OpenBMC Behavior-Based Fault Injection Testing Method Open Source Firmware Conference 2025

OpenBMC Behavior-Based Fault Injection Testing Method
.ical
2025-10-07 17:00–17:30, Main

This talk presents a behavior-based fault injection approach for OpenBMC firmware. By injecting faults at the I/O layer and using real-world failure models, we enhance grey-box testing coverage without modifying code or restarting services. Leveraging tools like Frida and eBPF, our method enables efficient, interpretable, and cross-environment validation of BMC robustness.

This presentation introduces a behavior-based fault injection testing methodology tailored for OpenBMC firmware. In BMC (Baseboard Management Controller) development, a significant portion of engineering effort is devoted to handling abnormal system behaviors. Ensuring stability and robustness under such conditions remains a critical challenge, particularly given the complex reproduction scenarios associated with many OpenBMC-related faults. Traditional testing techniques often fall short in addressing these difficulties.
To overcome these limitations, we adopt a behavior-driven approach inspired by fuzz testing. By injecting faults at the I/O layer, we enrich the diversity of grey-box test scenarios. Our methodology begins with constructing fixed fault models derived from real-world failure cases, as recorded in systems like JIRA. These models are then programmatically mutated to expand the coverage of fault types and injection points. A newly developed fault injection toolchain is used to perform comprehensive validation of OpenBMC firmware, with a focus on exception handling and recovery mechanisms.
The proposed approach offers several key advantages:
- Low Dependency: Faults are injected dynamically at runtime without requiring modifications to source code, service restarts, or firmware reflashing.
- High Reliability: Tests are conducted directly on release firmware builds, ensuring result accuracy without altering runtime configurations.
- Strong Interpretability: Injection events, timing, and fault models are explicitly derived from known issues, making test results easier to analyze and correlate.
- Cross-Scenario Compatibility: The method supports execution across different environments, including QEMU virtual platforms and physical hardware targets.
To implement this methodology, we leverage dynamic instrumentation tools such as Frida and eBPF, along with custom-developed scripts. The testing framework supports multiple techniques, including:
- Operation-sequence-based fault injection
- Log-keyword-based fault triggering
- Interception of system libraries and components
By leveraging these methods, we can accelerate the iteration and refinement of OpenBMC, thereby substantially enhancing its stability and robustness.

Lei Yu

Lei Yu is the firmware architect at ByteDance, bringing extensive expertise in OpenBMC community contributions and complex system design. With years of hands-on experience, he is dedicated to enhancing system reliability, increasing observability, and simplifying architectures to improve usability for Site Reliability Engineers (SREs). Lei is passionate about driving innovations that make firmware and system management more transparent, efficient, and robust, ultimately enabling smoother operations and faster issue resolution.

This speaker also appears in:

IPMI HTTPS Interface: Towards a Password-Free BMC

Jia, Chunhui

Chunhui Jia is the firmware team architect at ByteDance and a seasoned expert in the firmware industry. With deep experience across UEFI BIOS, BMC, and OS driver development, he has built a strong track record in tackling complex firmware challenges. His current focus centers on improving firmware debuggability, enhancing observability, and advancing automated issue diagnosis and troubleshooting.

OpenBMC Behavior-Based Fault Injection Testing Method .ical 2025-10-07 17:00–17:30, Main

OpenBMC Behavior-Based Fault Injection Testing Method
.ical
2025-10-07 17:00–17:30, Main