Implementing a High-Assurance Bootloader Jump

In high-assurance embedded systems, the transition from a bootloader to the main application is more than a simple branch instruction. To ensure system reliability and safety, the bootloader must guarantee that the application inherits a deterministic machine state. Ideally, this state should be as close as possible to the post-reset state. Matching the post-reset state makes the presence of a bootloader transparent to the application.

In this article, we analyze the architectural requirements for this transition on ARMv7 Cortex-M microcontrollers and define a robust strategy for a clean handover.

Hardware Foundations: The Cortex-M Boot Sequence

To understand the jump, we must first look at the hardware’s native entry sequence, referred to in ARM documentation as the TakeReset() procedure. On a cold reset, a Cortex-M processor performs two critical steps before executing the first line of code:

  1. MSP Initialization: It loads the Main Stack Pointer (MSP) from the first word of the vector table (address 0x00000000).
  2. Reset Vector Fetch: It loads the Reset Vector (the address of the entry point) from the second word of the vector table (0x00000004).

Further investigating TakeReset(), shows that the post-reset state of the CPU is well-defined. It includes the following:

  • CPU Mode: The processor starts in Thread Mode with Privileged access, using the MSP.
  • Stack Pointer: The MSP is used as the stack pointer,
  • Interrupts: Interrupts are not globally masked by the Core (PRIMASK = 0). The NVIC has all interrupts disabled and no pending interrupts.

The following state is microcontroller-specific but generally follows a similar pattern across Cortex-M devices:

  • Clock: The system clock is in a default state which is typically the internal oscillator (HSI) running at a low frequency (e.g., 8 MHz), and all PLLs are disabled.
  • Peripherals: All peripherals are in their reset state, which means they are disabled and not generating interrupts.
  • Pin Configurations: Most pins are typically configured as GPIO inputs.

The Application Contract

The post-reset state forms the baseline for the application contract: the set of hardware assumptions the application makes upon entry. By matching this state, the bootloader becomes transparent, providing two key benefits:

  • Portability: The application behaves identically whether launched by the bootloader or without bootloader, simplifying development and testing.
  • Robustness: It prevents “state pollution”—subtle bugs caused by leftover peripheral states or pending interrupts that the application does not expect.

Handling Deviations

Real-world constraints—such as hardware limitations, performance requirements, or cases where the complexity of restoration outweighs the risk—may force deviations from a pure post-reset state. Practical examples include initialized PLLs, enabled caches, or active watchdog timers.

In these scenarios, the bootloader must ensure the handover remains documented and deterministic. Any state left active should be limited to what the application would initialize anyway, ensuring the transition stays predictable. Crucially, even in this relaxed model, the application must inherit a clean and quiescent interrupt state. This is critical to prevent unexpected ISRs from firing immediately upon entry, which can lead to unpredictable behavior.

Fulfilling the Contract

There are two ways the bootloader can go about achieving this state before handing over control to the application:

  • Software-mediated state restoration: The bootloader software restores step-by-step the post-reset state of peripheral and the core of the microcontroller.
  • Using the system reset: The bootloader uses a system reset to achieve the post-reset state before jumping to the application.

They both have their pros and cons, and the choice depends on the specific requirements and constraints of the system.

The first option, software-mediated state restoration, requires careful tracking of the state modified by the bootloader and ensuring that it is properly restored before the jump. This is a very tedious task, and it is easy to miss something, which can lead to subtle bugs in the application. In addition, this approach requires good understanding of the ARM architecture and the specific microcontroller being used.

The second approach, performing a system reset, is much simpler and more robust, as it relies on the hardware to ensure a clean state. However, it requires a mechanism to communicate across resets to instruct the bootloader to jump directly to the application instead of going through the normal boot sequence. Such a mechanism may not be available or adds too much complexity to the system.

Due to its robustness and simplicity, the second approach is generally recommended for high-assurance systems, unless there are specific constraints that make it infeasible.

Software-Mediated State Scrubbing

In this approach, the bootloader manually returns every modified hardware component to its reset or quiescent state. This process, referred to as State Scrubbing, requires an exhaustive audit of all peripherals, core registers, and clock configurations touched by the bootloader.

The sequence described below assumes that it executes in privileged Thread mode and uses the MSP as the stack pointer. Performing the handover from non-privileged mode or Handler mode requires a more complex sequence which is out of scope of this article.

The architectural sequence for a safe handover is as follows:

  1. Global Interrupt Masking: Set the PRIMASK register to mask all interrupts. This must be the first step to ensure no Interrupt Service Routine (ISR) can fire while you are dismantling the system state or relocating the stack.
  2. Peripheral Deactivation: Explicitly disable every peripheral used by the bootloader (UARTs, Timers, DMA). It is critical to ensure all ongoing DMA transfers are fully stopped to prevent background memory corruption during the transition.
  3. Data Synchronization Barrier (DSB): Execute a DSB instruction. This ensures that all previous memory-mapped I/O writes (the peripheral deactivations) are completed and their effects are visible to the NVIC before proceeding.
  4. NVIC Sanitization: Loop through the NVIC to disable all interrupts and clear all pending bits. Because of the DSB in the previous step, you are guaranteed that no late interrupts from recently disabled peripherals can re-trigger a pending state.
  5. Clock Restoration: Revert the system clock to the internal oscillator (HSI) and disable all PLLs. This ensures the application inherits a stable, default timing environment. This may also involve restoring flash wait states if the bootloader had increased the clock speed, as well as reconfiguring the SysTick timer to a known state.
  6. Cache Deactivation: If the bootloader has enabled instruction or data caches (common on Cortex-M7), they must be cleaned and invalidated, then disabled. This ensures the application starts with a clean memory view and prevents background cache write-backs from corrupting application data.
  7. Instruction Synchronization Barrier (ISB): Execute an ISB instruction to flush the processor pipeline. This ensures that all previous context changes (like clock shifts, NVIC updates, or cache disabling) are fully synchronized before the final branch.
  8. Interrupt Unmasking: Clear the PRIMASK register to restore the standard post-reset state (PRIMASK = 0). This ensures the application can receive interrupts as soon as it begins execution. In practice, this is often performed within the final assembly jump sequence to ensure it happens atomically with the stack pointer relocation.

For clarity and brevity, this sequence leaves out a few aspects, such as floating point unit (FPU) state and hardware watchdogs which may not be resettable. For a full implementation, these need to be included in the above sequence.

Using the System Reset

This approach relies on the hardware to perform the cleanup by triggering a warm reset (software reset). This ensures all peripherals, the NVIC, and core registers return to their known reset state.

Persistent Storage: Backup Registers

To communicate across a reset, the bootloader needs a way to persist a “jump request” flag. Most Cortex-M microcontrollers provide Backup Registers (often part of the RTC or Power Control block) that are specifically designed to survive a system reset as it is typically powered by a VBAT supply.

Unlike SRAM, these registers are hardware-guaranteed to retain their state through a soft reset, making them the superior choice for high-assurance systems.

The Reset Workflow

The sequence of events for a reset-mediated jump is as follows:

  1. Set Magic Value: The bootloader (or application) writes a specific, non-zero “magic value” (e.g., 0x55AA77BB) into a designated Backup Register.
  2. Trigger System Reset: Call the CMSIS NVIC_SystemReset() function.
  3. Early Boot Check: Upon reset, the very first task of the bootloader is to check that specific Backup Register. This must happen before any other initialization steps, including clock setup, peripheral initialization, or even setting up the stack.
  4. Conditional Branch:
    • If the magic value matches, the bootloader clears the register (to prevent reset loops) and immediately calls the jump_to_app() function.
    • If the value is missing or incorrect, it proceeds with the normal bootloader sequence.

SRAM-based Flags

While some developers use specific addresses in SRAM to store these flags, this is generally discouraged. Microcontrollers do not typically guarantee SRAM retention across all types of resets, and the state of SRAM after a cold power-on is undefined, requiring complex checksums to avoid accidental jumps due to random noise.

Implementing the Final Jump

The final step is the most sensitive: setting the stack pointer and branching to the application entry point. This must be done in a way that preserves the “post-reset” contract while preventing the compiler from generating unsafe stack accesses.

The following implementation uses inline assembly to ensure the compiler cannot interfere with the stack pointer during the handover. Crucially, it also unmasks interrupts (cpsie i) immediately after setting the new stack pointer. This ensures the application inherits the standard PRIMASK = 0 state expected after a hardware reset. This is only necessary in case the bootloader implements the software-mediated state scrubbing approach.

void jump_to_app(uint32_t app_vtor) {
    uint32_t app_msp = *(uint32_t *)(app_vtor);
    uint32_t app_reset = *(uint32_t *)(app_vtor + 4);

    // 1. Relocate VTOR
    SCB->VTOR = app_vtor;
    __DSB(); // Ensure VTOR write is finished

    // 2. Handover
    __asm volatile (
        "msr msp, %0      \n" // Set Application Stack Pointer
        "cpsie i          \n" // Unmask interrupts (not needed for reset-based jump)
        "isb              \n" // Flush pipeline
        "bx %1            \n" // Jump to Application Reset_Handler
        : : "r" (app_msp), "r" (app_reset) : "memory"
    );
}

This approach solves the three primary risks of a bootloader jump:

  1. Stack Corruption: By using assembly, we prevent the compiler from pushing/popping variables to a stack that we are in the middle of changing.
  2. State Pollution: By explicitly clearing PRIMASK and resetting the VTOR, we ensure the application starts in a standard environment.
  3. Type Safety: We avoid casting integers to function pointers, which is technically undefined behavior in C.

Conclusion

Building high-assurance software requires shifting from a mindset of what “typically works” to one of absolute guarantees provided by the underlying hardware and the language specification. As we have seen, a naive jump might work in many scenarios, but it leaves the system vulnerable to subtle, non-deterministic failures caused by leftover state or compiler optimizations.

By treating the transition as a formal contract—enforced through rigorous state scrubbing, proper synchronization, and the avoidance of undefined behavior—we ensure that the application always starts from a known-good foundation. In the world of high-assurance systems, reliability is not a side effect of good luck, but the result of intentional architectural design.