Bare Metal or RTOS – which is better for your device? Advantages, disadvantages, and typical architectures

April 15, 2026 17 min read last update: April 15, 2026

Article on typical application architectures for bare metal and RTOS

Andrey Solovev

Chief Technology Officer, PhD in Physics and Mathematics

Timur Yuldashev

IT Writer, PhD in Philological Sciences

Contents

Key Properties of Bare Metal Firmware
Typical Bare-Metal Architectures
Example of Bare-Metal Firmware: Graphic Calculator
Key Properties of RTOS-Based Embedded Software
Typical RTOS-Based Firmware Architectures
Example of RTOS-Based Embedded Software: EEG System
Conclusion

Contents

Key Properties of Bare Metal Firmware
Typical Bare-Metal Architectures
1. Superloop
2. Superloop with Interrupts
3. Event-Driven System
4. Finite State Machine
5. Using a Task Scheduler
Example of Bare-Metal Firmware: Graphic Calculator
Key Properties of RTOS-Based Embedded Software
Typical RTOS-Based Firmware Architectures
1. Multithread Architecture
2. Message Passing via Queues
3. Using Interrupts
4. RTOS-Based Event-Driven Systems
Example of RTOS-Based Embedded Software: EEG System
Conclusion

The choice between bare metal and RTOS affects project costs, time to market, and the risk of technical debt. This article explains the key differences between bare-metal firmware and embedded software based on real-time operating systems. We will talk about typical architectures used in both types of software and provide project examples from our portfolio.

If a device has strict real-time requirements, its firmware can be implemented as a bare metal program or an RTOS-based application. Making the right choice between the two options at the start determines not only the software architecture but also the economics of the entire project. This impacts hardware and license costs, time to market, and future support expenses. To make such decisions based on experience from completed projects rather than blindly, contact our team for a consultation or to kick off full-scale development.

Making the wrong choice can significantly complicate your project. For instance, you might end up with an overly complex RTOS-based system where a simple superloop would have sufficed. Or conversely, a bare-metal system that becomes expensive to rework when requirements for functionality and scalability grow.

Key Properties of Bare Metal Firmware

This type of software runs directly on the hardware level without an operating system. It consists of one large superloop in which commands execute sequentially, unless the loop is stopped to handle an interrupt.

If the program doesn't involve a large number of interdependent tasks, bare metal offers clear advantages over RTOS. It’s more predictable despite having no task scheduler, it’s easier to debug, consumes less hardware resources and energy. The latter allows using simpler and cheaper microcontrollers. But as the application grows more complex, bare metal firmware becomes harder to develop, debug, support, and scale.

Advantages

Minimal overhead.
Low resources requirements.
Predictable latencies: no hidden background tasks.
Fast boot time: minimal initialization code.

Disadvantages

Hard to scale: code quickly turns into “spaghetti.”
No built-in tools or services. The developer has to create custom ones or integrate third-party libraries.
Limited multitasking.

Bare-metal firmware is typically developed for small, simple devices that don't require multitasking or complex resource management. The list includes DC motor controllers, thermostats, relays, basic lighting controllers, keyboard controllers, and so on. Designing embedded software based on an RTOS would be unnecessarily complex for such systems. It would only make the project longer and more expensive, also raising the cost of the device's bill of materials.

Typical Bare-Metal Architectures

One should keep in mind that programmers rarely use pure architectures in practice. A real code usually combines elements from different approaches. As the complexity of the application grows, developers resort to more advanced practices.

1. Superloop

This is the simplest software architecture. All its logic runs inside an infinite loop. The microcontroller polls inputs one by one, reads, processes, and records data, updates states, controls outputs, and calls the necessary functions. Once the code reaches the last line, it starts over.

Pros

Superloops are easy to develop from scratch. There’s no complex logic or parallel processes.
The developer retains full control over timing: code execution isn't interrupted by unscripted operations.
No need to allocate hardware resources: there's always just one process.
Minimal overhead (consumption of resources for computations that aren’t directly related to the application’s functions) and requirements for Flash or RAM.
Programs with this architecture end up small and straightforward.

Cons

Poorly scalable. As developers add more tasks, the code grows, and delays in the loop increase.
As the codebase expands, it becomes unstructured, confusing, and hard to read, which makes the app more difficult to support and debug.

2. Superloop with Interrupts

If a pure superloop can’t handle the required functions, firmware can incorporate interrupts. In this architecture, part of the functions is handled by the main infinite loop, while another part is managed by interrupt handlers.

The loop still sequentially executes the main logic, but asynchronous or time-critical hardware events aren't polled by the loop. Instead, they trigger interrupt handlers. An interrupt halts the main loop, saving the context of all registers used at that moment, launches the handler, which performs the necessary computations, records the result, and then resumes the main loop.

Pros

Allows for achieving acceptable response times for critical hardware events without making the superloop too complicated.
The program remains predictable and understandable.
Low overhead because there are no intermediate execution paths.

Cons

Data from interrupts must be manually synchronized. The easiest way to do it is by using flags. Developers also need to keep an eye on flags to ensure processes run in the correct order, which somewhat complicates development, code readability, and debugging.

3. Event-Driven System

Instead of sequentially performing a hardcoded list of actions in a superloop, this architecture uses events to perform the required functions.

The developer defines a set of events: for instance, receiving a byte via UART, ADC result ready, button pressed, and so on. When an event occurs, hardware interrupts create an event object or code and place it in a queue. The event dispatcher, running in the main loop or a dedicated handler, pulls an event from the queue, checks for subscribers, and if any exist, calls the corresponding handler. The handler processes the event and returns control to the dispatcher. The process continues until the queue is empty.

Pros

Supports complex logic with many functions, while the app remains lighter than an OS-based equivalent, with lower hardware requirements.
The microcontroller only wakes on events rather than spinning an idle superloop. It reduces power consumption and latency for critical signals.
Easier to scale than a superloop with interrupts. You just need to add events and subscribers. Event handlers can be created as relatively independent modules.

Cons

Harder to debug.
Scaling makes it harder to track handler interactions and avoid mutually exclusive processes.
In a pure event-driven system, it’s harder to implement true parallel processes.

4. Finite State Machine

A Finite state machine, or FSM, is a model of a system that always stays in one of a predefined, limited set of states—for example, off, idle, launching, error. Upon receiving specific signals or events (button press, byte received, timer expiration), the machine transitions to another state. Transitions follow predefined rules that dictate which state to move to for a given event and what actions to perform.

Finite state machines are essential when a linear superloop cannot handle the device's logic. Trying to implement the same functionality in a superloop turns the code into a mess of if/else statements and flags: “if the button is pressed and we're already connected but not in update mode, and the rightmost toggle is on, then...” This only drags out project timelines and creates support headaches.

Finite state machines are often used in event-driven systems, where they serve as the event handlers.

Pros

The code becomes transparent: all possible device states and transitions between them are explicitly described.
Predictable behavior: instead of vague checks across multiple flags, each state defines explicit, guaranteed responses to events.
Developers describe device operating modes as state transitions rather than tangled conditions in a superloop, simplifying the logic.
Easier scalability: just add a new state and its transitions.

Cons

States cannot be added indefinitely. As the number of states and transitions grows, the logic becomes hard to understand and modify.
If the device was initially designed for simple scenarios, but the logic then gets much more complex, you have to rebuild the structure from scratch.
Continuous computations, complex signal processing algorithms, filtering, optimization, etc., don't fit well into this architecture in its pure form. You may need to add standalone modules with “traditional” code.

5. Using a Task Scheduler

A task scheduler is a component of the software logic that allocates CPU time among various processes (programs, scripts, commands) at specified times.

Many operating systems have built-in schedulers. But when dealing with bare metal firmware, developers have to create custom ones.

Pros

A scheduler provides “lightweight” multitasking and controlled time distribution among logic components. It lets you explicitly define task priorities, rather than relying on implicit management via if-statement order and superloop iteration length.
Having a scheduler brings firmware closer to RTOS capabilities, but hardware requirements remain low.
A custom scheduler can be just a few dozen lines of code, tailored exactly to the system's needs without the abstractions or generality of a full RTOS.

Cons

As the number of tasks and timing requirements grows, the scheduler logic quickly becomes complex.
No built-in synchronization mechanisms. Semaphores, queues, events, mutexes, or priority protocols must all be implemented from scratch. It extends development time.
Most custom schedulers use the cooperative multitasking model, where tasks must voluntarily yield control. In this setup, a single poorly implemented, suspended, or computationally heavy task can block everything else.

Example of Bare-Metal Firmware: Graphic Calculator

An Integra Sources engineer is opening the case of a graphic calculator developed by the company.

Our graphic calculator project is a good example of bare-metal firmware development. The calculator is a relatively simple device that doesn’t have many complex and parallel tasks. Its core logic fits perfectly into the superloop paradigm. Developing an RTOS-based embedded software for it would have been excessively labor-intensive and only slowed down the project. Choosing bare metal sped up development and simplified debugging. As a result, the project required fewer revisions, reducing overall development costs.

Nearly all the calculator's functionality consists of repeating tasks implemented in the superloop. At startup, the microcontroller initializes the peripheral hardware. In the main loop, the device resets watchdog timers, updates LED states, polls the battery level measurement module, and handles mathematical tasks. Most of the time, the microcontroller stays in a low-power wait mode to save battery. Every 1 ms, it wakes up and runs the main loop.

Two tasks in the firmware are time-triggered using the microcontroller's hardware timer. The device polls the keyboard every 10 ms. When a keypress is detected, the program updates the calculator's screen to reflect the change. In this project, it was easier and more reliable to detect keypresses via polling: the MCU's speed eliminated concerns about delays. Additionally, the microcontroller refreshes the screen regardless of keypresses every 33 ms.

This approach breaks a single infinite superloop into several logic tasks that run not every cycle, but periodically via timer. No full RTOS is needed.

USB interaction is implemented via interrupts. When a USB data packet arrives, a handler stores it in a dedicated static buffer. The buffer is processed in the main loop, allowing quick release of the USB for the next packet. The system uses a synchronous protocol for interacting with a PC, preventing overflows in the USB receive buffer.

If you're contacting the development team with your project, we recommend discussing hardware migration possibilities right at the start. RTOS applications are easy to port, but bare metal requires that you make provisions for that.

For instance, when the graphic calculator project began, we selected the GD32F470 microcontroller for its optimal price-performance ratio at the time. The calculator's features have expanded significantly since then, and it now lacks memory. A GD32 with the required characteristics is unavailable, so we're discussing a migration to STM32 with the client. The team anticipated this situation when the project started, so we split the embedded software into two parts: one for business logic and one for hardware interaction. To migrate to a new MCU, the team only has to rebuild the hardware-specific part. This architecture also enables seamless support for both device versions, plus implementing a PC and web emulator of the calculator.

Key Properties of RTOS-Based Embedded Software

Real-time operating systems (RTOS) are specialized OSes designed to meet the strict timing constraints of embedded devices. RTOS-based firmware uses the OS’ built-in services for task scheduling and synchronization as well as drivers and APIs for hardware interaction.

This type of embedded software demands more hardware resources and has higher power consumption. Some microcontrollers don’t have enough memory to accommodate an RTOS application. However, using an OS greatly simplifies and accelerates embedded software development for complex programs that must handle multiple tasks in parallel. Built-in tools make it easier to create, debug, and scale the code.

Advantages

Multitasking support.
Modularity and scalability.
Predictable real-time performance at high complexity.
Built-in tools and services.

Disadvantages

Higher resource consumption.
Requires higher developer expertise.
Overkill for simple tasks.

RTOS is ideal for designing electronics that have multiple concurrently running subsystems with strict but manageable timing requirements. RTOS-based firmware is typically used in industrial controllers, robotics, automotive electronics, medical solutions, smart devices, and such.

Typical RTOS-Based Firmware Architectures

As with bare metal, the architecture of firmware grows in complexity as needed and may incorporate elements from different approaches, which is highly recommended to determine during the project estimation phase.

1. Multithread Architecture

The scheme of a multithread software architecture

This is the classic software architecture that effectively makes use of RTOS multitasking capabilities. The application is divided into several execution threads, or tasks. For example, one thread for networking, one for sensor processing, one for the user interface, and so on. The OS kernel decides which thread runs at any given moment according to its rules. The RTOS kernel stores thread contexts (registers, stack pointers) and switches between tasks based on timers or events. It selects ready-to-run threads and allocates CPU time to the highest-priority one.

In simple programs, threads have minimal interdependencies. Built-in OS tools like mutexes, semaphores, queues, and events manage resource and data sharing for thread-safe operation.

Pros

Easy to scale by simply adding new threads.
Each thread acts like a standalone program, making the code modular and easier to support.
RTOS provides built-in inter-component interaction mechanisms.

Cons

Increased design complexity. You must carefully plan priorities, task interactions, and avoid classic multithreading pitfalls like mutual deadlocks, priority inversion, and more. Otherwise, you risk causing race conditions, where multiple threads access the same resources simultaneously without proper synchronization.
Races, deadlocks, and timing bugs in multitasking systems are frequent. But discovering them is not an easy task, which makes debugging harder.
Like any other RTOS architecture, this one requires more hardware resources than bare metal.

2. Message Passing via Queues

The scheme of message passing in software

This architecture resembles the event-driven model. Tasks “communicate” through messages. A producer task forms a message (a structure with data or a command) and places it in a queue. A consumer task is blocked until the message appears in the queue. Then it wakes up and processes it.

Data exchange is configured using RTOS primitives: blocking waits, timeouts, priority processing, and queue sizes. This enables clean management of data flow and load.

Pros

Tasks don't access memory simultaneously; instead, they pass data copies or small descriptors via the queue. This drastically reduces race condition risks and makes interaction points explicit.
Tasks “sleep” when there's no work, avoiding CPU waste on polling. It's logically cleaner and more power-efficient.

Cons

Increased complexity. Developers must think through message formats, who sends what to whom, queue management, error handling, and overflows.
Higher memory and time overhead. Queues need buffers, and sometimes big ones, as messages can be quite “heavy”.
In small systems with just 2 or 3 tasks and simple logic, message-passing architecture is overkill.

3. Using Interrupts

Real-time operating systems also support hardware interrupts. The program reacts to hardware events as quickly as possible the same way it does in the bare metal architecture with interrupts, while “heavy” logic runs in OS tasks. A hardware event triggers an interrupt. The CPU suspends the current task and executes a short interrupt handler. After the handler finishes, the RTOS scheduler decides which task to run next.

Pros

Minimally possible latency for external events. Tasks can be split into time-critical ones and “heavier” but less urgent ones.
Combines the speed and hardware reactivity of interrupts with the structure and multitasking of RTOS.

Cons

Higher development and debugging complexity. The team must carefully choose which tasks react to interrupts, as “heavy” handlers lead to unstable latencies and missed events.
Many frequent interrupts increase overhead from handler entry/exit and task context switches. Poor design can consume a significant portion of CPU power.

4. RTOS-Based Event-Driven Systems

Modern real-time operating systems also provide tools for building event-driven architectures. Instead of constantly polling states, each task waits for an event: a signal, flag, queue message, semaphore from another task, or timeout from the system timer. When the event occurs, the task wakes up, processes it, often via a finite state machine, and returns to waiting.

With this approach, one can create an RTOS version of an event-driven system, conceptually similar to bare metal but distributed across multiple threads.

Pros

The RTOS-based event-driven architecture offers a cleaner, more efficient way to react to multiple conditions simultaneously compared to alternatives.
Tasks are blocked while waiting for events, avoiding CPU waste on polling flags or queues.
Tasks can wait for multiple conditions at once.
The event-driven architecture is easy to scale. A single event set can manage synchronization across many tasks, effectively replacing numerous semaphores and notifications, reducing memory and kernel call overhead.
Fits naturally with finite state machine models. The system transitions to the next state upon a specific event instead of polling multiple conditions in each task's superloop.

Cons

Higher risk of race conditions and lost events.
Requires more effort to ensure predictable behavior and response times.
Debugging gets harder due to asynchrony and event combination.
RTOS kernel may be underutilized: a lot of overhead operations serve the events, while the typical features of the scheduler and synchronization primitives go underused.
Risk of architectural inconsistencies in hybrid systems: the same subsystem might use different signaling methods, making its behavior hard to predict.

Example of RTOS-Based Embedded Software: EEG System

A doctor prepares a patient for an EEG examination

Our team created an RTOS-based firmware when developing software for an EEG system. The client designs solutions for electroencephalogram registration. They contacted Integra Sources after creating hardware for a new product and requested firmware development within a reasonable deadline.

The customer wanted the device to be able to register EEG, transmit data to a tablet, and receive commands from it at the same time. So, the system had to handle several fairly “heavy” tasks simultaneously. The most critical were processing and storing EEG data from the analog-to-digital converter (ADC), plus BLE data exchange with the tablet. The solution also required intensive data exchange between tasks.

Given these requirements, the best way to create the firmware was using an RTOS. It provides tools for parallel tasks and synchronization, which significantly speeds up embedded software development. The team selected FreeRTOS—a high-quality, open-source OS.

The firmware architecture for this project is based on a multithreaded implementation augmented with interrupts, events, and message passing. For instance, the ADC data processing task launches upon receiving a hardware ADC interrupt. Raw converter data is placed in a queue. It acts as an event that triggers data processing and recording to a microSD card. Before finishing, the task sends a “data written to SD card” event to the BLE interaction task.

While the first task sleeps, the BLE task runs. It sends study data to the tablet in real time or receives commands from it. Incoming BLE packets are also stored in a queue. For example, if a “stop study” command packet lands in the queue, it triggers the corresponding task.

Conclusion

In embedded electronics development, choosing between bare metal and RTOS isn't a matter of technology. It is a matter of control, complexity, and system scaling.

Bare metal wins on simplicity, cost, and predictability where logic is relatively straightforward and hardware imposes tight constraints. RTOS shines in more complex systems with parallel tasks, intensive data exchange, and high demands for scalability and support. The choice should stem from functional requirements, timing needs, microcontroller resources, and product evolution plans. This is where you need an experienced team that can guide you.

If you're planning a new device but unsure which approach fits your project, contact Integra Sources for a consultation. We'll help you choose the right tech stack and develop firmware that can both go through the first launch and survive multiple device generations.

Andrey Solovev

Chief Technology Officer, PhD in Physics and Mathematics

Expert with 20+ years in in electronics design and embedded software development. Author of 20+ scientific publications on spectral analysis, ADC systems, and optoelectronic diagnostics.

About author

Timur Yuldashev

IT Writer, PhD in Philological Sciences

At Integra Sources, he turns complex technical topics—like embedded systems, power electronics, and IoT—into clear and engaging stories that highlight the team’s expertise and innovative projects.

About author