RTEMS | Deterministic Hedged Read Library (DHRL) for DRAM Tail Latency Mitigation (#5548)

Wayne Thornton (@wmthornton-dev) Wed, 08 Apr 2026 15:25:19 -0700


Wayne Thornton created an issue: 
https://gitlab.rtems.org/rtems/rtos/rtems/-/issues/5548


Assignee: Wayne Thornton

## Summary
External Dynamic RAM (DRAM) acts as a hidden source of non-determinism. DRAM 
cells require periodic, hardware-mandated refresh cycles ($tREFI$) to prevent 
data loss. During these refresh cycles, the memory controller locks the bank.

If a high-priority RTEMS control thread (such as flight software hazard 
avoidance) suffers an L1/L2 cache miss, the CPU must fetch from main memory. If 
that fetch collides with a $tREFI$ cycle or a row-buffer conflict, the thread 
is effectively stalled by the hardware. A memory read that normally takes 40ns 
can spike to over 300ns. While acceptable in standard OS environments, this 
"tail latency" shatters Worst-Case Execution Time (WCET) bounds in RTEMS. 
Developers must grossly over-pad their execution deadlines to account for 
worst-case hardware refresh alignments, wasting CPU cycles.

The Deterministic Hedged Read Library (DHRL) solves hardware latency spikes by 
trading available memory bus bandwidth and SMP parallel compute for strict time 
determinism. It leverages the statistical reality that two physically 
independent memory controllers will not execute refresh cycles simultaneously. 

##Execution Flow: 

Redundant Mapping: Critical read-only payloads are duplicated across two 
independent physical memory channels (for example, Bank A and Bank B) 

SMP Thread Pinning: DHRL spawns two worker tasks and uses 
$rtems_task_set_affinity()$ to rigidly pin them to distinct physical CPU cores.

The Hedged Read ("Race" Condition): When the main application requires data, it 
fires $rtems_event_send()$ to wake both pinned workers. Both cores 
simultaneously force an AXI bus read to their respective memory controllers.

Lock-Free Resolution: Whichever memory controller is not currently refreshing 
returns the data first. That "winning" thread executes a C11 
$atomic_compare_exchange_strong$ to claim a shared flag, which instantly wakes 
the main thread to hand over the pointer. The slower read is safely dropped.

##Trade-offs & Hardware Requirements

Cost: DHRL intentionally burns instantaneous memory bus bandwidth and requires 
dedicating parallel CPU cores to execute a single read operation.

Hardware Dependency: This software-level fix requires a target SoC with an 
RTEMS SMP BSP and at least two physically independent memory controllers.It 
provides zero benefit on single-channel architectures.

Benefit: Absolute bounds on memory fetch latency and "free" spatial fault 
tolerance against Single Event Functional Interrupts (SEFIs) on the memory 
controllers.

##Acceptance Criteria
[ ] $dhrl_init()$ correctly spawns and pins worker threads using the RTEMS 
Classic API.

[ ] $dhrl_fetch_data()$ correctly wakes workers, executes the race, and returns 
the valid pointer via C11 Atomics and RTEMS Event Sets without deadlocking.

[ ] API successfully handles arbitrary memory pointers (volatile void*).

[ ] Validation: Empirical tests on multi-channel hardware (or cycle-accurate 
simulators) demonstrate a mathematically bounded WCET for DRAM fetches, 
eliminating the tail latency distribution curve.

[ ] Library compiles cleanly via $waf$ and integrates into the $cpukit$ build 
structure gated by compiler flags.


<!-- Pre-set options
- milestone
-->

-- 
View it on GitLab: https://gitlab.rtems.org/rtems/rtos/rtems/-/issues/5548
You're receiving this email because of your account on gitlab.rtems.org.

_______________________________________________
bugs mailing list
[email protected]
http://lists.rtems.org/mailman/listinfo/bugs

RTEMS | Deterministic Hedged Read Library (DHRL) for DRAM Tail Latency Mitigation (#5548)

Reply via email to