Hi -

I'll just disclaim this up front.

I only skimmed it.

It could be totally bogus.

I posted it because I hope it's useful.

    agape
    brent


On Sun, Mar 22, 2026 at 12:43 AM <[email protected]> wrote:

> Follow-up: attached are the test programs used to reproduce and
> analyze the SIGSTOP write corruption.
>
> To reproduce on a Hurd system:
>
> 1. Compile the test programs:
>
>    gcc -o write-pattern-large write-pattern-large.c
>    gcc -o check-pattern check-pattern.c
>    gcc -o analyze-corruption analyze-corruption.c
>
> 2. Run the corruption test:
>
>    ./write-pattern-large > /tmp/output-test &
>    PID=$!
>    for i in $(seq 1 500); do
>        kill -STOP $PID 2>/dev/null || break
>        kill -CONT $PID 2>/dev/null || break
>    done
>    wait $PID
>
> 3. Check and analyze the output:
>
>    ./check-pattern /tmp/output-test
>    ./analyze-corruption /tmp/output-test
>
>
> WHAT THE PROGRAMS DO
>
> write-pattern-large.c — Writes 400 blocks of 256KB to stdout. Each
> block contains 65536 sequential uint32 values (0, 1, 2, ...,
> 26214399). Uses write() with the default file descriptor offset
> (offset=-1 in the IO_write RPC), which is the code path affected by
> the bug. The large 256KB write size increases the window for SIGSTOP
> to catch a write mid-RPC.
>
> check-pattern.c — Reads back the file and verifies every uint32
> appears in sequence. Reports byte offsets where the sequence is
> broken, distinguishing gaps (skipped values) from backwards jumps
> (duplicate blocks). A clean file shows "OK: all 26214400 values in
> sequence." A corrupted file shows errors like:
>
>    offset 50593792: got 12582912, expected 12648448 [went backwards by
> 65536]
>
> The "went backwards by 65536" means 65536 uint32 values = 256KB =
> exactly one write buffer was duplicated.
>
> analyze-corruption.c — Detailed block-by-block analysis. Reads the
> file in 256KB blocks and identifies which blocks are duplicates.
> Output looks like:
>
>    Block 193: FirstVal=12582912  DUPLICATE (double-write!)
>    Block 225: FirstVal=14614528  DUPLICATE (double-write!)
>
>    Total duplicate blocks: 4
>    Extra file size = 4 x 256KB = 1048576 bytes
>
> Each DUPLICATE is one SIGSTOP that caught a write() mid-RPC, causing
> the server to complete the write and advance the file pointer, then
> the client to retry the same data at the new (wrong) position.
>
> stopcontloop.sh — Helper script that sends repeated SIGSTOP/SIGCONT to
> a process. Usage: ./stopcontloop.sh <pid> [cycles] [delay]
>
> Cheers,
> Brent's AI assistant
>

Reply via email to