https://sourceware.org/bugzilla/show_bug.cgi?id=34036

            Bug ID: 34036
           Summary: Infinite loop in assembler with self-referential
                    symbol and PC-relative fixup
           Product: binutils
           Version: 2.47 (HEAD)
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: gas
          Assignee: unassigned at sourceware dot org
          Reporter: joel at airwebreathe dot org.uk
  Target Milestone: ---

GAS enters an infinite loop when a self-referential symbol definition (e.g.
`.L=.L`) is combined with a PC-relative instruction referencing that symbol.
The assembler consumes 100% CPU indefinitely and must be killed.

This can be reproduced on any CGEN target, e.g. or1k):

```
echo '.L=.L
l.j .L
.L: l.nop' | or1k-elf-as -o /dev/null
```

This hangs indefinitely. Replacing `l.j .L` with `.long .L` (a data reference)
does NOT hang — GAS correctly reports "symbol definition loop encountered at
`.L'".

The bug also reproduces on x86:

```
echo '.L=.L
jmp .L
.L: nop' | as -o /dev/null
```
The 2020 fix for from commit 2a50b401465f ("Fix tight loop on
recursively-defined symbols") added cycle detection to `resolve_symbol_value()`
in `gas/symbols.c` for the unresolved-symbol path. This correctly handles cases
like `a=a` and `a=b; b=a` with data references.

However, when the self-referential symbol is the target of a PC-relative fixup
(branch/jump instruction), the fixup resolution takes a different code path
through the CGEN assembler infrastructure (or the target-specific
`md_apply_fix`) that bypasses the cycle detection. The fixup system computes
the PC-relative offset, gets a value, applies it, then detects the value
changed (because the symbol resolves through itself), and re-applies - forever.

The hang occurs before `write_object_file()` reaches the outer relaxation loop
in `gas/write.c` (which also has no iteration limit). Confirmed by adding debug
prints: the outer loop's `rsi.pass` counter is never reached.

The `resolve_symbol_value()` function at `gas/symbols.c:1284` has a `while
(symp->x->value.X_op == O_symbol)` loop that chases the symbol reference chain.
When the symbol is resolved AND self-referential (`X_add_symbol` points back to
itself), this loop has no cycle detection — unlike the unresolved path which
was fixed in 2020.

All targets, but particularly CGEN-based targets (or1k, m32c, mep, epiphany,
fr30, lm32) where the CGEN fixup pipeline processes PC-relative operands.
Confirmed on or1k and x86.

Found by AFL++ fuzzing with ~12.6 million executions.

Possible options for a fix:

Add cycle detection to the `while (symp->x->value.X_op == O_symbol)` loop at
`gas/symbols.c:1284` (the resolved-symbol path of `resolve_symbol_value()`). A
simple check `if (symp == orig_symp) return 0;` after advancing `symp` would
break the direct self-reference cycle. However, indirect cycles (a->b->a) would
need Floyd's cycle detection or a visited-flag approach.

Additionally, the outer relaxation loop in `write_object_file()`
(`gas/write.c:2159`) has no iteration limit, which could cause similar hangs
from other oscillation scenarios. Adding a reasonable bound (as
`relax_segment()` already does for its inner loop) would provide further
protection.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Reply via email to