Issue 181461
Summary [hexagon] SIGSEGV in `std::set` red-black tree destruction HexagonMCChecker.cpp
Labels backend:Hexagon
Assignees androm3da
Reporter androm3da
    Found on `release/22.x` (22.1.0-rc3). 

When cross-compiling LLVM for `hexagon-unknown-linux-musl` at `-O1` or above, the resulting native Hexagon LLVM tools exhibited widespread crashes and assertion failures across multiple unrelated source files:

- **HexagonMCChecker.cpp** — SIGSEGV in `std::set` red-black tree destruction.  A tree node pointer is corrupted to `0x1` (only a color bit, no valid address) because an inlined `std::set::__insert_node_at` store-load pair executed out of order.
- **HexagonMCCompound.cpp** — "out of slots" errors on legal compound instruction packets.  The compound instruction splitting logic produced incorrect results due to stale loads.
- **Verifier.cpp** — False "Broken module found" errors on all valid LLVM IR, including trivial functions like `add i32`.  A stored flag was read back through a differently-typed pointer with stale data.
- **LiveIntervals.cpp** — "reserved computation mismatch" assertion.  Two computations of `isReservedRegUnit()` using the same iterators produced different results because iterator state was corrupted by out-of-order memory operations.
- **Systemic** — 58 out of 100 sampled `CodeGen/Hexagon` tests failed with various crashes, assertion failures, and SIGSEGV at addresses like `0x1`, `0xf`, `0xc000001`.

All issues are absent at `-O0` (no packetization) and share this common root cause.

## Reduced tests

The following LLVM IR demonstrates the bug.  On V65+, the store and load land in the same packet **without** `:mem_noshuf`:

```llvm
; RUN: llc -march=hexagon -mcpu=hexagonv65 -O2 < %s | FileCheck %s

; CHECK-LABEL: test_store_imm_load:
; CHECK:       {
; CHECK-DAG:   memw(r0+#0) = #1
; CHECK-DAG:   r0 = memw(r1+#0)
; CHECK:       } :mem_noshuf

define i32 @test_store_imm_load(ptr %p, ptr %q) #0 {
entry:
  store i32 1, ptr %p, align 4, !tbaa !0
  %v = load i32, ptr %q, align 4, !tbaa !3
  ret i32 %v
}

attributes #0 = { nounwind }

!0 = !{!1, !1, i64 0}        ; type_a
!1 = !{!"type_a", !2}
!2 = !{!"tbaa_root"}
!3 = !{!4, !4, i64 0}        ; type_b
!4 = !{!"type_b", !2}
```

**Buggy output** (release/22.x, V65):
```asm
test_store_imm_load:
    {
        jumpr r31
        r0 = memw(r1+#0)
        memw(r0+#0) = #1
    }
```

No `:mem_noshuf` — the hardware may execute the load before the store.  If `%p == %q`, the load returns the old value instead of `1`.

**Expected output:**
```asm
test_store_imm_load:
    {
        jumpr r31
        memw(r0+#0) = #1
        r0 = memw(r1+#0)
    } :mem_noshuf
```

### Additional patterns

The same bug manifests with several variations:

1. **Tree node insertion** — store a pointer to a node field, load a pointer from another node field (models `std::set::__insert_node_at`):
```llvm
define ptr @test_tree_node_insert(ptr %new_node, ptr %parent, ptr %child_ptr) #0 {
  store ptr %parent, ptr %new_node, align 4, !tbaa !0
  %child = load ptr, ptr %child_ptr, align 4, !tbaa !3
  ret ptr %child
}
```

2. **Store-load feeding control flow** — loaded value used for a branch decision:
```llvm
define i32 @test_store_load_branch(ptr %flag_ptr, ptr %data_ptr, i32 %val) #0 {
  store i32 %val, ptr %flag_ptr, align 4, !tbaa !0
  %data = "" i32, ptr %data_ptr, align 4, !tbaa !3
  %cmp = icmp eq i32 %data, 0
  br i1 %cmp, label %then, label %else
then:
  ret i32 %val
else:
  ret i32 %data
}
```

3. **Multiple store-load pairs** — each pair uses different TBAA types, all missing `:mem_noshuf`:
```llvm
define i32 @test_multi_store_load(ptr %p1, ptr %p2, ptr %p3, ptr %p4) #0 {
  store i32 10, ptr %p1, align 4, !tbaa !0
  %v1 = load i32, ptr %p2, align 4, !tbaa !3
  store i32 %v1, ptr %p3, align 4, !tbaa !5
  %v2 = load i32, ptr %p4, align 4, !tbaa !7
  %sum = add i32 %v1, %v2
  ret i32 %sum
}
```


_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to