heguanhui opened a new issue, #64187:
URL: https://github.com/apache/doris/issues/64187
### Version
trunk (master branch)
### What's Wrong?
When running BE UT with `CMAKE_BUILD_TYPE=LSAN` on ARM64 architecture, the
test cases `IntersectOperatorTest::test_sink_large_string_data_over_4g` and
`ExceptOperatorTest::test_sink_large_string_data_over_4g` crash the entire UT
process with the following error:
```
==15521==ERROR: LeakSanitizer: requested allocation size 0x200000000 exceeds
maximum supported size of 0x100000000
#0 ... in realloc
#1 ... in doris::DefaultMemoryAllocator::realloc(void*, unsigned long)
allocator.h:100
#2 ... in doris::Allocator::realloc(...) allocator.cpp:411
#3 ... in doris::PODArrayBase::realloc(...) pod_array.h:191
#4 ... in doris::PODArrayBase::reserve(...) pod_array.h:261
#5 ... in doris::PODArrayBase::resize(...) pod_array.h:267
#6 ... in doris::ColumnStr::insert_range_from_ignore_overflow(...)
column_string.cpp:113
#7 ... in doris::MutableBlock::merge_impl_ignore_overflow(...)
block.h:586
#8 ... in doris::MutableBlock::merge_ignore_overflow(...) block.h:564
#9 ... in doris::SetSinkOperatorX<true>::sink(...)
set_sink_operator.cpp:89
#10 ... in
IntersectOperatorTest_test_sink_large_string_data_over_4g_Test::TestBody()
set_operator_test.cpp:480
```
After this error, the UT process exits immediately, preventing any
subsequent tests from running.
### What You Expected?
The UT process should not crash. Tests that are incompatible with LSAN's
ARM64 allocation limit should be gracefully skipped so that the rest of the UT
suite can continue running.
### How to Reproduce?
1. Build BE UT on ARM64 with `CMAKE_BUILD_TYPE=LSAN`
2. Run: `./run-be-ut.sh --run
--filter=IntersectOperatorTest.test_sink_large_string_data_over_4g`
3. Observe the crash
### Root Cause Analysis
The crash is caused by the interaction of two factors:
**1. LSAN's ARM64 allocation limit**
In LSAN's source code (`compiler-rt/lib/lsan/lsan_allocator.cpp`), the
maximum allowed single allocation size is defined per architecture:
```cpp
#if defined(__i386__) || defined(__arm__)
static const uptr kMaxAllowedMallocSize = 1ULL << 30; // 1GB
#elif defined(__mips64) || defined(__aarch64__)
static const uptr kMaxAllowedMallocSize = 4ULL << 30; // 4GB (ARM64)
#else
static const uptr kMaxAllowedMallocSize = 1ULL << 40; // 1TB (x86_64)
#endif
```
On ARM64 (`__aarch64__`), `kMaxAllowedMallocSize = 0x100000000` (4GB). Any
single allocation request exceeding this limit triggers a fatal error.
**2. PODArray's power-of-two rounding**
`PODArrayBase::reserve()` (`pod_array.h:259-262`) rounds up the requested
size to the next power of two via `round_up_to_power_of_two_or_zero()`. When
the test accumulates ~4.1GB of string data in `ColumnStr::chars`, the resize
request of ~4.1GB gets rounded up to **8GB (0x200000000)**, which exceeds
LSAN's 4GB limit on ARM64.
The detailed calculation for the Intersect test (4200 rows × 1MB per row,
batched at 500 rows):
| Batch | chars.size() | resize target | round_up_to_2^N | Realloc? |
|-------|-------------|-------------|-----------------|----------|
| 5 | 1.95GB | 2.44GB | 0x100000000 (4GB) | Yes |
| 6-8 | 1.95~3.91GB | 2.44~3.91GB | 0x100000000 (4GB) | No (capacity
sufficient) |
| **9** | **3.91GB** | **4.10GB** | **0x200000000 (8GB)** | **Yes →
CRASH** |
This issue does **not** occur on x86_64 because LSAN's limit there is 1TB.
### Are you willing to submit PR?
- [x] Yes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]