heguanhui opened a new issue, #64187:
URL: https://github.com/apache/doris/issues/64187

   ### Version
   
   trunk (master branch)
   
   ### What's Wrong?
   
   When running BE UT with `CMAKE_BUILD_TYPE=LSAN` on ARM64 architecture, the 
test cases `IntersectOperatorTest::test_sink_large_string_data_over_4g` and 
`ExceptOperatorTest::test_sink_large_string_data_over_4g` crash the entire UT 
process with the following error:
   
   ```
   ==15521==ERROR: LeakSanitizer: requested allocation size 0x200000000 exceeds 
maximum supported size of 0x100000000
       #0 ... in realloc
       #1 ... in doris::DefaultMemoryAllocator::realloc(void*, unsigned long) 
allocator.h:100
       #2 ... in doris::Allocator::realloc(...) allocator.cpp:411
       #3 ... in doris::PODArrayBase::realloc(...) pod_array.h:191
       #4 ... in doris::PODArrayBase::reserve(...) pod_array.h:261
       #5 ... in doris::PODArrayBase::resize(...) pod_array.h:267
       #6 ... in doris::ColumnStr::insert_range_from_ignore_overflow(...) 
column_string.cpp:113
       #7 ... in doris::MutableBlock::merge_impl_ignore_overflow(...) 
block.h:586
       #8 ... in doris::MutableBlock::merge_ignore_overflow(...) block.h:564
       #9 ... in doris::SetSinkOperatorX<true>::sink(...) 
set_sink_operator.cpp:89
       #10 ... in 
IntersectOperatorTest_test_sink_large_string_data_over_4g_Test::TestBody() 
set_operator_test.cpp:480
   ```
   
   After this error, the UT process exits immediately, preventing any 
subsequent tests from running.
   
   ### What You Expected?
   
   The UT process should not crash. Tests that are incompatible with LSAN's 
ARM64 allocation limit should be gracefully skipped so that the rest of the UT 
suite can continue running.
   
   ### How to Reproduce?
   
   1. Build BE UT on ARM64 with `CMAKE_BUILD_TYPE=LSAN`
   2. Run: `./run-be-ut.sh --run 
--filter=IntersectOperatorTest.test_sink_large_string_data_over_4g`
   3. Observe the crash
   
   ### Root Cause Analysis
   
   The crash is caused by the interaction of two factors:
   
   **1. LSAN's ARM64 allocation limit**
   
   In LSAN's source code (`compiler-rt/lib/lsan/lsan_allocator.cpp`), the 
maximum allowed single allocation size is defined per architecture:
   
   ```cpp
   #if defined(__i386__) || defined(__arm__)
   static const uptr kMaxAllowedMallocSize = 1ULL << 30;       // 1GB
   #elif defined(__mips64) || defined(__aarch64__)
   static const uptr kMaxAllowedMallocSize = 4ULL << 30;       // 4GB (ARM64)
   #else
   static const uptr kMaxAllowedMallocSize = 1ULL << 40;       // 1TB (x86_64)
   #endif
   ```
   
   On ARM64 (`__aarch64__`), `kMaxAllowedMallocSize = 0x100000000` (4GB). Any 
single allocation request exceeding this limit triggers a fatal error.
   
   **2. PODArray's power-of-two rounding**
   
   `PODArrayBase::reserve()` (`pod_array.h:259-262`) rounds up the requested 
size to the next power of two via `round_up_to_power_of_two_or_zero()`. When 
the test accumulates ~4.1GB of string data in `ColumnStr::chars`, the resize 
request of ~4.1GB gets rounded up to **8GB (0x200000000)**, which exceeds 
LSAN's 4GB limit on ARM64.
   
   The detailed calculation for the Intersect test (4200 rows × 1MB per row, 
batched at 500 rows):
   
   | Batch | chars.size() | resize target | round_up_to_2^N | Realloc? |
   |-------|-------------|-------------|-----------------|----------|
   | 5     | 1.95GB      | 2.44GB      | 0x100000000 (4GB)  | Yes |
   | 6-8   | 1.95~3.91GB | 2.44~3.91GB | 0x100000000 (4GB)  | No (capacity 
sufficient) |
   | **9** | **3.91GB**  | **4.10GB**  | **0x200000000 (8GB)** | **Yes → 
CRASH** |
   
   This issue does **not** occur on x86_64 because LSAN's limit there is 1TB.
   
   ### Are you willing to submit PR?
   
   - [x] Yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to