sahvx655-wq opened a new pull request, #64248:
URL: https://github.com/apache/doris/pull/64248

   Reading an HLL_DATA_SPARSE blob in HyperLogLog::deserialize, the loop pulls 
a uint16 register index and an 8-bit value per entry and writes straight into 
_registers, a 16384-byte array sized to HLL_REGISTERS_COUNT. The only guard 
ahead of it is is_valid(), which validates the slice length (5 + 
3*num_registers) but never inspects the index value, so any index from 16384 up 
to 65535 passes through. I traced this back from the column deserialisation 
paths that feed serialised HLL bytes in again, and a crafted sparse entry lands 
the write up to roughly 49KB beyond the allocation, with both the offset and 
the byte under the caller's control.
   
   The index has to be bounded where the write happens. is_valid() is 
deliberately O(1) and several callers reach deserialize without going through 
it, so moving the check there would miss them; rejecting an out-of-range index 
inside deserialize closes the write on every path. A guard-page run confirmed 
the unpatched code faults on the very first index of 16384 while is_valid() 
still reports the blob as well formed. Left alone this is a heap out-of-bounds 
write reachable from attacker-influenced HLL data, which sits at the corruption 
end of the scale rather than a benign crash.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to