itschrispeck commented on PR #12539:
URL: https://github.com/apache/pinot/pull/12539#issuecomment-1980247003

   > Trying to understand the problem better here. Why would it cause problem 
when the source position is moved? It is allowed for `decompress()` to move the 
position of the compressed buffer, and it should not break.
   
   I don't think the last bit is always true. While it is allowed to move the 
position, `lz4_decompress_fast` can move it beyond the source buffer's limit. 
See this example, the returned bytes `read` value is greater than `src.limit`:
   
   <img width="908" alt="image" 
src="https://github.com/apache/pinot/assets/27231838/ababbb74-2491-412e-91c9-946cce27e310";>
   
   There's a note in the 
[docs](http://ppl.cs.illinois.edu/doxygen/charm/lz4_8c.shtml#fc1e392c915b02d2459c3453bd647b4a):
   
   >The function never writes past the output 
[buffer](http://ppl.cs.illinois.edu/doxygen/charm/structbuffer.shtml). However, 
since it doesn't know its 'src' size, it may read past the intended input. 
Also, because match offsets are not validated during decoding, reads from 'src' 
may underflow.
   
   I am not sure if the the fast decompressor is the best choice. It may be 
better to move to the safe decompressor which is already used for non v4 
indexes. Doing that would also address this bug.
   > 
[LZ4_decompress_fast()](http://ppl.cs.illinois.edu/doxygen/charm/lz4_8c.shtml#fc1e392c915b02d2459c3453bd647b4a)
 : **unsafe!** This function used to be a bit faster than 
[LZ4_decompress_safe()](http://ppl.cs.illinois.edu/doxygen/charm/lz4_8c.shtml#5b433861ebd45cd96e65428febd89929),
 though situation has changed in recent versions, and now 
`LZ4_decompress_safe()` can be as fast and sometimes faster than 
`LZ4_decompress_fast()`. Moreover, LZ4_decompress_fast() is not protected vs 
malformed input, as it doesn't perform full validation of compressed data. As a 
consequence, this function is no longer recommended, and may be deprecated in 
future versions. It's only remaining specificity is that it can decompress data 
without knowing its compressed size.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to