prashantwason opened a new pull request, #8526:
URL: https://github.com/apache/hudi/pull/8526

   [HUDI-6116] Optimize log block reading by removing seeks to check corrupted 
blocks.
   
   ### Change Logs
   
   1. Removed the eager check for isBlockCorrupted after reading block size in 
HoodieLogFileReader
   2. Added validation checks after reading each item (version, size, 
blockType, content, etc) from the log block
   3. Added a unit test which generated various corruption scenarios and 
validates that the corrupted blocks are found
   
   ### Impact
   
   Improved performance of reading a log file when there is high latency or a 
large number of log blocks exist.
   
   ### Risk level (write none, low medium or high below)
   
   None
   
   ### Documentation Update
   
   None
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to