prashantwason commented on code in PR #9037: URL: https://github.com/apache/hudi/pull/9037#discussion_r1242686769
########## hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java: ########## @@ -195,11 +193,6 @@ protected <T> ClosableIterator<HoodieRecord<T>> lookupRecords(List<String> keys, blockContentLoc.getContentPositionInLogFile(), blockContentLoc.getBlockSize()); - // HFile read will be efficient if keys are sorted, since on storage records are sorted by key. Review Comment: Removing this means that if there is any code path (existing or introduced tomorrow) that does not sort the keys then we may have misses from the MDT. This could lead to data quality issues. If we do not want to have the overhead of re-sorting a sorted array (how much is the overhead?) then we atleast need to add some checks here that the current keys is greater than the previous key in the getRecordsByKeysIterator and getRecordsByKeyPrefixIterator. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org