prashantwason commented on code in PR #9037:
URL: https://github.com/apache/hudi/pull/9037#discussion_r1242686769


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java:
##########
@@ -195,11 +193,6 @@ protected <T> ClosableIterator<HoodieRecord<T>> 
lookupRecords(List<String> keys,
         blockContentLoc.getContentPositionInLogFile(),
         blockContentLoc.getBlockSize());
 
-    // HFile read will be efficient if keys are sorted, since on storage 
records are sorted by key.

Review Comment:
   Removing this means that if there is any code path (existing or introduced 
tomorrow) that does not sort the keys then we may have misses from the MDT. 
This could lead to data quality issues. 
   
   If we do not want to have the overhead of re-sorting a sorted array (how 
much is the overhead?) then we atleast need to add some checks here that the 
current keys is greater than the previous key in the getRecordsByKeysIterator 
and getRecordsByKeyPrefixIterator.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to