aditiwari01 edited a comment on issue #2756:
URL: https://github.com/apache/hudi/issues/2756#issuecomment-812516535


   I think I couldn't explain myself. I am using DefaultHoodieRecordPayload 
only. I have attached sample command regardinng same.
   
   The issue is not with "combineAndGetUpdateValue", rather with "preCombine".
    As per my uderstanding, combineAndGetUpdateValue is used to merge record 
from parquet and in memory record, whereas preCombine is used to dedupe 
multiple records in memory with same key. The preCombine function uses 
orderingVal field to sort and while creating record from log file we do not set 
this ordering field. And hence the issue. 
   
   The constructors are as foolows:
   
   1. DefaultHoodieRecordPayload(Option<GenericRecord> record) {this(recordl, 
0);}
   2. DefaultHoodieRecordPayload(GenericRecord record, Comparable orderingVal) 
{super(record, orderingVal)}
   
   In the read path we only call the 1st constructor and hence lose the 
ordering value.
   
   Also, if we compact after each commit we dont see this issue since 
"combineAndGetUpdateValue" works absolutely fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to