[GitHub] [hudi] bhasudha commented on issue #1960: How do you change the 'hoodie.datasource.write.payload.class' configuration property?

2020-08-13 Thread GitBox


bhasudha commented on issue #1960:
URL: https://github.com/apache/hudi/issues/1960#issuecomment-673910051


   @brandon-stanley Based on your description above, you could try this:
   
   Instead of skipping the precombine field, you could add the 
COALESCE(update_date, create_date) as new column before writing to Hudi and 
pass in that new column as the precombine field. I think you could use 
withColumn() in Spark to do this. Here duplicates are handled based on the 
latest value of the precombine field which is the COALESCE() described above. 
You wouldn't need to worry about Payload class then. 
   
   Please correct me if I am missing something.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] bhasudha commented on issue #1960: How do you change the 'hoodie.datasource.write.payload.class' configuration property?

2020-08-13 Thread GitBox


bhasudha commented on issue #1960:
URL: https://github.com/apache/hudi/issues/1960#issuecomment-673336838


   @brandon-stanley  the `hoodie.datasource.write.precombine.field` is a 
mandatory field. If not specified a default field name `ts` is assumed. Since 
your table does not have this field you are seeing the above error.  The 
payload class invocation is not an issue since the stack trace you are pointing 
to here is happening way before the payload class is being invoked. You might 
want to point the `hoodie.datasource.write.precombine.field` to a valid column 
in the table and then also pass in a payload class that would ignore the 
precombine field.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org