RonBarabash commented on issue #750: Adding support for optional skipping 
single archiving failures
URL: https://github.com/apache/incubator-hudi/pull/750#issuecomment-503916979
 
 
   Hey @vinothchandar, thanks for the response!
   This error occurred sporadically after a running a streaming application for 
a couple of days.
   Our setup is like so: 
   We use Hudi as an output writer for a structured streaming job, which is 
dockerized and then runs on Nomad.  The input is an event stream from Kafka and 
we write to output to S3.
   My guess is that we might be loosing spark executors before Hudi finalise 
the writing of the .commit file and therefore it is empty, but its just a 
speculation, i didn't saw this scenario in the logs or metrics.. 
   Another options is something to do with S3  eventual consistency, we are 
using `hoodie.consistency.check.enabled = true` in our run...
   
   How about ignoring empty commit files? or re-writing the commit if its empty?
   
   Nonetheless, we had to debug the application in order to understand the name 
of the damaged file, the log is important to understand that without actually 
debugging the app...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to