Re: spark structured streaming crash due to decompressing gzip file failure

2019-03-07 Thread Lian Jiang
Thanks, it worked. On Thu, Mar 7, 2019 at 5:05 AM Akshay Bhardwaj < akshay.bhardwaj1...@gmail.com> wrote: > Hi, > > In your spark-submit command, try using the below config property and see > if this solves the problem. > > --conf spark.sql.files.ignoreCorruptFiles=true > > For me this worked to

Re: spark structured streaming crash due to decompressing gzip file failure

2019-03-07 Thread Akshay Bhardwaj
Hi, In your spark-submit command, try using the below config property and see if this solves the problem. --conf spark.sql.files.ignoreCorruptFiles=true For me this worked to ignore reading empty/partially uploaded gzip files in s3 bucket. Akshay Bhardwaj +91-97111-33849 On Thu, Mar 7, 2019

spark structured streaming crash due to decompressing gzip file failure

2019-03-06 Thread Lian Jiang
Hi, I have a structured streaming job which listens to a hdfs folder containing jsonl.gz files. The job crashed due to error: java.io.IOException: incorrect header check at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native Method) at