Re: File not found exceptions on S3 while running spark jobs

2020-07-17 Thread Nagendra Darla
Hi,

Thanks I know about FileNotFound Exception.

This error is with S3 buckets which has a delay in showing newly created
files. These files eventually shows up after some time.

These errors are coming up while running a parquet table into Delta table.

My question is more around avoiding this error with spark jobs which create
/ updates / deletes lots of files on S3 buckets.

On Thu, Jul 16, 2020 at 10:28 PM Hulio andres  wrote:

>
> https://examples.javacodegeeks.com/java-io-filenotfoundexception-how-to-solve-file-not-found-exception/
>
> Are you a programmer   ?
>
> Regards,
>
> Hulio
>
>
>
> > Sent: Friday, July 17, 2020 at 2:41 AM
> > From: "Nagendra Darla" 
> > To: user@spark.apache.org
> > Subject: File not found exceptions on S3 while running spark jobs
> >
> > Hello All,
> > I am converting existing parquet table (size: 50GB) into Delta format. It
> > took around 1hr 45 mins to convert.
> > And I see that there are lot of FileNotFoundExceptions in the logs
> >
> > Caused by: java.io.FileNotFoundException: No such file or directory:
> >
> s3a://old-data/delta-data/PL1/output/denorm_table/part-00031-183e54ef-50bc-46fc-83a3-7836baa28f86-c000.snappy.parquet
> >
> > *How do I fix these errors?* I am using below options in spark-submit
> > command
> >
> > spark-submit --packages
> > io.delta:delta-core_2.11:0.6.0,org.apache.hadoop:hadoop-aws:2.8.5
> > --conf
> spark.delta.logStore.class=org.apache.spark.sql.delta.storage.S3SingleDriverLogStore
> > --class Pipeline1 Pipeline.jar
> >
> > Thank You,
> > Nagendra Darla
> >
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
> --
Sent from iPhone


File not found exceptions on S3 while running spark jobs

2020-07-16 Thread Nagendra Darla
Hello All,
I am converting existing parquet table (size: 50GB) into Delta format. It
took around 1hr 45 mins to convert.
And I see that there are lot of FileNotFoundExceptions in the logs

Caused by: java.io.FileNotFoundException: No such file or directory:
s3a://old-data/delta-data/PL1/output/denorm_table/part-00031-183e54ef-50bc-46fc-83a3-7836baa28f86-c000.snappy.parquet

*How do I fix these errors?* I am using below options in spark-submit
command

spark-submit --packages
io.delta:delta-core_2.11:0.6.0,org.apache.hadoop:hadoop-aws:2.8.5
--conf 
spark.delta.logStore.class=org.apache.spark.sql.delta.storage.S3SingleDriverLogStore
--class Pipeline1 Pipeline.jar

Thank You,
Nagendra Darla