subject:"\"spark 2.0.0 \\\- when saving a model to S3 spark creates temporary files. Why\\\?\""

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

2016-08-25 Thread Steve Loughran

With Hadoop 2.7 or later, set spark.hadooop.mapreduce.fileoutputcommitter.algorithm.version 2 spark.hadoop.mapreduce.fileoutputcommitter.cleanup.skipped true This switches to a no -rename version of the file output committer, is faster all round. You are still at risk of things going wrong on

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

2016-08-25 Thread Takeshi Yamamuro

afaik no. // maropu On Thu, Aug 25, 2016 at 9:16 PM, Tal Grynbaum wrote: > Is/was there an option similar to DirectParquetOutputCommitter to write > json files to S3 ? > > On Thu, Aug 25, 2016 at 2:56 PM, Takeshi Yamamuro > wrote: > >> Hi, >> >> Seems this just prevents writers from leaving pa

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

2016-08-25 Thread Tal Grynbaum

Is/was there an option similar to DirectParquetOutputCommitter to write json files to S3 ? On Thu, Aug 25, 2016 at 2:56 PM, Takeshi Yamamuro wrote: > Hi, > > Seems this just prevents writers from leaving partial data in a > destination dir when jobs fail. > In the previous versions of Spark, the

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

2016-08-25 Thread Takeshi Yamamuro

Hi, Seems this just prevents writers from leaving partial data in a destination dir when jobs fail. In the previous versions of Spark, there was a way to directly write data in a destination though, Spark v2.0+ has no way to do that because of the critial issue on S3 (See: SPARK-10063). // maropu

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

2016-08-24 Thread Tal Grynbaum

I read somewhere that its because s3 has to know the size of the file upfront I dont really understand this, as to why is it ok not to know it for the temp files and not ok for the final files. The delete permission is the minor disadvantage from my side, the worst thing is that i have a cluster

spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

2016-08-24 Thread Aseem Bansal

Hi When Spark saves anything to S3 it creates temporary files. Why? Asking this as this requires the the access credentails to be given delete permissions along with write permissions.

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

Re: spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

spark 2.0.0 - when saving a model to S3 spark creates temporary files. Why?

6 matches

Site Navigation

Mail list logo

Footer information