Re: Saving Parquet files to S3

2016-06-10 Thread Bijay Kumar Pathak
016 11:47 AM > *To:* Ankur Jain > *Cc:* user@spark.apache.org > *Subject:* Re: Saving Parquet files to S3 > > > > Hi, > > > > You'd better off `setting parquet.block.size`. > > > > // maropu > > > > On Thu, Jun 9, 2016 at 7:48 AM, Danie

RE: Saving Parquet files to S3

2016-06-10 Thread Ankur Jain
Thanks maropu.. It worked… From: Takeshi Yamamuro [mailto:linguin@gmail.com] Sent: 10 June 2016 11:47 AM To: Ankur Jain Cc: user@spark.apache.org Subject: Re: Saving Parquet files to S3 Hi, You'd better off `setting parquet.block.size`. // maropu On Thu, Jun 9, 2016 at 7:48 AM, Daniel

Re: Saving Parquet files to S3

2016-06-10 Thread Takeshi Yamamuro
Hi, You'd better off `setting parquet.block.size`. // maropu On Thu, Jun 9, 2016 at 7:48 AM, Daniel Siegmann wrote: > I don't believe there's anyway to output files of a specific size. What > you can do is partition your data into a number of partitions such that

Re: Saving Parquet files to S3

2016-06-09 Thread Daniel Siegmann
I don't believe there's anyway to output files of a specific size. What you can do is partition your data into a number of partitions such that the amount of data they each contain is around 1 GB. On Thu, Jun 9, 2016 at 7:51 AM, Ankur Jain wrote: > Hello Team, > > > > I

Saving Parquet files to S3

2016-06-09 Thread Ankur Jain
Hello Team, I want to write parquet files to AWS S3, but I want to size each file size to 1 GB. Can someone please guide me on how I can achieve the same? I am using AWS EMR with spark 1.6.1. Thanks, Ankur Information transmitted by this e-mail is proprietary to YASH Technologies and/ or its