016 11:47 AM
> *To:* Ankur Jain
> *Cc:* user@spark.apache.org
> *Subject:* Re: Saving Parquet files to S3
>
>
>
> Hi,
>
>
>
> You'd better off `setting parquet.block.size`.
>
>
>
> // maropu
>
>
>
> On Thu, Jun 9, 2016 at 7:48 AM, Danie
Thanks maropu.. It worked…
From: Takeshi Yamamuro [mailto:linguin@gmail.com]
Sent: 10 June 2016 11:47 AM
To: Ankur Jain
Cc: user@spark.apache.org
Subject: Re: Saving Parquet files to S3
Hi,
You'd better off `setting parquet.block.size`.
// maropu
On Thu, Jun 9, 2016 at 7:48 AM, Daniel
Hi,
You'd better off `setting parquet.block.size`.
// maropu
On Thu, Jun 9, 2016 at 7:48 AM, Daniel Siegmann wrote:
> I don't believe there's anyway to output files of a specific size. What
> you can do is partition your data into a number of partitions such that
I don't believe there's anyway to output files of a specific size. What you
can do is partition your data into a number of partitions such that the
amount of data they each contain is around 1 GB.
On Thu, Jun 9, 2016 at 7:51 AM, Ankur Jain wrote:
> Hello Team,
>
>
>
> I
Hello Team,
I want to write parquet files to AWS S3, but I want to size each file size to 1
GB.
Can someone please guide me on how I can achieve the same?
I am using AWS EMR with spark 1.6.1.
Thanks,
Ankur
Information transmitted by this e-mail is proprietary to YASH Technologies and/
or its