Re: [Spark 1.5.2]All data being written to only one part file rest part files are empty

nguyen duc tuan Mon, 25 Apr 2016 08:15:31 -0700

Maybe the problem is the data itself. For example, the first dataframe
might has common keys in only one part of the second dataframe. I think you
can verify if you are in this situation by repartition one dataframe and
join it. If this is the true reason, you might see the result distributed
more evenly.


2016-04-25 9:34 GMT+07:00 Divya Gehlot <divya.htco...@gmail.com>:

> Hi,
>
> After joining two dataframes, saving dataframe using Spark CSV.
> But all the result data is being written to only one part file whereas
> there are 200 part files being created, rest 199 part files are empty.
>
> What is the cause of uneven partitioning ? How can I evenly distribute the
> data ?
> Would really appreciate the help.
>
>
> Thanks,
> Divya
>

Re: [Spark 1.5.2]All data being written to only one part file rest part files are empty

Reply via email to