Maybe the problem is the data itself. For example, the first dataframe
might has common keys in only one part of the second dataframe. I think you
can verify if you are in this situation by repartition one dataframe and
join it. If this is the true reason, you might see the result distributed
more evenly.

2016-04-25 9:34 GMT+07:00 Divya Gehlot <divya.htco...@gmail.com>:

> Hi,
>
> After joining two dataframes, saving dataframe using Spark CSV.
> But all the result data is being written to only one part file whereas
> there are 200 part files being created, rest 199 part files are empty.
>
> What is the cause of uneven partitioning ? How can I evenly distribute the
> data ?
> Would really appreciate the help.
>
>
> Thanks,
> Divya
>

Reply via email to