Maybe the problem is the data itself. For example, the first dataframe might has common keys in only one part of the second dataframe. I think you can verify if you are in this situation by repartition one dataframe and join it. If this is the true reason, you might see the result distributed more evenly.
2016-04-25 9:34 GMT+07:00 Divya Gehlot <divya.htco...@gmail.com>: > Hi, > > After joining two dataframes, saving dataframe using Spark CSV. > But all the result data is being written to only one part file whereas > there are 200 part files being created, rest 199 part files are empty. > > What is the cause of uneven partitioning ? How can I evenly distribute the > data ? > Would really appreciate the help. > > > Thanks, > Divya >