Thanks Davies, after I did a coalesce(1) to save as single parquet file I
was able to get the head() to return the correct order.
On Sun, May 8, 2016 at 12:29 AM, Davies Liu wrote:
> When you have multiple parquet files, the order of all the rows in
> them is not defined.
When you have multiple parquet files, the order of all the rows in
them is not defined.
On Sat, May 7, 2016 at 11:48 PM, Buntu Dev wrote:
> I'm using pyspark dataframe api to sort by specific column and then saving
> the dataframe as parquet file. But the resulting parquet