I am using repartitionAndSortWithinPartitions to partition my content and then 
sort within each partition.  I've also created a custom partitioner that I use 
with repartitionAndSortWithinPartitions. I created a custom partitioner as my 
key consist of something like 'groupid|timestamp' and I only want to partition 
on the group id but I want to sort the records on each partition using the 
entire key (groupid and the timestamp).

My question is when I use mapPartitions (to process the records in each 
partition) is whether the order in each partition will be guaranteed (from the 
sort) as I iterate through the records in each partition.  As I iterate, while 
processing the current record I need to look at the previous record and the 
next record in the partition and I need to make sure the records would be 
processed in the sorted order.

I tend to think so, but wanted to confirm.

Thanks.

Darin.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to