Re: Map and MapParitions with partition-local variable

2016-11-17 Thread Rohit Verma
Using a map and mapPartition on same df at the same time doesn't make much sense to me. Also without complete infor I am assuming that you have some partition strategy being defined/influenced by map operation. In that case you can create a hashmap of map values for each partitions, do

Map and MapParitions with partition-local variable

2016-11-17 Thread Zsolt Tóth
Any comment on this one? 2016. nov. 16. du. 12:59 ezt írta ("Zsolt Tóth" ): > Hi, > > I need to run a map() and a mapPartitions() on my input DF. As a > side-effect of the map(), a partition-local variable should be updated, > that is used in the mapPartitions()

Map and MapParitions with partition-local variable

2016-11-16 Thread Zsolt Tóth
Hi, I need to run a map() and a mapPartitions() on my input DF. As a side-effect of the map(), a partition-local variable should be updated, that is used in the mapPartitions() afterwards. I can't use Broadcast variable, because it's shared between partitions on the same executor. Where can I