Re: How to optimiz and make this code faster using coalesce(1) and mapPartitionIndex

2016-01-13 Thread unk1102
Hi thanks for the reply. Actually I cant share details as it is classified and pretty complex to understand as it is not general problem I am trying to solve related to database dynamic sql order execution. I need to use Spark as my other jobs which dont use coalesce uses spark. My source data is

How to optimiz and make this code faster using coalesce(1) and mapPartitionIndex

2016-01-12 Thread unk1102
Hi I have the following code which I run as part of thread which becomes child job of my main Spark job it takes hours to run for large data around 1-2GB because of coalesce(1) and if data is in MB/KB then it finishes faster with more data sets size sometimes it does not complete at all. Please