Re: Spark app performance

2015-01-01 Thread Akhil Das
Would be great if you can share the piece of code happening inside your mapPartition, I'm assuming you are creating/handling a lot of Complex objects and hence it slows down the performance. Here's a link http://spark.apache.org/docs/latest/tuning.html to performance tuning if you haven't seen it

Re: Spark app performance

2015-01-01 Thread Raghavendra Pandey
I have seen that link. I am using RDD of Byte Array n Kryo serialization. Inside mapPartition when I measure time it is never more than 1 ms whereas total time took by application is like 30 min. Codebase has lot of dependencies. I m trying to come up with a simple version where I can reproduce

Spark app performance

2014-12-30 Thread Raghavendra Pandey
I have a spark app that involves series of mapPartition operations and then a keyBy operation. I have measured the time inside mapPartition function block. These blocks take trivial time. Still the application takes way too much time and even sparkUI shows that much time. So i was wondering where