Re: How to speed up Spark process

2015-07-14 Thread ๏̯͡๏
genericRecordsAndKeys.persist(StorageLevel.MEMORY_AND_DISK) with 17 as repartitioning argument is throwing this exception: 7/13 23:26:36 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Job aborted due to

Re: How to speed up Spark process

2015-07-14 Thread ๏̯͡๏
Any solutions to solve this exception ? org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 1 at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:389) at

Re: How to speed up Spark process

2015-07-13 Thread Aniruddh Sharma
Hi Deepak Not 100% sure , but please try increasing (--executor-cores ) to twice the number of your physical cores on your machine. Thanks and Regards Aniruddh On Tue, Jul 14, 2015 at 9:49 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: Its been 30 minutes and still the partitioner has not

Re: How to speed up Spark process

2015-07-13 Thread ๏̯͡๏
I reduced the number of partitions to 1/4 to 76 in order to reduce the time to 1/4 (from 33 to 8) But the re-parition is still running beyond 15 mins. @Nirmal click on details, shows the code lines and does not show why it is slow. I know that repartition is slow and want to speed it up

Re: How to speed up Spark process

2015-07-13 Thread Nirmal Fernando
If you press on the +details you could see the code that takes time. Did you already check it? On Tue, Jul 14, 2015 at 9:56 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: Job view. Others are fast, but the first one (repartition) is taking 95% of job run time. On Mon, Jul 13, 2015 at 9:23 PM,

Re: How to speed up Spark process

2015-07-13 Thread ๏̯͡๏
Its been 30 minutes and still the partitioner has not completed yet, its ever. Without repartition, i see this error https://issues.apache.org/jira/browse/SPARK-5928 FetchFailed(BlockManagerId(1, imran-2.ent.cloudera.com, 55028), shuffleId=1, mapId=0, reduceId=0, message=