genericRecordsAndKeys.persist(StorageLevel.MEMORY_AND_DISK) with 17 as
repartitioning argument is throwing this exception:
7/13 23:26:36 INFO yarn.ApplicationMaster: Final app status: FAILED,
exitCode: 15, (reason: User class threw exception:
org.apache.spark.SparkException: Job aborted due to
Any solutions to solve this exception ?
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an
output location for shuffle 1
at
org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:389)
at
Hi Deepak
Not 100% sure , but please try increasing (--executor-cores ) to twice the
number of your physical cores on your machine.
Thanks and Regards
Aniruddh
On Tue, Jul 14, 2015 at 9:49 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
Its been 30 minutes and still the partitioner has not
I reduced the number of partitions to 1/4 to 76 in order to reduce the
time to 1/4 (from 33 to 8) But the re-parition is still running beyond 15
mins.
@Nirmal
click on details, shows the code lines and does not show why it is slow. I
know that repartition is slow and want to speed it up
If you press on the +details you could see the code that takes time. Did
you already check it?
On Tue, Jul 14, 2015 at 9:56 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
Job view. Others are fast, but the first one (repartition) is taking 95%
of job run time.
On Mon, Jul 13, 2015 at 9:23 PM,
Its been 30 minutes and still the partitioner has not completed yet, its
ever.
Without repartition, i see this error
https://issues.apache.org/jira/browse/SPARK-5928
FetchFailed(BlockManagerId(1, imran-2.ent.cloudera.com, 55028),
shuffleId=1, mapId=0, reduceId=0, message=