[ 
https://issues.apache.org/jira/browse/SPARK-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957958#comment-13957958
 ] 

Nishkam Ravi commented on SPARK-1097:
-------------------------------------

We can consider putting a workaround in Spark as well (for non-CDH users that 
may be running an older version of Hadoop and not updating it periodically). 
For now, this fix needs to go upstream, so we can backport it to CDH. The 
CDH-Spark bundle would then inherit this fix. The same issue has been noted in 
Hadoop-10456 as well. 

> ConcurrentModificationException
> -------------------------------
>
>                 Key: SPARK-1097
>                 URL: https://issues.apache.org/jira/browse/SPARK-1097
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 0.9.0
>            Reporter: Fabrizio Milo
>         Attachments: nravi_Conf_Spark-1388.patch
>
>
> {noformat}
> 14/02/16 08:18:45 WARN TaskSetManager: Loss was due to 
> java.util.ConcurrentModificationException
> java.util.ConcurrentModificationException
>       at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926)
>       at java.util.HashMap$KeyIterator.next(HashMap.java:960)
>       at java.util.AbstractCollection.addAll(AbstractCollection.java:341)
>       at java.util.HashSet.<init>(HashSet.java:117)
>       at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:554)
>       at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:439)
>       at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:110)
>       at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:154)
>       at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:149)
>       at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:64)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
>       at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
>       at org.apache.spark.rdd.UnionPartition.iterator(UnionRDD.scala:32)
>       at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:72)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
>       at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
>       at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>       at org.apache.spark.scheduler.Task.run(Task.scala:53)
>       at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>       at 
> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:744)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to