[ https://issues.apache.org/jira/browse/SPARK-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957231#comment-13957231 ]
Nishkam Ravi commented on SPARK-1388: ------------------------------------- Here is a simple fix for this issue (patch attached). Verified with mvn compile, mvn test and mvn install. This issue may be identical to SPARK-1097. > ConcurrentModificationException in hadoop_common exposed by Spark > ----------------------------------------------------------------- > > Key: SPARK-1388 > URL: https://issues.apache.org/jira/browse/SPARK-1388 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 0.9.0 > Reporter: Nishkam Ravi > Attachments: nravi_Conf_Spark-1388.patch > > > The following exception occurs non-deterministically: > java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926) > at java.util.HashMap$KeyIterator.next(HashMap.java:960) > at java.util.AbstractCollection.addAll(AbstractCollection.java:341) > at java.util.HashSet.<init>(HashSet.java:117) > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:671) > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:439) > at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:110) > at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:154) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:149) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:64) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) > at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) > at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) > at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) > at org.apache.spark.scheduler.Task.run(Task.scala:53) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.2#6252)