I'm not sure either of those PRs will fix the concurrent adds to Configuration issue I observed. I've got a stack trace and writeup I'll share in an hour or two (traveling today). On Jul 14, 2014 9:50 PM, "scwf" <wangf...@huawei.com> wrote:
> hi,Cody > i met this issue days before and i post a PR for this( > https://github.com/apache/spark/pull/1385) > it's very strange that if i synchronize conf it will deadlock but it is ok > when synchronize initLocalJobConfFuncOpt > > > Here's the entire jstack output. >> >> >> On Mon, Jul 14, 2014 at 4:44 PM, Patrick Wendell <pwend...@gmail.com >> <mailto:pwend...@gmail.com>> wrote: >> >> Hey Cody, >> >> This Jstack seems truncated, would you mind giving the entire stack >> trace? For the second thread, for instance, we can't see where the >> lock is being acquired. >> >> - Patrick >> >> On Mon, Jul 14, 2014 at 1:42 PM, Cody Koeninger >> <cody.koenin...@mediacrossing.com <mailto:cody.koeninger@ >> mediacrossing.com>> wrote: >> > Hi all, just wanted to give a heads up that we're seeing a >> reproducible >> > deadlock with spark 1.0.1 with 2.3.0-mr1-cdh5.0.2 >> > >> > If jira is a better place for this, apologies in advance - figured >> talking >> > about it on the mailing list was friendlier than randomly >> (re)opening jira >> > tickets. >> > >> > I know Gary had mentioned some issues with 1.0.1 on the mailing >> list, once >> > we got a thread dump I wanted to follow up. >> > >> > The thread dump shows the deadlock occurs in the synchronized >> block of code >> > that was changed in HadoopRDD.scala, for the Spark-1097 issue >> > >> > Relevant portions of the thread dump are summarized below, we can >> provide >> > the whole dump if it's useful. >> > >> > Found one Java-level deadlock: >> > ============================= >> > "Executor task launch worker-1": >> > waiting to lock monitor 0x00007f250400c520 (object >> 0x00000000fae7dc30, a >> > org.apache.hadoop.co <http://org.apache.hadoop.co> >> > nf.Configuration), >> > which is held by "Executor task launch worker-0" >> > "Executor task launch worker-0": >> > waiting to lock monitor 0x00007f2520495620 (object >> 0x00000000faeb4fc8, a >> > java.lang.Class), >> > which is held by "Executor task launch worker-1" >> > >> > >> > "Executor task launch worker-1": >> > at >> > org.apache.hadoop.conf.Configuration.reloadConfiguration( >> Configuration.java:791) >> > - waiting to lock <0x00000000fae7dc30> (a >> > org.apache.hadoop.conf.Configuration) >> > at >> > org.apache.hadoop.conf.Configuration.addDefaultResource( >> Configuration.java:690) >> > - locked <0x00000000faca6ff8> (a java.lang.Class for >> > org.apache.hadoop.conf.Configurati >> > on) >> > at >> > org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>( >> HdfsConfiguration.java:34) >> > at >> > org.apache.hadoop.hdfs.DistributedFileSystem.<clinit> >> (DistributedFileSystem.java:110 >> > ) >> > at sun.reflect.NativeConstructorAccessorImpl. >> newInstance0(Native >> > Method) >> > at >> > sun.reflect.NativeConstructorAccessorImpl.newInstance( >> NativeConstructorAccessorImpl. >> > java:57) >> > at sun.reflect.NativeConstructorAccessorImpl. >> newInstance0(Native >> > Method) >> > at >> > sun.reflect.NativeConstructorAccessorImpl.newInstance( >> NativeConstructorAccessorImpl. >> > java:57) >> > at >> > sun.reflect.DelegatingConstructorAccessorImpl.newInstance( >> DelegatingConstructorAcces >> > sorImpl.java:45) >> > at java.lang.reflect.Constructor. >> newInstance(Constructor.java:525) >> > at java.lang.Class.newInstance0(Class.java:374) >> > at java.lang.Class.newInstance(Class.java:327) >> > at java.util.ServiceLoader$LazyIterator.next( >> ServiceLoader.java:373) >> > at java.util.ServiceLoader$1.next(ServiceLoader.java:445) >> > at >> > org.apache.hadoop.fs.FileSystem.loadFileSystems( >> FileSystem.java:2364) >> > - locked <0x00000000faeb4fc8> (a java.lang.Class for >> > org.apache.hadoop.fs.FileSystem) >> > at >> > org.apache.hadoop.fs.FileSystem.getFileSystemClass( >> FileSystem.java:2375) >> > at >> > org.apache.hadoop.fs.FileSystem.createFileSystem( >> FileSystem.java:2392) >> > at org.apache.hadoop.fs.FileSystem.access$200( >> FileSystem.java:89) >> > at >> > org.apache.hadoop.fs.FileSystem$Cache.getInternal( >> FileSystem.java:2431) >> > at org.apache.hadoop.fs.FileSystem$Cache.get( >> FileSystem.java:2413) >> > at org.apache.hadoop.fs.FileSystem.get(FileSystem. >> java:368) >> > at org.apache.hadoop.fs.FileSystem.get(FileSystem. >> java:167) >> > at >> > org.apache.hadoop.mapred.JobConf.getWorkingDirectory( >> JobConf.java:587) >> > at >> > org.apache.hadoop.mapred.FileInputFormat.setInputPaths( >> FileInputFormat.java:315) >> > at >> > org.apache.hadoop.mapred.FileInputFormat.setInputPaths( >> FileInputFormat.java:288) >> > at >> > org.apache.spark.SparkContext$$anonfun$22.apply( >> SparkContext.scala:546) >> > at >> > org.apache.spark.SparkContext$$anonfun$22.apply( >> SparkContext.scala:546) >> > at >> > org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$ >> 1.apply(HadoopRDD.scala:145) >> > >> > >> > >> > ...elided... >> > >> > >> > "Executor task launch worker-0" daemon prio=10 >> tid=0x0000000001e71800 >> > nid=0x2d97 waiting for monitor entry [0x00007f24d2bf1000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > at >> > org.apache.hadoop.fs.FileSystem.loadFileSystems( >> FileSystem.java:2362) >> > - waiting to lock <0x00000000faeb4fc8> (a java.lang.Class >> for >> > org.apache.hadoop.fs.FileSystem) >> > at >> > org.apache.hadoop.fs.FileSystem.getFileSystemClass( >> FileSystem.java:2375) >> > at >> > org.apache.hadoop.fs.FileSystem.createFileSystem( >> FileSystem.java:2392) >> > at org.apache.hadoop.fs.FileSystem.access$200( >> FileSystem.java:89) >> > at >> > org.apache.hadoop.fs.FileSystem$Cache.getInternal( >> FileSystem.java:2431) >> > at org.apache.hadoop.fs.FileSystem$Cache.get( >> FileSystem.java:2413) >> > at org.apache.hadoop.fs.FileSystem.get(FileSystem. >> java:368) >> > at org.apache.hadoop.fs.FileSystem.get(FileSystem. >> java:167) >> > at >> > org.apache.hadoop.mapred.JobConf.getWorkingDirectory( >> JobConf.java:587) >> > at >> > org.apache.hadoop.mapred.FileInputFormat.setInputPaths( >> FileInputFormat.java:315) >> > at >> > org.apache.hadoop.mapred.FileInputFormat.setInputPaths( >> FileInputFormat.java:288) >> > at >> > org.apache.spark.SparkContext$$anonfun$22.apply( >> SparkContext.scala:546) >> > at >> > org.apache.spark.SparkContext$$anonfun$22.apply( >> SparkContext.scala:546) >> > at >> > org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$ >> 1.apply(HadoopRDD.scala:145) >> >> >> > > -- > > Best Regards > Fei Wang > > ------------------------------------------------------------ > -------------------- > > >