Avery Ching created GIRAPH-694: ---------------------------------- Summary: Setting configuration in GiraphConfiguration causes non thread safe copies Key: GIRAPH-694 URL: https://issues.apache.org/jira/browse/GIRAPH-694 Project: Giraph Issue Type: Bug Reporter: Avery Ching Assignee: Avery Ching
When running multithreaded loading, I found a strange problem that all threads would get blocked on one thread that was reading an infinite sized map. The thread everyone was waiting on would be stuck doing the following: "load-17" prio=10 tid=0x00007f2bac138800 nid=0x6a8e runnable [0x0000000047d7a000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.hash(HashMap.java:351) at java.util.HashMap.putForCreate(HashMap.java:512) at java.util.HashMap.putAllForCreate(HashMap.java:534) at java.util.HashMap.<init>(HashMap.java:320) at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:291) - locked <0x00007f2f9be162c8> (a org.apache.hadoop.mapred.JobConf) at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:402) at com.facebook.hiveio.input.HiveApiInputFormat.createRecordReader(HiveApiInputFormat.java:246) at org.apache.giraph.hive.input.edge.HiveEdgeInputFormat.createEdgeReader(HiveEdgeInputFormat.java:86) at com.facebook.digraph.affinitypropagation.io.hive.ReverseEdgeDuplicatorHiveInputFormat.createEdgeReader(ReverseEdgeDuplicatorHiveInputFormat.java:32) at org.apache.giraph.io.internal.WrappedEdgeInputFormat.createEdgeReader(WrappedEdgeInputFormat.java:71) at org.apache.giraph.worker.EdgeInputSplitsCallable.readInputSplit(EdgeInputSplitsCallable.java:123) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:267) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) "load-17" prio=10 tid=0x00007f2bac138800 nid=0x6a8e runnable [0x0000000047d7a000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.putAllForCreate(HashMap.java:533) at java.util.HashMap.<init>(HashMap.java:320) at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:291) - locked <0x00007f2f9be162c8> (a org.apache.hadoop.mapred.JobConf) at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:402) at com.facebook.hiveio.input.HiveApiInputFormat.createRecordReader(HiveApiInputFormat.java:246) at org.apache.giraph.hive.input.edge.HiveEdgeInputFormat.createEdgeReader(HiveEdgeInputFormat.java:86) at com.facebook.digraph.affinitypropagation.io.hive.ReverseEdgeDuplicatorHiveInputFormat.createEdgeReader(ReverseEdgeDuplicatorHiveInputFormat.java:32) at org.apache.giraph.io.internal.WrappedEdgeInputFormat.createEdgeReader(WrappedEdgeInputFormat.java:71) at org.apache.giraph.worker.EdgeInputSplitsCallable.readInputSplit(EdgeInputSplitsCallable.java:123) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:267) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) This appears to have been caused by an unsafe Configuration#set() and a copy by another thread. Configuration#set() is not thread-safe due to the part HashMap updatingResource. This may or may not be present in other versions of Hadoop. The solution is simple. We synchronize the GiraphConfiguration when setting and then the copy is now thread-safe. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira