[
https://issues.apache.org/jira/browse/GIRAPH-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750724#comment-13750724
]
Chuan Lei commented on GIRAPH-730:
----------------------------------
>From the log info that I put in getTaskResourceMap, it shows that all
>containers in the thread-pool are trying to call it during the initialization
>stage. What happened to me was the giraph job immediately failed when there
>were more than one container without this synchronization patch. If needed, I
>can attach the log file generated from the failed runs.
In terms of buildContainerLaunchContext, it should also have the race
condition. However, it doesn't not try to do anything on HDFS. Probably, that
is why it doesn't create any runtime problem.
I actually wonder if it is possible for you test it out with a very simple
settings. Say, one master with three workers. Without synchronizing
getTaskResourceMap, you should be able to reproduce this problem. Thanks.
> GiraphApplicationMaster race condition in resource loading
> ----------------------------------------------------------
>
> Key: GIRAPH-730
> URL: https://issues.apache.org/jira/browse/GIRAPH-730
> Project: Giraph
> Issue Type: Bug
> Affects Versions: 1.0.0
> Environment: Giraph with Yarn
> Reporter: Chuan Lei
> Assignee: Chuan Lei
> Attachments: GIRAPH-730.v1.patch
>
>
> In GiraphApplicationMaster.java, getTaskResourceMap function is not
> multi-thread safe, which causes the application master fail to distribute the
> resources (jar, configuration file, etc.) to each container.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira