[
https://issues.apache.org/jira/browse/GIRAPH-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014247#comment-14014247
]
Jaeho Shin commented on GIRAPH-904:
-----------------------------------
The exact code where Giraph hangs is
{{BspServiceMaster#barrierOnWorkerList()}}. It keeps printing log lines
starting with {{"barrierOnWorkerList: Waiting on "}} with hostnames that has
definitely finished, e.g., when launched with a single worker, it waits for
itself after finishing the loading.
On a second look, I think a more precise fix would be doing the lowercase
normalization in that function when computing the {{hostnameIdList}} rather
than doing it in {{GraphConfiguration#getLocalHostname()}}. The case
difference between that and the hostname returned by
{{TaskInfo#getHostnameId()}} is causing the set difference computation in the
while loop to fail.
> Giraph can hang when hostnames include uppercase letters
> --------------------------------------------------------
>
> Key: GIRAPH-904
> URL: https://issues.apache.org/jira/browse/GIRAPH-904
> Project: Giraph
> Issue Type: Bug
> Components: bsp, conf and scripts, zookeeper
> Affects Versions: 1.1.0
> Reporter: Jaeho Shin
> Fix For: 1.1.0
>
> Attachments: GIRAPH-904.patch
>
>
> We found that Giraph jobs were consistently hanging if uppercase letters were
> included in the DNS (or /etc/hosts) resolved hostnames ({{foo.stanford.edu}}
> vs. {{foo.Stanford.EDU}} from our DNS). Normalizing the hostnames to lower
> case from {{GiraphConfiguration#getLocalHostname()}} fixed our problem.
--
This message was sent by Atlassian JIRA
(v6.2#6252)