[ http://issues.apache.org/jira/browse/HADOOP-642?page=all ]
Konstantin Shvachko updated HADOOP-642:
---------------------------------------
Attachment: IpcClientTimeout.patch
> Explicit timeout for ipc.Client
> -------------------------------
>
> Key: HADOOP-642
> URL: http://issues.apache.org/jira/browse/HADOOP-642
> Project: Hadoop
> Issue Type: Bug
> Affects Versions: 0.7.2
> Reporter: Konstantin Shvachko
> Attachments: IpcClientTimeout.patch
>
>
> This bug contributed to the crash discussed in HADOOP-572.
> ipc.Client is trying to establish connection with its server with an infinite
> timeout.
> For an unknown to me reason infinity equals 3 minutes in this case.
> I guess it is configured somewhere in the native socket implementation.
> With this timeout data-nodes had only 3 chances to send heartbeats during the
> 10
> minute expiration interval. And with a very busy name-node this makes their
> chances to be accepted close to 0.
> I included an explicit call of Socket.connect() with a timeout set to 1 min,
> which is
> our default for all connections.
> Modified a log message to include information that turned out to be useful
> for debugging.
> Removed unnecessary imports.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira