stack wrote:
Doug Cutting wrote:
RPC also tears down idle connections, which HDFS does not. I wonder
how much doing that alone might help your case? That would probably
be much simpler to implement. Both client and server must already
handle connection failures, so it shouldn't be too great of a change
to have one or both sides actively close things down if they're idle
for more than a few seconds.
If we added tear down of idle sockets, that'd work for us and, as you
suggest, should be easier to do than rewriting the client to use async
i/o. Currently, random reading, its probably rare that the currently
opened HDFS block has the wanted offset and so a tear down of the
current socket and an open of a new one is being done anyways.
HADOOP-2346 helps with the Datanode side of the problem. We still need
DFSClient to clean up idle connections (otherwise these sockets will
stay in CLOSE_WAIT state on the client). This would require an extra
thread on client to clean up these connections. You could file a jira
for it.
Raghu.