[
https://issues.apache.org/jira/browse/HADOOP-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875675#action_12875675
]
sam rash commented on HADOOP-6762:
----------------------------------
I almost have an updated patch. I loved the idea of sync on
Connection.this.out around submit + wait, but this causes deadlock as the
out.write() call inside is a synchronized function. I instead sync'd on the
connection object itself (seems safe and gives the same result. 1 : 1
Connection/socket => bounds the threads by the # of connections
(really, very clean/clever idea to optimize thread use)
one catch with the static executor instance: nothing ever shuts it down.
While this isn't an issue for actual execution, it won't cause unit tests to
hang, will it? (I made a thread factory that makes the threads daemons for
this purpose)
patch coming after i run a larger set of unit tests
> exception while doing RPC I/O closes channel
> --------------------------------------------
>
> Key: HADOOP-6762
> URL: https://issues.apache.org/jira/browse/HADOOP-6762
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: sam rash
> Assignee: sam rash
> Attachments: hadoop-6762-1.txt, hadoop-6762-2.txt, hadoop-6762-3.txt,
> hadoop-6762-4.txt, hadoop-6762-6.txt
>
>
> If a single process creates two unique fileSystems to the same NN using
> FileSystem.newInstance(), and one of them issues a close(), the leasechecker
> thread is interrupted. This interrupt races with the rpc namenode.renew()
> and can cause a ClosedByInterruptException. This closes the underlying
> channel and the other filesystem, sharing the connection will get errors.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.