[
https://issues.apache.org/jira/browse/HADOOP-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875424#action_12875424
]
sam rash commented on HADOOP-6762:
----------------------------------
thanks for reviewing the patch so quickly.
1. latch : I agree. I'll change it to use a future & get (with timeout, see
#2).
2. actually the timeout helped me debug a switch issue today. I would see a
flurry of timeouts when I saturated a switch that wasn't performing to spec.
getting the timeouts was far preferable than hanging indefinitely. i agree it
changes the behavior, but it's the same we do pings at I think. Also, playing
a game of adversary, *if* somehow the connection thread in the executor did
die, the latch would hang indefinitely.
perhaps the timeout value should be something else? I choose the timeout value
that was used to connect the socket (pingInterval) as it seemed appropriate.
3. ah, i misread your comment above--that's a great idea. Only the actual push
out the socket needs to be in critical section (in theory could improve perf a
tiny bit).
I'll get to the changes and post another patch.
> exception while doing RPC I/O closes channel
> --------------------------------------------
>
> Key: HADOOP-6762
> URL: https://issues.apache.org/jira/browse/HADOOP-6762
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: sam rash
> Assignee: sam rash
> Attachments: hadoop-6762-1.txt, hadoop-6762-2.txt, hadoop-6762-3.txt,
> hadoop-6762-4.txt, hadoop-6762-6.txt
>
>
> If a single process creates two unique fileSystems to the same NN using
> FileSystem.newInstance(), and one of them issues a close(), the leasechecker
> thread is interrupted. This interrupt races with the rpc namenode.renew()
> and can cause a ClosedByInterruptException. This closes the underlying
> channel and the other filesystem, sharing the connection will get errors.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.