[ 
https://issues.apache.org/jira/browse/HADOOP-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875428#action_12875428
 ] 

sam rash commented on HADOOP-6762:
----------------------------------

forgot to comment on static cachedThreadPool:

This concerns me as if we don't bound the size of this, we could get a massive 
# of threads;  if we do bound it, then it seems to me we might have some form 
of deadlock possible where one RPC depends on another (indirectly) and due to 
limited threads, it can't complete.  Basically we would want at least one 
thread per Connection, but no more (which is what have now)

We have seem one case of distributed deadlock here on the IPC workers in the 
DN, so this isn't 100% theory

While it is an extra resource, for simplicity and safety, I *do* prefer one 
thread Connection.  

What do you think?



> exception while doing RPC I/O closes channel
> --------------------------------------------
>
>                 Key: HADOOP-6762
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6762
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>            Reporter: sam rash
>            Assignee: sam rash
>         Attachments: hadoop-6762-1.txt, hadoop-6762-2.txt, hadoop-6762-3.txt, 
> hadoop-6762-4.txt, hadoop-6762-6.txt
>
>
> If a single process creates two unique fileSystems to the same NN using 
> FileSystem.newInstance(), and one of them issues a close(), the leasechecker 
> thread is interrupted.  This interrupt races with the rpc namenode.renew() 
> and can cause a ClosedByInterruptException.  This closes the underlying 
> channel and the other filesystem, sharing the connection will get errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to