[ 
https://issues.apache.org/jira/browse/HADOOP-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861108#action_12861108
 ] 

Hairong Kuang commented on HADOOP-6713:
---------------------------------------

This is a great idea! Separating "accept" from "read" should also greatly 
reduce the Connection reset errors observed at the client when NameNode is 
busy. Dhruba asked me to review this patch. So here are a few comments:

1. Please remove the System.out.println or change it to be a log statement;
2. Listener#run() should remove doRead() else branch;
3. Now that accept is done is a separate thread, doAccept() should accept as 
many as possible (not limit to up to 10 as in the trunk). Another option is to 
use a blocking accept channel.
4. Optional: the synchronization between listener thread & read thread is very 
interesting. It took me a while to figure out that it works. But it seems to me 
that the code is hard to understand and maintain. Another option is that each 
reader thread maintains a queue of pending registration channels. After 
choosing a reader, a listener thread simply adds an accepted channel into its 
pending queue and then wakes up the reader thread. Each reader thread  
registers all the pending channels before select().

> The RPC server Listener thread is a scalability bottleneck
> ----------------------------------------------------------
>
>                 Key: HADOOP-6713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6713
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.21.0
>            Reporter: dhruba borthakur
>            Assignee: Dmytro Molkov
>         Attachments: HADOOP-6713.patch
>
>
> The Hadoop RPC Server implementation has a single Listener thread that reads 
> data from the socket and puts them into a call queue. This means that this 
> single thread can pull RPC requests off the network only as fast as a single 
> CPU can execute. This is a scalability bottlneck in our cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to