[ 
https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844596#action_12844596
 ] 

Jay Booth commented on HDFS-918:
--------------------------------

.bq I think it is very important to have separate pools for each partition. 
Otherwise, each disk will be accessed only as much as the slowest disk (when DN 
has enough load).

This would be the case if I were using a fixed-size thread pool and a 
LinkedBlockingQueue -- but I'm not, see Executors.newCachedThreadPool(), it's 
actually bounded at Integer.MAX_VALUE threads and uses a SynchronousQueue.  If 
a new thread is needed in order to start work on a task immediately, it's 
created.  Otherwise, an existing waiting thread will be re-used.  (Threads are 
purged if they've been idle for 60 seconds).  Either way, the underlying I/O 
request is dispatched pretty much immediately after the connection is writable. 
 So I don't see why separate pools per partition would help anything, the 
operating system will handle IO requests as it can and put threads into 
runnable state as it can regardless of which pool they're in.

RE: Netty, I'm not very knowledgeable about it beyond the Cliff's Notes 
version, but my code dealing with the Selector is pretty small -- the main loop 
is under 75 lines, and java.util.concurrent does most of the heavy lifting.  
Most of the code is dealing with application and protocol specifics.  So my 
instinct in general is that adding a framework may actually increase the amount 
of code, especially if there's any mismatches between what we're doing and what 
it wants us to do (the packet-header, sums data, main data format is pretty 
specific to us).  Plus, as Todd said, we can't really change the blocking IO 
nature of the main accept() loop in DataXceiverServer without this becoming a 
much bigger patch, although I agree that we should go there in general.  That 
being said, better is better, so if a Netty implementation took up fewer lines 
of code and performed better, then that speaks for itself.

> Use single Selector and small thread pool to replace many instances of 
> BlockSender for reads
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-918
>                 URL: https://issues.apache.org/jira/browse/HDFS-918
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Jay Booth
>             Fix For: 0.22.0
>
>         Attachments: hdfs-918-20100201.patch, hdfs-918-20100203.patch, 
> hdfs-918-20100211.patch, hdfs-918-20100228.patch, hdfs-918-20100309.patch, 
> hdfs-multiplex.patch
>
>
> Currently, on read requests, the DataXCeiver server allocates a new thread 
> per request, which must allocate its own buffers and leads to 
> higher-than-optimal CPU and memory usage by the sending threads.  If we had a 
> single selector and a small threadpool to multiplex request packets, we could 
> theoretically achieve higher performance while taking up fewer resources and 
> leaving more CPU on datanodes available for mapred, hbase or whatever.  This 
> can be done without changing any wire protocols.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to