[ 
https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829293#action_12829293
 ] 

ryan rawson commented on HDFS-918:
----------------------------------

I have done some thinking about HBase performance in relation to HDFS, and 
right now we are currently bottlenecking reads to a single file. By moving to a 
re-entrant API (pread) we are looking to unleash the parallelism.  This is 
important I think, because we want to push as many parallel reads from our 
clients down into Datanode then down into the kernel to benefit from the IO 
scheduling in the kernel & hardware.

This could mean we might expect literally dozens of parallel reads per node on 
a busy cluster. Perhaps even hundreds!  Per node. To ensure scalbility we'd 
probably want to get away from the xciever model, for more than 1 reason...  If 
I remember correctly, xcivers not only consume threads (hundreds of threads is 
OK but non ideal) but it also consumes epolls, and there is just so many epolls 
available.  So I heartily approve of the direction of this JIRA!



> Use single Selector and small thread pool to replace many instances of 
> BlockSender for reads
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-918
>                 URL: https://issues.apache.org/jira/browse/HDFS-918
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Jay Booth
>             Fix For: 0.22.0
>
>         Attachments: hdfs-918-20100201.patch, hdfs-918-20100203.patch, 
> hdfs-multiplex.patch
>
>
> Currently, on read requests, the DataXCeiver server allocates a new thread 
> per request, which must allocate its own buffers and leads to 
> higher-than-optimal CPU and memory usage by the sending threads.  If we had a 
> single selector and a small threadpool to multiplex request packets, we could 
> theoretically achieve higher performance while taking up fewer resources and 
> leaving more CPU on datanodes available for mapred, hbase or whatever.  This 
> can be done without changing any wire protocols.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to