[ 
https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Booth updated HDFS-918:
---------------------------

    Attachment: hdfs-multiplex.patch

Here's a first implementation -- it works, passes TestDistributedFileSystem, 
TestDataTransferProtocol and TestPread.  However, it has a direct dependency on 
FSDataset (not FSDatasetInterface) because it needs to get ahold of files 
directly to open FileChannels.  This leads to ClassCastExceptions in all tests 
relying on SimulatedFSDataset.  Would love to hear feedback about a way to 
resolve this.  

Have not benchmarked yet, I'll post another comment with an architectural 
description.

> Use single Selector and small thread pool to replace many instances of 
> BlockSender for reads
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-918
>                 URL: https://issues.apache.org/jira/browse/HDFS-918
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Jay Booth
>             Fix For: 0.22.0
>
>         Attachments: hdfs-multiplex.patch
>
>
> Currently, on read requests, the DataXCeiver server allocates a new thread 
> per request, which must allocate its own buffers and leads to 
> higher-than-optimal CPU and memory usage by the sending threads.  If we had a 
> single selector and a small threadpool to multiplex request packets, we could 
> theoretically achieve higher performance while taking up fewer resources and 
> leaving more CPU on datanodes available for mapred, hbase or whatever.  This 
> can be done without changing any wire protocols.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to