[ 
https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Booth updated HDFS-918:
---------------------------

    Attachment: hdfs-918-20100309.patch

New patch and better benchmarks:

Environment:  
8x2GHz, 7GB RAM, namenode and dfs client
8x2GHz, 7GB RAM, datanode

Streaming:
Single threaded:  60 runs over 100MB file, presumed in memory so network is 
chokepoint
Current DFS : 92MB/s over 60 runs
Multiplex : 97 MB/s over 60 runs
* Either random variation, or maybe larger packet size helps

Multi-threaded - 32 threads reading 100MB file, 60X each
Both around 3.25MB/s/thread, 104 MB/s aggregate
Network saturation


Random reads:
The multiplexed implementation saves about 1.5 ms, probably by avoiding extra 
file-opens and buffer allocation.
 - 5 iterations of 2000 reads each, 32kb, front of file, singlethreaded
 - splits for current DFS: 5.3, 4.6, 5.0, 4.4, 6.4
 - splits for multiplex:    3.2, 3.0, 4.6, 3.3 ,3.2
 - multithreaded concurrent read speeds on a single host converged with more 
threads -- probably client-side delay negotiating lots of new tcp connections 


File handle consumption:
Both "rest" at 401 open files (mostly jars)

When doing random reads across 128 threads, BlockSender spikes to the 1150, 
opening a blockfile, metafile, selector, and socket for each concurrent 
connection.

MultiplexedBlockSender only jumps to 530, with just the socket as a 
per-connection resource, blockfiles, metafiles and the single selector are 
shared.



I'll post a comment later with an updated description of the patch, and when I 
get a chance, I'll run some more disk-bound benchmarks, I think the 
asynchronous approach will pay some dividends there by letting the operating 
system do more of the work.  

Super brief patch notes:
eliminated silly add'l dependency on commons-math, now has no new dependencies
incorporated Zlatin's suggestions upthread to do asynchronous I/O, 1 shared 
selector
BlockChannelPool is shared across threads
Buffers are threadlocal so they'll tend to be re-used rather than re-allocated




> Use single Selector and small thread pool to replace many instances of 
> BlockSender for reads
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-918
>                 URL: https://issues.apache.org/jira/browse/HDFS-918
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Jay Booth
>             Fix For: 0.22.0
>
>         Attachments: hdfs-918-20100201.patch, hdfs-918-20100203.patch, 
> hdfs-918-20100211.patch, hdfs-918-20100228.patch, hdfs-918-20100309.patch, 
> hdfs-multiplex.patch
>
>
> Currently, on read requests, the DataXCeiver server allocates a new thread 
> per request, which must allocate its own buffers and leads to 
> higher-than-optimal CPU and memory usage by the sending threads.  If we had a 
> single selector and a small threadpool to multiplex request packets, we could 
> theoretically achieve higher performance while taking up fewer resources and 
> leaving more CPU on datanodes available for mapred, hbase or whatever.  This 
> can be done without changing any wire protocols.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to