[ 
https://issues.apache.org/jira/browse/HADOOP-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613477#action_12613477
 ] 

Doug Cutting commented on HADOOP-3672:
--------------------------------------

> Why are asynchronous RPCs even relevant here?

Currently, once you ask a datanode to start sending you data, it keeps sending 
that block until you close the connection or the entire block has been sent.  
TCP's flow control is async, so there are no round-trip delays once a block 
starts streaming to the client.  However, with sufficiently large buffers, 
round-trip delays introduced by RPC might not be significant.  For example, 
round-trip delays might be significant for 8k buffers but not for 128k buffers. 
 But we probably don't want to make buffers too large, so if the round-trip 
overhead proves to be significant even with 128k buffers, then we should 
consider using async RPC.  Make sense?


> support for persistent connections to improve random read performance.
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-3672
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3672
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.17.0
>         Environment: Linux 2.6.9-55  , Dual Core Opteron 280 2.4Ghz , 4GB 
> memory
>            Reporter: George Wu
>         Attachments: pread_test.java
>
>
> preads() establish new connections per request. yourkit java profiles show 
> that this connection overhead is pretty significant on the DataNode. 
> I wrote a simple microbenchmark program which does many iterations of pread() 
> from different offsets of a large file. I hacked DFSClient/DataNode code to 
> re-use the same connection/DataNode request handler thread. The performance 
> improvement was 7% when the data is served from disk and 80% when the data is 
> served from the OS page cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to