[ 
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050219#comment-13050219
 ] 

Kihwal Lee commented on HDFS-941:
---------------------------------

Perhaps it's confusing because this Jira is seen as Random Vs. Sequential read. 
But in fact this jira is really about improving short reads and the solution is 
to reduce the overhead of connection making, which is present in both short and 
long reads. It is by no means favoring random or short reads. In fact, if the 
client does typical sequential reads multiple times from the same dn, this 
patch will help them too. The gain will be bigger if the files are smaller. 
Sure, there is one time overhead of cache lookup (size: 16), this can be 
ignored when the read size is sufficiently big. This cache management overhead 
should show up, in theory, for very small cold(connecton-wise) accesses. So far 
I have only seen gains. But there might be some special chronic cases that this 
patch actually make read slower. But again I don't belive they are typical use 
cases. Having said that, I think it is reasonable to run tests against the 
latest patch and make sure there is no regression in performance. Uncommitting 
now may do more harm than good. Let's see the numbers first and decide what to 
do. 

> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
>                 Key: HDFS-941
>                 URL: https://issues.apache.org/jira/browse/HDFS-941
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: bc Wong
>             Fix For: 0.22.0
>
>         Attachments: 941.22.txt, 941.22.txt, 941.22.v2.txt, 941.22.v3.txt, 
> HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch, 
> HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.22.patch, HDFS-941-6.patch, 
> HDFS-941-6.patch, HDFS-941-6.patch, fix-close-delta.txt, hdfs-941.txt, 
> hdfs-941.txt, hdfs-941.txt, hdfs-941.txt, hdfs941-1.png
>
>
> Right now each connection into the datanode xceiver only processes one 
> operation.
> In the case that an operation leaves the stream in a well-defined state (eg a 
> client reads to the end of a block successfully) the same connection could be 
> reused for a second operation. This should improve random read performance 
> significantly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to