[ https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050944#comment-13050944 ]
Konstantin Shvachko commented on HDFS-941: ------------------------------------------ 150 MB/sec throughput can be if your data.dir is on a filer, which is your home directory or /tmp. This also explains ridiculous standard deviation, because it competed with with Nicholas running ant test in his home dir, which is on the same filer. Set data.dir to crawlspace3, you will start getting reasonable numbers. What is the cluster size? > Datanode xceiver protocol should allow reuse of a connection > ------------------------------------------------------------ > > Key: HDFS-941 > URL: https://issues.apache.org/jira/browse/HDFS-941 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, hdfs client > Affects Versions: 0.22.0 > Reporter: Todd Lipcon > Assignee: bc Wong > Fix For: 0.22.0 > > Attachments: 941.22.txt, 941.22.txt, 941.22.v2.txt, 941.22.v3.txt, > HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch, > HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.22.patch, HDFS-941-6.patch, > HDFS-941-6.patch, HDFS-941-6.patch, fix-close-delta.txt, hdfs-941.txt, > hdfs-941.txt, hdfs-941.txt, hdfs-941.txt, hdfs941-1.png > > > Right now each connection into the datanode xceiver only processes one > operation. > In the case that an operation leaves the stream in a well-defined state (eg a > client reads to the end of a block successfully) the same connection could be > reused for a second operation. This should improve random read performance > significantly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira