[ https://issues.apache.org/jira/browse/HDFS-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400196#comment-13400196 ]
Lars Hofhansl commented on HDFS-1783: ------------------------------------- One more point to consider: For us (Salesforce) this is mostly interesting for HBase. A typical HBase cluster has the DataNodes co-located with the HBase RegionServers. So assuming good load distribution within HBase, the bandwidth would still be amortized across the cluster, but with lower latency for each single RegionServer (this HDFS client in this case). Overall the same number of bits is sent through the cluster as a whole. This would only be enabled for the WAL. Other write load (like compactions), would still do the pipelining. Andy did some cool testing on EC2 over in HBASE-6116. We'll be doing some basic testing in a real, dedicated cluster this week. > Ability for HDFS client to write replicas in parallel > ----------------------------------------------------- > > Key: HDFS-1783 > URL: https://issues.apache.org/jira/browse/HDFS-1783 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client > Reporter: dhruba borthakur > Assignee: Lars Hofhansl > Attachments: HDFS-1783-trunk-v2.patch, HDFS-1783-trunk-v3.patch, > HDFS-1783-trunk-v4.patch, HDFS-1783-trunk-v5.patch, HDFS-1783-trunk.patch > > > The current implementation of HDFS pipelines the writes to the three > replicas. This introduces some latency for realtime latency sensitive > applications. An alternate implementation that allows the client to write all > replicas in parallel gives much better response times to these applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira