[ https://issues.apache.org/jira/browse/HDFS-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398655#comment-13398655 ]
Daryn Sharp commented on HDFS-1783: ----------------------------------- I've only quickly looked at the discussion and the patch, so please excuse me if I'm misunderstanding the patch. The following is predicated on the belief this is all client-side. The client is constructing pipelines directly to all of the datanodes -- no more daisy-chaining, right? I think a benchmark on a generally quiescent network with small writes may be misleading. The client will now consume a multiple (replication factor) of the outgoing bandwidth it previously consumed, instead of the bandwidth being amortized over the network. This may quickly exhaust the NIC and/or congest the switches en-route to the datanodes. It would be interesting to see the benchmark with traffic shaping. Ex. perhaps throttle each host's bandwidth to ~2.5-3X the raw transfer speed of one client. Run two clients simultaneously on a host with and w/o parallel writes of files of at least a few blocks and replication factor 3 or more. > Ability for HDFS client to write replicas in parallel > ----------------------------------------------------- > > Key: HDFS-1783 > URL: https://issues.apache.org/jira/browse/HDFS-1783 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client > Reporter: dhruba borthakur > Assignee: Lars Hofhansl > Attachments: HDFS-1783-trunk-v2.patch, HDFS-1783-trunk-v3.patch, > HDFS-1783-trunk-v4.patch, HDFS-1783-trunk-v5.patch, HDFS-1783-trunk.patch > > > The current implementation of HDFS pipelines the writes to the three > replicas. This introduces some latency for realtime latency sensitive > applications. An alternate implementation that allows the client to write all > replicas in parallel gives much better response times to these applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira