[ 
https://issues.apache.org/jira/browse/HDFS-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398655#comment-13398655
 ] 

Daryn Sharp commented on HDFS-1783:
-----------------------------------

I've only quickly looked at the discussion and the patch, so please excuse me 
if I'm misunderstanding the patch.  The following is predicated on the belief 
this is all client-side.  The client is constructing pipelines directly to all 
of the datanodes -- no more daisy-chaining, right?

I think a benchmark on a generally quiescent network with small writes may be 
misleading.  The client will now consume a multiple (replication factor) of the 
outgoing bandwidth it previously consumed, instead of the bandwidth being 
amortized over the network.  This may quickly exhaust the NIC and/or congest 
the switches en-route to the datanodes.

It would be interesting to see the benchmark with traffic shaping. Ex. perhaps 
throttle each host's bandwidth to ~2.5-3X the raw transfer speed of one client. 
 Run two clients simultaneously on a host with and w/o parallel writes of files 
of at least a few blocks and replication factor 3 or more.
                
> Ability for HDFS client to write replicas in parallel
> -----------------------------------------------------
>
>                 Key: HDFS-1783
>                 URL: https://issues.apache.org/jira/browse/HDFS-1783
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>            Reporter: dhruba borthakur
>            Assignee: Lars Hofhansl
>         Attachments: HDFS-1783-trunk-v2.patch, HDFS-1783-trunk-v3.patch, 
> HDFS-1783-trunk-v4.patch, HDFS-1783-trunk-v5.patch, HDFS-1783-trunk.patch
>
>
> The current implementation of HDFS pipelines the writes to the three 
> replicas. This introduces some latency for realtime latency sensitive 
> applications. An alternate implementation that allows the client to write all 
> replicas in parallel gives much better response times to these applications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to