[ 
https://issues.apache.org/jira/browse/HDFS-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398767#comment-13398767
 ] 

Lars Hofhansl commented on HDFS-1783:
-------------------------------------

Yep. That is exactly the point. HDFS does pipelining to improve throughput at 
the expense of latency. This patch allows a client to favor latency.

If the client operates at the NIC's throughput limit enabling parallel writes 
will make things worse.

This patch could be extended in the future to mix direct connections with 
pipelining. For example a client could setup a 1-hop (direct) pipeline and a 
2-hop-pipeline for a replication factor of 3, or 2 2-hop-pipelines for a 
replication factor of 4, etc.

We'll be testing this with HBase workloads. Using traffic shaping is 
interesting.

                
> Ability for HDFS client to write replicas in parallel
> -----------------------------------------------------
>
>                 Key: HDFS-1783
>                 URL: https://issues.apache.org/jira/browse/HDFS-1783
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>            Reporter: dhruba borthakur
>            Assignee: Lars Hofhansl
>         Attachments: HDFS-1783-trunk-v2.patch, HDFS-1783-trunk-v3.patch, 
> HDFS-1783-trunk-v4.patch, HDFS-1783-trunk-v5.patch, HDFS-1783-trunk.patch
>
>
> The current implementation of HDFS pipelines the writes to the three 
> replicas. This introduces some latency for realtime latency sensitive 
> applications. An alternate implementation that allows the client to write all 
> replicas in parallel gives much better response times to these applications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to