[ 
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736658#comment-14736658
 ] 

Duo Zhang commented on HDFS-7966:
---------------------------------

Netty-4.1.0Beta6 is out so I'm back. I have added a simple {{asyncRead}} 
method(not fully asynchronous since this is only a POC) to {{DFSInputStream}} 
and write a performance test for it. Here is the test result(two times for 
every test)

{noformat}
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest async /test 100 
50000 4096 // 100 here means max concurrency which used to prevent OOM.
******* time based on http2 230946
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest async /test 100 
50000 4096
******* time based on http2 231066

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 100 
50000 4096 pread
******* time based on tcp 231410
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 100 
50000 4096 pread
******* time based on tcp 231038

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 100 
50000 4096 pread
******* time based on http2 236069
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 100 
50000 4096 pread
******* time based on http2 231773
{noformat}

The performance difference is ~±4% and async is a little better than tcp.

Thanks.

> New Data Transfer Protocol via HTTP/2
> -------------------------------------
>
>                 Key: HDFS-7966
>                 URL: https://issues.apache.org/jira/browse/HDFS-7966
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Haohui Mai
>            Assignee: Qianqian Shi
>              Labels: gsoc, gsoc2015, mentor
>         Attachments: GSoC2015_Proposal.pdf, 
> TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg, 
> TestHttp2ReadBlockInsideEventLoop.svg
>
>
> The current Data Transfer Protocol (DTP) implements a rich set of features 
> that span across multiple layers, including:
> * Connection pooling and authentication (session layer)
> * Encryption (presentation layer)
> * Data writing pipeline (application layer)
> All these features are HDFS-specific and defined by implementation. As a 
> result it requires non-trivial amount of work to implement HDFS clients and 
> servers.
> This jira explores to delegate the responsibilities of the session and 
> presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
> connection multiplexing, QoS, authentication and encryption, reducing the 
> scope of DTP to the application layer only. By leveraging the existing HTTP/2 
> library, it should simplify the implementation of both HDFS clients and 
> servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to