[ 
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334432#comment-15334432
 ] 

Andrew Wang commented on HDFS-9924:
-----------------------------------

The bylaws state that code is integrated based on consensus. There is no 
consensus. I'm -1. If you really want to debate this, it's something we can 
take to the PMC list.

My stance on all of this has been the same since my comment 1.5 months ago. I 
wanted to see the following in the design doc:

* an API review, which was not in the doc, and only happened once our 
downstreams became more involved
* a performance comparison, which still hasn't happened.

If the goal of the API is to improve performance, let's see some benchmarks. 
For example:

* Completion time of a Hive workload that is currently suffering from this 
problem
* Improvement in completion time when Hive is using a threadpool, try some 
different sizes to find the sweet spot
* Improvement in completiong time when using the Future API, try some different 
# concurrent calls to find the sweet spot
* Comparative overhead of the threadpool, e.g. increased memory usage

If these numbers are convincing, I'll be happy to lift my -1. Else, let's 
implement Deferred and ship this in a later 2.x release.

> [umbrella] Nonblocking HDFS Access
> ----------------------------------
>
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: AsyncHdfs20160510.pdf
>
>
> This is an umbrella JIRA for supporting Nonblocking HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked 
> until the method returns.  It is very slow if a client makes a large number 
> of independent calls in a single thread since each call has to wait until the 
> previous call is finished.  It is inefficient if a client needs to create a 
> large number of threads to invoke the calls.
> We propose adding a new API to support nonblocking calls, i.e. the caller is 
> not blocked.  The methods in the new API immediately return a Java Future 
> object.  The return value can be obtained by the usual Future.get() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to