[jira] [Commented] (HDFS-9924) [umbrella] Nonblocking HDFS Access

Xiaobing Zhou (JIRA) Fri, 17 Jun 2016 11:48:49 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336684#comment-15336684
 ]


Xiaobing Zhou commented on HDFS-9924:
-------------------------------------

bq. Hardware is listed as "32 X 8 cores Intel(R) Xeon(R) CPU E52630 v3 @ 
2.40GHz", could you clarify? 32 CPUs, or this is a typo?
To make it clear, I posted updated doc. It is:
Intel(R) Xeon(R) CPU E52630 v3 @ 2.40GHz (32 Virtual Processors, 2 CPUs, 8 
cores per CPU) with HT (HyperThreading) enabled

{quote}
Considering the best-case speedup ranges from 30-60x, I'm betting the sweet 
spot is closer to 30-60 threads. I'd be interested in seeing e.g. 25, 50, 100, 
250. Expectation to see an upside-down U-shaped curve.
{quote}
I can't agree sweet spot is closer to 30-60 threads unless experiments 
demonstrate it.

Regarding the branching, it makes much sense not keeping it in branch, making 
it quite easy and possible to try on various async API implementation down the 
road. Let's say, someones want to try deferred or completable future or 
whatever others, they anyway need work already done here in the scope of RPC 
and retry in HA. If these reusable work are in separate branch, won't they have 
to deal with tripartite (even more) branch merge/sync (trunk/branch-2, 
HDFS-9924 and their feature branches)? It actually makes it more complicated 
and less efficient.

I also don't understand why the resistance to make reusable parts stay in 
branch-2/trunk is that intensive especially HDFS-10538 has completely removed 
the API. I was thinking, so much time we spent here for arguing, alternatively, 
I already had much work done to implement the other proposals. Don't we want to 
move things very quickly to deliver better products than our competitors? I 
understand it's hard to balance all requirements from all parties, but the API 
is already removed, which was the key point of concerns/arguments from all 
rounds. So please kindly consider keeping the reusable parts due to 
benefits/reasons aforementioned. Thanks.

> [umbrella] Nonblocking HDFS Access
> ----------------------------------
>
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: Async-HDFS-Performance-Report.pdf, AsyncHdfs20160510.pdf
>
>
> This is an umbrella JIRA for supporting Nonblocking HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked 
> until the method returns.  It is very slow if a client makes a large number 
> of independent calls in a single thread since each call has to wait until the 
> previous call is finished.  It is inefficient if a client needs to create a 
> large number of threads to invoke the calls.
> We propose adding a new API to support nonblocking calls, i.e. the caller is 
> not blocked.  The methods in the new API immediately return a Java Future 
> object.  The return value can be obtained by the usual Future.get() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9924) [umbrella] Nonblocking HDFS Access

Reply via email to