[ 
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280617#comment-15280617
 ] 

Colin Patrick McCabe commented on HDFS-9924:
--------------------------------------------

With regard to error handling, why not handle all errors as exceptions thrown 
from {{Future#get}}?  Handling some errors in a different way because they 
happened "earlier" (let's say, on the client side rather than server side) 
forces the client to put error checking code in two places.

Does the {{Future#get}} callback get made without holding any locks?  Can other 
asynchronous calls be made from this context?

{code}
public boolean rename(Path src, Path dst) throws IOException {
  if (isAsynchronousMode()) {
    return getFutureDistributedFileSystem().rename(src, dst).get();
  } else {
    ... //current implementation.
  }
}
{code}
It seems concerning that we would have to make such a large change to the 
synchronous {{DistributedFileSystem}} code.  This would also result in more GC 
load since we'd be creating lots of {{Future}} objects.  Shouldn't it be 
possible to avoid this?  I do not think having some kind of global async bit is 
a good idea.

bq. In order to avoid client abusing the server by asynchronous calls. The RPC 
client should have a configurable limit in order to limit the outstanding 
asynchronous calls. The caller may be blocked if the number of outstanding 
calls hits the limit so that the caller is slowed down.

Blocking the client seems like it could be problematic for code which expects 
to be asynchronous.  There should be an option to throw an exception in this 
case.

I also think that we could maintain a queue of async calls that we have not 
submitted to the IPC layer yet, to avoid being limited by issues at the IPC 
layer.

bq.­ Support asynchronous FileContext (client API)

{{AsynchronousFileSystem}} is a separate API from {{FileSystem}}.  If there are 
issues with {{FileSystem}}, surely we can fix them in 
{{AsynchronousFileSystem}} rather than creating a fourth API?

bq.­ Use Java 8’s new language feature in the API (client API).

Given that Hadoop 3.x will probably be Java 8 (based on the mailing list 
discussion), why not just make the async API use jdk8's {{CompletableFuture}} 
from day 1, rather than hacking it in later?

> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: AsyncHdfs20160510.pdf
>
>
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked 
> until the method returns.  It is very slow if a client makes a large number 
> of independent calls in a single thread since each call has to wait until the 
> previous call is finished.  It is inefficient if a client needs to create a 
> large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is 
> not blocked.  The methods in the new API immediately return a Java Future 
> object.  The return value can be obtained by the usual Future.get() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to