[jira] [Commented] (HADOOP-15964) Add S3A support for Async Scatter/Gather IO

Steve Loughran (JIRA) Tue, 04 Dec 2018 09:44:17 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-15964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709030#comment-16709030
 ]


Steve Loughran commented on HADOOP-15964:
-----------------------------------------

Initial impl would treat async reads as completely different from the normal 
input stream: it would submit a separate GET for each block read, limited to 
length of block exactly (i.e coalesce is feature creep). Rely on performance of 
parallelized GET calls against shards of the content for speedup.

* maybe use the existing block output queue so as to provide fairer scheduling 
of reads across multiple input streams.
* Need to think how this works through the invoker retry logic if calls are 
coalesced.
*  AWS SDK 2.x will add async calls, so less need for an unlimited worker pool
* must wire up cancel
* do track outstanding ops so that stream.close() will cancel them all
* metrics to include #of active async reads and total #issued

> Add S3A support for Async Scatter/Gather IO
> -------------------------------------------
>
>                 Key: HADOOP-15964
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15964
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Priority: Major
>
> HADOOP-11867 is proposing adding a new scatter/gather IO API.
> For an object store to take advantage of it, it should be doing things like
> * coalescing reads even with a gap between them
> * choosing an optimal ordering of requests
> * submitting reads into the executor pool/using any async API provided by the 
> FS.
> * detecting overlapping reads (and then what?)
> * switching to HTTP 2 where supported
> Do this for S3A



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15964) Add S3A support for Async Scatter/Gather IO

Reply via email to