[ https://issues.apache.org/jira/browse/HADOOP-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran updated HADOOP-13695: ------------------------------------ Parent Issue: HADOOP-13204 (was: HADOOP-11694) > S3A to use a thread pool for async path operations > -------------------------------------------------- > > Key: HADOOP-13695 > URL: https://issues.apache.org/jira/browse/HADOOP-13695 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 2.8.0 > Reporter: Steve Loughran > > S3A path operations are often slow due to directory scanning, mock directory > create/delete, etc. Many of these can be done asynchronously > * because deletion is eventually consistent, deleting parent dirs after an > operation has returned doesn't alter the behaviour, except in the special > case of : operation failure. > * scanning for paths/parents of a file in the create operation only needs to > complete before the close() operation instantiates the object, no need to > block create(). > * parallelized COPY calls would permit asynchronous rename. > We could either use the thread pool used for block writes, or somehow isolate > low cost path ops (GET, DELETE) from the more expensive calls (COPY, PUT) so > that a thread doing basic IO doesn't block for the duration of the long op. > Maybe also use {{Semaphore.tryAcquire()}} and only start async work if there > actually is an idle thread, doing it synchronously if not. Maybe it depends > on the operation. path query/cleanup before/after a write is something which > could be scheduled as just more futures to schedule in the block write. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org