[jira] [Commented] (HADOOP-17475) ABFS : add high performance listStatusIterator

Steve Loughran (Jira) Thu, 22 Apr 2021 06:20:14 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-17475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17329098#comment-17329098
 ]


Steve Loughran commented on HADOOP-17475:
-----------------------------------------

Hey, I've just noticed the listings are going into the standard thread pool, 
which is only cores -1. 

unless set iin "java.util.concurrent.ForkJoinPool.common.parallelism");

Given the LIST calls are going to be blocking, I worry that this puts a limit 
on the performance of listing if you have many threads executing list requests, 
e.g spark workers.

Reviewing the code, the maximum number of list operations which can collect 
results will be limited to the #of cores -the others are going to block until 
the lists have been processed.

Which may also means: if you have multiple incremental iterators in the same 
thread (e.g. treewalking) there's a risk that you could actually deadlock

> ABFS : add high performance listStatusIterator
> ----------------------------------------------
>
>                 Key: HADOOP-17475
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17475
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.4.0
>            Reporter: Bilahari T H
>            Assignee: Bilahari T H
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.3.1
>
>          Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> The ABFS connector now implements listStatusIterator() with
> asynchronous prefetching of the next page(s) of results.
> For listing large directories this can provide tangible speedups.
> If for any reason this needs to be disabled, set
> fs.azure.enable.abfslistiterator to false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-17475) ABFS : add high performance listStatusIterator

Reply via email to