GitHub user Xuanwo added a comment to the discussion: Concurrent support for 
opendal list

> Is this feature primarily intended for recursive directory listing ?

No, it also works for non-recursive lists, as long as the service supports 
`starts_with`.

For example, S3 services can list very large directories this way.

> Could we design it like this ?
> 
> 1. perform a non-recursive list("dir/") operation.
> 2. divide the result from step 1 into some groups and list every group 
> concurrently.

Nice idea. I didn't choose this option because it relies heavily on assumptions 
about the users' file structures. If our assumptions are incorrect, the 
resulting cost could be significant, which would go against our promise of 
"Fast Access" with zero overhead.

For example, users can 

*Have only one-level of paths*. For example, `a/1`, `a/2`,.., `a/1000000`.

Perform a non-recursive list("dir/") operation can lead a full list and there 
is no place for `partitions` to take effects.


*Have extremely number of directories, but each one contains only a small 
number of files*. For example, it's common for users to have hash-prefixed 
directory structures like `a9/f8/a9f8xxxxxxxx`. If we group them based on 
either the `a` or `a9` layers, it can lead to undesirable results.

So, my current idea is that instead of making assumptions about the user's file 
structure, it's better to let users decide how to split the `partitions`.

GitHub link: 
https://github.com/apache/opendal/discussions/6115#discussioncomment-12967642

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to