GitHub user Xuanwo added a comment to the discussion: Concurrent support for
opendal list
> Is this feature primarily intended for recursive directory listing ?
No, it also works for non-recursive lists, as long as the service supports
`starts_with`.
For example, S3 services can list very large directories this way.
> Could we design it like this ?
>
> 1. perform a non-recursive list("dir/") operation.
> 2. divide the result from step 1 into some groups and list every group
> concurrently.
Nice idea. I didn't choose this option because it relies heavily on assumptions
about the users' file structures. If our assumptions are incorrect, the
resulting cost could be significant, which would go against our promise of
"Fast Access" with zero overhead.
For example, users can
*Have only one-level of paths*. For example, `a/1`, `a/2`,.., `a/1000000`.
Perform a non-recursive list("dir/") operation can lead a full list and there
is no place for `partitions` to take effects.
*Have extremely number of directories, but each one contains only a small
number of files*. For example, it's common for users to have hash-prefixed
directory structures like `a9/f8/a9f8xxxxxxxx`. If we group them based on
either the `a` or `a9` layers, it can lead to undesirable results.
So, my current idea is that instead of making assumptions about the user's file
structure, it's better to let users decide how to split the `partitions`.
GitHub link:
https://github.com/apache/opendal/discussions/6115#discussioncomment-12967642
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]