Re: HDFS-13616 : batch listing of multiple directories

Chao Sun Tue, 03 Mar 2020 12:59:41 -0800

Hi Steve,

Thanks for your valuable feedback and apologies for overlooking the needs
from object store and others! I'll look into the PR.


> Then, after 3.3.0 is out, someone gets to do the FileSystem
implementation, with specification, tests etc. Not me -you are the HDFS
team -of course you can do this.

+1. I can help to improve documentation and testing for this once 3.3.0 is
out.

Chao

On Mon, Mar 2, 2020 at 5:13 AM Steve Loughran <[email protected]>
wrote:

> On Sat, 29 Feb 2020 at 00:23, Wei-Chiu Chuang <[email protected]
> >
> wrote:
>
> > Steve,
> >
> > You made a great point and I'm sorry this API was implemented without
> > consideration of other FS implementation.
> > Thank you for your direct feedback.
> >
> > async -- yes
> > builder -- yes
> > cancellable -- totally agree
> >
> > There are good use cases for this API though -- Impala and Presto both
> > require lots of file system metadata operation, and this API would make
> > them much more efficient.
> >
>
> Well, I absolutely do not want an API we will have to maintain for decades
> yet lacks a rigorous specification other than "look at the HDFS
> implementation" and has seemingly not taken the needs of cloud storage into
> account.
>
> It is the long-term obligation to maintain this API which I am most worried
> about. Please: don't add new operations there unless you think it ready for
> broad use and that, in the absence of any builder API, is "the perfect
> operation"
>
> Proposed:
>
> * the new API is pulled into a new interface marked unstable; FileSystems
> subclasses get to implement if they support it.
> * new classes (PartialListing) also tagged unstable. Side issue, please,
> always mark new stuff as Evolving.
> * and that interface extends PathCapabilities, so you can't implement it
> without declaring whether paths support the feature.
> * And we will define a new path capability.
>
> Applications can cast to the new interface, and use PathCapabilities to
> verify it is actually available under a given path, even through filter
> filesystems
>
>
> Then, after 3.3.0 is out, someone gets to do the FileSystem implementation,
> with specification, tests etc. Not me -you are the HDFS team -of course you
> can do this. But it does need to be done taking into account the fact that
> alternate stores and far systems will be wanting to implement this and all
> others will be fielding support calls related to it for a long time.
>
>
> > On top of that, I would also like to have a batched delete API. HBase
> could
> > benefit a lot from that.
> >
> >
> Another interesting bit of work, especially since Gabor and I have just
> been dealing with S3 delete throttling issues.
>
> I will gladly give advice there. at the very least, it must implement
> Progressable, so when deletes are very slow processes/threads can still
> send heartbeats back. It also sets expectations up as to how long some of
> these things can take.
>
> -Steve
>
>
> >
> >
> > On Fri, Feb 28, 2020 at 5:48 AM Steve Loughran
> <[email protected]
> > >
> > wrote:
> >
> > > https://issues.apache.org/jira/browse/HDFS-13616
> > >
> > > I don't want to be territorial here -but as I keep reminding this list
> > > whenever it happens,  -I do not want any changes to go into the core
> > > FileSystem class without
> > >
> > > * raising a HADOOP- JIRA
> > > * involving those of us who work on object stores. We have different
> > > problems (latencies, failure modes) and want to move to move
> > > async/completable APIs, ideally with builder APIs for future
> flexibility
> > > and per-FS options.
> > > * specify semantics formally enough that people implementing and using
> > know
> > > what they get.
> > > * a specification in the filesystem.md
> > > * contract tests to match the spec and which object stores can
> implement,
> > > as well as HDFS
> > >
> > > The change has ~no javadocs and doesn't even state
> > > * whether it's recursive or not.
> > > * whether it includes directories or not
> > >
> > > batchedListStatusIterator is exactly the kind of feature this should
> > apply
> > > to -it is where we get a chance to fix those limitations of the
> previous
> > > calls (blocking sync, no expectation of right to cancel listings), ...
> > >
> > > I'd like to be able to
> > > * provide a hint on batch sizes.
> > > * get an async response so the fact the LIST can can take time is more
> > > visible.
> > > * and let us cancel that query if it is taking too long
> > >
> > > I also like to be able to close an iterator too; that is something we
> > > can/should retrofit, or require all implementations to add
> > >
> > >
> > > Completable<RemoteIterator<PartialListing<S extends FileStatus>>
> listing
> > =
> > >   batchList(Path)
> > >    .recursive(true)
> > >    .opt("fs.option.batchlist.size", 100)
> > >    .build()
> > >
> > > RemoteIterator<PartialListing<FileStatus> it = listing.get()
> > >
> > > FileStatus largeFile = null;
> > >
> > > try {
> > >   while(it.hasNext()) {
> > >     FileStatus st = it.next();
> > >     if (st.length()> 1_000_000) {
> > >       largeFile = st;
> > >       break;
> > >     }
> > >   } finally {
> > >     if (it instanceof Closeable) {
> > >       IOUtils.closeQuietly((Closeable)it);
> > >     }
> > >   }
> > >
> > >   if (largeFile != null) {
> > >     processLargeFile(largeFile);
> > >   }
> > > }
> > >
> > > See: something for slower IO, controllable batch sizes and a way to
> > cancel
> > > the scan -so let us recycle the HTTP connection even when breaking out
> > > early.
> > >
> > > This is a recurrent problem and I am getting as bored as a sending
> these
> > > emails out as people probably are at receiving them.
> > >
> > > Please please at least talk to me. Yes I'm going to add more homework
> but
> > > the goal is to make it something well documented well testable and
> > > straightforward to implement by other implementations without us having
> > to
> > > reverse engineer HDFS's behaviour and consider that a normative
> > >
> > > What I do here?
> > > 1. Do I overreact and revert the change until my needs are met?
> Because I
> > > know that if I volunteered to do this work myself it's going to get
> > > neglected.
> > > 2. Is someone going to put their hand up to help this?
> > >
> > > At the very least, I'm going to tag the APIs as unstable and
> potentially
> > > likely to break so that anyone who uses it in hadoop-3.3.0 isn't going
> to
> > > be upset when it is moved to a builder API. And it will have to  for
> the
> > > objects stores.
> > >
> > > sorry
> > >
> > > steve
> > >
> >
>

Re: HDFS-13616 : batch listing of multiple directories

Reply via email to