Hi Steve, Thanks for your valuable feedback and apologies for overlooking the needs from object store and others! I'll look into the PR.
> Then, after 3.3.0 is out, someone gets to do the FileSystem implementation, with specification, tests etc. Not me -you are the HDFS team -of course you can do this. +1. I can help to improve documentation and testing for this once 3.3.0 is out. Chao On Mon, Mar 2, 2020 at 5:13 AM Steve Loughran <ste...@cloudera.com.invalid> wrote: > On Sat, 29 Feb 2020 at 00:23, Wei-Chiu Chuang <weic...@cloudera.com.invalid > > > wrote: > > > Steve, > > > > You made a great point and I'm sorry this API was implemented without > > consideration of other FS implementation. > > Thank you for your direct feedback. > > > > async -- yes > > builder -- yes > > cancellable -- totally agree > > > > There are good use cases for this API though -- Impala and Presto both > > require lots of file system metadata operation, and this API would make > > them much more efficient. > > > > Well, I absolutely do not want an API we will have to maintain for decades > yet lacks a rigorous specification other than "look at the HDFS > implementation" and has seemingly not taken the needs of cloud storage into > account. > > It is the long-term obligation to maintain this API which I am most worried > about. Please: don't add new operations there unless you think it ready for > broad use and that, in the absence of any builder API, is "the perfect > operation" > > Proposed: > > * the new API is pulled into a new interface marked unstable; FileSystems > subclasses get to implement if they support it. > * new classes (PartialListing) also tagged unstable. Side issue, please, > always mark new stuff as Evolving. > * and that interface extends PathCapabilities, so you can't implement it > without declaring whether paths support the feature. > * And we will define a new path capability. > > Applications can cast to the new interface, and use PathCapabilities to > verify it is actually available under a given path, even through filter > filesystems > > > Then, after 3.3.0 is out, someone gets to do the FileSystem implementation, > with specification, tests etc. Not me -you are the HDFS team -of course you > can do this. But it does need to be done taking into account the fact that > alternate stores and far systems will be wanting to implement this and all > others will be fielding support calls related to it for a long time. > > > > On top of that, I would also like to have a batched delete API. HBase > could > > benefit a lot from that. > > > > > Another interesting bit of work, especially since Gabor and I have just > been dealing with S3 delete throttling issues. > > I will gladly give advice there. at the very least, it must implement > Progressable, so when deletes are very slow processes/threads can still > send heartbeats back. It also sets expectations up as to how long some of > these things can take. > > -Steve > > > > > > > > On Fri, Feb 28, 2020 at 5:48 AM Steve Loughran > <ste...@cloudera.com.invalid > > > > > wrote: > > > > > https://issues.apache.org/jira/browse/HDFS-13616 > > > > > > I don't want to be territorial here -but as I keep reminding this list > > > whenever it happens, -I do not want any changes to go into the core > > > FileSystem class without > > > > > > * raising a HADOOP- JIRA > > > * involving those of us who work on object stores. We have different > > > problems (latencies, failure modes) and want to move to move > > > async/completable APIs, ideally with builder APIs for future > flexibility > > > and per-FS options. > > > * specify semantics formally enough that people implementing and using > > know > > > what they get. > > > * a specification in the filesystem.md > > > * contract tests to match the spec and which object stores can > implement, > > > as well as HDFS > > > > > > The change has ~no javadocs and doesn't even state > > > * whether it's recursive or not. > > > * whether it includes directories or not > > > > > > batchedListStatusIterator is exactly the kind of feature this should > > apply > > > to -it is where we get a chance to fix those limitations of the > previous > > > calls (blocking sync, no expectation of right to cancel listings), ... > > > > > > I'd like to be able to > > > * provide a hint on batch sizes. > > > * get an async response so the fact the LIST can can take time is more > > > visible. > > > * and let us cancel that query if it is taking too long > > > > > > I also like to be able to close an iterator too; that is something we > > > can/should retrofit, or require all implementations to add > > > > > > > > > Completable<RemoteIterator<PartialListing<S extends FileStatus>> > listing > > = > > > batchList(Path) > > > .recursive(true) > > > .opt("fs.option.batchlist.size", 100) > > > .build() > > > > > > RemoteIterator<PartialListing<FileStatus> it = listing.get() > > > > > > FileStatus largeFile = null; > > > > > > try { > > > while(it.hasNext()) { > > > FileStatus st = it.next(); > > > if (st.length()> 1_000_000) { > > > largeFile = st; > > > break; > > > } > > > } finally { > > > if (it instanceof Closeable) { > > > IOUtils.closeQuietly((Closeable)it); > > > } > > > } > > > > > > if (largeFile != null) { > > > processLargeFile(largeFile); > > > } > > > } > > > > > > See: something for slower IO, controllable batch sizes and a way to > > cancel > > > the scan -so let us recycle the HTTP connection even when breaking out > > > early. > > > > > > This is a recurrent problem and I am getting as bored as a sending > these > > > emails out as people probably are at receiving them. > > > > > > Please please at least talk to me. Yes I'm going to add more homework > but > > > the goal is to make it something well documented well testable and > > > straightforward to implement by other implementations without us having > > to > > > reverse engineer HDFS's behaviour and consider that a normative > > > > > > What I do here? > > > 1. Do I overreact and revert the change until my needs are met? > Because I > > > know that if I volunteered to do this work myself it's going to get > > > neglected. > > > 2. Is someone going to put their hand up to help this? > > > > > > At the very least, I'm going to tag the APIs as unstable and > potentially > > > likely to break so that anyone who uses it in hadoop-3.3.0 isn't going > to > > > be upset when it is moved to a builder API. And it will have to for > the > > > objects stores. > > > > > > sorry > > > > > > steve > > > > > >