Hi All,
> I'm just curious why this is included in the 3.2.2 release? HDFS-15567 is > tagged with 3.2.3 and the corresponding HDFS-14272 on server side is tagged > with 3.3.0. Have checked the fix version tag, I found there are 8 issues which do not include branch-3.2.2 correctly or both branch-3.2.2 and branch-3.2.3 missed. And have updated them manually. Please have a look. Thanks. HADOOP-15691 HDFS-15464 HDFS-15478 HDFS-15567 HDFS-15574 HDFS-15583 HDFS-15628 YARN-10430 Regards, - He Xiaoqiao On Mon, Dec 14, 2020 at 5:08 AM Konstantin Shvachko <shv.had...@gmail.com> wrote: > Hi Steve, > > I am not sure I fully understand what is broken here. It is not an > incompatible change, right? > Could you please explain what you think the process is. > Would be best if you could share a link to a document describing it. > I would be glad to follow up with tests and documentation that are needed. > > As you can see I proposed multiple solutions to the problem in the jira. > Seemed nobody was objecting, so I chose one and explained why. > I believe we call it lazy consensus. > > Stay safe, > --Konstantin > > On Sun, Dec 13, 2020 at 10:22 AM Chao Sun <sunc...@apache.org> wrote: > >> > This is an API where it'd be ok to have a no-op if not implemented, >> correct? Or is there an requirement like Syncable that specific guarantees >> are met? >> >> Yes I think it's ok to leave it as no-op for other non-HDFS FS impls: it >> is >> only used by HDFS standby reads so far. >> >> >> >> On Sun, Dec 13, 2020 at 4:58 AM Steve Loughran <ste...@cloudera.com> >> wrote: >> >> > This isn't worth holding up the RC. We'd just add something to the >> > release notes "use with caution". And if we can get what the API does >> > defined in a way which works, it shouldn't need changing. >> > >> > (which reminds me, I do need to check that RC out, don't I?) >> > >> > On Sun, 13 Dec 2020 at 09:00, Xiaoqiao He <hexiaoq...@apache.org> >> wrote: >> > >> >> Thanks Steve very much for your discussion here. >> >> >> >> Leave some comments inline. Will focus on this thread to wait for the >> >> final >> >> conclusion to decide if we should prepare another release candidate of >> >> 3.2.2. >> >> Thanks Steve and Chao again for your warm discussions. >> >> >> >> On Sat, Dec 12, 2020 at 7:18 PM Steve Loughran >> >> <ste...@cloudera.com.invalid> >> >> wrote: >> >> >> >> > Maybe it's not in the release; it's certainly in the 3.2 branch. Will >> >> check >> >> > further. If it's in the release I was thinking of adding a warning in >> >> the >> >> > notes "unstable API"; stable if invoked from DFSClient >> >> >> >> On Fri, 11 Dec 2020 at 18:21, Chao Sun <sunc...@apache.org> wrote: >> >> > >> >> > > I'm just curious why this is included in the 3.2.2 release? >> >> HDFS-15567 is >> >> > > tagged with 3.2.3 and the corresponding HDFS-14272 on server side >> is >> >> > tagged >> >> > > with 3.3.0. >> >> > >> >> >> >> Just checked that HDFS-15567 has been involved in Hadoop-3.2.2 RC4. >> IIRC, >> >> I >> >> have cut branch-3.2.2 in early October, at that time branch-3.2.3 has >> >> created but source code not freeze completely because several blocked >> >> issues reported and code freeze has done about mid October. Some issues >> >> which are tagged with 3.2.3 has also been involved in 3.2.2 during >> >> that period, include HDFS-15567. I will check them later, and make sure >> >> that we have mark the correct tags. >> >> >> >> >> >> > > >> >> > > > If it goes into FS/FC, what does it do for a viewfs with >1 >> mounted >> >> > HDFS? >> >> > > Should it take path, msync(path) so that viewFS knows where to >> forward >> >> > it? >> >> > > >> >> > > The API shouldn't take any path - for viewFS I think it should call >> >> this >> >> > on >> >> > > all the child file systems. It might also need to handle the case >> >> where >> >> > > some downstream clusters support this capability while others >> don't. >> >> > > >> >> > >> >> > That's an extra bit of work for ViewFS then. It should probe for >> >> capability >> >> > and invoke as/when supported. >> >> > >> >> > > >> >> > > > Options >> >> > > 1. I roll HDFS-15567 back "please be follow process" >> >> > > 2. Someone does a followup patch with specification and contract >> test, >> >> > view >> >> > > FS. Add even more to the java >> >> > > 3. We do as per HADOOP-16898 into an MSyncable interface and then >> >> > > FileSystem & HDFS can implement. ViewFS and filterFS still need to >> >> pass >> >> > > through. >> >> > > >> >> > > I'm slightly in favor of the hasPathCapabilities approach and make >> >> this a >> >> > > mixin where FS impls can optionally support. Happy to hear what >> others >> >> > > think. >> >> > > >> >> > >> >> > Mixins are great when FC and FS can both implement; makes it easier >> to >> >> code >> >> > against either. All the filtering/aggregating FS's will have to >> >> implement >> >> > it, which means that presence of the interface doesn't guarantee >> >> support. >> >> > >> >> > This is an API where it'd be ok to have a no-op if not implemented, >> >> > correct? Or is there an requirement like Syncable that specific >> >> guarantees >> >> > are met? >> >> > >> >> > > >> >> > > Chao >> >> > > >> >> > > >> >> > > On Fri, Dec 11, 2020 at 9:00 AM Steve Loughran >> >> > <ste...@cloudera.com.invalid >> >> > > > >> >> > > wrote: >> >> > > >> >> > > > Silence from the HDFS team >> >> > > > >> >> > > > >> >> > > > Hadoop 3.2.2 is in an RC; it has the new FS API call. I really >> don't >> >> > want >> >> > > > to veto the release just because someone pulled up a method >> without >> >> > doing >> >> > > > the due diligence. >> >> > >> >> >> >> Thanks Steve started this discussion here. I agree to roll back >> HDFS-15567 >> >> if there are still some incompatible issues not resolved completely. >> And >> >> release will not be the blocked things here, I would like to prepare >> >> another RC if we would reach common agreement. To be honest, I think >> it is >> >> better to involve Shvachko here. >> >> >> >> >> >> > > > Is anyone in the HDFS going to do that due diligence or should we >> >> > include >> >> > > > something in the release notes "msync()" must be considered >> >> unstable. >> >> > > > >> >> > > > Then we can do a proper msync(). >> >> > > > >> >> > > > If it goes into FS/FC, what does it do for a viewfs with >1 >> mounted >> >> > HDFS? >> >> > > > Should it take path, msync(path) so that viewFS knows where to >> >> forward >> >> > > it? >> >> > > > >> >> > > > Alternatively: go with an MSync interface which those few FS >> which >> >> > > > implement it (hdfs) can do that, and the fact that it doesn't >> have >> >> doc >> >> > or >> >> > > > tests won't be a blocker any more? >> >> > > > >> >> > > > -steve >> >> > > > >> >> > > > >> >> > > > >> >> > > > >> >> > > > On Thu, 10 Dec 2020 at 12:41, Steve Loughran < >> ste...@cloudera.com> >> >> > > wrote: >> >> > > > >> >> > > > > >> >> > > > > Gosh, has it really been only since february since I last asked >> >> the >> >> > > HDFS >> >> > > > > dev list to stop adding anything to FileSystem/FileContext APIs >> >> > without >> >> > > > > >> >> > > > > * mentioning this on the hadoop-common list. >> >> > > > > * specifying what it does in filesystem.md >> >> > > > > * with a contract test >> >> > > > > * a new hasPathCapabilities probe. Throwing >> >> > > UnsupportedOperationException >> >> > > > > only lets people work out if it is unsupported through >> invocation. >> >> > > Being >> >> > > > > able to probe for it is better. >> >> > > > > * ViewFS support. >> >> > > > > * And, for any new API, one which works well for high-latency >> >> object >> >> > > > > stores: returning Future<Something> and >> >> > > Future<RemoteIterator<Something> >> >> > > > > when > 1 result is returned >> >> > > > > >> >> > > > > This needs to hold even for pulling something up from HDFS. >> >> Because >> >> > if >> >> > > > > another FS wants to implement it, they need to know what it >> does, >> >> and >> >> > > > have >> >> > > > > tests to verify this. I say this as someone who has tried to >> >> document >> >> > > > HDFS >> >> > > > > rename() semantics and gave up. >> >> > > > > >> >> > > > > It's really frustrating that every time someone does an FS API >> >> change >> >> > > > like >> >> > > > > this in the past (most recently HDFS-13616) I am the one who >> has >> >> to >> >> > > keep >> >> > > > > sending the reminders out, and then having to try and clean >> up/. >> >> > > > > >> >> > > > > So what now? >> >> > > > > >> >> > > > > Options >> >> > > > > 1. I roll HDFS-15567 back "please be follow process" >> >> > > > > 2. Someone does a followup patch with specification and >> contract >> >> > test, >> >> > > > > view FS. Add even more to the java >> >> > > > > 3. We do as per HADOOP-16898 into an MSyncable interface and >> then >> >> > > > > FileSystem & HDFS can implement. ViewFS and filterFS still >> need to >> >> > pass >> >> > > > > through. >> >> > > > > >> >> > > > > *If nobody is going to volunteer for the specification/test >> >> changes, >> >> > > I'm >> >> > > > > happy for the rollback. It'll remind people about process, * >> >> > > > > >> >> > > > > Pre-emptive Warning: No matter what we do for this patch, I >> will >> >> roll >> >> > > > back >> >> > > > > the next change which adds a new API if it's not accompanied by >> >> > > > > specification and tests. >> >> > > > > >> >> > > > > Unhappily yours, >> >> > > > > >> >> > > > > Steve >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> > >> >