Hey Xiaoqiao, HDFS-14272 was committed to all branches up to 2.10. The jira versions were not updated properly. I'll ping Chen for an update. He committed it in May.
Stay safe, --Konstantin On Tue, Dec 15, 2020 at 1:21 AM Xiaoqiao He <hexiaoq...@apache.org> wrote: > Hi All, > > >> I'm just curious why this is included in the 3.2.2 release? HDFS-15567 is >> tagged with 3.2.3 and the corresponding HDFS-14272 on server side is tagged >> with 3.3.0. > > > Have checked the fix version tag, I found there are 8 issues which do not > include branch-3.2.2 correctly or both branch-3.2.2 and branch-3.2.3 > missed. And have updated them manually. Please have a look. Thanks. > HADOOP-15691 > HDFS-15464 > HDFS-15478 > HDFS-15567 > HDFS-15574 > HDFS-15583 > HDFS-15628 > YARN-10430 > > Regards, > - He Xiaoqiao > > On Mon, Dec 14, 2020 at 5:08 AM Konstantin Shvachko <shv.had...@gmail.com> > wrote: > >> Hi Steve, >> >> I am not sure I fully understand what is broken here. It is not an >> incompatible change, right? >> Could you please explain what you think the process is. >> Would be best if you could share a link to a document describing it. >> I would be glad to follow up with tests and documentation that are needed. >> >> As you can see I proposed multiple solutions to the problem in the jira. >> Seemed nobody was objecting, so I chose one and explained why. >> I believe we call it lazy consensus. >> >> Stay safe, >> --Konstantin >> >> On Sun, Dec 13, 2020 at 10:22 AM Chao Sun <sunc...@apache.org> wrote: >> >>> > This is an API where it'd be ok to have a no-op if not implemented, >>> correct? Or is there an requirement like Syncable that specific >>> guarantees >>> are met? >>> >>> Yes I think it's ok to leave it as no-op for other non-HDFS FS impls: it >>> is >>> only used by HDFS standby reads so far. >>> >>> >>> >>> On Sun, Dec 13, 2020 at 4:58 AM Steve Loughran <ste...@cloudera.com> >>> wrote: >>> >>> > This isn't worth holding up the RC. We'd just add something to the >>> > release notes "use with caution". And if we can get what the API does >>> > defined in a way which works, it shouldn't need changing. >>> > >>> > (which reminds me, I do need to check that RC out, don't I?) >>> > >>> > On Sun, 13 Dec 2020 at 09:00, Xiaoqiao He <hexiaoq...@apache.org> >>> wrote: >>> > >>> >> Thanks Steve very much for your discussion here. >>> >> >>> >> Leave some comments inline. Will focus on this thread to wait for the >>> >> final >>> >> conclusion to decide if we should prepare another release candidate of >>> >> 3.2.2. >>> >> Thanks Steve and Chao again for your warm discussions. >>> >> >>> >> On Sat, Dec 12, 2020 at 7:18 PM Steve Loughran >>> >> <ste...@cloudera.com.invalid> >>> >> wrote: >>> >> >>> >> > Maybe it's not in the release; it's certainly in the 3.2 branch. >>> Will >>> >> check >>> >> > further. If it's in the release I was thinking of adding a warning >>> in >>> >> the >>> >> > notes "unstable API"; stable if invoked from DFSClient >>> >> >>> >> On Fri, 11 Dec 2020 at 18:21, Chao Sun <sunc...@apache.org> wrote: >>> >> > >>> >> > > I'm just curious why this is included in the 3.2.2 release? >>> >> HDFS-15567 is >>> >> > > tagged with 3.2.3 and the corresponding HDFS-14272 on server side >>> is >>> >> > tagged >>> >> > > with 3.3.0. >>> >> > >>> >> >>> >> Just checked that HDFS-15567 has been involved in Hadoop-3.2.2 RC4. >>> IIRC, >>> >> I >>> >> have cut branch-3.2.2 in early October, at that time branch-3.2.3 has >>> >> created but source code not freeze completely because several blocked >>> >> issues reported and code freeze has done about mid October. Some >>> issues >>> >> which are tagged with 3.2.3 has also been involved in 3.2.2 during >>> >> that period, include HDFS-15567. I will check them later, and make >>> sure >>> >> that we have mark the correct tags. >>> >> >>> >> >>> >> > > >>> >> > > > If it goes into FS/FC, what does it do for a viewfs with >1 >>> mounted >>> >> > HDFS? >>> >> > > Should it take path, msync(path) so that viewFS knows where to >>> forward >>> >> > it? >>> >> > > >>> >> > > The API shouldn't take any path - for viewFS I think it should >>> call >>> >> this >>> >> > on >>> >> > > all the child file systems. It might also need to handle the case >>> >> where >>> >> > > some downstream clusters support this capability while others >>> don't. >>> >> > > >>> >> > >>> >> > That's an extra bit of work for ViewFS then. It should probe for >>> >> capability >>> >> > and invoke as/when supported. >>> >> > >>> >> > > >>> >> > > > Options >>> >> > > 1. I roll HDFS-15567 back "please be follow process" >>> >> > > 2. Someone does a followup patch with specification and contract >>> test, >>> >> > view >>> >> > > FS. Add even more to the java >>> >> > > 3. We do as per HADOOP-16898 into an MSyncable interface and then >>> >> > > FileSystem & HDFS can implement. ViewFS and filterFS still need to >>> >> pass >>> >> > > through. >>> >> > > >>> >> > > I'm slightly in favor of the hasPathCapabilities approach and make >>> >> this a >>> >> > > mixin where FS impls can optionally support. Happy to hear what >>> others >>> >> > > think. >>> >> > > >>> >> > >>> >> > Mixins are great when FC and FS can both implement; makes it easier >>> to >>> >> code >>> >> > against either. All the filtering/aggregating FS's will have to >>> >> implement >>> >> > it, which means that presence of the interface doesn't guarantee >>> >> support. >>> >> > >>> >> > This is an API where it'd be ok to have a no-op if not implemented, >>> >> > correct? Or is there an requirement like Syncable that specific >>> >> guarantees >>> >> > are met? >>> >> > >>> >> > > >>> >> > > Chao >>> >> > > >>> >> > > >>> >> > > On Fri, Dec 11, 2020 at 9:00 AM Steve Loughran >>> >> > <ste...@cloudera.com.invalid >>> >> > > > >>> >> > > wrote: >>> >> > > >>> >> > > > Silence from the HDFS team >>> >> > > > >>> >> > > > >>> >> > > > Hadoop 3.2.2 is in an RC; it has the new FS API call. I really >>> don't >>> >> > want >>> >> > > > to veto the release just because someone pulled up a method >>> without >>> >> > doing >>> >> > > > the due diligence. >>> >> > >>> >> >>> >> Thanks Steve started this discussion here. I agree to roll back >>> HDFS-15567 >>> >> if there are still some incompatible issues not resolved completely. >>> And >>> >> release will not be the blocked things here, I would like to prepare >>> >> another RC if we would reach common agreement. To be honest, I think >>> it is >>> >> better to involve Shvachko here. >>> >> >>> >> >>> >> > > > Is anyone in the HDFS going to do that due diligence or should >>> we >>> >> > include >>> >> > > > something in the release notes "msync()" must be considered >>> >> unstable. >>> >> > > > >>> >> > > > Then we can do a proper msync(). >>> >> > > > >>> >> > > > If it goes into FS/FC, what does it do for a viewfs with >1 >>> mounted >>> >> > HDFS? >>> >> > > > Should it take path, msync(path) so that viewFS knows where to >>> >> forward >>> >> > > it? >>> >> > > > >>> >> > > > Alternatively: go with an MSync interface which those few FS >>> which >>> >> > > > implement it (hdfs) can do that, and the fact that it doesn't >>> have >>> >> doc >>> >> > or >>> >> > > > tests won't be a blocker any more? >>> >> > > > >>> >> > > > -steve >>> >> > > > >>> >> > > > >>> >> > > > >>> >> > > > >>> >> > > > On Thu, 10 Dec 2020 at 12:41, Steve Loughran < >>> ste...@cloudera.com> >>> >> > > wrote: >>> >> > > > >>> >> > > > > >>> >> > > > > Gosh, has it really been only since february since I last >>> asked >>> >> the >>> >> > > HDFS >>> >> > > > > dev list to stop adding anything to FileSystem/FileContext >>> APIs >>> >> > without >>> >> > > > > >>> >> > > > > * mentioning this on the hadoop-common list. >>> >> > > > > * specifying what it does in filesystem.md >>> >> > > > > * with a contract test >>> >> > > > > * a new hasPathCapabilities probe. Throwing >>> >> > > UnsupportedOperationException >>> >> > > > > only lets people work out if it is unsupported through >>> invocation. >>> >> > > Being >>> >> > > > > able to probe for it is better. >>> >> > > > > * ViewFS support. >>> >> > > > > * And, for any new API, one which works well for high-latency >>> >> object >>> >> > > > > stores: returning Future<Something> and >>> >> > > Future<RemoteIterator<Something> >>> >> > > > > when > 1 result is returned >>> >> > > > > >>> >> > > > > This needs to hold even for pulling something up from HDFS. >>> >> Because >>> >> > if >>> >> > > > > another FS wants to implement it, they need to know what it >>> does, >>> >> and >>> >> > > > have >>> >> > > > > tests to verify this. I say this as someone who has tried to >>> >> document >>> >> > > > HDFS >>> >> > > > > rename() semantics and gave up. >>> >> > > > > >>> >> > > > > It's really frustrating that every time someone does an FS API >>> >> change >>> >> > > > like >>> >> > > > > this in the past (most recently HDFS-13616) I am the one who >>> has >>> >> to >>> >> > > keep >>> >> > > > > sending the reminders out, and then having to try and clean >>> up/. >>> >> > > > > >>> >> > > > > So what now? >>> >> > > > > >>> >> > > > > Options >>> >> > > > > 1. I roll HDFS-15567 back "please be follow process" >>> >> > > > > 2. Someone does a followup patch with specification and >>> contract >>> >> > test, >>> >> > > > > view FS. Add even more to the java >>> >> > > > > 3. We do as per HADOOP-16898 into an MSyncable interface and >>> then >>> >> > > > > FileSystem & HDFS can implement. ViewFS and filterFS still >>> need to >>> >> > pass >>> >> > > > > through. >>> >> > > > > >>> >> > > > > *If nobody is going to volunteer for the specification/test >>> >> changes, >>> >> > > I'm >>> >> > > > > happy for the rollback. It'll remind people about process, * >>> >> > > > > >>> >> > > > > Pre-emptive Warning: No matter what we do for this patch, I >>> will >>> >> roll >>> >> > > > back >>> >> > > > > the next change which adds a new API if it's not accompanied >>> by >>> >> > > > > specification and tests. >>> >> > > > > >>> >> > > > > Unhappily yours, >>> >> > > > > >>> >> > > > > Steve >>> >> > > > > >>> >> > > > >>> >> > > >>> >> > >>> >> >>> > >>> >>