Hey Xiaoqiao,

HDFS-14272 was committed to all branches up to 2.10. The jira versions were
not updated properly.
I'll ping Chen for an update. He committed it in May.

Stay safe,
--Konstantin

On Tue, Dec 15, 2020 at 1:21 AM Xiaoqiao He <hexiaoq...@apache.org> wrote:

> Hi All,
>
>
>> I'm just curious why this is included in the 3.2.2 release? HDFS-15567 is
>> tagged with 3.2.3 and the corresponding HDFS-14272 on server side is tagged
>> with 3.3.0.
>
>
> Have checked the fix version tag, I found there are 8 issues which do not
> include branch-3.2.2 correctly or both branch-3.2.2 and branch-3.2.3
> missed. And have updated them manually. Please have a look. Thanks.
> HADOOP-15691
> HDFS-15464
> HDFS-15478
> HDFS-15567
> HDFS-15574
> HDFS-15583
> HDFS-15628
> YARN-10430
>
> Regards,
> - He Xiaoqiao
>
> On Mon, Dec 14, 2020 at 5:08 AM Konstantin Shvachko <shv.had...@gmail.com>
> wrote:
>
>> Hi Steve,
>>
>> I am not sure I fully understand what is broken here. It is not an
>> incompatible change, right?
>> Could you please explain what you think the process is.
>> Would be best if you could share a link to a document describing it.
>> I would be glad to follow up with tests and documentation that are needed.
>>
>> As you can see I proposed multiple solutions to the problem in the jira.
>> Seemed nobody was objecting, so I chose one and explained why.
>> I believe we call it lazy consensus.
>>
>> Stay safe,
>> --Konstantin
>>
>> On Sun, Dec 13, 2020 at 10:22 AM Chao Sun <sunc...@apache.org> wrote:
>>
>>> > This is an API where it'd be ok to have a no-op if not implemented,
>>> correct? Or is there an requirement like Syncable that specific
>>> guarantees
>>> are met?
>>>
>>> Yes I think it's ok to leave it as no-op for other non-HDFS FS impls: it
>>> is
>>> only used by HDFS standby reads so far.
>>>
>>>
>>>
>>> On Sun, Dec 13, 2020 at 4:58 AM Steve Loughran <ste...@cloudera.com>
>>> wrote:
>>>
>>> > This isn't worth holding up the RC. We'd just add something to the
>>> > release notes "use with caution". And if we can get what the API does
>>> > defined in a way which works, it shouldn't need changing.
>>> >
>>> > (which reminds me, I do need to check that RC out, don't I?)
>>> >
>>> > On Sun, 13 Dec 2020 at 09:00, Xiaoqiao He <hexiaoq...@apache.org>
>>> wrote:
>>> >
>>> >> Thanks Steve very much for your discussion here.
>>> >>
>>> >> Leave some comments inline. Will focus on this thread to wait for the
>>> >> final
>>> >> conclusion to decide if we should prepare another release candidate of
>>> >> 3.2.2.
>>> >> Thanks Steve and Chao again for your warm discussions.
>>> >>
>>> >> On Sat, Dec 12, 2020 at 7:18 PM Steve Loughran
>>> >> <ste...@cloudera.com.invalid>
>>> >> wrote:
>>> >>
>>> >> > Maybe it's not in the release; it's certainly in the 3.2 branch.
>>> Will
>>> >> check
>>> >> > further. If it's in the release I was thinking of adding a warning
>>> in
>>> >> the
>>> >> > notes "unstable API"; stable if invoked from DFSClient
>>> >>
>>> >> On Fri, 11 Dec 2020 at 18:21, Chao Sun <sunc...@apache.org> wrote:
>>> >> >
>>> >> > > I'm just curious why this is included in the 3.2.2 release?
>>> >> HDFS-15567 is
>>> >> > > tagged with 3.2.3 and the corresponding HDFS-14272 on server side
>>> is
>>> >> > tagged
>>> >> > > with 3.3.0.
>>> >> >
>>> >>
>>> >> Just checked that HDFS-15567 has been involved in Hadoop-3.2.2 RC4.
>>> IIRC,
>>> >> I
>>> >> have cut branch-3.2.2 in early October, at that time branch-3.2.3 has
>>> >> created but source code not freeze completely because several blocked
>>> >> issues reported and code freeze has done about mid October. Some
>>> issues
>>> >> which are tagged with 3.2.3 has also been involved in 3.2.2 during
>>> >> that period, include HDFS-15567. I will check them later, and make
>>> sure
>>> >> that we have mark the correct tags.
>>> >>
>>> >>
>>> >> > >
>>> >> > > > If it goes into FS/FC, what does it do for a viewfs with >1
>>> mounted
>>> >> > HDFS?
>>> >> > > Should it take path, msync(path) so that viewFS knows where to
>>> forward
>>> >> > it?
>>> >> > >
>>> >> > > The API shouldn't take any path - for viewFS I think it should
>>> call
>>> >> this
>>> >> > on
>>> >> > > all the child file systems. It might also need to handle the case
>>> >> where
>>> >> > > some downstream clusters support this capability while others
>>> don't.
>>> >> > >
>>> >> >
>>> >> > That's an extra bit of work for ViewFS then. It should probe for
>>> >> capability
>>> >> > and invoke as/when supported.
>>> >> >
>>> >> > >
>>> >> > > > Options
>>> >> > > 1. I roll HDFS-15567 back "please be follow process"
>>> >> > > 2. Someone does a followup patch with specification and contract
>>> test,
>>> >> > view
>>> >> > > FS. Add even more to the java
>>> >> > > 3. We do as per HADOOP-16898 into an MSyncable interface and then
>>> >> > > FileSystem & HDFS can implement. ViewFS and filterFS still need to
>>> >> pass
>>> >> > > through.
>>> >> > >
>>> >> > > I'm slightly in favor of the hasPathCapabilities approach and make
>>> >> this a
>>> >> > > mixin where FS impls can optionally support. Happy to hear what
>>> others
>>> >> > > think.
>>> >> > >
>>> >> >
>>> >> > Mixins are great when FC and FS can both implement; makes it easier
>>> to
>>> >> code
>>> >> > against either. All the filtering/aggregating FS's will have to
>>> >> implement
>>> >> > it, which means that presence of the interface doesn't guarantee
>>> >> support.
>>> >> >
>>> >> > This is an API where it'd be ok to have a no-op if not implemented,
>>> >> > correct? Or is there an requirement like Syncable that specific
>>> >> guarantees
>>> >> > are met?
>>> >> >
>>> >> > >
>>> >> > > Chao
>>> >> > >
>>> >> > >
>>> >> > > On Fri, Dec 11, 2020 at 9:00 AM Steve Loughran
>>> >> > <ste...@cloudera.com.invalid
>>> >> > > >
>>> >> > > wrote:
>>> >> > >
>>> >> > > > Silence from the  HDFS team
>>> >> > > >
>>> >> > > >
>>> >> > > > Hadoop 3.2.2 is in an RC; it has the new FS API call. I really
>>> don't
>>> >> > want
>>> >> > > > to veto the release just because someone pulled up a method
>>> without
>>> >> > doing
>>> >> > > > the due diligence.
>>> >> >
>>> >>
>>> >> Thanks Steve started this discussion here. I agree to roll back
>>> HDFS-15567
>>> >> if there are still some incompatible issues not resolved completely.
>>> And
>>> >> release will not be the blocked things here, I would like to prepare
>>> >> another RC if we would reach common agreement. To be honest, I think
>>> it is
>>> >> better to involve Shvachko here.
>>> >>
>>> >>
>>> >> > > > Is anyone in the HDFS going to do that due diligence or should
>>> we
>>> >> > include
>>> >> > > > something in the release notes "msync()" must be considered
>>> >> unstable.
>>> >> > > >
>>> >> > > > Then we can do a proper msync().
>>> >> > > >
>>> >> > > > If it goes into FS/FC, what does it do for a viewfs with >1
>>> mounted
>>> >> > HDFS?
>>> >> > > > Should it take path, msync(path) so that viewFS knows where to
>>> >> forward
>>> >> > > it?
>>> >> > > >
>>> >> > > > Alternatively: go with an MSync interface which those few FS
>>> which
>>> >> > > > implement it (hdfs) can do that, and the fact that it doesn't
>>> have
>>> >> doc
>>> >> > or
>>> >> > > > tests won't be a blocker any more?
>>> >> > > >
>>> >> > > > -steve
>>> >> > > >
>>> >> > > >
>>> >> > > >
>>> >> > > >
>>> >> > > > On Thu, 10 Dec 2020 at 12:41, Steve Loughran <
>>> ste...@cloudera.com>
>>> >> > > wrote:
>>> >> > > >
>>> >> > > > >
>>> >> > > > > Gosh, has it really been only since february since I last
>>> asked
>>> >> the
>>> >> > > HDFS
>>> >> > > > > dev list to stop adding anything to FileSystem/FileContext
>>> APIs
>>> >> > without
>>> >> > > > >
>>> >> > > > > * mentioning this on the hadoop-common list.
>>> >> > > > > * specifying what it does in filesystem.md
>>> >> > > > > * with a contract test
>>> >> > > > > * a new hasPathCapabilities probe. Throwing
>>> >> > > UnsupportedOperationException
>>> >> > > > > only lets people work out if it is unsupported through
>>> invocation.
>>> >> > > Being
>>> >> > > > > able to probe for it is better.
>>> >> > > > > * ViewFS support.
>>> >> > > > > * And, for any new API, one which works well for high-latency
>>> >> object
>>> >> > > > > stores: returning Future<Something> and
>>> >> > > Future<RemoteIterator<Something>
>>> >> > > > > when > 1 result is returned
>>> >> > > > >
>>> >> > > > > This needs to hold even for pulling something up from HDFS.
>>> >> Because
>>> >> > if
>>> >> > > > > another FS wants to implement it, they need to know what it
>>> does,
>>> >> and
>>> >> > > > have
>>> >> > > > > tests to verify this. I say this as someone who has tried to
>>> >> document
>>> >> > > > HDFS
>>> >> > > > > rename() semantics and gave up.
>>> >> > > > >
>>> >> > > > > It's really frustrating that every time someone does an FS API
>>> >> change
>>> >> > > > like
>>> >> > > > > this in the past (most recently HDFS-13616) I am the one who
>>> has
>>> >> to
>>> >> > > keep
>>> >> > > > > sending the reminders out, and then having to try and clean
>>> up/.
>>> >> > > > >
>>> >> > > > > So what now?
>>> >> > > > >
>>> >> > > > > Options
>>> >> > > > > 1. I roll HDFS-15567 back "please be follow process"
>>> >> > > > > 2. Someone does a followup patch with specification and
>>> contract
>>> >> > test,
>>> >> > > > > view FS. Add even more to the java
>>> >> > > > > 3. We do as per HADOOP-16898 into an MSyncable interface and
>>> then
>>> >> > > > > FileSystem & HDFS can implement. ViewFS and filterFS still
>>> need to
>>> >> > pass
>>> >> > > > > through.
>>> >> > > > >
>>> >> > > > > *If nobody is going to volunteer for the specification/test
>>> >> changes,
>>> >> > > I'm
>>> >> > > > > happy for the rollback. It'll remind people about process, *
>>> >> > > > >
>>> >> > > > > Pre-emptive Warning: No matter what we do for this patch, I
>>> will
>>> >> roll
>>> >> > > > back
>>> >> > > > > the next change which adds a new API if it's not accompanied
>>> by
>>> >> > > > > specification and tests.
>>> >> > > > >
>>> >> > > > > Unhappily yours,
>>> >> > > > >
>>> >> > > > > Steve
>>> >> > > > >
>>> >> > > >
>>> >> > >
>>> >> >
>>> >>
>>> >
>>>
>>

Reply via email to