> This is an API where it'd be ok to have a no-op if not implemented,
correct? Or is there an requirement like Syncable that specific guarantees
are met?

Yes I think it's ok to leave it as no-op for other non-HDFS FS impls: it is
only used by HDFS standby reads so far.



On Sun, Dec 13, 2020 at 4:58 AM Steve Loughran <ste...@cloudera.com> wrote:

> This isn't worth holding up the RC. We'd just add something to the
> release notes "use with caution". And if we can get what the API does
> defined in a way which works, it shouldn't need changing.
>
> (which reminds me, I do need to check that RC out, don't I?)
>
> On Sun, 13 Dec 2020 at 09:00, Xiaoqiao He <hexiaoq...@apache.org> wrote:
>
>> Thanks Steve very much for your discussion here.
>>
>> Leave some comments inline. Will focus on this thread to wait for the
>> final
>> conclusion to decide if we should prepare another release candidate of
>> 3.2.2.
>> Thanks Steve and Chao again for your warm discussions.
>>
>> On Sat, Dec 12, 2020 at 7:18 PM Steve Loughran
>> <ste...@cloudera.com.invalid>
>> wrote:
>>
>> > Maybe it's not in the release; it's certainly in the 3.2 branch. Will
>> check
>> > further. If it's in the release I was thinking of adding a warning in
>> the
>> > notes "unstable API"; stable if invoked from DFSClient
>>
>> On Fri, 11 Dec 2020 at 18:21, Chao Sun <sunc...@apache.org> wrote:
>> >
>> > > I'm just curious why this is included in the 3.2.2 release?
>> HDFS-15567 is
>> > > tagged with 3.2.3 and the corresponding HDFS-14272 on server side is
>> > tagged
>> > > with 3.3.0.
>> >
>>
>> Just checked that HDFS-15567 has been involved in Hadoop-3.2.2 RC4. IIRC,
>> I
>> have cut branch-3.2.2 in early October, at that time branch-3.2.3 has
>> created but source code not freeze completely because several blocked
>> issues reported and code freeze has done about mid October. Some issues
>> which are tagged with 3.2.3 has also been involved in 3.2.2 during
>> that period, include HDFS-15567. I will check them later, and make sure
>> that we have mark the correct tags.
>>
>>
>> > >
>> > > > If it goes into FS/FC, what does it do for a viewfs with >1 mounted
>> > HDFS?
>> > > Should it take path, msync(path) so that viewFS knows where to forward
>> > it?
>> > >
>> > > The API shouldn't take any path - for viewFS I think it should call
>> this
>> > on
>> > > all the child file systems. It might also need to handle the case
>> where
>> > > some downstream clusters support this capability while others don't.
>> > >
>> >
>> > That's an extra bit of work for ViewFS then. It should probe for
>> capability
>> > and invoke as/when supported.
>> >
>> > >
>> > > > Options
>> > > 1. I roll HDFS-15567 back "please be follow process"
>> > > 2. Someone does a followup patch with specification and contract test,
>> > view
>> > > FS. Add even more to the java
>> > > 3. We do as per HADOOP-16898 into an MSyncable interface and then
>> > > FileSystem & HDFS can implement. ViewFS and filterFS still need to
>> pass
>> > > through.
>> > >
>> > > I'm slightly in favor of the hasPathCapabilities approach and make
>> this a
>> > > mixin where FS impls can optionally support. Happy to hear what others
>> > > think.
>> > >
>> >
>> > Mixins are great when FC and FS can both implement; makes it easier to
>> code
>> > against either. All the filtering/aggregating FS's will have to
>> implement
>> > it, which means that presence of the interface doesn't guarantee
>> support.
>> >
>> > This is an API where it'd be ok to have a no-op if not implemented,
>> > correct? Or is there an requirement like Syncable that specific
>> guarantees
>> > are met?
>> >
>> > >
>> > > Chao
>> > >
>> > >
>> > > On Fri, Dec 11, 2020 at 9:00 AM Steve Loughran
>> > <ste...@cloudera.com.invalid
>> > > >
>> > > wrote:
>> > >
>> > > > Silence from the  HDFS team
>> > > >
>> > > >
>> > > > Hadoop 3.2.2 is in an RC; it has the new FS API call. I really don't
>> > want
>> > > > to veto the release just because someone pulled up a method without
>> > doing
>> > > > the due diligence.
>> >
>>
>> Thanks Steve started this discussion here. I agree to roll back HDFS-15567
>> if there are still some incompatible issues not resolved completely. And
>> release will not be the blocked things here, I would like to prepare
>> another RC if we would reach common agreement. To be honest, I think it is
>> better to involve Shvachko here.
>>
>>
>> > > > Is anyone in the HDFS going to do that due diligence or should we
>> > include
>> > > > something in the release notes "msync()" must be considered
>> unstable.
>> > > >
>> > > > Then we can do a proper msync().
>> > > >
>> > > > If it goes into FS/FC, what does it do for a viewfs with >1 mounted
>> > HDFS?
>> > > > Should it take path, msync(path) so that viewFS knows where to
>> forward
>> > > it?
>> > > >
>> > > > Alternatively: go with an MSync interface which those few FS which
>> > > > implement it (hdfs) can do that, and the fact that it doesn't have
>> doc
>> > or
>> > > > tests won't be a blocker any more?
>> > > >
>> > > > -steve
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Thu, 10 Dec 2020 at 12:41, Steve Loughran <ste...@cloudera.com>
>> > > wrote:
>> > > >
>> > > > >
>> > > > > Gosh, has it really been only since february since I last asked
>> the
>> > > HDFS
>> > > > > dev list to stop adding anything to FileSystem/FileContext APIs
>> > without
>> > > > >
>> > > > > * mentioning this on the hadoop-common list.
>> > > > > * specifying what it does in filesystem.md
>> > > > > * with a contract test
>> > > > > * a new hasPathCapabilities probe. Throwing
>> > > UnsupportedOperationException
>> > > > > only lets people work out if it is unsupported through invocation.
>> > > Being
>> > > > > able to probe for it is better.
>> > > > > * ViewFS support.
>> > > > > * And, for any new API, one which works well for high-latency
>> object
>> > > > > stores: returning Future<Something> and
>> > > Future<RemoteIterator<Something>
>> > > > > when > 1 result is returned
>> > > > >
>> > > > > This needs to hold even for pulling something up from HDFS.
>> Because
>> > if
>> > > > > another FS wants to implement it, they need to know what it does,
>> and
>> > > > have
>> > > > > tests to verify this. I say this as someone who has tried to
>> document
>> > > > HDFS
>> > > > > rename() semantics and gave up.
>> > > > >
>> > > > > It's really frustrating that every time someone does an FS API
>> change
>> > > > like
>> > > > > this in the past (most recently HDFS-13616) I am the one who has
>> to
>> > > keep
>> > > > > sending the reminders out, and then having to try and clean up/.
>> > > > >
>> > > > > So what now?
>> > > > >
>> > > > > Options
>> > > > > 1. I roll HDFS-15567 back "please be follow process"
>> > > > > 2. Someone does a followup patch with specification and contract
>> > test,
>> > > > > view FS. Add even more to the java
>> > > > > 3. We do as per HADOOP-16898 into an MSyncable interface and then
>> > > > > FileSystem & HDFS can implement. ViewFS and filterFS still need to
>> > pass
>> > > > > through.
>> > > > >
>> > > > > *If nobody is going to volunteer for the specification/test
>> changes,
>> > > I'm
>> > > > > happy for the rollback. It'll remind people about process, *
>> > > > >
>> > > > > Pre-emptive Warning: No matter what we do for this patch, I will
>> roll
>> > > > back
>> > > > > the next change which adds a new API if it's not accompanied by
>> > > > > specification and tests.
>> > > > >
>> > > > > Unhappily yours,
>> > > > >
>> > > > > Steve
>> > > > >
>> > > >
>> > >
>> >
>>
>

Reply via email to