I suggest we start a formal vote thread after we finish all the works :) And #3861 is not a blocker, I think we still have some concerns on how to collect the metrics at region server side to master side. We could do it after merging back the feature branch.
Thanks. Josh Elser <[email protected]> 于2021年12月15日周三 06:15写道: > Thanks for your input, Andrew and Nick! > > Big thank you to Duo for your hands-on-keyboard commitment as well for > this whole feature. > > I am also happy to target 2.x (and not 2.5.x) for the backport. > > In the interest of getting rid of this feature branch (and the > inevitable rebase pains the longer it runs parallel to master), I'd like > to move ahead with a concrete plan to merge. > > 1. Given there was no objection, do folks feel the need for a VOTE? Even > if one person would like a VOTE, I'm happy to start that. Please just > say so. > > 2. We have three outstanding PRs for the sake of SFT which are all (IMO) > very close to merging (#3851, #3861, and #3942). I think 3851 and 3942 > are easy to include and just need one more review cycle. If we feel like > we are still far away on 3861, I think we set that aside and revisit it > after the feature merge is done. > > If there are any other concerns, please shout! > > - Josh > > On 12/8/21 9:07 PM, Andrew Purtell wrote: > > +1 for merging to branch-2 (2.6) > > > >> On Dec 8, 2021, at 6:04 PM, 张铎 <[email protected]> wrote: > >> > >> I think here we just want this to be backported to 2.x, not 2.5.x. > >> > >> So thanks Andrew for the quick action. > >> > >> +1 on merging HBASE-26067 to master and backporting to branch-2(2.6.0). > >> > >> Thanks. > >> > >> Andrew Purtell <[email protected]> 于2021年12月9日周四 08:45写道: > >> > >>> I concur with Nick, but let me help here by branching 2.5 today. It was > >>> always going to be somewhat arbitrary a point. > >>> > >>>> On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk <[email protected]> > wrote: > >>>> > >>>> Based solely on the comments made to this thread, I would recommend > >>> against > >>>> a merge to branch-2, given that we are very close to 2.5. The points > >>> about > >>>> existing gaps seem like things we're not ready to publish in the > >>> impending > >>>> minor release. Once we have a branch-2.5, this particular concern of > mine > >>>> will be alleviated. > >>>> > >>>> Thanks, > >>>> Nick > >>>> > >>>>> On Wed, Dec 8, 2021 at 1:37 PM Josh Elser <[email protected]> wrote: > >>>> > >>>>> I was going to wait for some other folks to chime in, but I guess I > can > >>>>> be the next one :) > >>>>> > >>>>> Duo, Wellington, and Szabolcs have been doing some excellent work on > >>> the > >>>>> storefile tracking (SFT) to a degree that I never expected to see. I > >>>>> remember some of the original "Filesystem re-do" issues on Jira. The > >>>>> idea was exceptional, but the result seemed unreachable. > >>>>> > >>>>> These devs, building on the success of what Zach/Stephen first talked > >>>>> about in HBASE-24749, came up with what I think is an excellent step > >>>>> forward. I've yet to break it via my own testing, but do acknowledge > >>>>> that there's always more work to be done. > >>>>> > >>>>> I think this is at a reasonable place to merge this back into the > >>>>> "mainline" branches from the feature branch (HBASE-26067). I believe > >>>>> this is ready because: > >>>>> > >>>>> 1. The feature is completely opt-in (HBase works the same way by > >>> default) > >>>>> 2. There is API to migrate tables into the new SFT implementation > >>>>> 3. There is also API to migrate tables back to the default > >>> implementation > >>>>> > >>>>> Some gaps still exist around bulk loading, documentation, snapshots, > >>> and > >>>>> recovery tooling, but these are being worked on. In the context of > S3, > >>>>> this makes a significantly more compelling offering of HBase by > >>> removing > >>>>> the complexity of HBOSS. For HBase in all installations, I think SFT > >>>>> makes more a significantly more "deterministic" way of managing > >>>>> regions/files. > >>>>> > >>>>> +1 from me to merge HBASE-26067 into master and branch-2 > >>>>> > >>>>> - Josh > >>>>> > >>>>> On 12/7/21 10:31 AM, Wellington Chevreuil wrote: > >>>>>> Hello everyone, > >>>>>> > >>>>>> We have been making progress on the alternative way of tracking > store > >>>>> files > >>>>>> originally proposed by Duo in HBASE-26067. > >>>>>> > >>>>>> To briefly summarize it for those not following it, this feature > >>>>> introduces > >>>>>> an abstraction layer to track store files still used/needed by store > >>>>>> engines, allowing for plugging different approaches of identifying > >>>> store > >>>>>> files required by the given store. The design doc describing it in > >>> more > >>>>>> detail is available here > >>>>>> < > >>>>> > >>>> > >>> > https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s > >>>>>> > >>>>>> . > >>>>>> > >>>>>> Our main goal within this feature is to avoid the need for using > temp > >>>>> files > >>>>>> and renames when creating new hfiles (whenever flushing, compacting, > >>>>>> splitting/merging or snapshotting). This is made possible by the > >>>>> pluggable > >>>>>> tracker implementation labeled "FILE". The current behavior using > >>> temp > >>>>> dirs > >>>>>> and renames would still be the default approach (labeled "DEFAULT"). > >>>>>> > >>>>>> This "renameless" approach is appealing for deployments using Amazon > >>> S3 > >>>>>> Object store file system, where the lack of atomic rename operations > >>>>>> imposed the necessity of an additional layer of locking (HBOSS), > >>> which > >>>>>> combined with the s3a rename operation can have a performance > >>> overhead. > >>>>>> > >>>>>> Some test runs on my employer infrastructure have shown promising > >>>>> results. > >>>>>> A pure insertion ycsb run has shown ~6% performance gain on the > >>> client > >>>>>> writes. Snapshot clone of hundreds of regions table completes in > half > >>>> of > >>>>>> the time. There are also improvements in compaction, splits and > >>> merges > >>>>>> times. > >>>>>> > >>>>>> Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we > >>> feel > >>>>>> optimistic that the current implementation is in a good state to get > >>>>> merged > >>>>>> into master branch, but it would be nice to hear other opinions > about > >>>> it, > >>>>>> before we effectively commit it. Looking forward to hearing some > >>>>>> thoughts/concerns you might have. > >>>>>> > >>>>>> Kind regards, > >>>>>> Wellington. > >>>>>> > >>>>> > >>>> > >>> > >>> > >>> -- > >>> Best regards, > >>> Andrew > >>> > >>> Words like orphans lost among the crosstalk, meaning torn from truth's > >>> decrepit hands > >>> - A23, Crosstalk > >>> >
