I think here we just want this to be backported to 2.x, not 2.5.x. So thanks Andrew for the quick action.
+1 on merging HBASE-26067 to master and backporting to branch-2(2.6.0). Thanks. Andrew Purtell <[email protected]> 于2021年12月9日周四 08:45写道: > I concur with Nick, but let me help here by branching 2.5 today. It was > always going to be somewhat arbitrary a point. > > On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk <[email protected]> wrote: > > > Based solely on the comments made to this thread, I would recommend > against > > a merge to branch-2, given that we are very close to 2.5. The points > about > > existing gaps seem like things we're not ready to publish in the > impending > > minor release. Once we have a branch-2.5, this particular concern of mine > > will be alleviated. > > > > Thanks, > > Nick > > > > On Wed, Dec 8, 2021 at 1:37 PM Josh Elser <[email protected]> wrote: > > > > > I was going to wait for some other folks to chime in, but I guess I can > > > be the next one :) > > > > > > Duo, Wellington, and Szabolcs have been doing some excellent work on > the > > > storefile tracking (SFT) to a degree that I never expected to see. I > > > remember some of the original "Filesystem re-do" issues on Jira. The > > > idea was exceptional, but the result seemed unreachable. > > > > > > These devs, building on the success of what Zach/Stephen first talked > > > about in HBASE-24749, came up with what I think is an excellent step > > > forward. I've yet to break it via my own testing, but do acknowledge > > > that there's always more work to be done. > > > > > > I think this is at a reasonable place to merge this back into the > > > "mainline" branches from the feature branch (HBASE-26067). I believe > > > this is ready because: > > > > > > 1. The feature is completely opt-in (HBase works the same way by > default) > > > 2. There is API to migrate tables into the new SFT implementation > > > 3. There is also API to migrate tables back to the default > implementation > > > > > > Some gaps still exist around bulk loading, documentation, snapshots, > and > > > recovery tooling, but these are being worked on. In the context of S3, > > > this makes a significantly more compelling offering of HBase by > removing > > > the complexity of HBOSS. For HBase in all installations, I think SFT > > > makes more a significantly more "deterministic" way of managing > > > regions/files. > > > > > > +1 from me to merge HBASE-26067 into master and branch-2 > > > > > > - Josh > > > > > > On 12/7/21 10:31 AM, Wellington Chevreuil wrote: > > > > Hello everyone, > > > > > > > > We have been making progress on the alternative way of tracking store > > > files > > > > originally proposed by Duo in HBASE-26067. > > > > > > > > To briefly summarize it for those not following it, this feature > > > introduces > > > > an abstraction layer to track store files still used/needed by store > > > > engines, allowing for plugging different approaches of identifying > > store > > > > files required by the given store. The design doc describing it in > more > > > > detail is available here > > > > < > > > > > > https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s > > > > > > > > . > > > > > > > > Our main goal within this feature is to avoid the need for using temp > > > files > > > > and renames when creating new hfiles (whenever flushing, compacting, > > > > splitting/merging or snapshotting). This is made possible by the > > > pluggable > > > > tracker implementation labeled "FILE". The current behavior using > temp > > > dirs > > > > and renames would still be the default approach (labeled "DEFAULT"). > > > > > > > > This "renameless" approach is appealing for deployments using Amazon > S3 > > > > Object store file system, where the lack of atomic rename operations > > > > imposed the necessity of an additional layer of locking (HBOSS), > which > > > > combined with the s3a rename operation can have a performance > overhead. > > > > > > > > Some test runs on my employer infrastructure have shown promising > > > results. > > > > A pure insertion ycsb run has shown ~6% performance gain on the > client > > > > writes. Snapshot clone of hundreds of regions table completes in half > > of > > > > the time. There are also improvements in compaction, splits and > merges > > > > times. > > > > > > > > Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we > feel > > > > optimistic that the current implementation is in a good state to get > > > merged > > > > into master branch, but it would be nice to hear other opinions about > > it, > > > > before we effectively commit it. Looking forward to hearing some > > > > thoughts/concerns you might have. > > > > > > > > Kind regards, > > > > Wellington. > > > > > > > > > > > > -- > Best regards, > Andrew > > Words like orphans lost among the crosstalk, meaning torn from truth's > decrepit hands > - A23, Crosstalk >
