Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Josh Elser Wed, 08 Dec 2021 13:37:24 -0800

I was going to wait for some other folks to chime in, but I guess I canbe the next one :)

Duo, Wellington, and Szabolcs have been doing some excellent work on thestorefile tracking (SFT) to a degree that I never expected to see. Iremember some of the original "Filesystem re-do" issues on Jira. Theidea was exceptional, but the result seemed unreachable.

These devs, building on the success of what Zach/Stephen first talkedabout in HBASE-24749, came up with what I think is an excellent stepforward. I've yet to break it via my own testing, but do acknowledgethat there's always more work to be done.

I think this is at a reasonable place to merge this back into the"mainline" branches from the feature branch (HBASE-26067). I believethis is ready because:


1. The feature is completely opt-in (HBase works the same way by default)
2. There is API to migrate tables into the new SFT implementation
3. There is also API to migrate tables back to the default implementation

Some gaps still exist around bulk loading, documentation, snapshots, andrecovery tooling, but these are being worked on. In the context of S3,this makes a significantly more compelling offering of HBase by removingthe complexity of HBOSS. For HBase in all installations, I think SFTmakes more a significantly more "deterministic" way of managingregions/files.


+1 from me to merge HBASE-26067 into master and branch-2

- Josh

On 12/7/21 10:31 AM, Wellington Chevreuil wrote:

Hello everyone,

We have been making progress on the alternative way of tracking store files
originally proposed by Duo in HBASE-26067.

To briefly summarize it for those not following it, this feature introduces
an abstraction layer to track store files still used/needed by store
engines, allowing for plugging different approaches of identifying store
files required by the given store. The design doc describing it in more
detail is available here
<https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s>
.

Our main goal within this feature is to avoid the need for using temp files
and renames when creating new hfiles (whenever flushing, compacting,
splitting/merging or snapshotting). This is made possible by the pluggable
tracker implementation labeled "FILE". The current behavior using temp dirs
and renames would still be the default approach (labeled "DEFAULT").

This "renameless" approach is appealing for deployments using Amazon S3
Object store file system, where the lack of atomic rename operations
imposed the necessity of an additional layer of locking (HBOSS), which
combined with the s3a rename operation can have a performance overhead.

Some test runs on my employer infrastructure have shown promising results.
A pure insertion ycsb run has shown ~6% performance gain on the client
writes. Snapshot clone of hundreds of regions table completes in half of
the time. There are also improvements in compaction, splits and merges
times.

Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we feel
optimistic that the current implementation is in a good state to get merged
into master branch, but it would be nice to hear other opinions about it,
before we effectively commit it. Looking forward to hearing some
thoughts/concerns you might have.

Kind regards,
Wellington.

Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

Reply via email to