Hi all, The feature has been merged to the master branch.
Kudos to all contributors: - Anuj Sharma <[email protected]> - Kevin Geiszler <[email protected]> - Shanmukha Haripriya Kota <[email protected]> - Abhishek Kothalikar <[email protected]> Huge thanks to the reviewers: - Charles Connell <[email protected]> - Tak Lon (Stephen) Wu <[email protected]> We will continue the work by preparing patches for the documentation and integration tests next week. Best regards, Andor > On May 19, 2026, at 19:54, Andor Molnár <[email protected]> wrote: > > Hi HBase team, > > Just a quick heads-up for the community. > > The feature merge PR is all approved now. We’re working on fixing the CI to > get a green > build and once it’s done, the PR is ready to be merged. > > Last chance to share your thoughts and review the code changes. > > Thanks for the tremendous help for everybody who contributed. > > Regards, > Andor > > > >> On Apr 8, 2026, at 10:30, Andor Molnár <[email protected]> wrote: >> >> Hi all, >> >> We would like to propose merging the feature “Read Replica Cluster” into >> the main branch. >> >> *Background* >> >> We’d like to implement the open source version of Amazon’s Read Replica >> Cluster on S3 feature [1] for Apache HBase. It adds the ability of running >> another HBase cluster on the same cloud storage location in read-only mode, >> allowing users to share the read workload between multiple clusters. Due >> to the characteristics of the implementation and the lack of automated >> synchronization between the active and read-replica clusters, read replicas >> are eventually consistent, hence they’re not suitable for reading most >> recent data. However we still believe that users of open source Apache HBase >> could take advantage of this feature and there are use cases out there which >> read replicas could help with. Please find more information about the >> feature in the linked blog post. >> >> *Pros* >> >> - Running multiple clusters in different Availability Zones adds HA to the >> entire workload, >> - No need for data movement or duplication (active-active replication setup) >> which is cost and time efficient, >> - No limit for the number of read replica clusters >> >> *Cons* >> >> - Read Replica clusters are eventually consistent: in memory data is not >> visible from read replicas, >> - Read Replica clusters must be manually refreshed: flush on active cluster, >> refresh hfiles/meta on read replicas >> >> A detailed description of the design and implementation can be found in the >> following document: >> >> Apache HBase Read Replica Cluster Feature [2] >> >> Please review and share your feedback or comments on the pull request. [3] >> >> Best regards, >> Andor Molnar >> >> >> [1] >> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/ >> [2] >> https://docs.google.com/document/d/1EI0lsURX1BZhv3DYgMvZCl4EUy-ADJRkHUc1PjzZtj0/edit?usp=sharing >> [3] https://github.com/apache/hbase/pull/8044 >> >> >> >
