Hi all,

The feature has been merged to the master branch.

Kudos to all contributors:

- Anuj Sharma <[email protected]>
- Kevin Geiszler <[email protected]>
- Shanmukha Haripriya Kota <[email protected]>
- Abhishek Kothalikar <[email protected]>

Huge thanks to the reviewers:

- Charles Connell <[email protected]>
- Tak Lon (Stephen) Wu <[email protected]>

We will continue the work by preparing patches for the documentation and 
integration tests next week.

Best regards,

Andor



> On May 19, 2026, at 19:54, Andor Molnár <[email protected]> wrote:
> 
> Hi HBase team,
> 
> Just a quick heads-up for the community.
> 
> The feature merge PR is all approved now. We’re working on fixing the CI to 
> get a green
> build and once it’s done, the PR is ready to be merged.
> 
> Last chance to share your thoughts and review the code changes.
> 
> Thanks for the tremendous help for everybody who contributed.
> 
> Regards,
> Andor
> 
> 
> 
>> On Apr 8, 2026, at 10:30, Andor Molnár <[email protected]> wrote:
>> 
>> Hi all,
>> 
>> We would like to propose merging the feature “Read Replica Cluster” into 
>> the main branch.
>> 
>> *Background*
>> 
>> We’d like to implement the open source version of Amazon’s Read Replica 
>> Cluster on S3 feature [1] for Apache HBase. It adds the ability of running 
>> another HBase cluster on the same cloud storage location in read-only mode, 
>> allowing users to share the read workload between multiple clusters. Due 
>> to the characteristics of the implementation and the lack of automated 
>> synchronization between the active and read-replica clusters, read replicas 
>> are eventually consistent, hence they’re not suitable for reading most 
>> recent data. However we still believe that users of open source Apache HBase 
>> could take advantage of this feature and there are use cases out there which 
>> read replicas could help with. Please find more information about the 
>> feature in the linked blog post.
>> 
>> *Pros*
>> 
>> - Running multiple clusters in different Availability Zones adds HA to the 
>> entire workload,
>> - No need for data movement or duplication (active-active replication setup) 
>> which is cost and time efficient,
>> - No limit for the number of read replica clusters
>> 
>> *Cons*
>> 
>> - Read Replica clusters are eventually consistent: in memory data is not 
>> visible from read replicas,
>> - Read Replica clusters must be manually refreshed: flush on active cluster, 
>> refresh hfiles/meta on read replicas
>> 
>> A detailed description of the design and implementation can be found in the
>> following document:
>> 
>> Apache HBase Read Replica Cluster Feature [2]
>> 
>> Please review and share your feedback or comments on the pull request. [3]
>> 
>> Best regards,
>> Andor Molnar
>> 
>> 
>> [1] 
>> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/
>> [2] 
>> https://docs.google.com/document/d/1EI0lsURX1BZhv3DYgMvZCl4EUy-ADJRkHUc1PjzZtj0/edit?usp=sharing
>> [3] https://github.com/apache/hbase/pull/8044
>> 
>> 
>> 
> 

Reply via email to