[
https://issues.apache.org/jira/browse/HBASE-29081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18083170#comment-18083170
]
Hudson commented on HBASE-29081:
--------------------------------
Results for branch master
[build #1454 on
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/1454/]:
(x) *{color:red}-1 overall{color}*
----
details (if available):
(x) {color:red}-1 general checks{color}
-- For more information [see general
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/1454/General_20Nightly_20Build_20Report/]
(/) {color:green}+1 jdk17 hadoop3 checks{color}
-- For more information [see jdk17
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/1454/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]
> Add HBase Read Replica Cluster feature
> --------------------------------------
>
> Key: HBASE-29081
> URL: https://issues.apache.org/jira/browse/HBASE-29081
> Project: HBase
> Issue Type: Umbrella
> Components: Replication
> Reporter: Andor Molnar
> Assignee: Andor Molnar
> Priority: Major
> Labels: pull-request-available
>
> h1. Objective
> We’d like to implement the open source version of Amazon’s [Read Replica
> Cluster on
> S3|https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/]
> feature for Apache HBase. It adds the ability of running another HBase
> cluster on the same cloud storage location in read-only mode, allowing users
> to share the read workload between multiple clusters. Due to the
> characteristics of the implementation and the lack of automated
> synchronization between the active and read-replica clusters, read replicas
> are eventually consistent, hence they’re not suitable for reading most recent
> data. However we still believe that users of open source Apache HBase could
> take advantage of this feature and there’re use cases out there which read
> replicas could help with. Please find more information about the feature in
> the linked blog post.
> h1. Pros
> * Running multiple clusters in different Availability Zones adds HA to the
> entire workload,
> * No need for data movement or duplication (active-active replication setup)
> which is cost and time efficient,
> * No limit for the number of read replica clusters
> h1. Cons
> * Read Replica clusters are eventually consistent: in memory data is not
> visible from read replicas,
> * Read Replica clusters must be manually refreshed: flush on active cluster,
> refresh hfiles/meta on read replicas
--
This message was sent by Atlassian Jira
(v8.20.10#820010)