[ https://issues.apache.org/jira/browse/HBASE-24877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Kyle Purtell resolved HBASE-24877. ----------------------------------------- Fix Version/s: 2.4.0 3.0.0-alpha-1 Resolution: Fixed PRs were merged to master and branch-2. Resolving. File new issues for any further backports. > Add option to avoid aborting RS process upon uncaught exceptions happen on > replication source > --------------------------------------------------------------------------------------------- > > Key: HBASE-24877 > URL: https://issues.apache.org/jira/browse/HBASE-24877 > Project: HBase > Issue Type: Improvement > Components: Replication > Affects Versions: 3.0.0-alpha-1, 2.4.0 > Reporter: Wellington Chevreuil > Assignee: Wellington Chevreuil > Priority: Major > Fix For: 3.0.0-alpha-1, 2.4.0 > > > Currently, we abort entire RS process if any uncaught exceptions happens on > ReplicationSource initialization. This may be too extreme on certain > deployments, where custom replication endpoint implementations may choose to > do so when remote peers are unavailable, but source cluster shouldn't be > brought down entirely. Similarly, source reader and shipper threads would > cause RS to abort on any runtime exception occurrence while running. > This patch adds configuration option (false by default, to keep the original > behaviour), to avoid aborting entire RS processes under these conditions. > Instead, if ReplicationSource initialization fails with a RuntimeException, > it keeps retrying the source startup. In the case of readers/shippers runtime > errors, it refreshes the replication source, terminating current source and > its readers/shippers and creating new ones. -- This message was sent by Atlassian Jira (v8.3.4#803005)