ritegarg commented on code in PR #2479:
URL: https://github.com/apache/phoenix/pull/2479#discussion_r3291068286
##########
phoenix-core-client/src/main/java/org/apache/phoenix/jdbc/HAGroupStoreClient.java:
##########
@@ -1047,6 +1114,177 @@ private long
validateTransitionAndGetWaitTime(HAGroupStoreRecord.HAGroupState cu
return Math.max(0, remainingTime);
}
+ // ========== Legacy /phoenix/ha CRR Sync ==========
+
+ /**
+ * Derives the combined CRR from local + peer records and CAS-writes it to
{@code /phoenix/ha}.
+ * CAS losses are logged and skipped; the next consistentHA cache event or
periodic cycle
+ * reconverges.
+ */
+ private void syncLegacyCRRIfRoleChanged() {
+ if (!legacyCrrSyncEnabled || !isHealthy) {
+ return;
+ }
+ // Snapshot mutable resources up front so a concurrent close() can't null
them mid-method
+ // and trigger NPEs / writes through a torn-down Curator client.
+ final PhoenixHAAdmin admin = this.legacyHaAdmin;
+ final NodeCache cache = this.legacyCrrNodeCache;
+ if (admin == null || cache == null) {
+ return;
+ }
+ try {
+ HAGroupStoreRecord local = getHAGroupStoreRecord();
+ if (local == null) {
+ LOGGER.debug("Skipping legacy CRR sync for HA group {}: no local
consistentHA record",
+ haGroupName);
+ return;
+ }
+ // Wait for peer URL before building the desired CRR (ctor NPEs on null
url2).
+ if (StringUtils.isBlank(local.getPeerZKUrl())) {
+ LOGGER.debug("Skipping legacy CRR sync for HA group {}: peer ZK URL is
blank", haGroupName);
+ return;
+ }
+ HAGroupStoreRecord peer = getHAGroupStoreRecordFromPeer();
+ // NodeCache is eventually consistent; on apparent absence, fall back to
an authoritative
+ // ZK read so the equality check and CAS both see consistent state.
+ Pair<ClusterRoleRecord, Stat> snapshot = readLegacyCrrSnapshot(cache);
+ if (snapshot.getRight() == null) {
+ snapshot = admin.getClusterRoleRecordAndStatInZooKeeper(haGroupName);
+ }
+ ClusterRoleRecord existing = snapshot.getLeft();
+ Stat existingStat = snapshot.getRight();
+ if (!shouldWriteLegacyCrr(existing)) {
+ return;
+ }
+ ClusterRoleRecord desired = buildDesiredLegacyCrr(local, peer, existing);
+ if (desired.isLogicallyEqualIgnoringVersionAndRegistry(existing)) {
+ LOGGER.debug("Legacy CRR for HA group {} already up to date at version
{}", haGroupName,
+ existing.getVersion());
+ return;
+ }
+ try {
+ if (existingStat == null) {
+ admin.createOrUpdateClusterRoleRecordWithCAS(haGroupName, desired,
+ PhoenixHAAdmin.LegacyCrrWriteMode.CREATE_NEW, /* ignored */ 0);
+ } else {
+ admin.createOrUpdateClusterRoleRecordWithCAS(haGroupName, desired,
+ PhoenixHAAdmin.LegacyCrrWriteMode.CAS_WITH_VERSION,
existingStat.getVersion());
+ }
+ LOGGER.info("Synced legacy CRR for HA group {} (version {} -> {})",
haGroupName,
+ existing != null ? existing.getVersion() : -1L,
desired.getVersion());
+ } catch (StaleClusterRoleRecordVersionException stale) {
+ // CAS lost; next event/periodic cycle reconverges.
+ LOGGER.info("Legacy CRR CAS lost for HA group {} at expected stat
version {}", haGroupName,
Review Comment:
Agreed, fixing. Demoted the "CAS lost" line to DEBUG. The "Synced legacy CRR
... -> ..." line at the line above stays INFO since only the winner logs it
(one INFO per state transition cluster-wide).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]