apurtell commented on code in PR #2489:
URL: https://github.com/apache/phoenix/pull/2489#discussion_r3352474636
##########
phoenix-core-server/src/main/java/org/apache/phoenix/replication/reader/ReplicationLogReplayService.java:
##########
@@ -229,6 +251,37 @@ protected long getConsistencyPoint() throws IOException,
SQLException {
return consistencyPoint;
}
+ /**
+ * Applies the replication replay consistency point as a floor on
maxLookbackWindowStart. On
+ * standby clusters, this prevents compaction from dropping delete markers
that have timestamps
+ * newer than the consistency point.
+ */
+ public static long applyReplicationConsistencyGuard(long
currentMaxLookbackWindowStart,
+ Configuration conf, String tableName, String columnFamilyName) {
+ try {
+ long consistencyPoint = getInstance(conf).getConsistencyPoint();
+ return adjustMaxLookbackWindowStart(currentMaxLookbackWindowStart,
consistencyPoint,
+ tableName, columnFamilyName);
+ } catch (Exception e) {
+ LOG.warn("Replication guard enabled but consistency point unavailable
for table={} store={}."
+ + " Retaining all delete markers.", tableName, columnFamilyName, e);
+ return 0L;
+ }
+ }
+
+ @VisibleForTesting
+ static long adjustMaxLookbackWindowStart(long currentMaxLookbackWindowStart,
+ long consistencyPoint, String tableName, String columnFamilyName) {
+ long adjusted = Math.min(currentMaxLookbackWindowStart, consistencyPoint);
Review Comment:
You are changing the store-level `maxLookbackWindowStart` but the value
actually used for retention is computed per row as the max of this and the
`ttlWindowStart` of the Phoenix compactor's `RowContext`. When `ttlWindowStart`
in the `RowContext` is greater than `consistencyPoint` the boundary snaps to
`ttlWindowStart` and delete markers between `consistencyPoint` and
`ttlWindowStart` get purged anyway, which I think you are trying to prevent.
The robot pointed me to this issue with this text:
"The guard is effective only when replay lag stays within the TTL window.
Concretely: table TTL = 1h, replay lag = 2h ⇒ consistencyPoint = now-2h, guard
sets store start to now-2h, but the row uses max(now-1h, now-2h) = now-1h, so
markers in the [now-2h, now-1h] band are dropped. Tests happen to pass only
because the test table has no TTL."
##########
phoenix-core-server/src/main/java/org/apache/phoenix/replication/reader/ReplicationLogReplayService.java:
##########
@@ -229,6 +251,37 @@ protected long getConsistencyPoint() throws IOException,
SQLException {
return consistencyPoint;
}
+ /**
+ * Applies the replication replay consistency point as a floor on
maxLookbackWindowStart. On
+ * standby clusters, this prevents compaction from dropping delete markers
that have timestamps
+ * newer than the consistency point.
+ */
+ public static long applyReplicationConsistencyGuard(long
currentMaxLookbackWindowStart,
+ Configuration conf, String tableName, String columnFamilyName) {
+ try {
+ long consistencyPoint = getInstance(conf).getConsistencyPoint();
Review Comment:
This will iterate every HA group and in SYNC state each call hits the
filesystem. That is at least an `exists()` + `listStatus()` RPC to the NameNode
per HA group, executed once per store per compaction, synchronously, inside the
`CompactionScanner` constructor.
The consistency point moves slowly relative to compaction frequency, so
consider caching it with a short TTL of a few seconds and generally using the
cached value, rather than recomputing on every scanner construction.
##########
phoenix-core-server/src/main/java/org/apache/phoenix/replication/reader/ReplicationLogReplayService.java:
##########
@@ -50,6 +51,17 @@ public class ReplicationLogReplayService {
*/
public static final boolean DEFAULT_REPLICATION_REPLAY_ENABLED = false;
+ /**
+ * Configuration key for enabling/disabling the replication compaction guard
+ */
+ public static final String REPLICATION_COMPACTION_GUARD_ENABLED =
+ "phoenix.replication.compaction.guard.enabled";
Review Comment:
I commented on this elsewhere, but Claude called this a "foot gun", lol.
> I'd argue the flag is a foot-gun: setting
`phoenix.replication.compaction.guard.enabled=false` on a standby silently
re-introduces exactly the data-desync bug this PR wants to fix.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]