This is an automated email from the ASF dual-hosted git repository.

baedke pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/jackrabbit-oak.git


The following commit(s) were added to refs/heads/trunk by this push:
     new 243c9ab7c2 OAK-11284: Greedy Reuse of cluster IDs may lead to 
synchronous LastRe… (#1948)
243c9ab7c2 is described below

commit 243c9ab7c2e2142dfe8f4b6c3c4641054a3cb7bd
Author: mbaedke <[email protected]>
AuthorDate: Wed Jan 8 13:14:28 2025 +0100

    OAK-11284: Greedy Reuse of cluster IDs may lead to synchronous LastRe… 
(#1948)
    
    Added documentation.
    
    ---------
    
    Co-authored-by: Julian Reschke <[email protected]>
---
 oak-doc/src/site/markdown/nodestore/documentmk.md      | 18 ++++++++++++++++++
 .../oak/plugins/document/LastRevRecoveryAgent.java     |  2 ++
 2 files changed, 20 insertions(+)

diff --git a/oak-doc/src/site/markdown/nodestore/documentmk.md 
b/oak-doc/src/site/markdown/nodestore/documentmk.md
index df6db50da0..c3a504b63b 100644
--- a/oak-doc/src/site/markdown/nodestore/documentmk.md
+++ b/oak-doc/src/site/markdown/nodestore/documentmk.md
@@ -773,6 +773,24 @@ the `machine` and `instance` fields. This behaviour is new 
and was introduced
 with Oak 1.10. Previous versions ignore entries that do not match the
 environment and would create a new entry.
 
+Note that while this behavior is usually beneficial, there are circumstances
+under which it may lead to very slow startup times for cluster nodes that try
+to acquire a node ID that has not been shut down gracefully and has been
+inactive for a long time. This is due to synchronous recovery operations that
+are necessary to guarantee the consistency of the cluster (for details see
+[Recovery for a cluster node ID](#recovery-for-a-cluster-node-id)).
+
+To avoid that, the maximum duration of the synchronous recovery may be
+limited using the system property `oak.documentMK.syncRecoveryTimeoutMillis`.
+A positive value will specify this maximum duration in milliseconds, while a
+negative value doesn't limit the recovery time. The default is `-1`.
+If the duration is exceeded, the node will no longer try to reuse the ID
+and pick one that doesn't need recovery.
+
+Note that this feature has been specifically designed for unusual Oak
+deployments (requiring significantly longer lease timeouts) and is not
+recommended for general use. 
+
 ### <a name="update-lease-for-a-cluster-node-id"></a> Update lease for a 
cluster node ID
 
 Each running cluster node updates the `leaseEnd` time of the cluster node ID
diff --git 
a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/LastRevRecoveryAgent.java
 
b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/LastRevRecoveryAgent.java
index f5cf1cf646..07600881ed 100644
--- 
a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/LastRevRecoveryAgent.java
+++ 
b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/LastRevRecoveryAgent.java
@@ -81,6 +81,8 @@ public class LastRevRecoveryAgent {
 
     private final Consumer<Integer> afterRecovery;
 
+    //OAK-11284: optionally limit the maximum duration of a synchronous 
recovery operation that may occur when
+    //inactive node IDs are reused.
     private static final long SYNC_RECOVERY_TIMEOUT_MILLIS =
             SystemPropertySupplier
                     .create("oak.documentMK.syncRecoveryTimeoutMillis", -1)

Reply via email to