EdColeman commented on code in PR #2778:
URL: https://github.com/apache/accumulo/pull/2778#discussion_r934581685
##########
server/base/src/main/java/org/apache/accumulo/server/conf/ServerConfigurationFactory.java:
##########
@@ -140,4 +163,119 @@ public void connectionEvent() {
// no-op. changes handled by prop store impl
}
}
+
+ private class ConfigRefreshRunner {
+ private static final long MIN_JITTER_DELAY = 1;
+ private static final long MAX_JITTER_DELAY = 23;
+ private final ScheduledFuture<?> refreshTaskFuture;
+
+ ConfigRefreshRunner() {
+
+ Runnable refreshTask = this::verifySnapshotVersions;
+
+ ScheduledThreadPoolExecutor executor = ThreadPools.getServerThreadPools()
+ .createScheduledExecutorService(1, "config-refresh", false);
+
+ // staggering the initial delay prevents synchronization of Accumulo
servers communicating
+ // with ZooKeeper for the sync process. (Value is 25% -> 100% of the
refresh period.)
+ long randDelay = jitter(REFRESH_PERIOD_MINUTES / 4,
REFRESH_PERIOD_MINUTES);
+ refreshTaskFuture =
+ executor.scheduleWithFixedDelay(refreshTask, randDelay,
REFRESH_PERIOD_MINUTES, MINUTES);
+ }
+
+ /**
+ * Check that the stored version in ZooKeeper matches the version held in
the local snapshot.
+ * When a mismatch is detected, a change event is sent to the prop store
which will cause a
+ * re-load. If the Zookeeper node has been deleted, the local cache
entries are removed.
+ * <p>
+ * This method is designed to be called as a scheduled task, so it does
not propagate exceptions
+ * other than interrupted Exceptions so the scheduled tasks will continue
to run.
+ */
+ private void verifySnapshotVersions() {
+
+ // short circuit if refresh in progress
+ if (isConfigRefreshRunning.get()) {
+ return;
+ }
+
+ // allow only one thread if missed short circuit check.
+ refreshLock.lock();
+ try {
+ isConfigRefreshRunning.set(true);
+ long refreshStart = System.nanoTime();
+ int keyCount = 0;
+ int keyChangedCount = 0;
+
+ PropStore propStore = context.getPropStore();
+ keyCount++;
+
+ // rely on store to propagate change event if different
+ propStore.validateDataVersion(SystemPropKey.of(context),
+ ((ZooBasedConfiguration)
getSystemConfiguration()).getDataVersion());
+ // small yield - spread out ZooKeeper calls
+ jitterDelay();
+
+ for (Map.Entry<NamespaceId,NamespaceConfiguration> entry :
namespaceConfigs.entrySet()) {
+ keyCount++;
+ PropStoreKey<?> propKey = NamespacePropKey.of(context,
entry.getKey());
+ if (!propStore.validateDataVersion(propKey,
entry.getValue().getDataVersion())) {
+ keyChangedCount++;
+ namespaceConfigs.remove(entry.getKey());
+ }
+ // small yield - spread out ZooKeeper calls between namespace config
checks
+ jitterDelay();
Review Comment:
On a large cluster start, all tservers could come on line at nearly the same
time. The start jitter and then the additional jitter delays are meant to
ensure that this check does not sync ZooKeeper calls across the cluster over
time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]