dlmarion opened a new issue, #5368:
URL: https://github.com/apache/accumulo/issues/5368
The Manager kicks off a background thread to perform the upgrade, then in
the main thread it continues starting the metrics subsystem, starting various
threads, waiting for the tservers to come up, starting the compaction
coordinator, starting the Splitter, and starting the TabletGroupWatchers before
waiting for the upgrade to complete.
I saw the following exception while the Manager was starting while the
upgrade process was upgrading the root tablet from version 11 to 12.
```
2025-02-28T20:55:07,307 [fate.FateMetrics] INFO : Failed to update userfate
metrics due to exception
org.apache.accumulo.core.client.TableDeletedException: Table ID +fate was
deleted
at
org.apache.accumulo.core.clientImpl.ClientContext.requireNotDeleted(ClientContext.java:621)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.core.clientImpl.ThriftScanner.getNextScanAddress(ThriftScanner.java:561)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.core.clientImpl.ThriftScanner.scan(ThriftScanner.java:659)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.core.clientImpl.ScannerIterator.readBatch(ScannerIterator.java:162)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.core.clientImpl.ScannerIterator.getNextBatch(ScannerIterator.java:180)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.core.clientImpl.ScannerIterator.hasNext(ScannerIterator.java:112)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:132)
~[?:?]
at
java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1845)
~[?:?]
at
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
~[?:?]
at
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
~[?:?]
at
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
~[?:?]
at
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
~[?:?]
at
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
~[?:?]
at
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
~[?:?]
at
org.apache.accumulo.core.fate.AdminUtil.lambda$getTransactionStatus$2(AdminUtil.java:353)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.base/java.util.Map.forEach(Map.java:713) ~[?:?]
at
org.apache.accumulo.core.fate.AdminUtil.getTransactionStatus(AdminUtil.java:351)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.core.fate.AdminUtil.getTransactionStatus(AdminUtil.java:206)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.manager.metrics.fate.FateMetricValues.getFateMetrics(FateMetricValues.java:82)
~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.manager.metrics.fate.user.UserFateMetricValues.getUserStoreMetrics(UserFateMetricValues.java:42)
~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.manager.metrics.fate.user.UserFateMetrics.getMetricValues(UserFateMetrics.java:41)
~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.manager.metrics.fate.user.UserFateMetrics.getMetricValues(UserFateMetrics.java:27)
~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.manager.metrics.fate.FateMetrics.update(FateMetrics.java:77)
~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.manager.metrics.fate.FateMetrics.lambda$registerMetrics$2(FateMetrics.java:119)
~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
~[?:?]
at
java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
~[?:?]
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
~[?:?]
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
~[?:?]
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
~[?:?]
at
org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52)
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
```
I'm thinking we might want to re-organize the Manager startup sequence such
that the metrics, Splitter, and compaction coordinator are started after the
upgrade is complete. We may also want to wait to start the metadata TGW until
after the Root upgrade is complete, and the user TGW until after the entire
upgrade is complete.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]