dlmarion opened a new issue, #5368:
URL: https://github.com/apache/accumulo/issues/5368

   The Manager kicks off a background thread to perform the upgrade, then in 
the main thread it continues starting the metrics subsystem, starting various 
threads, waiting for the tservers to come up, starting the compaction 
coordinator, starting the Splitter, and starting the TabletGroupWatchers before 
waiting for the upgrade to complete.
   
   I saw the following exception while the Manager was starting while the 
upgrade process was upgrading the root tablet from version 11 to 12.
   
   ```
   2025-02-28T20:55:07,307 [fate.FateMetrics] INFO : Failed to update userfate 
metrics due to exception
   org.apache.accumulo.core.client.TableDeletedException: Table ID +fate was 
deleted
           at 
org.apache.accumulo.core.clientImpl.ClientContext.requireNotDeleted(ClientContext.java:621)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.clientImpl.ThriftScanner.getNextScanAddress(ThriftScanner.java:561)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.clientImpl.ThriftScanner.scan(ThriftScanner.java:659) 
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.clientImpl.ScannerIterator.readBatch(ScannerIterator.java:162)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.clientImpl.ScannerIterator.getNextBatch(ScannerIterator.java:180)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.clientImpl.ScannerIterator.hasNext(ScannerIterator.java:112)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at java.base/java.util.Iterator.forEachRemaining(Iterator.java:132) 
~[?:?]
           at 
java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1845)
 ~[?:?]
           at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) 
~[?:?]
           at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
 ~[?:?]
           at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
 ~[?:?]
           at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
 ~[?:?]
           at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
~[?:?]
           at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
 ~[?:?]
           at 
org.apache.accumulo.core.fate.AdminUtil.lambda$getTransactionStatus$2(AdminUtil.java:353)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at java.base/java.util.Map.forEach(Map.java:713) ~[?:?]
           at 
org.apache.accumulo.core.fate.AdminUtil.getTransactionStatus(AdminUtil.java:351)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.fate.AdminUtil.getTransactionStatus(AdminUtil.java:206)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.manager.metrics.fate.FateMetricValues.getFateMetrics(FateMetricValues.java:82)
 ~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.manager.metrics.fate.user.UserFateMetricValues.getUserStoreMetrics(UserFateMetricValues.java:42)
 ~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.manager.metrics.fate.user.UserFateMetrics.getMetricValues(UserFateMetrics.java:41)
 ~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.manager.metrics.fate.user.UserFateMetrics.getMetricValues(UserFateMetrics.java:27)
 ~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.manager.metrics.fate.FateMetrics.update(FateMetrics.java:77)
 ~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.manager.metrics.fate.FateMetrics.lambda$registerMetrics$2(FateMetrics.java:119)
 ~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
 ~[?:?]
           at 
java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) 
~[?:?]
           at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
 ~[?:?]
           at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
 ~[?:?]
           at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
 ~[?:?]
           at 
org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
   ```
   
   I'm thinking we might want to re-organize the Manager startup sequence such 
that the metrics, Splitter, and compaction coordinator are started after the 
upgrade is complete. We may also want to wait to start the metadata TGW until 
after the Root upgrade is complete, and the user TGW until after the entire 
upgrade is complete.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to