dlmarion commented on code in PR #5576:
URL: https://github.com/apache/accumulo/pull/5576#discussion_r2105167839
##########
core/src/main/java/org/apache/accumulo/core/util/threads/Threads.java:
##########
@@ -55,22 +55,26 @@ public static Runnable createNamedRunnable(String name,
Runnable r) {
return new NamedRunnable(name, r);
}
- public static Thread createThread(String name, Runnable r) {
- return createThread(name, OptionalInt.empty(), r, UEH);
+ public static Thread createNonCriticalThread(String name, Runnable r) {
Review Comment:
I'm not sure we need to rename these to be NonCritical, I think the fact
that we have a createCriticalThread implies that the others are non-critical.
##########
server/manager/src/main/java/org/apache/accumulo/manager/Manager.java:
##########
@@ -1265,14 +1265,21 @@ public void run() {
context.getTableManager().addObserver(this);
- Thread statusThread = Threads.createThread("Status Thread", new
StatusThread());
+ // TODO KEVIN RATHBUN updating the Manager state seems like a critical
function. However, the
+ // thread already handles, waits, and continues in the case of any
Exception, so critical or
+ // non critical doesn't make a difference here.
+ Thread statusThread = Threads.createCriticalThread("Status Thread", new
StatusThread());
statusThread.start();
- Threads.createThread("Migration Cleanup Thread", new
MigrationCleanupThread()).start();
+ // TODO KEVIN RATHBUN migration cleanup may be a critical function of the
manager, but the
+ // thread will already handle, wait, and continue in the case of any
Exception, so critical
+ // or non critical doesn't make a difference here.
+ Threads.createCriticalThread("Migration Cleanup Thread", new
MigrationCleanupThread()).start();
tserverSet.startListeningForTabletServerChanges();
- Threads.createThread("ScanServer Cleanup Thread", new
ScanServerZKCleaner()).start();
+ // TODO KEVIN RATHBUN Some ZK cleanup doesn't seem like a critical
function of manager
+ Threads.createNonCriticalThread("ScanServer Cleanup Thread", new
ScanServerZKCleaner()).start();
Review Comment:
I'm thinking we may want to make this critical. The clients find
ScanServers by looking them up in ZooKeeper. Leaving orphaned entries over a
long period of time could progressively slow the clients down. I'm not sure why
the thread might die, as all of the known exceptions are handled, but
RuntimeException is not caught by the thread, so some unchecked exception could
cause it to fail.
##########
server/tserver/src/main/java/org/apache/accumulo/tserver/log/DfsLogger.java:
##########
@@ -475,7 +475,10 @@ public synchronized void open(String address) throws
IOException {
throw new IOException(ex);
}
- syncThread = Threads.createThread("Accumulo WALog thread " + this, new
LogSyncingTask());
+ // TODO KEVIN RATHBUN this seems like a vital thread for TabletServer, but
appears that the
+ // thread will continuously be recreated, so probably fine to stay non
critical
+ syncThread =
+ Threads.createNonCriticalThread("Accumulo WALog thread " + this, new
LogSyncingTask());
Review Comment:
The syncThread is created for each DFSLogger, it looks like there is a 1:1
relationship. The syncThread runs a LogSyncingTask which flushes the write
ahead log according to the users configured durability intention. When the
write-ahead logs are full, the TabletServer will create new ones and therefore
new sync threads. I don't think a sync thread for a specific DFSLogger is
recreated, which means that it's possible it could die. I think this might be
critical.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]