ddanielr commented on code in PR #5399:
URL: https://github.com/apache/accumulo/pull/5399#discussion_r1991729384
##########
server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogs.java:
##########
@@ -266,37 +271,62 @@ private long
removeTabletServerMarkers(Map<UUID,TServerInstance> uidMap,
return result;
}
- private long removeFile(Path path) {
- try {
- if (!useTrash || !fs.moveToTrash(path)) {
- fs.deleteRecursively(path);
+ private void removeFile(ExecutorService deleteThreadPool, Path path,
AtomicLong counter,
+ String msg) {
+ deleteThreadPool.execute(() -> {
+ try {
+ log.debug(msg);
+ if (!useTrash || !fs.moveToTrash(path)) {
+ fs.deleteRecursively(path);
+ }
+ counter.incrementAndGet();
+ } catch (FileNotFoundException ex) {
+ // ignored
+ } catch (IOException ex) {
+ log.error("Unable to delete wal {}", path, ex);
}
- return 1;
- } catch (FileNotFoundException ex) {
- // ignored
- } catch (IOException ex) {
- log.error("Unable to delete wal {}", path, ex);
- }
-
- return 0;
+ });
}
private long removeFiles(Collection<Pair<WalState,Path>> collection, final
GCStatus status) {
+
+ final ExecutorService deleteThreadPool = ThreadPools.getServerThreadPools()
Review Comment:
When deleting rfiles, the threadpool is created and shutdown for each delete
pass to ensure that all rfiles are deleted before we remove the gcCandidates
from their respective metadata location.
For WALs, the GC looks at each wal directory location to find files, then
compares the file against the live tservers set, recovery operations, or dead
tserver list to see if the file can be removed.
Since we aren't processing gcCandidates, do we need to create and delete the
threadpool on each removeFiles run?
If we made a longer lived pool then we could enable metrics on the
threadpool and get more details about how many threads might be utilized.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]