mlbiscoc commented on code in PR #4023:
URL: https://github.com/apache/solr/pull/4023#discussion_r3162327522
##########
solr/core/src/java/org/apache/solr/handler/RestoreCore.java:
##########
@@ -107,34 +125,140 @@ public boolean doRestore() throws Exception {
DirectoryFactory.DirContext.DEFAULT,
core.getSolrConfig().indexConfig.lockType);
Set<String> indexDirFiles = new
HashSet<>(Arrays.asList(indexDir.listAll()));
- // Move all files from backupDir to restoreIndexDir
- for (String filename : repository.listAllFiles()) {
- checkInterrupted();
- try {
- if (indexDirFiles.contains(filename)) {
- Checksum cs = repository.checksum(filename);
- IndexFetcher.CompareResult compareResult;
- if (cs == null) {
- compareResult = new IndexFetcher.CompareResult();
- compareResult.equal = false;
- } else {
- compareResult = IndexFetcher.compareFile(indexDir, filename,
cs.size, cs.checksum);
+
+ // Capture directories as final for lambda access
+ final Directory finalIndexDir = indexDir;
+ final Directory finalRestoreIndexDir = restoreIndexDir;
+
+ // Only use an executor for parallel downloads when parallelism > 1
+ // When set to 1, run synchronously to avoid thread-local state issues
with CallerRunsPolicy
+ int maxParallelDownloads = DEFAULT_MAX_PARALLEL_DOWNLOADS;
+ ExecutorService executor =
+ maxParallelDownloads > 1
+ ? new ExecutorUtil.MDCAwareThreadPoolExecutor(
+ 0,
+ maxParallelDownloads,
+ 60L,
+ TimeUnit.SECONDS,
+ new SynchronousQueue<>(),
+ new SolrNamedThreadFactory("RestoreCore"),
+ new ThreadPoolExecutor.CallerRunsPolicy())
Review Comment:
I have done concurrent backups on a Solr cloud in the past and Solr allows
it so we should assume someone does so unless we explicitly reject it. My
argument is that in the current implementation, this could result in rapid
thread growth and each call needs to manage the executors lifecycle. It is
harder to control from a user perspective as it is `M * (N + 1) (M = # number
of calls, N = thread pool size)` threads vs making this a static pool with the
`CallerRunsPolicy -> N + 1`. Also the arg is called which to me implies a
global max `DEFAULT_MAX_PARALLEL_UPLOADS` but right now it is a "max per backup
call per collection". What if someone does some backup/restore over a network,
and they have bandwidth throttling? It could be harder to control right now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]