mlbiscoc commented on code in PR #4023:
URL: https://github.com/apache/solr/pull/4023#discussion_r3157353134


##########
solr/core/src/java/org/apache/solr/handler/RestoreCore.java:
##########
@@ -107,34 +125,140 @@ public boolean doRestore() throws Exception {
                   DirectoryFactory.DirContext.DEFAULT,
                   core.getSolrConfig().indexConfig.lockType);
       Set<String> indexDirFiles = new 
HashSet<>(Arrays.asList(indexDir.listAll()));
-      // Move all files from backupDir to restoreIndexDir
-      for (String filename : repository.listAllFiles()) {
-        checkInterrupted();
-        try {
-          if (indexDirFiles.contains(filename)) {
-            Checksum cs = repository.checksum(filename);
-            IndexFetcher.CompareResult compareResult;
-            if (cs == null) {
-              compareResult = new IndexFetcher.CompareResult();
-              compareResult.equal = false;
-            } else {
-              compareResult = IndexFetcher.compareFile(indexDir, filename, 
cs.size, cs.checksum);
+
+      // Capture directories as final for lambda access
+      final Directory finalIndexDir = indexDir;
+      final Directory finalRestoreIndexDir = restoreIndexDir;
+
+      // Only use an executor for parallel downloads when parallelism > 1
+      // When set to 1, run synchronously to avoid thread-local state issues 
with CallerRunsPolicy
+      int maxParallelDownloads = DEFAULT_MAX_PARALLEL_DOWNLOADS;
+      ExecutorService executor =
+          maxParallelDownloads > 1
+              ? new ExecutorUtil.MDCAwareThreadPoolExecutor(
+                  0,
+                  maxParallelDownloads,
+                  60L,
+                  TimeUnit.SECONDS,
+                  new SynchronousQueue<>(),
+                  new SolrNamedThreadFactory("RestoreCore"),
+                  new ThreadPoolExecutor.CallerRunsPolicy())

Review Comment:
   Not saying this is wrong and your argument makes sense. Seems like a all of 
our executors in Solr use a `LinkedBlockingQueue` which also seems if you do 
that, we can just set executor to 1 instead of null and avoid all these 
`executor != null` checks through this function.
   
   My main problem I can tell from this executor is that it is being created 
inside this function as a local variable. If I am not mistaken, that means if I 
do 2 backup/restore calls, I just created 2 thread pools giving us 2x the size. 
Someone with many collections and backing up with many calls can cause a thread 
explosion. This should be a global/static executor somewhere else that shares 
this.



##########
solr/solr-ref-guide/modules/deployment-guide/pages/backup-restore.adoc:
##########
@@ -396,6 +396,39 @@ Any children under the `<repository>` tag are passed as 
additional configuration
 
 Information on each of the repository implementations provided with Solr is 
provided below.
 
+=== Parallel File Transfers
+
+Backup and restore operations can transfer multiple index files in parallel to 
improve throughput, especially when using cloud storage repositories like S3 or 
GCS where latency is higher.
+The parallelism is controlled via system properties or environment variables:
+
+`solr.backup.maxparalleluploads`::
++
+[%autowidth,frame=none]
+|===
+|Optional |Default: `1`
+|===
++
+Maximum number of index files to upload in parallel during backup operations.
+Can also be set via the `SOLR_BACKUP_MAXPARALLELUPLOADS` environment variable.
+For cloud storage repositories (S3, GCS), consider setting this to `8` or 
higher to improve backup performance.

Review Comment:
   How did you come up with the number 8? Was it just the number you set? Just 
curious. I would probably not place a number here to recommend as this depends 
on a persons hardware.



##########
solr/core/src/java/org/apache/solr/handler/IncrementalShardBackup.java:
##########
@@ -191,55 +213,154 @@ private BackupStats incrementalCopy(Collection<String> 
indexFiles, Directory dir
     URI indexDir = incBackupFiles.getIndexDir();
     BackupStats backupStats = new BackupStats();
 
-    for (String fileName : indexFiles) {
-      Optional<ShardBackupMetadata.BackedFile> opBackedFile = 
oldBackupPoint.getFile(fileName);
-      Checksum originalFileCS = backupRepo.checksum(dir, fileName);
-
-      if (opBackedFile.isPresent()) {
-        ShardBackupMetadata.BackedFile backedFile = opBackedFile.get();
-        Checksum existedFileCS = backedFile.fileChecksum;
-        if (existedFileCS.equals(originalFileCS)) {
-          currentBackupPoint.addBackedFile(opBackedFile.get());
-          backupStats.skippedUploadingFile(existedFileCS);
-          continue;
+    // Only use an executor for parallel uploads when parallelism > 1
+    // When set to 1, run synchronously to avoid thread-local state issues 
with CallerRunsPolicy
+    int maxParallelUploads = DEFAULT_MAX_PARALLEL_UPLOADS;

Review Comment:
   Just use `DEFAULT_MAX_PARALLEL_UPLOADS`. No need to set to new variable and 
repass to executor



##########
solr/core/src/java/org/apache/solr/handler/IncrementalShardBackup.java:
##########
@@ -191,55 +213,154 @@ private BackupStats incrementalCopy(Collection<String> 
indexFiles, Directory dir
     URI indexDir = incBackupFiles.getIndexDir();
     BackupStats backupStats = new BackupStats();
 
-    for (String fileName : indexFiles) {
-      Optional<ShardBackupMetadata.BackedFile> opBackedFile = 
oldBackupPoint.getFile(fileName);
-      Checksum originalFileCS = backupRepo.checksum(dir, fileName);
-
-      if (opBackedFile.isPresent()) {
-        ShardBackupMetadata.BackedFile backedFile = opBackedFile.get();
-        Checksum existedFileCS = backedFile.fileChecksum;
-        if (existedFileCS.equals(originalFileCS)) {
-          currentBackupPoint.addBackedFile(opBackedFile.get());
-          backupStats.skippedUploadingFile(existedFileCS);
-          continue;
+    // Only use an executor for parallel uploads when parallelism > 1
+    // When set to 1, run synchronously to avoid thread-local state issues 
with CallerRunsPolicy
+    int maxParallelUploads = DEFAULT_MAX_PARALLEL_UPLOADS;
+    ExecutorService executor =
+        maxParallelUploads > 1
+            ? new ExecutorUtil.MDCAwareThreadPoolExecutor(
+                0,
+                maxParallelUploads,
+                60L,
+                TimeUnit.SECONDS,
+                new SynchronousQueue<>(),
+                new SolrNamedThreadFactory("IncrementalBackup"),

Review Comment:
   Maybe suffix it with `executor` like other threadpools do.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to