[
https://issues.apache.org/jira/browse/SOLR-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821860#comment-15821860
]
Timothy Potter commented on SOLR-9961:
--------------------------------------
Patch is not ready for commit. We need to think about how to provide some
config options like max time to wait and number of threads. Right now, I get
the number of threads from a sys prop, but I think it should probably come
through a marker interface that specific backup repos can implement ... will
post up a better version later today.
> RestoreCore needs the option to download files in parallel.
> -----------------------------------------------------------
>
> Key: SOLR-9961
> URL: https://issues.apache.org/jira/browse/SOLR-9961
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Backup/Restore
> Affects Versions: 6.2.1
> Reporter: Timothy Potter
> Attachments: SOLR-9961.patch
>
>
> My backup to cloud storage (Google cloud storage in this case, but I think
> this is a general problem) takes 8 minutes ... the restore of the same core
> takes hours. The restore loop in RestoreCore is serial and doesn't allow me
> to parallelize the expensive part of this operation (the IO from the remote
> cloud storage service). We need the option to parallelize the download (like
> distcp).
> Also, I tried downloading the same directory using gsutil and it was very
> fast, like 2 minutes. So I know it's not the pipe that's limiting perf here.
> Here's a very rough patch that does the parallelization. We may also want to
> consider a two-step approach: 1) download in parallel to a temp dir, 2)
> perform all the of the checksum validation against the local temp dir. That
> will save round trips to the remote cloud storage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]