[
https://issues.apache.org/jira/browse/SOLR-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527505#comment-16527505
]
Jan Høydahl commented on SOLR-12523:
------------------------------------
Tested the patch, and it passes precommit. Here's the new error text from the
API when attempting a backup across two nodes that do NOT share the backup
drive. The same error will also be logged in the logs on both nodes. Will
commit soon.
{noformat}
{
"responseHeader": {
"status": 500, "QTime": 135
}, "failure": {
"10.5.0.5:8983_solr":
"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
from server at http://10.5.0.5:8983/solr: Failed to backup
core=coll2_shard1_replica_n2 because org.apache.solr.common.SolrException:
Directory to contain snapshots doesn't exist: file:///back/myback2. Backup
folder must already exist. Note also that Backup/Restore of a SolrCloud
collection requires a shared file system mounted at the same path on all nodes!"
}, "Operation backup caused exception:":
"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not backup all replicas", "exception": {
"msg": "Could not backup all replicas", "rspCode": 500
}, "error": {
"metadata": [
"error-class", "org.apache.solr.common.SolrException", "root-error-class",
"org.apache.solr.common.SolrException"
], "msg": "Could not backup all replicas", "trace":
"org.apache.solr.common.SolrException: Could not backup all replicas\n\tat
org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:53)\n\tat
[...snip...]
}
}{noformat}
One unrelated observation here. Part of the error response says: *Could not
backup all replicas*. While we might call any core a "replica", it would
perhaps in the context of a collection backup be more precise to say *Could not
backup all shards*?
> Confusing error reporting if backup attempted on non-shared FS
> --------------------------------------------------------------
>
> Key: SOLR-12523
> URL: https://issues.apache.org/jira/browse/SOLR-12523
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Backup/Restore
> Affects Versions: 7.3.1
> Reporter: Timothy Potter
> Assignee: Jan Høydahl
> Priority: Minor
> Fix For: master (8.0), 7.5
>
> Attachments: SOLR-12523.patch
>
>
> So I have a large collection with 4 shards across 2 nodes. When I try to back
> it up with:
> {code}
> curl
> "http://localhost:8984/solr/admin/collections?action=BACKUP&name=sigs&collection=foo_signals&async=5&location=backups"
> {code}
> I either get:
> {code}
> "5170256188349065":{
> "responseHeader":{
> "status":0,
> "QTime":0},
> "STATUS":"failed",
> "Response":"Failed to backup core=foo_signals_shard1_replica_n2 because
> org.apache.solr.common.SolrException: Directory to contain snapshots doesn't
> exist: file:///vol1/cloud84/backups/sigs"},
> "5170256187999044":{
> "responseHeader":{
> "status":0,
> "QTime":0},
> "STATUS":"failed",
> "Response":"Failed to backup core=foo_signals_shard3_replica_n10 because
> org.apache.solr.common.SolrException: Directory to contain snapshots doesn't
> exist: file:///vol1/cloud84/backups/sigs"},
> {code}
> or if I create the directory, then I get:
> {code}
> {
> "responseHeader":{
> "status":0,
> "QTime":2},
> "Operation backup caused
> exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> The backup directory already exists: file:///vol1/cloud84/backups/sigs/",
> "exception":{
> "msg":"The backup directory already exists:
> file:///vol1/cloud84/backups/sigs/",
> "rspCode":400},
> "status":{
> "state":"failed",
> "msg":"found [2] in failed tasks"}}
> {code}
> I'm thinking this has to do with having 2 cores from the same collection on
> the same node but I can't get a collection with 1 shard on each node to work
> either:
> {code}
> "ec2-52-90-245-38.compute-1.amazonaws.com:8984_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
> from server at http://ec2-52-90-245-38.compute-1.amazonaws.com:8984/solr:
> Failed to backup core=system_jobs_history_shard2_replica_n6 because
> org.apache.solr.common.SolrException: Directory to contain snapshots doesn't
> exist: file:///vol1/cloud84/backups/ugh1"}
> {code}
> What's weird is that replica (system_jobs_history_shard2_replica_n6) is not
> even on the ec2-52-90-245-38.compute-1.amazonaws.com node! It lives on a
> different node.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]