[ 
https://issues.apache.org/jira/browse/SOLR-14845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401710#comment-17401710
 ] 

Jan Høydahl commented on SOLR-14845:
------------------------------------

Jeff, do you still have this problem? Does it happen every time, i.e. is 
reproducible? Does it still occur on a Solr 8.x cluster? If you are able to 
reproduce, please also copy relevant ERROR sections of solr.log file on the 
server, which may bring more light on what happened.

Most likely, there was some low-level I/O issues between your Server and the 
disk system while writing the backup for one of the cores. Which would of 
course be more likely the longer running the backup job is. And as such not a 
Solr bug, although you could wish for more retry logic or something?

I'll close this as not reproducible if no complaints.

> Backup failing with solr 7.7.2 java.io.IOException: Interrupted system call
> ---------------------------------------------------------------------------
>
>                 Key: SOLR-14845
>                 URL: https://issues.apache.org/jira/browse/SOLR-14845
>             Project: Solr
>          Issue Type: Bug
>          Components: Backup/Restore
>    Affects Versions: 7.2.2
>            Reporter: Jeff
>            Priority: Critical
>
> I have a 12 node solrcloud cluster with 48 shards.
> 800GB on each node.
> 7.3 million docs and around 98 GB per shard.
>  
>  
> When I issue the backup command it runs for several hours and produces most 
> of the backup but fails on some shards.
>  
> Command issued
> curl -XPOST 
> 'http://xx.xxx.xxx.xxx:8983/solr/admin/collections?action=BACKUP&name=prod1&collection=PROD&location=/mnt/prodstorage/backup&async=111113&wt=xml'
>  
>     "Response":"TaskId: 1111127375391376590965 webapp=null path=/admin/cores 
> params={core=PROD_shard8_1_replica_n156&async=1111127375391376590965&qt=/admin/cores&name=shard8_1&action=BACKUPCORE&location=file:///mnt/prodstorage/backup/prod1&wt=javabin&version=2}
>  status=0 QTime=0"},
>   "1111127375391376904569":{
>     "responseHeader":{
>       "status":0,
>       "QTime":0},
>     "STATUS":"failed",
>     "Response":"Failed to backup core=PROD_shard19_1_replica_n263 because 
> java.io.IOException: Interrupted system call"},
>   "status":{
>     "state":"failed",    "msg":"found [111112] in failed tasks"}}
>  
>  
> Can I lengthen the timeout? Manaully backup?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to