[ 
https://issues.apache.org/jira/browse/SOLR-15673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428893#comment-17428893
 ] 

David Smiley commented on SOLR-15673:
-------------------------------------

I haven't looked at the code but I suspect there is an assumption in the 
implementation that the shard shape (specific shards w/ hash ranges) of the 
cluster doesn't change, which isn't necessarily true.  I suspect use of manual 
sharing (aka implicit router) would be problematic as well.  I suspect/worry 
this is a hard problem to correct.

> Incremental backup attempt fails after a shard split operation has completed
> ----------------------------------------------------------------------------
>
>                 Key: SOLR-15673
>                 URL: https://issues.apache.org/jira/browse/SOLR-15673
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Backup/Restore
>    Affects Versions: 8.9
>            Reporter: Sameer
>            Priority: Major
>         Attachments: StackTrace.odt
>
>
> I have been attempting to use the incremental backup API on Solr 8.9.0, but 
> while testing in our product we would occasionally get into a state where all 
> subsequent backup attempts would fail. After some triage we found that it was 
> happening to any collection which had undergone a shard split operation. If 
> we did a backup, completed a shard split operation, then attempted another 
> backup, the second backup would fail with a FileNotFound exception relating 
> to the backup id of the second backup as the error message. 
> Steps to reproduce:
>  
>  * Create a new collection with no associated backups
>  * Run a backup for this collection
>  * 
> /admin/collections?action=BACKUP&name=myBackupName&collection=myCollectionName&location=/path/to/my/shared/drive
>  * Run a shard split operation
>  * /admin/collections?action=SPLITSHARD&collection=name&shard=shardID
>  * Attempt another backup
>  
> Expected Outcome:
> * If this operation is being blocked intentionally, then I would expect an 
> informative error message explaining why it failed. Otherwise I would expect 
> the backup to complete successfully.
> Actual Outcome:
> * The backup operation fails with a NoSuchFileException.
> NOTE: In the below exception message the number in the file which isn’t found 
> (in this case zk_backup_1) relates to the backup attempt which is currently 
> being attempted. 
> java.nio.file.NoSuchFileException: 
> /path/to/my/shared/drive/reproCollectionBackup/reproCollection/zk_backup_1
> I tried a few different workaround attempts, but after going through these 
> steps I wasn’t able to run another backup for the collection.
> Workaround attempt 1:
>  * Use the API to delete the backup
>  * Used the API to purge unused backup files
>  * Restarted Solr
>  * Attempted another backup
>  * Encountered the same failure
> Workaround attempt 2:
>  * Deleted all files in my Solr backup mount location
>  * Restarted Solr
>  * Attempted another backup
>  * Encountered the same failure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to