Arda created SOLR-17460:
---------------------------

             Summary: Error During Collection Migration from Solr 7.0 to Solr 
8.4: Missing Files and Shard Restoration Failures
                 Key: SOLR-17460
                 URL: https://issues.apache.org/jira/browse/SOLR-17460
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: hdfs, SolrCloud
    Affects Versions: 8.4, 7.0
            Reporter: Arda


I was attempting to migrate a collection with 3 shards from a Solr 7.0 cluster 
to a Solr 8.4 cluster. The data is stored in HDFS. I followed the 
backup-restore process but encountered issues with two of the shards during the 
restoration.
h1. *Migration Process:*


*1-* *Backup Command:* To avoid timeouts, I initiated the backup with an async 
parameter:


curl -k --negotiate -u : 
'https://<solrNode>:<solrPort>/solr/admin/collections?action=BACKUP&name=<backupName>&collection=<solrCollectionName>x&location=<hdfsPath>&
 async=12346'


*2- Copy Backup to Local:* After the backup, I copied the data from HDFS to the 
local filesystem:

hdfs dfs --copyToLocal <backupPath> <localPath>

*3- Transfer Backup to New Cluster:* I then copied the backup files from the 
older Solr node to the newer one:

scp -pr <localPath> <username>@<ip>:<localPathDestination>


*4- Prepare New HDFS Path:* On the new Solr cluster, I created a new directory 
in HDFS and adjusted ownership:

hdfs dfs -mkdir <pathName2>
hdfs dfs -chown solr:solr <pathName2>


*5- Copy Backup to New HDFS Location:* I transferred the backup data from local 
to the new HDFS path:

hdfs dfs --copyFromLocal <localPathDestination> <pathName2>

*6- Restore Collection:* Finally, I ran the restore command:


curl -k --negotiate -u : 
'https://<solrNode>:<solrPort>/solr/admin/collections?action=RESTORE&name=<backupName>&collection=<solrCollectionName>x&location=<hdfs_path>&
 async=12345'
h1. 
*Issue:*


After the restore process completed, I found that two of the shards could not 
be restored. The logs displayed the following errors:

*Error During Shard Restoration:*

ERROR [c: <solrCollectionName> s: shard2 r:core_node5 x: : 
<solrCollectionName>_shard2_replica_n4] o.a.s.h.RequestHandlerBase 
org.apache.solr.common. SolrException: Error CREATEing SolrCore 
'<solrCollectionName>_shard2_replica_n4': Unable to create core 
[:<solrCollectionName>_shard2_replica_n4] Caused by: 
org.apache.solr.handler.component.QueryDocAuthorizationComponent.....

*FileNotFoundException and Index Corruption:*

WARN 
(parallelCoreAdminExecutor-6-thread-7-processing-n:<solrNode>:<solrPort>_solrx:<solrCollectionName>_shard2
_replica_n1 <numbers> RESTORECORE) [x:<solrCollectionName>_shard2_replica_n1] 
o.a.s.h. RestoreCore Could not switch to restored index. Rolling back to the 
current index => org.apache.lucene.index.CorruptindexException: Unexpected file 
read error while reading index. 
(resource=BufferedChecksumIndexInput(segments_1g9dk))
Caused by: java.io. FileNotFoundException: File does not exist: 
hdfs://<hdfsPath>/core_node2/data/restore/<fileName>


It appears that Solr is looking for a file in HDFS that doesn't exist, despite 
no manual deletions being made. I cannot determine why these specific shards 
failed to restore, or why the system is unable to locate the required files.

Expected Behavior:
The backup and restore process should complete without errors, and all shards 
should be restored successfully to the new cluster.

Actual Behavior:
Two shards failed to restore, with errors related to missing files and index 
corruption.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to