[ 
https://issues.apache.org/jira/browse/SOLR-18096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Goswami updated SOLR-18096:
---------------------------------
    Description: 
Forking this JIRA from parent SOLR-17725 based on [~dsmiley]'s suggestion. The 
merge policy (LatestVersionMergePolicy) part of the overall enhancement made it 
into Solr 10 as part of SOLR-17725 which enables users to be able to upgrade 
their index for the next Solr version (Solr 11), although with more write 
overhead and manual steps.

The other part of the enhancement is this JIRA which proposes exposing an 
UPGRADECOREINDEX CoreAdmin API for standalone installations to be able to 
achieve the upgrade in a highly targeted and efficient way by only upgrading 
older segments and abstracts the setting/unsetting of the merge policy.

{+}*Design/Operational flow*{+}:

The UPGRADECOREINDEX CoreAdmin command rewrites segments written by older 
Lucene versions into the current format (as long as the fields are stored=true 
or docValues=true). This makes it possible to use the same index across 
multiple major versions without requiring reindexing from the original data 
source .

  For each core, it:
  1. Opens the existing index and sets 
[LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
 on the IndexWriter to prevent older-format segments from merging with 
latest-format segments.

  2. Identifies segments written by an older Lucene major version 
([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
 – any segment whose minVersion predates the current major version).
  3. Reads every document from those old-format segments, reconstructing 
SolrInputDocuments with all stored/docValues fields.
  4. Re-adds each document through an update processor chain, which writes it 
as a new current-format segment.
  5. Commit removes the now-obsolete old segments since they don't contain any 
live docs.

  6. Restores the original merge policy. 

  Documents that already reside in current-format segments are untouched. The 
operation is idempotent – re-running it skips segments that are already up to 
date.

 

  was:
Forking this JIRA from parent SOLR-17725 based on [~dsmiley]'s suggestion. The 
merge policy (LatestVersionMergePolicy) part of the overall enhancement made it 
into Solr 10 as part of SOLR-17725 which enables users to be able to upgrade 
their index for the next Solr version (Solr 11), although with more write 
overhead and manual steps.

The other part of the enhancement is this JIRA which proposes exposing an 
UPGRADECOREINDEX CoreAdmin API for standalone installations to be able to 
achieve the upgrade in a highly targeted and efficient way by only upgrading 
older segments and abstracts the setting/unsetting of the merge policy.



Solr's UPGRADECOREINDEX CoreAdmin command rewrites segments written by older 
Lucene versions into the current format (as long as the fields are stored=true 
or docValues=true). This makes it possible to use the same index across 
multiple major versions without requiring reindexing from the original data 
source .

  For each core, it:
  1. Opens the existing index and sets 
[LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
 on the IndexWriter to prevent older-format segments from merging with 
latest-format segments.

  2. Identifies segments written by an older Lucene major version 
([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
 – any segment whose minVersion predates the current major version).
  3. Reads every document from those old-format segments, reconstructing 
SolrInputDocuments with all stored fields.
  4. Re-adds each document through an update processor chain, which writes it 
as a new current-format segment.
  5. Commits and runs expungeDeletes to remove the now-obsolete old segments. 
Also restores the original merge policy. 

  Documents that already reside in current-format segments are untouched. The 
operation is idempotent – re-running it skips segments that are already up to 
date.

 


> Introduce a UPRADECOREINDEX CoreAdmin API to support index upgrade
> ------------------------------------------------------------------
>
>                 Key: SOLR-18096
>                 URL: https://issues.apache.org/jira/browse/SOLR-18096
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Rahul Goswami
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 10.1, 9.11
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Forking this JIRA from parent SOLR-17725 based on [~dsmiley]'s suggestion. 
> The merge policy (LatestVersionMergePolicy) part of the overall enhancement 
> made it into Solr 10 as part of SOLR-17725 which enables users to be able to 
> upgrade their index for the next Solr version (Solr 11), although with more 
> write overhead and manual steps.
> The other part of the enhancement is this JIRA which proposes exposing an 
> UPGRADECOREINDEX CoreAdmin API for standalone installations to be able to 
> achieve the upgrade in a highly targeted and efficient way by only upgrading 
> older segments and abstracts the setting/unsetting of the merge policy.
> {+}*Design/Operational flow*{+}:
> The UPGRADECOREINDEX CoreAdmin command rewrites segments written by older 
> Lucene versions into the current format (as long as the fields are 
> stored=true or docValues=true). This makes it possible to use the same index 
> across multiple major versions without requiring reindexing from the original 
> data source .
>   For each core, it:
>   1. Opens the existing index and sets 
> [LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
>  on the IndexWriter to prevent older-format segments from merging with 
> latest-format segments.
>   2. Identifies segments written by an older Lucene major version 
> ([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
>  – any segment whose minVersion predates the current major version).
>   3. Reads every document from those old-format segments, reconstructing 
> SolrInputDocuments with all stored/docValues fields.
>   4. Re-adds each document through an update processor chain, which writes it 
> as a new current-format segment.
>   5. Commit removes the now-obsolete old segments since they don't contain 
> any live docs.
>   6. Restores the original merge policy. 
>   Documents that already reside in current-format segments are untouched. The 
> operation is idempotent – re-running it skips segments that are already up to 
> date.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to