[
https://issues.apache.org/jira/browse/SOLR-18096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rahul Goswami updated SOLR-18096:
---------------------------------
Description:
Forking this JIRA from parent SOLR-17725 based on [~dsmiley]'s suggestion. The
merge policy (LatestVersionMergePolicy) part of the overall enhancement made it
into Solr 10 as part of SOLR-17725 which enables users to be able to upgrade
their index for the next Solr version (Solr 11), although with more write
overhead and manual steps.
The other part of the enhancement is this JIRA which proposes exposing an
UPGRADECOREINDEX CoreAdmin API for standalone installations to be able to
achieve the upgrade in a highly targeted and efficient way by only upgrading
older segments and abstracts the setting/unsetting of the merge policy.
{+}*Design/Operational flow*{+}:
The UPGRADECOREINDEX CoreAdmin command rewrites segments written by older
Lucene versions into the current format (as long as the fields are stored=true
or docValues=true). This makes it possible to use the same index across
multiple major versions without requiring reindexing from the original data
source .
For each core, it:
1. Opens the existing index and sets
[LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
on the IndexWriter to prevent older-format segments from merging with
latest-format segments.
2. Identifies segments written by an older Lucene major version
([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
– any segment whose minVersion predates the current major version).
3. Reads every document from those old-format segments, reconstructing
SolrInputDocuments with all stored/docValues fields.
4. Re-adds each document through an update processor chain, which writes it
as a new current-format segment.
5. Commit removes the now-obsolete old segments since they don't contain any
live docs.
6. Restores the original merge policy.
Documents that already reside in current-format segments are untouched. The
operation is idempotent – re-running it skips segments that are already up to
date.
was:
Forking this JIRA from parent SOLR-17725 based on [~dsmiley]'s suggestion. The
merge policy (LatestVersionMergePolicy) part of the overall enhancement made it
into Solr 10 as part of SOLR-17725 which enables users to be able to upgrade
their index for the next Solr version (Solr 11), although with more write
overhead and manual steps.
The other part of the enhancement is this JIRA which proposes exposing an
UPGRADECOREINDEX CoreAdmin API for standalone installations to be able to
achieve the upgrade in a highly targeted and efficient way by only upgrading
older segments and abstracts the setting/unsetting of the merge policy.
Solr's UPGRADECOREINDEX CoreAdmin command rewrites segments written by older
Lucene versions into the current format (as long as the fields are stored=true
or docValues=true). This makes it possible to use the same index across
multiple major versions without requiring reindexing from the original data
source .
For each core, it:
1. Opens the existing index and sets
[LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
on the IndexWriter to prevent older-format segments from merging with
latest-format segments.
2. Identifies segments written by an older Lucene major version
([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
– any segment whose minVersion predates the current major version).
3. Reads every document from those old-format segments, reconstructing
SolrInputDocuments with all stored fields.
4. Re-adds each document through an update processor chain, which writes it
as a new current-format segment.
5. Commits and runs expungeDeletes to remove the now-obsolete old segments.
Also restores the original merge policy.
Documents that already reside in current-format segments are untouched. The
operation is idempotent – re-running it skips segments that are already up to
date.
> Introduce a UPRADECOREINDEX CoreAdmin API to support index upgrade
> ------------------------------------------------------------------
>
> Key: SOLR-18096
> URL: https://issues.apache.org/jira/browse/SOLR-18096
> Project: Solr
> Issue Type: Improvement
> Reporter: Rahul Goswami
> Priority: Major
> Labels: pull-request-available
> Fix For: 10.1, 9.11
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> Forking this JIRA from parent SOLR-17725 based on [~dsmiley]'s suggestion.
> The merge policy (LatestVersionMergePolicy) part of the overall enhancement
> made it into Solr 10 as part of SOLR-17725 which enables users to be able to
> upgrade their index for the next Solr version (Solr 11), although with more
> write overhead and manual steps.
> The other part of the enhancement is this JIRA which proposes exposing an
> UPGRADECOREINDEX CoreAdmin API for standalone installations to be able to
> achieve the upgrade in a highly targeted and efficient way by only upgrading
> older segments and abstracts the setting/unsetting of the merge policy.
> {+}*Design/Operational flow*{+}:
> The UPGRADECOREINDEX CoreAdmin command rewrites segments written by older
> Lucene versions into the current format (as long as the fields are
> stored=true or docValues=true). This makes it possible to use the same index
> across multiple major versions without requiring reindexing from the original
> data source .
> For each core, it:
> 1. Opens the existing index and sets
> [LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
> on the IndexWriter to prevent older-format segments from merging with
> latest-format segments.
> 2. Identifies segments written by an older Lucene major version
> ([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
> – any segment whose minVersion predates the current major version).
> 3. Reads every document from those old-format segments, reconstructing
> SolrInputDocuments with all stored/docValues fields.
> 4. Re-adds each document through an update processor chain, which writes it
> as a new current-format segment.
> 5. Commit removes the now-obsolete old segments since they don't contain
> any live docs.
> 6. Restores the original merge policy.
> Documents that already reside in current-format segments are untouched. The
> operation is idempotent – re-running it skips segments that are already up to
> date.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]