[jira] [Updated] (SOLR-18190) Collection-Level Index Upgrade API in SolrCloud (UPRGRADECOLLECTIONINDEX)

Rahul Goswami (Jira) Sun, 12 Apr 2026 21:57:41 -0700


     [ 
https://issues.apache.org/jira/browse/SOLR-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rahul Goswami updated SOLR-18190:
---------------------------------
    Description: 
*+Objective+*

Expose index-upgrade functionality at collection scope in SolrCloud as a new 
"UPGRADECOLLECTIONINDEX" Collections API with async support and `REQUESTSTATUS` 
tracking.

 

+*Background: Core-Level Index Upgrade (UPGRADECOREINDEX)*+

Solr's UPGRADECOREINDEX CoreAdmin command rewrites segments written by older 
Lucene versions into the current format (as long as the fields are stored=true 
or docValues=true). This makes it possible to use the same index across 
multiple major versions without requiring reindexing from the original data 
source .

  For each core, it:
  1. Opens the existing index and sets 
[LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
 on the IndexWriter to prevent older-format segments from merging with 
latest-format segments.

  2. Identifies segments written by an older Lucene major version 
([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
 – any segment whose minVersion predates the current major version).
  3. Reads every document from those old-format segments, reconstructing 
SolrInputDocuments with all stored fields.
  4. Re-adds each document through an update processor chain, which writes it 
as a new current-format segment.
  5. Commits and runs expungeDeletes to remove the now-obsolete old segments. 
Also restores the original merge policy. 

  Documents that already reside in current-format segments are untouched. The 
operation is idempotent – re-running it skips segments that are already up to 
date.

*+
Approach+*
 * Leader is upgraded using the recently introduced CoreAdmin UPGRADECOREINDEX 
API
 * Thereafter UPGRADECOREINDEX API is called on each NRT replica to rewrite any 
segments leftover after upgrading the leader (since leader upgrade would have 
forwarded writes to the NRT replicas to rewrite most, if not all, of the older 
segments)   
 * TLOG/PULL replicas converge via the usual replication mechanism. 

Coordinator waits for replicas to converge in a timebound manner before 
declaring success.

 

+*Operational Flow*+

For each shard (sequentially):
 #   Coordinator sets LatestVersionMergePolicy on the IndexWriter of each 
replica
 #   Identify the current leader and upgrade the leader via CoreAdmin 
`UPGRADECOREINDEX` action. This would also cause the updates to be forwarded to 
the replicas, thereby upgrading majority, if not all, of each replica. 
 #   Upgrade NRT non-leader replicas in sequence using the same mechanism. We 
expect very less version churn since most of the segments should have been 
rewritten with the leader upgrade in step#2.
 #   TLOG/PULL replicas converge via their normal background replication from 
the now-upgraded leader.
 #   Convergence polling: Poll all replicas with `checkOnly=true` until every 
replica reports no old-format segments remaining. ("checkOnly" is a new 
lightweight param introduced on the UPGRADECOREINDEX CoreAdmin API to check for 
presence of any old-format segments)
 #   Reset the older/original merge policy on each replica

*+Limitations+*
 - Child/nested documents: Not supported (existing limitation in 
`UpgradeCoreIndex`). 
 - Leader election resilience: If a leader election occurs during the upgrade, 
progress may be lost and the command must be re-run. This should be fine since 
the operation is designed to be idempotent and ensures that the state remains 
consistent albeit at the cost of additional re-work.

  was:
*+Objective+*

Expose index-upgrade functionality at collection scope in SolrCloud as a new 
"UPGRADECOLLECTIONINDEX" Collections API with async support and `REQUESTSTATUS` 
tracking.

*+Approach+*
_Make collection read-only_ *+* _hybrid local upgrade_ - Collection is set to 
`readOnly` for the duration. Each replica type is upgraded via its designed 
index-update mechanism. Which means
 * Leader is upgraded using the recently introduced CoreAdmin UPGRADECOREINDEX 
API
 * Each NRT replica gets individually upgraded using the same UPGRADECOREINDEX 
API 
 * TLOG/PULL replicas converge via the usual replication mechanism. 

Coordinator waits for replicas to converge in a timebound manner before 
declaring success.

Why not  upgrade only the leader and rely on distributed forwarding to NRT 
replicas ? `DistributedZkUpdateProcessor` enforces the collection-level 
`readOnly` on every node, including replicas receiving forwarded updates. 
Forwarding is blocked due to read-only status.{+}**{+}

 

+*Background: Core-Level Index Upgrade (UPGRADECOREINDEX)*+

Solr's UPGRADECOREINDEX CoreAdmin command rewrites segments written by older 
Lucene versions into the current format (as long as the fields are stored=true 
or docValues=true). This makes it possible to use the same index across 
multiple major versions without requiring reindexing from the original data 
source .

  For each core, it:
  1. Opens the existing index and sets 
[LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
 on the IndexWriter to prevent older-format segments from merging with 
latest-format segments.

  2. Identifies segments written by an older Lucene major version 
([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
 – any segment whose minVersion predates the current major version).
  3. Reads every document from those old-format segments, reconstructing 
SolrInputDocuments with all stored fields.
  4. Re-adds each document through an update processor chain, which writes it 
as a new current-format segment.
  5. Commits and runs expungeDeletes to remove the now-obsolete old segments. 
Also restores the original merge policy. 

  Documents that already reside in current-format segments are untouched. The 
operation is idempotent – re-running it skips segments that are already up to 
date.

 

+*Operational Flow*+

1. Coordinator sets `readOnly=true` on the collection via `MODIFYCOLLECTION` 
(blocks all external writes at `DistributedZkUpdateProcessor`).

2. For each shard (sequentially):
  a. Identify the current leader 
  b. Upgrade the leader via CoreAdmin `UPGRADECOREINDEX` action. The leader 
uses a stripped-down chain (`LogUpdateProcessor` → `RunUpdateProcessor`, no 
`DistributedUpdateProcessor`) to rewrite old segments locally. No version 
reassignment, no distributed forwarding. The original `{_}version{_}` is 
preserved both in the indexed document and on the `AddUpdateCommand` (for tlog 
consistency). After rewriting, the original merge policy is restored before 
commit.
  c. Upgrade NRT non-leader replicas in parallel using the same mechanism. Each 
NRT replica independently rewrites its own segments with zero network I/O.
  d. TLOG/PULL replicas converge via their normal background replication from 
the now-upgraded leader.
  e. Convergence polling: Poll all replicas with `checkOnly=true` until every 
replica reports no old-format segments remaining. ("checkOnly" is a new 
lightweight param introduced on the UPGRADECOREINDEX CoreAdmin API to check for 
presence of any old-format segments)

3. Clear `readOnly=false` only after all shards validate. On any failure, the 
collection remains read-only for operator intervention. 

 

*+Limitations+*
 - Nested documents: Not supported (existing limitation in `UpgradeCoreIndex`). 
 - Write downtime: The collection is unavailable for external writes for the 
duration of the upgrade. {_}Note{_}: The CoreAdmin API doesn't have this 
limitation currently in standalone mode, but in SolrCloud mode with the 
possibility of leader election mid-upgrade and considering overall cluster 
stability/state correctness factors, reducing another variable by blocking 
writes makes the design simpler to reason about.
 - Leader election resilience: If a leader election occurs during the upgrade, 
progress may be lost and the command must be re-run. This should be fine since 
the operation is designed to be idempotent.
 - Co-located replica IO: NRT replicas on the same node are upgraded in 
parallel, which may cause IO contention. Node-aware throttling deferred to a 
future version.


> Collection-Level Index Upgrade API in SolrCloud (UPRGRADECOLLECTIONINDEX)
> -------------------------------------------------------------------------
>
>                 Key: SOLR-18190
>                 URL: https://issues.apache.org/jira/browse/SOLR-18190
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Rahul Goswami
>            Assignee: Rahul Goswami
>            Priority: Major
>
> *+Objective+*
> Expose index-upgrade functionality at collection scope in SolrCloud as a new 
> "UPGRADECOLLECTIONINDEX" Collections API with async support and 
> `REQUESTSTATUS` tracking.
>  
> +*Background: Core-Level Index Upgrade (UPGRADECOREINDEX)*+
> Solr's UPGRADECOREINDEX CoreAdmin command rewrites segments written by older 
> Lucene versions into the current format (as long as the fields are 
> stored=true or docValues=true). This makes it possible to use the same index 
> across multiple major versions without requiring reindexing from the original 
> data source .
>   For each core, it:
>   1. Opens the existing index and sets 
> [LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
>  on the IndexWriter to prevent older-format segments from merging with 
> latest-format segments.
>   2. Identifies segments written by an older Lucene major version 
> ([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
>  – any segment whose minVersion predates the current major version).
>   3. Reads every document from those old-format segments, reconstructing 
> SolrInputDocuments with all stored fields.
>   4. Re-adds each document through an update processor chain, which writes it 
> as a new current-format segment.
>   5. Commits and runs expungeDeletes to remove the now-obsolete old segments. 
> Also restores the original merge policy. 
>   Documents that already reside in current-format segments are untouched. The 
> operation is idempotent – re-running it skips segments that are already up to 
> date.
> *+
> Approach+*
>  * Leader is upgraded using the recently introduced CoreAdmin 
> UPGRADECOREINDEX API
>  * Thereafter UPGRADECOREINDEX API is called on each NRT replica to rewrite 
> any segments leftover after upgrading the leader (since leader upgrade would 
> have forwarded writes to the NRT replicas to rewrite most, if not all, of the 
> older segments)   
>  * TLOG/PULL replicas converge via the usual replication mechanism. 
> Coordinator waits for replicas to converge in a timebound manner before 
> declaring success.
>  
> +*Operational Flow*+
> For each shard (sequentially):
>  #   Coordinator sets LatestVersionMergePolicy on the IndexWriter of each 
> replica
>  #   Identify the current leader and upgrade the leader via CoreAdmin 
> `UPGRADECOREINDEX` action. This would also cause the updates to be forwarded 
> to the replicas, thereby upgrading majority, if not all, of each replica. 
>  #   Upgrade NRT non-leader replicas in sequence using the same mechanism. We 
> expect very less version churn since most of the segments should have been 
> rewritten with the leader upgrade in step#2.
>  #   TLOG/PULL replicas converge via their normal background replication from 
> the now-upgraded leader.
>  #   Convergence polling: Poll all replicas with `checkOnly=true` until every 
> replica reports no old-format segments remaining. ("checkOnly" is a new 
> lightweight param introduced on the UPGRADECOREINDEX CoreAdmin API to check 
> for presence of any old-format segments)
>  #   Reset the older/original merge policy on each replica
> *+Limitations+*
>  - Child/nested documents: Not supported (existing limitation in 
> `UpgradeCoreIndex`). 
>  - Leader election resilience: If a leader election occurs during the 
> upgrade, progress may be lost and the command must be re-run. This should be 
> fine since the operation is designed to be idempotent and ensures that the 
> state remains consistent albeit at the cost of additional re-work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-18190) Collection-Level Index Upgrade API in SolrCloud (UPRGRADECOLLECTIONINDEX)

Reply via email to