[jira] [Updated] (SOLR-18190) Collection-Level Index Upgrade API in SolrCloud (UPRGADECOLLECTIONINDEX)

Rahul Goswami (Jira) Thu, 02 Apr 2026 23:14:08 -0700


     [ 
https://issues.apache.org/jira/browse/SOLR-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rahul Goswami updated SOLR-18190:
---------------------------------
    Description: 
*+Objective+*

Expose index-upgrade functionality at collection scope in SolrCloud as a new 
"UPGRADECOLLECTIONINDEX" Collections API command with async support and 
`REQUESTSTATUS` tracking.

*+Approach+*
_Write freeze_ *+* _hybrid local upgrade_ - Collection is set to `readOnly` for 
the duration. Each replica type is upgraded via its designed index-update 
mechanism. Which means
 * Leader is upgraded using the CoreAdmin UPGRADECOREINDEX API that is now 
available.
 * NRT replicas are also individually upgraded using the same UPGRADECOREINDEX 
API 
 * TLOG/PULL replicas  

Why not  upgrade only the leader and rely on distributed forwarding to NRT 
replicas ? `DistributedZkUpdateProcessor` enforces the collection-level 
`readOnly` on every node, including replicas receiving forwarded updates. 
Forwarding is blocked by the same write freeze that protects against external 
writes.

+*Operational Flow*+

1. Coordinator sets `readOnly=true` on the collection via `MODIFYCOLLECTION` 
(blocks all external writes at `DistributedZkUpdateProcessor`).
2. For each shard (sequentially):
a. Identify the current leader via `getLeaderRetry()`.
b. *{*}Upgrade the leader{*}* via CoreAdmin `UPGRADECOREINDEX` with 
`cloudMode=true`. The leader uses a stripped-down chain (`LogUpdateProcessor` → 
`RunUpdateProcessor`, no `DistributedUpdateProcessor`) to rewrite old segments 
locally. No version reassignment, no distributed forwarding. The original 
`{_}version{_}` is preserved both in the indexed document and on the 
`AddUpdateCommand` (for tlog consistency). After rewriting, the original merge 
policy is restored before commit, followed by `expungeDeletes` to clean 
tombstone segments.
c. *{*}Upgrade NRT non-leader replicas{*}* in parallel using the same 
mechanism. Each NRT replica independently rewrites its own segments with zero 
network I/O.
d. *{*}TLOG/PULL replicas{*}* converge via their normal background replication 
from the now-upgraded leader.
e. *{*}Convergence polling{*}*: Poll all replicas with `checkOnly=true` until 
every replica reports no old-format segments remaining (see Convergence Polling 
below).
3. Clear `readOnly=false` only after all shards validate. On any failure, the 
collection remains read-only for operator intervention.



*+Limitations+*

- Nested documents: Not supported (existing limitation in `UpgradeCoreIndex`). 
- Write downtime: The collection is unavailable for writes for the duration of 
the upgrade. Zero write downtime may be revisited after solving the DBQ 
resurrection problem.
- Leader election resilience: If a leader election occurs during the upgrade, 
progress may be lost and the command must be re-run. 
- Co-located replica IO: NRT replicas on the same node are upgraded in 
parallel, which may cause IO contention. Node-aware throttling deferred to a 
future version.

  was:
# Collection-Level Index Upgrade in SolrCloud

## Problem

The existing `UPGRADECOREINDEX` CoreAdmin action rewrites old-format Lucene 
segments in-place by reconstructing documents and re-adding them as 
current-format segments. This operation is blocked in SolrCloud mode. There is 
no way to upgrade the index format of a SolrCloud collection without full 
reindexing from source.

## Goal

Expose index-upgrade functionality at collection scope in SolrCloud as a new 
`UPGRADECOLLECTIONINDEX` Collections API command with async support and 
`REQUESTSTATUS` tracking. The design must handle mixed replica types (NRT, 
TLOG, PULL), prevent index corruption, and minimize operational disruption.

## Design Decision: Write Freeze Required

Two approaches were evaluated:

**Approach A (Zero write downtime)** — Rejected. The upgrader replays documents 
while external writes continue, relying on `_version_`-based optimistic 
concurrency. This has a fatal flaw: **Delete-By-Query resurrection**. Code 
analysis confirms that `UpdateLog.lookupVersion()` does not consult the 
`deleteByQueries` list — it only checks the tlog map, the live index, and the 
`oldDeletes` LRU (which is populated only by delete-by-id, never by DBQ). When 
a document is deleted by DBQ and then replayed by the upgrader, `lookupVersion` 
returns either `null` (doc not found) or a stale tlog entry, allowing the 
re-add to succeed. The document is silently resurrected. This bug exists in 
standalone mode as well but is less likely to trigger.

**Approach B — Selected: Write freeze + hybrid local upgrade.** Collection is 
set to `readOnly` for the duration. Each replica type is upgraded via its 
designed index-update mechanism. All concurrency edge cases are eliminated.

An alternative within Approach B — upgrading only the leader and relying on 
distributed forwarding to NRT replicas — was also rejected. 
`DistributedZkUpdateProcessor` enforces the collection-level `readOnly` on 
every node, including replicas receiving forwarded updates. Forwarding is 
blocked by the same write freeze that protects against external writes.

## Operational Flow

1. Coordinator sets `readOnly=true` on the collection via `MODIFYCOLLECTION` 
(blocks all external writes at `DistributedZkUpdateProcessor`).
2. For each shard (sequentially):
   a. Identify the current leader via `getLeaderRetry()`.
   b. **Upgrade the leader** via CoreAdmin `UPGRADECOREINDEX` with 
`cloudMode=true`. The leader uses a stripped-down chain (`LogUpdateProcessor` → 
`RunUpdateProcessor`, no `DistributedUpdateProcessor`) to rewrite old segments 
locally. No version reassignment, no distributed forwarding. The original 
`_version_` is preserved both in the indexed document and on the 
`AddUpdateCommand` (for tlog consistency). After rewriting, the original merge 
policy is restored before commit, followed by `expungeDeletes` to clean 
tombstone segments.
   c. **Upgrade NRT non-leader replicas** in parallel using the same mechanism. 
Each NRT replica independently rewrites its own segments with zero network I/O.
   d. **TLOG/PULL replicas** converge via their normal background replication 
from the now-upgraded leader.
   e. **Convergence polling**: Poll all replicas with `checkOnly=true` until 
every replica reports no old-format segments remaining (see Convergence Polling 
below).
3. Clear `readOnly=false` only after all shards validate. On any failure, the 
collection remains read-only for operator intervention.


> Collection-Level Index Upgrade API in SolrCloud (UPRGADECOLLECTIONINDEX)
> ------------------------------------------------------------------------
>
>                 Key: SOLR-18190
>                 URL: https://issues.apache.org/jira/browse/SOLR-18190
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Rahul Goswami
>            Priority: Major
>
> *+Objective+*
> Expose index-upgrade functionality at collection scope in SolrCloud as a new 
> "UPGRADECOLLECTIONINDEX" Collections API command with async support and 
> `REQUESTSTATUS` tracking.
> *+Approach+*
> _Write freeze_ *+* _hybrid local upgrade_ - Collection is set to `readOnly` 
> for the duration. Each replica type is upgraded via its designed index-update 
> mechanism. Which means
>  * Leader is upgraded using the CoreAdmin UPGRADECOREINDEX API that is now 
> available.
>  * NRT replicas are also individually upgraded using the same 
> UPGRADECOREINDEX API 
>  * TLOG/PULL replicas  
> Why not  upgrade only the leader and rely on distributed forwarding to NRT 
> replicas ? `DistributedZkUpdateProcessor` enforces the collection-level 
> `readOnly` on every node, including replicas receiving forwarded updates. 
> Forwarding is blocked by the same write freeze that protects against external 
> writes.
> +*Operational Flow*+
> 1. Coordinator sets `readOnly=true` on the collection via `MODIFYCOLLECTION` 
> (blocks all external writes at `DistributedZkUpdateProcessor`).
> 2. For each shard (sequentially):
> a. Identify the current leader via `getLeaderRetry()`.
> b. *{*}Upgrade the leader{*}* via CoreAdmin `UPGRADECOREINDEX` with 
> `cloudMode=true`. The leader uses a stripped-down chain (`LogUpdateProcessor` 
> → `RunUpdateProcessor`, no `DistributedUpdateProcessor`) to rewrite old 
> segments locally. No version reassignment, no distributed forwarding. The 
> original `{_}version{_}` is preserved both in the indexed document and on the 
> `AddUpdateCommand` (for tlog consistency). After rewriting, the original 
> merge policy is restored before commit, followed by `expungeDeletes` to clean 
> tombstone segments.
> c. *{*}Upgrade NRT non-leader replicas{*}* in parallel using the same 
> mechanism. Each NRT replica independently rewrites its own segments with zero 
> network I/O.
> d. *{*}TLOG/PULL replicas{*}* converge via their normal background 
> replication from the now-upgraded leader.
> e. *{*}Convergence polling{*}*: Poll all replicas with `checkOnly=true` until 
> every replica reports no old-format segments remaining (see Convergence 
> Polling below).
> 3. Clear `readOnly=false` only after all shards validate. On any failure, the 
> collection remains read-only for operator intervention.
> *+Limitations+*
> - Nested documents: Not supported (existing limitation in 
> `UpgradeCoreIndex`). 
> - Write downtime: The collection is unavailable for writes for the duration 
> of the upgrade. Zero write downtime may be revisited after solving the DBQ 
> resurrection problem.
> - Leader election resilience: If a leader election occurs during the upgrade, 
> progress may be lost and the command must be re-run. 
> - Co-located replica IO: NRT replicas on the same node are upgraded in 
> parallel, which may cause IO contention. Node-aware throttling deferred to a 
> future version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-18190) Collection-Level Index Upgrade API in SolrCloud (UPRGADECOLLECTIONINDEX)

Reply via email to