[
https://issues.apache.org/jira/browse/SOLR-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18074611#comment-18074611
]
Rahul Goswami commented on SOLR-18190:
--------------------------------------
Adding feedback from [~lkot] on the dev list
(https://lists.apache.org/thread/57ol13m9omwj186zrz0f5ldhw44t29lo)
"Thanks for sharing this it's very interesting to me. For context we are going
through our own upgrade process. Our latest approach is to actually keep the
cores writeable and lean on Solr's monotonically increasing _version_ to
find what still needs to be rewritten to the new codec. We have some checks
in place to make sure we don't declare victory too early. We are also using
your clever version-sequestering merge policy. Anyways, I was curious if you
had considered reindexing at the Solr update API level (maybe something akin
to REINDEXCOLLECTION but without the same read-only constraints).
>From my testing I get the sense that analysis + indexing are the biggest
bottlenecks so using a higher level API may not always matter so much.
The benefit would be you may not need to set the cloud to read-only as
long as you have a reliable high-watermark of the last doc written by the
old codec."
My response:
"Thanks for your thoughts on this. If you see the[ UPGRADECOREINDEX
/admin/cores API
|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L289]
which does the core level upgrade, it already uses update level API (to
reindex each reconstructed document from older segments) so that each field
retains its analysis post upgrade.
Also, the core level API doesn't have a read-only restriction and external
parallel writes are fine in standalone mode. It relies on _version_ to ensure
that an upgrade doesn't silently overwrite a concurrent external write or cause
resurrection of a deleted doc (since one of the clients would get an error due
to version check) .
My initial reasoning for making a collection read-only in case of SolrCloud
upgrade was as below:
1) For NRT replicas since each replica independently processes any update, and
the auto commits and merges can occur at different instants across replicas,
the segment structure cannot be guaranteed to be consistent. Since the core
upgrade API only processes older segments (and not necessarily the entire
index), upgrading only the leader and allowing it to forward updates to the
replicas doesn't guarantee that the replicas too will end up with an upgraded
index.
2) To solve this, I decided to have each NRT replica upgraded locally. This
would require blocking forwarding of updates to avoid version churn and
potential version conflicts, which means no DUP.
3) With no DUP you cannot allow parallel external writes since we lose version
check, and run the risk of silently overwriting an externally added doc.
Hence read-only.
BUT, your comment pushed me to rethink certain assumptions and I realized that
the problem described in #1 can be solved in another way. I could upgrade the
leader with UPGRADECOREINDEX core api while retaining the DUP, aka its current
default behavior. This should also upgrade _a significant_ portion of each
non-leader replica. Running UPGRADECOREINDEX API over the remaining NRT
replicas thereafter would upgrade any remaining older segments which couldn't
be covered through the leader upgrade. There would still be some version churn
due to the non-leader replicas, but it would be quite minimal.
Basically the existing operational flow in the design was already solving the
problem in #1 without needing to exclude the DUP or making the collection
read-only!
So yes, the collection _can_ continue to accept writes while the upgrade is in
progress. Thank you for giving this a thought and for prompting me to
reconsider the read-only assumption.
Let me also add this discussion to the JIRA so that the design discussion is
not split across two places. "
> Collection-Level Index Upgrade API in SolrCloud (UPRGRADECOLLECTIONINDEX)
> -------------------------------------------------------------------------
>
> Key: SOLR-18190
> URL: https://issues.apache.org/jira/browse/SOLR-18190
> Project: Solr
> Issue Type: Improvement
> Reporter: Rahul Goswami
> Assignee: Rahul Goswami
> Priority: Major
>
> *+Objective+*
> Expose index-upgrade functionality at collection scope in SolrCloud as a new
> "UPGRADECOLLECTIONINDEX" Collections API with async support and
> `REQUESTSTATUS` tracking.
> +*Background: Core-Level Index Upgrade (UPGRADECOREINDEX)*+
> Solr's UPGRADECOREINDEX CoreAdmin command rewrites segments written by older
> Lucene versions into the current format (as long as the fields are
> stored=true or docValues=true). This makes it possible to use the same index
> across multiple major versions without requiring reindexing from the original
> data source .
> For each core, it:
> 1. Opens the existing index and sets
> [LatestVersionMergePolicy|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicy.java]
> on the IndexWriter to prevent older-format segments from merging with
> latest-format segments.
> 2. Identifies segments written by an older Lucene major version
> ([shouldUpgradeSegment()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L250]
> – any segment whose minVersion predates the current major version).
> 3. Reads every document from those old-format segments, reconstructing
> SolrInputDocuments with all stored fields.
> 4. Re-adds each document through an update processor chain, which writes it
> as a new current-format segment.
> 5. Commits and runs expungeDeletes to remove the now-obsolete old segments.
> Also restores the original merge policy.
> Documents that already reside in current-format segments are untouched. The
> operation is idempotent – re-running it skips segments that are already up to
> date.
> +*Approach*+
> * Leader is upgraded using the recently introduced CoreAdmin
> UPGRADECOREINDEX API
> * Thereafter UPGRADECOREINDEX API is called on each NRT replica to rewrite
> any segments leftover after upgrading the leader (since leader upgrade would
> have forwarded writes to the NRT replicas to rewrite most, if not all, of the
> older segments)
> * TLOG/PULL replicas converge via the usual replication mechanism.
> Coordinator waits for replicas to converge in a timebound manner before
> declaring success.
>
> +*Operational Flow (UPGRADECOLLECTIONINDEX SolrCloud Collections API)*+
> For each shard (sequentially):
> # Coordinator sets LatestVersionMergePolicy on the IndexWriter of each
> replica
> # Identify the current leader and upgrade the leader via CoreAdmin
> `UPGRADECOREINDEX` action. This would also cause the updates to be forwarded
> to the replicas, thereby upgrading majority, if not all, of older segments in
> each replica.
> # Upgrade NRT non-leader replicas in sequence using the same mechanism. We
> expect very less version churn since most of the older segments should have
> been rewritten already with the leader upgrade in step#2.
> # TLOG/PULL replicas converge via their normal background replication from
> the now-upgraded leader.
> # Convergence polling: Poll all replicas with `checkOnly=true` until every
> replica reports no old-format segments remaining. ("checkOnly" is a new
> lightweight param introduced on the UPGRADECOREINDEX CoreAdmin API to check
> for presence of any old-format segments)
> # Reset the older/original merge policy on each replica
> *+Limitations+*
> - Child/nested documents: Not supported (existing limitation in
> `UpgradeCoreIndex`).
> - Leader election resilience: If a leader election occurs during the
> upgrade, progress may be lost and the command must be re-run. This should be
> fine since the operation is designed to be idempotent and ensures that the
> state remains consistent albeit at the cost of additional re-work.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]