Hi Luke,
Thanks for your thoughts on this. If you see the UPGRADECOREINDEX
/admin/cores API
<https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/admin/api/UpgradeCoreIndex.java#L289>
which
does the core level upgrade, it already uses update level API (to reindex
each reconstructed document from older segments) so that each field retains
its analysis post upgrade.

Also, the core level API doesn't have a read-only restriction and external
parallel writes are fine in standalone mode. It relies on _version_ to
ensure that an upgrade doesn't silently overwrite a concurrent external
write or cause resurrection of a deleted doc (since one of the clients
would get an error due to version check) .

My initial reasoning for making a collection read-only in case of SolrCloud
upgrade was as below:
1) For NRT replicas since each replica independently processes any update,
and the auto commits and merges can occur at different instants across
replicas, the segment structure cannot be guaranteed to be consistent.
Since the core upgrade API only processes older segments (and not
necessarily the entire index), upgrading only the leader and allowing it to
forward updates to the replicas doesn't guarantee that the replicas too
will end up with an upgraded index.
2) To solve this, I decided to have each NRT replica upgraded locally. This
would require blocking forwarding of updates to avoid version churn and
potential version conflicts, which means no DUP.
3) With no DUP you cannot allow parallel external writes since we lose
version check, and run the risk of silently overwriting an externally added
doc.

Hence read-only.

BUT, your comment pushed me to rethink certain assumptions and I realized
that the problem described in #1 can be solved in another way. I could
upgrade the leader with UPGRADECOREINDEX core api while retaining the DUP,
aka its current default behavior. This should also upgrade *a
significant *portion
of each non-leader replica. Running UPGRADECOREINDEX API over the remaining
NRT replicas thereafter would upgrade any remaining older segments which
couldn't be covered through the leader upgrade. There would still be some
version churn due to the non-leader replicas, but it would be quite
minimal.

Basically the existing operational flow in the design was already solving
the problem in #1 without needing to exclude the DUP or making the
collection read-only!

So yes, the collection *can* continue to accept writes while the upgrade is
in progress. Thank you for giving this a thought and for prompting me to
reconsider the read-only assumption.

Let me also add this discussion to the JIRA so that the design discussion
is not split across two places.

- Rahul

On Mon, Apr 6, 2026 at 11:07 AM Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A) <
[email protected]> wrote:

> Hey Rahul,
>
> Thanks for sharing this it's very interesting to me. For context we are
> going
> through our own upgrade process. Our latest approach is to actually keep
> the
> cores writeable and lean on Solr's monotonically increasing _version_ to
> find what still needs to be rewritten to the new codec. We have some checks
> in place to make sure we don't declare victory too early. We are also using
> your clever version-sequestering merge policy. Anyways, I was curious if
> you
> had considered reindexing at the Solr update API level (maybe something
> akin
> to REINDEXCOLLECTION but without the same read-only constraints).
>
> From my testing I get the sense that analysis + indexing are the biggest
> bottlenecks so using a higher level API may not always matter so much.
> The benefit would be you may not need to set the cloud to  read-only as
> long as you have a reliable high-watermark of the last doc written by the
> old codec.
>
> Luke
>
> From: [email protected] At: 04/03/26 13:21:24 UTC-4:00To:
> [email protected]
> Subject: Re: Path to Solr major version upgrades without rebuilding index
>
> I have been working on the Collections API for index upgrade in SolrCloud.
> I have updated the design details in the JIRA. Thoughts/discussion on the
> same are welcome.
>
> https://issues.apache.org/jira/browse/SOLR-18190
>
> Also went ahead and started implementation anyway (Draft PR attached to
> JIRA) .
>
> Thanks,
> Rahul
>
>
> On Wed, Feb 25, 2026 at 1:11 AM Rahul Goswami <[email protected]>
> wrote:
>
> > Hi Jason,
> > I'll create a JIRA and share what I have in mind on the corresponding
> > collections API . Will update this thread once I have a draft ready.
> >
> > Cheers,
> > Rahul
> >
> > On Tue, Feb 24, 2026 at 6:57 AM Jason Gerlowski <[email protected]>
> > wrote:
> >
> >> Hey Rahul,
> >>
> >> Awesome work on the core-admin API!
> >>
> >> Are your ideas around a "collection-admin" version of this API written
> >> down anywhere, that someone could start in on it based on your
> >> description?  If not, would you be willing to put a JIRA writeup
> >> together?
> >>
> >> Best,
> >>
> >> Jason
> >>
> >> On Mon, Feb 16, 2026 at 12:47 PM Rahul Goswami <[email protected]>
> >> wrote:
> >> >
> >> > Hi all,
> >> > Wanted to circle back with a late(ish) update on this effort. PR
> >> > https://github.com/apache/solr/pull/3903 got merged a couple of weeks
> >> back
> >> > and* will be* available with Solr 9.11 (and Solr 10.1). It exposes a
> new
> >> > CoreAdmin API as below:
> >> >
> >> > admin/cores?action=UPGRADECOREINDEX&core=<core-name>
> >> >
> >> > This serves as a convenient REST endpoint to upgrade an older Solr
> core
> >> and
> >> > can work in both sync and async modes.
> >> >
> >> > So where does this leave us with the state of index upgrade in Solr?
> >> >
> >> > For an index created in Solr 8.x and running Solr 9.11 or later 9.x
> >> series:
> >> >
> >> > *SolrCloud mode*
> >> > - Configure LatestVersionMergePolicyFactory
> >> > <
> >>
>
> https://github.com/apache/solr/blob/branch_9x/solr/core/src/java/org/apache/solr
> /index/LatestVersionMergePolicyFactory.java
> <https://github.com/apache/solr/blob/branch_9x/solr/core/src/java/org/apache/solr/index/LatestVersionMergePolicyFactory.java>
> >> >
> >> > for the collection
> >> > - Have a client program reindex the collection
> >> >
> >> > *Standalone mode*
> >> > - Call /admin/cores?action=UPGRADECOREINDEX&core=<core-name>.
> >> > It takes care of setting/resetting the merge policy under-the-hood and
> >> > works in an optimized way by targeting only specific segments that
> need
> >> to
> >> > be rewritten. The API also maintains continuity across restarts
> without
> >> > having to re-process data.
> >> >
> >> > All this while the index can remain open for searches and updates by
> any
> >> > other application threads, aka zero downtime.
> >> > Post this, when you upgrade to Solr 10.1, the index should open fine.
> >> For a
> >> > future upgrade to Solr 11, rinse and repeat the process while on Solr
> >> 10.x.
> >> >
> >> > Thanks to David for the hard work in reviewing the PR
> >> > https://github.com/apache/solr/pull/3903!
> >> >
> >> > I would ideally love to have a similar Collections API to make things
> >> > easier on SolrCloud and have a rough approach in mind for the same,
> >> > accounting for the different replica types etc. It can build upon the
> >> > UPGRADECOREINDEX CoreAdmin API. However I may not have immediate
> >> bandwidth
> >> > to work on it. If someone would like to take this up, I'd be happy to
> >> help
> >> > out.
> >> >
> >> > Best,
> >> > Rahul
> >> >
> >> >
> >> > On Fri, Dec 5, 2025 at 11:47 AM Rahul Goswami <[email protected]>
> >> wrote:
> >> >
> >> > > Hello,
> >> > > Wanted to share an exciting development with the community. Recently
> >> > > Lucene PR https://github.com/apache/lucene/pull/15431 ("Revise
> >> strategy
> >> > > to open an index") got merged. This makes Lucene look at individual
> >> > > segments for backward compatibility rather than when the index was
> >> first
> >> > > created. Earlier this week Solr PR
> >> > > https://github.com/apache/solr/pull/3883 got merged which provides
> a
> >> > > merge policy to prevent older segments from merging. This should be
> >> > > available in the next Solr 9.x release. A combination of these PRs
> >> means if
> >> > > an index was created in Solr 8.x, and later upgraded to 9.x, users
> >> could
> >> > > upgrade to Solr 10.x in the future without having to completely
> >> rebuild the
> >> > > index from an external source as is the requirement today.
> >> > >
> >> > > For indexes originally created in Solr 8.x, this enhancement would
> >> require
> >> > > them to reindex data in existing collections (onto the same
> >> collection)
> >> > > once they are on the upcoming Solr 9.x release, and works if the
> >> fields are
> >> > > stored or docValues true (or a copy field destination, if not
> >> stored); *without
> >> > > any downtime.*
> >> > > Rinse and repeat when they are on Solr 10.x, to prepare for upgrade
> to
> >> > > Solr 11.
> >> > >
> >> > > Thanks to Michael Sokolov and David Smiley for the review inputs and
> >> for
> >> > > the help in seeing this through!
> >> > >
> >> > > An upcoming PR is in the works which provides an /admin/cores REST
> >> > > endpoint that will do the reindexing automatically and in an
> >> optimized way
> >> > > by targeting only the data in older segments
> >> > > https://github.com/apache/solr/pull/3903
> >> > >
> >> > > Thanks,
> >> > > Rahul Goswami
> >> > >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >>
>
>
>

Reply via email to