[ 
https://issues.apache.org/jira/browse/SOLR-16703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845135#comment-17845135
 ] 

Rahul Goswami commented on SOLR-16703:
--------------------------------------

I have done some work in this area and happy to take this up. Tied up for the 
next one month, but will get to this by end of June/early July 2024. 

> Clearing all documents of an index should delete traces of a previous Lucene 
> version
> ------------------------------------------------------------------------------------
>
>                 Key: SOLR-16703
>                 URL: https://issues.apache.org/jira/browse/SOLR-16703
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 7.6, 8.11.2, 9.1.1
>            Reporter: Gaël Jourdan
>            Priority: Major
>
> _This is a ticket following a discussion on Slack with_ [~elyograg] _and_ 
> [~wunder] _especially._
> h1. High level scenario
> Assume you're starting from a current Solr server in version 7.x and want to 
> upgrade to 8.x then 9.x.
> Upgrading from 7.x to 8.x works fine. Indexes of 7.x can still be read with 
> Solr 8.x.
> On a regular basis, you clear* the index to start fresh, assuming this will 
> recreate index in version 8.x.
> This run nicely for some time. Then you want to upgrade to 9.x. When 
> starting, you get an error saying that the index is still 7.x and cannot be 
> read by 9.x.
>  
> *This is surprising because you'd expect that starting from a fresh index in 
> 8.x would have removed any trace of 7.x.*
>  
> _* : when I say "clear", I mean "delete by query \{{* : * }}all docs" and 
> then commit + optionally optimize._
> h1. What I'd like to see
> Clearing an index when running Solr version N should delete any trace of 
> Lucene version N-1.
> Otherwise this forces users to delete an index (core / collection) and 
> recreate it rather than just clearing it.
> h1. Detailed scenario to reproduce
> The following steps reproduces the issue with a standalone Solr instance 
> running in Docker but I experienced the issue in SolrCloud mode running on 
> VMs and/or bare-metal.
>  
> Also note that for personal troubleshooting I used the tool "luceneupgrader" 
> available at [https://github.com/hakanai/luceneupgrader] but it's not 
> necessary to reproduce the issue.
>  
> 1. Create a directory for data
> {code:java}
> $ mkdir solrdata
> $ chmod -R a+rwx solrdata {code}
>  
> 2. Start a Solr 7.x server, create a core and push some docs
> {code:java}
> $ docker run -d -v "$PWD/solrdata:/opt/solr/server/solr/mycores:rw" -p 
> 8983:8983 --name my_solr_7 solr:7.6.0 solr-precreate gettingstarted
> $ docker exec -it my_solr_7 post -c gettingstarted 
> example/exampledocs/manufacturers.xml
> $ curl -s 'http://localhost:8983/solr/gettingstarted/select?q=*:*' | jq 
> .response.numFound
> 11{code}
>  
> 3. Look at the index files and check version
> {code:java}
> $ ll solrdata/gettingstarted/data/index                                       
>   
> total 40K
> -rw-r--r--. 1 8983 8983  718 16 mars  17:37 _0.fdt
> -rw-r--r--. 1 8983 8983   84 16 mars  17:37 _0.fdx
> -rw-r--r--. 1 8983 8983  656 16 mars  17:37 _0.fnm
> -rw-r--r--. 1 8983 8983  112 16 mars  17:37 _0_Lucene50_0.doc
> -rw-r--r--. 1 8983 8983 1,1K 16 mars  17:37 _0_Lucene50_0.tim
> -rw-r--r--. 1 8983 8983  145 16 mars  17:37 _0_Lucene50_0.tip
> -rw-r--r--. 1 8983 8983  767 16 mars  17:37 _0_Lucene70_0.dvd
> -rw-r--r--. 1 8983 8983  730 16 mars  17:37 _0_Lucene70_0.dvm
> -rw-r--r--. 1 8983 8983  478 16 mars  17:37 _0.si
> -rw-r--r--. 1 8983 8983  203 16 mars  17:37 segments_2
> -rw-r--r--. 1 8983 8983    0 16 mars  17:36 write.lock
> $ java -jar luceneupgrader-0.6.0.jar info solrdata/gettingstarted/data/index 
> Lucene index version: 7
> {code}
>  
> 4. Stop Solr 7, update solrconfig.xml for Solr 8 and start a Solr 8 server
> {code:java}
> $ docker stop my_solr_7
> $ vim solrdata/gettingstarted/conf/solrconfig.xml
> $ cat solrdata/gettingstarted/conf/solrconfig.xml | grep luceneMatchVersion   
>               <luceneMatchVersion>8.11.2</luceneMatchVersion> 
> $ docker run -d -v "$PWD/solrdata:/var/solr/data:rw" -p 8983:8983 --name 
> my_solr_8 solr:8.11.2{code}
>  
> 5. Check index is loaded ok and docs are still there
> {code:java}
> $ curl -s 'http://localhost:8983/solr/gettingstarted/select?q=*:*' | jq 
> .response.numFound
> 11 {code}
>  
> 6. Clear the index and check index files / version
> {code:java}
> $ curl -X POST -H 'Content-Type: application/json' 
> 'http://localhost:8983/solr/gettingstarted/update?commit=true' -d '{ 
> "delete": {"query":"*:*"} }'
> $ ll solrdata/gettingstarted/data/index                                       
> total 4,0K
> -rw-r--r--. 1 8983 8983 135 16 mars  17:45 segments_5
> -rw-r--r--. 1 8983 8983   0 16 mars  17:36 write.lock
> $ java -jar luceneupgrader-0.6.0.jar info solrdata/gettingstarted/data/index
> Lucene index version: 7
> $ curl 'http://localhost:8983/solr/gettingstarted/update?optimize=true'
> $ ll solrdata/gettingstarted/data/index                                       
>   
> total 4,0K
> -rw-r--r--. 1 8983 8983 135 16 mars  17:45 segments_5
> -rw-r--r--. 1 8983 8983   0 16 mars  17:36 write.lock
> $ java -jar ~luceneupgrader-0.6.0.jar info solrdata/gettingstarted/data/index
> Lucene index version: 7 {code}
> There's no more docs in the index but it's still considered as version 7.x.
>  
> 7. Add docs in the index again
> {code:java}
> $ docker exec -it my_solr_8 post -c gettingstarted 
> example/exampledocs/manufacturers.xml
> $ curl -s 'http://localhost:8983/solr/gettingstarted/select?q=*:*' | jq 
> .response.numFound
> 11
> $ ll solrdata/gettingstarted/data/index                                       
>   
> total 48K
> -rw-r--r--. 1 8983 8983 158 16 mars  17:47 _2.fdm
> -rw-r--r--. 1 8983 8983 832 16 mars  17:47 _2.fdt
> -rw-r--r--. 1 8983 8983  64 16 mars  17:47 _2.fdx
> -rw-r--r--. 1 8983 8983 748 16 mars  17:47 _2.fnm
> -rw-r--r--. 1 8983 8983 767 16 mars  17:47 _2_Lucene80_0.dvd
> -rw-r--r--. 1 8983 8983 750 16 mars  17:47 _2_Lucene80_0.dvm
> -rw-r--r--. 1 8983 8983  80 16 mars  17:47 _2_Lucene84_0.doc
> -rw-r--r--. 1 8983 8983 883 16 mars  17:47 _2_Lucene84_0.tim
> -rw-r--r--. 1 8983 8983  75 16 mars  17:47 _2_Lucene84_0.tip
> -rw-r--r--. 1 8983 8983 395 16 mars  17:47 _2_Lucene84_0.tmd
> -rw-r--r--. 1 8983 8983 505 16 mars  17:47 _2.si
> -rw-r--r--. 1 8983 8983 220 16 mars  17:47 segments_6
> -rw-r--r--. 1 8983 8983   0 16 mars  17:36 write.lock
> $ java -jar luceneupgrader-0.6.0.jar info solrdata/gettingstarted/data/index
> Lucene index version: 7 {code}
> Empty index in which we add new docs through Solr 8.x is still considered as 
> a 7.x index.
>  
> 8. Stop Solr 8.x, update solrconfig.xml for 9.x, start Solr 9.x
> {code:java}
> $ docker stop my_solr_8
> $ vim solrdata/gettingstarted/conf/solrconfig.xml
> $ cat solrdata/gettingstarted/conf/solrconfig.xml | grep luceneMatchVersion   
>                                                                       
>   <luceneMatchVersion>9.1.1</luceneMatchVersion>
> $ # also remove xslt response writer
> $ docker run -d -v "$PWD/solrdata:/var/solr/data:rw" -p 8983:8983 --name 
> my_solr_9 solr:9.1.1 {code}
>  
> 9. Check out logs of Solr, it cannot start/load the core:
> {code:java}
> $ docker logs my_solr_9
> 2023-03-16 16:53:37.046 ERROR (coreContainerWorkExecutor-2-thread-1) [] 
> o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup => 
> java.util.concurrent.ExecutionException: 
> org.apache.solr.common.SolrException: Unable to create core [gettingstarted]
> ...
> Caused by: org.apache.lucene.index.IndexFormatTooOldException: Format version 
> is not supported (resource 
> BufferedChecksumIndexInput(MMapIndexInput(path="/var/solr/data/gettingstarted/data/index/segments_6"))):
>  This index was initially created with Lucene 7.x while the current version 
> is 9.3.0 and Lucene only supports reading the current and previous major 
> versions. This version of Lucene only supports indexes created with release 
> 8.0 and later by default. {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to