Hi Rahul.
Are you looking for
https://lucene.apache.org/core/9_0_0/core/org/apache/lucene/index/IndexWriter.html#forceMergeDeletes()
?

On Tue, Aug 29, 2023 at 5:20 AM Rahul Goswami <rahul196...@gmail.com> wrote:

> Hello,
> I am trying to execute a program to read documents segment-by-segment and
> reindex to the same index. I am reading using Lucene apis and indexing
> using solr api (in a core that is currently loaded).
>
> What I am observing is that even after a segment has been fully processed
> and an autoCommit (as well as autoSoftCommit ) has kicked in, the segment
> with 0 live docs gets left behind. *Upon Solr restart, the segment does get
> cleared succesfully.*
>
> I tried to replicate same thing without the code by indexing 3 docs on an
> empty test core, and then reindexing the same docs. The older segment gets
> deleted as soon as softCommit interval hits or an explicit commit=true is
> called.
>
> Here are the two approaches that I have tried. Approach 2 is inspired by
> the merge logic of accessing segments in case opening a DirectoryReader
> (Approach 1) externally is causing this issue.
>
> But both approaches leave undeleted segments behind until I restart Solr
> and load the core again. What am I missing? I don't have any more brain
> cells left to fry on this!
>
> Approach 1:
> =========
> try (FSDirectory dir = FSDirectory.open(Paths.get(core.getIndexDir()));
>                     IndexReader reader = DirectoryReader.open(dir)) {
>                 for (LeafReaderContext lrc : reader.leaves()) {
>
>                        //read live docs from each leaf , create a
> SolrInputDocument out of Document and index using Solr api
>
>                 }
> }catch(Exception e){
>
> }
>
> Approach 2:
> ==========
> ReadersAndUpdates rld = null;
> SegmentReader segmentReader = null;
> RefCounted<IndexWriter> iwRef =
> core.getSolrCoreState().getIndexWriter(core);
>  iw = iwRef.get();
> try{
>   for (SegmentCommitInfo sci : segmentInfos) {
>      rld = iw.getPooledInstance(sci, true);
>      segmentReader = rld.getReader(IOContext.READ);
>
>     //process all live docs similar to above using the segmentReader.
>
>     rld.release(segmentReader);
>     iw.release(rld);
> }finally{
>    if (iwRef != null) {
>        iwRef.decref();
>     }
> }
>
> Help would be much appreciated!
>
> Thanks,
> Rahul
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to