Hi Rahul. Are you looking for https://lucene.apache.org/core/9_0_0/core/org/apache/lucene/index/IndexWriter.html#forceMergeDeletes() ?
On Tue, Aug 29, 2023 at 5:20 AM Rahul Goswami <[email protected]> wrote: > Hello, > I am trying to execute a program to read documents segment-by-segment and > reindex to the same index. I am reading using Lucene apis and indexing > using solr api (in a core that is currently loaded). > > What I am observing is that even after a segment has been fully processed > and an autoCommit (as well as autoSoftCommit ) has kicked in, the segment > with 0 live docs gets left behind. *Upon Solr restart, the segment does get > cleared succesfully.* > > I tried to replicate same thing without the code by indexing 3 docs on an > empty test core, and then reindexing the same docs. The older segment gets > deleted as soon as softCommit interval hits or an explicit commit=true is > called. > > Here are the two approaches that I have tried. Approach 2 is inspired by > the merge logic of accessing segments in case opening a DirectoryReader > (Approach 1) externally is causing this issue. > > But both approaches leave undeleted segments behind until I restart Solr > and load the core again. What am I missing? I don't have any more brain > cells left to fry on this! > > Approach 1: > ========= > try (FSDirectory dir = FSDirectory.open(Paths.get(core.getIndexDir())); > IndexReader reader = DirectoryReader.open(dir)) { > for (LeafReaderContext lrc : reader.leaves()) { > > //read live docs from each leaf , create a > SolrInputDocument out of Document and index using Solr api > > } > }catch(Exception e){ > > } > > Approach 2: > ========== > ReadersAndUpdates rld = null; > SegmentReader segmentReader = null; > RefCounted<IndexWriter> iwRef = > core.getSolrCoreState().getIndexWriter(core); > iw = iwRef.get(); > try{ > for (SegmentCommitInfo sci : segmentInfos) { > rld = iw.getPooledInstance(sci, true); > segmentReader = rld.getReader(IOContext.READ); > > //process all live docs similar to above using the segmentReader. > > rld.release(segmentReader); > iw.release(rld); > }finally{ > if (iwRef != null) { > iwRef.decref(); > } > } > > Help would be much appreciated! > > Thanks, > Rahul > -- Sincerely yours Mikhail Khludnev
