Hello, I am trying to execute a program to read documents segment-by-segment and reindex to the same index. I am reading using Lucene apis and indexing using solr api (in a core that is currently loaded).
What I am observing is that even after a segment has been fully processed and an autoCommit (as well as autoSoftCommit ) has kicked in, the segment with 0 live docs gets left behind. *Upon Solr restart, the segment does get cleared succesfully.* I tried to replicate same thing without the code by indexing 3 docs on an empty test core, and then reindexing the same docs. The older segment gets deleted as soon as softCommit interval hits or an explicit commit=true is called. Here are the two approaches that I have tried. Approach 2 is inspired by the merge logic of accessing segments in case opening a DirectoryReader (Approach 1) externally is causing this issue. But both approaches leave undeleted segments behind until I restart Solr and load the core again. What am I missing? I don't have any more brain cells left to fry on this! Approach 1: ========= try (FSDirectory dir = FSDirectory.open(Paths.get(core.getIndexDir())); IndexReader reader = DirectoryReader.open(dir)) { for (LeafReaderContext lrc : reader.leaves()) { //read live docs from each leaf , create a SolrInputDocument out of Document and index using Solr api } }catch(Exception e){ } Approach 2: ========== ReadersAndUpdates rld = null; SegmentReader segmentReader = null; RefCounted<IndexWriter> iwRef = core.getSolrCoreState().getIndexWriter(core); iw = iwRef.get(); try{ for (SegmentCommitInfo sci : segmentInfos) { rld = iw.getPooledInstance(sci, true); segmentReader = rld.getReader(IOContext.READ); //process all live docs similar to above using the segmentReader. rld.release(segmentReader); iw.release(rld); }finally{ if (iwRef != null) { iwRef.decref(); } } Help would be much appreciated! Thanks, Rahul