(1) no. The internal Ram buffer will pretty much limit the amount of heap used however.
(2) You actually have several segments. “.cfs” stands for “Compound File”, see: https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/codecs/lucene70/package-summary.html "An optional "virtual" file consisting of all the other index files for systems that frequently run out of file handles.” IOW, _0.cfs is a complete segment. _1.cfs is a different, complete segment etc. The merge policy (TieredMergePolicy) controls when these are used .vs. the segment being kept in separate files. New segments are created whenever the ram buffer is flushed or whenever you do a commit (closing the IW also creates a segment IIUC). However, under control of the merge policy, segments are merged. See: http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html You’re confusing closing a writer with merging segments. Essentially, every time a commit happens, the merge policy is called to determine if segments should be merged, see Mike’s blog above. Additionally, you say "I was hoping there would be only _0.cfs file”. This’ll pretty much never happen. Segment names always increase, at best you’d have something like _ab.cfs, if not 10-15 _ab* files. Lucene likes file handles, essentially when searching a file handle will be open for _every_ file in your index all the time. All that said, counting the number of files seems like a waste of time. If you’re running on a *nix box, the usual (Solr I’ll admit, but I think it applies to Lucene as well) is to set the limit to 65K or so. And if you’re truly concerned, and since you say this is an immutable, you can do a forceMerge. Prior to Lucene 7.5, the would by default form exactly one segment. For Lucene 7.5 and later, it’ll respect max segment size (a parameter in TMP, defaults to 5g) unless you specify a segment count of 1. Best, Erick > On Nov 11, 2019, at 5:47 PM, Shawn Heisey <apa...@elyograg.org> wrote: > > On 11/11/2019 1:40 PM, siddharth teotia wrote: >> I have a few questions about Lucene indexing and file handling. It would be >> great if someone can help with these. I had earlier asked these questions >> on gene...@lucene.apache.org but was asked to seek help here. > > This mailing list (solr-user) is for Solr. Questions about Lucene do not > belong on this list. > > You should ask on the java-user mailing list, which is for questions related > to the core (Java) version of Lucene. > > http://lucene.apache.org/core/discussion.html#java-user-list-java-userluceneapacheorg > > I have put the original sender address in the BCC field just in case you are > not subscribed here. > > Thanks, > Shawn