Vadim, I suppose https://vimeo.com/32065505 is old good explanation of all Lucene API dimensions. It covers the most of your questions. FWIW, Leaf is a segment, and postings is a list of occurrences. Regarding attributes in postings, iirc it's only used in some suggester, but now I even can't find this usage.
On Thu, Dec 14, 2017 at 12:15 PM, Vadim Gindin <vgin...@detectum.com> wrote: > Hi All > > I have a question about API. Particularly, about used terminology. > > 1. LeafReader. Why it starts with "Leaf"? Can I understand that, that such > reader is intended for reading only one leaf of index tree? Does it mean > that it is working inside a context (LeafReaderContext) of several > documents "physically" located in that leaf? > > 2. Our LeafReader is positioned in some document, and reader.terms(field) > will return terms list for the single field from the index. Right? > > 3. LeafReader is the successor of IndexReader, which has getTermVectors(int > docID) > Can I use it in my custom Query (to be aware of all documents fields) > instead of terms(field) > > 4. I.e. LeafReader contains statistical methods, methods returning the > document values, and the methods returning terms and postings. terms() and > postings() are intended for search. > > 3. What is Postings/PostingEnum? Why is it named starting with "Posting"? > My native language is Russian and I'm a bit confused trying to find a > corresponding meaning of this word in a search context. > > 5. Ok, I see PostingEnum implements some basic interface DocIdSetIterator, > but PostingEnum is one of approximately 20 implementations of that > interface. Why is it used in LeafReader? What the principal difference > between these 20 implementations and which of them can be really useful? > > Regards, > Vadim Gindin > -- Sincerely yours Mikhail Khludnev