On Fri, Oct 29, 2010 at 21:50, Robert Muir <rcm...@gmail.com> wrote: > I was suggesting that mathematically, the empty term makes no sense in > an inverted index, and we shouldn't allow it. > Its one solution.
Mathematically an inverted index is keyed by strings. Any strings. Empty term is just a case of a string of length 0. So, for consistency, Lucene should support them. TermsEnum.seek("") should position you into very beginning of terms list, etc. If you drop the support, you have to check zero length damn eeeeverywhere in the API where you accept terms. Or, thoroughly document unpredictable erratic behaviour :) A possible usecase for empty terms in analyzer stream is slipping in various metadata. Paragraph/sentence delimiters, whatever. Nobody precludes you from using "##PAR#BEGIN##" kind of things, but you may want to leave term text alone and exploit other attributes. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Phone: +7 (495) 683-567-4 ICQ: 104465785 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org