: why not just discard them completely in say, indexer/queryparser ? In QueryParser: maybe, that's a high level API with assumptions about "human" interaction and text.
In the IndexWriter: it seems like a bad idea. Low level Lucene really shouldn't be making any assumptions about *how* the client code is using the library -- you and i may not have any good reasons for wanting an empty term, but we shouldn't put that as a hardcoded assumption in the low level code. It's essentially the converse issue of IndexWriter.maxFieldLength -- which was deliberately changed to default to Integer.MAX_VALUE precisesly because of this "don't assume we know how people are using the library" issue -- but we could certianly make it configurable in the same way. (I see now that IndexWriter.maxFieldLength got deprecated in favor of IndexWriterConfig.maxFieldLength ... i thought i remembered that had been deprecated in favor of a TokenFilter that did the limiting, hence my suggestion that we use the same pattern for "min term length" -- it could easily be an IndexWriterConfig option as well, but using the TokenFilter approach seems more useful since it can be per field) -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org