Chuck Williams wrote:
Lucene today allows many field properties to vary at the Field level. E.g., the same field name might be tokenized in one Field on a Document
while it is untokenized in another Field on the same or different
Document.

The rationale for this design was to keep the API simple. I think of it like variable declarations: some languages require them and some don't. I opted to make Lucene fields like dynamically-typed variables. In part, Lucene's popularity is due to the simplicity of its API.

However, in my uses of Lucene, most documents have the same fields used in the same way, so I don't think I've ever actually taken much advantage of this functionality. It is nice to be able to add a field to an index by changing the indexing code in a single place, where the field's value is created, and not having to also change the index initialization code. We should try to keep such redundancies out of user code.

Thus I would encourage any change in this direction to continue to permit fields to be defined lazily, the first time they are added, rather than requiring all fields to be declared up front. Are there substantial optimizations that are only possible if all fields are known when the index is initialized?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to