Hi,

During one of discussions at ApacheCon it occurred to me that it would be useful to have an option to discard positional information but still keep the term frequency. Even though position-dependent queries wouldn't work then, still any other queries would work fine and we would get the right scoring.

I believe it should be possible to do this without changing the file format, if we used a negative term frequency for terms without postings - we would have to check for that condition in SegmentTermDocs, change the flags there and flip the sign of docFreq. And eventually we may want to add a separate flag for this and bump the format version.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to