Hi,
During one of discussions at ApacheCon it occurred to me that it would
be useful to have an option to discard positional information but still
keep the term frequency. Even though position-dependent queries wouldn't
work then, still any other queries would work fine and we would get the
right scoring.
I believe it should be possible to do this without changing the file
format, if we used a negative term frequency for terms without postings
- we would have to check for that condition in SegmentTermDocs, change
the flags there and flip the sign of docFreq. And eventually we may want
to add a separate flag for this and bump the format version.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org