On Fri, Feb 25, 2011 at 8:12 AM, Marvin Humphrey <[email protected]> wrote: > At the end of a search, you will only have documents and scores -- > not sophisticated metadata about what part of the subquery matched and what > parts didn't and how much each matching part contributed to the score. > Keeping track of such metadata during the matching phase would be > prohibitively expensive.
It's only prohibitive if you don't need that data. If actually need it (as Andrew seems to), and are going to do it in post-processing anyway, it's just the cost of doing business. My kick has been about making it easy to swap in non-TF/IDF scorers. I think part of doing so will be adding greater room for scratch data to Hits returned. My canonical example is that I want to to be possible to do alphabetical sorting of Hits by a category field. At some point you need a collector that can see field values, which if you squint right is just a special case of what Andrew wants. While I can see that argument that this is traditionally not the way that TF/IDF systems work, it's this potential for search/database hybridization that makes Lucy so attractive to me. Nathan Kurz [email protected]
