On Fri, Feb 25, 2011 at 8:12 AM, Marvin Humphrey <[email protected]> wrote:
> At the end of a search, you will only have documents and scores --
> not sophisticated metadata about what part of the subquery matched and what
> parts didn't and how much each matching part contributed to the score.
> Keeping track of such metadata during the matching phase would be
> prohibitively expensive.

It's only prohibitive if you don't need that data.  If actually need
it (as Andrew seems to), and are going to do it in post-processing
anyway, it's just the cost of doing business.

My kick has been about making it easy to swap in non-TF/IDF scorers.
I think part of doing so will be adding greater room for scratch data
to Hits returned. My canonical example is that I want to to be
possible to do alphabetical sorting of Hits by a category field.   At
some point you need a collector that can see field values, which if
you squint right is just a special case of what Andrew wants.

While I can see that argument that this is traditionally not the way
that TF/IDF systems work, it's this potential for search/database
hybridization that makes Lucy so attractive to me.

Nathan Kurz
[email protected]

Reply via email to