On Mon, Oct 03, 2011 at 04:06:08PM +0200, goran kent wrote: > Hi, > > I've scrounged around a bit, and I take it > http://mail-archives.apache.org/mod_mbox/incubator-lucy-dev/201109.mbox/%[email protected]%3E > is the only way to identify which field triggered a $hit, right?
I don't know of a better way. > ie (roughly), flag all fields as highlightable, then if the > Lucy::Highlight::Highlighter actually highlights something in a field, > then that's your indication that something was found in it? Yes. > If so, feature request: my @field_hit = $hits->relevant_field() would > be really nice ;) I don't think this feature is mature enough to be given a prominent core API just yet. We've struck upon the general approach of post-processing the hit using the single-document mini-inverted-indexes needed for highlighting, but the current implementation is arguably an abuse of Highlighter. For now, I think it's OK that we support this feature with cookbook code or via convenience methods in libraries which wrap Lucy. But it's become a popular feature request, and so it's good to think about what a Lucy API might look like in the future. Peter has provided one vision, in SWISH::Prog::Lucy::Results. I confess that I don't quite understand what you've shown us above. Can you provide some context illustrating how it would be used? > My minor problem is: I have inbound link text pointing to a page > which is indexed along with the page content itself. Since it's never > displayed, you might have a hit on this 'hidden' text (but highly > relevant in my case) and no other hits, so the excerpt is void of any > highlighting (I can just hear the wailing and gnashing of teeth from > my future users). It would be nice to be able to flag this kind of > search result as "Found your term in inbound text" or whatever). OK, sure. You can abuse Highlighter to achieve your ends. :) I assume that you have stripped all HTML tags from your data. (They would likely mess up scoring if left in). Thus seeing if a highlighted exerpt contains a "<strong>" tag suffices to indicate that a field indeed matched. If the primary content field produces a excerpt that does *not* contain "<strong>", but the "inbound_text" excerpt *does* contain "<strong>", then you know to flag that particular result. Cheers, Marvin Humphrey
