So let's say I have a query like:
(dog OR cat OR bird) animal
And the text I'm indexing is like this
A dog is an animal
Cats and Dogs are animals
A tree is not an animal
Of course (with stemming) the following two entries should be matched:
A dog is an animal
Cats and Dogs are animals
How do I get what query terms/phrases were found so I know that doc_id 1
matched against (dog, animal) and doc_id 2 matched (cat, dog, animal)?
I'm looking for something similar in functionality to what Whoosh[1] and
Xapian[2] offer in this regard.
I tried looking at the highlighter source thinking that has to implement
similar logic, but my knowledge of C is next to nil and I didn't see any
thing like that in the Perl bindings that I could use.
[1]:
http://pythonhosted.org/Whoosh/searching.html#which-terms-from-my-query-matched
[2]:
http://xapian.org/docs/apidoc/html/classXapian_1_1Enquire.html#dda4181ccd15beb52c39f5e24adbb25b
Regards,
--
Philip Southam
Chief Architect / Яeverse Эngineer
http://zefr.com