Hi Glenn-Erik,

Have you looked at the admin analysis tool (I think the link is /solr/ admin/analysis.jsp, but don't have it up and running at the moment to verify)? In this tool, you can see what is produced on the index side and the query side and see who the tokens are created, etc. From there, in my experience, it usually becomes obvious why things aren't matching.

Also, this question is best asked on solr-user, which is the mailing list for questions on how to use Solr. You are much more likely to reach a wider audience there, which more than likely means more insight.

Cheers,
Grant

On Aug 27, 2008, at 5:38 AM, Glenn-Erik Sandbakken wrote:

At sesam.no we want to replace a FAST (fast.no) Query Matching Server
with a Solr index.

The index we are trying to replace is not a regular index, but specially configured to perform phrases (and sub-phrases) matches against several
large lists (like an index with only a 'title' field).

I'm not sure of a correct, or logical, name for the behavior we are
after, but it is like a combination between Shingles and exact matching.

Some examples should explain it well.

Lets say we have the following list:
one two three
one two
two three
one
two
three
three two
two one
one three
three one


For the query "one two three", we need hits against, and only against:
one two three
one two
two three
one
two
three

For the query "one two", we need hits against, and only against:
one two
one
two

For the query "one three four" (or "four one three"), we need hits
against, and only against:
one three
one
three

For the query "one two sesam three", we need hits against, and only
against:
one two
one
two
three


We have been testing out solr with the ShingleFilter for this, but
without luck.
I am unsure whether the reason is misconfiguration in schema.xml or that
the ShingleFilter actually don't support this type of behavior. I've
attached our current schema.xml

I'd like to know if the SchingleFilter is at all able to do what we
want.
If it is: How can I configure schema.xml?
If not: does there exist any other solutions that we can incorporate
into solr which will give us this behavior?

If there is no existing solution to this, we will probably end up
writing our own methods for it, extending the ShingleFilter, gadly
contributing to the solr project =)

Thanks for a great product,
Glenn-Erik

<schema.xml>








Reply via email to