Re: Synonym filter with support for phrases?

Dawid Weiss Wed, 22 Apr 2009 06:02:57 -0700

Your synonyms will break if you try searching for phrases.

Good point, I did write that filter, but I never actually got to searching forexact phrases in it (there was a very specific scenario and we used prefixqueries which worked quite well).

Building on your example, "food place in new york" will find nothing,
because 'place' and 'in' share the same position.

You're right, but is it such a big problem in real life? What you're describingis searching for a phrase that spawns both the synonym and the actual tokensequence. What I thought was: searching for phrases that were either justsynonyms or synonyms and text with an identical position layout (which is thecase with single-word synonyms). I dare say this covers majority of cases,although I have nothing to support this claim.

While building the index, I inject synonym group ids instead of actual
words, then I detect synonyms in queries and replace them with group
ids too. Hard part comes after that, you have to adjust
positionIncrements on syngroup id tokens, with respect to the longest

> [snip]

Yep, hairy ;)

More correct approach is to index as-is and expand queries with actual
synonym phrases instead of ids, but then queries become really
humongous if you have any decent synonym dictionary (I have 20+ phrase
groups).

Query expansion is not the option for me, unfortunately -- to many synonyms. Itwould be much better to do it once at indexing time and rely on this informationsince.


Thanks for sharing your thoughts, Кирилл.

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Synonym filter with support for phrases?

Reply via email to