Re: Question regarding synonym

Christian Zambrano Mon, 05 Oct 2009 10:12:22 -0700

You are correct.

I would recommend to only use the Synonym TokenFilter at index timeunless you have a very good reason to do it at query time.


On 10/05/2009 11:46 AM, darniz wrote:

yes that's what we decided to expand these terms while indexing.
if we have
bayrische motoren werke =>  bmw

and i have a document which has bmw in it, searching for text:bayrische does
not give me results. i have to give
text:"bayrische motoren werke" then it actually takes the synonym and gets
me the document.

Now if i change the synonym mapping to
bayrische motoren werke , bmw with expand parameter to true and also use
this file at indexing.

now at the  time i index this document along with "bmw" i also index the
following words "bayrische" "motoren" "werke"

any text query like text:motoren or text:bayrische will give me results now.

Please correct me if my assumption is wrong.

Thanks
darniz









Christian Zambrano wrote:



On 10/02/2009 06:02 PM, darniz wrote:

Thanks
As i said it even works by giving double quotes too.
like carDescription:"austin martin"

So is that the conclusion that in order to map two word synonym i have to
always enclose in double quotes, so that it doen not split the words

Yes, but there are things you need to keep in mind.

  From the solr wiki:

Keep in mind that while the SynonymFilter will happily work with
*synonyms* containing multiple words (ie:
"sea biscuit, sea biscit, seabiscuit") The recommended approach for
dealing with *synonyms* like this, is to expand the synonym when
indexing. This is because there are two potential issues that can arrise
at query time:

    1.

       The Lucene QueryParser tokenizes on white space before giving any
       text to the Analyzer, so if a person searches for the words
       sea biscit the analyzer will be given the words "sea" and "biscit"
       seperately, and will not know that they match a synonym.

    2.

       Phrase searching (ie: "sea biscit") will cause the QueryParser to
       pass the entire string to the analyzer, but if the SynonymFilter
       is configured to expand the *synonyms*, then when the QueryParser
       gets the resulting list of tokens back from the Analyzer, it will
       construct a MultiPhraseQuery that will not have the desired
       effect. This is because of the limited mechanism available for the
       Analyzer to indicate that two terms occupy the same position:
       there is no way to indicate that a "phrase" occupies the same
       position as a term. For our example the resulting MultiPhraseQuery
       would be "(sea | sea | seabiscuit) (biscuit | biscit)" which would
       not match the simple case of "seabisuit" occuring in a document







Christian Zambrano wrote:

When you use a field qualifier(fieldName:valueToLookFor) it only applies
to the word right after the semicolon. If you look at the debug
infomation you will notice that for the second word it is using the
default field.

<str name="parsedquery_toString">carDescription:austin
*text*:martin</str>

the following should word:

carDescription:(austin martin)


On 10/02/2009 05:46 PM, darniz wrote:

This is not working when i search documents i have a document which
contains
text aston martin

when i search carDescription:"austin martin" i get a match but when i
dont
give double quotes

like carDescription:austin martin
there is no match

in the analyser if i give austin martin with out quotes, when it passes
through synonym filter it matches aston martin ,
may be by default analyser treats it as a phrase "austin martin" but
when
i
try to do a query by typing
carDescription:austin martin i get 0 documents. the following is the
debug
node info with debugQuery=on

<str name="rawquerystring">carDescription:austin martin</str>
<str name="querystring">carDescription:austin martin</str>
<str name="parsedquery">carDescription:austin text:martin</str>
<str name="parsedquery_toString">carDescription:austin
text:martin</str>

dont know why it breaks the word, may be its a desired behaviour
when i give carDescription:"austin martin" of course in this its able
to
map
to synonym and i get the desired result

Any opinion

darniz



Ensdorf Ken wrote:

Hi
i have a question regarding synonymfilter
i have a one way mapping defined
austin martin, astonmartin =>    aston martin

...

Can anybody please explain if my observation is correct. This is a
very
critical aspect for my work.

That is correct - the synonym filter can recognize multi-token
synonyms
from consecutive tokens in a stream.

Re: Question regarding synonym

Reply via email to