Re: Question regarding synonym
yes that's what we decided to expand these terms while indexing. if we have bayrische motoren werke = bmw and i have a document which has bmw in it, searching for text:bayrische does not give me results. i have to give text:bayrische motoren werke then it actually takes the synonym and gets me the document. Now if i change the synonym mapping to bayrische motoren werke , bmw with expand parameter to true and also use this file at indexing. now at the time i index this document along with bmw i also index the following words bayrische motoren werke any text query like text:motoren or text:bayrische will give me results now. Please correct me if my assumption is wrong. Thanks darniz Christian Zambrano wrote: On 10/02/2009 06:02 PM, darniz wrote: Thanks As i said it even works by giving double quotes too. like carDescription:austin martin So is that the conclusion that in order to map two word synonym i have to always enclose in double quotes, so that it doen not split the words Yes, but there are things you need to keep in mind. From the solr wiki: Keep in mind that while the SynonymFilter will happily work with *synonyms* containing multiple words (ie: sea biscuit, sea biscit, seabiscuit) The recommended approach for dealing with *synonyms* like this, is to expand the synonym when indexing. This is because there are two potential issues that can arrise at query time: 1. The Lucene QueryParser tokenizes on white space before giving any text to the Analyzer, so if a person searches for the words sea biscit the analyzer will be given the words sea and biscit seperately, and will not know that they match a synonym. 2. Phrase searching (ie: sea biscit) will cause the QueryParser to pass the entire string to the analyzer, but if the SynonymFilter is configured to expand the *synonyms*, then when the QueryParser gets the resulting list of tokens back from the Analyzer, it will construct a MultiPhraseQuery that will not have the desired effect. This is because of the limited mechanism available for the Analyzer to indicate that two terms occupy the same position: there is no way to indicate that a phrase occupies the same position as a term. For our example the resulting MultiPhraseQuery would be (sea | sea | seabiscuit) (biscuit | biscit) which would not match the simple case of seabisuit occuring in a document Christian Zambrano wrote: When you use a field qualifier(fieldName:valueToLookFor) it only applies to the word right after the semicolon. If you look at the debug infomation you will notice that for the second word it is using the default field. str name=parsedquery_toStringcarDescription:austin *text*:martin/str the following should word: carDescription:(austin martin) On 10/02/2009 05:46 PM, darniz wrote: This is not working when i search documents i have a document which contains text aston martin when i search carDescription:austin martin i get a match but when i dont give double quotes like carDescription:austin martin there is no match in the analyser if i give austin martin with out quotes, when it passes through synonym filter it matches aston martin , may be by default analyser treats it as a phrase austin martin but when i try to do a query by typing carDescription:austin martin i get 0 documents. the following is the debug node info with debugQuery=on str name=rawquerystringcarDescription:austin martin/str str name=querystringcarDescription:austin martin/str str name=parsedquerycarDescription:austin text:martin/str str name=parsedquery_toStringcarDescription:austin text:martin/str dont know why it breaks the word, may be its a desired behaviour when i give carDescription:austin martin of course in this its able to map to synonym and i get the desired result Any opinion darniz Ensdorf Ken wrote: Hi i have a question regarding synonymfilter i have a one way mapping defined austin martin, astonmartin = aston martin ... Can anybody please explain if my observation is correct. This is a very critical aspect for my work. That is correct - the synonym filter can recognize multi-token synonyms from consecutive tokens in a stream. -- View this message in context: http://www.nabble.com/Question-regarding-synonym-tp25720572p25754288.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Question regarding synonym
You are correct. I would recommend to only use the Synonym TokenFilter at index time unless you have a very good reason to do it at query time. On 10/05/2009 11:46 AM, darniz wrote: yes that's what we decided to expand these terms while indexing. if we have bayrische motoren werke = bmw and i have a document which has bmw in it, searching for text:bayrische does not give me results. i have to give text:bayrische motoren werke then it actually takes the synonym and gets me the document. Now if i change the synonym mapping to bayrische motoren werke , bmw with expand parameter to true and also use this file at indexing. now at the time i index this document along with bmw i also index the following words bayrische motoren werke any text query like text:motoren or text:bayrische will give me results now. Please correct me if my assumption is wrong. Thanks darniz Christian Zambrano wrote: On 10/02/2009 06:02 PM, darniz wrote: Thanks As i said it even works by giving double quotes too. like carDescription:austin martin So is that the conclusion that in order to map two word synonym i have to always enclose in double quotes, so that it doen not split the words Yes, but there are things you need to keep in mind. From the solr wiki: Keep in mind that while the SynonymFilter will happily work with *synonyms* containing multiple words (ie: sea biscuit, sea biscit, seabiscuit) The recommended approach for dealing with *synonyms* like this, is to expand the synonym when indexing. This is because there are two potential issues that can arrise at query time: 1. The Lucene QueryParser tokenizes on white space before giving any text to the Analyzer, so if a person searches for the words sea biscit the analyzer will be given the words sea and biscit seperately, and will not know that they match a synonym. 2. Phrase searching (ie: sea biscit) will cause the QueryParser to pass the entire string to the analyzer, but if the SynonymFilter is configured to expand the *synonyms*, then when the QueryParser gets the resulting list of tokens back from the Analyzer, it will construct a MultiPhraseQuery that will not have the desired effect. This is because of the limited mechanism available for the Analyzer to indicate that two terms occupy the same position: there is no way to indicate that a phrase occupies the same position as a term. For our example the resulting MultiPhraseQuery would be (sea | sea | seabiscuit) (biscuit | biscit) which would not match the simple case of seabisuit occuring in a document Christian Zambrano wrote: When you use a field qualifier(fieldName:valueToLookFor) it only applies to the word right after the semicolon. If you look at the debug infomation you will notice that for the second word it is using the default field. str name=parsedquery_toStringcarDescription:austin *text*:martin/str the following should word: carDescription:(austin martin) On 10/02/2009 05:46 PM, darniz wrote: This is not working when i search documents i have a document which contains text aston martin when i search carDescription:austin martin i get a match but when i dont give double quotes like carDescription:austin martin there is no match in the analyser if i give austin martin with out quotes, when it passes through synonym filter it matches aston martin , may be by default analyser treats it as a phrase austin martin but when i try to do a query by typing carDescription:austin martin i get 0 documents. the following is the debug node info with debugQuery=on str name=rawquerystringcarDescription:austin martin/str str name=querystringcarDescription:austin martin/str str name=parsedquerycarDescription:austin text:martin/str str name=parsedquery_toStringcarDescription:austin text:martin/str dont know why it breaks the word, may be its a desired behaviour when i give carDescription:austin martin of course in this its able to map to synonym and i get the desired result Any opinion darniz Ensdorf Ken wrote: Hi i have a question regarding synonymfilter i have a one way mapping defined austin martin, astonmartin =aston martin ... Can anybody please explain if my observation is correct. This is a very critical aspect for my work. That is correct - the synonym filter can recognize multi-token synonyms from consecutive tokens in a stream.
Re: Question regarding synonym
On 10/02/2009 06:02 PM, darniz wrote: Thanks As i said it even works by giving double quotes too. like carDescription:austin martin So is that the conclusion that in order to map two word synonym i have to always enclose in double quotes, so that it doen not split the words Yes, but there are things you need to keep in mind. From the solr wiki: Keep in mind that while the SynonymFilter will happily work with *synonyms* containing multiple words (ie: sea biscuit, sea biscit, seabiscuit) The recommended approach for dealing with *synonyms* like this, is to expand the synonym when indexing. This is because there are two potential issues that can arrise at query time: 1. The Lucene QueryParser tokenizes on white space before giving any text to the Analyzer, so if a person searches for the words sea biscit the analyzer will be given the words sea and biscit seperately, and will not know that they match a synonym. 2. Phrase searching (ie: sea biscit) will cause the QueryParser to pass the entire string to the analyzer, but if the SynonymFilter is configured to expand the *synonyms*, then when the QueryParser gets the resulting list of tokens back from the Analyzer, it will construct a MultiPhraseQuery that will not have the desired effect. This is because of the limited mechanism available for the Analyzer to indicate that two terms occupy the same position: there is no way to indicate that a phrase occupies the same position as a term. For our example the resulting MultiPhraseQuery would be (sea | sea | seabiscuit) (biscuit | biscit) which would not match the simple case of seabisuit occuring in a document Christian Zambrano wrote: When you use a field qualifier(fieldName:valueToLookFor) it only applies to the word right after the semicolon. If you look at the debug infomation you will notice that for the second word it is using the default field. str name=parsedquery_toStringcarDescription:austin *text*:martin/str the following should word: carDescription:(austin martin) On 10/02/2009 05:46 PM, darniz wrote: This is not working when i search documents i have a document which contains text aston martin when i search carDescription:austin martin i get a match but when i dont give double quotes like carDescription:austin martin there is no match in the analyser if i give austin martin with out quotes, when it passes through synonym filter it matches aston martin , may be by default analyser treats it as a phrase austin martin but when i try to do a query by typing carDescription:austin martin i get 0 documents. the following is the debug node info with debugQuery=on str name=rawquerystringcarDescription:austin martin/str str name=querystringcarDescription:austin martin/str str name=parsedquerycarDescription:austin text:martin/str str name=parsedquery_toStringcarDescription:austin text:martin/str dont know why it breaks the word, may be its a desired behaviour when i give carDescription:austin martin of course in this its able to map to synonym and i get the desired result Any opinion darniz Ensdorf Ken wrote: Hi i have a question regarding synonymfilter i have a one way mapping defined austin martin, astonmartin = aston martin ... Can anybody please explain if my observation is correct. This is a very critical aspect for my work. That is correct - the synonym filter can recognize multi-token synonyms from consecutive tokens in a stream.
Question regarding synonym
Hi i have a question regarding synonymfilter i have a one way mapping defined austin martin, astonmartin = aston martin what baffling me is that if i give at query time the word austin martin it first goes through white space and generate two words in analysis page austin and martin then after synonym filter it replace it with words aston martin Thats good and thats what i want but i am wodering sicne it went to white space tokeniser first and split the word in to two different word austin and martin how come it was able to map the entire synonym and replace it. If i give only austin the after passing thruough synonym filter it does not replace it with aston. That leads me to conclude that even though austin martin went thru whitespace tokenizer factory and got split into two the word ordering is still preserved to find a synonym match. Can anybody please explain if my observation is correct. This is a very critical aspect for my work. Thanks darniz -- View this message in context: http://www.nabble.com/Question-regarding-synonym-tp25720572p25720572.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Question regarding synonym
Hi i have a question regarding synonymfilter i have a one way mapping defined austin martin, astonmartin = aston martin ... Can anybody please explain if my observation is correct. This is a very critical aspect for my work. That is correct - the synonym filter can recognize multi-token synonyms from consecutive tokens in a stream.
RE: Question regarding synonym
This is not working when i search documents i have a document which contains text aston martin when i search carDescription:austin martin i get a match but when i dont give double quotes like carDescription:austin martin there is no match in the analyser if i give austin martin with out quotes, when it passes through synonym filter it matches aston martin , may be by default analyser treats it as a phrase austin martin but when i try to do a query by typing carDescription:austin martin i get 0 documents. the following is the debug node info with debugQuery=on str name=rawquerystringcarDescription:austin martin/str str name=querystringcarDescription:austin martin/str str name=parsedquerycarDescription:austin text:martin/str str name=parsedquery_toStringcarDescription:austin text:martin/str dont know why it breaks the word, may be its a desired behaviour when i give carDescription:austin martin of course in this its able to map to synonym and i get the desired result Any opinion darniz Ensdorf Ken wrote: Hi i have a question regarding synonymfilter i have a one way mapping defined austin martin, astonmartin = aston martin ... Can anybody please explain if my observation is correct. This is a very critical aspect for my work. That is correct - the synonym filter can recognize multi-token synonyms from consecutive tokens in a stream. -- View this message in context: http://www.nabble.com/Question-regarding-synonym-tp25720572p25723829.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Question regarding synonym
When you use a field qualifier(fieldName:valueToLookFor) it only applies to the word right after the semicolon. If you look at the debug infomation you will notice that for the second word it is using the default field. str name=parsedquery_toStringcarDescription:austin *text*:martin/str the following should word: carDescription:(austin martin) On 10/02/2009 05:46 PM, darniz wrote: This is not working when i search documents i have a document which contains text aston martin when i search carDescription:austin martin i get a match but when i dont give double quotes like carDescription:austin martin there is no match in the analyser if i give austin martin with out quotes, when it passes through synonym filter it matches aston martin , may be by default analyser treats it as a phrase austin martin but when i try to do a query by typing carDescription:austin martin i get 0 documents. the following is the debug node info with debugQuery=on str name=rawquerystringcarDescription:austin martin/str str name=querystringcarDescription:austin martin/str str name=parsedquerycarDescription:austin text:martin/str str name=parsedquery_toStringcarDescription:austin text:martin/str dont know why it breaks the word, may be its a desired behaviour when i give carDescription:austin martin of course in this its able to map to synonym and i get the desired result Any opinion darniz Ensdorf Ken wrote: Hi i have a question regarding synonymfilter i have a one way mapping defined austin martin, astonmartin = aston martin ... Can anybody please explain if my observation is correct. This is a very critical aspect for my work. That is correct - the synonym filter can recognize multi-token synonyms from consecutive tokens in a stream.
Re: Question regarding synonym
Thanks As i said it even works by giving double quotes too. like carDescription:austin martin So is that the conclusion that in order to map two word synonym i have to always enclose in double quotes, so that it doen not split the words Christian Zambrano wrote: When you use a field qualifier(fieldName:valueToLookFor) it only applies to the word right after the semicolon. If you look at the debug infomation you will notice that for the second word it is using the default field. str name=parsedquery_toStringcarDescription:austin *text*:martin/str the following should word: carDescription:(austin martin) On 10/02/2009 05:46 PM, darniz wrote: This is not working when i search documents i have a document which contains text aston martin when i search carDescription:austin martin i get a match but when i dont give double quotes like carDescription:austin martin there is no match in the analyser if i give austin martin with out quotes, when it passes through synonym filter it matches aston martin , may be by default analyser treats it as a phrase austin martin but when i try to do a query by typing carDescription:austin martin i get 0 documents. the following is the debug node info with debugQuery=on str name=rawquerystringcarDescription:austin martin/str str name=querystringcarDescription:austin martin/str str name=parsedquerycarDescription:austin text:martin/str str name=parsedquery_toStringcarDescription:austin text:martin/str dont know why it breaks the word, may be its a desired behaviour when i give carDescription:austin martin of course in this its able to map to synonym and i get the desired result Any opinion darniz Ensdorf Ken wrote: Hi i have a question regarding synonymfilter i have a one way mapping defined austin martin, astonmartin = aston martin ... Can anybody please explain if my observation is correct. This is a very critical aspect for my work. That is correct - the synonym filter can recognize multi-token synonyms from consecutive tokens in a stream. -- View this message in context: http://www.nabble.com/Question-regarding-synonym-tp25720572p25723980.html Sent from the Solr - User mailing list archive at Nabble.com.