Re: Synonyms Not Working when using SRC & DEST
> So, if instead of: > > allergy test => Doctors, Doctors-Medical, PHYSICIANS, Physicians & > Surgeons > > You specified > > > allergy test => allergy test, Doctors, Doctors-Medical, PHYSICIANS, > Physicians & Surgeons > > I followed the above approach " allergy test => allergy test, Doctors, Doctors-Medical, PHYSICIANS, Physicians & Surgeons " and it works as expected , Thanks for making it more clear Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3316691.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Synonyms Not Working when using SRC & DEST
Also, just to make one thing just a bit more clear. You can specify two different kinds of entries in synonym files. See http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters (solr.SynonymFilterFactory) One is replacement, where the words before the "=>" are *replaced* by the right hand side, i.e., the words on the left hand side "disappear". This is what you are currently doing according to your original message: #Explicit mappings match any token sequence on the LHS of "=>" #and replace with all alternatives on the RHS. These types of mappings #ignore the expand parameter in the schema. #Examples: i-pod, i pod => ipod, sea biscuit, sea biscit => seabiscuit The other is equivalence, where each term is expanded into the entire list, if you do the following, with expand set to true: #Equivalent synonyms may be separated with commas and give #no explicit mapping. In this case the mapping behavior will #be taken from the expand parameter in the schema. This allows #the same synonym file to be used in different synonym handling strategies. #Examples: ipod, i-pod, i pod foozball , foosball universe , cosmos So, if instead of: allergy test => Doctors, Doctors-Medical, PHYSICIANS, Physicians & Surgeons You specified allergy test => allergy test, Doctors, Doctors-Medical, PHYSICIANS, Physicians & Surgeons Or allergy test, Doctors, Doctors-Medical, PHYSICIANS, Physicians & Surgeons with expand set to true, then you might get the behavior your desire: "Allergy test" would get indexed, along with "Doctors" and all of the rest. The difference being that in the second case, any of those terms (e.g. "Docotrs") would also get indexed as "Allergy test" which might not be what you desire, in which case the first one would do what you want. I expect that all you really need to do is: allergy test => allergy test, Doctors, Doctors-Medical, PHYSICIANS, Physicians & Surgeons to solve your problem. JRJ -Original Message- From: balaji [mailto:mcabal...@gmail.com] Sent: Tuesday, September 06, 2011 7:48 PM To: solr-user@lucene.apache.org Subject: Re: Synonyms Not Working when using SRC & DEST > It won't work given your current schema. To get the desired results, you > would need to expand your synonyms at both index AND query time. Right now > your schema seems to specify it only at index time. > I have a very huge schema spanning up to 10K lines , if I use query time it will be huge hit for me because one term will be mapped to multiple terms . similar in the case of allergy I doesn't want to go with comma separated as it will give some erroneous results and more over allergy and doctors are not equivalent terms to be used in comma > > So, as the other respondent indicated, currently you replace allergy with > the other list when indexing, and since allergy is not replaced during > query, it gets no hits. > I replace allergy during the index with doctors , So it shouldn't be part of the document ? Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3315287.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Synonyms Not Working when using SRC & DEST
> I have a very huge schema spanning up to 10K lines , if I use query time it > will be huge hit for me because one term will be mapped to multiple terms . > similar in the case of allergy I think maybe you mean synonym file, rather than the schema? I doubt that the number of lines matters all that much, though undoubtedly some. I expect that Solr loads that synonym file into some kind of hash map, rather than searching it linearly -- though I have not looked at the code for that. > I replace allergy during the index with doctors , So it shouldn't be part of > the document ? Yes indeed, doctors would be in the index, and would give you a hit on that document when searched. But because your synonym file specifies replacement, that means that allergy is *NOT* part of the index, hence, when you searched on allergy, you got no results. As far as synonym expansion being a "huge hit", no, not really, I think. Besides, if you are not getting what you want or need, speed becomes pretty much irrelevant. We did some performance testing: modest single server (i.e., a laptop running Windows XP with only 2GB total memory available), pretty much configured "out of the box" with jetty, except that we added waffle authentication. The data was names, addresses and the like (not text) -- 7+ million rows, with considerable synonym expansion: 200 first name synonyms, 433 last name synonyms, expanded at both index time and search time. We then did a search test driven from those same synonyms files, by randomly picking out a name from the first and last name list, the idea being that most likely names did have some synonyms. Under Solr 3.1, once the OS file system cache got some entries in there, running with 8 concurrent client search threads sending HTTP search requests (done in perl) we averaged about .50 seconds per request, or over 55,000 searches per hour. JRJ -Original Message- From: balaji [mailto:mcabal...@gmail.com] Sent: Tuesday, September 06, 2011 7:48 PM To: solr-user@lucene.apache.org Subject: Re: Synonyms Not Working when using SRC & DEST > It won't work given your current schema. To get the desired results, you > would need to expand your synonyms at both index AND query time. Right now > your schema seems to specify it only at index time. > I have a very huge schema spanning up to 10K lines , if I use query time it will be huge hit for me because one term will be mapped to multiple terms . similar in the case of allergy I doesn't want to go with comma separated as it will give some erroneous results and more over allergy and doctors are not equivalent terms to be used in comma > > So, as the other respondent indicated, currently you replace allergy with > the other list when indexing, and since allergy is not replaced during > query, it gets no hits. > I replace allergy during the index with doctors , So it shouldn't be part of the document ? Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3315287.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Synonyms Not Working when using SRC & DEST
> It won't work given your current schema. To get the desired results, you > would need to expand your synonyms at both index AND query time. Right now > your schema seems to specify it only at index time. > I have a very huge schema spanning up to 10K lines , if I use query time it will be huge hit for me because one term will be mapped to multiple terms . similar in the case of allergy I doesn't want to go with comma separated as it will give some erroneous results and more over allergy and doctors are not equivalent terms to be used in comma > > So, as the other respondent indicated, currently you replace allergy with > the other list when indexing, and since allergy is not replaced during > query, it gets no hits. > I replace allergy during the index with doctors , So it shouldn't be part of the document ? Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3315287.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Synonyms Not Working when using SRC & DEST
It won't work given your current schema. To get the desired results, you would need to expand your synonyms at both index AND query time. Right now your schema seems to specify it only at index time. So, as the other respondent indicated, currently you replace allergy with the other list when indexing, and since allergy is not replaced during query, it gets no hits. It almost sounds like a case where you could consider synonym expansion only at query time, rather than at index time (though that is usually not advisable for reasons discussed on the Wiki). Then Allergy would get expanded during a search, and hit the documents with Doctors, etc. JRJ -Original Message- From: balaji [mailto:mcabal...@gmail.com] Sent: Tuesday, September 06, 2011 12:24 PM To: solr-user@lucene.apache.org Subject: Re: Synonyms Not Working when using SRC & DEST Hi Chris The Terms Doctors , Doctors-Medical are all present in my Document body, title fields etc.. but Allergy Test is not . So what I am doing in synonym file is if a user searches for allergy test bring me results that match Doctors etc.. i.e Explicit mappings match any token sequence on the LHS of "=>" and replace with all alternatives on the RHS. So when I do a search "allergy test" it should map with doctors and should bring me results but it is not mapping . Is there any way I make it work Hope it clarifies Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3314222.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Synonyms Not Working when using SRC & DEST
Hi Chris The Terms Doctors , Doctors-Medical are all present in my Document body, title fields etc.. but Allergy Test is not . So what I am doing in synonym file is if a user searches for allergy test bring me results that match Doctors etc.. i.e Explicit mappings match any token sequence on the LHS of "=>" and replace with all alternatives on the RHS. So when I do a search "allergy test" it should map with doctors and should bring me results but it is not mapping . Is there any way I make it work Hope it clarifies Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3314222.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Synonyms Not Working when using SRC & DEST
: *allergy test => Doctors, Doctors-Medical, PHYSICIANS, Physicians & : Surgeons .. : ... : ... : But when I do a search for allergy , I get 0 results You've configured your field so that any time the terms "allergy" and "test" appear in sequence in a field value you index, those terms are removed and replaced by new terms ("Doctors", "Doctors-Medical", etc...) So if the term "allergy" only appears in the source text followed by the term "test" then it will never actually be indexed in your document, so a serach for it will never match. You can see this exact behavior in the screen shot you posted of the analysis tool... : http://lucene.472066.n3.nabble.com/file/n3313862/Screenshot-1.png ...after the synonyn filter, the term "allergy" is not in your indexed terms. : when i change the synonym file to a comma separated I am able to see the : results because when using a comma instead of "=>" you are saying "if any of these term sequences exist, expand it to *all* of these term sequences. Please note the docs on SYnonymFilter, particularly the examples... https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory -Hoss