Re: Synonyms Not Working when using SRC & DEST

2011-09-07 Thread balaji
> So, if instead of:
>
> allergy test  =>  Doctors, Doctors-Medical, PHYSICIANS, Physicians &
> Surgeons
>
> You specified
>
>
> allergy test => allergy test, Doctors, Doctors-Medical, PHYSICIANS,
> Physicians & Surgeons
>
>
   I followed the above approach " allergy test => allergy test, Doctors,
Doctors-Medical, PHYSICIANS, Physicians & Surgeons " and it works as
expected , Thanks for making it more clear

Thanks
Balaji


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3316691.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Synonyms Not Working when using SRC & DEST

2011-09-07 Thread Jaeger, Jay - DOT
Also, just to make one thing just a bit more clear.   You can specify two 
different kinds of entries in synonym files.  See 
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters 
(solr.SynonymFilterFactory)


One is replacement, where the words before the "=>" are *replaced* by the right 
hand side, i.e., the words on the left hand side "disappear".  This is what you 
are currently doing according to your original message:

#Explicit mappings match any token sequence on the LHS of "=>"
#and replace with all alternatives on the RHS.  These types of mappings
#ignore the expand parameter in the schema.
#Examples:
i-pod, i pod => ipod,
sea biscuit, sea biscit => seabiscuit



The other is equivalence, where each term is expanded into the entire list, if 
you do the following, with expand set to true:

#Equivalent synonyms may be separated with commas and give
#no explicit mapping.  In this case the mapping behavior will
#be taken from the expand parameter in the schema.  This allows
#the same synonym file to be used in different synonym handling strategies.
#Examples:
ipod, i-pod, i pod
foozball , foosball
universe , cosmos



So, if instead of:

allergy test  =>  Doctors, Doctors-Medical, PHYSICIANS, Physicians & Surgeons

You specified


allergy test => allergy test, Doctors, Doctors-Medical, PHYSICIANS, Physicians 
& Surgeons 

Or 

allergy test, Doctors, Doctors-Medical, PHYSICIANS, Physicians & Surgeons

with expand set to true,  then you might get the behavior your desire:  
"Allergy test" would get indexed, along with "Doctors" and all of the rest.  
The difference being that in the second case, any of those terms (e.g. 
"Docotrs") would also get indexed as "Allergy test" which might not be what you 
desire, in which case the first one would do what you want.

I expect that all you really need to do is:

allergy test => allergy test, Doctors, Doctors-Medical, PHYSICIANS, Physicians 
& Surgeons

to solve your problem.

JRJ

-Original Message-
From: balaji [mailto:mcabal...@gmail.com] 
Sent: Tuesday, September 06, 2011 7:48 PM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms Not Working when using SRC & DEST

> It won't work given your current schema.  To get the desired results, you
> would need to expand your synonyms at both index AND query time.  Right now
> your schema seems to specify it only at index time.
>

I have a very huge schema spanning up to 10K lines , if I use query time it
will be huge hit for me because one term will be mapped to multiple terms .
similar in the case of allergy

I doesn't want to go with comma separated as it will give
some erroneous results  and more over allergy and doctors are not equivalent
terms to be used in comma


>
> So, as the other respondent indicated, currently you replace allergy with
> the other list when indexing, and since allergy is not replaced during
> query, it gets no hits.
>

I replace allergy during the index with doctors , So it shouldn't be part of
the document ?


Thanks
Balaji


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3315287.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Synonyms Not Working when using SRC & DEST

2011-09-07 Thread Jaeger, Jay - DOT
> I have a very huge schema spanning up to 10K lines , if I use query time it
> will be huge hit for me because one term will be mapped to multiple terms .
> similar in the case of allergy

I think maybe you mean synonym file, rather than the schema?  I doubt that the 
number of lines matters all that much, though undoubtedly some.  I expect that 
Solr loads that synonym file into some kind of hash map, rather than searching 
it linearly -- though I have not looked at the code for that.

> I replace allergy during the index with doctors , So it shouldn't be part of
> the document ?

Yes indeed, doctors would be in the index, and would give you a hit on that 
document when searched.  But because your synonym file specifies replacement, 
that means that allergy is *NOT* part of the index, hence, when you searched on 
allergy, you got no results.

As far as synonym expansion being a "huge hit", no, not really, I think.  
Besides, if you are not getting what you want or need, speed becomes pretty 
much irrelevant.  We did some performance testing:  modest single server (i.e., 
a laptop running Windows XP with only 2GB total memory available), pretty much 
configured "out of the box" with jetty, except that we added waffle 
authentication.  The data was names, addresses and the like (not text) -- 7+ 
million rows, with considerable synonym expansion:  200 first name synonyms, 
433 last name synonyms, expanded at both index time and search time.

We then did a search test driven from those same synonyms files, by randomly 
picking out a name from the first and last name list, the idea being that most 
likely names did have some synonyms.

Under Solr 3.1, once the OS file system cache got some entries in there, 
running with 8 concurrent client search threads sending HTTP search requests 
(done in perl) we averaged about .50 seconds per request, or over 55,000 
searches per hour.

JRJ

-Original Message-
From: balaji [mailto:mcabal...@gmail.com] 
Sent: Tuesday, September 06, 2011 7:48 PM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms Not Working when using SRC & DEST

> It won't work given your current schema.  To get the desired results, you
> would need to expand your synonyms at both index AND query time.  Right now
> your schema seems to specify it only at index time.
>

I have a very huge schema spanning up to 10K lines , if I use query time it
will be huge hit for me because one term will be mapped to multiple terms .
similar in the case of allergy

I doesn't want to go with comma separated as it will give
some erroneous results  and more over allergy and doctors are not equivalent
terms to be used in comma


>
> So, as the other respondent indicated, currently you replace allergy with
> the other list when indexing, and since allergy is not replaced during
> query, it gets no hits.
>

I replace allergy during the index with doctors , So it shouldn't be part of
the document ?


Thanks
Balaji


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3315287.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Synonyms Not Working when using SRC & DEST

2011-09-06 Thread balaji
> It won't work given your current schema.  To get the desired results, you
> would need to expand your synonyms at both index AND query time.  Right now
> your schema seems to specify it only at index time.
>

I have a very huge schema spanning up to 10K lines , if I use query time it
will be huge hit for me because one term will be mapped to multiple terms .
similar in the case of allergy

I doesn't want to go with comma separated as it will give
some erroneous results  and more over allergy and doctors are not equivalent
terms to be used in comma


>
> So, as the other respondent indicated, currently you replace allergy with
> the other list when indexing, and since allergy is not replaced during
> query, it gets no hits.
>

I replace allergy during the index with doctors , So it shouldn't be part of
the document ?


Thanks
Balaji


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3315287.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Synonyms Not Working when using SRC & DEST

2011-09-06 Thread Jaeger, Jay - DOT
It won't work given your current schema.  To get the desired results, you would 
need to expand your synonyms at both index AND query time.  Right now your 
schema seems to specify it only at index time.

So, as the other respondent indicated, currently you replace allergy with the 
other list when indexing, and since allergy is not replaced during query, it 
gets no hits.

It almost sounds like a case where you could consider synonym expansion only at 
query time, rather than at index time (though that is usually not advisable for 
reasons discussed on the Wiki).  Then Allergy would get expanded during a 
search, and hit the documents with Doctors, etc.

JRJ

-Original Message-
From: balaji [mailto:mcabal...@gmail.com] 
Sent: Tuesday, September 06, 2011 12:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms Not Working when using SRC & DEST

Hi Chris

The Terms Doctors , Doctors-Medical are all present in my Document body,
title fields etc..  but Allergy Test is not . So what I am doing in synonym
file is if a user searches for allergy test bring me results that match
Doctors etc.. i.e 
Explicit mappings match any token sequence on the LHS of "=>"  and replace
with all alternatives on the RHS.

So when I do a search "allergy test" it should map with doctors and
should bring me results but it is not mapping . Is there any way I make it
work

Hope it clarifies 


Thanks
Balaji



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3314222.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Synonyms Not Working when using SRC & DEST

2011-09-06 Thread balaji
Hi Chris

The Terms Doctors , Doctors-Medical are all present in my Document body,
title fields etc..  but Allergy Test is not . So what I am doing in synonym
file is if a user searches for allergy test bring me results that match
Doctors etc.. i.e 
Explicit mappings match any token sequence on the LHS of "=>"  and replace
with all alternatives on the RHS.

So when I do a search "allergy test" it should map with doctors and
should bring me results but it is not mapping . Is there any way I make it
work

Hope it clarifies 


Thanks
Balaji



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3314222.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Synonyms Not Working when using SRC & DEST

2011-09-06 Thread Chris Hostetter

: *allergy test  =>  Doctors, Doctors-Medical, PHYSICIANS, Physicians &
: Surgeons
..
: 
...
: 
...
: But when I do a search for allergy , I get 0 results

You've configured your field so that any time the terms "allergy" 
and "test" appear in sequence in a field value you index, those terms are 
removed and replaced by new terms ("Doctors", "Doctors-Medical", etc...)

So if the term "allergy" only appears in the source text followed by the 
term "test" then it will never actually be indexed in your document, so a 
serach for it will never match.

You can see this exact behavior in the screen shot you posted of the 
analysis tool...

: http://lucene.472066.n3.nabble.com/file/n3313862/Screenshot-1.png 

...after the synonyn filter, the term "allergy" is not in your indexed 
terms.

: when i change the synonym file to a comma separated I am able to see the
: results

because when using a comma instead of "=>" you are saying "if any of these 
term sequences exist, expand it to *all* of these term sequences.

Please note the docs on SYnonymFilter, particularly the examples...

https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

-Hoss