A small addition to my earlier post. I wonder if its because of the 'mm' param, which requires that until 3 words in search phrase, all the words should be matched. If i alter this now, i'd get ir-relevant results for a lot of popular 1, 2, 3 word search terms. How to solve for this?
anuvenk wrote: > > I tried adding some city to state mappings in the synonyms file. I'm using > the dismax handler for phrase matching. So as & when i add more & more > city to state mappings, I end up with zero results for state based > searches. > Eg: ca,california,los angeles > ca,california,san diego > ca,california,san francisco > ca,california,burbank and so on.... > now a city based search returns a few other california results but a state > based search like dui california is returning zero results. > I checked the parsedquery_toString and I see no 'OR' although the default > operator is 'OR' in schema. It looks like its trying to find matches for > all those cities as they are mapped to 'california' and hence returns zero > results. How to force dismax to use 'OR' and not 'AND' even though the > schema has 'OR'. > Or is this how dismax works? Can someone explain how to overcome this > problem. > Here is my custom request handler that extends dismax > <requestHandler name="qfacet" class="solr.DisMaxRequestHandler" > > <lst name="defaults"> > <str name="echoParams">explicit</str> > <float name="tie">0.01</float> > <str name="qf">name^2.0 text^0.8</str> > <!-- until 3 all should match;4 - 3 shld match; 5 - 4 shld match; 6 - > 5 shld match; above 6 - 90% match --> > <str name="mm">3<-1 4<-1 5<-1 6<90%</str> > <str name="pf"> > text^0.8 name^2.0 > </str> > <int name="qs">4</int> > <int name="ps">4</int> > <str name="fl"> > *,score > </str> > > </lst> > <lst name="invariants"> > <!--<str name="facet.field">resourceType</str> > <str name="facet.field">category</str> > <str name="facet.field">stateName</str>--> > <str name="facet.sort">false</str> > <int name="facet.mincount">1</int> > </lst> > </requestHandler> > > Thanks. > > > > Otis Gospodnetic wrote: >> >> >> Hello, >> >> 300K is a pretty small index. I wouldn't worry about the number of >> synonyms unless you are turning a single term into dozens of ORed terms. >> >> Otis >> -- >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >> >> >> >> ----- Original Message ---- >>> From: anuvenk <anuvenkat...@hotmail.com> >>> To: solr-user@lucene.apache.org >>> Sent: Tuesday, June 2, 2009 11:28:43 PM >>> Subject: Re: Is there Downside to a huge synonyms file? >>> >>> >>> I'm using query time synonyms. I have more fields in my index though. >>> This is >>> just an example or sample of data from my index. Yes, we don't have >>> millions >>> of documents. Could be around 300,000 and might increase in future. The >>> reason i'm using query time synonyms is because of the nature of my >>> data. I >>> can't re-index the data everytime i add or remove a synonym. But for >>> this >>> particular requirement is it best to have index time synonyms because of >>> the >>> multi-word synonym nature. Again if i add more cities list to the >>> synonym >>> file, I can't be re-indexing all the data over and over again. >>> >>> >>> >>> anuvenk wrote: >>> > >>> > In my index i have legal faqs, forms, legal videos etc with a state >>> field >>> > for each resource. >>> > Now if i search for real estate san diego, I want to be able to return >>> > other 'california' results i.e results from san francisco. >>> > I have the following fields in the index >>> > >>> > title state >>> > description... >>> > real estate san diego example 1 california some >>> > description >>> > real estate carlsbad example 2 california some >>> desc >>> > >>> > so when i search for real estate san francisco, since there is no >>> match, i >>> > want to be able to return the other real estate results in california >>> > instead of returning none. Because sometimes they might be searching >>> for a >>> > real estate form and city probably doesn't matter. >>> > >>> > I have two things in mind. One is adding a synonym mapping >>> > san diego, california >>> > carlsbad, california >>> > san francisco, california >>> > >>> > (which probably isn't the best way) >>> > hoping that search for san francisco real estate would map san >>> francisco >>> > to california and hence return the other two california results >>> > >>> > OR >>> > >>> > adding the mapping of city to state in the index itself like.. >>> > >>> > title state city >>> > >>> >>> > description... >>> > real estate san diego eg 1 california carlsbad, san francisco, >>> san >>> > diego some description >>> > real estate carlsbad eg 2 california carlsbad, san francisco, >>> san >>> > diego some description >>> > >>> > which of the above two is better. Does a huge synonym file affect >>> > performance. Or Is there a even better way? I'm sure there is but I >>> can't >>> > put my finger on it yet & I'm not familiar with java either. >>> > >>> > >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Is-there-Downside-to-a-huge-synonyms-file--tp23842527p23844761.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >> >> >> > > -- View this message in context: http://www.nabble.com/Is-there-Downside-to-a-huge-synonyms-file--tp23842527p23861649.html Sent from the Solr - User mailing list archive at Nabble.com.