All of the applications I've seen with user control over synonym expansion where recall-oriented. The "give me all matches for X" kind of problem. So ranking is not as important.
wunder On Dec 12, 2012, at 5:23 PM, Roman Chyla wrote: > Well, this IDF problem has more sides. So, let's say your synonym file > contains multi-token synonyms (it does, right? or perhaps you don't need > it? well, some people do) > > "TV, TV set, TV foo, television" > > if you use the default synonym expansion, when you index 'television' > > you have increased frequency of also 'set', 'foo', so, the IDF of 'TV' is > the same as that of 'television' - but IDF of 'foo' and 'set' has changed > (their frequency increased, their IDF decreased) -- TV's have in fact made > 'foo' term very frequent and undesirable > > So, you might be sure that IDF of 'TV' and 'television' are the same, but > you are not aware it has 'screwed' other (desirable) terms - so it really > depends. And I wouldn't argue these cases are esoteric. > > And finally: there are use cases out there, where people NEED to switch off > synonym expansion at will (find only these documents, that contain the word > 'TV' and not that bloody 'foo'). This cannot be done if the index contains > all synonym terms (unless you have a way to mark the original and the > synonym in the index). > > roman > > > On Wed, Dec 12, 2012 at 12:50 PM, Walter Underwood > <wun...@wunderwood.org>wrote: > >> Query parsers cannot fix the IDF problem or make query-time synonyms >> faster. Query synonym expansion makes more search terms. More search terms >> are more work at query time. >> >> The IDF problem is real; I've run up against it. The most rare variant of >> the synonym have the highest score. This probably the opposite of what you >> want. For me, it was "TV" and "television". Documents with "TV" had higher >> scores than those with "television". >> >> wunder >> >> On Dec 12, 2012, at 9:45 AM, Roman Chyla wrote: >> >>> @wunder >>> It is a misconception (well, supported by that wiki description) that the >>> query time synonym filter have these problems. It is actually the default >>> parser, that is causing these problems. Look at this if you still think >>> that index time synonyms are cure for all: >>> https://issues.apache.org/jira/browse/LUCENE-4499 >>> >>> @joe >>> If you can use the flexible query parser (as linked in by @Swati) then >> all >>> you need to do is to define a different field with a different tokenizer >>> chain and then swap the field names before the analyzers processes the >>> document (and then rewrite the field name back - for example, we have >>> fields called "author" and "author_nosyn") >>> >>> roman >>> >>> On Wed, Dec 12, 2012 at 12:38 PM, Walter Underwood < >> wun...@wunderwood.org>wrote: >>> >>>> Query time synonyms have known problems. They are slower, cause >> incorrect >>>> IDF, and don't work for phrase synonyms. >>>> >>>> Apply synonyms at index time and you will have none of those problems. >>>> >>>> See: >>>> >> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory >>>> >>>> wunder >>>> >>>> On Dec 12, 2012, at 9:34 AM, Swati Swoboda wrote: >>>> >>>>> Query-time analyzers are still applied, even if you include a string in >>>> quotes. Would you expect "foo" to not match "Foo" just because it's >>>> enclosed in quotes? >>>>> >>>>> Also look at this, someone who had similar requirements: >>>>> >>>> >> http://lucene.472066.n3.nabble.com/Synonym-Filter-disable-at-query-time-td2919876.html >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: joe.cohe...@gmail.com [mailto:joe.cohe...@gmail.com] >>>>> Sent: Wednesday, December 12, 2012 12:09 PM >>>>> To: solr-user@lucene.apache.org >>>>> Subject: Re: Can a field with defined synonym be searched without the >>>> synonym? >>>>> >>>>> >>>>> I'm aplying only query-time synonym, so I have the original values >>>> stored and indexed. >>>>> I would've expected that if I search a strin with quotations, i'll get >>>> the exact match, without applying a synonym. >>>>> >>>>> any way to achieve that? >>>>> >>>>> >>>>> Upayavira wrote >>>>>> You can only search against terms that are stored in your index. If >>>>>> you have applied index time synonyms, you can't remove them at query >>>> time. >>>>>> >>>>>> You can, however, use copyField to clone an incoming field to another >>>>>> field that doesn't use synonyms, and search against that field >> instead. >>>>>> >>>>>> Upayavira >>>>>> >>>>>> On Wed, Dec 12, 2012, at 04:26 PM, >>>>> >>>>>> joe.cohen.m@ >>>>> >>>>>> wrote: >>>>>>> Hi >>>>>>> I hava a field type without defined synonym.txt which retrieves both >>>>>>> records with "home" and "house" when I search either one of them. >>>>>>> >>>>>>> I want to be able to search this field on the specific value that I >>>>>>> enter, without the synonym filter. >>>>>>> >>>>>>> is it possible? >>>>>>> >>>>>>> thanks. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: >>>>>>> >> http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-b >>>>>>> e-searched-without-the-synonym-tp4026381.html >>>>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>> >> http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-be-searched-without-the-synonym-tp4026381p4026405.html >>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>> >>>> -- >>>> Walter Underwood >>>> wun...@wunderwood.org >>>> >>>> >>>> >>>> >> >> -- >> Walter Underwood >> wun...@wunderwood.org >> >> >> >> -- Walter Underwood wun...@wunderwood.org