All of the applications I've seen with user control over synonym expansion 
where recall-oriented. The "give me all matches for X" kind of problem. So 
ranking is not as important.

wunder

On Dec 12, 2012, at 5:23 PM, Roman Chyla wrote:

> Well, this IDF problem has more sides. So, let's say your synonym file
> contains multi-token synonyms (it does, right? or perhaps you don't need
> it? well, some people do)
> 
> "TV, TV set, TV foo, television"
> 
> if you use the default synonym expansion, when you index 'television'
> 
> you have increased frequency of also 'set', 'foo', so, the IDF of 'TV' is
> the same as that of 'television' - but IDF of 'foo' and 'set' has changed
> (their frequency increased, their IDF decreased) -- TV's have in fact made
> 'foo' term very frequent and undesirable
> 
> So, you might be sure that IDF of 'TV' and 'television' are the same, but
> you are not aware it has 'screwed' other (desirable) terms - so it really
> depends. And I wouldn't argue these cases are esoteric.
> 
> And finally: there are use cases out there, where people NEED to switch off
> synonym expansion at will (find only these documents, that contain the word
> 'TV' and not that bloody 'foo'). This cannot be done if the index contains
> all synonym terms (unless you have a way to mark the original and the
> synonym in the index).
> 
> roman
> 
> 
> On Wed, Dec 12, 2012 at 12:50 PM, Walter Underwood 
> <wun...@wunderwood.org>wrote:
> 
>> Query parsers cannot fix the IDF problem or make query-time synonyms
>> faster. Query synonym expansion makes more search terms. More search terms
>> are more work at query time.
>> 
>> The IDF problem is real; I've run up against it. The most rare variant of
>> the synonym have the highest score. This probably the opposite of what you
>> want. For me, it was "TV" and "television". Documents with "TV" had higher
>> scores than those with "television".
>> 
>> wunder
>> 
>> On Dec 12, 2012, at 9:45 AM, Roman Chyla wrote:
>> 
>>> @wunder
>>> It is a misconception (well, supported by that wiki description) that the
>>> query time synonym filter have these problems. It is actually the default
>>> parser, that is causing these problems. Look at this if you still think
>>> that index time synonyms are cure for all:
>>> https://issues.apache.org/jira/browse/LUCENE-4499
>>> 
>>> @joe
>>> If you can use the flexible query parser (as linked in by @Swati) then
>> all
>>> you need to do is to define a different field with a different tokenizer
>>> chain and then swap the field names before the analyzers processes the
>>> document (and then rewrite the field name back - for example, we have
>>> fields called "author" and "author_nosyn")
>>> 
>>> roman
>>> 
>>> On Wed, Dec 12, 2012 at 12:38 PM, Walter Underwood <
>> wun...@wunderwood.org>wrote:
>>> 
>>>> Query time synonyms have known problems. They are slower, cause
>> incorrect
>>>> IDF, and don't work for phrase synonyms.
>>>> 
>>>> Apply synonyms at index time and you will have none of those problems.
>>>> 
>>>> See:
>>>> 
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
>>>> 
>>>> wunder
>>>> 
>>>> On Dec 12, 2012, at 9:34 AM, Swati Swoboda wrote:
>>>> 
>>>>> Query-time analyzers are still applied, even if you include a string in
>>>> quotes. Would you expect "foo" to not match "Foo" just because it's
>>>> enclosed in quotes?
>>>>> 
>>>>> Also look at this, someone who had similar requirements:
>>>>> 
>>>> 
>> http://lucene.472066.n3.nabble.com/Synonym-Filter-disable-at-query-time-td2919876.html
>>>>> 
>>>>> 
>>>>> -----Original Message-----
>>>>> From: joe.cohe...@gmail.com [mailto:joe.cohe...@gmail.com]
>>>>> Sent: Wednesday, December 12, 2012 12:09 PM
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: Re: Can a field with defined synonym be searched without the
>>>> synonym?
>>>>> 
>>>>> 
>>>>> I'm aplying only query-time synonym, so I have the original values
>>>> stored and indexed.
>>>>> I would've expected that if I search a strin with quotations, i'll get
>>>> the exact match, without applying a synonym.
>>>>> 
>>>>> any way to achieve that?
>>>>> 
>>>>> 
>>>>> Upayavira wrote
>>>>>> You can only search against terms that are stored in your index. If
>>>>>> you have applied index time synonyms, you can't remove them at query
>>>> time.
>>>>>> 
>>>>>> You can, however, use copyField to clone an incoming field to another
>>>>>> field that doesn't use synonyms, and search against that field
>> instead.
>>>>>> 
>>>>>> Upayavira
>>>>>> 
>>>>>> On Wed, Dec 12, 2012, at 04:26 PM,
>>>>> 
>>>>>> joe.cohen.m@
>>>>> 
>>>>>> wrote:
>>>>>>> Hi
>>>>>>> I hava a field type without defined synonym.txt which retrieves both
>>>>>>> records with "home" and "house" when I search either one of them.
>>>>>>> 
>>>>>>> I want to be able to search this field on the specific value that I
>>>>>>> enter, without the synonym filter.
>>>>>>> 
>>>>>>> is it possible?
>>>>>>> 
>>>>>>> thanks.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> 
>> http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-b
>>>>>>> e-searched-without-the-synonym-tp4026381.html
>>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> View this message in context:
>>>> 
>> http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-be-searched-without-the-synonym-tp4026381p4026405.html
>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>> 
>>>> --
>>>> Walter Underwood
>>>> wun...@wunderwood.org
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> --
>> Walter Underwood
>> wun...@wunderwood.org
>> 
>> 
>> 
>> 

--
Walter Underwood
wun...@wunderwood.org



Reply via email to