Furthermore, I would like to add its not just the highlight matches
functionality that is horribly broken here, but the output of the analysis
itself is misleading.

lets say i take 'textTight' from the example, and add the following synonym:

this is broken => broke

the query time analysis is wrong, as it clearly shows synonymfilter
collapsing "this is broken" to broke, but in reality with the qp for that
field, you are gonna get 3 separate tokenstreams and this will never
actually happen (because the qp will divide it up on whitespace first)

So really the output from 'Query Analyzer' is completely bogus.

On Wed, Aug 4, 2010 at 1:57 PM, Robert Muir <rcm...@gmail.com> wrote:

>
>
> On Wed, Aug 4, 2010 at 1:45 PM, Chris Hostetter 
> <hossman_luc...@fucit.org>wrote:
>
>>
>> it really only attempts to identify when there is overlap between
>> analaysis at query time and at indexing time so you can easily spot when
>> one analyzer or the other "breaks" things so that they no longer line up
>> (or when it "fiexes" things so they start to line up)
>>
>
> It attempts badly, because it only "works" in the most trivial of cases
> (e.g. doesnt reflect the interaction of queryparser with multiword synonyms
> or worddelimiterfilter).
>
> Since Solr includes these non-trivial analysis components *in the example*
> it means that this 'highlight matches' doesnt actually even really work at
> all.
>
> Someone is gonna use this thing when they dont understand why analysis isnt
> doing what they want, i.e. the cases like I outlined above.
>
> For the trivial cases where it does "work" the 'highlight matches' isnt
> useful anyway, so in its current state its completely unnecessary.
>
>
>> Even if we eliminated that highlighting as missleading, people would still
>> do it in thier minds, it would just be harder -- it doesn't change the
>> underlying fact that analysis is only part of the picture.
>>
>
> I'm not suggesting that. I'm suggesting fixing the highlighting so its not
> misleading. There are really only two choices:
> 1. remove the current highlighting
> 2. fix it.
>
> in its current state its completely useless and misleading, except for very
> trivial cases, in which you dont need it anyway.
>
>
>>
>> : it would be better if it put your text in a memoryindex and actually
>> parsed
>> : the query w/ queryparser, ran it, and used the highlighter to try to
>> show
>> : any matches.
>>
>> Thta level of "query explanation" really only works if the user gives us a
>> full document (all fields, not just one) and a full query string, and all
>> of the possible query params -- because the query parser (either implicit
>> because of config, or explicitly specified by the user) might change it's
>> behavior based on those other params.
>>
>
> thats true, but I dont see why the user couldnt be allowed to provide just
> this.
> I'd bet money a lot of people are using this thing with a specific
> query/document in mind anyway!
>
>
>> people can infer from that page.  As i said, i don't think removing the
>> "match" highlighting will actaully reduce confusion, but perhaps there is
>> verbage/disclaimers that could be added to make it more clear?
>>
>
>  As i said before, I think i disagree with you. I think for stuff like this
> the technicals are less important, whats important is this is a misleading
> checkbox that really confuses users.
>
> I suggest disabling it entirely, you are only going to remove confusion.
>
>
> --
> Robert Muir
> rcm...@gmail.com
>



-- 
Robert Muir
rcm...@gmail.com

Reply via email to