+1
I just had occasion to debug something where the interaction between the 
queryparser and the analyzer produced *interesting* results.  Having a separate 
jsp that includes the whole chain (i.e. analyzer/tokenizer/filter and qp) would 
be great!

Tom

-----Original Message-----
From: Michael McCandless [mailto:luc...@mikemccandless.com] 
Sent: Friday, August 13, 2010 5:19 AM
To: solr-user@lucene.apache.org
Subject: Re: analysis tool vs. reality

Maybe, separate from analysis.jsp (showing only how text is analyzed),
Solr needs a debug page showing the steps the field's QueryParser goes
through on a given query, to debug such tricky QueryParser/Analyzer
interactions?

We could make a wrapper around the analyzer that records each text
fragment sent to it by the QueryParser, as a start.  It'd be great to
also see it spelled out how that then resulted in a particular part of
the query.  So for query "ABC12 FOO" you'd see that ABC12 was sent to
analyzer, it returned two tokens (ABC, 12), and then QueryParser made
a PhraseQuery from that, and then FOO was sent, and that turned into
TermQuery, and default op was AND and so a toplevel BooleanQuery with
2 MUST terms was created...

Mike

On Thu, Aug 12, 2010 at 8:39 PM, Robert Muir <rcm...@gmail.com> wrote:
> On Thu, Aug 12, 2010 at 8:07 PM, Chris Hostetter
> <hossman_luc...@fucit.org>wrote:
>
>>
>> : > You say it's bogus because the qp will divide on whitesapce first --
>> but
>> : > you're assuming you know what query parser will be used ... the "field"
>> : > query parser (to name one) doesn't split on whitespace first.  That's
>> my
>> : > point: analysis.jsp doesn't make any assumptions about what query
>> parser
>> : > *might* be used, it just tells you what your analyzers do with strings.
>> : >
>> :
>> : you're right, we should just fix the bug that the queryparser tokenizes
>> on
>> : whitespace first. then analysis.jsp will be significantly less confusing.
>>
>> dude .. not trying to get into a holy war here
>>
>> actually I'm suggesting the practical solution: that we fix the primary
> problem that makes it confusing.
>
>
>> even if you change the Lucene QUeryParser so that whitespace isn't a meta
>> character it doens't affect the underlying issue: analysis.jsp is agnostic
>> about QueryParsers.
>
>
> analysis.jsp isn't agnostic about queryparsers, its ignorant of them, and
> your default queryparser is actually a de-facto whitespace tokenizer, don't
> try to sugarcoat it.
>
> --
> Robert Muir
> rcm...@gmail.com
>

Reply via email to