Erick Erickson created SOLR-12136:
-------------------------------------

             Summary: Document hl.q parameter
                 Key: SOLR-12136
                 URL: https://issues.apache.org/jira/browse/SOLR-12136
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
          Components: documentation
            Reporter: Erick Erickson
            Assignee: Erick Erickson


*********Original issue:
If I specify:

hl.fl=f1&hl.q=something

then "something" is analyzed against the default field rather than f1

So in this particular case, f1 did some diacritic folding
(GermanNormalizationFilterFactory specifically). But my guess is that
the df was still "text", or at least something that didn't reference
that filter.

I'm defining "worked" in what follows is getting highlighting on "Kündigung"

so
Kündigung was indexed as Kundigung

So far so good. Now if I try to highlight on f1

These work
q=f1:Kündigung&hl.fl=f1
q=f1:Kündigung&hl.fl=f1&hl.q=Kundigung <= NOTE, without umlaut
q=f1:Kündigung&hl.fl=f1&hl.q=f1:Kündigung <= NOTE, with umlaut

This does not work
q=f1:Kündigung&hl.fl=f1&hl.q=Kündigung <= NOTE, with umlaut

Testing this locally, I'd get the highlighting if I defined df as "f1"
in all the above cases.

**********David Smiley's analysis
BTW hl.q is parsed by the hl.qparser param which defaults to the defType param 
which defaults to "lucene".

In common cases, I think this is a non-issue.  One common case is 
defType=edismax and you specify a list of fields in 'qf' (thus your query has 
parts parsed on various fields) and then you set hl.fl to some subset of those 
fields.  This will use the correct analysis.

You make a compelling point in terms of what a user might expect -- my gut 
reaction aligned with your expectation and I thought maybe we should change 
this.  But it's not as easy at it seems at first blush, and there are bad 
performance implications.  How do you *generically* tell an arbitrary query 
parser which field it should parse the string with?  We have no such standard.  
And lets say we did; then we'd have to re-parse the query string for each field 
in hl.fl (and consider hl.fl might be a wildcard!).  Perhaps both solveable or 
constrainable with yet more parameters, but I'm pessimistic it'll be a better 
outcome.

The documentation ought to clarify this matter.  Probably in hl.fl to say that 
the fields listed are analyzed with that of their field type, and that it ought 
to be "compatible" (the same or similar) to that which parsed the query.

Perhaps, like spellcheck's spellcheck.collateParam.* param prefix, highlighting 
could add a means to specify additional parameters for hl.q to be parsed (not 
just the choice of query parsers).  This isn't particularly pressing though 
since this can easily be added to the front of hl.q like hl.q={!edismax 
qf=$hl.fl v=$q}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to