Hi Erick,
BTW hl.q is parsed by the hl.qparser param which defaults to the defType
param which defaults to "lucene".
In common cases, I think this is a non-issue. One common case is
defType=edismax and you specify a list of fields in 'qf' (thus your query
has parts parsed on various fields) and then you set hl.fl to some subset
of those fields. This will use the correct analysis.
You make a compelling point in terms of what a user might expect -- my gut
reaction aligned with your expectation and I thought maybe we should change
this. But it's not as easy at it seems at first blush, and there are bad
performance implications. How do you *generically* tell an arbitrary query
parser which field it should parse the string with? We have no such
standard. And lets say we did; then we'd have to re-parse the query string
for each field in hl.fl (and consider hl.fl might be a wildcard!). Perhaps
both solveable or constrainable with yet more parameters, but I'm
pessimistic it'll be a better outcome.
The documentation ought to clarify this matter. Probably in hl.fl to say
that the fields listed are analyzed with that of their field type, and that
it ought to be "compatible" (the same or similar) to that which parsed the
query.
Perhaps, like spellcheck's spellcheck.collateParam.* param prefix,
highlighting could add a means to specify additional parameters for hl.q to
be parsed (not just the choice of query parsers). This isn't particularly
pressing though since this can easily be added to the front of hl.q like
hl.q={!edismax qf=$hl.fl v=$q}
~ David
On Fri, Mar 23, 2018 at 4:40 PM Erick Erickson <[email protected]>
wrote:
> I was responding to a thread on the user's list and saw this. I didn't
> see a JIRA. I tested with 7.1, all defaults.
>
> It seems odd to me that (apparently, I haven't traced the code) that
> if I specify:
>
> hl.fl=f1&hl.q=something
>
> then "something" is analyzed against the default field rather than f1
>
> So in this particular case, f1 did some diacritic folding
> (GermanNormalizationFilterFactory specifically). But my guess is that
> the df was still "text", or at least something that didn't reference
> that filter.
>
> I'm defining "worked" in what follows is getting highlighting on
> "Kündigung"
>
> so
> Kündigung was indexed as Kundigung
>
> So far so good. Now if I try to highlight on f1
>
> These work
> q=f1:Kündigung&hl.fl=f1
> q=f1:Kündigung&hl.fl=f1&hl.q=Kundigung <= NOTE, without umlaut
>
> This does not work
> q=f1:Kündigung&hl.fl=f1&hl.q=Kündigung <= NOTE, with umlaut
>
> Testing this locally, I'd get the highlighting if I defined df as "f1"
> in all the above cases.
>
> Worth a JIRA? Or a doc note?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com