[
https://issues.apache.org/jira/browse/SOLR-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Smiley updated SOLR-6160:
-------------------------------
Attachment:
SOLR-6160__bugfix_when_facet_query_or_range_with_group_facets_and_distributed.patch
Here's the patch against 4.7. Two lines were replaced with one. Solr's tests
pass and I would be surprised if this doesn't fix it on the production in
system but it'll take time to deploy the fix to see.
One thing about this simple patch I don't like is that group.field is having an
effect even though group=true isn't on the request. That actually was the case
before this patch as well since both callers of the method I changed simply
looked for the group.field param. This isn't a big deal; arguably the client
shouldn't be sending them in the first place. But nonetheless, it would be
nice if a sharded request could have its parameters stripped (by prefix) so as
not to have side-effects like this. Doing so would then raise the question of
how, in a GET_FIELDS phase, a facet query could get a grouped count even though
we don't _actually_ want to group the results? _<<shrug>>_ Thoughts welcome.
> Exception possible with group.facet, range faceting, with distributed search
> ----------------------------------------------------------------------------
>
> Key: SOLR-6160
> URL: https://issues.apache.org/jira/browse/SOLR-6160
> Project: Solr
> Issue Type: Bug
> Affects Versions: 4.7.1
> Reporter: David Smiley
> Attachments:
> SOLR-6160__bugfix_when_facet_query_or_range_with_group_facets_and_distributed.patch
>
>
> I'm seeing a hard to reproduce bug when the following conditions are true:
> * Distributed search
> * group=true with group.field=xxx and group.facet=true
> * facet=true with facet.field and facet.range
> On a sharded request (isShard=true, distrib=false) that has
> requestPurpose=GET_FIELDS, *sometimes* facet=true but sometimes it isn't.
> Apparently, sometimes the earlier GET_FACETS phase satisfies the faceting
> alone and sometimes more is done in GET_FIELDS. So *if* facet=true on such a
> request *and* facet.range is set (or perhaps facet.query), then the bug will
> hit. Specifically both the facet.range and facet.query logic will
> conditionally call SimpleFacets.getGroupedFacetQueryCount, and both will
> conditionally do so when they detect that "group.field" has been set. BUT,
> for a GET_FIELDS shard request, the "group" parameter flag is explicitly
> removed from the request by StoredFieldsShardRequestFactory, effectively
> disabling grouping. So SimpleFacets.getGroupedFacetQueryCount will throw an
> error. The error is that "group.field" hasn't been set which technically
> isn't true.
> It's 100% reproducible on my customer's system. Reproducing it is tricky
> because it's not going to happen if the faceting logic doesn't happen on
> GET_FIELDS, which doesn't seem to happen often. I found that for a test
> query if I changed the facet.limit to a sufficiently high number then it
> trips, but if it's low then it doesn't. I presume this has something to do
> with refining faceting counts such that a higher facet.limit increases the
> chance that the coordinating node (what do we call that?) will need to ask a
> shard for more counts beyond which was provided on the initial GET_FACETS
> phase.
> If anyone has pointers on how to reproduce this (such as in a test!), then
> that would help.
> Even though I don't have 100% understanding of the bug and have yet to
> reproduce it with test data, it seems to me the bug might be as simple as
> having SimpleFacets.getGroupedFacetQueryCount retrieve the group.field
> parameter directly from parameters instead of possibly failing to find it in
> rb.GetGroupingSpec() (because "group" wasn't set). After all, that is how
> the callers of this method determine wether or not they want to get a grouped
> query count.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]