[
https://issues.apache.org/jira/browse/SOLR-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804701#comment-14804701
]
Hoss Man commented on SOLR-6168:
--------------------------------
bq. The intention of the FieldComparator.value API is that you would use it to
look up a value after all collection is complete, e.g. to fill the fields to
create the FieldDoc for this hit, and so the comparator would naturally not be
changing it anymore.
ok, but that's not clear/enforced anywhere in the API itself -- if you are only
suppose to fetch back all the slots after the search is complete, why support
fetching the individual slots (and compareValues()) at all?
bq. But I don't really understand what SOLR-6168 is trying to do ... maybe
there is another way to do it w/o abusing this API?
In a nutshell the CollapseQParser is another style of "group collapsing" where
it picks a single best representative doc for each group on the fly as docs are
collected -- currently it does this only based on a single field via DocValues,
SOLR-6168 is about making this work with any arbitrary "Sort" object. The code
in the patch (in SOLR-6168) is analogous to the type of logic that
FieldValueHitQueue does, but instead of tracking the "N best docs" (according
to the sort) in a single priority queue (which re-orders on the fly), it tracks
the *1* best doc for every "group" as it makes a single pass collection uses
value() + compareValues to know when a "better" doc has been found for that
group.
(The most trivial implementation would be to use a FieldValueHitQueue of size
'1' for each group, but this approach unwinds the logic a track the values
directly. When i was first looking at this approach, i initially considered
using a FieldComparator "slot" for each group, but the number of groups isn't
neccessarly known in advance and there's no way to "grow" a FieldComparator)
It doesn't feel like an abuse of the FieldComparator API to do this ... every
other Comparator works fine as is. As i mentioned I could always put hacks in
the caller code (ColapseQParser) to do instanceof/casting/deepCopy when the
FieldComparator returns BytesRefs -- but fixing Term(Ord)ValComparator.value to
return "safe" objects seemed saner.
---
Forgetting SOLR-6168, even if you asssume that all callers should know that
FieldComparator.value is only intended to be usable after the end of a search,
the BytesRefs returned by Term(Ord)ValComparator.value don't really seem to be
safe for a caller to hang on to (for later use searchAfter, etc...) in any
situation where the FieldComparator itself might get re-used -- ie: re-using a
FieldValueHitQueue (or Collector that uses FieldValueHitQueue) in a subsequent
search.
what protects the user from those values being changed out from under them in
that situation?
> enhance collapse QParser so that "group head" documents can be selected by
> more complex sort options
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-6168
> URL: https://issues.apache.org/jira/browse/SOLR-6168
> Project: Solr
> Issue Type: Improvement
> Affects Versions: 4.7.1, 4.8.1
> Reporter: Umesh Prasad
> Assignee: Joel Bernstein
> Attachments: CollapsingQParserPlugin-6168.patch.1-1stcut,
> SOLR-6168-group-head-inconsistent-with-sort.patch, SOLR-6168.patch,
> SOLR-6168.patch
>
>
> The fundemental goal of this issue is add additional support to the
> CollapseQParser so that as an alternative to the existing min/max localparam
> options, more robust sort syntax can be used to sort on multiple criteria
> when selecting the "group head" documents used to represent each collapsed
> group.
> Since support for arbitrary, multi-clause, sorting is almost certainly going
> to require more RAM then the existing min/max functionaly, this new
> functionality should be in addition to the existing min/max localparam
> implementation, not a replacement of it.
> (NOTE: early comments made in this jira may be confusing in historical
> context due to the way this issue was originally filed as a bug report)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]