[
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564966#action_12564966
]
Charles Hornberger commented on SOLR-236:
-----------------------------------------
Ah ... got the beginnings of a diagnosis. The problem appears when the DocSet
{{qDocSet}} returned by DocSetHitCollector.getDocSet() -- called at
org.apache.solr.search.SolrIndexSearcher:1101 in trunk, or 1108 with the
field_collapsing patch applied, inside getDocListAndSetNC()) -- is a BitDocSet,
and not when it's a HashDocSet. As the stack trace above shows, calling
intersection() on a BitDocSet object invokes the superclass'
DocSetBase.intersection() method, which invokes a call chain that blows up when
it hits the iterator() method of the NegatedDocSet passed in as the {{filter}}
parameter to getDocListAndSetNC(); NegatedDocSet.iterator() blows up by design:
{{
public DocIterator iterator() {
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,
"Unsupported Operation");
}
}}
I see that DocSetBase.intersection(DocSet other) has special-casing logic for
dealing with {{other}} parameters that are instances of HashDocSet; does it
also need special casing logic for dealing with {{other}} parameters that are
NegatedDocSets? Or should NegatedDocSet *really* implement iterator()? Or
something else entirely?
> Field collapsing
> ----------------
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
> Issue Type: New Feature
> Components: search
> Affects Versions: 1.3
> Reporter: Emmanuel Keller
> Attachments: field-collapsing-extended-592129.patch,
> field_collapsing_1.1.0.patch, field_collapsing_1.3.patch,
> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff,
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,
> SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given
> field to a single entry in the result set. Site collapsing is a special case
> of this, where all results for a given web site is collapsed into one or two
> entries in the result set, typically with an associated "more documents from
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.