[
https://issues.apache.org/jira/browse/SOLR-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903776#comment-13903776
]
Rob Tulloh commented on SOLR-5709:
----------------------------------
Here is an example where we get the result we expect (no duplicates involved):
In this case 3 documents map to the same storageid parent. Notice that in this
result, matches is 3 (number of documents) and ngroups is 1 (the number of
unique results):
{noformat}
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">194</int><lst name="params"><str
name="group.ngroups">true</str><str name="group.limit">1000</str><str
name="isPartial">0</str><str name="hl.simple.pre"><b></str><str
name="params">{hl.requireFieldMatch=true</str><str name="hl.fl">*</str><str
name="wt">xml</str><str name="hl">true</str><str name="rows">1</str><str
name="EmsQueryId">INTERNAL</str><str
name="f.mailsubject2.qf">mailsubject</str><str
name="shards">archive-8.ems.labmanager.net:8983/solr,archive-6.ems.labmanager.net:8983/solr</str><str
name="start">0</str><str name="q">customerid:352</str><str
name="f.body2.qf">body</str><str name="group.field">storageid</str><str
name="hl.simple.post"></b></str><str name="group">true</str><str
name="qt">/search-any</str><str name="fq">{!lucene}storageid:{44414 TO
44415]</str><str name="EmsQueryTs">1392658773339}</str></lst></lst><lst
name="grouped"><lst name="storageid"><int name="matches">3</int><int
name="ngroups">1</int><arr name="groups"><lst><long
name="groupValue">44415</long><result name="doclist" numFound="3" start="0"
maxScore="2.8401346">
{noformat}
What puzzles me is why the result with duplicates doesn't group the same way.
Clearly, this result shows it does work without duplicates involved. Is there
an explanation for why this is the case?
> Highlighting grouped duplicate docs from different shards with group.limit >
> 1 throws ArrayIndexOutOfBoundsException
> --------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-5709
> URL: https://issues.apache.org/jira/browse/SOLR-5709
> Project: Solr
> Issue Type: Bug
> Components: highlighter
> Affects Versions: 4.3, 4.4, 4.5, 4.6, 5.0
> Reporter: Steve Rowe
> Assignee: Steve Rowe
> Fix For: 4.7, 5.0
>
> Attachments: SOLR-5709.patch
>
>
> In a sharded (non-SolrCloud) deployment, if you index a document with the
> same unique key value into more than one shard, and then try to highlight
> grouped docs with more than one doc per group, where the grouped docs contain
> at least one duplicate doc pair, you get an AIOOBE.
> Here's the stack trace I got from such a situation, with 1 doc indexed into
> each shard in a 2-shard index, with {{group.limit=2}}:
> {noformat}
> ERROR null:java.lang.ArrayIndexOutOfBoundsException: 1
> at
> org.apache.solr.handler.component.HighlightComponent.finishStage(HighlightComponent.java:185)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:328)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:758)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:412)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:202)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at
> org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:136)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at
> org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> at org.eclipse.jetty.server.Server.handle(Server.java:368)
> at
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> at
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
> at
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
> at
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
> at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at
> org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
> at
> org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)
> at
> org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> at java.lang.Thread.run(Thread.java:724)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]