[ https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557531#action_12557531 ]
Hoss Man commented on SOLR-303: ------------------------------- bq. OK, this version patches cleanly and includes some distributed faceting code. I haven't looked at it ... but holy freaking cow that's cool. bq. Note that it is theoretically possible to miss terms. A term could be just below the threshold of each shard (and thus not returned by any shard), but the total count could boost it in the top. This could be rectified by retrieving all terms above a specified count, but it could be expensive. The counts that are currently returned are exact. one solution i've seen to mitigate problems like this in the past is to compute a higher "limit" when querying the individual shards, someone somewhere suggested that n**2 is a good approach (but they may have been talking out of their ass) so if the initial request says facet.limit=5, the individual shards would be queried with facet.limit=25 ... but you'd also still want to use refinement requests. > Distributed Search over HTTP > ---------------------------- > > Key: SOLR-303 > URL: https://issues.apache.org/jira/browse/SOLR-303 > Project: Solr > Issue Type: New Feature > Components: search > Reporter: Sharad Agarwal > Assignee: Yonik Seeley > Attachments: distributed.patch, distributed.patch, distributed.patch, > distributed.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, > fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, > fedsearch.stu.patch, fedsearch.stu.patch > > > Searching over multiple shards and aggregating results. > Motivated by http://wiki.apache.org/solr/DistributedSearch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.