Hello - We're doing something similar ended up overriding QueryComponent 
(https://issues.apache.org/jira/browse/SOLR-7968) which needs protected members 
instead of private members first. We could do a RankQuery and use its cool 
MergeStrategy, but we would also ened RankQuery to provide an entry point for 
QueryComponent.createMainQuery(). That would be ideal because we can then use 
the Collector there for local deduplication, and a combination of 
createMainQuery and mergeIds to do the distributed deduplication.

Markus
 
-----Original message-----
> From:Joel Bernstein <joels...@gmail.com>
> Sent: Wednesday 2nd September 2015 23:46
> To: solr-user@lucene.apache.org
> Subject: Re: Merging documents from a distributed search
> 
> The merge strategy probably won't work for the type of distributed collapse
> you're describing.
> 
> You may want to begin exploring the Streaming API which supports real-time
> map/reduce operations,
> 
> http://joelsolr.blogspot.com/2015/03/parallel-computing-with-solrcloud.html
> 
> Joel Bernstein
> http://joelsolr.blogspot.com/
> 
> On Wed, Sep 2, 2015 at 5:12 PM, tedsolr <tsm...@sciquest.com> wrote:
> 
> > I've read from  http://heliosearch.org/solrs-mergestrategy/
> > <http://heliosearch.org/solrs-mergestrategy/>   that the AnalyticsQuery
> > component only works for a single instance of Solr. I'm planning to
> > "migrate" to the SolrCloud soon and I have a custom AnalyticsQuery module
> > that collapses what I consider to be duplicate documents, keeping stats
> > like
> > a "count" of the dupes. For my purposes "dupes" are determined at run time
> > and vary by the search request. Once a collection has multiple shards I
> > will
> > not be able to prevent "dupes" from appearing across those shards. A custom
> > merge strategy should allow me to merge my stats, but I don't see how I can
> > drop duplicate docs at that point.
> >
> > If shard1 returns docs A & B and shard2 returns docs B & C (letters
> > denoting
> > what I consider to be unique docs), can my implementation of a merge
> > strategy return only docs A, B, & C, rather than A, B, B, & C?
> >
> > thanks!
> > solr 5.2.1
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/Merging-documents-from-a-distributed-search-tp4226802.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> 

Reply via email to