This would describe the facet parameters we're talking about:

http://wiki.apache.org/solr/SimpleFacetParameters

Query something like this:
http://localhost:8983/solr/select?q=*:*&fl=id&rows=0&facet=true&facet.limit=-1&facet.field=<your
 field name>&facet.mincount=2

Then filter on each facet returned with a filter query described here: 
http://wiki.apache.org/solr/CommonQueryParameters
Example: q=*:*&fq=<your field name>:<your field value>

Then you would have to get all ids returned and delete all but the first one 
using some app...

Thanks 
Robi


-----Original Message-----
From: Ali, Saqib [mailto:docbook....@gmail.com] 
Sent: Wednesday, August 21, 2013 2:34 PM
To: solr-user@lucene.apache.org
Subject: Re: removing duplicates

Thanks Aloke and Robert. Can you please give me code/query snippets?
(newbie here)


On Wed, Aug 21, 2013 at 2:31 PM, Aloke Ghoshal <alghos...@gmail.com> wrote:

> Hi,
>
> Facet by one of the duplicate fields (probably by the numeric field 
> that you mentioned) and set facet.mincount=2.
>
> Regards,
> Aloke
>
>
> On Thu, Aug 22, 2013 at 2:44 AM, Ali, Saqib <docbook....@gmail.com> wrote:
>
> > hello,
> >
> > We have documents that are duplicates i.e. the ID is different, but 
> > rest
> of
> > the fields are same. Is there a query that can remove duplicate, and 
> > just leave one copy of the document on solr? There is one numeric 
> > field that
> we
> > can key off for find duplicates.
> >
> > Please advise.
> >
> > Thanks
> >
>

Reply via email to