Find duplicates

2014-12-02 Thread Peter Kirk
Hi Is it possible to formulate a Solr query which finds all documents which have the same value in a particular field? Note, I don't know what the value is, I just want to find all documents with duplicate values. For example, I have 5 documents: Doc1: field Name = Peter Doc2: field Name = Jac

Re: Find duplicates

2014-12-02 Thread Erik Hatcher
Sort of… if you indexed the full value of the field (and you’re looking for truly exact matches) as a string field type you could facet on that field with facet.mincount=2 and the facets returned would be the ones with duplicate values. You’d have to drill down on each of the facets returned to

RE: Find duplicates

2014-12-02 Thread Gonzalo Rodriguez
@lucene.apache.org Subject: Find duplicates Hi Is it possible to formulate a Solr query which finds all documents which have the same value in a particular field? Note, I don't know what the value is, I just want to find all documents with duplicate values. For example, I have 5 documents: Doc1:

Re: Find duplicates

2014-12-02 Thread Alexandre Rafalovitch
And if I am correct, enabling docValues will do this kind of grouping as part of the indexing with docValues data structure (per segment). So, all one has to do is to get it back (through faceting). Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newslett