Hi!

On Fri, Dec 9, 2011 at 17:41, Martijn v Groningen
<martijn.v.gronin...@gmail.com> wrote:
> On what field type are you grouping and what version of Solr are you
> using? Grouping by string field is faster.

The field is defined as follows:
<field name="signature" type="string" indexed="true" stored="true" />

Grouping itself is quite fast, only computing the number of groups
seems to increase significantly with the number of documents (linear).

I was hoping for a faster solution to compute the total number of
distinct documents (or in other terms, the number of distinct values
in the signature field). Facets came to mind, but as far as I could
see, they don't offer a total number of facets as well.

I'm using Solr 3.5 (upgraded from Solr 3.4 without reindexing).

Thanks,
Michael

> On 9 December 2011 12:46, Michael Jakl <jakl.mich...@gmail.com> wrote:
>> Hi, I'm using the grouping feature of Solr to return a list of unique
>> documents together with a count of the duplicates.
>>
>> Essentially I use Solr's signature algorithm to create the "signature"
>> field and use grouping on it.
>>
>> To provide good numbers for paging through my result list, I'd like to
>> compute the total number of documents found (= matches) and the number
>> of unique documents (= ngroups). Unfortunately, enabling
>> "group.ngroups" considerably slows down the query (from 500ms to
>> 23000ms for a result list of roughly 300000 documents).
>>
>> Is there a faster way to compute the number of groups (or unique
>> values in the signature field) in the search result? My Solr instance
>> currently contains about 50 million documents and around 10% of them
>> are duplicates.
>>
>> Thank you,
>> Michael
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen

Reply via email to