[jira] [Updated] (LUCENE-3098) Grouped total count

Martijn van Groningen (JIRA) Sun, 15 May 2011 10:36:28 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Martijn van Groningen updated LUCENE-3098:
------------------------------------------

    Attachment: LUCENE-3098.patch

Attached new patch.
* Added total count collector to random tests
* Removed calculating max possible values upfront. Instead I added a initial 
size instead. A larger initial size results in less rehashing. Handy if you 
know more or less the number of groups upfront.

bq. I'm nervous that you pull a top level DocTermsIndex just to get the max 
number of unique groups.
I should have been nervous too! Turns out that the average heap usage is now 
around 59MB. A decrease of heap usage around 50%!

The random tests are really valuable! I found a bug with it. Group with null 
values weren't handled properly. Changing the random test was a bit difficult 
for me. So I think it is good if you take a look at it.

> Grouped total count
> -------------------
>
>                 Key: LUCENE-3098
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3098
>             Project: Lucene - Java
>          Issue Type: New Feature
>            Reporter: Martijn van Groningen
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3098.patch, LUCENE-3098.patch
>
>
> When grouping currently you can get two counts:
> * Total hit count. Which counts all documents that matched the query.
> * Total grouped hit count. Which counts all documents that have been grouped 
> in the top N groups.
> Since the end user gets groups in his search result instead of plain 
> documents with grouping. The total number of groups as total count makes more 
> sense in many situations. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3098) Grouped total count

Reply via email to