[ https://issues.apache.org/jira/browse/LUCENE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martijn van Groningen updated LUCENE-3098: ------------------------------------------ Attachment: LUCENE-3098.patch Attached new patch. * Added total count collector to random tests * Removed calculating max possible values upfront. Instead I added a initial size instead. A larger initial size results in less rehashing. Handy if you know more or less the number of groups upfront. bq. I'm nervous that you pull a top level DocTermsIndex just to get the max number of unique groups. I should have been nervous too! Turns out that the average heap usage is now around 59MB. A decrease of heap usage around 50%! The random tests are really valuable! I found a bug with it. Group with null values weren't handled properly. Changing the random test was a bit difficult for me. So I think it is good if you take a look at it. > Grouped total count > ------------------- > > Key: LUCENE-3098 > URL: https://issues.apache.org/jira/browse/LUCENE-3098 > Project: Lucene - Java > Issue Type: New Feature > Reporter: Martijn van Groningen > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3098.patch, LUCENE-3098.patch > > > When grouping currently you can get two counts: > * Total hit count. Which counts all documents that matched the query. > * Total grouped hit count. Which counts all documents that have been grouped > in the top N groups. > Since the end user gets groups in his search result instead of plain > documents with grouping. The total number of groups as total count makes more > sense in many situations. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org