[ https://issues.apache.org/jira/browse/LUCENE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033801#comment-13033801 ]
Martijn van Groningen edited comment on LUCENE-3098 at 5/15/11 7:58 PM: ------------------------------------------------------------------------ {quote} * Maybe only one ctor for TopGroups? (Ie, we just pass in null as totalGroupCount). I'm wary of ctor explosion over time... * In TestGrouping, you don't need a separate uniqueGroupCount int? Can't you just use knownGroups.size() in the end? * For TotalGroupCountCollector, in the jdocs for the ctor maybe state that caller should set initialSize to rough estimate of how many uniuqe groups are expected, but that this uses up 4 bytes * initialSize? Maybe we should also add a ctor that sets a default for this (128?) and mark the other ctor as expert? {quote} I agree. I've updated the patch. {quote} Hmm, it's a little odd to have TopGroups hold the totalGroupCount? Ie, it's only the test case that makes use of this, because the 2nd pass collector just sets it to null? It'd be nice to find some way to have 2nd pass collector be able to set this... {quote} That would be nice. Future collectors might need something similar. I'm currently think about a TopGroupsEnrich interface that collectors can implement. This allows them to add data to the TopGroups like total group count. The SecondPassGroupingCollector has a list of collectors that implement the TopGroupsEnrich interface. When the getTopGroups() method is executed it iterates of the these collectors and the TopGroups is enriched with data. Downside is that the fields inside TopGroups can't be final and properly we need setters. I think if we do something like this we should do this in a new Jira issue. was (Author: martijn.v.groningen): {quote} * Maybe only one ctor for TopGroups? (Ie, we just pass in null as totalGroupCount). I'm wary of ctor explosion over time... * In TestGrouping, you don't need a separate uniqueGroupCount int? Can't you just use knownGroups.size() in the end? * For TotalGroupCountCollector, in the jdocs for the ctor maybe state that caller should set initialSize to rough estimate of how many uniuqe groups are expected, but that this uses up 4 bytes * initialSize? Maybe we should also add a ctor that sets a default for this (128?) and mark the other ctor as expert? {quote} I agree. I've updated the patch. {quote} Hmm, it's a little odd to have TopGroups hold the totalGroupCount? Ie, it's only the test case that makes use of this, because the 2nd pass collector just sets it to null? It'd be nice to find some way to have 2nd pass collector be able to set this... {quote} That would be nice. Future collectors might need something similar. I'm currently think about a TopGroupsEnrich interface that collectors can implement. This allows them to add data to the TopGroups like total group count. The SecondPassGroupingCollector has a list of collectors that implement the TopGroupsEnrich interface. When the getTopGroups() method is executed it iterates of the these collectors and the TopGroups is enriched with data. Downside is that the fields inside TopGroups can't be final and properly we need setters. I think if we do something like this we should this in a new Jira issue. > Grouped total count > ------------------- > > Key: LUCENE-3098 > URL: https://issues.apache.org/jira/browse/LUCENE-3098 > Project: Lucene - Java > Issue Type: New Feature > Reporter: Martijn van Groningen > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3098.patch, LUCENE-3098.patch, LUCENE-3098.patch, > LUCENE-3098.patch > > > When grouping currently you can get two counts: > * Total hit count. Which counts all documents that matched the query. > * Total grouped hit count. Which counts all documents that have been grouped > in the top N groups. > Since the end user gets groups in his search result instead of plain > documents with grouping. The total number of groups as total count makes more > sense in many situations. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org