[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-06-02 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3099:
---

Fix Version/s: (was: 3.2)
   3.3

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.3, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-06-02 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3099:
--

Attachment: LUCENE-3099.patch

Updated patch to trunk. Previous patch can't be applied on the current trunk 
without issues.

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-06-02 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3099:
--

Attachment: LUCENE-3099.patch

bq. Looks like LUCENE-3097 snuck into the patch (AllGroupHeadsCollector).

oops... Yes it did. My change lists collide. Maybe it is time use git...

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-06-02 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3099:
--

Attachment: LUCENE-3099.patch

Attached an updated version of the patch with the Mike's points. When building 
the module no generic warnings occur any more in the ant build. I also updated 
some documentation.

{quote}
For AbstractAllGroupsCollector, can we impl the getGroupCount in
the base class as getGroups.size()?
{quote}
Certainly! Much better. Since the method is syntactical sugar anyway.

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-05-31 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3099:
--

Attachment: (was: LUCENE-3099.patch)

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch, LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-05-31 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3099:
--

Attachment: LUCENE-3099.patch

Previous patch was wrong. Here a new one.

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch, LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-05-31 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3099:
--

Attachment: LUCENE-3099.patch

Attached a new patch with the discussed changes. The 
AbstractMatchAllGroupCollector changed a lot. Most code is pushed to 
implementation classes. During development of fq impl I noticed that the 
abstraction was still too specific for terms impl.

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch, LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-05-29 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3099:
--

Attachment: LUCENE-3099.patch

Attached an updated patch.

I'm currently busy with integrating the grouping module in trunk Solr. I 
noticed that FirstPassGroupingCollector and SecondPassGroupingCollector still 
has groupField as field and constructor argument. So I moved this to 
TermsFirstPassGroupingCollector and TermSecondPassGroupingCollector. Also made 
a small change in GroupDocs regarding generics.

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch, 
> LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-05-29 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3099:
--

Attachment: LUCENE-3099.patch

Attached a new patch that is based on Mike's patch.
* All existing grouping tests pass
* AllGroupsCollector has also been included in this infrastructure.

Only the TermSecondPassGroupingCollector didn't work. The size groupDocs array 
was too small.

I think the following things need to be done:
* Update the documentation in package.html
* Backport to 3x

Should we rename all abstract collectors to Abstract*? To make it clear that 
these classes are abstract.

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch, LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-05-29 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3099:
---

Attachment: LUCENE-3099.patch

Attached patch, with a possible more minimal approach to enabling Solr trunk to 
cutover to the grouping module.

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch, LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3099) Grouping module should allow subclasses to set the group key per document

2011-05-28 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3099:
--

Attachment: LUCENE-3099.patch

Attached an initial idea of abstracting away the source of the group values.

In the patch there are four concepts:
* GroupValue. An abstraction of a value.
* GroupValueSource. An abstraction of the source of the group value.
* GroupHolder. An abstract holder to get the groups in a efficient manner.
* GroupSpecification. A factory class that glues all the above concepts 
together.

I think with this infrastructure it is quite straight forward to add the 
ability to group by a function. The patch contains implementations for all 
fields (string, int, double etc.).

Some concepts are look like with what is already in Lucene / Solr. For example 
Lucene's ValueSource and DocValues or Solr's DocValues and MutableValue. I just 
started from scratch to see what grouping really needs. The Lucene's DocValues 
for example didn't have all functionality grouping needs.

Furthermore I have added research group collectors. I ported the 
MatchAllGroupsCollector and FirstPassGroupingCollector to use this new 
infrastructure. Just to get a feeling of how it all fits together.

I also included a simple runner class that runs these research collectors and 
compares them to the variants already added to the code base. You can then 
easily check search times and the group results. 

> Grouping module should allow subclasses to set the group key per document
> -
>
> Key: LUCENE-3099
> URL: https://issues.apache.org/jira/browse/LUCENE-3099
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3099.patch
>
>
> The new grouping module can only group by a single-valued indexed field.
> But, if we make the 'getGroupKey' a method that a subclass could override, 
> then I think we could refactor Solr over to the module, because it could do 
> function queries and normal queries via subclass (I think).
> This also makes the impl more extensible to apps that might have their own 
> interesting group values per document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org