[jira] [Updated] (SOLR-2564) Integrating grouping module into Solr 4.0

Martijn van Groningen (JIRA) Sat, 04 Jun 2011 03:44:46 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Martijn van Groningen updated SOLR-2564:
----------------------------------------

    Attachment: SOLR-2564.patch

Hi Yonik,

It is good to know that you took a look at the patch!

bq. in the QueryComponent, why the change to set the GET_SCORES flag based on 
the sort(s)?
Yes I did this because I used to set Grouping.needScores with this flag. The 
needScores I also used whether to indicate if the scores need to be cached. 
However I have changed this in the updated patch and basically this check isn't 
done with setting GET_SCORES flag. 

bq. I'm not a fan of this new style for matching request parameters to enums...
We can choose to leave out the upper-casing. Solr users would then need make 
sure that parameter options are spelled correctly. Would that be allright? 

bq. "Accuracy" seems a bit mis-named?
Maybe another name is more descriptive. Maybe style or method?

bq. The parameter "group.totalCount" I would expect to return the total count 
of something, not control the pre/post faceting thing?
The jdoc is mixed up with group.docSet. I also think that group.groupCount is a 
better name. I changed this in the new patch 

bq. What does "group.docSet" do?
Currently nothing. I plan to use it when I finish LUCENE-3097. Basically it 
will decide whether the docset (for FacetComponent and StatsComponent) is based 
on plain documents or groups. Since you can have more than one Command (Field / 
Function / Query), it will then select the first CommandField or 
CommandFunction. I'm not sure how we should handle multiple command when having 
more than one command. 

bq. I'm not sure we should default group.cache to true
The query time can really be reduced with this option, but yes it requires more 
memory. If the cache collector threshold is met they array is immediately set 
to null during the search, so gc might be able to clean it up during the 
search. Also Solr users get a message in the response. Somehow I forget to move 
that from SOLR-2524, but it is in the updated patch now.

bq. we could dump group.cache and have a single group.cacheMB parameter that 
uses 0 as no cache, -1 as maximum needed (solr uses -1 in this manner in other 
places too)
Makes sense, grouping then at least consistent with the rest of Solr. I made it 
default to -1 for now.

bq. FYI: there's a nocommit in there misspelled as "No commit"
I have removed that.

{quote}
It wasn't necessary before, and there are advantages to preserving information 
(like the fact that someone said "no limit" vs a specific number) until as late 
as possible. That was previously handled by getMax() in Grouping.java, and I 
still see it being called... so it should be OK?
{quote}
I've removed this if statement and made sure that getMax(...) is used wherever 
it is needed.

> Integrating grouping module into Solr 4.0
> -----------------------------------------
>
>                 Key: SOLR-2564
>                 URL: https://issues.apache.org/jira/browse/SOLR-2564
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Martijn van Groningen
>            Assignee: Martijn van Groningen
>             Fix For: 4.0
>
>         Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
> SOLR-2564.patch
>
>
> Since work on grouping module is going well. I think it is time to wire this 
> up in Solr.
> Besides the current grouping features Solr provides, Solr will then also 
> support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-2564) Integrating grouping module into Solr 4.0

Reply via email to