[jira] [Commented] (SOLR-6354) Support stats over functions

2014-09-11 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14130901#comment-14130901
 ] 

Hoss Man commented on SOLR-6354:


bq. Also, regarding changing the output key: either this was either broken 
already, or I broke it somehow.

Yeah, beyond the initial problem you pointed out about code duplication dealing 
with where/how the StatsValues instances are constructed, theres also 
inconsistencies in when/how/if the localparams are parsed.  I'm tackling that 
in SOLR-6507 first.

> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
> Attachments: TstStatsComponent.java
>
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6354) Support stats over functions

2014-09-12 Thread Crawdaddy (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131587#comment-14131587
 ] 

Crawdaddy commented on SOLR-6354:
-

Excellent - thanks Hoss.  Maybe crosstalk, but do you think some of this work 
would make it easier for us to do stats on scores?  Scores mean something in my 
application and I want to use them in the Stats component.


> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
> Attachments: TstStatsComponent.java
>
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6354) Support stats over functions

2014-09-16 Thread Crawdaddy (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135528#comment-14135528
 ] 

Crawdaddy commented on SOLR-6354:
-

Nice work Hoss - thank you very much for the patch and the reminder about 
frange.  Will give this a try.



> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
> Attachments: SOLR-6354.patch, TstStatsComponent.java
>
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6354) Support stats over functions

2014-09-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143512#comment-14143512
 ] 

ASF subversion and git services commented on SOLR-6354:
---

Commit 1626856 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1626856 ]

SOLR-6354: stats.field can now be used to generate stats over the numeric 
results of arbitrary functions

> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
> Attachments: SOLR-6354.patch, SOLR-6354.patch, SOLR-6354.patch, 
> TstStatsComponent.java
>
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6354) Support stats over functions

2014-09-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143660#comment-14143660
 ] 

ASF subversion and git services commented on SOLR-6354:
---

Commit 1626875 from hoss...@apache.org in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1626875 ]

SOLR-6354: stats.field can now be used to generate stats over the numeric 
results of arbitrary functions (merge r1626856)

> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
> Attachments: SOLR-6354.patch, SOLR-6354.patch, SOLR-6354.patch, 
> TstStatsComponent.java
>
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6354) Support stats over functions

2014-08-18 Thread Andy Crossen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101118#comment-14101118
 ] 

Andy Crossen commented on SOLR-6354:


Hey Hoss, I think you mean StatsInfo should do the check you propose?  At 
least, that's where I found I needed to start intercepting this.

I have all but the last line in your proposal implemented in StatsInfo.parse in 
a custom copy of StatsComponent, but having a little trouble seeing how to go 
from ValueSource -> StatsValues.  Can you provide a couple more pointers here?  



> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6354) Support stats over functions

2014-08-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101414#comment-14101414
 ] 

Hoss Man commented on SOLR-6354:


bq. Hey Hoss, I think you mean StatsInfo should do the check you propose? At 
least, that's where I found I needed to start intercepting this.

Hmmm... i guess there's two code paths here that have to be considered?  I 
think i was looking at SimpleStats because that's where the individual 
{{stats.field}} values are parsed into the "localParams" variable -- so we 
definitely need to check for a type there, and then do the right thing as far 
as dealing with the NumericStatsValues in terms of methods like 
{{SimpleStats.getStatsFields()}} and/or {{SimpleStats.getFieldCacheStats()}}

But i think you're right about StatsInfo ... looks like we need to account for 
it there as well ... i'd need to look over this more closely to understand 
what's going on there and why

bq. ...having a little trouble seeing how to go from ValueSource -> 
StatsValues. Can you provide a couple more pointers here? 

If you look at {{StatsValuesFactory.createStatsValues}} and the existing 
{{AbstractStatsValues}} you'll see that it maintains references to the 
SchemaField/FieldType of the associated field -- but the meat of the logic is 
in asking the FieldType for it's ValueSource to then accumulate values from.  
So what i had in mind was  refactoring "field" specific bits of 
{{AbstractStatsValues}} as needed so that a (new) subclass could be completely 
field agnostic, and just do the accumulation directly from a VlaueSource passed 
in (based on the FunctionQParser in most cases)

bq. I have all but the last line in your proposal implemented...

FYI: no need to "hold back" changes until they are "done" ... yonik's law of 
patches...

bq. A half-baked patch in Jira, with no documentation, no tests and no 
backwards compatibility is better than no patch at all.



> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6354) Support stats over functions

2014-08-18 Thread Crawdaddy (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101572#comment-14101572
 ] 

Crawdaddy commented on SOLR-6354:
-

I did manage to get something working, but it has some problems.  I see what 
you mean on the StatsValuesFactory refactor.  Because I was experimenting with 
this in a copy of StatsComponent (which by the way, is not easy to do!), I 
ended up not modifying StatsValuesFactory at all.  Instead, I wrote a couple 
inner classes extending NumericStatsValues and FieldType that take a 
ValueSource as input.  In SimpleStats.getStatsFields() and 
SimpleStats.getFieldCacheStats(), I catch the exception to schema.getField() 
that is thrown when trying to look up non-existent function fields, and return 
my custom NSV/FT-based classes - stored in rb._statsInfo - instead.  This seems 
to have broken stat faceting, however, I think since other calls to 
StatsValuesFactory.createStatsValues outside StatsComponent don't use the same 
logic.  No doubt yours is the better road to travel - I was shooting for 
quick-n-dirty to see if this was a useful approach to a stats problem.

Also, regarding changing the output key: either this was either broken already, 
or I broke it somehow.

Would it be useful for me to upload what I have as a reference point for you or 
someone else to implement more coherently?  I'm not sure I have the bandwidth 
to pull down a virgin Solr and migrate the changes at this time.



> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6354) Support stats over functions

2014-08-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101604#comment-14101604
 ] 

Hoss Man commented on SOLR-6354:


bq. Would it be useful for me to upload what I have as a reference point for 
you or someone else to implement more coherently?

Yes, absolutely ... didn't you see my last comment?

bq. A half-baked patch in Jira, with no documentation, no tests and no 
backwards compatibility is better than no patch at all.

> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6354) Support stats over functions

2014-08-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091545#comment-14091545
 ] 

Hoss Man commented on SOLR-6354:



Proposed implementation...

* SimpleStats should check the local params of each {{stats.field}} for a 
"type" param 
** type is already treated special in local param parsing as as way to specify 
the parser type for the body of the string, ie: {{\{!foo\}...}} is just an 
alias for {{\{!type=foo\}...}}
* if "type" param doesn't exist, or is the empty string, treat the param value 
as a regular field name and get it's value source (just like is done today)
* if "type" does exist, do normal QParserPlugin lookup to parse the param value
** if the resulting Query is {{instanceof FunctionQuery}}, cast it and pull out 
it's ValueSource
** else: wrap the Query in a QueryValueSource
* add a new subclass of NumericStatsValues that can be constructed directly 
with a ValueSource



> Support stats over functions
> 
>
> Key: SOLR-6354
> URL: https://issues.apache.org/jira/browse/SOLR-6354
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Hoss Man
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org