[ https://issues.apache.org/jira/browse/SOLR-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003390#comment-15003390 ]
Dennis Gove commented on SOLR-8185: ----------------------------------- Running into some issues turning the expression into something that would perform the expected .equals() {code} avg(a_f, replace(10, withValue=0)) {code} In this example, what type is 10? Is it a long or a float or a double? The field is a float (as noted by the _f) so one would expect 10 to be a float as well. However, in converting 10 to some Object that we can call .equals(...) on we are not sure what the type is. This has been a persistent problem with this patch. But I think I've come up with something that puts some of the decision making in the hands of the expression writer. {code} avg(a_f, replace(10f, withValue=0f)) {code} In this case the value can only be converted to a float so it will be created as a float object. However, to add this new requirement on the expression creator I want to take a deeper look at what this might impact and make sure the documentation is very clear. If a user doesn't do the correct thing (gives us 10 instead of 10f) and the value in the tuple is a float then float.equals(long) == false every single time. Anyway, this note is somewhat of a rant. > Add operations support to streaming metrics > ------------------------------------------- > > Key: SOLR-8185 > URL: https://issues.apache.org/jira/browse/SOLR-8185 > Project: Solr > Issue Type: Improvement > Components: SolrJ > Reporter: Dennis Gove > Assignee: Dennis Gove > Priority: Minor > Attachments: SOLR-8185.patch > > > Adds support for operations on stream metrics. > With this feature one can modify tuple values before applying to the computed > metric. There are a lot of use-cases I can see with this - I'll describe one > here. > Imagine you have a RollupStream which is computing the average over some > field but you cannot be sure that all documents have a value for that field, > ie the value is null. When the value is null you want to treat it as a 0. > With this feature you can accomplish that like this > {code} > rollup( > search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"), > over=\"a_s\", > avg(a_i, replace(null, withValue=0)), > count(*), > ) > {code} > The operations are applied to the tuple for each metric in the stream which > means you perform different operations on different metrics without being > impacted by operations on other metrics. > Adding to our previous example, imagine you want to also get the min of a > field but do not consider null values. > {code} > rollup( > search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"), > over=\"a_s\", > avg(a_i, replace(null, withValue=0)), > min(a_i), > count(*), > ) > {code} > Also, the tuple is not modified for streams that might wrap this one. Ie, the > only thing that sees the applied operation is that particular metric. If you > want to apply operations for wrapping streams you can still achieve that with > the SelectStream (SOLR-7669). > One feature I'm investigating but this patch DOES NOT add is the ability to > assign names to the resulting metric value. For example, to allow for > something like this > {code} > rollup( > search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"), > over=\"a_s\", > avg(a_i, replace(null, withValue=0), as="avg_a_i_null_as_0"), > avg(a_i), > count(*, as="totalCount"), > ) > {code} > Right now that isn't possible because the identifier for each metric would be > the same "avg_a_i" and as such both couldn't be returned. It's relatively > easy to add but I have to investigate its impact on the SQL and FacetStream > areas. > Depends on SOLR-7669 (SelectStream) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org