[ 
https://issues.apache.org/jira/browse/SOLR-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-8185:
------------------------------
    Attachment: SOLR-8185.patch

Full patch. All tests pass.

> Add operations support to streaming metrics
> -------------------------------------------
>
>                 Key: SOLR-8185
>                 URL: https://issues.apache.org/jira/browse/SOLR-8185
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrJ
>            Reporter: Dennis Gove
>            Priority: Minor
>         Attachments: SOLR-8185.patch
>
>
> Adds support for operations on stream metrics.
> With this feature one can modify tuple values before applying to the computed 
> metric. There are a lot of use-cases I can see with this - I'll describe one 
> here.
> Imagine you have a RollupStream which is computing the average over some 
> field but you cannot be sure that all documents have a value for that field, 
> ie the value is null. When the value is null you want to treat it as a 0. 
> With this feature you can accomplish that like this
> {code}
> rollup(
>   search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"),
>   over=\"a_s\",
>   avg(a_i, replace(null, withValue=0)),
>   count(*),
> )
> {code}
> The operations are applied to the tuple for each metric in the stream which 
> means you perform different operations on different metrics without being 
> impacted by operations on other metrics. 
> Adding to our previous example, imagine you want to also get the min of a 
> field but do not consider null values.
> {code}
> rollup(
>   search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"),
>   over=\"a_s\",
>   avg(a_i, replace(null, withValue=0)),
>   min(a_i),
>   count(*),
> )
> {code}
> Also, the tuple is not modified for streams that might wrap this one. Ie, the 
> only thing that sees the applied operation is that particular metric. If you 
> want to apply operations for wrapping streams you can still achieve that with 
> the SelectStream (SOLR-7669).
> One feature I'm investigating but this patch DOES NOT add is the ability to 
> assign names to the resulting metric value. For example, to allow for 
> something like this
> {code}
> rollup(
>   search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"),
>   over=\"a_s\",
>   avg(a_i, replace(null, withValue=0), as="avg_a_i_null_as_0"),
>   avg(a_i),
>   count(*, as="totalCount"),
> )
> {code}
> Right now that isn't possible because the identifier for each metric would be 
> the same "avg_a_i" and as such both couldn't be returned. It's relatively 
> easy to add but I have to investigate its impact on the SQL and FacetStream 
> areas.
> Depends on SOLR-7669 (SelectStream)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to