[ https://issues.apache.org/jira/browse/SOLR-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dennis Gove updated SOLR-8185: ------------------------------ Attachment: SOLR-8185.patch Full patch. All tests pass. > Add operations support to streaming metrics > ------------------------------------------- > > Key: SOLR-8185 > URL: https://issues.apache.org/jira/browse/SOLR-8185 > Project: Solr > Issue Type: Improvement > Components: SolrJ > Reporter: Dennis Gove > Priority: Minor > Attachments: SOLR-8185.patch > > > Adds support for operations on stream metrics. > With this feature one can modify tuple values before applying to the computed > metric. There are a lot of use-cases I can see with this - I'll describe one > here. > Imagine you have a RollupStream which is computing the average over some > field but you cannot be sure that all documents have a value for that field, > ie the value is null. When the value is null you want to treat it as a 0. > With this feature you can accomplish that like this > {code} > rollup( > search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"), > over=\"a_s\", > avg(a_i, replace(null, withValue=0)), > count(*), > ) > {code} > The operations are applied to the tuple for each metric in the stream which > means you perform different operations on different metrics without being > impacted by operations on other metrics. > Adding to our previous example, imagine you want to also get the min of a > field but do not consider null values. > {code} > rollup( > search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"), > over=\"a_s\", > avg(a_i, replace(null, withValue=0)), > min(a_i), > count(*), > ) > {code} > Also, the tuple is not modified for streams that might wrap this one. Ie, the > only thing that sees the applied operation is that particular metric. If you > want to apply operations for wrapping streams you can still achieve that with > the SelectStream (SOLR-7669). > One feature I'm investigating but this patch DOES NOT add is the ability to > assign names to the resulting metric value. For example, to allow for > something like this > {code} > rollup( > search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"), > over=\"a_s\", > avg(a_i, replace(null, withValue=0), as="avg_a_i_null_as_0"), > avg(a_i), > count(*, as="totalCount"), > ) > {code} > Right now that isn't possible because the identifier for each metric would be > the same "avg_a_i" and as such both couldn't be returned. It's relatively > easy to add but I have to investigate its impact on the SQL and FacetStream > areas. > Depends on SOLR-7669 (SelectStream) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org