[ https://issues.apache.org/jira/browse/SOLR-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Bernstein updated SOLR-10651: ---------------------------------- Description: This is a ticket for organizing the new statistical programming features of Streaming Expressions. It's also a place for the community to discuss what functions are needed to support statistical programming. Basic Syntax: {code} let(a = timeseries(...), b = timeseries(...), c = col(a, count(*)), d = col(b, count(*)), r = regress(c, d), tuple (p = predict(r, 50))) {code} The expression above is doing the following: 1) The let expression is setting variables (a, b, c, d, r). 2) Variables *a* and *b* are the output of timeseries() Streaming Expressions. These will be stored in memory as list of Tuples with time series result. 3) Variables *c* and *d* are set using the *col* evaluator. The col evaluator extracts a column of numbers from a list of tuples. In the example it extracting the count\(*\) field from the two time series result sets. 4) Variable *r* is the output from the *regress* evaluator. The regress evaluator performs a simple regress analysis on two columns of numbers. 5) Once the variables are set, a single Streaming Expression is run by the *let* expression. In the example the *tuple* expression is run. The tuple expression outputs a single Tuple with name/value pairs. Any Streaming Expression can be run by the *let* expression so this can be a complex program. The streaming expression run by *let* has access to all the variables defined earlier. 6) The tuple expression in the example has one name / value pair. The name *p* is set to the output of the *predict* evaluator. The predict evaluator is predicting the value 50 based on the regression result stored in variable *r*. 7) The output of this expression will be a single tuple with value of the predict function in the *p* field. was: This is a ticket for organizing the new statistical programming features of Streaming Expressions. It's also a place for the community to discuss what functions are needed to support statistical programming. Basic Syntax: {code} let(a = timeseries(...), b = timeseries(...), c = col(a, count(*)), d = col(b, count(*)), r = regress(c, d), tuple (p = predict(r, 50))) {code} The expression above is doing the following: 1) The let expression is setting variables (a, b, c, d, r). 2) Variables *a* and *b* are the output of timeseries() Streaming Expressions. These will be stored in memory as list of Tuples with time series result. 3) Variables *c* and *d* a set using the *col* evaluator. The col evaluator extracts a column of numbers from a list of tuples. In the example it extracting the count\(*\) field from the two time series result sets. 4) Variable *r* is the output from the *regress* evaluator. The regress evaluator performs a simple regress analysis on two columns of numbers. 5) Once the variables are set, a single Streaming Expression is run by the *let* expression. In the example the *tuple* expression is run. The tuple expression outputs a single Tuple with name/value pairs. Any Streaming Expression can be run by the *let* expression so this can be a complex program. The streaming expression run by *let* has access to all the variables defined earlier. 6) The tuple expression in the example has one name / value pair. The name *p* is set to the output of the *predict* evaluator. The predict evaluator is predicting the value 50 based on the regression result stored in variable *r*. 7) The output of this expression will be a single tuple with value of the predict function in the *p* field. > Streaming Expressions statistical functions library > --------------------------------------------------- > > Key: SOLR-10651 > URL: https://issues.apache.org/jira/browse/SOLR-10651 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Joel Bernstein > Fix For: master (7.0) > > > This is a ticket for organizing the new statistical programming features of > Streaming Expressions. It's also a place for the community to discuss what > functions are needed to support statistical programming. > Basic Syntax: > {code} > let(a = timeseries(...), > b = timeseries(...), > c = col(a, count(*)), > d = col(b, count(*)), > r = regress(c, d), > tuple (p = predict(r, 50))) > {code} > The expression above is doing the following: > 1) The let expression is setting variables (a, b, c, d, r). > 2) Variables *a* and *b* are the output of timeseries() Streaming > Expressions. These will be stored in memory as list of Tuples with time > series result. > 3) Variables *c* and *d* are set using the *col* evaluator. The col evaluator > extracts a column of numbers from a list of tuples. In the example it > extracting the count\(*\) field from the two time series result sets. > 4) Variable *r* is the output from the *regress* evaluator. The regress > evaluator performs a simple regress analysis on two columns of numbers. > 5) Once the variables are set, a single Streaming Expression is run by the > *let* expression. In the example the *tuple* expression is run. The tuple > expression outputs a single Tuple with name/value pairs. Any Streaming > Expression can be run by the *let* expression so this can be a complex > program. The streaming expression run by *let* has access to all the > variables defined earlier. > 6) The tuple expression in the example has one name / value pair. The name > *p* is set to the output of the *predict* evaluator. The predict evaluator is > predicting the value 50 based on the regression result stored in variable *r*. > 7) The output of this expression will be a single tuple with value of the > predict function in the *p* field. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org