[ https://issues.apache.org/jira/browse/SOLR-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Bernstein updated SOLR-10651: ---------------------------------- Description: This is a ticket for organizing the new statistical programming features of Streaming Expressions. It's also a place for the community to discuss what functions are needed to support statistical programming. Basic Syntax: {code} let(a = timeseries(...), b = timeseries(...), c = col(a, count(*)), d = col(b, count(*)), r = regress(c, d), tuple (p = predict(r, 50))) {code} The expression below is doing the following: 1) The let expression is setting variables (a, b, c, d, r). 2) Variables *a* and *b* are the output of a timeseries() Streaming Expression. These will be stored in memory as list of Tuples with time series result. 3) Variables *c* and *d* a set using the *col* evaluator. The col evaluator extracts a column of numbers from a list of tuples. In the example it extracting the count(*) field from the two time series result sets. 4) Variable *r* is the output from the *regress* evaluator. The regress evaluator performs a simple regress analysis on two columns of numbers. 5) Once the variables are set, a single Streaming Expression is run by the *let* expression. In the example the *tuple* expression is run. The tuple expression outputs a single Tuple with name/value pairs. Any Streaming Expression can be run by the *let* expression so this can be a complex program. The streaming expression run by *let* has access to all the variables defined earlier. 6) The tuple expression in the example has one name / value pair. The name *p* is set to the output of the *predict* evaluator. The predict evaluator is predicting the value 50 based on the regression result stored in variable *r*. 7) The output of this expression will be a single tuple with value of the predict function in the *p* field. was: This is a ticket for organizing the new statistical programming features of Streaming Expressions. It's also a place for the community to discuss what functions are needed to support statistical programming. This ticket will be updated shortly to show the existing syntax and functions with links to existing jira tickets. Basic Syntax: {code} let(a = timeseries(...), b = timeseries(...), c = col(a, count(*)), d = col(b, count(*)), r = regress(c, d), tuple (p = predict(r, 50))) {code} The expression below is doing the following: 1) The let expression is setting variables (a, b, c, d, r). 2) Variables *a* and *b* are the output of a timeseries() Streaming Expression. These will be stored in memory as list of Tuples with time series result. 3) Variables *c* and *d* a set using the *col* evaluator. The col evaluator extracts a column of numbers from a list of tuples. In the example it extracting the count(*) field from the two time series result sets. 4) Variable *r* is the output from the *regress* evaluator. The regress evaluator performs a simple regress analysis on two columns of numbers. 5) Once the variables are set, a single Streaming Expression is run by the *let* expression. In the example the *tuple* expression is run. The tuple expression outputs a single Tuple with name/value pairs. Any Streaming Expression can be run by the *let* expression so this can be a complex program. The streaming expression run by *let* has access to all the variables defined earlier. 6) The tuple expression in the example has one name / value pair. The name *p* is set to the output of the *predict* evaluator. The predict evaluator is predicting the value 50 based on the regression result stored in variable *r*. 7) The output of this expression will be a single tuple with value of the predict function in the *p* field. > Streaming Expressions statistical functions library > --------------------------------------------------- > > Key: SOLR-10651 > URL: https://issues.apache.org/jira/browse/SOLR-10651 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Joel Bernstein > Fix For: master (7.0) > > > This is a ticket for organizing the new statistical programming features of > Streaming Expressions. It's also a place for the community to discuss what > functions are needed to support statistical programming. > Basic Syntax: > {code} > let(a = timeseries(...), > b = timeseries(...), > c = col(a, count(*)), > d = col(b, count(*)), > r = regress(c, d), > tuple (p = predict(r, 50))) > {code} > The expression below is doing the following: > 1) The let expression is setting variables (a, b, c, d, r). > 2) Variables *a* and *b* are the output of a timeseries() Streaming > Expression. These will be stored in memory as list of Tuples with time series > result. > 3) Variables *c* and *d* a set using the *col* evaluator. The col evaluator > extracts a column of numbers from a list of tuples. In the example it > extracting the count(*) field from the two time series result sets. > 4) Variable *r* is the output from the *regress* evaluator. The regress > evaluator performs a simple regress analysis on two columns of numbers. > 5) Once the variables are set, a single Streaming Expression is run by the > *let* expression. In the example the *tuple* expression is run. The tuple > expression outputs a single Tuple with name/value pairs. Any Streaming > Expression can be run by the *let* expression so this can be a complex > program. The streaming expression run by *let* has access to all the > variables defined earlier. > 6) The tuple expression in the example has one name / value pair. The name > *p* is set to the output of the *predict* evaluator. The predict evaluator is > predicting the value 50 based on the regression result stored in variable *r*. > 7) The output of this expression will be a single tuple with value of the > predict function in the *p* field. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org