[ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523097#comment-14523097
 ] 

Joel Bernstein commented on SOLR-7377:
--------------------------------------

This patch looks really good. I'm planning on making only three changes:

1) Allowing the ParallelStream to either use object serialization or Streaming 
Expressions as transport mechanism. There will be use cases where people will 
want to use the Streaming API directly and not bother with Streaming 
Expressions constructs.  

2) Add some tests for StreamingExpressions that use the ParallelStream.

3) Add an ExpressionRunner that runs expressions from the command line. The 
initial version of this will only support built-in expressions. We should 
decide if we want to break out the name-to-class mapping in an external 
properties file so it can be used both by Solr and the ExpressionRunner.

> SOLR Streaming Expressions
> --------------------------
>
>                 Key: SOLR-7377
>                 URL: https://issues.apache.org/jira/browse/SOLR-7377
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>            Reporter: Dennis Gove
>            Priority: Minor
>             Fix For: Trunk
>
>         Attachments: SOLR-7377.patch, SOLR-7377.patch, SOLR-7377.patch, 
> SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
>     search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
>     over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to