[jira] [Comment Edited] (SOLR-7584) Add Joins to the Streaming API and Streaming Expressions

2015-05-21 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555348#comment-14555348
 ] 

Dennis Gove edited comment on SOLR-7584 at 5/22/15 12:27 AM:
-

Adds abstract JoinStream to support joins of N sub-streams.
Adds abstract BiJoinStream to limit JoinStream to 2 sub-streams, left and right.
Adds concrete InnerJoinStream with support for merge join.

Does not handle hash joins.
Uses aliasing concept already available in CloudSolrStream.

Still work to be done.


was (Author: dpgove):
Adds JoinStream to support joins of N sub-streams.
Adds BiJoinStream to limit JoinStream to 2 sub-streams, left and right.
Adds InnerJoinStream with support for merge join.

Does not handle hash joins.
Uses aliasing concept already available in CloudSolrStream.

Still work to be done.

> Add Joins to the Streaming API and Streaming Expressions
> 
>
> Key: SOLR-7584
> URL: https://issues.apache.org/jira/browse/SOLR-7584
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrJ
>Reporter: Dennis Gove
>Priority: Minor
>  Labels: Streaming
> Attachments: SOLR-7584.patch
>
>
> Add InnerJoinStream, LeftOuterJoinStream, and supporting classes to the 
> Streaming API to allow for joining between sub-streams.
> At its basic, it would look something like this
> {code}
> innerJoin(
>   search(collection1, q=*:*, fl="fieldA, fieldB, fieldC", ...),
>   search(collection2, q=*:*, fl="fieldA, fieldD, fieldE", ...),
>   on="fieldA=fieldA"
> )
> {code}
> or with multi-field on clauses
> {code}
> innerJoin(
>   search(collection1, q=*:*, fl="fieldA, fieldB, fieldC", ...),
>   search(collection2, q=*:*, fl="fieldA, fieldD, fieldE", ...),
>   on="fieldA=fieldA, fieldB=fieldD"
> )
> {code}
> I'd also like to support the option of doing a hash join instead of the 
> default merge join but I haven't yet figured out the best way to express 
> that. I'd like to let the user tell us which sub-stream should be hashed (the 
> least-cost one).
> Also, I've been thinking about field aliasing and might want to add a 
> SelectStream which serves the purpose of allowing us to limit the fields 
> coming out and rename fields.
> Depends on SOLR-7554



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7377) SOLR Streaming Expressions

2015-04-10 Thread Dennis Gove (JIRA)
Dennis Gove created SOLR-7377:
-

 Summary: SOLR Streaming Expressions
 Key: SOLR-7377
 URL: https://issues.apache.org/jira/browse/SOLR-7377
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Dennis Gove
Priority: Minor
 Fix For: Trunk


It would be beneficial to add an expression-based interface to Streaming API 
described in SOLR-6526. Right now that API requires streaming requests to come 
in from clients as serialized bytecode of the streaming classes. The suggestion 
here is to support string expressions which describe the streaming operations 
the client wishes to perform. 

{code:java}
search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
{code}

With this syntax in mind, one can now express arbitrarily complex stream 
queries with a single string.

{code:java}
// merge two distinct searches together on common fields
merge(
  search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
asc"),
  search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
asc"),
  on="a_f asc, a_s asc")

// find top 20 unique records of a search
top(
  n=20,
  unique(
search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
over="a_f desc"),
  over="a_f desc")
{code}

The syntax would support
1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
to a class implementing a Unique stream class) This allows users to build their 
own streams and use as they wish.
2. Named parameters (of both simple and expression types)
3. Unnamed, type-matched parameters (to support requiring N streams as 
arguments to another stream)
4. Positional parameters

The main goal here is to make streaming as accessible as possible and define a 
syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-10 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Attachment: SOLR-7377.patch

First-pass patch. Looking for initial feedback.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-6526. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   over="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-10 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Description: 
It would be beneficial to add an expression-based interface to Streaming API 
described in SOLR-6526. Right now that API requires streaming requests to come 
in from clients as serialized bytecode of the streaming classes. The suggestion 
here is to support string expressions which describe the streaming operations 
the client wishes to perform. 

{code:java}
search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
{code}

With this syntax in mind, one can now express arbitrarily complex stream 
queries with a single string.

{code:java}
// merge two distinct searches together on common fields
merge(
  search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
asc"),
  search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
asc"),
  on="a_f asc, a_s asc")

// find top 20 unique records of a search
top(
  n=20,
  unique(
search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
over="a_f desc"),
  sort="a_f desc")
{code}

The syntax would support
1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
to a class implementing a Unique stream class) This allows users to build their 
own streams and use as they wish.
2. Named parameters (of both simple and expression types)
3. Unnamed, type-matched parameters (to support requiring N streams as 
arguments to another stream)
4. Positional parameters

The main goal here is to make streaming as accessible as possible and define a 
syntax for running complex queries across large distributed systems.

  was:
It would be beneficial to add an expression-based interface to Streaming API 
described in SOLR-6526. Right now that API requires streaming requests to come 
in from clients as serialized bytecode of the streaming classes. The suggestion 
here is to support string expressions which describe the streaming operations 
the client wishes to perform. 

{code:java}
search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
{code}

With this syntax in mind, one can now express arbitrarily complex stream 
queries with a single string.

{code:java}
// merge two distinct searches together on common fields
merge(
  search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
asc"),
  search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
asc"),
  on="a_f asc, a_s asc")

// find top 20 unique records of a search
top(
  n=20,
  unique(
search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
over="a_f desc"),
  over="a_f desc")
{code}

The syntax would support
1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
to a class implementing a Unique stream class) This allows users to build their 
own streams and use as they wish.
2. Named parameters (of both simple and expression types)
3. Unnamed, type-matched parameters (to support requiring N streams as 
arguments to another stream)
4. Positional parameters

The main goal here is to make streaming as accessible as possible and define a 
syntax for running complex queries across large distributed systems.


> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-6526. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.

[jira] [Commented] (SOLR-7275) Pluggable authorization module in Solr

2015-04-10 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490733#comment-14490733
 ] 

Dennis Gove commented on SOLR-7275:
---

I like this concept but I think the response can be expanded to add a bit more 
functionality. It would be nice if the pluggable security layer could respond 
in such a way as to not wholly reject a request but to instead restrict what is 
returned from a request. It could accomplish this by providing additional 
filters to apply to a request.

{code}
public class SolrAuthorizationResponse {
  boolean authorized;
  String additionalFilterQuery;

  ...
}
{code}

By adding additionalFilterQuery, this would give the security layer an 
opportunity to say, "yup, you're authorized but you can't see records matching 
this filter" or "yup, you're authorized but you can only see records also 
matching this filter". It provides a way to add fine-grained control of data 
access but keep that control completely outside of SOLR (as it would live in 
the pluggable security layer).

Additionally, it allows the security layer to add fine-grained control 
**without notifying the user they are being restricted** as this lives wholly 
in the SOLR <---> security layer communication. There are times when telling 
the user their request was rejected due to it returning records they're not 
privileged to see actually gives the user some information you may not want 
them to know - the fact that these restricted records even exist. Instead, by 
adding filters and just not returning records the user isn't privileged for, 
the user is non-the-wiser that they were restricted at all.

> Pluggable authorization module in Solr
> --
>
> Key: SOLR-7275
> URL: https://issues.apache.org/jira/browse/SOLR-7275
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Anshum Gupta
>Assignee: Anshum Gupta
>
> Solr needs an interface that makes it easy for different authorization 
> systems to be plugged into it. Here's what I plan on doing:
> Define an interface {{SolrAuthorizationPlugin}} with one single method 
> {{isAuthorized}}. This would take in a {{SolrRequestContext}} object and 
> return an {{SolrAuthorizationResponse}} object. The object as of now would 
> only contain a single boolean value but in the future could contain more 
> information e.g. ACL for document filtering etc.
> The reason why we need a context object is so that the plugin doesn't need to 
> understand Solr's capabilities e.g. how to extract the name of the collection 
> or other information from the incoming request as there are multiple ways to 
> specify the target collection for a request. Similarly request type can be 
> specified by {{qt}} or {{/handler_name}}.
> Flow:
> Request -> SolrDispatchFilter -> isAuthorized(context) -> Process/Return.
> {code}
> public interface SolrAuthorizationPlugin {
>   public SolrAuthorizationResponse isAuthorized(SolrRequestContext context);
> }
> {code}
> {code}
> public  class SolrRequestContext {
>   UserInfo; // Will contain user context from the authentication layer.
>   HTTPRequest request;
>   Enum OperationType; // Correlated with user roles.
>   String[] CollectionsAccessed;
>   String[] FieldsAccessed;
>   String Resource;
> }
> {code}
> {code}
> public class SolrAuthorizationResponse {
>   boolean authorized;
>   public boolean isAuthorized();
> }
> {code}
> User Roles: 
> * Admin
> * Collection Level:
>   * Query
>   * Update
>   * Admin
> Using this framework, an implementation could be written for specific 
> security systems e.g. Apache Ranger or Sentry. It would keep all the security 
> system specific code out of Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-7377) SOLR Streaming Expressions

2015-04-20 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503160#comment-14503160
 ] 

Dennis Gove edited comment on SOLR-7377 at 4/20/15 4:46 PM:


Updated the patch based on some additional items I wanted to include in this. 
Note that this patch adds a dependency on guava in the solr/solrj/ivy.xml file. 
We may want to revisit this additional dependency. Guava is being used for some 
basic string checks (to ensure operations only include supported characters) 
and this logic could be coded up if we want to avoid added a dependency.

{code}

{code}


was (Author: dpgove):
Updated the patch based on some additional items I wanted to include in this. 
Note that this patch adds a dependency on guava in the solr/solrj/ivy.xml file. 
We may want to revisit this additional dependency. Guava is being used for some 
basic string checks (to ensure operations only include supported characters) 
and this logic could be coded up if we want to avoid added a dependency.



> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch, SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-20 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Attachment: SOLR-7377.patch

Updated the patch based on some additional items I wanted to include in this. 
Note that this patch adds a dependency on guava in the solr/solrj/ivy.xml file. 
We may want to revisit this additional dependency. Guava is being used for some 
basic string checks (to ensure operations only include supported characters) 
and this logic could be coded up if we want to avoid added a dependency.



> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch, SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-20 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Attachment: (was: SOLR-7377.patch)

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-21 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Attachment: SOLR-7377.patch

Now allows a search expression to include a zkHost (though does not require 
it). Improved performance of EqualToComparator by moving some branching logic 
into the constructor and creating a lambda for the actual comparison.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-21 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Attachment: (was: SOLR-7377.patch)

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-23 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Attachment: (was: SOLR-7377.patch)

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-23 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Attachment: SOLR-7377.patch

Added ability to turn an ExpressibleStream into a StreamExpression. Combined 
with the already existing ability to turn a StreamExpression into a string, we 
can now go back and forth from string <--> stream.

This will allow us to modify ParallelStream to pass along the string expression 
of the stream it wants to parallelize.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7377) SOLR Streaming Expressions

2015-04-23 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510208#comment-14510208
 ] 

Dennis Gove commented on SOLR-7377:
---

Well that makes it much clearer for me. I'm sorry for deleting all the older 
patches. In my reading of the "How to Contribute" I was under the impression 
that uploading a new patch (with same name) would just replace the old one. 
Each time I uploaded a new version I would see the previous one still there and 
figured I'd done something wrong so went ahead and deleted the old version. It 
didn't occur to me that the old versions would still stay there but just be 
greyed out. I won't be deleting the old versions going forward. Thanks for 
clearing that up for me!

The size of the patch is a function of a bit of package refactoring in the 
org.apache.solr.client.solrj.io package. This seems to be resulting in the diff 
showing a bunch of deleted/added files.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7377) SOLR Streaming Expressions

2015-04-24 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512165#comment-14512165
 ] 

Dennis Gove commented on SOLR-7377:
---

I was thinking that all comparators, no matter their implemented comparison 
logic, return one of three basic values when comparing A and B. 

1. A and B are logically equal to each other
2. A is logically before B
3. A is logically after B

The implemented comparison logic is then wholly dependent on what one might be 
intending to use the comparator for. For example, EqualToComparator's 
implemented comparison logic will return that A and B are logically equal if 
they are in fact equal to each other. Its logically before/after response 
depends on the sort order (ascending or descending) but is basically deciding 
if A is less than B or if A is greater than B.

One could, if they wanted to, create a comparator returning that two dates are 
logically equal to each other if they occur within the same week. Or a 
comparator returning that two numbers are logically equal if their values are 
within the same logarithmic order of magnitude. So on and so forth.

My thinking is that comparators determine the logical comparison and make no 
assumption on what that implemented logic is. This leaves open the possibility 
of implementing other comparators for given situations as they arise.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-25 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Attachment: SOLR-7377.patch

Made ParallelStream an ExpressibleStream and modified the StreamHandler to 
accept stream expression strings instead of bytecode.

Refactored operation and operand into functionName and parameter. And 
refactored all required references and tangentially related variable/class 
names.

Renamed EqualToComparator to FieldComparator to be a little more descriptive in 
the name.

Added ability to support pluggable streams by making it something you can 
configure in solrconfig.xml.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch, SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-7377) SOLR Streaming Expressions

2015-04-25 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512791#comment-14512791
 ] 

Dennis Gove edited comment on SOLR-7377 at 4/26/15 12:29 AM:
-

Made ParallelStream an ExpressibleStream and modified the StreamHandler to 
accept stream expression strings instead of bytecode.

Refactored operation and operand into functionName and parameter. And 
refactored all required references and tangentially related variable/class 
names.

Renamed EqualToComparator to FieldComparator to be a little more descriptive in 
the name.

Added ability to support pluggable streams by making it something you can 
configure in solrconfig.xml.

All stream-related tests pass. At this point I'd consider this functionally 
complete. 


was (Author: dpgove):
Made ParallelStream an ExpressibleStream and modified the StreamHandler to 
accept stream expression strings instead of bytecode.

Refactored operation and operand into functionName and parameter. And 
refactored all required references and tangentially related variable/class 
names.

Renamed EqualToComparator to FieldComparator to be a little more descriptive in 
the name.

Added ability to support pluggable streams by making it something you can 
configure in solrconfig.xml.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch, SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

2015-04-28 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Attachment: SOLR-7377.patch

Fixed a bug in CloudSolrStream when handling aliases. When filtering out the 
stream-only parameters from those that need to be passed to SOLR for query I 
was checking for parameter name "alias" when I should have been checking for 
"aliases".

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch, SOLR-7377.patch, SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7377) SOLR Streaming Expressions

2015-04-30 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522440#comment-14522440
 ] 

Dennis Gove commented on SOLR-7377:
---

I'm not totally against doing that but I feel like the refactoring is a 
required piece of this patch. I could, however, create a new ticket with just 
the refactoring and then make this one depend on that one. 

I am worried that such a ticket might look like unnecessary refactoring. 
Without the expression stuff added here I think the streaming stuff has a 
reasonable home in org.apache.solr.client.solrj.io.

That said, I certainly understand the benefit of smaller patches.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch, SOLR-7377.patch, SOLR-7377.patch, 
> SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-7377) SOLR Streaming Expressions

2015-04-30 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7377:
--
Comment: was deleted

(was: I'm not totally against doing that but I feel like the refactoring is a 
required piece of this patch. I could, however, create a new ticket with just 
the refactoring and then make this one depend on that one. 

I am worried that such a ticket might look like unnecessary refactoring. 
Without the expression stuff added here I think the streaming stuff has a 
reasonable home in org.apache.solr.client.solrj.io.

That said, I certainly understand the benefit of smaller patches.)

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch, SOLR-7377.patch, SOLR-7377.patch, 
> SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7377) SOLR Streaming Expressions

2015-04-30 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522439#comment-14522439
 ] 

Dennis Gove commented on SOLR-7377:
---

I'm not totally against doing that but I feel like the refactoring is a 
required piece of this patch. I could, however, create a new ticket with just 
the refactoring and then make this one depend on that one. 

I am worried that such a ticket might look like unnecessary refactoring. 
Without the expression stuff added here I think the streaming stuff has a 
reasonable home in org.apache.solr.client.solrj.io.

That said, I certainly understand the benefit of smaller patches.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch, SOLR-7377.patch, SOLR-7377.patch, 
> SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7377) SOLR Streaming Expressions

2015-05-02 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525364#comment-14525364
 ] 

Dennis Gove commented on SOLR-7377:
---

I don't see the ExpressionRunner in the patch - am I missing it somewhere? 
Also, I noticed ParallelStream lines 94-100 have some System.out.println lines. 
I suspect you intended to remove those.

Tests look good.

> SOLR Streaming Expressions
> --
>
> Key: SOLR-7377
> URL: https://issues.apache.org/jira/browse/SOLR-7377
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dennis Gove
>Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7377.patch, SOLR-7377.patch, SOLR-7377.patch, 
> SOLR-7377.patch, SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API 
> described in SOLR-7082. Right now that API requires streaming requests to 
> come in from clients as serialized bytecode of the streaming classes. The 
> suggestion here is to support string expressions which describe the streaming 
> operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream 
> queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s 
> asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
> search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
> over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" 
> to a class implementing a Unique stream class) This allows users to build 
> their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as 
> arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define 
> a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11283) Refactor all Stream Evaluators in solrj.io.eval to simplify them

2017-08-23 Thread Dennis Gove (JIRA)
Dennis Gove created SOLR-11283:
--

 Summary: Refactor all Stream Evaluators in solrj.io.eval to 
simplify them
 Key: SOLR-11283
 URL: https://issues.apache.org/jira/browse/SOLR-11283
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Dennis Gove
Assignee: Dennis Gove


As Stream Evaluators have been evolving we are seeing a need to better handle 
differing types of data within evaluators. For example, allowing some to 
evaluate over individual values or arrays of values, like
{code}
sin(a)
sin(a,b,c,d)
sin([a,b,c,d])
{code}

The current structure of Evaluators makes this difficult and repetitive work. 

Also, the hierarchy of classes behind evaluators can be confusing for 
developers creating new evaluators. For example, when to use a ComplexEvaluator 
vs a BooleanEvaluator.

A full refactoring of these classes will greatly enhance the usability and 
future evolution of evaluators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11283) Refactor all Stream Evaluators in solrj.io.eval to simplify them

2017-08-23 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-11283:
---
Attachment: SOLR-11283.patch

Full patch. All evaluator and stream related tests pass. Have not yet run full 
tests or precommit checks.

All evaluators are backward compatible in functionality and name/parameters, 
except for addAll which has been renamed to append.

> Refactor all Stream Evaluators in solrj.io.eval to simplify them
> 
>
> Key: SOLR-11283
> URL: https://issues.apache.org/jira/browse/SOLR-11283
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Dennis Gove
>Assignee: Dennis Gove
> Attachments: SOLR-11283.patch
>
>
> As Stream Evaluators have been evolving we are seeing a need to better handle 
> differing types of data within evaluators. For example, allowing some to 
> evaluate over individual values or arrays of values, like
> {code}
> sin(a)
> sin(a,b,c,d)
> sin([a,b,c,d])
> {code}
> The current structure of Evaluators makes this difficult and repetitive work. 
> Also, the hierarchy of classes behind evaluators can be confusing for 
> developers creating new evaluators. For example, when to use a 
> ComplexEvaluator vs a BooleanEvaluator.
> A full refactoring of these classes will greatly enhance the usability and 
> future evolution of evaluators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11283) Refactor all Stream Evaluators in solrj.io.eval to simplify them

2017-08-23 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139070#comment-16139070
 ] 

Dennis Gove commented on SOLR-11283:


I should add, this patch has a bunch of new tests marked to fail with 
NotImplementedException. I intend to implement these tests before committing.

> Refactor all Stream Evaluators in solrj.io.eval to simplify them
> 
>
> Key: SOLR-11283
> URL: https://issues.apache.org/jira/browse/SOLR-11283
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Dennis Gove
>Assignee: Dennis Gove
> Attachments: SOLR-11283.patch
>
>
> As Stream Evaluators have been evolving we are seeing a need to better handle 
> differing types of data within evaluators. For example, allowing some to 
> evaluate over individual values or arrays of values, like
> {code}
> sin(a)
> sin(a,b,c,d)
> sin([a,b,c,d])
> {code}
> The current structure of Evaluators makes this difficult and repetitive work. 
> Also, the hierarchy of classes behind evaluators can be confusing for 
> developers creating new evaluators. For example, when to use a 
> ComplexEvaluator vs a BooleanEvaluator.
> A full refactoring of these classes will greatly enhance the usability and 
> future evolution of evaluators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11283) Refactor all Stream Evaluators in solrj.io.eval to simplify them

2017-08-25 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-11283:
---
Attachment: SOLR-11283.patch

All stream related tests pass.

The following 3 tests failed but appear to be completely unrelated to my 
changes.

{code}
[junit4] Tests with failures [seed: EE1F61CDA04C3749]:
[junit4]   - org.apache.solr.cloud.HttpPartitionTest.test
[junit4]   - org.apache.solr.cloud.ForceLeaderTest.testReplicasInLIRNoLeader
[junit4]   -
org.apache.solr.update.processor.UpdateRequestProcessorFactoryTest.testUpdateDistribChainSkipping
{code}

When run individually those tests pass.

> Refactor all Stream Evaluators in solrj.io.eval to simplify them
> 
>
> Key: SOLR-11283
> URL: https://issues.apache.org/jira/browse/SOLR-11283
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Dennis Gove
>Assignee: Dennis Gove
> Attachments: SOLR-11283.patch, SOLR-11283.patch
>
>
> As Stream Evaluators have been evolving we are seeing a need to better handle 
> differing types of data within evaluators. For example, allowing some to 
> evaluate over individual values or arrays of values, like
> {code}
> sin(a)
> sin(a,b,c,d)
> sin([a,b,c,d])
> {code}
> The current structure of Evaluators makes this difficult and repetitive work. 
> Also, the hierarchy of classes behind evaluators can be confusing for 
> developers creating new evaluators. For example, when to use a 
> ComplexEvaluator vs a BooleanEvaluator.
> A full refactoring of these classes will greatly enhance the usability and 
> future evolution of evaluators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11145) Comprehensive Unit Tests for the Analytics Component

2017-10-13 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-11145:
--

Assignee: Dennis Gove

> Comprehensive Unit Tests for the Analytics Component
> 
>
> Key: SOLR-11145
> URL: https://issues.apache.org/jira/browse/SOLR-11145
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.1
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Critical
> Fix For: 7.0
>
>
> Adding comprehensive unit tests for the new Analytics Component.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11146) Analytics Component 2.0 Bug Fixes

2017-10-13 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-11146:
--

Assignee: Dennis Gove

> Analytics Component 2.0 Bug Fixes
> -
>
> Key: SOLR-11146
> URL: https://issues.apache.org/jira/browse/SOLR-11146
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.1
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Blocker
> Fix For: 7.0
>
>
> The new Analytics Component has several small bugs in mapping functions and 
> other places. This ticket is a fix for a large number of them. This patch 
> should allow all unit tests created in SOLR-11145 to pass.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-14 Thread Dennis Gove (JIRA)
Dennis Gove created SOLR-12355:
--

 Summary: HashJoinStream's use of String::hashCode results in 
non-matching tuples being considered matches
 Key: SOLR-12355
 URL: https://issues.apache.org/jira/browse/SOLR-12355
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Affects Versions: 6.0
Reporter: Dennis Gove
Assignee: Dennis Gove


The following strings have been found to have hashCode conflicts and as such 
can result in HashJoinStream considering two tuples with fields of these values 
to be considered the same.


{code:java}
"MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
This means these two tuples are the same if we're comparing on field "foo"
{code:java}
{
  "foo":"MG!!00TNGP::Mtge::"
}
{
  "foo":"MG!!00TNH1::Mtge::"
}
{code}
and these two tuples are the same if we're comparing on fields "foo,bar"
{code:java}
{
  "foo":"MG!!00TNGP"
  "bar":"Mtge"
}
{
  "foo":"MG!!00TNH1"
  "bar":"Mtge"
}{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-14 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475055#comment-16475055
 ] 

Dennis Gove commented on SOLR-12355:


I have a fix for this where instead of calculating the string value's hashCode 
we just use the string value as the key in the hashed set of tuples. I'm 
creating a few test cases to verify this gives us what we want.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-14 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475060#comment-16475060
 ] 

Dennis Gove commented on SOLR-12355:


This also impacts OuterHashJoinStream.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-15 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-12355:
---
Attachment: SOLR-12355.patch

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Attachments: SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-15 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475973#comment-16475973
 ] 

Dennis Gove commented on SOLR-12355:


Initial patch attached. I have not yet run the full suite of tests against this.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Attachments: SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-18 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-12355:
---
Attachment: SOLR-12355.patch

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Attachments: SOLR-12355.patch, SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-18 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove resolved SOLR-12355.

   Resolution: Fixed
Fix Version/s: 7.4

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12355.patch, SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10512) Innerjoin streaming expressions - Invalid JoinStream error

2018-03-09 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392851#comment-16392851
 ] 

Dennis Gove commented on SOLR-10512:


It was certainly designed such that the left field in the on clause is the 
field from the first incoming stream and the right field in the on clause is 
the field from the second incoming stream. If that is not occurring then this 
is a very clear bug.

> Innerjoin streaming expressions - Invalid JoinStream error
> --
>
> Key: SOLR-10512
> URL: https://issues.apache.org/jira/browse/SOLR-10512
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Affects Versions: 6.4.2, 6.5
> Environment: Debian Jessie
>Reporter: Dominique Béjean
>Priority: Major
>
> It looks like innerJoin streaming expression do not work as explained in 
> documentation. An invalid JoinStream error occurs.
> {noformat}
> curl --data-urlencode 'expr=innerJoin(
> search(books, 
>q="*:*", 
>fl="id", 
>sort="id asc"),
> searchreviews, 
>q="*:*", 
>fl="id_book_s", 
>sort="id_book_s asc"), 
> on="id=id_books_s"
> )' http://localhost:8983/solr/books/stream
>   
> {"result-set":{"docs":[{"EXCEPTION":"Invalid JoinStream - all incoming stream 
> comparators (sort) must be a superset of this stream's 
> equalitor.","EOF":true}]}}   
> {noformat}
> It is tottaly similar to the documentation example
> 
> {noformat}
> innerJoin(
>   search(people, q=*:*, fl="personId,name", sort="personId asc"),
>   search(pets, q=type:cat, fl="ownerId,petName", sort="ownerId asc"),
>   on="personId=ownerId"
> )
> {noformat}
> Queries on each collection give :
> {noformat}
> $ curl --data-urlencode 'expr=search(books, 
>q="*:*", 
>fl="id, title_s, pubyear_i", 
>sort="pubyear_i asc", 
>qt="/export")' 
> http://localhost:8983/solr/books/stream
> {
>   "result-set": {
> "docs": [
>   {
> "title_s": "Friends",
> "pubyear_i": 1994,
> "id": "book2"
>   },
>   {
> "title_s": "The Way of Kings",
> "pubyear_i": 2010,
> "id": "book1"
>   },
>   {
> "EOF": true,
> "RESPONSE_TIME": 16
>   }
> ]
>   }
> }
> $ curl --data-urlencode 'expr=search(reviews, 
>q="author_s:d*", 
>fl="id, id_book_s, stars_i, review_dt", 
>sort="id_book_s asc", 
>qt="/export")' 
> http://localhost:8983/solr/reviews/stream
>  
> {
>   "result-set": {
> "docs": [
>   {
> "stars_i": 3,
> "id": "book1_c2",
> "id_book_s": "book1",
> "review_dt": "2014-03-15T12:00:00Z"
>   },
>   {
> "stars_i": 4,
> "id": "book1_c3",
> "id_book_s": "book1",
> "review_dt": "2014-12-15T12:00:00Z"
>   },
>   {
> "stars_i": 3,
> "id": "book2_c2",
> "id_book_s": "book2",
> "review_dt": "1994-03-15T12:00:00Z"
>   },
>   {
> "stars_i": 4,
> "id": "book2_c3",
> "id_book_s": "book2",
> "review_dt": "1994-12-15T12:00:00Z"
>   },
>   {
> "EOF": true,
> "RESPONSE_TIME": 47
>   }
> ]
>   }
> }
> {noformat}
> After more tests, I just had to invert the "on" clause to make it work
> {noformat}
> curl --data-urlencode 'expr=innerJoin(
> search(books, 
>q="*:*", 
>fl="id", 
>sort="id asc"),
> searchreviews, 
>q="*:*", 
>fl="id_book_s", 
>sort="id_book_s asc"), 
> on="id_books_s=id"
> )' http://localhost:8983/solr/books/stream
> 
> {
>   "result-set": {
> "docs": [
>   {
> "title_s": "The Way of Kings",
> "pubyear_i": 2010,
> "stars_i": 5,
> "id": "book1",
>

[jira] [Commented] (SOLR-11914) Remove/move questionable SolrParams methods

2018-04-16 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440179#comment-16440179
 ] 

Dennis Gove commented on SOLR-11914:


I agree with [~dsmiley] - the code in the streaming classes appears to be an 
oddly round-about way of doing things. The changes you've made here appear to 
be a much better approach.

> Remove/move questionable SolrParams methods
> ---
>
> Key: SOLR-11914
> URL: https://issues.apache.org/jira/browse/SOLR-11914
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: David Smiley
>Priority: Minor
>  Labels: newdev
> Attachments: SOLR-11914.patch
>
>
> {{Map getAll(Map sink, Collection 
> params)}} 
> Is only used by the CollectionsHandler, and has particular rules about how it 
> handles multi-valued data that make it not very generic, and thus I think 
> doesn't belong here.  Furthermore the existence of this method is confusing 
> in that it gives the user another choice against it use versus toMap (there 
> are two overloaded variants).
> {{SolrParams toFilteredSolrParams(List names)}}
> Is only called in one place, and something about it bothers me, perhaps just 
> the name or that it ought to be a view maybe.
> {{static Map toMap(NamedList params)}}
> Isn't used and I don't like it; it doesn't even involve a SolrParams!  Legacy 
> of 2006.
> {{static Map toMultiMap(NamedList params)}}
> It doesn't even involve a SolrParams! Legacy of 2006 with some updates since. 
> Used in some places. Perhaps should be moved to NamedList as an instance 
> method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-11924) Add the ability to watch collection set changes in ZkStateReader

2018-04-17 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove resolved SOLR-11924.

   Resolution: Fixed
 Assignee: Dennis Gove
Fix Version/s: (was: master (8.0))

> Add the ability to watch collection set changes in ZkStateReader
> 
>
> Key: SOLR-11924
> URL: https://issues.apache.org/jira/browse/SOLR-11924
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.4, master (8.0)
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Minor
> Fix For: 7.4
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Allow users to watch when the set of collections for a cluster is changed. 
> This is useful if a user is trying to discover collections within a cloud.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-12271) Analytics Component reads negative float and double field values incorrectly

2018-05-25 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-12271:
--

Assignee: Dennis Gove

> Analytics Component reads negative float and double field values incorrectly
> 
>
> Key: SOLR-12271
> URL: https://issues.apache.org/jira/browse/SOLR-12271
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.3.1, 7.4, master (8.0)
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Major
> Fix For: 7.4, master (8.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the analytics component uses the incorrect way of converting 
> numeric doc values longs to doubles and floats.
> The fix is easy and the tests now cover this use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-12271) Analytics Component reads negative float and double field values incorrectly

2018-05-30 Thread Dennis Gove (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove resolved SOLR-12271.

Resolution: Fixed

> Analytics Component reads negative float and double field values incorrectly
> 
>
> Key: SOLR-12271
> URL: https://issues.apache.org/jira/browse/SOLR-12271
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.4, master (8.0)
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Major
> Fix For: 7.4, master (8.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the analytics component uses the incorrect way of converting 
> numeric doc values longs to doubles and floats.
> The fix is easy and the tests now cover this use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



<    1   2   3   4   5   6