[jira] [Updated] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-17 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-11444:

Attachment: SOLR-11444.patch

Updated patch.  I think it's ready.

* double-resolve of an alias.  This used to not be supported by JoinQParser nor 
streaming expressions but now it works since I put this logic in Aliases.java 
where it can be shared.  I added some TODOs about this feature being dubious.
* ClusterStateProvider: getAlias -> resolveAlias and changed semantics to 
return input if not an alias.  The extra alias indirection happens here (new).
* Aliases.java: decided to remove a convenience method I added in the last 
patch.  And changed one of the newer methods I added to be resolveAlias with 
same semantics as the one in ClusterStateProvider.
* SolrTestCaseJ4: new getCloudSolrClient(MiniSolrCloudCluster cluster) to 
randomly pick a cluster state provider based on either ZK or HTTP.  FYI 
[~ichattopadhyaya].  Perhaps MSCC.getClient's impl should random().usually() do 
what it does now (it's probably fastest) and otherwise use the HTTP provider 
one (perhaps slower?)?

note: some streaming expressions code here and CloudSolrClient's http cluster 
state provider are coded in such a way that will probably be wrong or break if 
aliases and collections have the same name.  This is an observation I see, not 
a change induced by this patch. 

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR-11444.patch, SOLR-11444.patch, 
> SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-15 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-11444:
--
Attachment: SOLR-11444.patch

Oops, attached the wrong patch last time.

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR-11444.patch, SOLR_11444_Aliases.patch, 
> SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-15 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-11444:
--
Attachment: (was: SOLR-11488.patch)

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-15 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-11444:
--
Attachment: SOLR-11488.patch

My attempt to reconcile recent changes by [~ab] with this JIRA, there were some 
merge conflicts.

Note, there is one /nocommit in HttpSolrCall that I'd like David to take a 
glance at to see if it makes sense, I was getting an AIOOB there because the 
path was just "/select" and idx == -1 so path.substring(idx) failed.

I'm going to use this as a base for beefing up tests in SOLR-11218.

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-11 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-11444:

Attachment: SOLR_11444_Aliases.patch

New patch.  All existing tests pass.  Probably ready to commit but would love a 
review on some points.

New behavior: collection references in the URL path can now be comma delimited 
lists, just as is already possible with the little-known {{collection}} 
parameter.  Thus you can now do 
{{http://localhost:8983/solr/collection1,collection2/select?...}}.  The point 
of this is to have better consistency in treatment between both options, which 
in turn helps make the code to process them easier and more maintainable, 
removing gotcha edge-cases that were present.  I propose that this 
{{collection}} parameter in 8.0 be purely internal (or removed entirely?), thus 
not supported in SolrJ as it's needless, I think -- similar to {{qt}}.

CloudSolrClient
* {{request()}}: The {{collection}} parameter is now fetched as top precedence, 
instead of the argument/param to the method.  Although it might seem this is a 
break in semantics, I'm doubtful since code I replaced in this class (in 
{{sendRequest()}}) used to compose the URL to Solr differently depending on 
wether a {{collection}} parameter was present.  After all, HttpSolrCall (Solr 
side) considers {{collection}} first (assuming the path isn't a core name).  
FYI [~markrmil...@gmail.com]
* New: you can now index (update) documents to an alias (or collection list) 
that references more than one collection.  It's routed to the first in the 
list. This change matches Solr's existing behavior (as implemented by 
HttpSolrCall).
* {{sendRequest()}}: improved clarity of gathering the URL list; no intended 
change in behavior.

HttpSolrCall & V2HttpCall  (FYI [~noble.paul])
* Most changes are just a refactor to improve the code.
* Collections in the path are parsed comma-delimited now to be consistent with 
{{collection}} param.
* {{getAuthCtx()}}: Now trusts/honors {{collectionList}} when present, instead 
of duplicating or adding special case logic of how to detect the collections, 
thus easier to maintain.  [~anshumg] do you think this is fine?

AliasIntegrationTest
* Updated to ensure we more thoroughly tested all the ways that one can refer 
to collection lists and aliases. This includes comma delimited collection 
references in the URL path now.
* Test indexing with CloudSolrClient to multi-collection alias.

ClusterStateProvider
* Simplified a bit, removing one method.  FYI [~ichattopadhyaya].  Perhaps 
instead of keeping getAlias and removing getCollectionName; the reverse could 
be done?  I dunno, I could go either way.  There is a caller that specifically 
wants to know if it was alias-resolved which would be awkward to use 
getCollectionName to detect that.

SQL handler, SolrSchema
* getTableMap: instead of attempting to expand the alias to its target 
collection, simply pretend the alias is itself a table/collection.  I believe 
this should work, whereas the code it replaces assumed incorrectly that an 
alias maps to one collection when in fact it's (potentially) a comma delimited 
list -- and I believe the related in streaming expressions here doesn't support 
collection references that are comma delimited.  That could be added by I left 
that as a TODO.  FYI [~risdenk]


> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, 

[jira] [Updated] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-06 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-11444:

Attachment: SOLR_11444_Aliases.patch

This patch is a WIP; I know I broke something and I'm working out what it was.

Some random notes:
* CloudSolrClient: send write to 1st in alias list
* More consistently use StrUtils.splitSmart instead of String.split
* [~joel.bernstein] {{TupleStream.getSlices}} looks identical to 
{{CloudSolrClient.getSlices}}.  Why did you copy code and commit it unmodified? 
 Perhaps there is more duplicated code; I didn't check. 



> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org