[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-08-04 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653993#comment-14653993
 ] 

Scott Blum commented on SOLR-5756:
--

1) I don't understand.  `if (children == null || children.isEmpty()) {` isn't 
that the null check?

2) constructState() actually updates the field as a side effect, and it's the 
ONLY place I'm ever writing to the field.  I put it inside the method because I 
was just having to do clusterState = constructState() at every call site.  I 
can outline it if you think it's unclear, or I can rename the method.  I was 
just sticking with the old name, but maybe that doesn't make sense now.  
"updateView" maybe?

3) I actually intentionally left the live node and alias code alone, even 
though I have the strong urge to refactor it to match the new patterns I'm 
creating, since it wasn't directly related to the change.  That said, I would 
be totally happy to create a LiveNodeWatcher along the lines of StateWatcher 
(which would enable shared code) if you think I should go ahead with that!

4) Awesome, I was hoping for some guidance on this.  For "shared" clusterstate, 
should I use "shared" or "legacyFormat"?

As far as testing goes, I'll hit you up on IRC for some pointers on writing 
this.  Ideally, if you could point me at an existing test I could start from?


> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5756-trunk.patch, SOLR-5756-vs-5.2.1.patch
>
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-08-04 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653962#comment-14653962
 ] 

Shalin Shekhar Mangar commented on SOLR-5756:
-

Thanks Scott. A few more comments:

# Need a null check for "children" in ZkStateReader.refreshLazyCollections()
# I don't see where the clusterState object in ZkStateReader is initialized? 
The createClusterStateWatchersAndUpdate() method just calls constructState(new 
HashSet<>(liveNodes)); but doesn't set its return value to the clusterState 
variable.
# Nit-pick: It looks like the code for updating live nodes in 
ZkStateReader.createClusterStateWatchersAndUpdate() can use updateLiveNodes() 
directly (with a watcher)?
# Nit-pick: There was a lot of controversy on naming collections as "external" 
when this was originally implemented in SOLR-5473. In the interest of avoiding 
controversy and bike-shedding, can you rename "external" to "stateFormat2" 
wherever possible? e.g. "externalCollectionExists" to "isStateFormat2()" etc.

I think now we are in a position to write a test which:
# Creates a collection in shared cluster state
# Creates a state.json for the same collection
# Mutates state in shared cluster state and asserts that those changes are not 
visible in ZkStateReader
# Deletes old cluster state and ensures that everything still works fine.

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5756-trunk.patch, SOLR-5756-trunk.patch, 
> SOLR-5756-vs-5.2.1.patch
>
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-08-03 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652401#comment-14652401
 ] 

Scott Blum commented on SOLR-5756:
--

Okay, per discussion on IRC let me summarize the big glaring problem with the 
existing patch:

I'm trying to use clusterState to do two very different things:
1) Serve as an in-memory copy of the contents of clusterstate.json
2) Serve as the "amalgam" result of combining shared clusterstate.json with all 
of the non-shared state.json.

I need to explicitly break this apart into two different fields.

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5756-trunk.patch, SOLR-5756-vs-5.2.1.patch
>
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-08-03 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652366#comment-14652366
 ] 

Scott Blum commented on SOLR-5756:
--

1) Great catch, totally missed that.

2) Sure, not a problem.  I thought it was kind of implied by the classname 
(StateWatcher / ClusterStateWatcher) but explicit methods are always good.

3) Yes, but in the 99% case (no external state exists) the getData call fails, 
no watcher is left on the node, and everything should just clean itself up.  I 
don't think this adds much additional overhead considering that 
getStateFormat2CollectionNames() checks existence on *every* collection for a 
state.json, even ones that are not "interesting", and this happens on every 
update, not just on startup which is where the initial set of StateWatchers get 
created.  So I think lower-hanging fruit would be to optimize the 
updateFromSharedClusterState -> getStateFormat2CollectionNames() code path.  I 
would love to talk more about how/why the rest of the system relies on the 
lazy-loading bits.

4) I don't think I quite follow, let's discuss on IRC.

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5756-trunk.patch, SOLR-5756-vs-5.2.1.patch
>
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-08-03 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652342#comment-14652342
 ] 

Shalin Shekhar Mangar commented on SOLR-5756:
-

Thanks [~dragonsinth]. This approach looks promising. A few comments on the 
patch:

# The return value of updateFromSharedClusterState(clusterState.getLiveNodes(), 
null); in ZkStateReader.updateClusterState is discarded. How is the cluster 
state updated in that case?
# Perhaps rename StateWatcher.refresh to refreshAndAddWatch to make it more 
clear?
# All collections (whether in clusterstate.json or in individual state.json) 
having local cores are added to interestingCollections and therefore a 
StateWatcher is created even if the collection was in clusterstate.json (as a 
result one useless zk call is made?)
# ZkStateReader.removeZKWatch should not remove the collection from the cluster 
state just because all local cores belonging to that collection have been 
removed. A proper updateClusterState should be called in this case.

This patch has a lot of changes to the way synchronization is done so I will 
review it in more detail. But before that, can you please rebase the patch to 
trunk?

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5756-vs-5.2.1.patch
>
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-07-31 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648849#comment-14648849
 ] 

Shalin Shekhar Mangar commented on SOLR-5756:
-

[~dragonsinth] - I am 'shalin' on irc

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-07-30 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648794#comment-14648794
 ] 

Noble Paul commented on SOLR-5756:
--

always prefer 'state.json' .

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-07-30 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648608#comment-14648608
 ] 

Anshum Gupta commented on SOLR-5756:


[~dragonsinth] people generally work on trunk and then merge/backport it to 
branch_5x.

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-07-30 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648469#comment-14648469
 ] 

Scott Blum commented on SOLR-5756:
--

Also, what branch should I be working against?  I have something in a basically 
working state I'd like feedback on.

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-07-30 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648097#comment-14648097
 ] 

Scott Blum commented on SOLR-5756:
--

[~noble.paul] [~shalinmangar] do either of you ever hang out on #solr-dev?  I 
have a few questions that I think would greatly benefit from a real time chat.

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-07-09 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621500#comment-14621500
 ] 

Scott Blum commented on SOLR-5756:
--

That sounds great.  I will get on this in the next couple days.  One clarifying 
questions:

Suppose on read/reload, data exists in both the collection's `state.json` and 
the shared `clusterstate.json` on load, which one should it prefer, and it 
should any corrective action happen to de-dup?  I would presume that it should 
prefer `state.json` (and eagerly remove the entry from `clusterstate.json`?) in 
this case, since it indicates someone successfully wrote out the new one but 
failed to delete the old one.


> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-07-09 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620564#comment-14620564
 ] 

Noble Paul commented on SOLR-5756:
--

The steps to migrate state from main clusterstate.

# Add support to ZKStateReader to lookup the 
{{/collections/state.json}} if a collection is suddenly 
missing from the main {{clusterstate.json}} . If the {{state.json}} is found 
add a listener to that if that requires to be watched
# Add a collection-admin command to migrate the state outside

A user should first upgrade all servers to a new version and then execute the 
command

> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-07-08 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14618910#comment-14618910
 ] 

Noble Paul commented on SOLR-5756:
--

bq.Do you think it would be safe to do this manually on a running cluster?

It is never fully safe , There are intermediate state changes which you may lose

bq. Suppose I wanted to try to write a patch for this issue to help solve it 
for everyone, is that a reasonable thing to attempt for someone with a lot of 
ZK knowledge but pretty new to Solr? 

bq. Is it too "soon" to do that?

It is not too soon. 5.0 is out for a while

A lot of ZK knowledge is not required to do this. You need to know how overseer 
updates state  and how states are updated by 
{{ZkStateReader.createClusterStateWatchersAndUpdate}}  in each node

bq.Can you opine on the specifics of having an API to move the state out vs. a 
forced migration?

I shall do a follow up comment of what are the steps involved in actually doing 
this



> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5756) A utility API to move collections from internal to external

2015-07-08 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14618875#comment-14618875
 ] 

Scott Blum commented on SOLR-5756:
--

Hi, has any work started on this issue?  We have a deployment with a very large 
clusterstate.json (most of our collections are there).  New collections added 
since our last upgrade have their own split state.json, but we still have an 
enormous number of collections using the shared file.  We are suspicious that 
the large degree of contention on clusterstate.json is affecting the stability 
of our cluster, so we'd like to split it apart to see if things improve.

A few questions:

1) Do you think it would be safe to do this manually on a running cluster?  
I've only spent a few hours looking at the overseer code, but I got the 
impression that I might just be able to populate all the state.json nodes 
manually, followed by emptying clusterstate.json.  That last step should tickle 
all the running servers, forcing a reload which will get all servers into the 
right separated state.  At least, that's my theory.  Does that sound right to 
you?

2) Suppose I wanted to try to write a patch for this issue to help solve it for 
everyone, is that a reasonable thing to attempt for someone with a lot of ZK 
knowledge but pretty new to Solr?  Or are there a lot of subtleties?

3) Can you opine on the specifics of having an API to move the state out vs. a 
forced migration?  From what I read on SOLR-5473, it sounds like eventually 
we'd just want to force everyone into split state.  Is it too "soon" to do that?

(Unrelated to this specific issue, I'm actually a committer on Apache Curator, 
and I have a general interest in understanding and possibly helping improve 
overseer's ZK interactions.   Are there any docs outside of the code itself you 
might recommend for me to read?)


> A utility API to move collections from internal to external
> ---
>
> Key: SOLR-5756
> URL: https://issues.apache.org/jira/browse/SOLR-5756
> Project: Solr
>  Issue Type: Bug
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> SOLR-5473 allows creation of collection with state stored outside of 
> clusterstate.json. We would need an API to  move existing 'internal' 
> collections outside



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org