Re: Ensembles failing to reach "Leader ready" state

2015-04-21 Thread Andrew Stone
A couple things stand out here. If a node is left in leaving state, it's likely that the system can't get quorum for the ensembles it's a part of. Node's that leave wait until their peer membership is transferred via joint consensus and they are removed from the ensembles in question so that future

Re: Ensembles failing to reach "Leader ready" state

2015-04-21 Thread Jonathan Koff
Ok, thanks Andrew! I’ll go ahead and migrate the data to a fresh cluster. Jonathan Koff B.CS. co-founder of Projexity www.projexity.com follow us on facebook at: www.facebook.com/projexity follow us on twitter at: twitter.com/projex

Re: Ensembles failing to reach "Leader ready" state

2015-04-20 Thread Alexander Sicular
Hi Jonathan, "staging (3 servers across NA)" If this means you're spreading your cluster across North America I would suggest you reconsider. A Riak cluster is meant to be deployed in one data center, more specifically in one LAN. Connecting Riak nodes over a WAN introduces network latencies.

Re: Ensembles failing to reach "Leader ready" state

2015-04-20 Thread Jonathan Koff
Hi Alexander and Andrew, Thanks for the follow-up! Although I would expect to have used `riak-admin cluster leave`, it’s been months at this point and I can’t be sure. Perhaps I did something weird when I was getting started… Given the uncertain state of the system, it may make sense for me to

Re: Ensembles failing to reach "Leader ready" state

2015-04-17 Thread Andrew Stone
Hi Jonathan, Sorry for the late reply. It looks like riak_ensemble still thinks that those old nodes are part of the cluster. Did you remove them with 'riak-admin cluster leave' ? If so they should have been removed from the root ensemble also, and the machines shouldn't have actually left the clu

Ensembles failing to reach "Leader ready" state

2015-03-23 Thread Jonathan Koff
Hi all, I recently used Riak’s Strong Consistency functionality to get auto-incrementing IDs for a feature of an application I’m working on, and although this worked great in dev (5 nodes in 1 VM) and staging (3 servers across NA) environments, I’ve run into some odd behaviour in production (o