This is exactly the problem we are encountering as well, how to deal with
the ZK Quorum when we have multiple DCs.  Our index is spread so that each
DC has a complete copy and *should* be able to survive on its own, but how
to arrange ZK to deal with that.  The problem with Quorum is we need an odd
number of ZKs to start which doesn't fit when we have 2 DCs.

Logically we would want 4 ZKs, 2 in each DC, and be able to survive with 2
out of that 4 (or in the extreme case 1) but the quorum has to be an odd
number. So if we have 3 or 5, we have to assign more to 1 DC than the other
and we can't afford to lose that DC.  It feels like we are assigning
"weight" to 1 DC, when in reality they should be peers...

One solution is not to use Zookeeper/Solr Cloud and just manually
assign/distribute the shards/queries/configuration at the application
level, but that feels like a step back to Solr 3.x?

Cheers, Daniel


On 23 January 2013 05:56, Timothy Potter <thelabd...@gmail.com> wrote:

> For the Zk quorum issue, we'll put nodes in 3 different AZ's so we can lose
> 1 AZ and still establish quorum with the other 2.
>
> On Tue, Jan 22, 2013 at 10:44 PM, Timothy Potter <thelabd...@gmail.com
> >wrote:
>
> > Hi Markus,
> >
> > Thanks for the insight. There's a pretty high cost to using the approach
> > you suggest in that I'd have to double my node count which won't make my
> > acct'ing dept. very happy.
> >
> > As for cross AZ latency, I'm already running my cluster with nodes in 3
> > different AZ's and our distributed query performance is acceptable for
> us.
> > Our AZ's are in the same region.
> >
> > However, I'm not sure I understand your point about Solr modifying
> > clusterstate.json when a node goes down. From what I understand, it will
> > assign a new shard leader but in my case that's expected and doesn't seem
> > to cause an issue. The new shard leader will be the previous replica from
> > the other AZ but that's OK. In this case, the cluster is still
> functional.
> > In other words, from my understanding, Solr is not going to change shard
> > assignments on the nodes, it's just going to select a new leader, which
> in
> > my case is in another AZ.
> >
> > Lastly, Erick raises a good point about Zk and cross AZ quorum. I don't
> > have a good answer to that issue but will post back if I come up with
> > something.
> >
> > Cheers,
> > Tim
> >
> > On Tue, Jan 22, 2013 at 3:11 PM, Markus Jelsma <
> markus.jel...@openindex.io
> > > wrote:
> >
> >> Hi,
> >>
> >> Regarding availability; since SolrCloud is not DC-aware at this moment
> we
> >> 'solve' the problem by simply operating multiple identical clusters in
> >> different DCs and send updates to them all. This works quite well but it
> >> requires some manual intervention if a DC is down due to a prolonged DOS
> >> attack or netwerk of power failure.
> >>
> >> I don't think it's a very good idea to change clusterstate.json because
> >> Solr will modify it when for example a node goes down. Your
> preconfigured
> >> state doesn't exist anymore. It's also a bad idea because distributed
> >> queries are going to be sent to remote locations, adding a lot of
> latency.
> >> Again, because it's not DC aware.
> >>
> >> Any good solution to this problem should be in Solr itself.
> >>
> >> Cheers,
> >>
> >>
> >> -----Original message-----
> >> > From:Timothy Potter <thelabd...@gmail.com>
> >> > Sent: Tue 22-Jan-2013 22:46
> >> > To: solr-user@lucene.apache.org
> >> > Subject: Manually assigning shard leader and replicas during initial
> >> setup on EC2
> >> >
> >> > Hi,
> >> >
> >> > I'm wanting to split my existing Solr 4 cluster into 2 different
> >> > availability zones in EC2, as in have my initial leaders in one zone
> and
> >> > their replicas in another AZ. My thinking here is if one zone goes
> >> down, my
> >> > cluster stays online. This is the recommendation of Amazon EC2 docs.
> >> >
> >> > My thinking here is to just cook up a clusterstate.json file to
> manually
> >> > set my desired shard / replica assignments to specific nodes. After
> >> which I
> >> > can update the clusterstate.json file in Zk and then bring the nodes
> >> > online.
> >> >
> >> > The other thing to mention is that I have existing indexes that need
> to
> >> be
> >> > preserved as I don't want to re-index. For this I'm planning to just
> >> move
> >> > data directories where they need to be based on my changes to
> >> > clusterstate.json
> >> >
> >> > Does this sound reasonable? Any pitfalls I should look out for?
> >> >
> >> > Thanks.
> >> > Tim
> >> >
> >>
> >
> >
>

Reply via email to