Re: Master - Master / Upgrading a slave to master

2014-09-11 Thread Shawn Heisey
On 9/9/2014 9:20 AM, Salman Akram wrote:
> So realistically speaking you cannot have SolrCloud work for 2 data centers
> as a redundant solution because no matter how many nodes you add you still
> would need at least 1 node in the 2nd center working too.

Precisely.

> So that just leaves with non-SolrCloud solutions.
> 
> "1) Change the replication config to redefine the master and reload the core
> or restart Solr."
> 
> That of course is a simple way but the real issue is about the possible
> issues and some good practices e.g. normally the scenario would be that
> primary data center goes down for few hours and till then we upgrade one of
> the slaves in secondary to a master. Now
> 
> - IF there is no lag there won't be any issue in secondary at least but
> what if there is lag and one of the files is not completely replicated?
> That file would be discarded or there is a possibility that whole index is
> not usable?
> 
> - Once the primary comes back how would we now copy the delta from
> secondary? Make it a slave of secondary first, replicate the delta and then
> set it as a master again?

If you're handling all your replication yourself with the HTTP API, then
you would contact the old master when it comes back up and ask it to
replicate from the temporary master.  Then you switch modes in your
program that drives the replication and have it use the original master
for all replication.

If you need to switch masters for non-cloud setups, it's really not
practical to have Solr be in control of the replication, because you
have to modify the config in place and kick Solr to make it re-read the
config.  It's extremely messy and prone to error.

As for an incomplete replication ... I do not know this for sure, but i
would imagine that if a replication is not complete, it won't switch
indexes, it will keep going with the one it's already got.

> In other words is there a good guide out there for this with possible
> issues and solutions? Definitely before SolrCloud people would be doing
> this and even now SolrCloud doesn't seem practical in quite a few
> situations.

SolrCloud relies on zookeeper to maintain the cluster.  It knows how to
deal with the *Solr* parts of a distributed cluster, but it leaves the
management of the cluster itself to zookeeper -- they've been doing it a
lot longer than we have, so we can use a wheel that's already invented
instead of building it ourselves.

Because Zookeeper prizes a guaranteed quorum above all else, its design
is not well-suited for a two-datacenter solution.  SolrCloud works
really well within a single data center, or with three.

Thanks,
Shawn



Re: Master - Master / Upgrading a slave to master

2014-09-11 Thread Salman Akram
Anyone?

On Tue, Sep 9, 2014 at 8:20 PM, Salman Akram <
salman.ak...@northbaysolutions.net> wrote:

> So realistically speaking you cannot have SolrCloud work for 2 data
> centers as a redundant solution because no matter how many nodes you add
> you still would need at least 1 node in the 2nd center working too.
>
> So that just leaves with non-SolrCloud solutions.
>
> "1) Change the replication config to redefine the master and reload the core
> or restart Solr."
>
> That of course is a simple way but the real issue is about the possible
> issues and some good practices e.g. normally the scenario would be that
> primary data center goes down for few hours and till then we upgrade one of
> the slaves in secondary to a master. Now
>
> - IF there is no lag there won't be any issue in secondary at least but
> what if there is lag and one of the files is not completely replicated?
> That file would be discarded or there is a possibility that whole index is
> not usable?
>
> - Once the primary comes back how would we now copy the delta from
> secondary? Make it a slave of secondary first, replicate the delta and then
> set it as a master again?
>
> In other words is there a good guide out there for this with possible
> issues and solutions? Definitely before SolrCloud people would be doing
> this and even now SolrCloud doesn't seem practical in quite a few
> situations.
>
> Thanks again!!
>
> On Tue, Sep 9, 2014 at 8:02 PM, Shawn Heisey  wrote:
>
>> On 9/9/2014 8:46 AM, Salman Akram wrote:
>> > You mean 3 'data centers' or 'nodes'? I am thinking if we have 2 nodes
>> on
>> > primary and 1 in secondary and we normally keep the secondary down would
>> > that work? Basically secondary network is just for redundancy and won't
>> be
>> > as fast so normally we won't like to shift traffic there.
>> >
>> > So can we just have nodes for redundancy and NOT load balancing i.e. it
>> has
>> > 3 nodes but update is only on one of them? Similarly for the slave
>> replicas
>> > can we limit the searches to a certain slave or it will be auto
>> balanced?
>> >
>> > Also apart from SOLR cloud is it possible to have multiple master in
>> SOLR
>> > or a good guide to upgrade a slave to master?
>>
>> You must have three zookeeper nodes for a redundant setup.  If you only
>> have two data centers, then you must put at least two of those nodes in
>> one data center.  If the data center with two zookeeper nodes goes down,
>> zookeeper cannot function, which means SolrCloud will not work
>> correctly.  There is no way to maintain SolrCloud redundancy with only
>> two data centers.  You might think to add a fourth ZK node and split
>> them between the data centers ... except that in that situation, at
>> least three nodes must be functional.  Two out of four nodes is not
>> enough.
>>
>> A minimal fault-tolerant SolrCloud install is three physical machines.
>> Two of them run ZK and Solr, one of them runs ZK only.
>>
>> If you don't use SolrCloud, then you have two choices to switch masters:
>>
>> 1) Change the replication config to redefine the master and reload the
>> core or restart Solr.
>> 2) Write scripts that manually use the replication HTTP API to do all
>> your replication, rather than let Solr handle it automatically.  You can
>> choose the master for every replication with HTTP calls.
>>
>> https://wiki.apache.org/solr/SolrReplication#HTTP_API
>>
>> Thanks,
>> Shawn
>>
>>
>
>
> --
> Regards,
>
> Salman Akram
>
>


-- 
Regards,

Salman Akram


Re: Master - Master / Upgrading a slave to master

2014-09-09 Thread Salman Akram
So realistically speaking you cannot have SolrCloud work for 2 data centers
as a redundant solution because no matter how many nodes you add you still
would need at least 1 node in the 2nd center working too.

So that just leaves with non-SolrCloud solutions.

"1) Change the replication config to redefine the master and reload the core
or restart Solr."

That of course is a simple way but the real issue is about the possible
issues and some good practices e.g. normally the scenario would be that
primary data center goes down for few hours and till then we upgrade one of
the slaves in secondary to a master. Now

- IF there is no lag there won't be any issue in secondary at least but
what if there is lag and one of the files is not completely replicated?
That file would be discarded or there is a possibility that whole index is
not usable?

- Once the primary comes back how would we now copy the delta from
secondary? Make it a slave of secondary first, replicate the delta and then
set it as a master again?

In other words is there a good guide out there for this with possible
issues and solutions? Definitely before SolrCloud people would be doing
this and even now SolrCloud doesn't seem practical in quite a few
situations.

Thanks again!!

On Tue, Sep 9, 2014 at 8:02 PM, Shawn Heisey  wrote:

> On 9/9/2014 8:46 AM, Salman Akram wrote:
> > You mean 3 'data centers' or 'nodes'? I am thinking if we have 2 nodes on
> > primary and 1 in secondary and we normally keep the secondary down would
> > that work? Basically secondary network is just for redundancy and won't
> be
> > as fast so normally we won't like to shift traffic there.
> >
> > So can we just have nodes for redundancy and NOT load balancing i.e. it
> has
> > 3 nodes but update is only on one of them? Similarly for the slave
> replicas
> > can we limit the searches to a certain slave or it will be auto balanced?
> >
> > Also apart from SOLR cloud is it possible to have multiple master in SOLR
> > or a good guide to upgrade a slave to master?
>
> You must have three zookeeper nodes for a redundant setup.  If you only
> have two data centers, then you must put at least two of those nodes in
> one data center.  If the data center with two zookeeper nodes goes down,
> zookeeper cannot function, which means SolrCloud will not work
> correctly.  There is no way to maintain SolrCloud redundancy with only
> two data centers.  You might think to add a fourth ZK node and split
> them between the data centers ... except that in that situation, at
> least three nodes must be functional.  Two out of four nodes is not enough.
>
> A minimal fault-tolerant SolrCloud install is three physical machines.
> Two of them run ZK and Solr, one of them runs ZK only.
>
> If you don't use SolrCloud, then you have two choices to switch masters:
>
> 1) Change the replication config to redefine the master and reload the
> core or restart Solr.
> 2) Write scripts that manually use the replication HTTP API to do all
> your replication, rather than let Solr handle it automatically.  You can
> choose the master for every replication with HTTP calls.
>
> https://wiki.apache.org/solr/SolrReplication#HTTP_API
>
> Thanks,
> Shawn
>
>


-- 
Regards,

Salman Akram


Re: Master - Master / Upgrading a slave to master

2014-09-09 Thread Shawn Heisey
On 9/9/2014 8:46 AM, Salman Akram wrote:
> You mean 3 'data centers' or 'nodes'? I am thinking if we have 2 nodes on
> primary and 1 in secondary and we normally keep the secondary down would
> that work? Basically secondary network is just for redundancy and won't be
> as fast so normally we won't like to shift traffic there.
>
> So can we just have nodes for redundancy and NOT load balancing i.e. it has
> 3 nodes but update is only on one of them? Similarly for the slave replicas
> can we limit the searches to a certain slave or it will be auto balanced?
>
> Also apart from SOLR cloud is it possible to have multiple master in SOLR
> or a good guide to upgrade a slave to master?

You must have three zookeeper nodes for a redundant setup.  If you only
have two data centers, then you must put at least two of those nodes in
one data center.  If the data center with two zookeeper nodes goes down,
zookeeper cannot function, which means SolrCloud will not work
correctly.  There is no way to maintain SolrCloud redundancy with only
two data centers.  You might think to add a fourth ZK node and split
them between the data centers ... except that in that situation, at
least three nodes must be functional.  Two out of four nodes is not enough.

A minimal fault-tolerant SolrCloud install is three physical machines. 
Two of them run ZK and Solr, one of them runs ZK only.

If you don't use SolrCloud, then you have two choices to switch masters:

1) Change the replication config to redefine the master and reload the
core or restart Solr.
2) Write scripts that manually use the replication HTTP API to do all
your replication, rather than let Solr handle it automatically.  You can
choose the master for every replication with HTTP calls.

https://wiki.apache.org/solr/SolrReplication#HTTP_API

Thanks,
Shawn



Re: Master - Master / Upgrading a slave to master

2014-09-09 Thread Salman Akram
You mean 3 'data centers' or 'nodes'? I am thinking if we have 2 nodes on
primary and 1 in secondary and we normally keep the secondary down would
that work? Basically secondary network is just for redundancy and won't be
as fast so normally we won't like to shift traffic there.

So can we just have nodes for redundancy and NOT load balancing i.e. it has
3 nodes but update is only on one of them? Similarly for the slave replicas
can we limit the searches to a certain slave or it will be auto balanced?

Also apart from SOLR cloud is it possible to have multiple master in SOLR
or a good guide to upgrade a slave to master?

Thanks

On Tue, Sep 9, 2014 at 5:40 PM, Shawn Heisey  wrote:

> On 9/8/2014 9:54 PM, Salman Akram wrote:
> > We have a redundant data center in case the primary goes down. Currently
> we
> > have 1 master and multiple slaves on primary data center. This master
> also
> > replicates to a slave in secondary data center. So if the primary goes
> down
> > at least the read only part works. However, now we want writes to work on
> > secondary data center too when primary goes down.
> >
> > - Is it possible in SOLR to have Master - Master?
> > - If not then what's the best strategy to upgrade a slave to master?
> > - Naturally there would be some latency due to data centers being in
> > different geographical locations so what are the normal data issues and
> > best practices in case primary goes down? We would also like to shift
> back
> > to primary as soon as its back.
>
> SolrCloud would work, but only if you have *three* datacenters.  Two of
> them would need to remain fully operational.  SolrCloud is a true
> cluster -- there is no master.  Each of the shards in a collection has
> one or more replicas.  One of the replicas gets elected to be leader,
> but the leader designation can change.
>
> The reason that you need three is because of zookeeper, which is the
> software that actually maintains the cluster and handles leader
> elections.  A majority of zookeeper nodes (more than half of them) must
> be operational for zookeeper to maintain quorum.  That means that the
> minimum number of zookeepers is three, and in a three-node system, one
> can go down without disrupting operation.
>
> One thing that SolrCloud doesn't yet have is rack/datacenter awareness.
>  Requests get load balanced across the entire cluster, regardless of
> where they are located.  It's something that will eventually come, but I
> don't have any kind of estimate for when.
>
> Thanks,
> Shawn
>
>


-- 
Regards,

Salman Akram


Re: Master - Master / Upgrading a slave to master

2014-09-09 Thread Shawn Heisey
On 9/8/2014 9:54 PM, Salman Akram wrote:
> We have a redundant data center in case the primary goes down. Currently we
> have 1 master and multiple slaves on primary data center. This master also
> replicates to a slave in secondary data center. So if the primary goes down
> at least the read only part works. However, now we want writes to work on
> secondary data center too when primary goes down.
> 
> - Is it possible in SOLR to have Master - Master?
> - If not then what's the best strategy to upgrade a slave to master?
> - Naturally there would be some latency due to data centers being in
> different geographical locations so what are the normal data issues and
> best practices in case primary goes down? We would also like to shift back
> to primary as soon as its back.

SolrCloud would work, but only if you have *three* datacenters.  Two of
them would need to remain fully operational.  SolrCloud is a true
cluster -- there is no master.  Each of the shards in a collection has
one or more replicas.  One of the replicas gets elected to be leader,
but the leader designation can change.

The reason that you need three is because of zookeeper, which is the
software that actually maintains the cluster and handles leader
elections.  A majority of zookeeper nodes (more than half of them) must
be operational for zookeeper to maintain quorum.  That means that the
minimum number of zookeepers is three, and in a three-node system, one
can go down without disrupting operation.

One thing that SolrCloud doesn't yet have is rack/datacenter awareness.
 Requests get load balanced across the entire cluster, regardless of
where they are located.  It's something that will eventually come, but I
don't have any kind of estimate for when.

Thanks,
Shawn



Master - Master / Upgrading a slave to master

2014-09-08 Thread Salman Akram
We have a redundant data center in case the primary goes down. Currently we
have 1 master and multiple slaves on primary data center. This master also
replicates to a slave in secondary data center. So if the primary goes down
at least the read only part works. However, now we want writes to work on
secondary data center too when primary goes down.

- Is it possible in SOLR to have Master - Master?
- If not then what's the best strategy to upgrade a slave to master?
- Naturally there would be some latency due to data centers being in
different geographical locations so what are the normal data issues and
best practices in case primary goes down? We would also like to shift back
to primary as soon as its back.


Thanks!

-- 
Regards,

Salman Akram