subject:"legacy replication"

Re: SolrCloud (7.3) and Legacy replication slaves

2019-05-21 Thread Erick Erickson

Have you looked at TLOG and PULL replica types? If you’re trying to connect to 
separate clusters (defined by having different ZooKeepers) those won’t work...

> On May 21, 2019, at 8:15 AM, Shawn Heisey  wrote:
> 
> On 5/21/2019 8:48 AM, Michael Tracey wrote:
>> Is it possible set up an existing SolrCloud cluster as the master for
>> legacy replication to a slave server or two?   It looks like another option
>> is to use Uni-direction CDCR, but not sure what is the best option in this
>> case.
> 
> You're asking for problems if you try to combine legacy replication with 
> SolrCloud.  The two features are not guaranteed to work together.
> 
> CDCR is your best bet.  This replicates from one SolrCloud cluster to another.
> 
> Thanks,
> Shawn

Re: SolrCloud (7.3) and Legacy replication slaves

2019-05-21 Thread Shawn Heisey


On 5/21/2019 8:48 AM, Michael Tracey wrote:

Is it possible set up an existing SolrCloud cluster as the master for
legacy replication to a slave server or two?   It looks like another option
is to use Uni-direction CDCR, but not sure what is the best option in this
case.


You're asking for problems if you try to combine legacy replication with 
SolrCloud.  The two features are not guaranteed to work together.


CDCR is your best bet.  This replicates from one SolrCloud cluster to 
another.


Thanks,
Shawn

SolrCloud (7.3) and Legacy replication slaves

2019-05-21 Thread Michael Tracey

Is it possible set up an existing SolrCloud cluster as the master for
legacy replication to a slave server or two?   It looks like another option
is to use Uni-direction CDCR, but not sure what is the best option in this
case.

-- 
Michael Tracey

Re: Legacy replication slave node full sync

2018-03-22 Thread Erick Erickson

1a> Replication pulls down changed segments, which includes _changed_
segments. Say I have 10 segments in my index and they all get merged
into a single segment that now contains the entire index. Then the
changed segment is replicated.

1b> If you're polling interval is such that all the segments get
replaced between synchronizations, then the entire index will be fetch
at the next poll.

2> You can turn on infoStream logging, see the reference guide.
WARNING: this will produce a _lot_ of output.

Why is this important? It's expected that occasionally a replication
may pull down the entire index, doing a forceMerge on the master for
instance. If you're saying that occasionally replication _replaces
existing segments_ with fresh ones of the same name from the master,
then that's a mystery. If you're saying that occasionally all the
segments are pulled from the master and all the old segments are
deleted from the slave, then that's expected, in this case there will
be no segments in common.

Best,
Erick

On Thu, Mar 22, 2018 at 11:10 AM, Yunee Lee  wrote:
> Hi,
> I have two questions regarding legacy master /slave node replication 
> architecture.
> We noticed that slave node does full sync time to time.
>
>   1.  What type of  event or configuration does trigger the full sync in 
> slave node?
> I can not locate exact time and frequency from the logs. Please let me know.
>   2.  If  master nodes’  index merges  is related to trigger the full sync 
> replication,  then how can I find the index merge logging in solr log from 
> master node?
> Please share the documentation if I can reference.
> Thanks.
>

Legacy replication slave node full sync

2018-03-22 Thread Yunee Lee

Hi,
I have two questions regarding legacy master /slave node replication 
architecture.
We noticed that slave node does full sync time to time.

  1.  What type of  event or configuration does trigger the full sync in slave 
node?
I can not locate exact time and frequency from the logs. Please let me know.
  2.  If  master nodes’  index merges  is related to trigger the full sync 
replication,  then how can I find the index merge logging in solr log from 
master node?
Please share the documentation if I can reference.
Thanks.

Re: legacy replication

2017-12-16 Thread Erick Erickson

Yeah, much as I love SolrCloud (and make most of my living working
with it), it does have its complexities.

My rule of thumb is that you really want to consider SolrCloud when
you start having to shard or need NRT
searching.

You trade the complexity of maintaining your own sharding etc. for the
complexity of  ZooKeeper for the former,
and can't do the latter with master/slave, so whatever floats your boat ;)

About ZooKeeper: with a system your size, you would absolutely _not_
need any more than three. In fact,
if you were willing to accept that your one ZooKeeper going down would
prevent updates from happening,
you could run with just one. And your ZK machines could be cheap
boxes, they don't need all that much
processing power.

That said, if it's just a matter of plopping in Solr 7x (or whatever)
over your existing infrastructure that's been
running for years, I really can't say you should move to SolrCloud

Best,
Erick

On Sat, Dec 16, 2017 at 9:36 AM, Shawn Heisey  wrote:
> On 12/15/2017 12:12 PM, David Hastings wrote:
>>
>> Also the complexity of adding another 3
>> or more machines just to do nothing but ZK stuff was getting out of hand.
>
>
> You can run ZK on the same machines that are running Solr.  The only strong
> recommendation that I would make is that it should be a completely separate
> process, not embedded within Solr.  The ZK process is unlikely to need much
> of a heap, unless your ZK database is huge.
>
> It can also be useful to have ZK's data on separate disks from other things
> on the machine, but this is not usually necessary.
>
> Thanks,
> Shawn

Re: legacy replication

2017-12-16 Thread Shawn Heisey


On 12/15/2017 12:12 PM, David Hastings wrote:

Also the complexity of adding another 3
or more machines just to do nothing but ZK stuff was getting out of hand.


You can run ZK on the same machines that are running Solr.  The only 
strong recommendation that I would make is that it should be a 
completely separate process, not embedded within Solr.  The ZK process 
is unlikely to need much of a heap, unless your ZK database is huge.


It can also be useful to have ZK's data on separate disks from other 
things on the machine, but this is not usually necessary.


Thanks,
Shawn

Re: legacy replication

2017-12-15 Thread David Hastings

Understandable.  Right now we have a large set up of solr 5.x servers that
has been doing great for years.  But the time to upgrade has come, with
some things that we want that are not available in the 5.x branch.  I
really like legacy ( master/slave) replication, for the reasons you stated,
but also the fact that the cloud set up seems perfect, if you have a
handful of cheap machines around.  Our production set up has 1 indexer,
which has a 5 minute polling slave, and on releases we have 3 searching
servers that poll manually.   Tjhing is, these machines have over 32 cores
and over 200gb of ram with 2TB SSDs, each, these were not cheap and are
pretty fast with standalone solr.  Also the complexity of adding another 3
or more machines just to do nothing but ZK stuff was getting out of hand.
if its not broken, im not about to fix it

In any case im glad to hear legacy replication will stay.
Thanks,
-Dave

On Fri, Dec 15, 2017 at 1:15 PM, Walter Underwood 
wrote:

> I love legacy replication. It is simple and bulletproof. Loose coupling
> for the win! We only run Solr Cloud when we need sharding or NRT search.
> Loose coupling is a very, very good thing in distributed systems.
>
> Adding a replica (new slave) is trivial. Clone an existing one. This makes
> horizontal scaling so easy. We still haven’t written the procedure and
> scripts for scaling our Solr Cloud cluster. Last time, it was 100% manual
> through the admin UI.
>
> Setting up a Zookeeper ensemble isn’t as easy as it should be. We tried to
> set up a five node ensemble with ZK 3.4.6 and finally gave up after two
> weeks because it was blocking the release. We are using the three node
> 3.4.5 ensemble that had been set up for something else a couple of years
> earlier. I’ve had root on Unix since 1981 and have been running TCP/IP
> since 1983, so I should have been able to figure this out.
>
> We’ve had some serious prod problems with the Solr Cloud cluster, like
> cores stuck in a permanent recovery loop. I finally manually deleted that
> core and created a new one. Ugly.
>
> Even starting Solr Cloud processes is confusing. It took a while to figure
> out they were all joining as the same host (no, I don’t know why), so now
> we start them as: solr start -cloud -h `hostname`
>
> Keeping configs under source control and deploying them isn’t easy. I’m
> not going to install Solr on the Jenkins executor just so it can deploy,
> that is weird and kind of a chicken and egg thing. I ended up writing a
> Python program to get the ZK address from the cluster, use kazoo to load
> directly to ZK, then tell the cluster to reload. Both with that and with
> the provided ZK tools I ran into so much undocumented stuff. What is
> linking? How to the file config directories map to the ZK config
> directories? And so on.
>
> The lack of a thread pool for requests is a very serious problem. If our
> 6.5.1 cluster gets overloaded, it creates 4000 threads, runs out of memory
> and fails. That is just wrong. With earlier versions of Solr, it would get
> slower and slower, but recover gracefully.
>
> Converting a slave into a master is easy. We use this in the config file:
>
>
>   ${enable.master:false}
>   …
>   
>  ${textbooks.enable.slave:false}
>
> And this at startup (slave config shown): -Denable.master=false
> -Denable.slave=true
>
> Change the properties and restart.
>
> Our 6.5.1 cluster is faster than the non-sharded 4.10.4 master/slave
> cluster, but I’m not happy with the stability in prod. We’ve had more
> search outages in the past six months than we had in the previous four
> years. I’ve had Solr in prod since version 1.2, and this is the first time
> it has really embarrassed me.
>
> There are good things. Search is faster, we’re handling double the query
> volume with 3X the docs.
>
> Sorry for the rant, but it has not been a good fall semester for our
> students (customers).
>
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Dec 15, 2017, at 9:46 AM, Erick Erickson 
> wrote:
> >
> > There's pretty much zero chance that it'll go away, too much current
> > and ongoing functionality that depends on it.
> >
> > 1> old-style replication has always been used for "full sync" in
> > SolrCloud when peer sync can't be done.
> >
> > 2> The new TLOG and PULL replica types are a marriage of old-style
> > master/slave and SolrCloud. In particular a PULL replica is
> > essentially an old-style slave. A TLOG replica is an old-style slave
> > that also maintains a transaction log so it can take over leadership
> > if necessary.
> >
> > Best,
> > Erick
> >
> > On F

Re: legacy replication

2017-12-15 Thread Walter Underwood

I love legacy replication. It is simple and bulletproof. Loose coupling for the 
win! We only run Solr Cloud when we need sharding or NRT search. Loose coupling 
is a very, very good thing in distributed systems.

Adding a replica (new slave) is trivial. Clone an existing one. This makes 
horizontal scaling so easy. We still haven’t written the procedure and scripts 
for scaling our Solr Cloud cluster. Last time, it was 100% manual through the 
admin UI.

Setting up a Zookeeper ensemble isn’t as easy as it should be. We tried to set 
up a five node ensemble with ZK 3.4.6 and finally gave up after two weeks 
because it was blocking the release. We are using the three node 3.4.5 ensemble 
that had been set up for something else a couple of years earlier. I’ve had 
root on Unix since 1981 and have been running TCP/IP since 1983, so I should 
have been able to figure this out.

We’ve had some serious prod problems with the Solr Cloud cluster, like cores 
stuck in a permanent recovery loop. I finally manually deleted that core and 
created a new one. Ugly.

Even starting Solr Cloud processes is confusing. It took a while to figure out 
they were all joining as the same host (no, I don’t know why), so now we start 
them as: solr start -cloud -h `hostname`

Keeping configs under source control and deploying them isn’t easy. I’m not 
going to install Solr on the Jenkins executor just so it can deploy, that is 
weird and kind of a chicken and egg thing. I ended up writing a Python program 
to get the ZK address from the cluster, use kazoo to load directly to ZK, then 
tell the cluster to reload. Both with that and with the provided ZK tools I ran 
into so much undocumented stuff. What is linking? How to the file config 
directories map to the ZK config directories? And so on.

The lack of a thread pool for requests is a very serious problem. If our 6.5.1 
cluster gets overloaded, it creates 4000 threads, runs out of memory and fails. 
That is just wrong. With earlier versions of Solr, it would get slower and 
slower, but recover gracefully.

Converting a slave into a master is easy. We use this in the config file:

  ${enable.master:false}
  …

 ${textbooks.enable.slave:false}

And this at startup (slave config shown): -Denable.master=false 
-Denable.slave=true

Change the properties and restart.

Our 6.5.1 cluster is faster than the non-sharded 4.10.4 master/slave cluster, 
but I’m not happy with the stability in prod. We’ve had more search outages in 
the past six months than we had in the previous four years. I’ve had Solr in 
prod since version 1.2, and this is the first time it has really embarrassed me.

There are good things. Search is faster, we’re handling double the query volume 
with 3X the docs.

Sorry for the rant, but it has not been a good fall semester for our students 
(customers).

Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Dec 15, 2017, at 9:46 AM, Erick Erickson  wrote:
> 
> There's pretty much zero chance that it'll go away, too much current
> and ongoing functionality that depends on it.
> 
> 1> old-style replication has always been used for "full sync" in
> SolrCloud when peer sync can't be done.
> 
> 2> The new TLOG and PULL replica types are a marriage of old-style
> master/slave and SolrCloud. In particular a PULL replica is
> essentially an old-style slave. A TLOG replica is an old-style slave
> that also maintains a transaction log so it can take over leadership
> if necessary.
> 
> Best,
> Erick
> 
> On Fri, Dec 15, 2017 at 8:56 AM, David Hastings
>  wrote:
>> So i dont step on the other thread, I want to be assured whether or not
>> legacy master/slave/repeater replication will continue to be supported in
>> future solr versions.  our infrastructure is set up for this and all the HA
>> redundancies that solrcloud provides we have already spend a lot of time
>> and resources with very expensive servers to handle solr in standalone
>> mode.
>> 
>> thanks.
>> -David

Re: legacy replication

2017-12-15 Thread Erick Erickson

There's pretty much zero chance that it'll go away, too much current
and ongoing functionality that depends on it.

1> old-style replication has always been used for "full sync" in
SolrCloud when peer sync can't be done.

2> The new TLOG and PULL replica types are a marriage of old-style
master/slave and SolrCloud. In particular a PULL replica is
essentially an old-style slave. A TLOG replica is an old-style slave
that also maintains a transaction log so it can take over leadership
if necessary.

Best,
Erick

On Fri, Dec 15, 2017 at 8:56 AM, David Hastings
 wrote:
> So i dont step on the other thread, I want to be assured whether or not
> legacy master/slave/repeater replication will continue to be supported in
> future solr versions.  our infrastructure is set up for this and all the HA
> redundancies that solrcloud provides we have already spend a lot of time
> and resources with very expensive servers to handle solr in standalone
> mode.
>
> thanks.
> -David

legacy replication

2017-12-15 Thread David Hastings

So i dont step on the other thread, I want to be assured whether or not
legacy master/slave/repeater replication will continue to be supported in
future solr versions.  our infrastructure is set up for this and all the HA
redundancies that solrcloud provides we have already spend a lot of time
and resources with very expensive servers to handle solr in standalone
mode.

thanks.
-David

Re: SolrCloud (7.3) and Legacy replication slaves

Re: SolrCloud (7.3) and Legacy replication slaves

SolrCloud (7.3) and Legacy replication slaves

Re: Legacy replication slave node full sync

Legacy replication slave node full sync

Re: legacy replication

Re: legacy replication

Re: legacy replication

Re: legacy replication

Re: legacy replication

legacy replication

11 matches

Site Navigation

Mail list logo

Footer information