Re: performance during index switch

2011-01-19 Thread Jonathan Rochkind

On 1/19/2011 2:56 PM, Tri Nguyen wrote:

Yes, during a commit.
  
I'm planning to do as you suggested, having a master do the indexing and replicating the index to a slave which leads to my next questions.
  
During the slave replicates the index files from the master, how does it impact performance on the slave?


That I am not certain, because I haven't done it yet myself, but I am 
optimistic it will be tolerable.


As with any commit, when the slave replicates it will temporarily make a 
second copy of any changed index files (possibly the whole index), and 
it will then set up new searchers on the new copy of the index, and it 
will warm that new index, and then once warmed, it'll switch live 
searches over to the new index, and delete any old copies of indexes.


So you may still need a bunch of 'extra' RAM in the JVM to accomodate 
that overlap period.  You will need some extra diskspace. But the actual 
CPU I mean, it will take some CPU for the slave to run the new 
warmers, but it should be tolerable not very noticeable... I'm hoping.


One main benefit of the replication setup is that you can _optimize_ on 
the master, which will be completely out of the way of the slave.


Even with the replication setup, you still can't commit (ie pull down 
changes from master) "near real time" in 1.4 though, you can't commit so 
often that a new index is not done warming when a new commit comes in, 
or your Solr will grind to a halt as it uses too much CPU and RAM. There 
are various ways people have suggested you can try to work around this, 
but I havne't been too happy with any of em, I think it's best just not 
to commit/pull down changes from master that often.  Unless you REALLY 
need to, and are prepared to get into details of Solr to figure out how 
to make it work as well as it can.


Re: performance during index switch

2011-01-19 Thread Otis Gospodnetic
Tri,

During replication:
* extra disk IO on slaves during replication - worst if you are replicating an 
optimized index, which can hurt if your index is not RAM resident
* the above will consume some of your OS buffer cache, which can hurt
* increased network usage - never seen this becoming a real problem, but if you 
are replicating a large and always optimized index, it might cause problems

After replication:
* potentially high CPU usage during the warmup of the new IndexSearcher, 
depending on warmup queries used, cache warmup settings, etc.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Tri Nguyen 
> To: solr-user@lucene.apache.org
> Sent: Wed, January 19, 2011 2:56:58 PM
> Subject: Re: performance during index switch
> 
> Yes, during a commit.
>  
> I'm planning to do as you suggested, having a  master do the indexing and 
>replicating the index to a slave which leads to my  next questions.
>  
> During the slave replicates the index files from the  master, how does it 
>impact performance on the slave?
>  
> Tri
> 
> 
> ---  On Wed, 1/19/11, Jonathan Rochkind  wrote:
> 
> 
> From:  Jonathan Rochkind 
> Subject: Re:  performance during index switch
> To: "solr-user@lucene.apache.org"  
> Date:  Wednesday, January 19, 2011, 11:30 AM
> 
> 
> During commit?
> 
> A commit  (and especially an optimize) can be expensive in terms of both CPU 
>and RAM as  your index grows larger, leaving less CPU for querying, and 
>possibly 
>less RAM  which can cause Java GC slowdowns in some cases.
> 
> A common suggestion is  to use Solr replication to seperate out a Solr index 
>that you index to, and then  replicate to a slave index that actually serves 
>your queries. This should  minimize any performance problems on your 'live' 
>Solr 
>while indexing, although  there's still something that has to be done for the 
>actual replication of  course. Haven't tried it yet myself.  Plan to -- my 
>plan 
>is actually to put them  both on the same server (I've only got one), but in 
>seperate JVMs, and on a  server with enough CPU cores that hopefully the 
>indexing won't steal CPU the  querying needs.
> 
> On 1/19/2011 2:23 PM, Tri Nguyen wrote:
> >  Hi,
> >   Are there performance issues during the index switch?
> >   As  the size of index gets bigger, response time slows down?  Are there 
> > any 
>studies  on this?
> >   Thanks,
> >   Tri
> 


Re: performance during index switch

2011-01-19 Thread Tri Nguyen
Yes, during a commit.
 
I'm planning to do as you suggested, having a master do the indexing and 
replicating the index to a slave which leads to my next questions.
 
During the slave replicates the index files from the master, how does it impact 
performance on the slave?
 
Tri


--- On Wed, 1/19/11, Jonathan Rochkind  wrote:


From: Jonathan Rochkind 
Subject: Re: performance during index switch
To: "solr-user@lucene.apache.org" 
Date: Wednesday, January 19, 2011, 11:30 AM


During commit?

A commit (and especially an optimize) can be expensive in terms of both CPU and 
RAM as your index grows larger, leaving less CPU for querying, and possibly 
less RAM which can cause Java GC slowdowns in some cases.

A common suggestion is to use Solr replication to seperate out a Solr index 
that you index to, and then replicate to a slave index that actually serves 
your queries. This should minimize any performance problems on your 'live' Solr 
while indexing, although there's still something that has to be done for the 
actual replication of course. Haven't tried it yet myself.  Plan to -- my plan 
is actually to put them both on the same server (I've only got one), but in 
seperate JVMs, and on a server with enough CPU cores that hopefully the 
indexing won't steal CPU the querying needs.

On 1/19/2011 2:23 PM, Tri Nguyen wrote:
> Hi,
>   Are there performance issues during the index switch?
>   As the size of index gets bigger, response time slows down?  Are there any 
>studies on this?
>   Thanks,
>   Tri


Re: performance during index switch

2011-01-19 Thread Markus Jelsma
> Hi,
>  
> Are there performance issues during the index switch?

What do you mean by index switch?

>  
> As the size of index gets bigger, response time slows down?  Are there any
> studies on this? 

I haven't seen any studies as of yet but response time will slow down for some 
components. Sorting and faceting will tend to consume more RAM and CPU cycles 
with the increase of documents and unique values. It also becomes increasingly 
slow if you query for very high start values. And, of course, cache warming 
queries usually take some more time as well increasing latency between commit 
and availability.

> Thanks,
>  
> Tri


Re: performance during index switch

2011-01-19 Thread Jonathan Rochkind

During commit?

A commit (and especially an optimize) can be expensive in terms of both 
CPU and RAM as your index grows larger, leaving less CPU for querying, 
and possibly less RAM which can cause Java GC slowdowns in some cases.


A common suggestion is to use Solr replication to seperate out a Solr 
index that you index to, and then replicate to a slave index that 
actually serves your queries. This should minimize any performance 
problems on your 'live' Solr while indexing, although there's still 
something that has to be done for the actual replication of course. 
Haven't tried it yet myself.  Plan to -- my plan is actually to put them 
both on the same server (I've only got one), but in seperate JVMs, and 
on a server with enough CPU cores that hopefully the indexing won't 
steal CPU the querying needs.


On 1/19/2011 2:23 PM, Tri Nguyen wrote:

Hi,
  
Are there performance issues during the index switch?
  
As the size of index gets bigger, response time slows down?  Are there any studies on this?
  
Thanks,
  
Tri


performance during index switch

2011-01-19 Thread Tri Nguyen
Hi,
 
Are there performance issues during the index switch?
 
As the size of index gets bigger, response time slows down?  Are there any 
studies on this?
 
Thanks,
 
Tri