Re: Replication Clarification Please

Ravi Solr Mon, 09 May 2011 08:25:14 -0700

Hello Mr. Bell,
                   Thank you very much for patiently responding to my
questions. We optimize once in every 2 days. Can you kindly rephrase
your answer, I could not understand - "if the amount of time if > 10
segments, I believe that might also trigger a whole index, since you
cycled all the segments.In that case I think you might want to
increase the mergeFactor."


The current index folder details and sizes are given below

MASTER
--------------
   5K   search-data/spellchecker2
 480M  search-data/index
   5K   search-data/spellchecker1
   5K   search-data/spellcheckerFile
 480M   search-data

SLAVE
----------
   2K   search-data/index.20110509103950
 419M   search-data/index
 2.3G   search-data/index.20110429042508  ----> SLAVE is pointing to
this directory
   5K   search-data/spellchecker1
   5K  search-data/spellchecker2
   5K   search-data/spellcheckerFile
 2.7G   search-data

Thanks,

Ravi Kiran Bhaskar

On Sat, May 7, 2011 at 11:49 PM, Bill Bell <billnb...@gmail.com> wrote:
> I did not see answers... I am not an authority, but will tell you what I
> think....
>
> Did you get some answers?
>
>
> On 5/6/11 2:52 PM, "Ravi Solr" <ravis...@gmail.com> wrote:
>
>>Hello,
>>        Pardon me if this has been already answered somewhere and I
>>apologize for a lengthy post. I was wondering if anybody could help me
>>understand Replication internals a bit more. We have a single
>>master-slave setup (solr 1.4.1) with the configurations as shown
>>below. Our environment is quite commit heavy (almost 100s of docs
>>every 5 minutes), and all indexing is done on Master and all searches
>>go to the Slave. We are seeing that the slave replication performance
>>gradually decreases and the speed decreases < 1kbps and ultimately
>>gets backed up. Once we reload the core on slave it will be work fine
>>for sometime and then it again gets backed up. We have mergeFactor set
>>to 10 and ramBufferSizeMB is set to 32MB and solr itself is running
>>with 2GB memory and locktype is simple on both master and slave.
>
> How big is your index? How many rows and GB ?
>
> Every time you replicate, there are several resets on caching. So if you
> are constantly
> Indexing, you need to be careful on how that performance impact will apply.
>
>>
>>I am hoping that the following questions might help me understand the
>>replication performance issue better (Replication Configuration is
>>given at the end of the email)
>>
>>1. Does the Slave get the whole index every time during replication or
>>just the delta since the last replication happened ?
>
>
> It depends. If you do an OPTIMIZE every time your index, then you will be
> sending the whole index down.
> If the amount of time if > 10 segments, I believe that might also trigger
> a whole index, since you cycled all the segments.
> In that case I think you might want to increase the mergeFactor.
>
>
>>
>>2. If there are huge number of queries being done on slave will it
>>affect the replication ? How can I improve the performance ? (see the
>>replications details at he bottom of the page)
>
> It seems that might be one way the you get the index.* directories. At
> least I see it more frequently when there is huge load and you are trying
> to replicate.
> You could replicate less frequently.
>
>>
>>3. Will the segment names be same be same on master and slave after
>>replication ? I see that they are different. Is this correct ? If it
>>is correct how does the slave know what to fetch the next time i.e.
>>the delta.
>
> Yes they better be. In the old days you could just rsync the data
> directory from master and slave and reload the core, that worked fine.
>
>>
>>4. When and why does the index.<TIMESTAMP> folder get created ? I see
>>this type of folder getting created only on slave and the slave
>>instance is pointing to it.
>
> I would love to know all the conditions... I believe it is supposed to
> replicate to index.*, then reload to point to it. But sometimes it gets
> stuck in index.* land and never goes back to straight index.
>
> There are several bug fixes for this in 3.1.
>
>>
>>5. Does replication process copy both the index and index.<TIMESTAMP>
>>folder ?
>
> I believe it is supposed to copy the segment or whole index/ from master
> to index.* on slave.
>
>>
>>6. what happens if the replication kicks off even before the previous
>>invocation has not completed ? will the 2nd invocation block or will
>>it go through causing more confusion ?
>
> That is not supposed to happen, if a replication is in process, it should
> not copy again until that one is complete.
> Try it, just delete the data/*, restart SOLR, and force a replication,
> while it is syncing, force it again. Does not seem to work for me.
>>
>>7. If I have to prep a new master-slave combination is it OK to copy
>>the respective contents into the new master-slave and start solr ? or
>>do I have have to wipe the new slave and let it replicate from its new
>>master ?
>
> If you shut down the slave, copy the data/* directory amd restart you
> should be fine. That is how we fix the data/ dir when
> there is corruption.
>>
>>8. Doing an 'ls | wc -l' on index folder of master and slave gave 194
>>and 17968 respectively...I slave has lot of segments_xxx files. Is
>>this normal ?
>
> Several bugs fixed in 3.1 for this one. Not a good thing.... You are
> getting leftover segments or index.* directories.
>>
>>MASTER
>>
>><requestHandler name="/replication" class="solr.
>>ReplicationHandler" >
>>    <lst name="master">
>>        <str name="replicateAfter">startup</str>
>>        <str name="replicateAfter">commit</str>
>>        <str name="replicateAfter">optimize</str>
>>
>>        <str name="confFiles">schema.xml,stopwords.txt</str>
>>        <str name="commitReserveDuration">00:00:10</str>
>>    </lst>
>></requestHandler>
>>
>>
>>SLAVE
>>
>><requestHandler name="/replication" class="solr.ReplicationHandler" >
>>    <lst name="slave">
>>        <str name="masterUrl">master core url</str>
>>        <str name="pollInterval">00:03:00</str>
>>        <str name="compression">internal</str>
>>        <str name="httpConnTimeout">5000</str>
>>        <str name="httpReadTimeout">10000</str>
>>     </lst>
>></requestHandler>
>>
>>
>>REPLICATION DETAILS FROM PAGE
>>
>>Master     master core url
>>Poll Interval     00:03:00
>>Local Index     Index Version: 1296217104577, Generation: 20190
>>    Location: /data/solr/core/search-data/index.20110429042508
>>    Size: 2.1 GB
>>    Times Replicated Since Startup: 672
>>    Previous Replication Done At: Fri May 06 15:41:01 EDT 2011
>>    Config Files Replicated At: null
>>    Config Files Replicated: null
>>    Times Config Files Replicated Since Startup: null
>>    Next Replication Cycle At: Fri May 06 15:44:00 EDT 2011
>>    Current Replication Status     Start Time: Fri May 06 15:41:00 EDT
>>2011
>>    Files Downloaded: 43 / 197
>>    Downloaded: 477.08 KB / 588.82 MB [0.0%]
>>    Downloading File: _hdm.prx, Downloaded: 9.3 KB / 9.3 KB [100.0%]
>>    Time Elapsed: 967s, Estimated Time Remaining: 1221166s, Speed: 505
>>bytes/s
>>
>>
>>Ravi Kiran Bhaskar
>
>
>

Re: Replication Clarification Please

Reply via email to