That's great information.

Thanks for all the help and guidance, its been invaluable.

Thanks
Ben

-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 26 March 2012 12:21
To: solr-user@lucene.apache.org
Subject: Re: Simple Slave Replication Question

It's the optimize step. Optimize essentially forces all the segments to be 
copied into a single new segment, which means that your entire index will be 
replicated to the slaves.

In recent Solrs, there's usually no need to optimize, so unless and until you 
can demonstrate a noticeable change, I'd just leave the optimize step off. In 
fact, trunk renames it to forceMerge or something just because it's so common 
for people to think "of course I want to optimize my index!" and get the 
unintended consequences you're seeing even thought the optimize doesn't 
actually do that much good in most cases.

Some people just do the optimize once a day (or week or whatever) during 
off-peak hours as a compromise.

Best
Erick


On Mon, Mar 26, 2012 at 5:02 AM, Ben McCarthy <ben.mccar...@tradermedia.co.uk> 
wrote:
> Hello,
>
> Had to leave the office so didn't get a chance to reply.  Nothing in the 
> logs.  Just ran one through from the ingest tool.
>
> Same results full copy of the index.
>
> Is it something to do with:
>
> server.commit();
> server.optimize();
>
> I call this at the end of the ingestion.
>
> Would optimize then work across the whole index?
>
> Thanks
> Ben
>
> -----Original Message-----
> From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com]
> Sent: 23 March 2012 15:10
> To: solr-user@lucene.apache.org
> Subject: Re: Simple Slave Replication Question
>
> Also, what happens if, instead of adding the 40K docs you add just one and 
> commit?
>
> 2012/3/23 Tomás Fernández Löbbe <tomasflo...@gmail.com>
>
>> Have you changed the mergeFactor or are you using 10 as in the
>> example solrconfig?
>>
>> What do you see in the slave's log during replication? Do you see any
>> line like "Skipping download for..."?
>>
>>
>> On Fri, Mar 23, 2012 at 11:57 AM, Ben McCarthy <
>> ben.mccar...@tradermedia.co.uk> wrote:
>>
>>> I just have a index directory.
>>>
>>> I push the documents through with a change to a field.  Im using
>>> SOLRJ to do this.  Im using the guide from the wiki to setup the
>>> replication.  When the feed of updates to the master finishes I call
>>> a commit again using SOLRJ.  I then have a poll period of 5 minutes
>>> from the slave.  When it kicks in I see a new version of the index
>>> and then it copys the full 5gb index.
>>>
>>> Thanks
>>> Ben
>>>
>>> -----Original Message-----
>>> From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com]
>>> Sent: 23 March 2012 14:29
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Simple Slave Replication Question
>>>
>>> Hi Ben, only new segments are replicated from master to slave. In a
>>> situation where all the segments are new, this will cause the index
>>> to be fully replicated, but this rarely happen with incremental
>>> updates. It can also happen if the slave Solr assumes it has an "invalid" 
>>> index.
>>> Are you committing or optimizing on the slaves? After replication,
>>> the index directory on the slaves is called "index" or "index.<timestamp>"?
>>>
>>> Tomás
>>>
>>> On Fri, Mar 23, 2012 at 11:18 AM, Ben McCarthy <
>>> ben.mccar...@tradermedia.co.uk> wrote:
>>>
>>> > So do you just simpy address this with big nic and network pipes.
>>> >
>>> > -----Original Message-----
>>> > From: Martin Koch [mailto:m...@issuu.com]
>>> > Sent: 23 March 2012 14:07
>>> > To: solr-user@lucene.apache.org
>>> > Subject: Re: Simple Slave Replication Question
>>> >
>>> > I guess this would depend on network bandwidth, but we move around
>>> > 150G/hour when hooking up a new slave to the master.
>>> >
>>> > /Martin
>>> >
>>> > On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy <
>>> > ben.mccar...@tradermedia.co.uk> wrote:
>>> >
>>> > > Hello,
>>> > >
>>> > > Im looking at the replication from a master to a number of slaves.
>>> > > I have configured it and it appears to be working.  When
>>> > > updating 40K records on the master is it standard to always copy
>>> > > over the full index, currently 5gb in size.  If this is standard
>>> > > what do people do who have massive 200gb indexs, does it not
>>> > > take a while to bring the
>>> > slaves inline with the master?
>>> > >
>>> > > Thanks
>>> > > Ben
>>> > >
>>> > > ________________________________________
>>> > >
>>> > >
>>> > > This e-mail is sent on behalf of Trader Media Group Limited,
>>> > > Registered
>>> > > Office: Auto Trader House, Cutbush Park Industrial Estate,
>>> > > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in 
>>> > > England No.
>>> > 4768833).
>>> > > This email and any files transmitted with it are confidential
>>> > > and may be legally privileged, and intended solely for the use
>>> > > of the individual or entity to whom they are addressed. If you
>>> > > have received this email in error please notify the sender. This
>>> > > email message has been swept for the presence of computer viruses.
>>> > >
>>> > >
>>> >
>>> > ________________________________________
>>> >
>>> >
>>> > This e-mail is sent on behalf of Trader Media Group Limited,
>>> > Registered
>>> > Office: Auto Trader House, Cutbush Park Industrial Estate,
>>> > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England 
>>> > No.
>>> 4768833).
>>> > This email and any files transmitted with it are confidential and
>>> > may be legally privileged, and intended solely for the use of the
>>> > individual or entity to whom they are addressed. If you have
>>> > received this email in error please notify the sender. This email
>>> > message has been swept for the presence of computer viruses.
>>> >
>>> >
>>>
>>> ________________________________________
>>>
>>>
>>> This e-mail is sent on behalf of Trader Media Group Limited,
>>> Registered
>>> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill,
>>> Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 
>>> 4768833).
>>> This email and any files transmitted with it are confidential and
>>> may be legally privileged, and intended solely for the use of the
>>> individual or entity to whom they are addressed. If you have
>>> received this email in error please notify the sender. This email
>>> message has been swept for the presence of computer viruses.
>>>
>>>
>>
>
> ________________________________________
>
>
> This e-mail is sent on behalf of Trader Media Group Limited, Registered 
> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower 
> Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This 
> email and any files transmitted with it are confidential and may be legally 
> privileged, and intended solely for the use of the individual or entity to 
> whom they are addressed. If you have received this email in error please 
> notify the sender. This email message has been swept for the presence of 
> computer viruses.
>

________________________________________


This e-mail is sent on behalf of Trader Media Group Limited, Registered Office: 
Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower Earley, 
Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This email and 
any files transmitted with it are confidential and may be legally privileged, 
and intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error please notify the sender. 
This email message has been swept for the presence of computer viruses. 

Reply via email to