RE: More replication questions
Thanks for the responses. If we used a poll interval of one second (for 1.4), wouldn't we still have to wait for the replication to finish? In that case, couldn't it take minutes (depending on index size) to get that data on the slave? Or would there be a lot less data to pull down because of the high replication frequency (i.e. Will it only have small files to replicate)? -Original Message- From: solr-user-return-19721-laurent.vauthrin=disney@lucene.apache.org [mailto:solr-user-return-19721-laurent.vauthrin=disney@lucene.apache.org] On Behalf Of Noble Paul ??? ?? Sent: Tuesday, March 17, 2009 9:04 PM To: solr-user@lucene.apache.org Subject: Re: More replication questions On Wed, Mar 18, 2009 at 12:34 AM, Vauthrin, Laurent laurent.vauth...@disney.com wrote: Hello, I have a couple of questions relating to replication in Solr. As far as I understand it, the replication approach for both 1.3 and 1.4 involves having the slaves poll the master for updates to the index. We're curious to know if it's possible to have a more dynamic/quicker way to propagate updates. 1. Is there a built-in mechanism for pushing out updates(/inserts/deletes) received by the master to the slaves? The pull mechanism in 1.4 can be good enough. The 'pollInterval' can be as small as 1 sec. So you will get the updates within a second .Isn't it not good enough? 2. Is it discouraged to post updates to multiple Solr instances? (all instances can receive updates and fulfill query requests) This is prone to serious errors all the solr instances may not be in sync 3. If that sort of capability is not supported, why was it not implemented this way? (So that we don't repeat any mistakes) A push based replication is in the cards. the implementation is not trivial. In Solr commits are already expensive s a second's delay may be alright . 4. Has anyone else on the list attempted to do this? The intent here is to achieve optimal performance while have the freshest data possible if that's possible. Thanks, Laurent -- --Noble Paul
Re: More replication questions
it depends on a few things. 1) no:of docs added 2) is the index optimized 3) autowarming if the no:of docs added are few and the index is not optimized , the replication will be will be done in milliseconds (the changed files will be small). If there is no autoWarming , there should be no delay in seeing the new data On Thu, Mar 19, 2009 at 6:23 AM, Vauthrin, Laurent laurent.vauth...@disney.com wrote: Thanks for the responses. If we used a poll interval of one second (for 1.4), wouldn't we still have to wait for the replication to finish? In that case, couldn't it take minutes (depending on index size) to get that data on the slave? Or would there be a lot less data to pull down because of the high replication frequency (i.e. Will it only have small files to replicate)? -Original Message- From: solr-user-return-19721-laurent.vauthrin=disney@lucene.apache.org [mailto:solr-user-return-19721-laurent.vauthrin=disney@lucene.apache.org] On Behalf Of Noble Paul ??? ?? Sent: Tuesday, March 17, 2009 9:04 PM To: solr-user@lucene.apache.org Subject: Re: More replication questions On Wed, Mar 18, 2009 at 12:34 AM, Vauthrin, Laurent laurent.vauth...@disney.com wrote: Hello, I have a couple of questions relating to replication in Solr. As far as I understand it, the replication approach for both 1.3 and 1.4 involves having the slaves poll the master for updates to the index. We're curious to know if it's possible to have a more dynamic/quicker way to propagate updates. 1. Is there a built-in mechanism for pushing out updates(/inserts/deletes) received by the master to the slaves? The pull mechanism in 1.4 can be good enough. The 'pollInterval' can be as small as 1 sec. So you will get the updates within a second .Isn't it not good enough? 2. Is it discouraged to post updates to multiple Solr instances? (all instances can receive updates and fulfill query requests) This is prone to serious errors all the solr instances may not be in sync 3. If that sort of capability is not supported, why was it not implemented this way? (So that we don't repeat any mistakes) A push based replication is in the cards. the implementation is not trivial. In Solr commits are already expensive s a second's delay may be alright . 4. Has anyone else on the list attempted to do this? The intent here is to achieve optimal performance while have the freshest data possible if that's possible. Thanks, Laurent -- --Noble Paul -- --Noble Paul
Re: More replication questions
On Wed, Mar 18, 2009 at 12:34 AM, Vauthrin, Laurent laurent.vauth...@disney.com wrote: Hello, I have a couple of questions relating to replication in Solr. As far as I understand it, the replication approach for both 1.3 and 1.4 involves having the slaves poll the master for updates to the index. We're curious to know if it's possible to have a more dynamic/quicker way to propagate updates. 1. Is there a built-in mechanism for pushing out updates(/inserts/deletes) received by the master to the slaves? The pull mechanism in 1.4 can be good enough. The 'pollInterval' can be as small as 1 sec. So you will get the updates within a second .Isn't it not good enough? 2. Is it discouraged to post updates to multiple Solr instances? (all instances can receive updates and fulfill query requests) This is prone to serious errors all the solr instances may not be in sync 3. If that sort of capability is not supported, why was it not implemented this way? (So that we don't repeat any mistakes) A push based replication is in the cards. the implementation is not trivial. In Solr commits are already expensive s a second's delay may be alright . 4. Has anyone else on the list attempted to do this? The intent here is to achieve optimal performance while have the freshest data possible if that's possible. Thanks, Laurent -- --Noble Paul