Re: should slave replication be turned off / on during master clean and re-index?
On 5/1/2012 6:55 AM, geeky2 wrote: you said, you don't use autocommit. if so - then why don't you use / like autocommit? It's not really that I don't like it, I just don't need it. I think that it actually caused me problems when I first started using Solr (1.4.0), but that's been long enough ago that I no longer remember. I use the live/build core method, so I do not need to be able to search the documents as they are being added. A commit at the end is good enough. It already creates multiple Lucene segments when ramBufferSizeMB fills up. I used to use the dataimporter for everything, with a Perl-based build system using cron and LWP. Now I have a multi-threaded SolrJ application that only use the importer for full rebuilds, which are very rare. Because I could not do replication between 1.4.1 and 3.x, I had to abandon replication in order to upgrade Solr. The new build program updates both of my index chains in parallel. Thanks, Shawn
Re: should slave replication be turned off / on during master clean and re-index?
thanks for all of the advice / help. i appreciate it ;) -- View this message in context: http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3959088.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: should slave replication be turned off / on during master clean and re-index?
Simply turn off replication during your rebuild-from-scratch. See: http://wiki.apache.org/solr/SolrReplication#HTTP_API the "disabelreplication" command. The autocommit thing was, I think, in reference to keeping any replication of a partial-rebuild from being replicated. Autocommit is usually a fine thing. So your full-rebuild looks like this 1> disable replication on the master 2> rebuild the index (autocommit on or off, makes little difference as far as replication) 3> enable replication on the master Best Erick On Tue, May 1, 2012 at 8:55 AM, geeky2 wrote: > hello shawn, > > thanks for the reply. > > ok - i did some testing and yes you are correct. > > autocommit is doing the "commit" work in chunks. yes - the slaves are also > going to having everything to nothing, then slowly building back up again, > lagging behind the master. > > ... and yes - this is probably not what we need - as far as a replication > strategy for the slaves. > > you said, you don't use autocommit. if so - then why don't you use / like > autocommit? > > since we have not done this here - there is no established reference point, > from an operations perspective. > > i am looking to formulate some sort of operation strategy, so ANY ideas or > input is really welcome. > > > > it seems to me that we have to account for two operational strategies - > > the first operational mode is a "daily" append to the solr core after the > database tables have been updated. this can probably be done with a simple > delta import. i would think that autocommit could remain on for the master > and replication could also be left on so the slaves picked up the changes > ASAP. this seems like the mode that we would / should be in most of the > time. > > > the second operational mode would be a "build from scratch" mode, where > changes in the schema necessitated a full re-index of the data. given that > our site (powered by solr) must be up all of the time, and that our full > index time on the master (for the moment) is hovering somewhere around 16 > hours - it makes sense that some sort of parallel path - with a cut-over, > must be used. > > in this situation is it possible to have the indexing process going on in > the background - then have one commit at the end - then turn replication on > for the slaves? > > are there disadvantages to this approach? > > also - i really like your suggestion of a "build core" and "live core". is > this approach you use? > > thank you for all of the great input > > > > > then > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3952904.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: should slave replication be turned off / on during master clean and re-index?
hello shawn, thanks for the reply. ok - i did some testing and yes you are correct. autocommit is doing the "commit" work in chunks. yes - the slaves are also going to having everything to nothing, then slowly building back up again, lagging behind the master. ... and yes - this is probably not what we need - as far as a replication strategy for the slaves. you said, you don't use autocommit. if so - then why don't you use / like autocommit? since we have not done this here - there is no established reference point, from an operations perspective. i am looking to formulate some sort of operation strategy, so ANY ideas or input is really welcome. it seems to me that we have to account for two operational strategies - the first operational mode is a "daily" append to the solr core after the database tables have been updated. this can probably be done with a simple delta import. i would think that autocommit could remain on for the master and replication could also be left on so the slaves picked up the changes ASAP. this seems like the mode that we would / should be in most of the time. the second operational mode would be a "build from scratch" mode, where changes in the schema necessitated a full re-index of the data. given that our site (powered by solr) must be up all of the time, and that our full index time on the master (for the moment) is hovering somewhere around 16 hours - it makes sense that some sort of parallel path - with a cut-over, must be used. in this situation is it possible to have the indexing process going on in the background - then have one commit at the end - then turn replication on for the slaves? are there disadvantages to this approach? also - i really like your suggestion of a "build core" and "live core". is this approach you use? thank you for all of the great input then -- View this message in context: http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3952904.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: should slave replication be turned off / on during master clean and re-index?
On 4/27/2012 8:33 PM, geeky2 wrote: well, in this case when i say, "clean" (on the Master), i mean selecting the "Full Import with Cleaning" button from the DataImportHandler Development Console page in solr. at the top of the page, i have the check boxes selected for verbose and clean (*but i don't have the commit checkbox selected*). by doing the above process - doesn't this issue a deletion query - then start the import? and as a follow-up - when actually is the commit being done? here is my from my solrconfig.xml file on the master * 6 1000 * 10 With commit turned off on the import, the *import* will not do a commit at any time, so something else has to do the commit or you will never see the new index. In your case, you are relying on autocommit. Because I don't use autocommit, I can't say for sure that the following is right, but I believe that it is: With your settings during a full import, your index will go from having everything in it to having 1000 documents or less within one minute of the import starting. If that is indeed what happens (and you should definitely test to make sure) and you have replication active, your slaves would have a reduced index that would slowly build back up as the import progressed on the master. I am pretty sure that's not what you want, so it is a good idea to disable replication until the full import is complete. There is another option, one that would be a good idea if you make additions/deletions to your index on an interval that is smaller than the time it takes for a full-import: Maintain a live core and a build core on your master server. Build a new index in the build core while simultaneously keeping the live core up to date. When the build is complete, update it to be current and then swap the live core and build core. If replication is set up correctly, the slaves should replicate the new index as soon as the cores are swapped. Thanks, Shawn
Re: should slave replication be turned off / on during master clean and re-index?
I guess you're looking for 'disabling replication poll on slave' go to 'Replication dashboard[1]', there you have options like Enable/Disable Poll, Force replication, Abort replication dashboard url: http://slave_host:port/solr/corename/admin/replication/index.jsp Poll Disabled => slave will not poll master for replication - Jeevanandam [1] http://wiki.apache.org/solr/SolrReplication#Replication_Dashboard On Apr 28, 2012, at 8:03 AM, geeky2 wrote: > hello, > > thank you for the reply, > >>> > Does a "clean" mean issuing a deletion query (e.g. > *:*) prior to re-indexing all of your content? I > don't think the slaves will download any changes until you've committed at > some point on the master. > << > > well, in this case when i say, "clean" (on the Master), i mean selecting > the "Full Import with Cleaning" button from the DataImportHandler > Development Console page in solr. at the top of the page, i have the check > boxes selected for verbose and clean (*but i don't have the commit checkbox > selected*). > > by doing the above process - doesn't this issue a deletion query - then > start the import? > > and as a follow-up - when actually is the commit being done? > > > here is my from my solrconfig.xml file on the master > > > * > 60000 > 1000 > * >10 > > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3945954.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: should slave replication be turned off / on during master clean and re-index?
hello, thank you for the reply, >> Does a "clean" mean issuing a deletion query (e.g. *:*) prior to re-indexing all of your content? I don't think the slaves will download any changes until you've committed at some point on the master. << well, in this case when i say, "clean" (on the Master), i mean selecting the "Full Import with Cleaning" button from the DataImportHandler Development Console page in solr. at the top of the page, i have the check boxes selected for verbose and clean (*but i don't have the commit checkbox selected*). by doing the above process - doesn't this issue a deletion query - then start the import? and as a follow-up - when actually is the commit being done? here is my from my solrconfig.xml file on the master * 6 1000 * 10 -- View this message in context: http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3945954.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: should slave replication be turned off / on during master clean and re-index?
Does a "clean" mean issuing a deletion query (e.g. *:*) prior to re-indexing all of your content? I don't think the slaves will download any changes until you've committed at some point on the master. If you delete everything and then commit, and proceed to re-index, then the slaves will pick that up at some point, perhaps sooner than you'd like. If you expect the slaves to continue to serve queries during this process, then don't commit on the master until you want the slaves to be aware of what you've done. If it's more complicated where you're going to perform multiple commits on the master, or shut it down and remove the data files etc., then I think it makes sense to turn off replication during that interval. Once you're happy with your update on the master, then enable replication again. Cheers, Jeff On Apr 27, 2012, at 3:59 PM, geeky2 wrote: > hello all, > > i am just getting replication going on our master and two (2) slaves. > > from time to time, i may need to do a complete re-index and clean on the > master. > > should replication on the slave - remain On or Off during a full clean and > re-index on the Master? > > thank you, > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3945531.html > Sent from the Solr - User mailing list archive at Nabble.com. -- Jeff Schmidt 535 Consulting j...@535consulting.com http://www.535consulting.com (650) 423-1068
should slave replication be turned off / on during master clean and re-index?
hello all, i am just getting replication going on our master and two (2) slaves. from time to time, i may need to do a complete re-index and clean on the master. should replication on the slave - remain On or Off during a full clean and re-index on the Master? thank you, -- View this message in context: http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3945531.html Sent from the Solr - User mailing list archive at Nabble.com.