isolating solrcloud instance from peer updates
Hello all, Is there a way to isolate an active solr-cloud instance from all incoming replication update requests from peer nodes ? -- Regards, -Aditya Sakhuja
Re: ReplicationFactor for solrcloud
Thanks Shalin. We used the maxShardsPerNode=3 as you suggest here. On Thu, Sep 12, 2013 at 4:09 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > You must specify maxShardsPerNode=3 for this to happen. By default > maxShardsPerNode defaults to 1 so only one shard is created per node. > > On Thu, Sep 12, 2013 at 3:19 AM, Aditya Sakhuja > wrote: > > Hi - > > > > I am trying to set the 3 shards and 3 replicas for my solrcloud > deployment > > with 3 servers, specifying the replicationFactor=3 and numShards=3 when > > starting the first node. I see each of the servers allocated to 1 shard > > each.however, do not see 3 replicas allocated on each node. > > > > I specifically need to have 3 replicas across 3 servers with 3 shards. Do > > we think of any reason to not have this configuration ? > > > > -- > > Regards, > > -Aditya Sakhuja > > > > -- > Regards, > Shalin Shekhar Mangar. > -- Regards, -Aditya Sakhuja
Re: solrcloud shards backup/restoration
How does one recover from an index corruption ? That's what I am trying to eventually tackle here. Thanks Aditya On Thursday, September 19, 2013, Aditya Sakhuja wrote: > Hi, > > Sorry for the late followup on this. Let me put in more details here. > > *The problem:* > > Cannot successfully restore back the index backed up with > '/replication?command=backup'. The backup was generated as * > snapshot.mmdd* > > *My setup and steps:* > * > * > 6 solrcloud instances > 7 zookeepers instances > > Steps: > > 1.> Take snapshot using *http://host1:8893/solr/replication?command=backup > *, on one host only. move *snapshot.mmdd *to some reliable storage. > > 2.> Stop all 6 solr instances, all 7 zk instances. > > 3.> Delete ../collectionname/data/* on all solrcloud nodes. ie. deleting > the index data completely. > > 4.> Delete zookeeper/data/version*/* on all zookeeper nodes. > > 5.> Copy back index from backup to one of the nodes. > \> cp *snapshot.mmdd/* *../collectionname/data/index/* > > 6.> Restart all zk instances. Restart all solrcloud instances. > > > *Outcome:* > * > * > All solr instances are up. However, *num of docs = 0 *for all nodes. > Looking at the node where the index was restored, there is a new > index.yymmddhhmmss directory being created and index.properties pointing to > it. That explains why no documents are reported. > > > How do I have solrcloud pickup data from the index directory on a restart > ? > > Thanks in advance, > Aditya > > > > On Fri, Sep 6, 2013 at 3:41 PM, Aditya Sakhuja > wrote: > > Thanks Shalin and Mark for your responses. I am on the same page about the > conventions for taking the backup. However, I am less sure about the > restoration of the index. Lets say we have 3 shards across 3 solrcloud > servers. > > 1.> I am assuming we should take a backup from each of the shard leaders > to get a complete collection. do you think that will get the complete index > ( not worrying about what is not hard committed at the time of backup ). ? > > 2.> How do we go about restoring the index in a fresh solrcloud cluster ? > From the structure of the snapshot I took, I did not see any > replication.properties or index.properties which I see normally on a > healthy solrcloud cluster nodes. > if I have the snapshot named snapshot.20130905 does the > snapshot.20130905/* go into data/index ? > > Thanks > Aditya > > > > On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller wrote: > > Phone typing. The end should not say "don't hard commit" - it should say > "do a hard commit and take a snapshot". > > Mark > > Sent from my iPhone > > On Sep 6, 2013, at 7:26 AM, Mark Miller wrote: > > > I don't know that it's too bad though - its always been the case that if > you do a backup while indexing, it's just going to get up to the last hard > commit. With SolrCloud that will still be the case. So just make sure you > do a hard commit right before taking the backup - yes, it might miss a few > docs in the tran log, but if you are taking a back up while indexing, you > don't have great precision in any case - you will roughly get a snapshot > for around that time - even without SolrCloud, if you are worried about > precision and getting every update into that backup, you want to stop > indexing and commit first. But if you just want a rough snapshot for around > that time, in both cases you can still just don't hard commit and take a > snapshot. > > > > Mark > > > > Sent from my iPhone > > > > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar < > shalinman...@gmail.com> wrote: > > > >> The replication handler's backup command was built for pre-SolrCloud. > >> It takes a snapshot of the index but it is unaware of the transaction > >> log which is a key component in SolrCloud. Hence unless you stop > >> updates, commit your changes and then take a backup, you will likely > >> miss some updates. > >> > >> That being said, I'm curious to see how peer sync behaves when you try > >> to restore from a snapshot. When you say that you haven't been > >> successful in restoring, what exactly is the behaviour you observed? > >> > >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja < > aditya.sakh...@gmail.com> wrote: > >>> Hello, > >>> > >>> I was looking for a good backup / recovery solution for the solrcloud > >>> indexes. I am more looking for restoring the indexes from the index > >>> sn
Re: solrcloud shards backup/restoration
Hi, Sorry for the late followup on this. Let me put in more details here. *The problem:* Cannot successfully restore back the index backed up with '/replication?command=backup'. The backup was generated as * snapshot.mmdd* *My setup and steps:* * * 6 solrcloud instances 7 zookeepers instances Steps: 1.> Take snapshot using *http://host1:8893/solr/replication?command=backup*, on one host only. move *snapshot.mmdd *to some reliable storage. 2.> Stop all 6 solr instances, all 7 zk instances. 3.> Delete ../collectionname/data/* on all solrcloud nodes. ie. deleting the index data completely. 4.> Delete zookeeper/data/version*/* on all zookeeper nodes. 5.> Copy back index from backup to one of the nodes. \> cp *snapshot.mmdd/* *../collectionname/data/index/* 6.> Restart all zk instances. Restart all solrcloud instances. *Outcome:* * * All solr instances are up. However, *num of docs = 0 *for all nodes. Looking at the node where the index was restored, there is a new index.yymmddhhmmss directory being created and index.properties pointing to it. That explains why no documents are reported. How do I have solrcloud pickup data from the index directory on a restart ? Thanks in advance, Aditya On Fri, Sep 6, 2013 at 3:41 PM, Aditya Sakhuja wrote: > Thanks Shalin and Mark for your responses. I am on the same page about the > conventions for taking the backup. However, I am less sure about the > restoration of the index. Lets say we have 3 shards across 3 solrcloud > servers. > > 1.> I am assuming we should take a backup from each of the shard leaders > to get a complete collection. do you think that will get the complete index > ( not worrying about what is not hard committed at the time of backup ). ? > > 2.> How do we go about restoring the index in a fresh solrcloud cluster ? > From the structure of the snapshot I took, I did not see any > replication.properties or index.properties which I see normally on a > healthy solrcloud cluster nodes. > if I have the snapshot named snapshot.20130905 does the > snapshot.20130905/* go into data/index ? > > Thanks > Aditya > > > > On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller wrote: > >> Phone typing. The end should not say "don't hard commit" - it should say >> "do a hard commit and take a snapshot". >> >> Mark >> >> Sent from my iPhone >> >> On Sep 6, 2013, at 7:26 AM, Mark Miller wrote: >> >> > I don't know that it's too bad though - its always been the case that >> if you do a backup while indexing, it's just going to get up to the last >> hard commit. With SolrCloud that will still be the case. So just make sure >> you do a hard commit right before taking the backup - yes, it might miss a >> few docs in the tran log, but if you are taking a back up while indexing, >> you don't have great precision in any case - you will roughly get a >> snapshot for around that time - even without SolrCloud, if you are worried >> about precision and getting every update into that backup, you want to stop >> indexing and commit first. But if you just want a rough snapshot for around >> that time, in both cases you can still just don't hard commit and take a >> snapshot. >> > >> > Mark >> > >> > Sent from my iPhone >> > >> > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar < >> shalinman...@gmail.com> wrote: >> > >> >> The replication handler's backup command was built for pre-SolrCloud. >> >> It takes a snapshot of the index but it is unaware of the transaction >> >> log which is a key component in SolrCloud. Hence unless you stop >> >> updates, commit your changes and then take a backup, you will likely >> >> miss some updates. >> >> >> >> That being said, I'm curious to see how peer sync behaves when you try >> >> to restore from a snapshot. When you say that you haven't been >> >> successful in restoring, what exactly is the behaviour you observed? >> >> >> >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja < >> aditya.sakh...@gmail.com> wrote: >> >>> Hello, >> >>> >> >>> I was looking for a good backup / recovery solution for the solrcloud >> >>> indexes. I am more looking for restoring the indexes from the index >> >>> snapshot, which can be taken using the replicationHandler's backup >> command. >> >>> >> >>> I am looking for something that works with solrcloud 4.3 eventually, >> but >> >>> still relevant if you tested with a previous version. >> >>> >> >>> I haven't been successful in have the restored index replicate across >> the >> >>> new replicas, after I restart all the nodes, with one node having the >> >>> restored index. >> >>> >> >>> Is restoring the indexes on all the nodes the best way to do it ? >> >>> -- >> >>> Regards, >> >>> -Aditya Sakhuja >> >> >> >> >> >> >> >> -- >> >> Regards, >> >> Shalin Shekhar Mangar. >> > > > > -- > Regards, > -Aditya Sakhuja > -- Regards, -Aditya Sakhuja
ReplicationFactor for solrcloud
Hi - I am trying to set the 3 shards and 3 replicas for my solrcloud deployment with 3 servers, specifying the replicationFactor=3 and numShards=3 when starting the first node. I see each of the servers allocated to 1 shard each.however, do not see 3 replicas allocated on each node. I specifically need to have 3 replicas across 3 servers with 3 shards. Do we think of any reason to not have this configuration ? -- Regards, -Aditya Sakhuja
Re: solrcloud shards backup/restoration
Thanks Shalin and Mark for your responses. I am on the same page about the conventions for taking the backup. However, I am less sure about the restoration of the index. Lets say we have 3 shards across 3 solrcloud servers. 1.> I am assuming we should take a backup from each of the shard leaders to get a complete collection. do you think that will get the complete index ( not worrying about what is not hard committed at the time of backup ). ? 2.> How do we go about restoring the index in a fresh solrcloud cluster ? >From the structure of the snapshot I took, I did not see any replication.properties or index.properties which I see normally on a healthy solrcloud cluster nodes. if I have the snapshot named snapshot.20130905 does the snapshot.20130905/* go into data/index ? Thanks Aditya On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller wrote: > Phone typing. The end should not say "don't hard commit" - it should say > "do a hard commit and take a snapshot". > > Mark > > Sent from my iPhone > > On Sep 6, 2013, at 7:26 AM, Mark Miller wrote: > > > I don't know that it's too bad though - its always been the case that if > you do a backup while indexing, it's just going to get up to the last hard > commit. With SolrCloud that will still be the case. So just make sure you > do a hard commit right before taking the backup - yes, it might miss a few > docs in the tran log, but if you are taking a back up while indexing, you > don't have great precision in any case - you will roughly get a snapshot > for around that time - even without SolrCloud, if you are worried about > precision and getting every update into that backup, you want to stop > indexing and commit first. But if you just want a rough snapshot for around > that time, in both cases you can still just don't hard commit and take a > snapshot. > > > > Mark > > > > Sent from my iPhone > > > > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar < > shalinman...@gmail.com> wrote: > > > >> The replication handler's backup command was built for pre-SolrCloud. > >> It takes a snapshot of the index but it is unaware of the transaction > >> log which is a key component in SolrCloud. Hence unless you stop > >> updates, commit your changes and then take a backup, you will likely > >> miss some updates. > >> > >> That being said, I'm curious to see how peer sync behaves when you try > >> to restore from a snapshot. When you say that you haven't been > >> successful in restoring, what exactly is the behaviour you observed? > >> > >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja < > aditya.sakh...@gmail.com> wrote: > >>> Hello, > >>> > >>> I was looking for a good backup / recovery solution for the solrcloud > >>> indexes. I am more looking for restoring the indexes from the index > >>> snapshot, which can be taken using the replicationHandler's backup > command. > >>> > >>> I am looking for something that works with solrcloud 4.3 eventually, > but > >>> still relevant if you tested with a previous version. > >>> > >>> I haven't been successful in have the restored index replicate across > the > >>> new replicas, after I restart all the nodes, with one node having the > >>> restored index. > >>> > >>> Is restoring the indexes on all the nodes the best way to do it ? > >>> -- > >>> Regards, > >>> -Aditya Sakhuja > >> > >> > >> > >> -- > >> Regards, > >> Shalin Shekhar Mangar. > -- Regards, -Aditya Sakhuja
data/index naming format
Hello, I am running solr 4.1 for now, and am confused about the structure and naming of the contents of the data dir. I do not see the index.properties being generated on a fresh solr node start either. Can someone clarify when should one expect to see data/index vs. data/index., and the index.properties along with the second version. -- Regards, -Aditya Sakhuja
solrcloud shards backup/restoration
Hello, I was looking for a good backup / recovery solution for the solrcloud indexes. I am more looking for restoring the indexes from the index snapshot, which can be taken using the replicationHandler's backup command. I am looking for something that works with solrcloud 4.3 eventually, but still relevant if you tested with a previous version. I haven't been successful in have the restored index replicate across the new replicas, after I restart all the nodes, with one node having the restored index. Is restoring the indexes on all the nodes the best way to do it ? -- Regards, -Aditya Sakhuja
Solr 4.1 default commit mode
Hi, Can someone please confirm what is the default "commit" type for solrcloud 4.1 As per https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig, looks like softcommit is false ( which means every index update triggers an IO ). Apparently this is application for future solrcloud 4.5. I would appreciate if someone can confirm this for solr 4.1 ? My second question is : Is it ok to have different commit types on different nodes which are part of my solrCloud deployment ? Regards, Aditya -- Regards, -Aditya Sakhuja