isolating solrcloud instance from peer updates

2013-09-21 Thread Aditya Sakhuja
Hello all,

Is there a way to isolate an active solr-cloud instance from all incoming
replication update requests from peer nodes ?

-- 
Regards,
-Aditya Sakhuja


Re: ReplicationFactor for solrcloud

2013-09-21 Thread Aditya Sakhuja
Thanks Shalin. We used the maxShardsPerNode=3 as you suggest here.


On Thu, Sep 12, 2013 at 4:09 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> You must specify maxShardsPerNode=3 for this to happen. By default
> maxShardsPerNode defaults to 1 so only one shard is created per node.
>
> On Thu, Sep 12, 2013 at 3:19 AM, Aditya Sakhuja
>  wrote:
> > Hi -
> >
> > I am trying to set the 3 shards and 3 replicas for my solrcloud
> deployment
> > with 3 servers, specifying the replicationFactor=3 and numShards=3 when
> > starting the first node. I see each of the servers allocated to 1 shard
> > each.however, do not see 3 replicas allocated on each node.
> >
> > I specifically need to have 3 replicas across 3 servers with 3 shards. Do
> > we think of any reason to not have this configuration ?
> >
> > --
> > Regards,
> > -Aditya Sakhuja
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
Regards,
-Aditya Sakhuja


Re: solrcloud shards backup/restoration

2013-09-19 Thread Aditya Sakhuja
How does one recover from an index corruption ? That's what I am trying to
eventually tackle here.

Thanks
Aditya

On Thursday, September 19, 2013, Aditya Sakhuja wrote:

> Hi,
>
> Sorry for the late followup on this. Let me put in more details here.
>
> *The problem:*
>
> Cannot successfully restore back the index backed up with
> '/replication?command=backup'. The backup was generated as *
> snapshot.mmdd*
>
> *My setup and steps:*
> *
> *
> 6 solrcloud instances
> 7 zookeepers instances
>
> Steps:
>
> 1.> Take snapshot using *http://host1:8893/solr/replication?command=backup
> *, on one host only. move *snapshot.mmdd *to some reliable storage.
>
> 2.> Stop all 6 solr instances, all 7 zk instances.
>
> 3.> Delete ../collectionname/data/* on all solrcloud nodes. ie. deleting
> the index data completely.
>
> 4.> Delete zookeeper/data/version*/* on all zookeeper nodes.
>
> 5.> Copy back index from backup to one of the nodes.
>  \> cp *snapshot.mmdd/*  *../collectionname/data/index/*
>
> 6.> Restart all zk instances. Restart all solrcloud instances.
>
>
> *Outcome:*
> *
> *
> All solr instances are up. However, *num of docs = 0 *for all nodes.
> Looking at the node where the index was restored, there is a new
> index.yymmddhhmmss directory being created and index.properties pointing to
> it. That explains why no documents are reported.
>
>
> How do I have solrcloud pickup data from the index directory on a restart
> ?
>
> Thanks in advance,
> Aditya
>
>
>
> On Fri, Sep 6, 2013 at 3:41 PM, Aditya Sakhuja 
> wrote:
>
> Thanks Shalin and Mark for your responses. I am on the same page about the
> conventions for taking the backup. However, I am less sure about the
> restoration of the index. Lets say we have 3 shards across 3 solrcloud
> servers.
>
> 1.> I am assuming we should take a backup from each of the shard leaders
> to get a complete collection. do you think that will get the complete index
> ( not worrying about what is not hard committed at the time of backup ). ?
>
> 2.> How do we go about restoring the index in a fresh solrcloud cluster ?
> From the structure of the snapshot I took, I did not see any
> replication.properties or index.properties  which I see normally on a
> healthy solrcloud cluster nodes.
> if I have the snapshot named snapshot.20130905 does the
> snapshot.20130905/* go into data/index ?
>
> Thanks
> Aditya
>
>
>
> On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller  wrote:
>
> Phone typing. The end should not say "don't hard commit" - it should say
> "do a hard commit and take a snapshot".
>
> Mark
>
> Sent from my iPhone
>
> On Sep 6, 2013, at 7:26 AM, Mark Miller  wrote:
>
> > I don't know that it's too bad though - its always been the case that if
> you do a backup while indexing, it's just going to get up to the last hard
> commit. With SolrCloud that will still be the case. So just make sure you
> do a hard commit right before taking the backup - yes, it might miss a few
> docs in the tran log, but if you are taking a back up while indexing, you
> don't have great precision in any case - you will roughly get a snapshot
> for around that time - even without SolrCloud, if you are worried about
> precision and getting every update into that backup, you want to stop
> indexing and commit first. But if you just want a rough snapshot for around
> that time, in both cases you can still just don't hard commit and take a
> snapshot.
> >
> > Mark
> >
> > Sent from my iPhone
> >
> > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <
> shalinman...@gmail.com> wrote:
> >
> >> The replication handler's backup command was built for pre-SolrCloud.
> >> It takes a snapshot of the index but it is unaware of the transaction
> >> log which is a key component in SolrCloud. Hence unless you stop
> >> updates, commit your changes and then take a backup, you will likely
> >> miss some updates.
> >>
> >> That being said, I'm curious to see how peer sync behaves when you try
> >> to restore from a snapshot. When you say that you haven't been
> >> successful in restoring, what exactly is the behaviour you observed?
> >>
> >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <
> aditya.sakh...@gmail.com> wrote:
> >>> Hello,
> >>>
> >>> I was looking for a good backup / recovery solution for the solrcloud
> >>> indexes. I am more looking for restoring the indexes from the index
> >>> sn

Re: solrcloud shards backup/restoration

2013-09-19 Thread Aditya Sakhuja
Hi,

Sorry for the late followup on this. Let me put in more details here.

*The problem:*

Cannot successfully restore back the index backed up with
'/replication?command=backup'. The backup was generated as *
snapshot.mmdd*

*My setup and steps:*
*
*
6 solrcloud instances
7 zookeepers instances

Steps:

1.> Take snapshot using *http://host1:8893/solr/replication?command=backup*,
on one host only. move *snapshot.mmdd *to some reliable storage.

2.> Stop all 6 solr instances, all 7 zk instances.

3.> Delete ../collectionname/data/* on all solrcloud nodes. ie. deleting
the index data completely.

4.> Delete zookeeper/data/version*/* on all zookeeper nodes.

5.> Copy back index from backup to one of the nodes.
 \> cp *snapshot.mmdd/*  *../collectionname/data/index/*

6.> Restart all zk instances. Restart all solrcloud instances.


*Outcome:*
*
*
All solr instances are up. However, *num of docs = 0 *for all nodes.
Looking at the node where the index was restored, there is a new
index.yymmddhhmmss directory being created and index.properties pointing to
it. That explains why no documents are reported.


How do I have solrcloud pickup data from the index directory on a restart ?

Thanks in advance,
Aditya



On Fri, Sep 6, 2013 at 3:41 PM, Aditya Sakhuja wrote:

> Thanks Shalin and Mark for your responses. I am on the same page about the
> conventions for taking the backup. However, I am less sure about the
> restoration of the index. Lets say we have 3 shards across 3 solrcloud
> servers.
>
> 1.> I am assuming we should take a backup from each of the shard leaders
> to get a complete collection. do you think that will get the complete index
> ( not worrying about what is not hard committed at the time of backup ). ?
>
> 2.> How do we go about restoring the index in a fresh solrcloud cluster ?
> From the structure of the snapshot I took, I did not see any
> replication.properties or index.properties  which I see normally on a
> healthy solrcloud cluster nodes.
> if I have the snapshot named snapshot.20130905 does the
> snapshot.20130905/* go into data/index ?
>
> Thanks
> Aditya
>
>
>
> On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller  wrote:
>
>> Phone typing. The end should not say "don't hard commit" - it should say
>> "do a hard commit and take a snapshot".
>>
>> Mark
>>
>> Sent from my iPhone
>>
>> On Sep 6, 2013, at 7:26 AM, Mark Miller  wrote:
>>
>> > I don't know that it's too bad though - its always been the case that
>> if you do a backup while indexing, it's just going to get up to the last
>> hard commit. With SolrCloud that will still be the case. So just make sure
>> you do a hard commit right before taking the backup - yes, it might miss a
>> few docs in the tran log, but if you are taking a back up while indexing,
>> you don't have great precision in any case - you will roughly get a
>> snapshot for around that time - even without SolrCloud, if you are worried
>> about precision and getting every update into that backup, you want to stop
>> indexing and commit first. But if you just want a rough snapshot for around
>> that time, in both cases you can still just don't hard commit and take a
>> snapshot.
>> >
>> > Mark
>> >
>> > Sent from my iPhone
>> >
>> > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <
>> shalinman...@gmail.com> wrote:
>> >
>> >> The replication handler's backup command was built for pre-SolrCloud.
>> >> It takes a snapshot of the index but it is unaware of the transaction
>> >> log which is a key component in SolrCloud. Hence unless you stop
>> >> updates, commit your changes and then take a backup, you will likely
>> >> miss some updates.
>> >>
>> >> That being said, I'm curious to see how peer sync behaves when you try
>> >> to restore from a snapshot. When you say that you haven't been
>> >> successful in restoring, what exactly is the behaviour you observed?
>> >>
>> >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <
>> aditya.sakh...@gmail.com> wrote:
>> >>> Hello,
>> >>>
>> >>> I was looking for a good backup / recovery solution for the solrcloud
>> >>> indexes. I am more looking for restoring the indexes from the index
>> >>> snapshot, which can be taken using the replicationHandler's backup
>> command.
>> >>>
>> >>> I am looking for something that works with solrcloud 4.3 eventually,
>> but
>> >>> still relevant if you tested with a previous version.
>> >>>
>> >>> I haven't been successful in have the restored index replicate across
>> the
>> >>> new replicas, after I restart all the nodes, with one node having the
>> >>> restored index.
>> >>>
>> >>> Is restoring the indexes on all the nodes the best way to do it ?
>> >>> --
>> >>> Regards,
>> >>> -Aditya Sakhuja
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >> Shalin Shekhar Mangar.
>>
>
>
>
> --
> Regards,
> -Aditya Sakhuja
>



-- 
Regards,
-Aditya Sakhuja


ReplicationFactor for solrcloud

2013-09-11 Thread Aditya Sakhuja
Hi -

I am trying to set the 3 shards and 3 replicas for my solrcloud deployment
with 3 servers, specifying the replicationFactor=3 and numShards=3 when
starting the first node. I see each of the servers allocated to 1 shard
each.however, do not see 3 replicas allocated on each node.

I specifically need to have 3 replicas across 3 servers with 3 shards. Do
we think of any reason to not have this configuration ?

-- 
Regards,
-Aditya Sakhuja


Re: solrcloud shards backup/restoration

2013-09-06 Thread Aditya Sakhuja
Thanks Shalin and Mark for your responses. I am on the same page about the
conventions for taking the backup. However, I am less sure about the
restoration of the index. Lets say we have 3 shards across 3 solrcloud
servers.

1.> I am assuming we should take a backup from each of the shard leaders to
get a complete collection. do you think that will get the complete index (
not worrying about what is not hard committed at the time of backup ). ?

2.> How do we go about restoring the index in a fresh solrcloud cluster ?
>From the structure of the snapshot I took, I did not see any
replication.properties or index.properties  which I see normally on a
healthy solrcloud cluster nodes.
if I have the snapshot named snapshot.20130905 does the snapshot.20130905/*
go into data/index ?

Thanks
Aditya



On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller  wrote:

> Phone typing. The end should not say "don't hard commit" - it should say
> "do a hard commit and take a snapshot".
>
> Mark
>
> Sent from my iPhone
>
> On Sep 6, 2013, at 7:26 AM, Mark Miller  wrote:
>
> > I don't know that it's too bad though - its always been the case that if
> you do a backup while indexing, it's just going to get up to the last hard
> commit. With SolrCloud that will still be the case. So just make sure you
> do a hard commit right before taking the backup - yes, it might miss a few
> docs in the tran log, but if you are taking a back up while indexing, you
> don't have great precision in any case - you will roughly get a snapshot
> for around that time - even without SolrCloud, if you are worried about
> precision and getting every update into that backup, you want to stop
> indexing and commit first. But if you just want a rough snapshot for around
> that time, in both cases you can still just don't hard commit and take a
> snapshot.
> >
> > Mark
> >
> > Sent from my iPhone
> >
> > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <
> shalinman...@gmail.com> wrote:
> >
> >> The replication handler's backup command was built for pre-SolrCloud.
> >> It takes a snapshot of the index but it is unaware of the transaction
> >> log which is a key component in SolrCloud. Hence unless you stop
> >> updates, commit your changes and then take a backup, you will likely
> >> miss some updates.
> >>
> >> That being said, I'm curious to see how peer sync behaves when you try
> >> to restore from a snapshot. When you say that you haven't been
> >> successful in restoring, what exactly is the behaviour you observed?
> >>
> >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <
> aditya.sakh...@gmail.com> wrote:
> >>> Hello,
> >>>
> >>> I was looking for a good backup / recovery solution for the solrcloud
> >>> indexes. I am more looking for restoring the indexes from the index
> >>> snapshot, which can be taken using the replicationHandler's backup
> command.
> >>>
> >>> I am looking for something that works with solrcloud 4.3 eventually,
> but
> >>> still relevant if you tested with a previous version.
> >>>
> >>> I haven't been successful in have the restored index replicate across
> the
> >>> new replicas, after I restart all the nodes, with one node having the
> >>> restored index.
> >>>
> >>> Is restoring the indexes on all the nodes the best way to do it ?
> >>> --
> >>> Regards,
> >>> -Aditya Sakhuja
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Shalin Shekhar Mangar.
>



-- 
Regards,
-Aditya Sakhuja


data/index naming format

2013-09-05 Thread Aditya Sakhuja
Hello,

I am running solr 4.1 for now, and am confused about the structure and
naming of the contents of the data dir. I do not see the index.properties
being generated on a fresh solr node start either.

Can someone clarify when should one expect to see

data/index vs. data/index., and the index.properties along with
the second version.

-- 
Regards,
-Aditya Sakhuja


solrcloud shards backup/restoration

2013-09-05 Thread Aditya Sakhuja
Hello,

I was looking for a good backup / recovery solution for the solrcloud
indexes. I am more looking for restoring the indexes from the index
snapshot, which can be taken using the replicationHandler's backup command.

I am looking for something that works with solrcloud 4.3 eventually, but
still relevant if you tested with a previous version.

I haven't been successful in have the restored index replicate across the
new replicas, after I restart all the nodes, with one node having the
restored index.

Is restoring the indexes on all the nodes the best way to do it ?
-- 
Regards,
-Aditya Sakhuja


Solr 4.1 default commit mode

2013-08-01 Thread Aditya Sakhuja
Hi,

Can someone please confirm what is the default  "commit" type for solrcloud
4.1  As per
https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig,
looks like softcommit is false ( which means every index update triggers an
IO ). Apparently this is application for future solrcloud 4.5.

I would appreciate if someone can confirm this for solr 4.1 ?

My second question is : Is it ok to have different commit types on
different nodes which are part of my solrCloud deployment ?

Regards,
Aditya


-- 
Regards,
-Aditya Sakhuja