Re: Cannot add replica during backup

2020-08-11 Thread Ashwin Ramesh
Hey Matthew,

Unfortunately, our shard leaders are across multiple nodes thus a single
EBS couldn't work. Did you manage to get around this issue yourself?

Regards,

Ash

On Tue, Aug 11, 2020 at 9:00 PM matthew sporleder 
wrote:

> I can already tell you it is EFS that is slow. I had to switch to an ebs
> disk for backups on a different project because efs couldn't keep up.
>
> > On Aug 10, 2020, at 9:43 PM, Ashwin Ramesh 
> wrote:
> >
> > Hey Aroop, the general process for our backup is:
> > - Connect all machines to an EFS drive (AWS's NFS service)
> > - Call the collections API to backup into EFS
> > - ZIP the directory once the backup is completed
> > - Copy the ZIP into an s3 bucket
> >
> > I'll probably have to see which part of the process is the slowest.
> >
> > On another note, can you simply remove the task from the ZK path to
> > continue the execution of tasks?
> >
> > Regards,
> >
> > Ash
> >
> >> On Tue, Aug 11, 2020 at 11:40 AM Aroop Ganguly
> >>  wrote:
> >>
> >> 12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins
> >> using the collection backup api.
> >> How are you taking the backup?
> >>
> >> Do you actually see any backup progress or u are just seeing the task in
> >> the overseer queue linger ?
> >> I have seen restore tasks hanging in the queue forever despite process
> >> completing in Solr 77 so wouldn’t be surprised this happens with backup
> as
> >> well. And also observed that unless that unless that task is removed
> from
> >> the overseer-collection-queue the next ones do not proceed.
> >>
> >> Also adding replicas while backup seems like overkill, why don’t you
> just
> >> have the appropriate replication factor in the first place and have
> >> autoAddReplicas=true for indemnity?
> >>
> >>> On Aug 10, 2020, at 6:32 PM, Ashwin Ramesh 
> >> wrote:
> >>>
> >>> Hi everybody,
> >>>
> >>> We are using solr 7.6 (SolrCloud). We notices that when the backup is
> >>> running, we cannot add any replicas to the collection. By the looks of
> >> it,
> >>> the job to add the replica is put into the Overseer queue, but it is
> not
> >>> being processed. Is this expected? And are there any workarounds?
> >>>
> >>> Our backups take about 12 hours. Maybe we should try optimize that too.
> >>>
> >>> Regards,
> >>>
> >>> Ash
> >>>
> >>> --
> >>> **
> >>> ** Empowering the world to design
> >>> Share accurate
> >>> information on COVID-19 and spread messages of support to your
> community.
> >>>
> >>> Here are some resources
> >>> <
> >>
> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr_source=news_campaign=covid19_templates
> >
> >>
> >>> that can help.
> >>>  
> >>>  
> >>>   
> >>> 
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >
> > --
> > **
> > ** Empowering the world to design
> > Share accurate
> > information on COVID-19 and spread messages of support to your community.
> >
> > Here are some resources
> > <
> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr_source=news_campaign=covid19_templates>
>
> > that can help.
> >  
> >  
> >   
> > 
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>

-- 
**
** Empowering the world to design
Share accurate 
information on COVID-19 and spread messages of support to your community.

Here are some resources 

 
that can help.
   
   
    













Re: Cannot add replica during backup

2020-08-11 Thread matthew sporleder
I can already tell you it is EFS that is slow. I had to switch to an ebs disk 
for backups on a different project because efs couldn't keep up. 

> On Aug 10, 2020, at 9:43 PM, Ashwin Ramesh  wrote:
> 
> Hey Aroop, the general process for our backup is:
> - Connect all machines to an EFS drive (AWS's NFS service)
> - Call the collections API to backup into EFS
> - ZIP the directory once the backup is completed
> - Copy the ZIP into an s3 bucket
> 
> I'll probably have to see which part of the process is the slowest.
> 
> On another note, can you simply remove the task from the ZK path to
> continue the execution of tasks?
> 
> Regards,
> 
> Ash
> 
>> On Tue, Aug 11, 2020 at 11:40 AM Aroop Ganguly
>>  wrote:
>> 
>> 12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins
>> using the collection backup api.
>> How are you taking the backup?
>> 
>> Do you actually see any backup progress or u are just seeing the task in
>> the overseer queue linger ?
>> I have seen restore tasks hanging in the queue forever despite process
>> completing in Solr 77 so wouldn’t be surprised this happens with backup as
>> well. And also observed that unless that unless that task is removed from
>> the overseer-collection-queue the next ones do not proceed.
>> 
>> Also adding replicas while backup seems like overkill, why don’t you just
>> have the appropriate replication factor in the first place and have
>> autoAddReplicas=true for indemnity?
>> 
>>> On Aug 10, 2020, at 6:32 PM, Ashwin Ramesh 
>> wrote:
>>> 
>>> Hi everybody,
>>> 
>>> We are using solr 7.6 (SolrCloud). We notices that when the backup is
>>> running, we cannot add any replicas to the collection. By the looks of
>> it,
>>> the job to add the replica is put into the Overseer queue, but it is not
>>> being processed. Is this expected? And are there any workarounds?
>>> 
>>> Our backups take about 12 hours. Maybe we should try optimize that too.
>>> 
>>> Regards,
>>> 
>>> Ash
>>> 
>>> --
>>> **
>>> ** Empowering the world to design
>>> Share accurate
>>> information on COVID-19 and spread messages of support to your community.
>>> 
>>> Here are some resources
>>> <
>> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr_source=news_campaign=covid19_templates>
>> 
>>> that can help.
>>>  
>>>  
>>>   
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
> -- 
> **
> ** Empowering the world to design
> Share accurate 
> information on COVID-19 and spread messages of support to your community.
> 
> Here are some resources 
> 
>  
> that can help.
>   
>    
>     
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


Re: Cannot add replica during backup

2020-08-11 Thread Aroop Ganguly
> We have 16 shards each approx 30GB - total is ~480GB. I'm also pretty sure
> it's a network issue. Very interesting that you can index 20x the data in
> 15 min!
Not index but backup an index in 15min.



>>> It would also help to ensure your overseer is on a node with a role that
> exempts it from any Solr index responsibilities.
> How would I ensure this? First I'm hearing about this!

Lookup roles and snitches and tags here: 
https://lucene.apache.org/solr/guide/7_7/rule-based-replica-placement.html 
  



> On Aug 10, 2020, at 6:54 PM, Ashwin Ramesh  > wrote:
> 
> Hi Aroop,
> 
> We have 16 shards each approx 30GB - total is ~480GB. I'm also pretty sure
> it's a network issue. Very interesting that you can index 20x the data in
> 15 min!
> 
>>> It would also help to ensure your overseer is on a node with a role that
> exempts it from any Solr index responsibilities.
> How would I ensure this? First I'm hearing about this!
> 
> Thanks for all the help!!
> 
> On Tue, Aug 11, 2020 at 11:48 AM Aroop Ganguly
> mailto:aroopgang...@icloud.com.invalid>> 
> wrote:
> 
>> Hi Ashwin
>> 
>> Thanks for sharing this detail.
>> Do you mind sharing how big are each of these indices ?
>> I am almost sure this is network capacity and constraints related per your
>> aws setup.
>> 
>> Yes if you can confirm that the backup is complete, or you just want the
>> system to move on discarding the backup process, your removal of the backup
>> flag from zookeeper will help Solr in moving on to the next task in the
>> queue.
>> 
>> It would also help to ensure your overseer is on a node with a role that
>> exempts it from any Solr index responsibilities.
>> 
>> 
>>> On Aug 10, 2020, at 6:43 PM, Ashwin Ramesh >> >
>> wrote:
>>> 
>>> Hey Aroop, the general process for our backup is:
>>> - Connect all machines to an EFS drive (AWS's NFS service)
>>> - Call the collections API to backup into EFS
>>> - ZIP the directory once the backup is completed
>>> - Copy the ZIP into an s3 bucket
>>> 
>>> I'll probably have to see which part of the process is the slowest.
>>> 
>>> On another note, can you simply remove the task from the ZK path to
>>> continue the execution of tasks?
>>> 
>>> Regards,
>>> 
>>> Ash
>>> 
>>> On Tue, Aug 11, 2020 at 11:40 AM Aroop Ganguly
>>> mailto:aroopgang...@icloud.com.invalid>> 
>>> wrote:
>>> 
 12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins
 using the collection backup api.
 How are you taking the backup?
 
 Do you actually see any backup progress or u are just seeing the task in
 the overseer queue linger ?
 I have seen restore tasks hanging in the queue forever despite process
 completing in Solr 77 so wouldn’t be surprised this happens with backup
>> as
 well. And also observed that unless that unless that task is removed
>> from
 the overseer-collection-queue the next ones do not proceed.
 
 Also adding replicas while backup seems like overkill, why don’t you
>> just
 have the appropriate replication factor in the first place and have
 autoAddReplicas=true for indemnity?
 
> On Aug 10, 2020, at 6:32 PM, Ashwin Ramesh  >
 wrote:
> 
> Hi everybody,
> 
> We are using solr 7.6 (SolrCloud). We notices that when the backup is
> running, we cannot add any replicas to the collection. By the looks of
 it,
> the job to add the replica is put into the Overseer queue, but it is
>> not
> being processed. Is this expected? And are there any workarounds?
> 
> Our backups take about 12 hours. Maybe we should try optimize that too.
> 
> Regards,
> 
> Ash
> 
> --
> **
> ** >Empowering the world 
> to design
> Share accurate
> information on COVID-19 and spread messages of support to your
>> community.
> 
> Here are some resources
> <
 
>> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr_source=news_campaign=covid19_templates
>>  
>> 
>>> 
 
> that can help.
>  
>  
>   
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
 
 
>>> 
>>> --
>>> **
>>> ** Empowering the world to design
>>> Share accurate
>>> information on COVID-19 and spread messages of support to your community.
>>> 
>>> Here are some resources
>>> <
>> 

Re: Cannot add replica during backup

2020-08-10 Thread Ashwin Ramesh
Hi Aroop,

We have 16 shards each approx 30GB - total is ~480GB. I'm also pretty sure
it's a network issue. Very interesting that you can index 20x the data in
15 min!

>> It would also help to ensure your overseer is on a node with a role that
exempts it from any Solr index responsibilities.
How would I ensure this? First I'm hearing about this!

Thanks for all the help!!

On Tue, Aug 11, 2020 at 11:48 AM Aroop Ganguly
 wrote:

> Hi Ashwin
>
> Thanks for sharing this detail.
> Do you mind sharing how big are each of these indices ?
> I am almost sure this is network capacity and constraints related per your
> aws setup.
>
> Yes if you can confirm that the backup is complete, or you just want the
> system to move on discarding the backup process, your removal of the backup
> flag from zookeeper will help Solr in moving on to the next task in the
> queue.
>
> It would also help to ensure your overseer is on a node with a role that
> exempts it from any Solr index responsibilities.
>
>
> > On Aug 10, 2020, at 6:43 PM, Ashwin Ramesh 
> wrote:
> >
> > Hey Aroop, the general process for our backup is:
> > - Connect all machines to an EFS drive (AWS's NFS service)
> > - Call the collections API to backup into EFS
> > - ZIP the directory once the backup is completed
> > - Copy the ZIP into an s3 bucket
> >
> > I'll probably have to see which part of the process is the slowest.
> >
> > On another note, can you simply remove the task from the ZK path to
> > continue the execution of tasks?
> >
> > Regards,
> >
> > Ash
> >
> > On Tue, Aug 11, 2020 at 11:40 AM Aroop Ganguly
> >  wrote:
> >
> >> 12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins
> >> using the collection backup api.
> >> How are you taking the backup?
> >>
> >> Do you actually see any backup progress or u are just seeing the task in
> >> the overseer queue linger ?
> >> I have seen restore tasks hanging in the queue forever despite process
> >> completing in Solr 77 so wouldn’t be surprised this happens with backup
> as
> >> well. And also observed that unless that unless that task is removed
> from
> >> the overseer-collection-queue the next ones do not proceed.
> >>
> >> Also adding replicas while backup seems like overkill, why don’t you
> just
> >> have the appropriate replication factor in the first place and have
> >> autoAddReplicas=true for indemnity?
> >>
> >>> On Aug 10, 2020, at 6:32 PM, Ashwin Ramesh 
> >> wrote:
> >>>
> >>> Hi everybody,
> >>>
> >>> We are using solr 7.6 (SolrCloud). We notices that when the backup is
> >>> running, we cannot add any replicas to the collection. By the looks of
> >> it,
> >>> the job to add the replica is put into the Overseer queue, but it is
> not
> >>> being processed. Is this expected? And are there any workarounds?
> >>>
> >>> Our backups take about 12 hours. Maybe we should try optimize that too.
> >>>
> >>> Regards,
> >>>
> >>> Ash
> >>>
> >>> --
> >>> **
> >>> ** Empowering the world to design
> >>> Share accurate
> >>> information on COVID-19 and spread messages of support to your
> community.
> >>>
> >>> Here are some resources
> >>> <
> >>
> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr_source=news_campaign=covid19_templates
> >
> >>
> >>> that can help.
> >>>  
> >>>  
> >>>   
> >>> 
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >
> > --
> > **
> > ** Empowering the world to design
> > Share accurate
> > information on COVID-19 and spread messages of support to your community.
> >
> > Here are some resources
> > <
> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr_source=news_campaign=covid19_templates>
>
> > that can help.
> >  
> >  
> >   
> > 
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>

-- 
**
** Empowering the world to design
Share accurate 
information on COVID-19 and spread messages of support to your community.

Here are some resources 

 
that can help.
   
   
    













Re: Cannot add replica during backup

2020-08-10 Thread Aroop Ganguly
Hi Ashwin

Thanks for sharing this detail.
Do you mind sharing how big are each of these indices ?
I am almost sure this is network capacity and constraints related per your aws 
setup.

Yes if you can confirm that the backup is complete, or you just want the system 
to move on discarding the backup process, your removal of the backup flag from 
zookeeper will help Solr in moving on to the next task in the queue.

It would also help to ensure your overseer is on a node with a role that 
exempts it from any Solr index responsibilities. 


> On Aug 10, 2020, at 6:43 PM, Ashwin Ramesh  wrote:
> 
> Hey Aroop, the general process for our backup is:
> - Connect all machines to an EFS drive (AWS's NFS service)
> - Call the collections API to backup into EFS
> - ZIP the directory once the backup is completed
> - Copy the ZIP into an s3 bucket
> 
> I'll probably have to see which part of the process is the slowest.
> 
> On another note, can you simply remove the task from the ZK path to
> continue the execution of tasks?
> 
> Regards,
> 
> Ash
> 
> On Tue, Aug 11, 2020 at 11:40 AM Aroop Ganguly
>  wrote:
> 
>> 12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins
>> using the collection backup api.
>> How are you taking the backup?
>> 
>> Do you actually see any backup progress or u are just seeing the task in
>> the overseer queue linger ?
>> I have seen restore tasks hanging in the queue forever despite process
>> completing in Solr 77 so wouldn’t be surprised this happens with backup as
>> well. And also observed that unless that unless that task is removed from
>> the overseer-collection-queue the next ones do not proceed.
>> 
>> Also adding replicas while backup seems like overkill, why don’t you just
>> have the appropriate replication factor in the first place and have
>> autoAddReplicas=true for indemnity?
>> 
>>> On Aug 10, 2020, at 6:32 PM, Ashwin Ramesh 
>> wrote:
>>> 
>>> Hi everybody,
>>> 
>>> We are using solr 7.6 (SolrCloud). We notices that when the backup is
>>> running, we cannot add any replicas to the collection. By the looks of
>> it,
>>> the job to add the replica is put into the Overseer queue, but it is not
>>> being processed. Is this expected? And are there any workarounds?
>>> 
>>> Our backups take about 12 hours. Maybe we should try optimize that too.
>>> 
>>> Regards,
>>> 
>>> Ash
>>> 
>>> --
>>> **
>>> ** Empowering the world to design
>>> Share accurate
>>> information on COVID-19 and spread messages of support to your community.
>>> 
>>> Here are some resources
>>> <
>> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr_source=news_campaign=covid19_templates>
>> 
>>> that can help.
>>>  
>>>  
>>>   
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
> -- 
> **
> ** Empowering the world to design
> Share accurate 
> information on COVID-19 and spread messages of support to your community.
> 
> Here are some resources 
> 
>  
> that can help.
>   
>    
>     
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 



Re: Cannot add replica during backup

2020-08-10 Thread Ashwin Ramesh
Hey Aroop, the general process for our backup is:
- Connect all machines to an EFS drive (AWS's NFS service)
- Call the collections API to backup into EFS
- ZIP the directory once the backup is completed
- Copy the ZIP into an s3 bucket

I'll probably have to see which part of the process is the slowest.

On another note, can you simply remove the task from the ZK path to
continue the execution of tasks?

Regards,

Ash

On Tue, Aug 11, 2020 at 11:40 AM Aroop Ganguly
 wrote:

> 12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins
> using the collection backup api.
> How are you taking the backup?
>
> Do you actually see any backup progress or u are just seeing the task in
> the overseer queue linger ?
> I have seen restore tasks hanging in the queue forever despite process
> completing in Solr 77 so wouldn’t be surprised this happens with backup as
> well. And also observed that unless that unless that task is removed from
> the overseer-collection-queue the next ones do not proceed.
>
> Also adding replicas while backup seems like overkill, why don’t you just
> have the appropriate replication factor in the first place and have
> autoAddReplicas=true for indemnity?
>
> > On Aug 10, 2020, at 6:32 PM, Ashwin Ramesh 
> wrote:
> >
> > Hi everybody,
> >
> > We are using solr 7.6 (SolrCloud). We notices that when the backup is
> > running, we cannot add any replicas to the collection. By the looks of
> it,
> > the job to add the replica is put into the Overseer queue, but it is not
> > being processed. Is this expected? And are there any workarounds?
> >
> > Our backups take about 12 hours. Maybe we should try optimize that too.
> >
> > Regards,
> >
> > Ash
> >
> > --
> > **
> > ** Empowering the world to design
> > Share accurate
> > information on COVID-19 and spread messages of support to your community.
> >
> > Here are some resources
> > <
> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr_source=news_campaign=covid19_templates>
>
> > that can help.
> >  
> >  
> >   
> > 
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>

-- 
**
** Empowering the world to design
Share accurate 
information on COVID-19 and spread messages of support to your community.

Here are some resources 

 
that can help.
   
   
    













Re: Cannot add replica during backup

2020-08-10 Thread Aroop Ganguly
12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins using 
the collection backup api.
How are you taking the backup?

Do you actually see any backup progress or u are just seeing the task in the 
overseer queue linger ?
I have seen restore tasks hanging in the queue forever despite process 
completing in Solr 77 so wouldn’t be surprised this happens with backup as 
well. And also observed that unless that unless that task is removed from the 
overseer-collection-queue the next ones do not proceed. 

Also adding replicas while backup seems like overkill, why don’t you just have 
the appropriate replication factor in the first place and have 
autoAddReplicas=true for indemnity?

> On Aug 10, 2020, at 6:32 PM, Ashwin Ramesh  wrote:
> 
> Hi everybody,
> 
> We are using solr 7.6 (SolrCloud). We notices that when the backup is
> running, we cannot add any replicas to the collection. By the looks of it,
> the job to add the replica is put into the Overseer queue, but it is not
> being processed. Is this expected? And are there any workarounds?
> 
> Our backups take about 12 hours. Maybe we should try optimize that too.
> 
> Regards,
> 
> Ash
> 
> -- 
> **
> ** Empowering the world to design
> Share accurate 
> information on COVID-19 and spread messages of support to your community.
> 
> Here are some resources 
> 
>  
> that can help.
>   
>    
>     
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 



Cannot add replica during backup

2020-08-10 Thread Ashwin Ramesh
Hi everybody,

We are using solr 7.6 (SolrCloud). We notices that when the backup is
running, we cannot add any replicas to the collection. By the looks of it,
the job to add the replica is put into the Overseer queue, but it is not
being processed. Is this expected? And are there any workarounds?

Our backups take about 12 hours. Maybe we should try optimize that too.

Regards,

Ash

-- 
**
** Empowering the world to design
Share accurate 
information on COVID-19 and spread messages of support to your community.

Here are some resources 

 
that can help.