[jira] [Commented] (SOLR-10285) Reduce state messages when there are leader only shards

2017-10-02 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189159#comment-16189159
 ] 

Cao Manh Dat commented on SOLR-10285:
-

Hi [~varunthacker], I don't know why we have to wait for the leader message to 
be processed ( because this ticket skipped leader message )? Even if we send 
leader message and wait for it to be processed, we can easily get false 
positive, when the replica is already a leader and the unset leader message is 
in the queue. 

> Reduce state messages when there are leader only shards
> ---
>
> Key: SOLR-10285
> URL: https://issues.apache.org/jira/browse/SOLR-10285
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Cao Manh Dat
> Attachments: SOLR-10285.patch, SOLR-10285.patch, SOLR-10285.patch
>
>
> For shards which have 1 replica ( leader ) we know it doesn't need to recover 
> from anyone. We should short-circuit the recovery process in this case. 
> The motivation for this being that we will generate less state events and be 
> able to mark these replicas as active again without it needing to go into 
> 'recovering' state. 
> We already short circuit when you set {{-Dsolrcloud.skip.autorecovery=true}} 
> but that sys prop was meant for tests only. Extending this to make sure the 
> code short-circuits when the core knows its the only replica in the shard is 
> the motivation of the Jira.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10285) Reduce state messages when there are leader only shards

2017-10-02 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188454#comment-16188454
 ] 

Varun Thacker commented on SOLR-10285:
--

Hi Dat,

Do you think it will be a good idea to wait for the leader message to be 
processed before we return? 

> Reduce state messages when there are leader only shards
> ---
>
> Key: SOLR-10285
> URL: https://issues.apache.org/jira/browse/SOLR-10285
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Cao Manh Dat
> Attachments: SOLR-10285.patch, SOLR-10285.patch, SOLR-10285.patch
>
>
> For shards which have 1 replica ( leader ) we know it doesn't need to recover 
> from anyone. We should short-circuit the recovery process in this case. 
> The motivation for this being that we will generate less state events and be 
> able to mark these replicas as active again without it needing to go into 
> 'recovering' state. 
> We already short circuit when you set {{-Dsolrcloud.skip.autorecovery=true}} 
> but that sys prop was meant for tests only. Extending this to make sure the 
> code short-circuits when the core knows its the only replica in the shard is 
> the motivation of the Jira.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10285) Reduce state messages when there are leader only shards

2017-10-01 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16187638#comment-16187638
 ] 

Cao Manh Dat commented on SOLR-10285:
-

Hi [~jhump], your patch looks good to me. About your TODO notes, I did some 
search and found that
- ElectionContext is the only place use OverseerAction.Leader ( one for unset 
leader and one for set leader ).
- STATE_PROP used in the second case is replica's state, which even not used in 
{{SliceMutator.setShardLeader}}
So your concern about "mark the shard as inactive" is not correct, right?

The only case that can occur between upgrade is 
1. A replica ( repA ) is currently leader
2. The overseer is very busy
3. repA does unset leader operation ( which is delayed because overseer is very 
busy )
4. repA get stopped in middle of the election process ( so set leader operation 
never get executed )
5. repA start with the new code, then it saw it is the leader ( the unset 
operation in step 2 had not been executed ) so it skipped set leader operation.

I think that above case is very very very rare and even it happens, Sysadmins 
must handle overwhelming in the number of operations in Overseer first. 



> Reduce state messages when there are leader only shards
> ---
>
> Key: SOLR-10285
> URL: https://issues.apache.org/jira/browse/SOLR-10285
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Cao Manh Dat
> Attachments: SOLR-10285.patch
>
>
> For shards which have 1 replica ( leader ) we know it doesn't need to recover 
> from anyone. We should short-circuit the recovery process in this case. 
> The motivation for this being that we will generate less state events and be 
> able to mark these replicas as active again without it needing to go into 
> 'recovering' state. 
> We already short circuit when you set {{-Dsolrcloud.skip.autorecovery=true}} 
> but that sys prop was meant for tests only. Extending this to make sure the 
> code short-circuits when the core knows its the only replica in the shard is 
> the motivation of the Jira.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10285) Reduce state messages when there are leader only shards

2017-05-08 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001160#comment-16001160
 ] 

Erick Erickson commented on SOLR-10285:
---

Joshua:

"Yonik's law of patches" reads "A half-baked patch with no documentation, no 
tests and no backwards compatibility is better than no patch at all.". 

Please feel free to attach a patch even if it's not complete (even if it 
doesn't even _compile_!), with appropriate disclaimers. Even if someone picks 
up this JIRA and decides to use another approach they'll be able to benefit 
from what they see of your work.

It also is good if you mention that you won't be working on it, that way people 
won't wait if they want to pick it up.

Best,
Erick

> Reduce state messages when there are leader only shards
> ---
>
> Key: SOLR-10285
> URL: https://issues.apache.org/jira/browse/SOLR-10285
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Cao Manh Dat
>
> For shards which have 1 replica ( leader ) we know it doesn't need to recover 
> from anyone. We should short-circuit the recovery process in this case. 
> The motivation for this being that we will generate less state events and be 
> able to mark these replicas as active again without it needing to go into 
> 'recovering' state. 
> We already short circuit when you set {{-Dsolrcloud.skip.autorecovery=true}} 
> but that sys prop was meant for tests only. Extending this to make sure the 
> code short-circuits when the core knows its the only replica in the shard is 
> the motivation of the Jira.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10285) Reduce state messages when there are leader only shards

2017-05-08 Thread Joshua Humphries (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000967#comment-16000967
 ] 

Joshua Humphries commented on SOLR-10285:
-

[~dragonsinth], I'm afraid I don't have a patch. I do have a branch where I 
made a lot of progress, but I did not finish getting unit tests to pass. The 
patch for SOLR-10277 ended up being sufficient for our restart-time objectives 
at the time, so I put it on the back-burner. This change would certainly reduce 
the restart time further, quite considerably, in fact, for deployments with a 
large number of shards that do not have multiple replicas. I'll dust it off 
today and try to assess remaining work to get it merge-worthy.

> Reduce state messages when there are leader only shards
> ---
>
> Key: SOLR-10285
> URL: https://issues.apache.org/jira/browse/SOLR-10285
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Cao Manh Dat
>
> For shards which have 1 replica ( leader ) we know it doesn't need to recover 
> from anyone. We should short-circuit the recovery process in this case. 
> The motivation for this being that we will generate less state events and be 
> able to mark these replicas as active again without it needing to go into 
> 'recovering' state. 
> We already short circuit when you set {{-Dsolrcloud.skip.autorecovery=true}} 
> but that sys prop was meant for tests only. Extending this to make sure the 
> code short-circuits when the core knows its the only replica in the shard is 
> the motivation of the Jira.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10285) Reduce state messages when there are leader only shards

2017-05-07 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000303#comment-16000303
 ] 

Scott Blum commented on SOLR-10285:
---

[~jhump] did you have a patch for this?  or did we only discuss it?

> Reduce state messages when there are leader only shards
> ---
>
> Key: SOLR-10285
> URL: https://issues.apache.org/jira/browse/SOLR-10285
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Cao Manh Dat
>
> For shards which have 1 replica ( leader ) we know it doesn't need to recover 
> from anyone. We should short-circuit the recovery process in this case. 
> The motivation for this being that we will generate less state events and be 
> able to mark these replicas as active again without it needing to go into 
> 'recovering' state. 
> We already short circuit when you set {{-Dsolrcloud.skip.autorecovery=true}} 
> but that sys prop was meant for tests only. Extending this to make sure the 
> code short-circuits when the core knows its the only replica in the shard is 
> the motivation of the Jira.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org