[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-16 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824411#comment-15824411
 ] 

Erick Erickson commented on SOLR-9906:
--

Beasting after this latest push succeeded 100 times out of 100. Prior it  
failed for me 21/100 times.

> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 6.4
>
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824147#comment-15824147
 ] 

ASF subversion and git services commented on SOLR-9906:
---

Commit efc7ee0f0c9154fe58671601fdc053540c97ff62 in lucene-solr's branch 
refs/heads/master from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=efc7ee0 ]

SOLR-9906: Fix dodgy test check


> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 6.4
>
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824146#comment-15824146
 ] 

ASF subversion and git services commented on SOLR-9906:
---

Commit e13a6fa078890c3f3e0d9cebb1bf3329d94e46a6 in lucene-solr's branch 
refs/heads/branch_6x from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e13a6fa ]

SOLR-9906: Fix dodgy test check


> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 6.4
>
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824145#comment-15824145
 ] 

ASF subversion and git services commented on SOLR-9906:
---

Commit 3795c997257868b66306a2c105f095f8a82326c7 in lucene-solr's branch 
refs/heads/branch_6_4 from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3795c99 ]

SOLR-9906: Fix dodgy test check


> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 6.4
>
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-16 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824113#comment-15824113
 ] 

Alan Woodward commented on SOLR-9906:
-

Yes to both - don't worry about a patch, I'll make the change and push it.

> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 6.4
>
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-16 Thread Pushkar Raste (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824089#comment-15824089
 ] 

Pushkar Raste commented on SOLR-9906:
-

[~romseygeek] - Thank you for catch the bug. I think check can be fixed by 
changing {{slice.getState() == State.ACTIVE}} to {{slice.getLeader().getState() 
== Replica.State.ACTIVE}} 

Let me know if that is correct and I will attach a patch to fix it (Not sure if 
I have attach patch for this issue in entirety or just the patch to fix the 
slice vs replica state.

What do you mean by log message is badly setup?

> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 6.4
>
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-16 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823692#comment-15823692
 ] 

Alan Woodward commented on SOLR-9906:
-

This is causing lots of failures in PeerSyncReplicationTest.  I think 
AbstractDistribZkTestBase.waitForNewLeader() is buggy - the check that a new 
leader is active is looking at the slice state, not the prospective leader's 
replica state, plus the log message is badly set up.

> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 6.4
>
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15795845#comment-15795845
 ] 

ASF subversion and git services commented on SOLR-9906:
---

Commit 812070a77f483149e1d83b3d1bbc7ba80f0fd868 in lucene-solr's branch 
refs/heads/branch_6x from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=812070a ]

SOLR-9906-Use better check to validate if node recovered via PeerSync or 
Replication


> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794681#comment-15794681
 ] 

ASF subversion and git services commented on SOLR-9906:
---

Commit d5652385675d12b80a58e44a8c8b392c9f70a334 in lucene-solr's branch 
refs/heads/master from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d565238 ]

SOLR-9906: unused import


> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2017-01-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794381#comment-15794381
 ] 

ASF subversion and git services commented on SOLR-9906:
---

Commit 3988532d26a50b1f3cf51e1d0009a0754cfd6b57 in lucene-solr's branch 
refs/heads/master from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3988532 ]

SOLR-9906-Use better check to validate if node recovered via PeerSync or 
Replication


> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-9906.patch, SOLR-9906.patch, 
> SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9906) Use better check to validate if node recovered via PeerSync or Replication

2016-12-30 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787382#comment-15787382
 ] 

Noble Paul commented on SOLR-9906:
--

{{Thread.sleep(3000)}} in {{PeerSyncReplicationTest.forceNodeFailures}} need to 
go. uncoditional waits are pretty bad

> Use better check to validate if node recovered via PeerSync or Replication
> --
>
> Key: SOLR-9906
> URL: https://issues.apache.org/jira/browse/SOLR-9906
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-PeerSyncVsReplicationTest.diff
>
>
> Tests {{LeaderFailureAfterFreshStartTest}} and {{PeerSyncReplicationTest}} 
> currently rely on number of requests made to the leader's replication handler 
> to check if node recovered via PeerSync or replication. This check is not 
> very reliable and we have seen failures in the past. 
> While tinkering with different way to write a better test I found 
> [SOLR-9859|SOLR-9859]. Now that SOLR-9859 is fixed, here is idea for better 
> way to distinguish recovery via PeerSync vs Replication. 
> * For {{PeerSyncReplicationTest}}, if node successfully recovers via 
> PeerSync, then file {{replication.properties}} should not exist
> For {{LeaderFailureAfterFreshStartTest}}, if the freshly replicated node does 
> not go into replication recovery after the leader failure, contents 
> {{replication.properties}} should not change 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org