[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2018-10-26 Thread Tamas Penzes (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Penzes updated ZOOKEEPER-1777:

Fix Version/s: (was: 3.5.5)

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Critical
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-1777-3.4.patch, ZOOKEEPER-1777.patch, 
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz, logs_trunk.tar.gz, snaps.tar
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2018-10-26 Thread Tamas Penzes (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Penzes updated ZOOKEEPER-1777:

Priority: Major  (was: Critical)

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Major
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-1777-3.4.patch, ZOOKEEPER-1777.patch, 
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz, logs_trunk.tar.gz, snaps.tar
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2017-03-13 Thread Michael Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han updated ZOOKEEPER-1777:
---
Fix Version/s: (was: 3.5.3)
   3.5.4

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Critical
> Fix For: 3.5.4, 3.6.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777-3.4.patch, 
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2016-06-21 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated ZOOKEEPER-1777:
-
Fix Version/s: (was: 3.5.2)
   3.5.3

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Critical
> Fix For: 3.6.0, 3.5.3
>
> Attachments: ZOOKEEPER-1777-3.4.patch, ZOOKEEPER-1777.patch, 
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz, logs_trunk.tar.gz, snaps.tar
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2013-10-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco updated ZOOKEEPER-1777:
-

Fix Version/s: (was: 3.4.6)

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777-3.4.patch, 
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2013-10-08 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-1777:


Priority: Critical  (was: Blocker)

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Critical
> Fix For: 3.4.6, 3.5.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777-3.4.patch, 
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2013-10-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco updated ZOOKEEPER-1777:
-

Attachment: ZOOKEEPER-1777.patch

The updated version of the patch for trunk seems to pass the regression tests.
However, it still lacks an specific test for the error test reported in this 
JIRA.
It is not intended to be the final version, it is there to help with the 
discussion of the proposal.

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777-3.4.patch, 
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2013-10-04 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco updated ZOOKEEPER-1777:
-

Attachment: ZOOKEEPER-1777.patch

For my case there is a simple solution, since our snapshots are very small we 
have already applied a patch that forces snapshot synchronization and avoids 
the problem. In any case, severity was changed by Patrick Hunt, you may want to 
check with him in case you haven't done so already.
The attached patch proposes a fix in which an incremental hash that should be 
unique for each transaction history is associated with each transaction. This 
hash is sent to the Leader (only if the leader supports it).
The Leader then sends an snapshot if the hash doesn't match its history for the 
same transaction.
At least this was the intention of the change :-).
I had only time to check the patch for 3.4 and at least it passes the 
regression test.
Reviews and comments will be very appreciated.

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777-3.4.patch, 
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2013-10-04 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco updated ZOOKEEPER-1777:
-

Attachment: ZOOKEEPER-1777-3.4.patch

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777-3.4.patch, 
> ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2013-10-04 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco updated ZOOKEEPER-1777:
-

Attachment: logs_trunk.tar.gz

It does occur in trunk also. Logs attached in file logs_trunk.tar.gz.
However I did see the TRUNC used for synchronization a couple of times, and 
also a message of being unable to send TRUNC because of different epochs and 
sending snapshot instead. So it was a bit harder to reproduced.
This is the data in server A:
[Fbc, Cbc, 6a, 4a, 7bc, 5a, 8bc, 3, 2, 1, 9a, 9bc, 7a, 8a, zookeeper, Abc, 6bc, 
Bbc, Dbc, Ebc]
This is the data in server B:
[Fbc, Cbc, 7bc, 5bc, 8bc, 4bc, 3, 2, 1, 9bc, zookeeper, Abc, 6bc, Bbc, Dbc, Ebc]
I am working in the patch that sends information about the last transaction 
from the learner to the leader. That means that synchronization via snapshot 
will only happen when this problem occurs. Personally I don't see any other way 
to solve this, please tell me if you do.

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2013-10-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco updated ZOOKEEPER-1777:
-

Attachment: ZOOKEEPER-1777.tar.gz

Thanks a lot Flavio and Thawan for looking into this!
I thought A does not get a TRUNC because B and C are already in a zxid that is 
higher than a9, which is the highest zxid that A has seen.
I thought a TRUNC is only sent if the leader has a lower zxid than the incoming 
learner.
The logs and data dir for this case are attached now.
This is the resulting data in the wrong follower:
[3, 2, 1, 6, zookeeper, 5, 5bis, 4]
And this is the resulting data in the leader and the other follower:
[3, 2, 1, 4bis, 6, zookeeper, 5bis]
I am not saying that this is an error in the protocol. I am only saying that I 
see it as a problem and a small modification of the protocol is one of the 
solutions. Another solution would be adding an option to force SNAP 
synchronization, and there are very likely more.

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: snaps.tar, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2013-10-02 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-1777:


Priority: Blocker  (was: Major)

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: snaps.tar
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

2013-10-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco updated ZOOKEEPER-1777:
-

Attachment: snaps.tar

Snapshots of the three members of the ZooKeeper ensemble.
The 8 missing nodes in "the follower that is not ok" were created in the end of 
epoch 1:
 < cZxid = 0x01007d
 ...
 < cZxid = 0x0100a9
 while the complete list is:
 ...
 cZxid = 0x01007b
 cZxid = 0x01007d
 ...
 cZxid = 0x0100a9
 cZxid = 0x020004
 ...
 4 of the 6 ephemeral owners of these nodes have made modifications during 
epoch 2, which makes me think that this problem might not be related with 
session expiration, but more likely with synchronization after leader election.
 Even though some of the missing znodes were modified in epoch 2, "the follower 
that is not ok" didn't use this event to notice that something was wrong and 
e.g. restart and synchronize via snapshot.

> Missing ephemeral nodes in one of the members of the ensemble
> -
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
>Reporter: Germán Blanco
>Assignee: Germán Blanco
> Fix For: 3.4.6, 3.5.0
>
> Attachments: snaps.tar
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)