[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-2739: - Component/s: (was: mongomk) core > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: core >Affects Versions: 1.2 >Reporter: Stefan Egli >Assignee: Stefan Egli > Labels: resilience > Fix For: 1.3.4 > > Attachments: OAK-2739.patch > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-2739: - Fix Version/s: (was: 1.3.5) 1.3.4 Tentatively setting fix version to 1.3.4 assuming it would be fine to go ahead with this without the testing situation resolved/clarified > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli >Assignee: Stefan Egli > Labels: resilience > Fix For: 1.3.4 > > Attachments: OAK-2739.patch > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-2739: - Attachment: (was: OAK-2739.patch) > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli >Assignee: Stefan Egli > Labels: resilience > Fix For: 1.3.5 > > Attachments: OAK-2739.patch > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-2739: - Attachment: OAK-2739.patch > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli >Assignee: Stefan Egli > Labels: resilience > Fix For: 1.3.5 > > Attachments: OAK-2739.patch > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-2739: - Attachment: OAK-2739.patch Attaching [^OAK-2739.patch] that implements a lease check upon each document store invocation by using suggested document store wrapper pattern. Review very welcome! > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli >Assignee: Stefan Egli > Labels: resilience > Fix For: 1.3.5 > > Attachments: OAK-2739.patch > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-2739: - Fix Version/s: (was: 1.4) 1.3.5 > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli >Assignee: Stefan Egli > Labels: resilience > Fix For: 1.3.5 > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-2739: - Fix Version/s: (was: 1.3.4) 1.4 > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli > Labels: resilience > Fix For: 1.4 > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-2739: -- Fix Version/s: (was: 1.3.3) 1.3.4 Bulk move to 1.3.4 > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli > Labels: resilience > Fix For: 1.3.4 > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-2739: -- Fix Version/s: (was: 1.3.2) 1.3.3 Bulk move to 1.3.3. > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli > Labels: resilience > Fix For: 1.3.3 > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-2739: - Fix Version/s: (was: 1.3.1) 1.3.2 > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli > Labels: resilience > Fix For: 1.3.2 > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-2739: - Fix Version/s: (was: 1.3.0) 1.3.1 > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli > Labels: resilience > Fix For: 1.3.1 > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2739) take appropriate action when lease cannot be renewed (in time)
[ https://issues.apache.org/jira/browse/OAK-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-2739: --- Labels: resilience (was: ) > take appropriate action when lease cannot be renewed (in time) > -- > > Key: OAK-2739 > URL: https://issues.apache.org/jira/browse/OAK-2739 > Project: Jackrabbit Oak > Issue Type: Task > Components: mongomk >Affects Versions: 1.2 >Reporter: Stefan Egli > Labels: resilience > Fix For: 1.3.0 > > > Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its > lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the > others in the same oak-cluster. Those then mark this client as {{inactive}} > and start recoverying and subsequently removing that node from any further > merge etc operation. > Now, whatever the reason was why that client stopped renewing the lease > (could be an exception, deadlock, whatever) - that client itself still > considers itself as {{active}} and continues to take part in the cluster > action. > This will result in a unbalanced situation where that one client 'sees' > everybody as {{active}} while the others see this one as {{inactive}}. > If this ClusterNodeInfo state should be something that can be built upon, and > to avoid any inconsistency due to unbalanced handling, the inactive node > should probably retire gracefully - or any other appropriate action should be > taken, other than just continuing as today. > This ticket is to keep track of ideas and actions taken wrt this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)