Hi Michael,

Quarantine is the state when akka system level messages could not be exchanged 
across the nodes – these include but not limited to heartbeats, remote 
deathwatch, node state updates etc.

This article https://livingston.io/understanding-akkas-quarantine-state/ gives 
a fair idea

Some pointers on what could cause this are discussed here
https://groups.google.com/forum/#!searchin/akka-user/quarantine|sort:date/akka-user/6cmA1RzE4-s/IaHxhxLhEgAJ

We have seen the suicide in past earlier during long stop-the world type GCs as 
well as *deliberate* (for testing purposes) interface-down / up for 2550 …

Haven’t tested this behavior on master yet ..

Regards
Muthu




From: controller-dev-boun...@lists.opendaylight.org 
[mailto:controller-dev-boun...@lists.opendaylight.org] On Behalf Of Michael 
Vorburger
Sent: Thursday, July 05, 2018 11:12 PM
To: Tom Pantelis <tompante...@gmail.com>
Cc: Sridhar Gaddam <sgad...@redhat.com>; Kitt, Stephen <sk...@redhat.com>; 
controller-dev <controller-dev@lists.opendaylight.org>
Subject: Re: [controller-dev] ODL abrupt restart - System.exit() via 
QuarantinedMonitorActorPropsFactory ?

On Thu, Jul 5, 2018 at 7:39 PM, Tom Pantelis 
<tompante...@gmail.com<mailto:tompante...@gmail.com>> wrote:
On Thu, Jul 5, 2018 at 1:35 PM, Michael Vorburger 
<vorbur...@redhat.com<mailto:vorbur...@redhat.com>> wrote:
Tom, or Robert, or anyone else having hit this themselves,

would you be able to remind us what in clustering can cause an ODL abrupt 
restart - System.exit() via bundleContext.getBundle(0).stop(); from 
https://github.com/opendaylight/controller/blob/master/opendaylight/md-sal/sal-distributed-datastore/src/main/java/org/opendaylight/controller/cluster/akka/osgi/impl/QuarantinedMonitorActorPropsFactory.java
 ?

I do vaguely an "inconsistent cluster" leading to this - clarify exactly what 
situation leads to that? Loss of leader? Loss of majority?

asking for https://bugzilla.redhat.com/show_bug.cgi?id=1597304 ...

That happens when akka quarantines a node - it can no longer rejoin the 
majority cluster unless the actor system is restarted, hence we restart the 
whole JVM.

and what can cause Akka to have to quarantine a node?

_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to