Hi Michael, Quarantine is the state when akka system level messages could not be exchanged across the nodes – these include but not limited to heartbeats, remote deathwatch, node state updates etc.
This article https://livingston.io/understanding-akkas-quarantine-state/ gives a fair idea Some pointers on what could cause this are discussed here https://groups.google.com/forum/#!searchin/akka-user/quarantine|sort:date/akka-user/6cmA1RzE4-s/IaHxhxLhEgAJ We have seen the suicide in past earlier during long stop-the world type GCs as well as *deliberate* (for testing purposes) interface-down / up for 2550 … Haven’t tested this behavior on master yet .. Regards Muthu From: controller-dev-boun...@lists.opendaylight.org [mailto:controller-dev-boun...@lists.opendaylight.org] On Behalf Of Michael Vorburger Sent: Thursday, July 05, 2018 11:12 PM To: Tom Pantelis <tompante...@gmail.com> Cc: Sridhar Gaddam <sgad...@redhat.com>; Kitt, Stephen <sk...@redhat.com>; controller-dev <controller-dev@lists.opendaylight.org> Subject: Re: [controller-dev] ODL abrupt restart - System.exit() via QuarantinedMonitorActorPropsFactory ? On Thu, Jul 5, 2018 at 7:39 PM, Tom Pantelis <tompante...@gmail.com<mailto:tompante...@gmail.com>> wrote: On Thu, Jul 5, 2018 at 1:35 PM, Michael Vorburger <vorbur...@redhat.com<mailto:vorbur...@redhat.com>> wrote: Tom, or Robert, or anyone else having hit this themselves, would you be able to remind us what in clustering can cause an ODL abrupt restart - System.exit() via bundleContext.getBundle(0).stop(); from https://github.com/opendaylight/controller/blob/master/opendaylight/md-sal/sal-distributed-datastore/src/main/java/org/opendaylight/controller/cluster/akka/osgi/impl/QuarantinedMonitorActorPropsFactory.java ? I do vaguely an "inconsistent cluster" leading to this - clarify exactly what situation leads to that? Loss of leader? Loss of majority? asking for https://bugzilla.redhat.com/show_bug.cgi?id=1597304 ... That happens when akka quarantines a node - it can no longer rejoin the majority cluster unless the actor system is restarted, hence we restart the whole JVM. and what can cause Akka to have to quarantine a node?
_______________________________________________ controller-dev mailing list controller-dev@lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/controller-dev