Glad to see that you are on track. In 2.4 you might find registerOnMemberRemoved useful: http://doc.akka.io/docs/akka/snapshot/scala/cluster-usage.html
Cheers, Patrik On Wed, Jun 24, 2015 at 6:11 PM, Anders Båtstrand <ander...@gmail.com> wrote: > I have attached my logs showing the problem. > > I do now think that the problem is the same as the bug you mention. I can > read the following: > > 2015-06-24 17:51:54,693 INFO Cluster(akka://my-system) > my-system-akka.actor.default-dispatcher-3 - Cluster Node > [akka.tcp://my-system@machine2:15552] - Successfully shut down > 2015-06-24 17:51:54,677 INFO Cluster(akka://my-system) > my-system-akka.actor.default-dispatcher-17 - Cluster Node > [akka.tcp://my-system@machine2:15552] - Shutting down... > 2015-06-24 17:51:54,603 INFO Cluster(akka://my-system) > my-system-akka.actor.default-dispatcher-20 - Cluster Node > [akka.tcp://my-system@machine1:15552] - Marking unreachable node > [akka.tcp://my-system@machine2:15552] as [Down] > 2015-06-24 17:51:54,603 INFO Cluster(akka://my-system) > my-system-akka.actor.default-dispatcher-4 - Cluster Node > [akka.tcp://my-system@machine1:15552] - Leader is auto-downing > unreachable node [akka.tcp://my-system@machine2:15552] > > The problem is that I am still finding three members in the > cluster.state.members set on machine2, and all of them have status 'up'. > And this is (as far as I can tell) exactly what the bug is about. > > It seems the other problems I thought was related is caused by me not > shutting down the system when this is happening. > > I will hook into the shutdown process of the cluster, and call System.exit > from there. That way the node will be restarted, and re-join the cluster. > > Anders > > > onsdag 24. juni 2015 17.20.30 UTC+2 skrev Anders Båtstrand følgende: > >> I am using the cluster singleton, my mistake. I was somehow believing the >> leader always had the singleton... >> >> Anyway, it might be that https://github.com/akka/akka/issues/17479 is >> related. I am not downing any node manually, however, and a node will never >> down itself, right? Anyway, this bug gave me some pointers, and I will >> investigate further. >> >> I avoid split-brain by calling System.exit on any system that has less >> than half the cluster size in it's member list. >> >> Anders >> >> onsdag 24. juni 2015 17.04.36 UTC+2 skrev Patrik Nordwall følgende: >> >> >> >> On Wed, Jun 24, 2015 at 4:57 PM, Anders Båtstrand <ande...@gmail.com> >> wrote: >> >> No OutOfMemory, the third node is running fine. Except is can be the >> leader, and in that case I have two leaders... >> >> >> What are you using the leader for? There is no guarantee that there will >> not be more than one leader. >> For that you have to use the cluster singleton, or cluster sharding. >> >> >> >> I think I have reproduced it in the following program (let me know if you >> want the complete maven setup or similar): >> >> >> Thanks, I will try that. >> >> >> >> application.conf: >> >> akka { >> actor.provider = "akka.cluster.ClusterActorRefProvider" >> remote.netty.tcp.hostname = "127.0.0.1" >> >> cluster { >> seed-nodes = [ >> "akka.tcp://system@127.0.0.1:2551", >> "akka.tcp://system@127.0.0.1:2552" >> ] >> >> auto-down-unreachable-after = 10ms >> >> failure-detector { >> heartbeat-interval = 10 ms >> threshold = 2.0 >> acceptable-heartbeat-pause = 10 ms >> expected-response-after = 5 ms >> } >> } >> } >> >> >> And the test: >> >> package no.kantega.workshop.akka; >> >> import akka.actor.ActorRef; >> import akka.actor.ActorSystem; >> import akka.actor.Address; >> import akka.cluster.Cluster; >> import akka.cluster.Member; >> import akka.contrib.pattern.DistributedPubSubExtension; >> import akka.contrib.pattern.DistributedPubSubMediator; >> import com.typesafe.config.ConfigFactory; >> import org.junit.Test; >> import scala.collection.Iterator; >> import scala.collection.immutable.SortedSet; >> >> import java.util.HashSet; >> import java.util.Set; >> import java.util.concurrent.TimeUnit; >> >> import static com.jayway.awaitility.Awaitility.await; >> import static com.typesafe.config.ConfigValueFactory.fromAnyRef; >> import static org.assertj.core.api.Assertions.assertThat; >> >> public class ClusterConvergenceTest { >> >> @Test >> public void test_many_times() throws InterruptedException { >> for (int i = 0; i < 100; i++) { >> cluster_membership_is_symmetric(); >> } >> } >> >> @Test >> public void cluster_membership_is_symmetric() throws >> InterruptedException { >> >> ActorSystem actorSystem1 = ActorSystem.apply("system", >> ConfigFactory.load() >> .withValue("akka.remote.netty.tcp.port", >> fromAnyRef("2551"))); >> >> ActorSystem actorSystem2 = ActorSystem.apply("system", >> ConfigFactory.load() >> .withValue("akka.remote.netty.tcp.port", >> fromAnyRef("2552"))); >> >> ActorSystem actorSystem3 = ActorSystem.apply("system", >> ConfigFactory.load() >> .withValue("akka.remote.netty.tcp.port", >> fromAnyRef("2553"))); >> >> try { >> >> Cluster cluster1 = Cluster.get(actorSystem1); >> Cluster cluster2 = Cluster.get(actorSystem2); >> Cluster cluster3 = Cluster.get(actorSystem3); >> >> // Wait until all members can see all the others: >> await().atMost(20, TimeUnit.SECONDS).until(() -> >> cluster1.state().members().size() == 3); >> await().atMost(20, TimeUnit.SECONDS).until(() -> >> cluster2.state().members().size() == 3); >> await().atMost(20, TimeUnit.SECONDS).until(() -> >> cluster3.state().members().size() == 3); >> >> System.out.println("Generate some load (we should see cluster >> events in the console log)..."); >> >> for (ActorSystem system : new ActorSystem[]{actorSystem1, >> actorSystem2, actorSystem3}) { >> ActorRef mediator = >> DistributedPubSubExtension.get(system).mediator(); >> for (int i = 0; i < 1_000_000; i++) { >> String event = "Message number " + i; >> mediator.tell(new >> DistributedPubSubMediator.Publish("dummy stream", event), >> ActorRef.noSender()); >> } >> } >> >> System.out.println("Wait for things to settle down (cluster >> events should stop)..."); >> >> // Ideally I would have a way to know when all messages was >> processed... >> Thread.sleep(30_000L); >> >> System.out.println("Check that cluster membership is >> reflexise..."); >> >> membershipIsSymmetric(cluster1, cluster2); >> membershipIsSymmetric(cluster2, cluster3); >> membershipIsSymmetric(cluster1, cluster3); >> >> } finally { >> actorSystem1.shutdown(); >> actorSystem2.shutdown(); >> actorSystem3.shutdown(); >> actorSystem1.awaitTermination(); >> actorSystem2.awaitTermination(); >> actorSystem3.awaitTermination(); >> } >> } >> >> /** >> * Here we check that cluster1 has cluster2 as a member iff cluster2 has >> cluster1 as a member. >> */ >> private static void membershipIsSymmetric(Cluster cluster1, Cluster >> cluster2) { >> if >> (addresses(cluster1.state().members()).contains(cluster2.selfAddress())) { >> // node 1 sees node 2, check the opposite way >> >> >> ... > > -- > >>>>>>>>>> Read the docs: http://akka.io/docs/ > >>>>>>>>>> Check the FAQ: > http://doc.akka.io/docs/akka/current/additional/faq.html > >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user > --- > You received this message because you are subscribed to the Google Groups > "Akka User List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to akka-user+unsubscr...@googlegroups.com. > To post to this group, send email to akka-user@googlegroups.com. > Visit this group at http://groups.google.com/group/akka-user. > For more options, visit https://groups.google.com/d/optout. > -- Patrik Nordwall Typesafe <http://typesafe.com/> - Reactive apps on the JVM Twitter: @patriknw -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.