[akka-user] cluster sharding failover
Hi gang, a question regarding shard failover Here's my 3-node setup (using akka-2.3.0 rc3): - backend 1 - backend 2 - frontend All nodes initialize ClusterSharding at startup -- frontend supplies an empty entryProps so that it does not host any regions. Frontend starts pinging a range of sharded actors host on the backends -- I can see that the entries are evenly distributed across the backends. When I invoke a clean shutdown of one of the backends, the frontend node is left in a bad state. It continually tries to connect to the dead backend node. It does this forever. Restarting the frontend is required to get it to find the failed over shards on the remaining backend. Am I missing something with shutting down cluster sharding? Some logs from frontend. They're a bit noisy -- I've tried to call out the interesting bits. DEBUG 15:40:42,389 akka.contrib.pattern.ShardRegion - Forwarding request for shard [0] to [Actor[akka.tcp:// ghost@127.0.0.1:50351/user/sharding/user#-576014717]] DEBUG 15:40:43,405 akka.contrib.pattern.ShardRegion - Forwarding request for shard [1] to [Actor[akka.tcp:// ghost@127.0.0.1:50324/user/sharding/user#-498786510]] DEBUG 15:40:44,425 akka.contrib.pattern.ShardRegion - Forwarding request for shard [2] to [Actor[akka.tcp:// ghost@127.0.0.1:50351/user/sharding/user#-576014717]] INFO 15:40:44,825 akka.actor.LocalActorRef - Message [akka.remote.transport.AssociationHandle$Disassociated] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2Fghost%40127.0.0.1%3A50351-1#-206867083] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 15:40:44,881 akka.actor.LocalActorRef - Message [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2Fghost%40127.0.0.1%3A50351-1#-206867083] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. *DEBUG 15:40:44,884 akka.remote.EndpointWriter - Disassociated [akka.tcp://ghost@127.0.0.1:50373 http://ghost@127.0.0.1:50373] - [akka.tcp://ghost@127.0.0.1:50351 http://ghost@127.0.0.1:50351]* INFO 15:40:44,884 akka.actor.LocalActorRef - Message [akka.actor.FSM$Timer] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fghost%40127.0.0.1%3A50351-1/endpointWriter#1517188237] was not delivered. [4] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 15:40:45,091 akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef - Message [akka.cluster.GossipEnvelope] from Actor[akka://ghost/system/cluster/core/daemon#-1507678631] to Actor[akka://ghost/deadLetters] was not delivered. [5] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 15:40:45,097 com.kixeye.common.log.AkkaLogger - Cluster Node [akka.tcp://ghost@127.0.0.1:50373] - Marking exiting node(s) as UNREACHABLE [Member(address = akka.tcp://ghost@127.0.0.1:50351, status = Exiting)]. This is expected and they will be removed. *INFO 15:40:45,106 com.kixeye.common.cluster.ClusterModule - member is unreachable: Member(address = akka.tcp://ghost@127.0.0.1:50351 http://ghost@127.0.0.1:50351, status = Exiting)* ^-- node becomes unreachable DEBUG 15:40:45,445 akka.contrib.pattern.ShardRegion - Forwarding request for shard [3] to [Actor[akka.tcp:// ghost@127.0.0.1:50324/user/sharding/user#-498786510]] INFO 15:40:45,535 akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef - Message [akka.cluster.ClusterHeartbeatSender$Heartbeat] from Actor[akka://ghost/system/cluster/core/daemon/heartbeatSender#214131267] to Actor[akka://ghost/deadLetters] was not delivered. [6] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. *DEBUG 15:40:46,465 akka.contrib.pattern.ShardRegion - Forwarding request for shard [4] to [Actor[akka.tcp://ghost@127.0.0.1:50351/user/sharding/user#-576014717 http://ghost@127.0.0.1:50351/user/sharding/user#-576014717]]* *^-- *forwarding msg to a known-unreachable node INFO 15:40:46,465 akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef - Message [com.kixeye.common.cluster.UserBackend$UserMessageEnvelope] from Actor[akka://ghost/user/ghost-sheppard/module-com.kixeye.common.cluster.UserClientModule-8c0d9b97-8b4e-4988-a40b-91d054080afa#1779934259] to Actor[akka://ghost/deadLetters] was
Re: [akka-user] akka cluster seed nodes on ec2
Hi Tim, We have a similar-ish setup over here. We ended up registering all nodes in zookeeper and doing discovery through that. This works well for initial cluster startup as well as nodes joining an existing cluster. It also works well for hybrid environments that are not all on aws. On Tue, Feb 11, 2014 at 11:07 AM, Patrik Nordwall patrik.nordw...@gmail.com wrote: Hi Tim, You can use whatever nodes as seed nodes, except for when you start up a fresh cluster from scratch. When you start a node you can use the AWS API to discover other EC2 instances and use all or a few as seed nodes. The special case is when starting a new cluster, and then I imagine that you can mark one node as special using AWS metadata, or a special argument in the start script. This special node must put itself first in the list of seed nodes. In all other cases a node should not include itself as seed node. The reason for the special first seed node is to avoid creating several separate clusters when starting from an empty cluster. I would be interested in understanding why this would not be feasible solution. Cheers, Patrik On Tue, Feb 11, 2014 at 7:20 PM, Timothy Perrett timo...@getintheloop.euwrote: Hey Roland - doesnt always on sound like will never fail... ? I wouldn't be comfortable manually managing that cluster, so it would need an ASG, which in turn will remove the oldest node when shrinking, so over time you'd end up loosing all your original seed nodes if your cluster is expanding and contracting enough. On Sunday, 9 February 2014 23:35:20 UTC-8, rkuhn wrote: Hi Tim, you could have one small group which is always on and acts as entry point (seed nodes) while the rest are auto-scaling. Would that solve the issue? Regards, Roland 6 feb 2014 kl. 23:12 skrev Timothy Perrett tim...@getintheloop.eu: Scott, What did you ever end up doing about this? As yet, I have not seen any decent story for auto-scalling akka-cluster on AWS; it strikes me that over a long enough period all the seed nodes could be reaped by the ASG, but that there would be enough folks wanting to do this that there would be a decent solution for it. Cheers Tim On Thursday, 24 October 2013 01:09:08 UTC-7, Patrik Nordwall wrote: Hi Scott, Using the AWS API together with Cluster(system).joinSeedNodes should be possible. The first seed node must be marked somehow. From docs: You may also use Cluster(system).joinSeedNodes, which is attractive when dynamically discovering other nodes at startup by using some external tool or API. When using joinSeedNodes you should not include the node itself except for the node that is supposed to be the first seed node, and that should be placed first in parameter to joinSeedNodes. Regards, Patrik On Tue, Oct 22, 2013 at 1:36 AM, Ryan Tanner ryan@gmail.comwrote: We have Vagrant auto-update the hosts file over ssh. Kinda clunky but it works for small clusters. On Monday, October 21, 2013 4:58:36 PM UTC-5, Scott Clasen wrote: Anyone have good tricks for standing up an akka cluster on ec2 WRT seed nodes? Would like to be able to auto scale it so not having to manually join a node is necessary. Could use elastic IPs would rather not. Could maybe use a tcb elb? not sure that would work, would rather not. Guess I can use ZK which I already have stood up...what weird things can happen if a node attempts to join a cluster via a seed that has been partitioned away from the cluster? -- Read the docs: http://akka.io/docs/ Check the FAQ: http://akka.io/faq/ Search the archives: https://groups.google.com/ group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com. To post to this group, send email to akka...@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/groups/opt_out. -- Patrik Nordwall Typesafe http://typesafe.com/ - Reactive apps on the JVM Twitter: @patriknw -- Read the docs: http://akka.io/docs/ Check the FAQ: http://akka.io/faq/ Search the archives: https://groups.google.com/ group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com. To post to this group, send email to akka...@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/groups/opt_out. *Dr. Roland Kuhn* *Akka Tech Lead* Typesafe http://typesafe.com/ - Reactive apps on the JVM. twitter: @rolandkuhn http://twitter.com/#!/rolandkuhn -- Read the docs: http://akka.io/docs/ Check the FAQ: http://akka.io/faq/ Search
Re: [akka-user] clean cluster exit
Thanks for the followup Roland Bonus Followup Round! This original issue came up in looking at the cluster sharding support in 2.3.0-RC2 We like to use ephemeral cluster nodes with random akka ports. After the cluster shuts down, the first node that comes up and becomes ShardCoordinator recovers the previous journal and tries to reconnect to the regions defined in the journal. These not only don't exist, but they never will exist due to the randomized ports. I had expected that when ShardCoordinator shuts down without handoff that it would wipe out its internal state. It's hard to tell if this is an issue in cluster sharding, cluster singleton manager in general, or a caveat emptor with using ephemeral ports On Tue, Feb 11, 2014 at 2:36 AM, Akka Team akka.offic...@gmail.com wrote: Hi James, the sequence you describe makes perfect sense to me, and the ClusterSingletonManager tries to be overly thorough here; so much so that I would call it a bughttps://www.assembla.com/spaces/ddEDvgVAKr3QrUeJe5aVNr/tickets/3869. Thanks for reporting! Regards, Roland On Mon, Feb 10, 2014 at 9:34 PM, James Bellenger ja...@kixeye.com wrote: Hi gang. What is the process for a node to gracefully exit a cluster? Nodes in our system are going through this sequence: - jvm gets the shutdown signal - node calls cluster.leave(cluster.selfAddress) - node waits until it sees MemberRemoved with its own address - node gives singletons a grace period to migrate - actor system is shutdown - jvm exits This *feels *correct, but the docshttp://doc.akka.io/docs/akka/2.3.0-RC2/scala/cluster-usage.html#Leaving are fuzzy on when the node can drop out. Moreover, ClusterSingletonManager has a hard time with this flow. Especially for 1-node clusters, it tries to handover to a non-existing peer, fails, and then fails harder when it is restarted and the cluster service is no longer running. Is there a better way for nodes to leave the cluster? Logs below. INFO 12:19:40,586 com.kixeye.common.log.AkkaLogger - Cluster Node [akka.tcp://ghost@127.0.0.1:50570] - Marked address [akka.tcp:// ghost@127.0.0.1:50570] as [Leaving] INFO 12:19:41,355 com.kixeye.common.log.AkkaLogger - Cluster Node [akka.tcp://ghost@127.0.0.1:50570] - Leader is moving node [akka.tcp:// ghost@127.0.0.1:50570] to [Exiting] INFO 12:19:41,356 com.kixeye.common.cluster.ClusterModule - member removed: leave completed! INFO 12:19:41,362 com.kixeye.common.log.AkkaLogger - Cluster Node [akka.tcp://ghost@127.0.0.1:50570] - Shutting down... INFO 12:19:41,371 com.kixeye.common.log.AkkaLogger - Cluster Node [akka.tcp://ghost@127.0.0.1:50570] - Successfully shut down INFO 12:19:41,374 akka.contrib.pattern.ClusterSingletonManager - Exited [akka.tcp://ghost@127.0.0.1:50570] INFO 12:19:41,376 akka.contrib.pattern.ClusterSingletonManager - Oldest observed OldestChanged: [akka.tcp://ghost@127.0.0.1:50570 - None] INFO 12:19:41,381 akka.contrib.pattern.ClusterSingletonManager - ClusterSingletonManager state change [Oldest - WasOldest] INFO 12:19:41,396 akka.actor.LocalActorRef - Message [akka.cluster.ClusterEvent$LeaderChanged] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/cluster/core/daemon/autoDown#2017004581] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:41,396 akka.actor.LocalActorRef - Message [akka.dispatch.sysmsg.Terminate] from Actor[akka://ghost/system/cluster/core/daemon/heartbeatSender#1919962524] to Actor[akka://ghost/system/cluster/core/daemon/heartbeatSender#1919962524] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:41,397 akka.actor.LocalActorRef - Message [akka.cluster.ClusterEvent$RoleLeaderChanged] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/cluster/core/daemon/autoDown#2017004581] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:41,397 akka.actor.LocalActorRef - Message [akka.cluster.ClusterEvent$SeenChanged] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/cluster/core/daemon/autoDown#2017004581] was not delivered. [4] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:41,398 akka.actor.LocalActorRef - Message [akka.cluster.InternalClusterAction$Unsubscribe] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/cluster/core/daemon#1571353727] was not delivered. [5] dead letters encountered. This logging can be turned off
[akka-user] clean cluster exit
Hi gang. What is the process for a node to gracefully exit a cluster? Nodes in our system are going through this sequence: - jvm gets the shutdown signal - node calls cluster.leave(cluster.selfAddress) - node waits until it sees MemberRemoved with its own address - node gives singletons a grace period to migrate - actor system is shutdown - jvm exits This *feels *correct, but the docshttp://doc.akka.io/docs/akka/2.3.0-RC2/scala/cluster-usage.html#Leaving are fuzzy on when the node can drop out. Moreover, ClusterSingletonManager has a hard time with this flow. Especially for 1-node clusters, it tries to handover to a non-existing peer, fails, and then fails harder when it is restarted and the cluster service is no longer running. Is there a better way for nodes to leave the cluster? Logs below. INFO 12:19:40,586 com.kixeye.common.log.AkkaLogger - Cluster Node [akka.tcp://ghost@127.0.0.1:50570] - Marked address [akka.tcp:// ghost@127.0.0.1:50570] as [Leaving] INFO 12:19:41,355 com.kixeye.common.log.AkkaLogger - Cluster Node [akka.tcp://ghost@127.0.0.1:50570] - Leader is moving node [akka.tcp:// ghost@127.0.0.1:50570] to [Exiting] INFO 12:19:41,356 com.kixeye.common.cluster.ClusterModule - member removed: leave completed! INFO 12:19:41,362 com.kixeye.common.log.AkkaLogger - Cluster Node [akka.tcp://ghost@127.0.0.1:50570] - Shutting down... INFO 12:19:41,371 com.kixeye.common.log.AkkaLogger - Cluster Node [akka.tcp://ghost@127.0.0.1:50570] - Successfully shut down INFO 12:19:41,374 akka.contrib.pattern.ClusterSingletonManager - Exited [akka.tcp://ghost@127.0.0.1:50570] INFO 12:19:41,376 akka.contrib.pattern.ClusterSingletonManager - Oldest observed OldestChanged: [akka.tcp://ghost@127.0.0.1:50570 - None] INFO 12:19:41,381 akka.contrib.pattern.ClusterSingletonManager - ClusterSingletonManager state change [Oldest - WasOldest] INFO 12:19:41,396 akka.actor.LocalActorRef - Message [akka.cluster.ClusterEvent$LeaderChanged] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/cluster/core/daemon/autoDown#2017004581] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:41,396 akka.actor.LocalActorRef - Message [akka.dispatch.sysmsg.Terminate] from Actor[akka://ghost/system/cluster/core/daemon/heartbeatSender#1919962524] to Actor[akka://ghost/system/cluster/core/daemon/heartbeatSender#1919962524] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:41,397 akka.actor.LocalActorRef - Message [akka.cluster.ClusterEvent$RoleLeaderChanged] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/cluster/core/daemon/autoDown#2017004581] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:41,397 akka.actor.LocalActorRef - Message [akka.cluster.ClusterEvent$SeenChanged] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/cluster/core/daemon/autoDown#2017004581] was not delivered. [4] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:41,398 akka.actor.LocalActorRef - Message [akka.cluster.InternalClusterAction$Unsubscribe] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/cluster/core/daemon#1571353727] was not delivered. [5] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:41,398 akka.actor.LocalActorRef - Message [akka.cluster.InternalClusterAction$Unsubscribe] from Actor[akka://ghost/deadLetters] to Actor[akka://ghost/system/cluster/core/daemon#1571353727] was not delivered. [6] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO 12:19:42,395 akka.contrib.pattern.ClusterSingletonManager - Retry [1], sending TakeOverFromMe to [None] INFO 12:19:43,415 akka.contrib.pattern.ClusterSingletonManager - Retry [2], sending TakeOverFromMe to [None] INFO 12:19:44,435 akka.contrib.pattern.ClusterSingletonManager - Retry [3], sending TakeOverFromMe to [None] INFO 12:19:45,455 akka.contrib.pattern.ClusterSingletonManager - Retry [4], sending TakeOverFromMe to [None] INFO 12:19:46,475 akka.contrib.pattern.ClusterSingletonManager - Retry [5], sending TakeOverFromMe to [None] ERROR 12:19:47,517 akka.actor.OneForOneStrategy - Expected hand-over to [None] never occured akka.contrib.pattern.ClusterSingletonManagerIsStuck: Expected hand-over to [None] never occured at