Hi Endre,

I have read that part of the documentation, and after switching to a new 
cassandra journal keyspace, everything is working as it should be. I've 
confirmed that if I bump the "ECS Service" count to 2 (or greater), the ECS 
Task (describing an akka-cluster-sharding application) gets deployed on 
another ECS Container instance. With the custom seed enumeration code I 
wrote (which queries the ECS cluster/service for the IP addresses of all 
the ECS container instances) and the custom code to set 
akka.remote.netty.tcp.hostname to the IP address of the container instance 
(rather than the docker instance's docker0 ip address), the cluster members 
all see each other and all nodes join the cluster.

For the (written) record, since others will undoubtedly run into the same 
issue as I did, here are the important things you have to do to get an 
akka-cluster application working on AWS/ECS.

1) You have to dynamically configure the seed nodes to the IP addresses of 
the ECS container instances in your cluster. The way I did this was as 
follows:  I wrote a scala library using the AWS Java SDK that, given the 
ECS Cluster Arn and Service Name, enumerated the tasks for the service, and 
for those tasks, enumerates the container instances on which the tasks are 
running. Given those container instances the code determines the EC2 
instance ID of the EC2 instances hosting the container instances. And using 
the EC2 DescribeInstances API, it determines the IP address (private, in my 
case) of the EC2 instances.  Finally, the IP addresses are mapped to the 
akka.tcp URLs required to configure the seed nodes.

2) You have to dynamically configure the akka.remote.netty.tcp.hostname to 
be the IP address of the ECS container instance on which your docker 
container is running. With no customizations, akka will set this to the IP 
address on the docker0 interface, which is a private IP address not 
accessible from other akka cluster members. Since there doesn't appear to 
be a way, on AWS ECS for a docker container to determine the IP address of 
the docker host (ECS Container Instance), I "cheated".  I used the metadata 
URL that all EC2 instances support, to query the ip address 
(http://169.254.169.254/latest/meta-data/local-ipv4";.

-- Eric

On Tuesday, March 22, 2016 at 3:06:59 AM UTC-7, Akka Team wrote:
>
> Hi Eric,
>
> I have no experience with Docker at all, but it does feel wrong (unless 
> very specific use-cases) to have separate journals and snapshot stores 
> *per-node*. I think you might have an issue with Docker NAT. Have you read 
> this part of the documentation: 
> http://doc.akka.io/docs/akka/2.4.2/scala/remoting.html#Akka_behind_NAT_or_in_a_Docker_container
>
> -Endre
>
> On Sat, Mar 19, 2016 at 3:27 AM, Eric Swenson <er...@swenson.org 
> <javascript:>> wrote:
>
>> Well, I may be able to answer my own question. It absolutely does matter 
>> that the new remote cluster system is using the same akka-persistence 
>> (cassandra keyspace) store as the old (local) one.  When I changed the 
>> cassandra keyspace to a new one, everything started working.
>>
>> So the question now is this: Do I have to have a separate 
>> akka-persistence journal and snapshot store for every node in the cluster?  
>> This is very inconvenient, as it means I have to make up keyspace names 
>> that are somehow tied to each individual node.  I guess I can add the ip 
>> address of the Docker host to the keyspace name, but this isn’t terribly 
>> resilient.  Why does akka-persistence care? The journal should reflect 
>> events that apply to all nodes. If a node goes down (getting a new address) 
>> and a new one takes its place, it should be able to recover all the events 
>> from the old node.  There must be something else at play here.
>>
>> Help!  
>>
>> — Eric
>>
>> On Mar 18, 2016, at 7:17 PM, Eric Swenson <er...@swenson.org 
>> <javascript:>> wrote:
>>
>> One more thing to add to this, in case it is relevant.  I see multiple of 
>> these messages in the log:
>>
>> [
>> akka://ClusterSystem/system/sharding/ExperimentInstanceCoordinator/singleton/coordinator]
>>  
>> stopped
>>
>>
>> First, why is it stopping (or why does it stop, in general), and 
>> secondly, is it significant that the url prefix is akka://ClusterSystem/ 
>> rather than akka.tcp://ClusterSystem@10.0.3.170:2552/
>>
>> And second, I assume it makes no difference that I’m using the same 
>> akka-persistence journal/snapshot store as I used when the app was binding 
>> to 127.0.0.1:2552.  I get tons of log messages indicating that 
>> receiveRecover is not happy trying to recover shards associated with 
>> akka.tcp://ClusterSystem@127.0.0.1:2552/system/sharing/ExperimentInstance#-396422686.
>>   
>> I’m assuming this is expected and that akka persistence should be able to 
>> deal with this case. It should fail to recover these and then carry on with 
>> new persistence events that are targeted to the new ClusterSystem on the 
>> new IP address.
>>
>> Examples of the rejected receiveRecover messages that I’m seeing are:
>>
>> [akka.tcp://ClusterSystem@10.0.3.170:2552/system/cassandra-journal] 
>> Starting message scan from 1
>> [DEBUG] [03/19/2016 02:09:02.712] 
>> [ClusterSystem-akka.actor.default-dispatcher-18] [
>> akka.tcp://ClusterSystem@10.0.3.170:2552/system/sharding/ExperimentInstanceCoordinator/singleton/coordinator]
>>  
>> receiveRecover ShardRegionRegistered(Actor[
>> akka.tcp://ClusterSystem@127.0.0.1:2552/system/sharding/ExperimentInstance#-396422686
>> ])
>>
>> — Eric
>>
>> On Mar 18, 2016, at 6:54 PM, Eric Swenson <er...@swenson.org 
>> <javascript:>> wrote:
>>
>> I’ve been unsuccessful in trying to get an akka-cluster application that 
>> works fine with one instance to work when there are multiple members of the 
>> clusters.  A bit of background is in order:
>>
>> 1) the application is an akka-cluster-sharing application
>> 2) it runs in a docker container
>> 3) the cluster is comprised of multiple docker hosts, each running the 
>> akka application
>> 4) the error I’m getting is this:
>>
>> [WARN] [03/19/2016 01:39:18.086] 
>> [ClusterSystem-akka.actor.default-dispatcher-3] [
>> akka.tcp://ClusterSystem@10.0.3.170:2552/system/sharding/ExperimentInstance] 
>> Trying to register to coordinator at [Some(ActorSelection[Anchor(
>> akka://ClusterSystem/), 
>> Path(/system/sharding/ExperimentInstanceCoordinator/singleton/coordinator)])],
>>  
>> but no acknowledgement. Total [1] buffered messages.
>> [WARN] [03/19/2016 01:39:20.086] 
>> [ClusterSystem-akka.actor.default-dispatcher-3] [
>> akka.tcp://ClusterSystem@10.0.3.170:2552/system/sharding/ExperimentInstance] 
>> Trying to register to coordinator at [Some(ActorSelection[Anchor(
>> akka://ClusterSystem/), 
>> Path(/system/sharding/ExperimentInstanceCoordinator/singleton/coordinator)])],
>>  
>> but no acknowledgement. Total [1] buffered messages.
>>
>> 5) the warning message is logged repeatedly and the cluster never 
>> initializes.
>> 6) I’ve set the following config parameters:
>> akka.remote.netty.tcp.hostname: to the actual host ip address (the one 
>> accessible from all the other docker hosts)
>> akka.remote.netty.tcp.bind-hostname: to 0.0.0.0 (so that it binds on the 
>> docker0 interface, on the ip address of the container)
>> akka.remote.netty.tcp.port: 2552
>> akka.remote.netty.tcp.bind-port:2552
>> 7) when I start the container, I map port 2552 on the host to port 2552 
>> on the container.
>> 8) from the host, I’m able to do a “telnet <ip-address-of-host> 2552” so 
>> I should be taking to the akka-remoting handler.
>> 9) I’m setting the akka..cluster.seed-nodes to a list of one element:  
>> akka.tcp://ClusterSystem@<ip-address-of-host>:2552.  I’m doing that 
>> because, as far as I know, this seed-node list is advertised to all the 
>> other members of the cluster (currently just the one) and must be 
>> accessible to them all.  The Docker container ip address (seen by the 
>> docker container) is on the private docker0 interface, which is not 
>> accessible from the outside (from outside the host).
>>
>> Now, before I tried all this, I set seed-nodes to a list of a single 
>> aka.tcp://ClusterSystem@127.0.0.1:2552 entry, and  left 
>> akka.remote.netty.tcp.hostname, bind-hostname, port, and bind-port to their 
>> default values.  In this configuration, the cluster comes up perfectly fine 
>> and the application (with an akka-http frontend and a cluster sharing 
>> backend) works perfectly.  In fact, when I use two seed nodes of the same 
>> form, but different ports on the same local 127.0.0.1 host, and two 
>> instances of the app (each binding to the different ports), the app works 
>> fine too.  The cluster comes up find and each node finds the other.
>>
>> But in a real environment, the multiple instances will be on different 
>> nodes (diffferent ip addresses), and in my case, as docker containers on 
>> different docker hosts.  So clearly, the seed nodes must have externally 
>> accessible ip addresses.  
>>
>> Can anyone shed any light on what might be going wrong?  
>>
>> How might I debug this?  Thank. — Eric
>>
>>
>>
>>  
>>
>>
>>
>> -- 
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to akka-user+...@googlegroups.com <javascript:>.
>> To post to this group, send email to akka...@googlegroups.com 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Akka Team
> Typesafe - Reactive apps on the JVM
> Blog: letitcrash.com
> Twitter: @akkateam
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to