Hi,

We have "MonitoringMaster" actor system and N "Metrics" actor systems.
They are deployed in AWS, and to make it working we are substituting 
public-ip in runtime.

Akka version: 2.2.4 (can't upgrade to 2.3.x due protobuf dependency)

Config file:
akka {
  loglevel = INFO
  log-config-on-start = on
  debug {
    receive = on
    lifecyle = off
  }
  actor {
    provider = "akka.remote.RemoteActorRefProvider"
  }
  remote {
    enabled-transports = ["akka.remote.netty.tcp"]
    log-remote-lifecycle-events = INFO
    netty.tcp {
      hostname = "127.0.0.1" //but we substitute a real IP in runtime
    }
    secure-cookie = "#####"
    require-cookie = on
  }
}

remote {
  untrusted-mode = on
  log-received-messages = off
}

So everything works ok when we have less than 10 clients. Problem starts to 
occur when more than 10 clients are "connecting" to master (sometimes 11, 
sometimes 15, ...).
In this case we observing cascade of exceptions (and it affects all Metrics 
systems):


*MonitoringMaster*:
[INFO] [07/14/2014 15:02:06.386] 
[MonitoringMaster-akka.actor.default-dispatcher-3] 
[akka://MonitoringMaster/user/master] Added producer 
Actor[akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552/user/metric-producer#-1020796025]
 
with meta InstanceMeta(InstanceGlobalId(us-east-1,i-14ffd83e),XXXX)
[WARN] [07/14/2014 15:03:03.023] 
[MonitoringMaster-akka.actor.default-dispatcher-19] 
[akka://MonitoringMaster/system/remote-watcher] Detected unreachable: 
[akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552]
[INFO] [07/14/2014 15:03:03.048] 
[MonitoringMaster-akka.actor.default-dispatcher-3] [Remoting] Address 
[akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552] is now 
quarantined, all messages to this address will be delivered to dead letters.
WARN] [07/14/2014 15:03:03.060] 
[MonitoringMaster-akka.actor.default-dispatcher-3] 
[akka://MonitoringMaster/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FMetrics%40ec2-54-88-77-195.compute-1.amazonaws.com%3A2552-1866/endpointWriter]
 
AssociationError 
[akka.tcp://monitoringmas...@ec2-54-82-6-7.compute-1.amazonaws.com:2551] -> 
[akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552]: Error 
[Invalid address: 
akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552] [
akka.remote.InvalidAssociation: Invalid address: 
akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The 
remote system has a UID that has been quarantined. Association aborted.
]
[WARN] [07/14/2014 15:03:03.061] 
[MonitoringMaster-akka.actor.default-dispatcher-3] [Remoting] Tried to 
associate with unreachable remote address 
[akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552]. Address 
is now gated for 60000 ms, all messages to this address will be delivered 
to dead letters. Reason: The remote system has a UID that has been 
quarantined. Association aborted.
[ERROR] [07/14/2014 15:03:06.205] 
[MonitoringMaster-akka.actor.default-dispatcher-19] 
[akka://MonitoringMaster/system/endpointManager/endpointWriter-akka.tcp%3A%2F%2FMetrics%40ec2-54-88-77-195.compute-1.amazonaws.com%3A2552-1867]
 
AssociationError 
[akka.tcp://monitoringmas...@ec2-54-82-6-7.compute-1.amazonaws.com:2551] <- 
[akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552]: Error 
[Invalid address: 
akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552] [
akka.remote.InvalidAssociation: Invalid address: 
akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The 
remote system has quarantined this system. No further associations to the 
remote system are possible until this system is restarted.
]
[WARN] [07/14/2014 15:03:06.205] 
[MonitoringMaster-akka.actor.default-dispatcher-19] [Remoting] Tried to 
associate with unreachable remote address 
[akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552]. Address 
is now gated for 60000 ms, all messages to this address will be delivered 
to dead letters. Reason: The remote system has quarantined this system. No 
further associations to the remote system are possible until this system is 
restarted.



Sometimes I also see such exception:
[ERROR] [07/14/2014 15:02:47.544] 
[MonitoringMaster-akka.actor.default-dispatcher-12] [Remoting] Error 
encountered while processing system message acknowledgement [2, 3] ACK[2, 
{1, 0}] (akka.remote.transport.Transport$InvalidAssociationException)



*Metrics*:
2014-07-14 15:02:06,381  INFO [Metrics-akka.actor.default-dispatcher-17] 
d.e.m.MetricProducerActor - Successfully connected to master 
Actor[akka.tcp://monitoringmas...@ec2-54-82-6-7.compute-1.amazonaws.com:2551/user/master#-530936949]
2014-07-14 15:03:01,174  WARN [Metrics-akka.actor.default-dispatcher-15] 
a.r.RemoteWatcher - Detected unreachable: 
[akka.tcp://monitoringmas...@ec2-54-82-6-7.compute-1.amazonaws.com:2551]
2014-07-14 15:03:01,174  INFO [Metrics-akka.actor.default-dispatcher-15] 
Remoting - Address 
[akka.tcp://monitoringmas...@ec2-54-82-6-7.compute-1.amazonaws.com:2551] is 
now quarantined, all messages to this address will be delivered to dead 
letters.
2014-07-14 15:03:01,176 ERROR [Metrics-akka.actor.default-dispatcher-17] 
a.a.OneForOneStrategy - Master terminated, need to reconnect
java.lang.RuntimeException: Master terminated, need to reconnect //Got 
Terminated message
at 
xxx.xxx.monitoring.MetricProducerActor$$anonfun$connected$1.applyOrElse(MetricProducerActor.scala:81)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at 
akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:45)
at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:338)
at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:470)
at akka.actor.ActorCell.invoke(ActorCell.scala:455)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:385)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2014-07-14 15:03:06,204  WARN [Metrics-akka.actor.default-dispatcher-2] 
a.r.EndpointWriter - AssociationError 
[akka.tcp://metr...@ec2-54-88-77-195.compute-1.amazonaws.com:2552] -> 
[akka.tcp://monitoringmas...@ec2-54-82-6-7.compute-1.amazonaws.com:2551]: 
Error [Invalid address: 
akka.tcp://monitoringmas...@ec2-54-82-6-7.compute-1.amazonaws.com:2551] [
akka.remote.InvalidAssociation: Invalid address: 
akka.tcp://monitoringmas...@ec2-54-82-6-7.compute-1.amazonaws.com:2551
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The 
remote system has a UID that has been quarantined. Association aborted.
]
2014-07-14 15:03:06,204  WARN [Metrics-akka.actor.default-dispatcher-2] 
Remoting - Tried to associate with unreachable remote address 
[akka.tcp://monitoringmas...@ec2-54-82-6-7.compute-1.amazonaws.com:2551]. 
Address is now gated for 60000 ms, all messages to this address will be 
delivered to dead letters. Reason: The remote system has a UID that has 
been quarantined. Association aborted.


I'm curious, why it happens? Our Metrics actor tries to re-connect to 
MonitoringMaster but after successful resolving it becomes unreachable.


Regards,
Vitaliy

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to