Hi Vitaliy,
> I also tried scenario "establish remote connection without sending any > application specific messages"...without any success. > That is strange... > > Finally I solved problem with protobuf dependency in our app and upgraded > to Akka 2.3.4 > Everything works as expected (tried with 43 Tomcats). > Good to hear that! -Endre > > > > On Tuesday, July 22, 2014 11:03:00 AM UTC+3, Akka Team wrote: > >> Hi Vitaliy, >> >> Do you send large messages or send messages without backpressure? Since >> 2.2.x does not prioritize internal heartbeat messages over user messages it >> can accumulate delay. You should try throttling or backpressuring your >> remote sends first to see if it is the problem. >> >> -Endre >> >> >> On Mon, Jul 21, 2014 at 7:06 PM, Vitaliy Morarian <[email protected]> >> wrote: >> >>> Hi Konrad, >>> >>> Master is m1.xlarge instance. I checked CPU load: about 5-10% >>> >>> Unfortunately we can't upgrade to 2.3.x due protobuf dependency (I >>> tried, but faced with some reflection exceptions during Tomcat startup) >>> >>> >>> On Friday, July 18, 2014 1:04:33 PM UTC+3, Konrad Malawski wrote: >>> >>>> Hi Vitaliy, >>>> It seems the master is overloaded. >>>> Do you have jvm monitoring in place to see if it's not in state of >>>> agony? >>>> >>>> In other news, 2.3.4 prioritises hearbeats so missing hearbeat messages >>>> (thus causing false positives on failure detection) because of overloaded >>>> machine is less likely, >>>> I would recommend trying to upgrade (maybe you can upgrade your proto >>>> dependency somehow?). >>>> >>>> You can also tweak the failure detector's timeout, but if it's the case >>>> that the master is barely keeping up anyway that's not really a solution. >>>> >>>> >>>> >>>> On Mon, Jul 14, 2014 at 5:53 PM, Vitaliy Morarian <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> We have "MonitoringMaster" actor system and N "Metrics" actor systems. >>>>> They are deployed in AWS, and to make it working we are substituting >>>>> public-ip in runtime. >>>>> >>>>> Akka version: 2.2.4 (can't upgrade to 2.3.x due protobuf dependency) >>>>> >>>>> Config file: >>>>> akka { >>>>> loglevel = INFO >>>>> log-config-on-start = on >>>>> debug { >>>>> receive = on >>>>> lifecyle = off >>>>> } >>>>> actor { >>>>> provider = "akka.remote.RemoteActorRefProvider" >>>>> } >>>>> remote { >>>>> enabled-transports = ["akka.remote.netty.tcp"] >>>>> log-remote-lifecycle-events = INFO >>>>> netty.tcp { >>>>> hostname = "127.0.0.1" //but we substitute a real IP in runtime >>>>> } >>>>> secure-cookie = "#####" >>>>> require-cookie = on >>>>> } >>>>> } >>>>> >>>>> remote { >>>>> untrusted-mode = on >>>>> log-received-messages = off >>>>> } >>>>> >>>>> So everything works ok when we have less than 10 clients. Problem >>>>> starts to occur when more than 10 clients are "connecting" to master >>>>> (sometimes 11, sometimes 15, ...). >>>>> In this case we observing cascade of exceptions (and it affects all >>>>> Metrics systems): >>>>> >>>>> >>>>> *MonitoringMaster*: >>>>> [INFO] [07/14/2014 15:02:06.386] >>>>> [MonitoringMaster-akka.actor.default-dispatcher-3] >>>>> [akka://MonitoringMaster/user/master] Added producer Actor[akka.tcp:// >>>>> [email protected]:2552/user/metric- >>>>> producer#-1020796025] with meta InstanceMeta(InstanceGlobalId( >>>>> us-east-1,i-14ffd83e),XXXX) >>>>> [WARN] [07/14/2014 15:03:03.023] >>>>> [MonitoringMaster-akka.actor.default-dispatcher-19] >>>>> [akka://MonitoringMaster/system/remote-watcher] Detected unreachable: >>>>> [akka.tcp://[email protected]:2552] >>>>> [INFO] [07/14/2014 15:03:03.048] >>>>> [MonitoringMaster-akka.actor.default-dispatcher-3] >>>>> [Remoting] Address [akka.tcp://Metrics@ec2-54-88- >>>>> 77-195.compute-1.amazonaws.com:2552] is now quarantined, all messages >>>>> to this address will be delivered to dead letters. >>>>> WARN] [07/14/2014 15:03:03.060] >>>>> [MonitoringMaster-akka.actor.default-dispatcher-3] >>>>> [akka://MonitoringMaster/system/endpointManager/reliableEndp >>>>> ointWriter-akka.tcp%3A%2F%2FMetrics%40ec2-54-88-77-195. >>>>> compute-1.amazonaws.com%3A2552-1866/endpointWriter] AssociationError >>>>> [akka.tcp://[email protected] >>>>> ws.com:2551] -> [akka.tcp://Metrics@ec2-54-88- >>>>> 77-195.compute-1.amazonaws.com:2552]: Error [Invalid address: >>>>> akka.tcp://[email protected]:2552] [ >>>>> akka.remote.InvalidAssociation: Invalid address: akka.tcp:// >>>>> [email protected]:2552 >>>>> Caused by: akka.remote.transport.Transport$InvalidAssociationException: >>>>> The remote system has a UID that has been quarantined. Association >>>>> aborted. >>>>> ] >>>>> [WARN] [07/14/2014 15:03:03.061] >>>>> [MonitoringMaster-akka.actor.default-dispatcher-3] >>>>> [Remoting] Tried to associate with unreachable remote address [akka.tcp:// >>>>> [email protected]:2552]. Address is >>>>> now gated for 60000 ms, all messages to this address will be delivered to >>>>> dead letters. Reason: The remote system has a UID that has been >>>>> quarantined. Association aborted. >>>>> [ERROR] [07/14/2014 15:03:06.205] >>>>> [MonitoringMaster-akka.actor.default-dispatcher-19] >>>>> [akka://MonitoringMaster/system/endpointManager/endpointWrit >>>>> er-akka.tcp%3A%2F%2FMetrics%40ec2-54-88-77-195.compute-1.amazonaws.com >>>>> %3A2552-1867] AssociationError [akka.tcp://MonitoringMaster@e >>>>> c2-54-82-6-7.compute-1.amazonaws.com:2551] <- [akka.tcp:// >>>>> [email protected]:2552]: Error >>>>> [Invalid address: akka.tcp://Metrics@ec2-54-88-7 >>>>> 7-195.compute-1.amazonaws.com:2552] [ >>>>> akka.remote.InvalidAssociation: Invalid address: akka.tcp:// >>>>> [email protected]:2552 >>>>> Caused by: akka.remote.transport.Transport$InvalidAssociationException: >>>>> The remote system has quarantined this system. No further associations to >>>>> the remote system are possible until this system is restarted. >>>>> ] >>>>> [WARN] [07/14/2014 15:03:06.205] >>>>> [MonitoringMaster-akka.actor.default-dispatcher-19] >>>>> [Remoting] Tried to associate with unreachable remote address [akka.tcp:// >>>>> [email protected]:2552]. Address is >>>>> now gated for 60000 ms, all messages to this address will be delivered to >>>>> dead letters. Reason: The remote system has quarantined this system. No >>>>> further associations to the remote system are possible until this system >>>>> is >>>>> restarted. >>>>> >>>>> >>>>> >>>>> Sometimes I also see such exception: >>>>> [ERROR] [07/14/2014 15:02:47.544] >>>>> [MonitoringMaster-akka.actor.default-dispatcher-12] >>>>> [Remoting] Error encountered while processing system message >>>>> acknowledgement [2, 3] ACK[2, {1, 0}] (akka.remote.transport.Transpo >>>>> rt$InvalidAssociationException) >>>>> >>>>> >>>>> >>>>> *Metrics*: >>>>> 2014-07-14 15:02:06,381 INFO [Metrics-akka.actor.default-dispatcher-17] >>>>> d.e.m.MetricProducerActor - Successfully connected to master >>>>> Actor[akka.tcp://[email protected] >>>>> azonaws.com:2551/user/master#-530936949] >>>>> 2014-07-14 15:03:01,174 WARN [Metrics-akka.actor.default-dispatcher-15] >>>>> a.r.RemoteWatcher - Detected unreachable: [akka.tcp:// >>>>> [email protected]:2551] >>>>> 2014-07-14 15:03:01,174 INFO [Metrics-akka.actor.default-dispatcher-15] >>>>> Remoting - Address [akka.tcp://MonitoringMaster@e >>>>> c2-54-82-6-7.compute-1.amazonaws.com:2551] is now quarantined, all >>>>> messages to this address will be delivered to dead letters. >>>>> 2014-07-14 15:03:01,176 ERROR [Metrics-akka.actor.default-dispatcher-17] >>>>> a.a.OneForOneStrategy - Master terminated, need to reconnect >>>>> java.lang.RuntimeException: Master terminated, need to reconnect //Got >>>>> Terminated message >>>>> at xxx.xxx.monitoring.MetricProducerActor$$anonfun$connected$1. >>>>> applyOrElse(MetricProducerActor.scala:81) >>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) >>>>> at akka.actor.dungeon.DeathWatch$class.receivedTerminated(Death >>>>> Watch.scala:45) >>>>> at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:338) >>>>> at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:470) >>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:455) >>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) >>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:219) >>>>> at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec( >>>>> AbstractDispatcher.scala:385) >>>>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask. >>>>> java:260) >>>>> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask( >>>>> ForkJoinPool.java:1339) >>>>> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPoo >>>>> l.java:1979) >>>>> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinW >>>>> orkerThread.java:107) >>>>> 2014-07-14 15:03:06,204 WARN [Metrics-akka.actor.default-dispatcher-2] >>>>> a.r.EndpointWriter - AssociationError [akka.tcp://Metrics@ec2-54-88- >>>>> 77-195.compute-1.amazonaws.com:2552] -> [akka.tcp://MonitoringMaster@e >>>>> c2-54-82-6-7.compute-1.amazonaws.com:2551]: Error [Invalid address: >>>>> akka.tcp://[email protected]:2551] >>>>> [ >>>>> akka.remote.InvalidAssociation: Invalid address: akka.tcp:// >>>>> [email protected]:2551 >>>>> Caused by: akka.remote.transport.Transport$InvalidAssociationException: >>>>> The remote system has a UID that has been quarantined. Association >>>>> aborted. >>>>> ] >>>>> 2014-07-14 15:03:06,204 WARN [Metrics-akka.actor.default-dispatcher-2] >>>>> Remoting - Tried to associate with unreachable remote address [akka.tcp:// >>>>> [email protected]:2551]. Address >>>>> is now gated for 60000 ms, all messages to this address will be delivered >>>>> to dead letters. Reason: The remote system has a UID that has been >>>>> quarantined. Association aborted. >>>>> >>>>> >>>>> I'm curious, why it happens? Our Metrics actor tries to re-connect to >>>>> MonitoringMaster but after successful resolving it becomes unreachable. >>>>> >>>>> >>>>> Regards, >>>>> Vitaliy >>>>> >>>>> -- >>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/c >>>>> urrent/additional/faq.html >>>>> >>>>>>>>>> Search the archives: https://groups.google.com/grou >>>>> p/akka-user >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Akka User List" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> >>>>> Visit this group at http://groups.google.com/group/akka-user. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>>> >>>> -- >>>> Cheers, >>>> Konrad 'ktoso' Malawski >>>> hAkker @ Typesafe >>>> >>>> <http://typesafe.com> >>>> >>> -- >>> >>>>>>>>>> Read the docs: http://akka.io/docs/ >>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/ >>> current/additional/faq.html >>> >>>>>>>>>> Search the archives: https://groups.google.com/ >>> group/akka-user >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "Akka User List" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/akka-user. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> Akka Team >> Typesafe - The software stack for applications that scale >> Blog: letitcrash.com >> Twitter: @akkateam >> > -- > >>>>>>>>>> Read the docs: http://akka.io/docs/ > >>>>>>>>>> Check the FAQ: > http://doc.akka.io/docs/akka/current/additional/faq.html > >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user > --- > You received this message because you are subscribed to the Google Groups > "Akka User List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/akka-user. > For more options, visit https://groups.google.com/d/optout. > -- Akka Team Typesafe - The software stack for applications that scale Blog: letitcrash.com Twitter: @akkateam -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
