Hello,

I'm trying to understand the cause of nodes being quarantined and possible 
solutions to fixing it. I'm using akka 2.3.11. On the quarantined node I 
see this logging:

2:45:44.204 ERROR [geyser-akka.remote.default-remote-dispatcher-6] 
a.r.EndpointWriter - AssociationError 
[akka.tcp://geyser@172.16.120.174:7000] <- 
[akka.tcp://geyser@172.17.100.105:7000]: Error [Invalid address: 
akka.tcp://geyser@172.17.100.105:7000] [
akka.remote.InvalidAssociation: Invalid address: 
akka.tcp://geyser@172.17.100.105:7000
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The 
remote system has quarantined this system. No further associations to the 
remote system are possible until this system is restarted.
]
12:45:44.205 WARN  [geyser-akka.remote.default-remote-dispatcher-25] 
Remoting - Tried to associate with unreachable remote address 
[akka.tcp://geyser@172.17.100.105:7000]. Address is now gated for 5000 ms, 
all messages to this address will be delivered to dead letters. Reason: 
[The remote system has quarantined this system. No further associations to 
the remote system are possible until this system is restarted.]

And on the node that cause the box to be quarantined I see this logging:

12:45:44.194 WARN  [geyser-akka.remote.default-remote-dispatcher-6] 
Remoting - Association to [akka.tcp://geyser@172.16.120.174:7000] having 
UID [-450748474] is irrecoverably failed. UID is now quarantined and all 
messages to this UID will be delivered to dead letters. Remote actorsystem 
must be restarted to recover from this situation.
12:45:44.202 WARN  [geyser-akka.remote.default-remote-dispatcher-7] 
a.r.EndpointWriter - AssociationError 
[akka.tcp://geyser@172.17.100.105:7000] -> 
[akka.tcp://geyser@172.16.120.174:7000]: Error [Invalid address: 
akka.tcp://geyser@172.16.120.174:7000] [
akka.remote.InvalidAssociation: Invalid address: 
akka.tcp://geyser@172.16.120.174:7000
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The 
remote system has a UID that has been quarantined. Association aborted.
]
12:45:44.203 WARN  [geyser-akka.remote.default-remote-dispatcher-7] 
Remoting - Tried to associate with unreachable remote address 
[akka.tcp://geyser@172.16.120.174:7000]. Address is now gated for 5000 ms, 
all messages to this address will be delivered to dead letters. Reason: 
[The remote system has a UID that has been quarantined. Association 
aborted.]
12:45:44.221 ERROR [geyser-akka.remote.default-remote-dispatcher-7] 
Remoting - Association to [akka.tcp://geyser@172.16.120.174:7000] with UID 
[-450748474] irrecoverably failed. Quarantining address.
java.lang.IllegalStateException: Error encountered while processing system 
message acknowledgement buffer: [-1 {}] ack: ACK[6, {}]
        at 
akka.remote.ReliableDeliverySupervisor$$anonfun$receive$1.applyOrElse(Endpoint.scala:288)
 
~[geyser.jar:1.1.17-SNAPSHOT]
        at akka.actor.Actor$class.aroundReceive(Actor.scala:467) 
~[geyser.jar:1.1.17-SNAPSHOT]
Caused by: java.lang.IllegalArgumentException: Highest SEQ so far was -1 
but cumulative ACK is 6
        at akka.remote.AckedSendBuffer.acknowledge(AckedDelivery.scala:103) 
~[geyser.jar:1.1.17-SNAPSHOT]
        at 
akka.remote.ReliableDeliverySupervisor$$anonfun$receive$1.applyOrElse(Endpoint.scala:284)
 
~[geyser.jar:1.1.17-SNAPSHOT]
        ... 11 common frames omitted
12:45:44.221 WARN  [geyser-akka.remote.default-remote-dispatcher-7] 
Remoting - Association to [akka.tcp://geyser@172.16.120.174:7000] having 
UID [-450748474] is irrecoverably failed. UID is now quarantined and all 
messages to this UID will be delivered to dead letters. Remote actorsystem 
must be restarted to recover from this situation.

Quite a bit of data can be passed between the nodes ~200 Mb/sec and maybe 
the system is hitting a capacity issue although I don't see any issue with 
CPU or memory. I noticed that the default-remote-dispatcher only has two 
threads. Are these threads being used to send the data? And if so should I 
try increase the thread count? Are there any other settings I could play 
with of things I can look for in the logs that might highlight what is 
wrong?

Thanks,
Ben

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to