Hi - 

I'm being bitten by a client/server application where multiple clients 
subscribe for updates from a server where, upon a long GC pause, the 
clients are being quarantined. Here is some client logs for an attempt to 
rediscover the server actor using "server ? Identify", which times out. I 
can see that this is because the client has quarantined the server.

24-Oct-2014 11:30:28:310: [(akka)Remoting - WARNING] [31]: Tried to 
associate with unreachable remote address 
[akka.tcp://gekkoremot...@loypcore4.intra.gsacapital.com:35411]. Address is 
now gated for 5000 ms, all messages to this address will be delivered to 
dead letters. Reason: The remote system has quarantined this system. No 
further associations to the remote system are possible until this system is 
restarted.


I can guess (it's not a guess, the times line up perfectly) in the server 
when the original disconnect happened:

1.611: [Full GC (Metadata GC Threshold)  87M->14M(52M), 0.1731620 secs]
7.622: [Full GC (Metadata GC Threshold)  175M->85M(220M), 0.4527592 secs]
8658.035: [Full GC (Metadata GC Threshold)  3638M->419M(4192M), 1.8916884 
secs]
244257.679: [Full GC (Allocation Failure)  6516M->2622M(14G), 9.2884735 
secs]
*391857.856: [Full GC (Allocation Failure)  7390M->3758M(12G), 13.7533193 
secs]*


the server's akka logs state the following happendd at this time (the 
server quarantines my client):

[WARN] [10/24/2014 10:57:22.416] 
[gekkoRemoting-akka.remote.default-remote-dispatcher-7] 
[akka.tcp://gekkoremot...@loypcore4.intra.gsacapital.com:35411/system/remote-watcher]
 
Detected unreachable: [akka.tcp://papagui@10.210.50.92:60091]
[WARN] [10/24/2014 10:57:22.416] 
[gekkoRemoting-akka.remote.default-remote-dispatcher-11015] [Remoting] 
Association to [akka.tcp://papagui@10.210.50.92:60091] having UID 
[1196983173] is irrecoverably failed. UID is now quarantined and all 
messages to this UID will be delivered to dead letters. Remote actorsystem 
must be restarted to recover from this situation.


My question is "how on earth do I code against this?" - currently I have a 
"canary" which creates the client connection. The client connection picks 
up the Terminated message from the server and stops itself; the canary 
picks this Termination up and spools up (after a pause for a few minutes) a 
new client to attempt to connect to the server. Except the new client 
cannot connect to the server because it has been quarantined. How is my 
client supposed to know this? There's no "you've been quarantined" 
callback, just a timeout looking up a server. Do I need to just assume that 
the failure to lookup a server might indicate a quarantine?


here's the server's remote configuration:

remote {
log-remote-lifecycle-events = on
retry-gate-closed-for = 5 s
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
maximum-frame-size = 100 MiB
}
watch-failure-detector {
acceptable-heartbeat-pause = 20 s
heartbeat-interval = 5 s
}
transport-failure-detector {
acceptable-heartbeat-pause = 10 s
heartbeat-interval = 3 s
}
}

here's the client's remote configuration:

remote {
log-remote-lifecycle-events = on
gate-invalid-addresses-for = 5 s

enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
port = 0
maximum-frame-size = 100 MiB
}
watch-failure-detector {
acceptable-heartbeat-pause = 20 s
heartbeat-interval = 5 s
}
transport-failure-detector {
acceptable-heartbeat-pause = 12 s
heartbeat-interval = 3 s
}
}

Chris

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to