Re: [akka-user] Cluster node reconnects

Björn Antonsson Tue, 04 Nov 2014 13:31:38 -0800

Hi,

So the way the cluster works currently is that the unreachable node has to be 
removed (by doing a down on it) before a system with the same address/port is 
allowed to join the cluster. If you have the auto-down set to a low value and 
wait with restarting the "crashed" node until you see the master setting it to 
DOWN, does it work then?


The thing that seems weird in your log is that 127.0.0.1:2552 suddenly marks 
the node as reachable again instead of just downing it. If the old node had 
been downd and removed correctly, then the new one with the same address/port 
should be allowed to connect. There might be an issue with the failure detector 
and a missmatch between addresses and unique addresses (address:port:uid).

Would it be possible for you to package up a minimal project that we can use to 
reproduce this?

B/

On 4 November 2014 at 14:57:38, Behrad Zari (behr...@gmail.com) wrote:

In my three node cluster (akka 2.3.6 - scala 2.10.4) with the config below

cluster {
    seed-nodes = [
      "akka.tcp://adp@127.0.0.1:2552" // using one of the three as seed node
    ]
    auto-down-unreachable-after = 120s
  }

I `Ctrl+C` one of my nodes so that simulate some crash/termination I see

Remoting - Tried to associate with unreachable remote address 
[akka.tcp://adp@127.0.0.1:2553]. Address is now gated for 5000 ms, all messages 
to this address will be delivered to dead letters. Reason: Connection refused: 
/127.0.0.1:2553

but when I restart the process it is ignored to join and they cannot 
interoperate, and I continue to see the following message:

Cluster Node [akka.tcp://adp@127.0.0.1:2552] - Existing member 
[UniqueAddress(akka.tcp://adp@127.0.0.1:2553,392261992)] is trying to join, 
ignoring
13:36:18.964UTC INFO [adp-akka.actor.default-dispatcher-2] Cluster(akka://adp) 
- Cluster Node [akka.tcp://adp@127.0.0.1:2552] - Marking node(s) as REACHABLE 
[Member(address = akka.tcp://adp@127.0.0.1:2553, status = Up)]
Cluster Node [akka.tcp://adp@127.0.0.1:2552] - Existing member 
[UniqueAddress(akka.tcp://adp@127.0.0.1:2553,392261992)] is trying to join, 
ignoring
Cluster Node [akka.tcp://adp@127.0.0.1:2552] - Existing member 
[UniqueAddress(akka.tcp://adp@127.0.0.1:2553,392261992)] is trying to join, 
ignoring
Cluster Node [akka.tcp://adp@127.0.0.1:2552] - Existing member 
[UniqueAddress(akka.tcp://adp@127.0.0.1:2553,392261992)] is trying to join, 
ignoring
...



I'd expect cluster to reconnect after one of my node restarts :( 
when I decrease "auto-down-unreachable-after" my crashed node is down in my 
seed node, so it is quarantined and won't be able to rejoin after startup until 
both node restart.
I doubt what is the correct pattern for per node restarts in a clustered 
deployment!?
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

-- 
Björn Antonsson
Typesafe – Reactive Apps on the JVM
twitter: @bantonsson

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Cluster node reconnects

Reply via email to