On Sat, Oct 29, 2016 at 5:57 PM, Eugene Dzhurinsky
wrote:
> Patrick, thanks for the quick reply!
>
> In fact I do see it happening quite often, I have akka *INFO* logging
> enabled, please take a look at this one:
>
> http://depnongadsla.s3.amazonaws.com/wrapper-20161029.log
>
>
Can you somehow v
Patrick, thanks for the quick reply!
In fact I do see it happening quite often, I have akka *INFO* logging
enabled, please take a look at this one:
http://depnongadsla.s3.amazonaws.com/wrapper-20161029.log
To make it clear: all addresses *192.168.1.** are located in the same
datacenter in US,
How long network partitions do you have?
You have increased acceptable-heartbeat-pause of the cluster failure
detector, which is good.
You use auto-down-unreachable-after. In total those timeouts would mean
that it should be able to survive a network partition of 70 seconds before
removing and quar
I have some not really stable network across different geo locations
(different hemispheres actually) and from time to time my actor cluster
falls apart.
I wrote bunch of event interceptors for quarantine events and the system is
more or less operable (less than 10% of nodes are off-cluster at a