My plan for this was to have a proxy for NodeB watch state in NodeA. In 
normal (connected) operation it just remembers the current watch states 
(Actor Ax is/isn't watching Actor By) and passes the messages on to NodeB. 
If disconnected it just remembers the watch state. On reconnect, it sends a 
snapshot of the state to NodeB.

Because it's remembering state rather than messages it stays bounded. Need 
to make sure that "isn't watching" states get pruned, but that's just the 
usual sequence number/ack stuff (if disconnected, we can prune them 
immediately, because they don't form part of the snapshot).

Cheers

Alistair
On Tuesday, January 21, 2014 5:57:36 PM UTC, √ wrote:
>
> NodeA & NodeB are communicating
> NodeB disappears (not acking)
> NodeA still has things it needs to propagate to NodeB (watching actors on 
> that node etc), so they need to be buffered, also, if there are ordering 
> requirements it means that other things may not be transmitted/received 
> before other things, so more thing get buffered. At which point does NodeA 
> either A) drop the buffer or B) throw OOME?
>
> Cheers,
> √
>
>
> On Tue, Jan 21, 2014 at 6:46 PM, Alistair George 
> <alistai...@gmail.com<javascript:>
> > wrote:
>
>> I don't see where the unbounded buffer is needed. I'd be grateful for a 
>> bit of explanation, especially since it looks like I'm going to have to 
>> implement this stuff :)
>>
>>
>> On Tuesday, January 21, 2014 12:40:19 PM UTC, √ wrote:
>>
>>> And it also introduces the need for unbounded buffering, i.e. memory 
>>> leaks.
>>>
>>>
>>> On Tue, Jan 21, 2014 at 1:01 PM, Akka Team <akka.o...@gmail.com> wrote:
>>>
>>>> Hi Alistair,
>>>>
>>>>
>>>>
>>>>  
>>>>>> I'm not sure this is desirable behaviour. I shouldn't have to restart 
>>>>> a process just to recover from a comms failure. After all, nothing in the 
>>>>> process has failed, and it may be providing services to other clients 
>>>>> that 
>>>>> have not suffered any comms failure. They shouldn't have to take the 
>>>>> impact 
>>>>> of a restart.
>>>>>
>>>>
>>>> No, restart is not needed to recover from a communications failure, it 
>>>> is needed to recover from the quarantined state. 
>>>>  
>>>>
>>>>>
>>>>> One of the strengths of Akka is that it doesn't pretend to do things 
>>>>> that can't be done in a distributed context - this is essential for 
>>>>> transparent distribution. One of this things you can't do distributed is 
>>>>> give reliable, timely notification of a remote event, such as actor 
>>>>> termination, and I don't think Akka should try.
>>>>>
>>>>> What I'd prefer is this:
>>>>>
>>>>>    - Reconnect attempts should continue indefinitely. 
>>>>>
>>>>> This is actually what happens. Quarantining only happens if you use 
>>>> remote DeathWatch. If this is not the feature you want, you should not use 
>>>> remote deathwatch. The contract of remote DeathWatch is that it eventually 
>>>> produces a Terminated message if the remote actor is dead, at the cost of 
>>>> false positives (i.e. actor is live but remote system was not responding 
>>>> for a long time). You can tune the false positive rate by setting the 
>>>> watch 
>>>> failure detector. You can even set it to trigger after one year.
>>>>
>>>>>
>>>>>    - The DeathWatch protocol should be extended to include (possibly 
>>>>>    multiple) Reachable/Unreachable events. 
>>>>>
>>>>> This is not the purpose of DeathWatch, but you can implement such a 
>>>> thing on your own by sending heartbeats between the related actors. In 
>>>> clustering there is support for Unreachable/Reachable events. 
>>>>
>>>>>
>>>>>    - Terminate should only be delivered when the remote actor system 
>>>>>    is reachable and asserts that the watched actor does not exist. This 
>>>>> might 
>>>>>    never happen: an actor might stay in an unreachable state forever. 
>>>>>
>>>>> While that is a useful feature, it is not the purpose of DeathWatch. 
>>>> Clustering has the feature you want, and in general, clustering is 
>>>> recommended over plain remoting.
>>>>  
>>>>
>>>>> I realise I can emulate this by setting the timeout before quarantine 
>>>>> to be effectively infinite, and adding my own facility to detect 
>>>>> reachability and termination, but this isn't trivial. I'd prefer this 
>>>>> behaviour to be available out of the box, for both practical and 
>>>>> conceptual 
>>>>> reasons.
>>>>>
>>>>
>>>> There is always the akka-contrib area for contributions :)
>>>>
>>>> In general, I understand your use case, but DeathWatch does not support 
>>>> it. Many users would be surprised when they watch a remote actor, and kill 
>>>> the node running the system, but no Terminated messages would be 
>>>> generated. 
>>>> And they would argue that is wrong for practical and conceptual reasons :)
>>>>
>>>> -Endre
>>>>  
>>>>
>>>>>
>>>>> Just my $.02
>>>>>
>>>>> Cheers
>>>>>
>>>>> Alistair 
>>>>>
>>>>> -Endre
>>>>>>  
>>>>>>
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>>  Alistair
>>>>>>>
>>>>>>>
>>>>>>> On Monday, January 20, 2014 12:41:25 PM UTC, Akka Team wrote:
>>>>>>>
>>>>>>>> Hi Alistair,
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jan 16, 2014 at 9:30 AM, Alistair George <
>>>>>>>> alistai...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> If I set up a watch on a remote actor (one on a remote actor 
>>>>>>>>> system) and the network between me and the remote system fails, I get 
>>>>>>>>> a 
>>>>>>>>> Terminated message almost immediately. In fact, the remote actor 
>>>>>>>>> hasn't 
>>>>>>>>> terminated, 
>>>>>>>>>
>>>>>>>>
>>>>>>>> That does not matter. If you use remote DeathWatch, and one of the 
>>>>>>>> systems gets unreachable for enough time it will eventually fire 
>>>>>>>> Terminated 
>>>>>>>> for all the watched actors on the remote system and then quarantines 
>>>>>>>> that 
>>>>>>>> system so it never comes back again. The deathwatch failure detector 
>>>>>>>> (akka.remote.watch-failure-detector) settings controls how 
>>>>>>>> sensitive is this decision. If you think that a 1 hour unreachability 
>>>>>>>> should be not considered terminal, then you should configure those 
>>>>>>>> settings 
>>>>>>>> correspondingly.
>>>>>>>>  
>>>>>>>>
>>>>>>>>> and I can still use the ActorRef to send messages to it once comms 
>>>>>>>>> are restored. (However, if comms fail a second time I don't get a 
>>>>>>>>> second 
>>>>>>>>> Terminated message.)
>>>>>>>>>
>>>>>>>>
>>>>>>>> This is because we made the mistakes in 2.2.x:
>>>>>>>>  - we made quarantine times configurable
>>>>>>>>  - we set it to a low value, 60 seconds
>>>>>>>>
>>>>>>>> After the quarantine elapses the systems can communicate again, 
>>>>>>>> regardless of the Terminated message, probably this is what you 
>>>>>>>> observed -- 
>>>>>>>> and this is exactly why quarantine in 2.3 is permanent.
>>>>>>>>  
>>>>>>>>
>>>>>>>>>
>>>>>>>>> "Terminated" and "lost contact" are rather different states, and 
>>>>>>>>> may need different handling. Does anyone know of a reliable way I can 
>>>>>>>>> distinguish these? 
>>>>>>>>>
>>>>>>>>
>>>>>>>> DeathWatch sends Terminated in the case the remote system is in 
>>>>>>>> "lost contact" state for a long time. How long is that time is 
>>>>>>>> configurable 
>>>>>>>> by the DeathWatch failure detector. "lost contact" events are 
>>>>>>>> generated as 
>>>>>>>> remote lifecycle events, but I don't recommend using those directly. 
>>>>>>>> Message send supposed to be lossy, you can track reachability in your 
>>>>>>>> user 
>>>>>>>> layer by some heartbeating mechanism if you want it.
>>>>>>>>
>>>>>>>> Btw, there is another failure detector (akka.remote.transport-
>>>>>>>> failure-detector) that monitors the health of network connections, 
>>>>>>>> but it does not generate Terminated events, only reconnect attempts.
>>>>>>>>
>>>>>>>>  In 2.3 clustering will differentiate between UNREACHABLE events 
>>>>>>>> (which can heal) from removals. You probably want to use those 
>>>>>>>> features 
>>>>>>>> instead of plain remoting.
>>>>>>>>  
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> Alistair
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>> >>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>> >>>>>>>>>> Search the archives: https://groups.google.com/grou
>>>>>>>>> p/akka-user
>>>>>>>>> --- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "Akka User List" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to akka-user+...@googlegroups.com.
>>>>>>>>> To post to this group, send email to akka...@googlegroups.com.
>>>>>>>>>
>>>>>>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Akka Team
>>>>>>>> Typesafe - The software stack for applications that scale
>>>>>>>> Blog: letitcrash.com
>>>>>>>> Twitter: @akkateam
>>>>>>>>  
>>>>>>>  -- 
>>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>> >>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>> >>>>>>>>>> Search the archives: https://groups.google.com/grou
>>>>>>> p/akka-user
>>>>>>> --- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Akka User List" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to akka-user+...@googlegroups.com.
>>>>>>> To post to this group, send email to akka...@googlegroups.com.
>>>>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Akka Team
>>>>>> Typesafe - The software stack for applications that scale
>>>>>> Blog: letitcrash.com
>>>>>> Twitter: @akkateam
>>>>>>  
>>>>>  -- 
>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>> >>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>> >>>>>>>>>> Search the archives: https://groups.google.com/
>>>>> group/akka-user
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Akka User List" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to akka-user+...@googlegroups.com.
>>>>> To post to this group, send email to akka...@googlegroups.com.
>>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> Akka Team
>>>> Typesafe - The software stack for applications that scale
>>>> Blog: letitcrash.com
>>>> Twitter: @akkateam
>>>>  
>>>> -- 
>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>> >>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>> >>>>>>>>>> Search the archives: https://groups.google.com/
>>>> group/akka-user
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Akka User List" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to akka-user+...@googlegroups.com.
>>>> To post to this group, send email to akka...@googlegroups.com.
>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>
>>>
>>>
>>> -- 
>>> Cheers,
>>> √
>>>
>>> * ——————— **Viktor Klang*
>>> *Chief Architect - **Typesafe <http://www.typesafe.com/>*
>>>
>>>  Twitter: @viktorklang
>>>  
>>  -- 
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ: http://akka.io/faq/
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to akka-user+...@googlegroups.com <javascript:>.
>> To post to this group, send email to akka...@googlegroups.com<javascript:>
>> .
>> Visit this group at http://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
>
> -- 
> Cheers,
> √
>
> * ——————— **Viktor Klang*
> *Chief Architect - **Typesafe <http://www.typesafe.com/>*
>
>  Twitter: @viktorklang
>  

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: http://akka.io/faq/
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to