Hi Again,

I noticed a few more things:

First of all, I ran the same test (master with s single worker) with a 
physical Android device (instead of an emulator), and the cluster 
disconnection was much less frequent (took more than two hours), but with 
the same symptoms.
I therefore suspect that the problem may be related to an arbitrary 
communication failure with the device - that happens less frequently with 
physical devices (as they are more stable).

Secondly, looking at the Akka logs from the master side, after the 
disassociation event occurs, I start getting dead-letters messages:
[INFO] [05/28/2015 09:59:50.715] 
[ClusterSystem-akka.actor.default-dispatcher-25] 
[akka://ClusterSystem/deadLetters] Message [akka.cluster.GossipStatus] from 
Actor[akka://ClusterSystem/system/cluster/core/daemon#-1559364220] to 
Actor[akka://ClusterSystem/deadLetters] was not delivered. [3] dead letters 
encountered. This logging can be turned off or adjusted with configuration 
settings 'akka.log-dead-letters' and 
'akka.log-dead-letters-during-shutdown'.
[INFO] [05/28/2015 09:59:50.731] 
[ClusterSystem-akka.actor.default-dispatcher-19] 
[akka://ClusterSystem/deadLetters] Message 
[akka.contrib.pattern.DistributedPubSubMediator$Internal$Status] from 
Actor[akka://ClusterSystem/user/distributedPubSubMediator#-825410933] to 
Actor[akka://ClusterSystem/deadLetters] was not delivered. [4] dead letters 
encountered. This logging can be turned off or adjusted with configuration 
settings 'akka.log-dead-letters' and 
'akka.log-dead-letters-during-shutdown'.

After a few seconds, I notice that the connection to the worker was refused:
[WARN] [05/28/2015 09:59:54.746] 
[ClusterSystem-akka.remote.default-remote-dispatcher-26] 
[akka.tcp://ClusterSystem@10.141.4.140:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%40127.0.0.1%3A2553-0]
 
Association with remote system [akka.tcp://ClusterSystem@127.0.0.1:2553] 
has failed, address is now gated for [5000] ms. Reason: [Association failed 
with [akka.tcp://ClusterSystem@127.0.0.1:2553]] Caused by: [Connection 
refused: /127.0.0.1:2553]

Afterwards I see two more heartbeats sent to the worker (with no response), 
before it is marked unreachable (The IP of the other node is localhost as 
it runs on a local Android emulator)

My questions are:
1. Is it possible that the other node actively refuses the tcp connection? 
If so, why, and how can I avoid it?
2. The exception that I brought in the first post of this thread, the 
cluster can generally recover from it, right? If so, what stops the cluster 
from doing so? 

Thank,
Nozik



On Tuesday, May 26, 2015 at 4:55:39 PM UTC+3, Ran Nozik wrote:
>
> Hi Endre,
>
> Thank you for your quick response.
>
> I verified that the only protobuf version we use 
> is com.google.protobuf:protobuf-java:2.5.0 (no other versions in the 
> classpath).
>
> I'm not sure I understood your question about the remoting. We have a 
> distributed system with many (backend) Android workers and one master 
> (frontend) node. They do not interact as client and server.
>
> Regards,
> Nozik
>
> On Tue, May 26, 2015 at 4:21 PM, Endre Varga <endre.va...@typesafe.com> 
> wrote:
>
>> Caused by: com.google.protobuf.UninitializedMessageException: Message 
>> missing required fields: 
>> ... 30 more
>> ]
>>
>> This very much looks like a serialization problem though. Do you maybe 
>> have a newer protobuf version on your classpath than the one Akka uses?
>>
>> Btw, why are you using akka-remoting between android systems? Don't 
>> forget that remoting and clustering are not client-server technologies but 
>> peer-to-peer technologies: 
>> http://doc.akka.io/docs/akka/2.3.11/general/remoting.html#Peer-to-Peer_vs__Client-Server
>>
>> -Endre
>>
>> On Tue, May 26, 2015 at 3:16 PM, <rno...@quixey.com> wrote:
>>
>>> Hi,
>>>
>>> I upgraded to 2.3.11 and the problem reproduced again.
>>>
>>> Thanks.
>>>
>>>
>>> On Tuesday, May 26, 2015 at 3:12:38 PM UTC+3, √ wrote:
>>>>
>>>> Hi Mozik,
>>>>
>>>> please upgrade to the latest version and report back if you still have 
>>>> the same problem.
>>>>
>>>> On Tue, May 26, 2015 at 2:03 PM, <rno...@quixey.com> wrote:
>>>>
>>>>> Hi Everyone,
>>>>>
>>>>> I've been trying to set an Akka cluster with one master node and 
>>>>> multiple workers. The workers are actor systems than run on Android 
>>>>> emulators.
>>>>> As a start, I work with one worker (emulator). I verify that it 
>>>>> successfully joins the cluster and start sending it messages, that are 
>>>>> handled successfully. After some time (from 2-3 to 30-40 minutes), 
>>>>> however, 
>>>>> it disconnects from the cluster.
>>>>> Trying to figure out what causes the problem, I noticed that even if 
>>>>> the worker is idle (no messages are sent), it disconnects from the 
>>>>> cluster 
>>>>> after some time. 
>>>>>
>>>>> In the Android logcat, the following message is displayed:
>>>>>
>>>>> [ClusterSystem-akka.remote.default-remote-dispatcher-5] [akka.tcp://
>>>>>> ClusterSystem@127.0.0.1:2553/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.141.4.104%3A2551-0]
>>>>>>  
>>>>>> Association with remote system [akka.tcp://
>>>>>> ClusterSystem@10.141.4.104:2551] has failed, address is now gated 
>>>>>> for [5000] ms. Reason is: [].
>>>>>
>>>>>
>>>>> and then:
>>>>>
>>>>> [ClusterSystem-cluster-dispatcher-15] [akka.tcp://
>>>>> ClusterSystem@127.0.0.1:2553/system/cluster/core/daemon] Cluster Node 
>>>>> [akka.tcp://ClusterSystem@127.0.0.1:2553] - Marking node(s) as 
>>>>> UNREACHABLE [Member(address = akka.tcp://
>>>>> ClusterSystem@10.141.4.104:2551, status = Up)]
>>>>>
>>>>> and eventually:
>>>>>
>>>>> [ClusterSystem-cluster-dispatcher-26] [Cluster(akka://ClusterSystem)] 
>>>>> Cluster Node [akka.tcp://ClusterSystem@127.0.0.1:2553] - Leader is 
>>>>> auto-downing unreachable node [akka.tcp://
>>>>> ClusterSystem@10.141.4.104:2551]
>>>>> [ClusterSystem-cluster-dispatcher-26] [Cluster(akka://ClusterSystem)] 
>>>>> Cluster Node [akka.tcp://ClusterSystem@127.0.0.1:2553] - Marking 
>>>>> unreachable node [akka.tcp://ClusterSystem@10.141.4.104:2551] as 
>>>>> [Down]
>>>>> [ClusterSystem-cluster-dispatcher-27] [Cluster(akka://ClusterSystem)] 
>>>>> Cluster Node [akka.tcp://ClusterSystem@127.0.0.1:2553] - Leader is 
>>>>> removing unreachable node [akka.tcp://ClusterSystem@10.141.4.104:2551
>>>>> ] 
>>>>>
>>>>>
>>>>> After I subscribed to AssociationErrorEvent, I was able to get more 
>>>>> details:
>>>>>
>>>>> AssociationErrorEvent has occurred: AssociationError [akka.tcp://
>>>>>> ClusterSystem@127.0.0.1:2553] -> [akka.tcp://
>>>>>> ClusterSystem@10.141.4.104:2551]: Error [] [
>>>>>> akka.remote.EndpointException: 
>>>>>> at 
>>>>>> com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770)
>>>>>> at 
>>>>>> akka.remote.ContainerFormats$Selection$Builder.build(ContainerFormats.java:1513)
>>>>>> at 
>>>>>> akka.remote.ContainerFormats$SelectionEnvelope$Builder.addPattern(ContainerFormats.java:931)
>>>>>> at 
>>>>>> akka.remote.serialization.MessageContainerSerializer$$anonfun$serializeSelection$1.apply(MessageContainerSerializer.scala:45)
>>>>>> at 
>>>>>> akka.remote.serialization.MessageContainerSerializer$$anonfun$serializeSelection$1.apply(MessageContainerSerializer.scala:43)
>>>>>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>>>>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>>>>> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>>>>>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>>>>>> at 
>>>>>> akka.remote.serialization.MessageContainerSerializer.serializeSelection(MessageContainerSerializer.scala:43)
>>>>>> at 
>>>>>> akka.remote.serialization.MessageContainerSerializer.toBinary(MessageContainerSerializer.scala:25)
>>>>>> at 
>>>>>> akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:36)
>>>>>> at 
>>>>>> akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:842)
>>>>>> at 
>>>>>> akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:842)
>>>>>> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>>>>>> at akka.remote.EndpointWriter.serializeMessage(Endpoint.scala:841)
>>>>>> at akka.remote.EndpointWriter.writeSend(Endpoint.scala:742)
>>>>>> at 
>>>>>> akka.remote.EndpointWriter$$anonfun$2.applyOrElse(Endpoint.scala:717)
>>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>>>>> at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:410)
>>>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>>>>> at 
>>>>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>> at 
>>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>>>>>> at 
>>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>>>>>> at 
>>>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>> at 
>>>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>>> Caused by: com.google.protobuf.UninitializedMessageException: Message 
>>>>>> missing required fields: 
>>>>>> ... 30 more
>>>>>> ]
>>>>>> akka.remote.EndpointException: 
>>>>>> at 
>>>>>> com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770)
>>>>>> at 
>>>>>> akka.remote.ContainerFormats$Selection$Builder.build(ContainerFormats.java:1513)
>>>>>> at 
>>>>>> akka.remote.ContainerFormats$SelectionEnvelope$Builder.addPattern(ContainerFormats.java:931)
>>>>>> at 
>>>>>> akka.remote.serialization.MessageContainerSerializer$$anonfun$serializeSelection$1.apply(MessageContainerSerializer.scala:45)
>>>>>> at 
>>>>>> akka.remote.serialization.MessageContainerSerializer$$anonfun$serializeSelection$1.apply(MessageContainerSerializer.scala:43)
>>>>>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>>>>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>>>>> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>>>>>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>>>>>> at 
>>>>>> akka.remote.serialization.MessageContainerSerializer.serializeSelection(MessageContainerSerializer.scala:43)
>>>>>> at 
>>>>>> akka.remote.serialization.MessageContainerSerializer.toBinary(MessageContainerSerializer.scala:25)
>>>>>> at 
>>>>>> akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:36)
>>>>>> at 
>>>>>> akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:842)
>>>>>
>>>>>
>>>>>
>>>>> At first I though that there's a serialization problem with one of the 
>>>>> messages that are sent to or from the worker. However, the problem 
>>>>> repeats 
>>>>> itself even when there are no messages sent to the worker at all.
>>>>> If I restart the worker, it re-joins the cluster and everything is 
>>>>> back to normal again (until the next disconnection event) - so the 
>>>>> problem 
>>>>> isn't permanent.
>>>>>
>>>>> I'm using Akka 2.3.9 on both master and worker.
>>>>>
>>>>> What could be causing the problem? Could it be Android related?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> -- 
>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>> >>>>>>>>>> Check the FAQ: 
>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>> >>>>>>>>>> Search the archives: 
>>>>> https://groups.google.com/group/akka-user
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Akka User List" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to akka-user+...@googlegroups.com.
>>>>> To post to this group, send email to akka...@googlegroups.com.
>>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> Cheers,
>>>> √
>>>>  
>>>  -- 
>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>> >>>>>>>>>> Check the FAQ: 
>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>> >>>>>>>>>> Search the archives: 
>>> https://groups.google.com/group/akka-user
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to akka-user+unsubscr...@googlegroups.com.
>>> To post to this group, send email to akka-user@googlegroups.com.
>>> Visit this group at http://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "Akka User List" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/akka-user/EfTyabqQyK8/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> akka-user+unsubscr...@googlegroups.com.
>> To post to this group, send email to akka-user@googlegroups.com.
>>
>> Visit this group at http://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to