Hi,

Communication unlikely influence to Discovery, because they use difference
thread pools.

Follow to steps:
1) Exclude Communication from DEBUG log level, and left in place DEBUG
level for package org.apache.ignite.spi.discovery.
2) Provide log file from a node, where a message like  "Node FAILED:
TcpDiscoveryNode..." appear first by time.
3) Provide GC log file from the node, who was joined.

After all logs will be received, we will move forward.

On Tue, Aug 16, 2016 at 12:45 PM, Jason <fqy...@outlook.com> wrote:

> hi Vladislav,
>
> By scanning through the log, found that the serialization error may be the
> root cause. When the previous node of the new added node tried to send the
> messages (NodeAddedMessage/NodeAddFinishedMessage) to the new node, it
> failed, so it marked the new node as "failed" and sent an NodeFailedMessage
> to coordinator and when the coordinator received that, removed the new node
> and notified all the other nodes.
>
> I've attached the serialization error, would you like to help take a look
> at?
>
> Thanks,
> -Jason
>
> serialization_error.txt
> <http://apache-ignite-users.70518.x6.nabble.com/file/
> n7092/serialization_error.txt>
>
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Fail-to-join-topology-and-repeat-join-
> process-tp6987p7092.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov

Reply via email to