Hello!

It seems that you have network problems.

It's possible that you have more than one network/interface and some
combinations are causing problems. Please try to specify localHost property
on every node pointing to a current actual external IP address of the node.

Regards,
-- 
Ilya Kasnacheev


пт, 15 янв. 2021 г. в 16:19, VeenaMithare <v.mith...@cmcmarkets.com>:

> Hello,
>
> We see this behaviour in our client startup :
> Client 1 -
>
> Client1.txt
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2757/Client1.txt>
> Server003 log -
>
> SERVER3.TXT
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2757/SERVER3.TXT>
>
> 1. The client join the cluster around (2021-01-14T16:16:29) in both
> client1.log and server003.log. This started the partition map exchange .
> 2. The client1.log shows BLOCKED SYSTEM CRITICAL THREAD :
> 2021-01-14T16:16:44,472 ERROR o.a.i.i.u.t.G
> [grid-timeout-worker-#21%InstanceName%]: Blocked system-critical thread has
> been detected. This can lead to cluster-wide undefined behaviour
> [workerName=partition-exchanger,
> threadName=exchange-worker-#37%InstanceName%, blockedFor=14s]
>
> After this , we see the connection timesout  to machinename003.
> 021-01-14T16:16:45,017 WARN  o.a.i.s.c.t.TcpCommunicationSpi
> [exchange-worker-#37%InstanceName%]: Connection timed out (will stop
> attempts to perform the connect)
> [node=7b9d8c6f-814c-4cb6-9822-8fa3d7f79eb7,
> connTimeoutStgy=ExponentialBackoffTimeoutStrategy [maxTimeout=10000,
> totalTimeout=15000, startNanos=14942834332684124, currTimeout=10000],
> failureDetectionTimeoutEnabled=false, timeout=9938, err=null,
> addr=/m.n.o.202:47130]*
>
>
>
> 3. If I look in to the logs in machinename003, the partition map exchange
> finished in 8 seconds.
>
>
> 2021-01-14T16:16:29,671 INFO
> o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture [exchange-worker-#566]:
> Exchange timings [startVer=AffinityTopologyVersion [topVer=47,
> minorTopVer=0], resVer=AffinityTopologyVersion [topVer=47, minorTopVer=0],
> stage="Waiting in exchange queue" (0 ms), stage="Exchange parameters
> initialization" (0 ms), stage="Determine exchange type" (4 ms),
> stage="Exchange done" (4 ms), stage="Total time" (8 ms)]
>
> If so, why was the partition-exchanger blocked on the client ?
>
> 4. Inspite of showing connection timeout however, it manages to
> successfully
> connect to machinename003. ( Please note that m.n.o.202, x.y.z.202 are ip
> addresses of the same server machinename003 ).
>
> 2021-01-14T16:16:45,026 INFO  o.a.i.s.c.t.TcpCommunicationSpi
> [grid-nio-worker-tcp-comm-0-#22%InstanceName%]: Established outgoing
> communication connection [locAddr=/a.b.c.21:53607,
> rmtAddr=machinename003.cmc.local/x.y.z.202:47130]
>
>
>
> Kindly guide us what happened here..
> =========================
> 5. Also we have configured TcpCommunicationSpi timeouts as below as per the
> recommendation given in :
>
>
> http://apache-ignite-users.70518.x6.nabble.com/IgniteSpiOperationTimeoutException-Operation-timed-out-timeoutStrategy-ExponentialBackoffTimeoutStray-tp34196p34377.html
>
>             <bean
> class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi"
> scope="prototype">
>                 <property name="connectTimeout" value="5000"/>
>                 <property name="maxConnectTimeout" value="10000"/>
>
> Is the timeout observed because of this setting ?
> ===============================
>
> 6. Our TxTimeoutOnPartitionMapExchange is 1 second
>                 <property name="TxTimeoutOnPartitionMapExchange"
> value="1000"/>
>
> What is the ideal TxTimeoutOnPartitionMapExchange value that should be
> given
> should it be something like 50 milliseconds ?
>
> =====================================
> A similar log captured during client 2 startup attached as well.
> Client2.txt
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2757/Client2.txt>
>
> regards,
> Veena.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Reply via email to