Hi Ibrahim,

I see that one node didn't send acknowledgment during cache creation:
[2019-09-27T15:00:17,727][WARN
][exchange-worker-#219][GridDhtPartitionsExchangeFuture] Unable to await
partitions release latch within timeout: ServerLatch [permits=1,
pendingAcks=[*3561ac09-6752-4e2e-8279-d975c268d045*],
super=CompletableLatch [id=exchange, topVer=AffinityTopologyVersion
[topVer=92, minorTopVer=2]]]

Do you have any logs from a node with id =
"3561ac09-6752-4e2e-8279-d975c268d045".
You can find this node by grepping the following
"locNodeId=3561ac09-6752-4e2e-8279-d975c268d045" like in line:
[2019-09-27T15:24:03,532][INFO ][main][TcpDiscoverySpi] Successfully bound
to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0,*
locNodeId=70b49e00-5b9f-4459-9055-a05ce358be10*]


ср, 9 окт. 2019 г. в 17:34, ihalilaltun <ibrahim.al...@segmentify.com>:

> Hi There Igniters,
>
> We had a very strange cluster behivour while creating new caches on the
> fly.
> Just after caches are created we start get following warnings from all
> cluster nodes, including coordinator node;
>
> [2019-09-27T15:00:17,727][WARN
> ][exchange-worker-#219][GridDhtPartitionsExchangeFuture] Unable to await
> partitions release latch within timeout: ServerLatch [permits=1,
> pendingAcks=[3561ac09-6752-4e2e-8279-d975c268d045], super=CompletableLatch
> [id=exchange, topVer=AffinityTopologyVersion [topVer=92, minorTopVer=2]]]
>
> After a while all client nodes are seemed to disconnected from cluster with
> no logs on clients' side.
>
> Coordinator node has many logs like;
> 2019-09-27T15:00:03,124][WARN
> ][sys-#337823][GridDhtPartitionsExchangeFuture] Partition states validation
> has failed for group: acc_1306acd07be78000_userPriceDrop. Partitions cache
> sizes are inconsistent for Part 129:
> [9497f1c4-13bd-4f90-bbf7-be7371cea22f=757
> 1486cd47-7d40-400c-8e36-b66947865602=2427 ] Part 138:
> [1486cd47-7d40-400c-8e36-b66947865602=2463
> f9cf594b-24f2-4a91-8d84-298c97eb0f98=736 ] Part 156:
> [b7782803-10da-45d8-b042-b5b4a880eb07=672
> 9f0c2155-50a4-4147-b444-5cc002cf6f5d=2414 ] Part 284:
> [b7782803-10da-45d8-b042-b5b4a880eb07=690
> 1486cd47-7d40-400c-8e36-b66947865602=1539 ] Part 308:
> [1486cd47-7d40-400c-8e36-b66947865602=2401
> 7750e2f1-7102-4da2-9a9d-ea202f73905a=706 ] Part 362:
> [1486cd47-7d40-400c-8e36-b66947865602=2387
> 7750e2f1-7102-4da2-9a9d-ea202f73905a=697 ] Part 434:
> [53c253e1-ccbe-4af1-a3d6-178523023c8b=681
> 1486cd47-7d40-400c-8e36-b66947865602=1541 ] Part 499:
> [1486cd47-7d40-400c-8e36-b66947865602=2505
> 7750e2f1-7102-4da2-9a9d-ea202f73905a=699 ] Part 622:
> [1486cd47-7d40-400c-8e36-b66947865602=2436
> e97a0f3f-3175-49f7-a476-54eddd59d493=662 ] Part 662:
> [b7782803-10da-45d8-b042-b5b4a880eb07=686
> 1486cd47-7d40-400c-8e36-b66947865602=2445 ] Part 699:
> [1486cd47-7d40-400c-8e36-b66947865602=2427
> f9cf594b-24f2-4a91-8d84-298c97eb0f98=646 ] Part 827:
> [62a05754-3f3a-4dc8-b0fa-53c0a0a0da63=703
> 1486cd47-7d40-400c-8e36-b66947865602=1549 ] Part 923:
> [1486cd47-7d40-400c-8e36-b66947865602=2434
> a9e9eaba-d227-4687-8c6c-7ed522e6c342=706 ] Part 967:
> [62a05754-3f3a-4dc8-b0fa-53c0a0a0da63=673
> 1486cd47-7d40-400c-8e36-b66947865602=1595 ] Part 976:
> [33301384-3293-417f-b94a-ed36ebc82583=666
> 1486cd47-7d40-400c-8e36-b66947865602=2384 ]
>
> Coordinator's log and one of the cluster node's log is attached.
> coordinator_log.gz
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t2515/coordinator_log.gz>
>
> cluster_node_log.gz
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t2515/cluster_node_log.gz>
>
>
> Any help/comment is appriciated.
>
> Thanks.
>
>
>
>
>
> -----
> İbrahim Halil Altun
> Senior Software Engineer @ Segmentify
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Reply via email to