Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-09-11 Thread wangsan
Yes , It was blocked when do cache operation in discovery event listeners when node left events arrival concurrently. I just do cache operation in another thread. Then the listener will not be blocked. The original cause may be that discovery event processor hold the server latch.when do cache op

Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-09-07 Thread eugene miretsky
Hi Wangsan, So what was the original cause of the issue? Was it blocking the listening thread in your test code or something else? We are having similar issues Cheers, Eugene On Mon, Sep 3, 2018 at 1:23 PM Ilya Kasnacheev wrote: > Hello! > > The operation will execute after partition map excha

Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-09-03 Thread Ilya Kasnacheev
Hello! The operation will execute after partition map exchange (or maybe several ones). Just be sure to avoid waiting on operation from discovery event listener. Regards, -- Ilya Kasnacheev пн, 3 сент. 2018 г. в 17:37, wangsan : > Thanks! > > Can I do cache operations(update cache item) in an

Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-09-03 Thread wangsan
Thanks! Can I do cache operations(update cache item) in another thread from discovery event listeners? And the operation(update cache item) will execute concurrently or execute before partition map exchange? -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-08-31 Thread Ilya Kasnacheev
Hello! I'm fairly confident that you should not attempt to do any cache operations from discovery threads, e.g. from event listeners. When topology changes you can't expect any cache operations to be performed before partition map exchange is done. Regards, -- Ilya Kasnacheev ср, 29 авг. 2018

Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-08-28 Thread wangsan
I can reproduce the bug, above log is the server(first) node print when I stop other nodes . import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; import java.util.concurrent.CompletableFuture; import java.util.concurrent.ExecutorService; import java.util.concurrent.E

Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-08-28 Thread Ilya Kasnacheev
Hello! Please check that there are no problems with connectivity in your cluster, i.e. that all nodes can open communication and discovery connections to all other nodes. >From what I observe in the log, there are massive problems with cluster stability: 23:48:44.624 [tcp-disco-sock-reader-#48%te

Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-08-27 Thread wangsan wang
About question 2, debug level like this: I start a node,then b,c,d,e,f nodes in mulitithread. then close them all. in the debugs log, A server latch created with participantsSize=5 but only one countdown .then latch will be hang. simple logs is: >>> ++ > >>> Topology snapshot. > >

Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-08-27 Thread Ilya Kasnacheev
Hello! 1. As far as my understanding goes, there's no such handling of OOM in Apache Ignite that would guarantee not causing cluster crash. This means you should be extra careful with that. This is since after OOM node doesn't have a chance to quit gracefully. Maybe other nodes will be able to eve

ignte cluster hang with GridCachePartitionExchangeManager

2018-08-24 Thread wangsan
Now my cluster topology is Node a,b,c,d all with persistence enable and peerclassloader false. b c d have different class(cache b) from a. 1.When any node crash with oom(memory or stack) .all nodes hang with " - Still waiting for initial partition map exchange " 2.When a start first, b,c,d start