Hi. We encountered the same situation again. That time we get the following error:
12/Apr/2017 10:08:17 ERROR 72832749 exchange-worker-#199%null% org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture(L:495) - Failed to reinitialize local partitions (preloading will be stopped): GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=46, minorTopVer=45], nodeId=172e4264, evt=DISCOVERY_CUSTOM_EVT] java.lang.NullPointerException at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.initStartedCacheOnCoordinator(CacheAffinitySharedManager.java:747) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.onCacheChangeRequest(CacheAffinitySharedManager.java:413) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onCacheChangeRequest(GridDhtPartitionsExchangeFuture.java:571) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:454) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1670) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) 12/Apr/2017 10:08:18 ERROR 72833126 exchange-worker-#199%null% org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager(L:495) - Runtime error caught during grid runnable execution: GridWorker name=partition-exchanger, gridName=null, finished=false, isCancelled=false, hashCode=1466668876, interrupted=false, runner=exchange-worker-#199%null% java.lang.NullPointerException at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.partitionMap(GridDhtPartitionTopologyImpl.java:973) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.createPartitionsFullMessage(GridCachePartitionExchangeManager.java:855) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.createPartitionsMessage(GridDhtPartitionsExchangeFuture.java:966) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendAllPartitions(GridDhtPartitionsExchangeFuture.java:977) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendAllPartitions(GridDhtPartitionsExchangeFuture.java:1313) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onReceive(GridDhtPartitionsExchangeFuture.java:1142) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$7.apply(GridCachePartitionExchangeManager.java:1295) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$7.apply(GridCachePartitionExchangeManager.java:1292) at org.apache.ignite.internal.util.future.GridFutureAdapter$ArrayListener.apply(GridFutureAdapter.java:456) at org.apache.ignite.internal.util.future.GridFutureAdapter$ArrayListener.apply(GridFutureAdapter.java:439) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:271) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:259) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:389) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:355) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:1053) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:88) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:343) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:512) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1670) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) After that exchange-worker dies and all put and get operations blocks. 1 - What does above error mean? When does it happen? 2 - Is there a way to avoid or recover from that state? Regards. On Thu, Apr 6, 2017 at 5:50 PM, Andrey Mashenkov <andrey.mashen...@gmail.com > wrote: > Hi Alper, > > I see no starvation here. Looks like WriteBehindStore waits for its queue > has flushed. > Threaddump is needed to understand if Flusher threads also waits for smth. > > On Thu, Apr 6, 2017 at 4:40 PM, Alper Tekinalp <al...@evam.com> wrote: > >> Hi Andrey. >> >> Ignite logs are at the attachment. >> >> Interruption exception is on 30/Mar/2017 15:56:03. starvation log is on >> the second node. >> >> I do not have threaddump. I can provide if the problem repeats. >> >> Thanks for your helps! >> >> On Thu, Apr 6, 2017 at 4:27 PM, Andrey Mashenkov < >> andrey.mashen...@gmail.com> wrote: >> >>> Hi Alper, >>> >>> Would you please provide full treaddump and full log? >>> >>> On Thu, Apr 6, 2017 at 4:08 PM, nragon <nuno.goncalves@wedotechnologi >>> es.com> wrote: >>> >>>> Hi Andrew, >>>> >>>> Please note that the same flink job without ignite runs around 30k/s >>>> with >>>> ignite get method goes to 2k/s. If you don't mind taking a look at >>>> http://apache-ignite-users.70518.x6.nabble.com/Client-near-c >>>> ache-with-Apache-Flink-td11627.html. >>>> I only said it was related because of the thread dump Alper mentioned >>>> seemed >>>> alike and because his architecture with clients/servers are also very >>>> similiar to mine. >>>> >>>> Thanks >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: http://apache-ignite-users.705 >>>> 18.x6.nabble.com/Client-got-stucked-on-get-operation-tp11313p11780.html >>>> Sent from the Apache Ignite Users mailing list archive at Nabble.com. >>>> >>> >>> >>> >>> -- >>> Best regards, >>> Andrey V. Mashenkov >>> >> >> >> >> -- >> Alper Tekinalp >> >> Software Developer >> Evam Streaming Analytics >> >> Atatürk Mah. Turgut Özal Bulv. >> Gardenya 5 Plaza K:6 Ataşehir >> 34758 İSTANBUL >> >> Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 >> www.evam.com.tr >> <http://www.evam.com> >> > > > > -- > Best regards, > Andrey V. Mashenkov > -- Alper Tekinalp Software Developer Evam Streaming Analytics Atatürk Mah. Turgut Özal Bulv. Gardenya 5 Plaza K:6 Ataşehir 34758 İSTANBUL Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 www.evam.com.tr <http://www.evam.com>