[ https://issues.apache.org/jira/browse/IGNITE-21478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Luchnikov Alexander updated IGNITE-21478: ----------------------------------------- Description: User cases: 1) Frequent entry/exit of a thick client into the topology leads to a crash of the server node due to OMM. 2) Frequent creation and destroy of caches leads to a server node crash due to OOM. topVer=20098 Part of the log before the OOM crash, pay attention to *topVer=20098* {code:java} Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=f080abcd, uptime=3 days, 09:00:55.274] ^-- Cluster [hosts=4, CPUs=6, servers=2, clients=2, topVer=20098, minorTopVer=6] ^-- Network [addrs=[192.168.1.2, 127.0.0.1], discoPort=47500, commPort=47100] ^-- CPU [CPUs=2, curLoad=86.83%, avgLoad=21.9%, GC=23.9%] ^-- Heap [used=867MB, free=15.29%, comm=1024MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=7, qSize=0] ^-- System thread pool [active=0, idle=8, qSize=0] ^-- Striped thread pool [active=0, idle=8, qSize=0] {code} Histogram from heap-dump after node failed !histo.png! *MinorTop example * {code:java} @Test public void testMinorVer() throws Exception { Ignite server = startGrids(1); IgniteEx client = startClientGrid(); String cacheName = "cacheName"; for (int i = 0; i < 500; i++) { client.getOrCreateCache(cacheName); client.destroyCache(cacheName); } System.err.println("Heap dump time"); Thread.sleep(1000000); } {code} {code:java} [INFO ][exchange-worker-#149%internal.IgniteOomTest%][GridCachePartitionExchangeManager] AffinityTopologyVersion [topVer=2, minorTopVer=1000], evt=DISCOVERY_CUSTOM_EVT, evtNode=52b4c130-1a01-4858-813a-ebc8a5dabf1e, client=true] {code} was: User cases: 1) Frequent entry/exit of a thick client into the topology leads to a crash of the server node due to OMM. 2) Frequent creation and destroy of caches leads to a server node crash due to OOM. topVer=20098 Part of the log before the OOM crash, pay attention to *topVer=20098* {code:java} Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=f080abcd, uptime=3 days, 09:00:55.274] ^-- Cluster [hosts=4, CPUs=6, servers=2, clients=2, topVer=20098, minorTopVer=6] ^-- Network [addrs=[192.168.1.2, 127.0.0.1], discoPort=47500, commPort=47100] ^-- CPU [CPUs=2, curLoad=86.83%, avgLoad=21.9%, GC=23.9%] ^-- Heap [used=867MB, free=15.29%, comm=1024MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=7, qSize=0] ^-- System thread pool [active=0, idle=8, qSize=0] ^-- Striped thread pool [active=0, idle=8, qSize=0] {code} Histogram from heap-dump after node failed !histo.png! MinorTop example {code:java} @Test public void testMinorVer() throws Exception { Ignite server = startGrids(1); IgniteEx client = startClientGrid(); String cacheName = "cacheName"; for (int i = 0; i < 500; i++) { client.getOrCreateCache(cacheName); client.destroyCache(cacheName); } System.err.println("Heap dump time"); Thread.sleep(1000000); } {code} {code:java} [INFO ][exchange-worker-#149%internal.IgniteOomTest%][GridCachePartitionExchangeManager] AffinityTopologyVersion [topVer=2, minorTopVer=1000], evt=DISCOVERY_CUSTOM_EVT, evtNode=52b4c130-1a01-4858-813a-ebc8a5dabf1e, client=true] {code} > OOM crash with unstable topology > -------------------------------- > > Key: IGNITE-21478 > URL: https://issues.apache.org/jira/browse/IGNITE-21478 > Project: Ignite > Issue Type: Bug > Reporter: Luchnikov Alexander > Priority: Minor > Labels: ise > Attachments: HistoMinorTop.png, histo.png > > > User cases: > 1) Frequent entry/exit of a thick client into the topology leads to a crash > of the server node due to OMM. > 2) Frequent creation and destroy of caches leads to a server node crash due > to OOM. > topVer=20098 > Part of the log before the OOM crash, pay attention to *topVer=20098* > {code:java} > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=f080abcd, uptime=3 days, 09:00:55.274] > ^-- Cluster [hosts=4, CPUs=6, servers=2, clients=2, topVer=20098, > minorTopVer=6] > ^-- Network [addrs=[192.168.1.2, 127.0.0.1], discoPort=47500, > commPort=47100] > ^-- CPU [CPUs=2, curLoad=86.83%, avgLoad=21.9%, GC=23.9%] > ^-- Heap [used=867MB, free=15.29%, comm=1024MB] > ^-- Outbound messages queue [size=0] > ^-- Public thread pool [active=0, idle=7, qSize=0] > ^-- System thread pool [active=0, idle=8, qSize=0] > ^-- Striped thread pool [active=0, idle=8, qSize=0] > {code} > Histogram from heap-dump after node failed > !histo.png! > *MinorTop example > * > {code:java} > @Test > public void testMinorVer() throws Exception { > Ignite server = startGrids(1); > IgniteEx client = startClientGrid(); > String cacheName = "cacheName"; > for (int i = 0; i < 500; i++) { > client.getOrCreateCache(cacheName); > client.destroyCache(cacheName); > } > System.err.println("Heap dump time"); > Thread.sleep(1000000); > } > {code} > {code:java} > [INFO > ][exchange-worker-#149%internal.IgniteOomTest%][GridCachePartitionExchangeManager] > AffinityTopologyVersion [topVer=2, minorTopVer=1000], > evt=DISCOVERY_CUSTOM_EVT, evtNode=52b4c130-1a01-4858-813a-ebc8a5dabf1e, > client=true] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)