[jira] [Updated] (IGNITE-21478) OOM crash with unstable topology

Luchnikov Alexander (Jira) Wed, 07 Feb 2024 00:45:23 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-21478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Luchnikov Alexander updated IGNITE-21478:
-----------------------------------------
    Description: 
User cases:
1) Frequent entry/exit of a thick client into the topology leads to a crash of 
the server node due to OMM.
2) Frequent creation and destroy of caches leads to a server node crash due to 
OOM.
 topVer=20098

Part of the log before the OOM crash, pay attention to *topVer=20098*
{code:java}
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=f080abcd, uptime=3 days, 09:00:55.274]
    ^-- Cluster [hosts=4, CPUs=6, servers=2, clients=2, topVer=20098, 
minorTopVer=6]
    ^-- Network [addrs=[192.168.1.2, 127.0.0.1], discoPort=47500, 
commPort=47100]
    ^-- CPU [CPUs=2, curLoad=86.83%, avgLoad=21.9%, GC=23.9%]
    ^-- Heap [used=867MB, free=15.29%, comm=1024MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=7, qSize=0]
    ^-- System thread pool [active=0, idle=8, qSize=0]
    ^-- Striped thread pool [active=0, idle=8, qSize=0]
{code}

Histogram from heap-dump after node failed



  was:
User cases:
1) Frequent entry/exit of a thick client into the topology leads to a crash of 
the server node due to OMM.
2) Frequent creation and destroy of caches leads to a server node crash due to 
OOM.
 topVer=20098

Part of the log before the OOM crash, pay attention to *topVer=20098*
{code:java}
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=f080abcd, uptime=3 days, 09:00:55.274]
    ^-- Cluster [hosts=4, CPUs=6, servers=2, clients=2, topVer=20098, 
minorTopVer=6]
    ^-- Network [addrs=[192.168.1.2, 127.0.0.1], discoPort=47500, 
commPort=47100]
    ^-- CPU [CPUs=2, curLoad=86.83%, avgLoad=21.9%, GC=23.9%]
    ^-- Heap [used=867MB, free=15.29%, comm=1024MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=7, qSize=0]
    ^-- System thread pool [active=0, idle=8, qSize=0]
    ^-- Striped thread pool [active=0, idle=8, qSize=0]
{code}

Histogram from heap-dump after node failed



> OOM crash with unstable topology
> --------------------------------
>
>                 Key: IGNITE-21478
>                 URL: https://issues.apache.org/jira/browse/IGNITE-21478
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Luchnikov Alexander
>            Priority: Minor
>         Attachments: histo.png
>
>
> User cases:
> 1) Frequent entry/exit of a thick client into the topology leads to a crash 
> of the server node due to OMM.
> 2) Frequent creation and destroy of caches leads to a server node crash due 
> to OOM.
>  topVer=20098
> Part of the log before the OOM crash, pay attention to *topVer=20098*
> {code:java}
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=f080abcd, uptime=3 days, 09:00:55.274]
>     ^-- Cluster [hosts=4, CPUs=6, servers=2, clients=2, topVer=20098, 
> minorTopVer=6]
>     ^-- Network [addrs=[192.168.1.2, 127.0.0.1], discoPort=47500, 
> commPort=47100]
>     ^-- CPU [CPUs=2, curLoad=86.83%, avgLoad=21.9%, GC=23.9%]
>     ^-- Heap [used=867MB, free=15.29%, comm=1024MB]
>     ^-- Outbound messages queue [size=0]
>     ^-- Public thread pool [active=0, idle=7, qSize=0]
>     ^-- System thread pool [active=0, idle=8, qSize=0]
>     ^-- Striped thread pool [active=0, idle=8, qSize=0]
> {code}
> Histogram from heap-dump after node failed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-21478) OOM crash with unstable topology

Reply via email to