[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814374#comment-16814374 ] Dmitriy Pavlov commented on IGNITE-9305: [~Pavlukhin], I think no, because tickets are slightly different. If we will agree that Page Memory regions should not be used, then IGNITE-5583 may be closed as won't fix > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813256#comment-16813256 ] Ivan Pavlukhin commented on IGNITE-9305: [~xtern], [~dpavlov], [~dmagda], can we close a ticket IGNITE-5583 linked as a duplicate? > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629149#comment-16629149 ] ASF GitHub Bot commented on IGNITE-9305: Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/4593 > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627539#comment-16627539 ] Dmitriy Pavlov commented on IGNITE-9305: Thank you, [~xtern], it seems it is not related to the fix. > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627537#comment-16627537 ] Pavel Pereslegin commented on IGNITE-9305: -- [~dpavlov], I updated test results, but it seems like the "MVCC Queries" suite is flaky. > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627532#comment-16627532 ] Ignite TC Bot commented on IGNITE-9305: --- {panel:title=Possible Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1} {color:#d04437}MVCC Queries{color} [[tests 3|https://ci.ignite.apache.org/viewLog.html?buildId=1936881]] * IgniteCacheMvccSqlTestSuite: CacheMvccPartitionedSqlCoordinatorFailoverTest.testReadInProgressCoordinatorFailsSimple_FromClient - 0,0% fails in last 100 master runs. * IgniteCacheMvccSqlTestSuite: CacheMvccReplicatedSqlCoordinatorFailoverTest.testReadInProgressCoordinatorFailsSimple_FromClient - 0,0% fails in last 100 master runs. * IgniteCacheMvccSqlTestSuite: CacheMvccReplicatedSqlTxQueriesWithReducerTest.testQueryReducerDeadlockInsert - 0,0% fails in last 100 master runs. {panel} [TeamCity Run All|http://ci.ignite.apache.org/viewLog.html?buildId=1936882buildTypeId=IgniteTests24Java8_RunAll] > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625680#comment-16625680 ] Dmitriy Pavlov commented on IGNITE-9305: [~xtern] could you please include Ignite TC Bot visa to the ticket? > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617350#comment-16617350 ] Pavel Pereslegin commented on IGNITE-9305: -- [~dmagda], [~dpavlov] patch is ready for merge, tests look good. > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613277#comment-16613277 ] Aleksey Plekhanov commented on IGNITE-9305: --- [~xtern], looks good to me. > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604030#comment-16604030 ] Pavel Pereslegin commented on IGNITE-9305: -- [~alex_pl], review these changes. please. > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603733#comment-16603733 ] Denis Magda commented on IGNITE-9305: - [~xtern], thanks, let's merge the changes then after someone from the community reviews them. -- Denis > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602417#comment-16602417 ] Pavel Pereslegin commented on IGNITE-9305: -- [~dmagda], there is no significant performance impact, we add reading of metrics offHeapSize and totalAllocatedPages, which are always enabled for each data region regardless metricsEnabled flag. > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599333#comment-16599333 ] Denis Magda commented on IGNITE-9305: - [~xtern], The format looks good to me. Let's type "size=uknown" for "metastoreMemPlc" persistence region and create a ticket to see how this can be fixed in the future. Btw, would there be any significant performance impact if we print out these detailed metrics periodically? Did you need to enable resources intensive metrics collection? In my understanding, there should be no impact. > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598940#comment-16598940 ] Pavel Pereslegin commented on IGNITE-9305: -- Hello [~dmagda]. I changed format as you suggested, here an example of output: {noformat} ^-- H/N/C [hosts=1, nodes=2, CPUs=8] ^-- CPU [cur=7.8%, avg=11.65%, GC=0%] ^-- PageMemory [pages=7447] ^-- Heap [used=114MB, free=96.78%, comm=242MB] ^-- Off-heap [used=29MB, free=87.2%, comm=230MB] ^-- sysMemPlc region [used=0MB, free=99.98%, comm=100MB] ^-- default region [used=29MB, free=2.1%, comm=30MB] ^-- metastoreMemPlc region [used=0MB, free=99.96%, comm=100MB] ^-- Ignite persistence [used=35MB] ^-- sysMemPlc region [used=0MB] ^-- default region [used=35MB] ^-- metastoreMemPlc region [used=0MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=6, qSize=0] ^-- System thread pool [active=0, idle=6, qSize=0] ^-- Custom executor 0 [active=0, idle=0, qSize=0] ^-- Custom executor 1 [active=0, idle=0, qSize=0] {noformat} But there is a problem with tracking allocated pages (disk usage) for "metastoreMemPlc" persistence region (metastore). Total allocated pages metric is always zero, because this region is recreated after FilePageStore was created. I see the following options: # Output actual value of this metric (don't change anything here) and create separate ticket for the problem. # Exclude "metastoreMemPlc" region from persistence regions in log output. # Output only those persistence regions for which metrics are enabled. Any thoughts? > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596958#comment-16596958 ] Denis Magda commented on IGNITE-9305: - [~xtern], [~NSAmelchev], Let's use a little bit modified format suggested by Nikita. {noformat} ^-- H/N/C [hosts=1, nodes=2, CPUs=8] ^-- CPU [cur=15.73%, avg=0%, GC=0%] ^-- PageMemory [pages=25] ^-- Heap [used=71MB, free=97.99%, comm=203MB] ^-- Off-heap [used=0MB, free=99.96%, comm=220MB]: ^-- sysMemPlc region [used=0MB, free=99.98%, comm=100MB] ^-- default region [used=0MB, free=99.8%, comm=20MB] ^-- metastoreMemPlc region [used=0MB, free=99.96%, comm=100MB] ^-- Ignite persistence [used=0MB] ^-- defaul region [used=XMB] ^-- {Name} region [used=XMB] ^-- {Name} region [used=XMB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=6, qSize=0] ^-- System thread pool [active=0, idle=6, qSize=0] {noformat} We show the total for off-heap and persistence and also do a breakdown per region. > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596303#comment-16596303 ] Amelchev Nikita commented on IGNITE-9305: - Hello, [~xtern], [~dmagda], I suggest more readable format for off-heap data regions: {noformat} ^-- H/N/C [hosts=1, nodes=2, CPUs=8] ^-- CPU [cur=15.73%, avg=0%, GC=0%] ^-- PageMemory [pages=25] ^-- Heap [used=71MB, free=97.99%, comm=203MB] ^-- Off-heap [used=0MB, free=99.96%, comm=220MB] ^-- Off-heap data regions: ^-- sysMemPlc [used=0MB, free=99.98%, comm=100MB] ^-- default [used=0MB, free=99.8%, comm=20MB] ^-- metastoreMemPlc [used=0MB, free=99.96%, comm=100MB] ^-- Ignite persistence region: default [used=0MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=6, qSize=0] ^-- System thread pool [active=0, idle=6, qSize=0] {noformat} Any thoughts? > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594258#comment-16594258 ] Pavel Pereslegin commented on IGNITE-9305: -- [~dmagda], is this format ok? > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591363#comment-16591363 ] Pavel Pereslegin commented on IGNITE-9305: -- [~dmagda], I changed format and off heap metrics look not very well readable: {noformat} ^-- H/N/C [hosts=1, nodes=2, CPUs=8] ^-- CPU [cur=2.73%, avg=1.6%, GC=0%] ^-- PageMemory [pages=34] ^-- Heap [used=71MB, free=97.99%, comm=204MB] ^-- Off-heap [used=0MB, free=99.94%, comm=220MB] ^-- Off-heap sysMemPlc region [used=0MB, free=99.98%, comm=100MB] ^-- Off-heap default region [used=0MB, free=99.62%, comm=20MB] ^-- Off-heap metastoreMemPlc region [used=0MB, free=99.96%, comm=100MB] ^-- Ignite persistence default region [used=0MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=6, qSize=0] ^-- System thread pool [active=0, idle=7, qSize=0] ^-- Custom executor 0 [active=0, idle=0, qSize=0] ^-- Custom executor 1 [active=0, idle=0, qSize=0] {noformat} May be change format to something like this? {noformat} Data region: {name}, off-heap [{params}] Ignite persistence region: {name}, disk [{params}] {noformat} Example of such output: {noformat} ^-- H/N/C [hosts=1, nodes=2, CPUs=8] ^-- CPU [cur=15.73%, avg=0%, GC=0%] ^-- PageMemory [pages=25] ^-- Heap [used=71MB, free=97.99%, comm=203MB] ^-- Off-heap [used=0MB, free=99.96%, comm=220MB] ^-- Data region: sysMemPlc, off-heap [used=0MB, free=99.98%, comm=100MB] ^-- Data region: default, off-heap [used=0MB, free=99.8%, comm=20MB] ^-- Data region: metastoreMemPlc, off-heap [used=0MB, free=99.96%, comm=100MB] ^-- Ignite persistence region: default, disk [used=0MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=6, qSize=0] ^-- System thread pool [active=0, idle=6, qSize=0] ^-- Custom executor 0 [active=0, idle=0, qSize=0] ^-- Custom executor 1 [active=0, idle=0, qSize=0] {noformat} > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590964#comment-16590964 ] Denis Magda commented on IGNITE-9305: - [~xtern], That's a good point. Let's show the aggregated information first and the breakdown by specific regions as a sublist. Plus please add "region" to the region names. For instance, "default" -> "default region", "systemMemPlc" -> "systemMemPlc region" > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589960#comment-16589960 ] Pavel Pereslegin commented on IGNITE-9305: -- Hi [~dmagda], Do I understand correctly that we don't logging aggregated information about off-heap usage? Example of such output: {noformat} ^-- H/N/C [hosts=1, nodes=2, CPUs=8] ^-- CPU [cur=2.57%, avg=4.44%, GC=0%] ^-- PageMemory [pages=34] ^-- Heap [used=130MB, free=96.31%, comm=244MB] ^-- Off-heap sysMemPlc [used=0MB, free=99.98%, comm=100MB] ^-- Off-heap default [used=0MB, free=99.62%, comm=20MB] ^-- Off-heap metastoreMemPlc [used=0MB, free=99.96%, comm=100MB] ^-- Ignite persistence default [used=0MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=6, qSize=0] ^-- System thread pool [active=0, idle=7, qSize=0] ^-- Custom executor 0 [active=0, idle=0, qSize=0] ^-- Custom executor 1 [active=0, idle=0, qSize=0] {noformat} > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9305) Wrong off-heap size is reported for a node
[ https://issues.apache.org/jira/browse/IGNITE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588967#comment-16588967 ] ASF GitHub Bot commented on IGNITE-9305: GitHub user xtern opened a pull request: https://github.com/apache/ignite/pull/4593 IGNITE-9305 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xtern/ignite IGNITE-9305 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/4593.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4593 commit bd7a871d1cc20101bfba68fb05d07cd4f4c31206 Author: pereslegin-pa Date: 2018-08-22T12:19:34Z IGNITE-9305 Non heap mem metrics logging. commit bf8eb41ed66363adffaec56e86c828f6bb14 Author: pereslegin-pa Date: 2018-08-22T13:42:14Z IGNITE-9305 Log mem metrics separately for each region. commit 5014b414969689152dc6fe4d6ea268c17fb8d543 Author: pereslegin-pa Date: 2018-08-22T14:52:01Z IGNITE-9305 Persistence region metrics log format check. > Wrong off-heap size is reported for a node > -- > > Key: IGNITE-9305 > URL: https://issues.apache.org/jira/browse/IGNITE-9305 > Project: Ignite > Issue Type: Task >Affects Versions: 2.6 >Reporter: Denis Magda >Assignee: Pavel Pereslegin >Priority: Blocker > Fix For: 2.7 > > > Was troubleshooting an Ignite deployment today and couldn't find out from the > logs what was the actual off-heap space used. > Those were the given memory resoures (Ignite 2.6): > {code} > [2018-08-16 15:07:49,961][INFO ][main][GridDiscoveryManager] Topology > snapshot [ver=1, servers=1, clients=0, CPUs=64, offheap=30.0GB, heap=24.0GB] > {code} > And that weird stuff was reported by the node (pay attention to the last > line): > {code} > [2018-08-16 15:45:50,211][INFO > ][grid-timeout-worker-#135%cluster_31-Dec-2017%][IgniteKernal%cluster_31-Dec-2017] > > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=c033026e, name=cluster_31-Dec-2017, uptime=00:38:00.257] > ^-- H/N/C [hosts=1, nodes=1, CPUs=64] > ^-- CPU [cur=0.03%, avg=5.54%, GC=0%] > ^-- PageMemory [pages=6997377] > ^-- Heap [used=9706MB, free=61.18%, comm=22384MB] > ^-- Non heap [used=144MB, free=-1%, comm=148MB] - this line is always the > same! > {code} > Had to change the code by using > {code}dataRegion.getPhysicalMemoryPages(){code} to find out that actual > off-heap usage size was > {code} > >>> Physical Memory Size: 28651614208 => 27324 MB, 26 GB > {code} > The logs have to report the following instead: > {code} > ^-- Off-heap {Data Region 1} [used={dataRegion1.getPhysicalMemorySize()}, > free=X%, comm=dataRegion1.maxSize()] > ^-- Off-heap {Data Region 2} [used={dataRegion2.getPhysicalMemorySize()}, > free=X%, comm=dataRegion2.maxSize()] > {code} > If Ignite persistence is enabled then the following extra lines have to be > added to see the disk used space: > {code} > ^-- Ignite persistence {Data Region 1}: > used={dataRegion1.getTotalAllocatedSize() - > dataRegion1.getPhysicalMemorySize()} > ^-- Ignite persistence {Data Region 2} > [used={dataRegion2.getTotalAllocatedSize() - > dataRegion2.getPhysicalMemorySize()}] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)