RE: High cpu on ignite server nodes
I don't really have much to say but to try reducing/balancing on heap cache size. If you have a lot of objects on heap, you need to have a large heap, obviously. If you need to constantly add/remove on-heap objects, you'll have a lot of work for GC. Perhaps you can review your architecture to avoid having on-heap caching whatsoever to minimize the impact of Java GC. Stan -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
RE: High cpu on ignite server nodes
Hi Stan, Thanks for your analysis. We have increased the on heap cache size 50 and added expiry policy [30mins]. The expiry policy is expiring the entries and the cache is never reaching to it's max size. But now we see high heap usage because of that GCs are happening frequently and FULL GC is happened only once in a 2 days, after that full gc didn't happen, only GCs are happening frequently. Every time the heap usage is more than 59% in all the nodes and the heap usage is reaching to 94% after 40 to 60 mins. Once GC happens it is coming down to 60% . Following are the gc logs. Desired survivor size 10485760 bytes, new threshold 1 (max 15) [PSYoungGen: 2086889K->10213K(2086912K)] 5665084K->3680259K(6281216K), 0.0704050 secs] [Times: user=0.54 sys=0.00, real=0.07 secs] 2018-06-22T09:53:38.873-0400: 374604.772: Total time for which application threads were stopped: 0.0794010 seconds 2018-06-22T09:55:00.332-0400: 374686.231: Total time for which application threads were stopped: 0.0084890 seconds 2018-06-22T09:55:00.340-0400: 374686.239: Total time for which application threads were stopped: 0.0075450 seconds 2018-06-22T09:55:00.348-0400: 374686.247: Total time for which application threads were stopped: 0.0078560 seconds 2018-06-22T09:55:26.847-0400: 374712.746: Total time for which application threads were stopped: 0.0090060 seconds 2018-06-22T10:00:26.857-0400: 375012.756: Total time for which application threads were stopped: 0.0105490 seconds 2018-06-22T10:02:48.740-0400: 375154.639: Total time for which application threads were stopped: 0.0093160 seconds 2018-06-22T10:02:48.748-0400: 375154.647: Total time for which application threads were stopped: 0.000 seconds 2018-06-22T10:02:48.757-0400: 375154.656: Total time for which application threads were stopped: 0.0092110 seconds 2018-06-22T10:05:26.867-0400: 375312.766: Total time for which application threads were stopped: 0.0098100 seconds 2018-06-22T10:05:52.775-0400: 375338.674: Total time for which application threads were stopped: 0.0083580 seconds 2018-06-22T10:05:52.783-0400: 375338.682: Total time for which application threads were stopped: 0.0074860 seconds 2018-06-22T10:05:52.790-0400: 375338.689: Total time for which application threads were stopped: 0.0073980 seconds 2018-06-22T10:06:48.756-0400: 375394.655: Total time for which application threads were stopped: 0.0086660 seconds 2018-06-22T10:06:48.764-0400: 375394.662: Total time for which application threads were stopped: 0.0076080 seconds 2018-06-22T10:06:48.771-0400: 375394.670: Total time for which application threads were stopped: 0.0076890 seconds 2018-06-22T10:07:05.603-0400: 375411.501: Total time for which application threads were stopped: 0.0077390 seconds 2018-06-22T10:07:05.610-0400: 375411.509: Total time for which application threads were stopped: 0.0074570 seconds 2018-06-22T10:07:05.617-0400: 375411.516: Total time for which application threads were stopped: 0.0073410 seconds 2018-06-22T10:07:05.626-0400: 375411.525: Total time for which application threads were stopped: 0.0072380 seconds 2018-06-22T10:07:05.633-0400: 375411.532: Total time for which application threads were stopped: 0.0073070 seconds 2018-06-22T10:10:26.876-0400: 375612.775: Total time for which application threads were stopped: 0.0091690 seconds 2018-06-22T10:15:26.887-0400: 375912.786: Total time for which application threads were stopped: 0.0111650 seconds 2018-06-22T10:20:26.897-0400: 376212.796: Total time for which application threads were stopped: 0.0099680 seconds 2018-06-22T10:22:30.917-0400: 376336.816: Total time for which application threads were stopped: 0.0085330 seconds 2018-06-22T10:25:26.907-0400: 376512.806: Total time for which application threads were stopped: 0.0094760 seconds 2018-06-22T10:26:04.247-0400: 376550.145: Total time for which application threads were stopped: 0.0077120 seconds 2018-06-22T10:26:04.254-0400: 376550.153: Total time for which application threads were stopped: 0.0075380 seconds 2018-06-22T10:26:04.262-0400: 376550.161: Total time for which application threads were stopped: 0.0073460 seconds 2018-06-22T10:30:26.918-0400: 376812.817: Total time for which application threads were stopped: 0.0107140 seconds 2018-06-22T10:35:26.929-0400: 377112.827: Total time for which application threads were stopped: 0.0102250 seconds 2018-06-22T10:40:26.939-0400: 377412.838: Total time for which application threads were stopped: 0.0096620 seconds 2018-06-22T10:41:06.178-0400: 377452.077: Total time for which application threads were stopped: 0.0085630 seconds 2018-06-22T10:41:06.186-0400: 377452.085: Total time for which application threads were stopped: 0.0079250 seconds 2018-06-22T10:41:06.194-0400: 377452.092: Total time for which application threads were stopped: 0.0074940 seconds 2018-06-22T10:42:57.088-0400: 377562.987: Total time for which application threads were stopped: 0.0090560 seconds 2018-06-22T10:42:57.096-0400: 377562.995: Total time for which applic
RE: High cpu on ignite server nodes
There is no default expiry policy, and no default eviction policy for the on-heap caches (just in case: expiry and eviction are not the same thing, see https://apacheignite.readme.io/docs/evictions and https://apacheignite.readme.io/docs/expiry-policies). I see that most of the threads in the dump that you’ve shared are executing on-heap eviction code. Perhaps you’ve just hit the eviction size of your caches, and now the cache updates became more expensive ( You can try increasing the eviction maximum size in the eviction policy. Thanks, Stan From: praveeng Sent: 16 июня 2018 г. 18:38 To: user@ignite.apache.org Subject: RE: High cpu on ignite server nodes Hi Stan, The high cpu usage is on all servers. one doubt , what is the default expiry policy for any cache if we don't set. Following are the stats of one cache collected from ignitevisor. Cache 'playerSessionInfoCacheIgnite(@c18)': +--+ | Name(@) | playerSessionInfoCacheIgnite(@c18) | | Nodes | 7 | | Total size Min/Avg/Max | 0 / 53201.29 / 151537 | | Heap size Min/Avg/Max | 0 / 21857.29 / 50001 | | Off-heap size Min/Avg/Max | 0 / 31344.00 / 101536 | +--+ Nodes for: playerSessionInfoCacheIgnite(@c18) +=+ |Node ID8(@), IP| CPUs | Heap Used | CPU Load | Up Time | Size | Hi/Mi/Rd/Wr | +=+ | 54F7EA58(@n4), ip.ip.ip.ip1 | 4| 43.81 % | 2.43 % | 24:20:08:018 | Total: 1000 | Hi: 0 | | | | | | | Heap: 1000 | Mi: 0 | | | | | | | Off-Heap: 0| Rd: 0 | | | | | | | Off-Heap Memory: 0 | Wr: 0 | +---+--+---+--+--+--+-+ | D3A97470(@n7), ip.ip.ip.ip2 | 8| 41.88 % | 0.27 % | 02:26:29:576 | Total: 151536| Hi: 0 | | | | | | | Heap: 5| Mi: 0 | | | | | | | Off-Heap: 101536 | Rd: 0 | | | | | | | Off-Heap Memory: 100mb | Wr: 0 | +---+--+---+--+--+--+-+ | 6BA0FEA2(@n5), ip.ip.ip.ip3 | 8| 25.74 % | 0.30 % | 02:29:02:915 | Total: 151529| Hi: 0 | | | | | | | Heap: 5| Mi: 0 | | | | | | | Off-Heap: 101529 | Rd: 0 | | | | | | | Off-Heap Memory: 100mb | Wr: 0 | +---+--+---+--+--+--+-+ | E41C47FD(@n6), ip.ip.ip.ip4 | 8| 38.53 % | 0.30 % | 02:27:35:184 | Total: 66344 | Hi: 0 | | | | | | | Heap: 50001| Mi: 0 | | | | | | | Off-Heap: 16343| Rd: 0 | | | | | | | Off-Heap Memory: 16mb | Wr: 0 | +---+--+---+--+--+--+-+ | D487DD7A(@n3), ip.ip.ip.ip5 | 4| 36.07 % | 1.90 % | 24:27:24:711 | Total: 1000 | Hi: 0 | | | | | | | Heap: 1000 | Mi: 0 | | | | | | | Off-Heap: 0| Rd: 0 | | | | | | | Off-Heap Memory: 0 | Wr: 0 | +---+--+---+--+--+--+-+ | A30CC6D1(@n2), ip.ip.ip.ip6 | 4| 29.72 % | 0.50 % | 24:33:45:581 | Total: 0
RE: High cpu on ignite server nodes
Hi Stan, The high cpu usage is on all servers. one doubt , what is the default expiry policy for any cache if we don't set. Following are the stats of one cache collected from ignitevisor. Cache 'playerSessionInfoCacheIgnite(@c18)': +--+ | Name(@) | playerSessionInfoCacheIgnite(@c18) | | Nodes | 7 | | Total size Min/Avg/Max | 0 / 53201.29 / 151537 | | Heap size Min/Avg/Max | 0 / 21857.29 / 50001 | | Off-heap size Min/Avg/Max | 0 / 31344.00 / 101536 | +--+ Nodes for: playerSessionInfoCacheIgnite(@c18) +=+ |Node ID8(@), IP| CPUs | Heap Used | CPU Load | Up Time | Size | Hi/Mi/Rd/Wr | +=+ | 54F7EA58(@n4), ip.ip.ip.ip1 | 4| 43.81 % | 2.43 % | 24:20:08:018 | Total: 1000 | Hi: 0 | | | | | | | Heap: 1000 | Mi: 0 | | | | | | | Off-Heap: 0| Rd: 0 | | | | | | | Off-Heap Memory: 0 | Wr: 0 | +---+--+---+--+--+--+-+ | D3A97470(@n7), ip.ip.ip.ip2 | 8| 41.88 % | 0.27 % | 02:26:29:576 | Total: 151536| Hi: 0 | | | | | | | Heap: 5| Mi: 0 | | | | | | | Off-Heap: 101536 | Rd: 0 | | | | | | | Off-Heap Memory: 100mb | Wr: 0 | +---+--+---+--+--+--+-+ | 6BA0FEA2(@n5), ip.ip.ip.ip3 | 8| 25.74 % | 0.30 % | 02:29:02:915 | Total: 151529| Hi: 0 | | | | | | | Heap: 5| Mi: 0 | | | | | | | Off-Heap: 101529 | Rd: 0 | | | | | | | Off-Heap Memory: 100mb | Wr: 0 | +---+--+---+--+--+--+-+ | E41C47FD(@n6), ip.ip.ip.ip4 | 8| 38.53 % | 0.30 % | 02:27:35:184 | Total: 66344 | Hi: 0 | | | | | | | Heap: 50001| Mi: 0 | | | | | | | Off-Heap: 16343| Rd: 0 | | | | | | | Off-Heap Memory: 16mb | Wr: 0 | +---+--+---+--+--+--+-+ | D487DD7A(@n3), ip.ip.ip.ip5 | 4| 36.07 % | 1.90 % | 24:27:24:711 | Total: 1000 | Hi: 0 | | | | | | | Heap: 1000 | Mi: 0 | | | | | | | Off-Heap: 0| Rd: 0 | | | | | | | Off-Heap Memory: 0 | Wr: 0 | +---+--+---+--+--+--+-+ | A30CC6D1(@n2), ip.ip.ip.ip6 | 4| 29.72 % | 0.50 % | 24:33:45:581 | Total: 0 | Hi: 0 | | | | | | | Heap: 0| Mi: 0 | | | | | | | Off-Heap: 0| Rd: 0 | | | | | | | Off-Heap Memory: 0 | Wr: 0 | +---+--+---+--+--+--+-+ | 4FA3F3EB(@n1), ip.ip.ip.ip7 | 4| 28.57 % | 1.77 % | 24:38:18:856 | Total: 1000 | Hi: 0 | | | | | | | Heap: 1000 | Mi: 0 | |
RE: High cpu on ignite server nodes
Hi Praveen, Is it on all nodes or just one? How many nodes do you have? Is the cluster fully responsive, or do some operations appear frozen forever? It looks like a memory/GC issue to me. jmap shows 2GB of on-heap data. What is the max heap size? Try increasing it. Thanks, Stan From: praveeng Sent: 16 июня 2018 г. 17:15 To: user@ignite.apache.org Subject: Re: High cpu on ignite server nodes Hi , Following are the high cpu consumed threads. PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 36708 gmedia20 0 11.031g 6.749g 14720 R 53.5 43.5 267:15.83 java 36709 gmedia20 0 11.031g 6.749g 14720 R 51.5 43.5 266:53.53 java 34433 gmedia20 0 11.031g 6.749g 14720 R 48.8 43.5 268:50.46 java 35687 gmedia20 0 11.031g 6.749g 14720 R 48.8 43.5 270:04.85 java 36706 gmedia20 0 11.031g 6.749g 14720 R 48.8 43.5 266:05.29 java 36712 gmedia20 0 11.031g 6.749g 14720 R 48.5 43.5 268:34.80 java 36713 gmedia20 0 11.031g 6.749g 14720 R 48.5 43.5 270:16.17 java 37366 gmedia20 0 11.031g 6.749g 14720 R 48.5 43.5 269:50.37 java 37367 gmedia20 0 11.031g 6.749g 14720 R 48.5 43.5 267:32.84 java 48957 gmedia20 0 11.031g 6.749g 14720 R 48.2 43.5 266:50.72 java 36707 gmedia20 0 11.031g 6.749g 14720 R 47.8 43.5 268:30.12 java 36714 gmedia20 0 11.031g 6.749g 14720 R 47.5 43.5 266:10.18 java 37811 gmedia20 0 11.031g 6.749g 14720 R 47.5 43.5 267:44.25 java 34438 gmedia20 0 11.031g 6.749g 14720 R 47.2 43.5 269:45.54 java 34439 gmedia20 0 11.031g 6.749g 14720 R 46.8 43.5 268:24.83 java 36710 gmedia20 0 11.031g 6.749g 14720 R 46.5 43.5 269:43.68 java Thanks, Praveen -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: High cpu on ignite server nodes
Hi , Following are the high cpu consumed threads. PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 36708 gmedia20 0 11.031g 6.749g 14720 R 53.5 43.5 267:15.83 java 36709 gmedia20 0 11.031g 6.749g 14720 R 51.5 43.5 266:53.53 java 34433 gmedia20 0 11.031g 6.749g 14720 R 48.8 43.5 268:50.46 java 35687 gmedia20 0 11.031g 6.749g 14720 R 48.8 43.5 270:04.85 java 36706 gmedia20 0 11.031g 6.749g 14720 R 48.8 43.5 266:05.29 java 36712 gmedia20 0 11.031g 6.749g 14720 R 48.5 43.5 268:34.80 java 36713 gmedia20 0 11.031g 6.749g 14720 R 48.5 43.5 270:16.17 java 37366 gmedia20 0 11.031g 6.749g 14720 R 48.5 43.5 269:50.37 java 37367 gmedia20 0 11.031g 6.749g 14720 R 48.5 43.5 267:32.84 java 48957 gmedia20 0 11.031g 6.749g 14720 R 48.2 43.5 266:50.72 java 36707 gmedia20 0 11.031g 6.749g 14720 R 47.8 43.5 268:30.12 java 36714 gmedia20 0 11.031g 6.749g 14720 R 47.5 43.5 266:10.18 java 37811 gmedia20 0 11.031g 6.749g 14720 R 47.5 43.5 267:44.25 java 34438 gmedia20 0 11.031g 6.749g 14720 R 47.2 43.5 269:45.54 java 34439 gmedia20 0 11.031g 6.749g 14720 R 46.8 43.5 268:24.83 java 36710 gmedia20 0 11.031g 6.749g 14720 R 46.5 43.5 269:43.68 java Thanks, Praveen -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/