yes i was able to get 128 with the following configuration; <property name="systemThreadPoolSize" value="128"/>
here is a sample log; [2021-09-27T00:00:19,359][INFO ][grid-timeout-worker-#198][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=f0025abe, uptime=2 days, 21:40:21.138] ^-- Cluster [hosts=39, CPUs=236, servers=10, clients=29, topVer=49, minorTopVer=0] ^-- Network [addrs=[10.240.242.1, 10.240.245.1, 127.0.0.1], discoPort=47500, commPort=47100] ^-- CPU [CPUs=12, curLoad=3.53%, avgLoad=7.1%, GC=0%] ^-- Heap [used=10076MB, free=17.99%, comm=12288MB] ^-- Off-heap memory [used=36152MB, free=2.98%, allocated=37063MB] ^-- Page memory [pages=9147903] ^-- sysMemPlc region [type=internal, persistence=true, lazyAlloc=false, ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.99%, allocRam=99MB, allocTotal=0MB] ^-- default region [type=default, persistence=true, lazyAlloc=true, ... initCfg=256MB, maxCfg=36864MB, usedRam=36152MB, freeRam=1.93%, allocRam=36864MB, allocTotal=81296MB] ^-- metastoreMemPlc region [type=internal, persistence=true, lazyAlloc=false, ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.57%, allocRam=0MB, allocTotal=0MB] ^-- TxLog region [type=internal, persistence=true, lazyAlloc=false, ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=99MB, allocTotal=0MB] ^-- volatileDsMemPlc region [type=user, persistence=false, lazyAlloc=true, ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=0MB] ^-- Ignite persistence [used=81296MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=128, qSize=0] On 2021/09/29 12:24:19, Zhenya Stanilovsky <arzamas...@mail.ru> wrote: > > > systemThreadPoolSize and other pools are defined by default from: > Runtime.getRuntime().availableProcessors(), if you somehow obtain 128, plz > fill the ticket with all env info. > thanks ! > > > > >after many configuration changes and optimizations, i think i've solved the > >heap problem. > > > >here are the changes that i applied to the system; > >JVM changes -> > >https://medium.com/@hoan.nguyen.it/how-did-g1gc-tuning-flags-affect-our-back-end-web-app-c121d38dfe56 > > helped a lot > > > >nodes are running on 12CORE and 64GB MEM servers, i've added the following > >jvm parameters > > > >-XX:ParallelGCThreads=6 > >-XX:ConcGCThreads=2 > >-XX:MaxGCPauseMillis=200 > >-XX:InitiatingHeapOccupancyPercent=40 > > > >on ignite configuration i've changed all thread pool sizes, which were much > >more than these; > > <property name="systemThreadPoolSize" value="12"/> > > <property name="publicThreadPoolSize" value="12"/> > > <property name="queryThreadPoolSize" value="12"/> > > <property name="serviceThreadPoolSize" value="12"/> > > <property name="stripedPoolSize" value="12"/> > > <property name="dataStreamerThreadPoolSize" value="12"/> > > <property name="rebalanceThreadPoolSize" value="12"/> > > > >Here is the 16 hours of GC report; > >https://gceasy.io/diamondgc-report.jsp?p=c2hhcmVkLzIwMjEvMDkvMjkvLS1nYy5sb2cuMC5jdXJyZW50LS04LTU4LTMx&channel=WEB > > > > > > > >On 2021/09/27 17:11:21, Ilya Korol < llivezk...@gmail.com > wrote: > >> Actually Query interface doesn't define close() method, but QueryCursor > >> does. > >> In your snippets you're using try-with-resource construction for SELECT > >> queries which is good, but when you run MERGE INTO query you would also > >> get an QueryCursor as a result of > >> > >> igniteCacheService.getCache(ID, IgniteCacheType.LABEL).query(insertQuery); > >> > >> so maybe this QueryCursor objects still hold some resources/memory. > >> Javadoc for QueryCursor states that you should always close cursors. > >> > >> To simplify cursor closing there is a cursor.getAll() method that will > >> do this for you under the hood. > >> > >> > >> On 2021/09/13 06:17:21, Ibrahim Altun < i...@segmentify.com > wrote: > >> > Hi Ilya,> > >> > > >> > since this is production environment i could not risk to take heap > >> dump for now, but i will try to convince my superiors to get one and > >> analyze it.> > >> > > >> > Queries are heavily used in our system but aren't they autoclosable > >> objects? do we have to close them anyway?> > >> > > >> > here are some usage examples on our system;> > >> > --insert query is like this; MERGE INTO "ProductLabel" ("productId", > >> "label", "language") VALUES (?, ?, ?)> > >> > igniteCacheService.getCache(ID, > >> IgniteCacheType.LABEL).query(insertQuery);> > >> > > >> > another usage example;> > >> > --sqlFieldsQuery is like this; > > >> > String sql = "SELECT _val FROM \"UserRecord\" WHERE \"email\" IN (?)";> > >> > SqlFieldsQuery sqlFieldsQuery = new SqlFieldsQuery(sql);> > >> > sqlFieldsQuery.setLazy(true);> > >> > sqlFieldsQuery.setArgs(emails.toArray());> > >> > > >> > try (QueryCursor<List<?>> ignored = igniteCacheService.getCache(ID, > >> IgniteCacheType.USER).query(sqlFieldsQuery)) {...}> > >> > > >> > > >> > > >> > On 2021/09/12 20:28:09, Shishkov Ilya < sh...@gmail.com > wrote: > > >> > > Hi, Ibrahim!> > >> > > Have you analyzed the heap dump of the server node JVMs?> > >> > > In case your application executes queries are their cursors closed?> > >> > > > > >> > > пт, 10 сент. 2021 г. в 11:54, Ibrahim Altun < ib...@segmentify.com >:> > >> > > > > >> > > > Igniters any comment on this issue, we are facing huge GC > >> problems on> > >> > > > production environment, please advise.> > >> > > >> > >> > > > On 2021/09/07 14:11:09, Ibrahim Altun < ib...@segmentify.com >> > >> > > > wrote:> > >> > > > > Hi,> > >> > > > >> > >> > > > > totally 400 - 600K reads/writes/updates> > >> > > > > 12core> > >> > > > > 64GB RAM> > >> > > > > no iowait> > >> > > > > 10 nodes> > >> > > > >> > >> > > > > On 2021/09/07 12:51:28, Piotr Jagielski < pj...@touk.pl > wrote:> > >> > > > > > Hi,> > >> > > > > > Can you provide some information on how you use the cluster? > >> How many> > >> > > > reads/writes/updates per second? Also CPU / RAM spec of cluster > >> nodes?> > >> > > > > >> > >> > > > > > We observed full GC / CPU load / OOM killer when loading big > >> amount of> > >> > > > data (15 mln records, data streamer + allowOverwrite=true). We've > >> seen> > >> > > > 200-400k updates per sec on JMX metrics, but load up to 10 on > >> nodes, iowait> > >> > > > to 30%. Our cluster is 3 x 4CPU, 16GB RAM (already upgradingto > >> 8CPU, 32GB> > >> > > > RAM). Ignite 2.10> > >> > > > > >> > >> > > > > > Regards,> > >> > > > > > Piotr> > >> > > > > >> > >> > > > > > On 2021/09/02 08:36:07, Ibrahim Altun < ib...@segmentify.com >> > >> > > > wrote:> > >> > > > > > > After upgrading from 2.7.1 version to 2.10.0 version ignite > >> nodes> > >> > > > facing> > >> > > > > > > huge full GC operations after 24-36 hours after node start.> > >> > > > > > >> > >> > > > > > > We try to increase heap size but no luck, here is the start> > >> > > > configuration> > >> > > > > > > for nodes;> > >> > > > > > >> > >> > > > > > > JVM_OPTS="$JVM_OPTS -Xms12g -Xmx12g -server> > >> > > > > > >> > >> > > > > >> -javaagent:/etc/prometheus/jmx_prometheus_javaagent-0.14.0.jar=8090:/etc/prometheus/jmx.yml> > >> > >> > > > > > > -Dcom.sun.management.jmxremote> > >> > > > > > > -Dcom.sun.management.jmxremote.authenticate=false> > >> > > > > > > -Dcom.sun.management.jmxremote.port=49165> > >> > > > > > > -Dcom.sun.management.jmxremote.host=localhost> > >> > > > > > > -XX:MaxMetaspaceSize=256m -XX:MaxDirectMemorySize=1g> > >> > > > > > > -DIGNITE_SKIP_CONFIGURATION_CONSISTENCY_CHECK=true> > >> > > > > > > -DIGNITE_WAL_MMAP=true > >> -DIGNITE_BPLUS_TREE_LOCK_RETRIES=100000> > >> > > > > > > -Djava.net.preferIPv4Stack=true"> > >> > > > > > >> > >> > > > > > > JVM_OPTS="$JVM_OPTS -XX:+AlwaysPreTouch -XX:+UseG1GC> > >> > > > > > > -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC> > >> > > > > > > -XX:+UseStringDeduplication > >> -Xloggc:/var/log/apache-ignite/gc.log> > >> > > > > > > -XX:+PrintGCDetails -XX:+PrintGCDateStamps> > >> > > > > > > -XX:+PrintTenuringDistribution -XX:+PrintGCCause> > >> > > > > > > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10> > >> > > > > > > -XX:GCLogFileSize=100M"> > >> > > > > > >> > >> > > > > > > here is the 80 hours of GC analyize report:> > >> > > > > > >> > >> > > > > >> > >> https://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMjEvMDgvMzEvLS1nYy5sb2cuMC5jdXJyZW50LnppcC0tNS01MS0yOQ==&channel=WEB > >> > > >> > >> > > > > > >> > >> > > > > > > do we need more heap size or is there a BUG that we need to > >> be aware?> > >> > > > > > >> > >> > > > > > > here is the node configuration:> > >> > > > > > >> > >> > > > > > > <?xml version="1.0" encoding="UTF-8"?>> > >> > > > > > > <beans xmlns=" http://www.springframework.org/schema/beans "> > >> > > > > > > xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance "> > >> > > > > > > xsi:schemaLocation="> > >> > > > > > > http://www.springframework.org/schema/beans > > >> > > > > > > > >> http://www.springframework.org/schema/beans/spring-beans.xsd ">> > >> > > > > > > <bean id="ignite.cfg"> > >> > > > > > > class="org.apache.ignite.configuration.IgniteConfiguration">> > >> > > > > > > <property name="gridLogger">> > >> > > > > > > <bean class="org.apache.ignite.logger.log4j2.Log4J2Logger">> > >> > > > > > > <constructor-arg type="java.lang.String"> > >> > > > > > > value="/etc/apache-ignite/ignite-log4j2.xml"/>> > >> > > > > > > </bean>> > >> > > > > > > </property>> > >> > > > > > > <property name="communicationSpi">> > >> > > > > > > <bean> > >> > > > > >> class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">> > >> > > > > > > <property name="usePairedConnections" value="true"/>> > >> > > > > > > </bean>> > >> > > > > > > </property>> > >> > > > > > > <property name="failureDetectionTimeout" value="60000"/>> > >> > > > > > > <property name="systemThreadPoolSize" value="128"/>> > >> > > > > > > <property name="publicThreadPoolSize" value="128"/>> > >> > > > > > > <property name="queryThreadPoolSize" value="128"/>> > >> > > > > > > <property name="serviceThreadPoolSize" value="128"/>> > >> > > > > > > <property name="stripedPoolSize" value="128"/>> > >> > > > > > > <property name="dataStreamerThreadPoolSize" value="4"/>> > >> > > > > > > <property name="rebalanceThreadPoolSize" value="16"/>> > >> > > > > > >> > >> > > > > > > <!-- Explicitly enable peer class loading. -->> > >> > > > > > > <property name="peerClassLoadingEnabled" value="true"/>> > >> > > > > > >> > >> > > > > > > <!-- Enable deploymentSpi,> > >> > > > > > > /usr/share/apache-ignite/libs/segmentify directory will be > >> checked> > >> > > > > > > every 5 seconds for changed files-->> > >> > > > > > > <property name="deploymentSpi">> > >> > > > > > > <bean> > >> > > > class="org.apache.ignite.spi.deployment.uri.UriDeploymentSpi">> > >> > > > > > > <property name="temporaryDirectoryPath"> > >> > > > > > > value="/tmp/temp_ignite_libs"/>> > >> > > > > > > <property name="uriList">> > >> > > > > > > <list>> > >> > > > > > >> > >> > > > > > > <value>file://freq=5000@localhost> > >> > > > /usr/share/apache-ignite/libs/segmentify/</value>> > >> > > > > > > </list>> > >> > > > > > > </property>> > >> > > > > > > </bean>> > >> > > > > > > </property>> > >> > > > > > >> > >> > > > > > > <property name="cacheConfiguration">> > >> > > > > > > <list>> > >> > > > > > > <!-- Partitioned cache example configuration (Atomic> > >> > > > mode). -->> > >> > > > > > > <bean> > >> > > > class="org.apache.ignite.configuration.CacheConfiguration">> > >> > > > > > > <property name="name" value="default"/>> > >> > > > > > > <property name="atomicityMode" value="ATOMIC"/>> > >> > > > > > > <property name="backups" value="1"/>> > >> > > > > > > </bean>> > >> > > > > > > </list>> > >> > > > > > > </property>> > >> > > > > > >> > >> > > > > > > <!-- Explicitly configure TCP discovery SPI to provide list > >> of> > >> > > > > > > initial nodes. -->> > >> > > > > > > <property name="discoverySpi">> > >> > > > > > > <bean> > >> > > > class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">> > >> > > > > > > <property name="networkTimeout" value="60000"/>> > >> > > > > > > <property name="ipFinder">> > >> > > > > > > <bean> > >> > > > > > >> > >> > > > > >> class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">> > >> > >> > > > > > > <property name="addresses">> > >> > > > > > > <list>> > >> > > > > > > <!-- THERE ARE 10 NODES -->> > >> > > > > > > </list>> > >> > > > > > > </property>> > >> > > > > > > </bean>> > >> > > > > > > </property>> > >> > > > > > > </bean>> > >> > > > > > > </property>> > >> > > > > > >> > >> > > > > > > <!-- Enabling Apache Ignite native persistence. -->> > >> > > > > > > <property name="dataStorageConfiguration">> > >> > > > > > > <bean> > >> > > > class="org.apache.ignite.configuration.DataStorageConfiguration">> > >> > > > > > > <property name="defaultDataRegionConfiguration">> > >> > > > > > > <bean> > >> > > > > > > > >> class="org.apache.ignite.configuration.DataRegionConfiguration">> > >> > > > > > > <property name="persistenceEnabled"> > >> > > > value="true"/>> > >> > > > > > > <property name="checkpointPageBufferSize"> > >> > > > > > > value="#{ 2L * 1024 * 1024 * 1024}"/>> > >> > > > > > > <property name="maxSize" value="#{ 40L * 1024 *> > >> > > > > > > 1024 * 1024 }"/>> > >> > > > > > > </bean>> > >> > > > > > > </property>> > >> > > > > > > <property name="storagePath"> > >> > > > value="/srv/ignite/persist"/>> > >> > > > > > > <property name="walPath" value="/srv/ignite/wal"/>> > >> > > > > > > <property name="walArchivePath" value="/srv/ignite/wal"/>> > >> > > > > > > <property name="walMode" value="LOG_ONLY"/>> > >> > > > > > > <property name="walSegmentSize" value="#{ 256L * 1024 *> > >> > > > 1024 }"/>> > >> > > > > > > <property name="walFlushFrequency" value="5000"/>> > >> > > > > > > <property name="maxWalArchiveSize" value="#{ 512L * 1024> > >> > > > * 1024 }"/>> > >> > > > > > > <property name="writeThrottlingEnabled" value="true"/>> > >> > > > > > > <property name="checkpointFrequency" value="300000"/>> > >> > > > > > > <property name="checkpointWriteOrder" value="SEQUENTIAL"> > >> > > > />> > >> > > > > > > </bean>> > >> > > > > > > </property>> > >> > > > > > > </bean>> > >> > > > > > >> > >> > > > > > >> > >> > > > > > > --> > >> > > > > > > < https://www.segmentify.com/ >İbrahim Halil AltunSenior > >> Software> > >> > > > Engineer+90> > >> > > > > > > 536 3327510 • segmentify.com → > >> < https://www.segmentify.com/ >UK •> > >> > > > Germany •> > >> > > > > > > Turkey < https://www.segmentify.com/ecommerce-growth-show >> > >> > > > > > > < https://www.g2.com/products/segmentify/reviews >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > > > > >> > > >> > > > >