Hi support, I'm running a performance test writing 4000 entry per second on a cache: 1. TRANSACTIONAL 2. partitioned 3. with backup 1 (and affinity with exclude neighbors enabled) 4. write synchronization mode FULL_ASYNC 5. indexed on key and value (and enabled to SQL inquiry)
Writes are performed by a client node using a data stream with StreamVisitor and set autoFlushFrequency 1 sec. We have configured: 1. failureDetectionTimeout to 120000msec 2. Data region (only 1): a. Persistence enabled b. max size 8 GB c. checkpointPageBufferSize 2 GB 3. WAL mode LOG_ONLY 4. disabled WAL archiving (WAL path and the WAL archive path to the same value) 5. Pages Writes Throttling enabled After some hour submitting about 20 million entries without problems, the client node starts to accuse delays: the queue from the client node Ignite reads messages start to grow. Verifying the logs of server and client node there isn’t any error message but from the statistics of WAL high FSYNC values are observed. Could you help me to understand why inspite a constant rate and a constant consumption of cpu of about 30% only after a certain amount of entry it seems the server slow down in term of performance? May be there is some param to tune that I missed? Below the configuration used for the simulation: Total server nodes 8 so distributed: HOST1 with 4 nodes server and 1 client node on HDD disk HOST2 with 4 nodes on HDD disk Both hosts are machines with 16 cores of 256 GB of memory and HHD disk. The DataStorageConfiguration for each server node is as follows: <property name="dataStorageConfiguration"> <bean class="org.apache.ignite.configuration.DataStorageConfiguration"> <property name="writeThrottlingEnabled" value="true" /> <property name="defaultDataRegionConfiguration"> <bean class="org.apache.ignite.configuration.DataRegionConfiguration"> <property name="persistenceEnabled" value="true" /> <property name="maxSize" value="#{8L * 1024 * 1024 * 1024}"/> <property name="checkpointPageBufferSize" value="#{2048L * 1024 * 1024}" /> </bean> </property> <property name="walMode" value="LOG_ONLY" /> <property name="walPath" value="wal/path" /> <property name="walArchivePath" value="wal/path" /> </bean> </property> JVM option used for start each server node: -server -Xms4g -Xmx8g -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC I report the WAL statistics from log of node 1 : At Simulation start: 2019-02-22 10:19:44.195 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=c115ead9-643d-45e5-be41-cd7ae5caac14, pages=11891, markPos=FileWALPointer [idx=1, fileOff=36517886, len=79426], walSegmentsCleared=0, walSegmentsCovered=[0], markDuration=34ms, pagesWrite=87ms, fsync=1931ms, total=2052ms] 2019-02-22 10:22:44.742 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=d40c9096-2e10-46ca-ae8e-2a39e242b768, pages=66732, markPos=FileWALPointer [idx=7, fileOff=63806638, len=79426], walSegmentsCleared=7, walSegmentsCovered=[1 - 6], markDuration=98ms, pagesWrite=407ms, fsync=2085ms, total=2590ms] 2019-02-22 10:25:44.900 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=5124f12f-6ed8-4ed3-9644-3c58957600ed, pages=70253, markPos=FileWALPointer [idx=14, fileOff=47159207, len=79426], walSegmentsCleared=6, walSegmentsCovered=[7 - 13], markDuration=98ms, pagesWrite=402ms, fsync=2241ms, total=2741ms] 2019-02-22 10:28:47.866 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=9094d36c-cd79-4d30-9b90-e48de78fa3e6, pages=72524, markPos=FileWALPointer [idx=21, fileOff=39728290, len=79426], walSegmentsCleared=8, walSegmentsCovered=[14 - 20], markDuration=83ms, pagesWrite=365ms, fsync=5255ms, total=5703ms] 2019-02-22 10:31:53.635 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=7132d53a-e2a6-4ac8-b1a8-2621cc39c82b, pages=77471, markPos=FileWALPointer [idx=28, fileOff=64681287, len=79426], walSegmentsCleared=7, walSegmentsCovered=[21 - 27], markDuration=494ms, pagesWrite=748ms, fsync=10136ms, total=11472ms] At end of simulation 2019-02-22 11:52:36.339 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=dc8369b9-b100-4dd0-bfaf-9a5b13620072, pages=129942, markPos=FileWALPointer [idx=309, fileOff=19048810, len=79426], walSegmentsCleared=11, walSegmentsCovered=[298 - 308], markDuration=77ms, pagesWrite=797ms, fsync=171049ms, total=171923ms] 2019-02-22 11:56:24.001 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=1fb1ced8-a257-4c6f-be97-a1867ba6692e, pages=133096, markPos=FileWALPointer [idx=320, fileOff=13420410, len=79426], walSegmentsCleared=11, walSegmentsCovered=[309 - 319], markDuration=1707ms, pagesWrite=1537ms, fsync=216332ms, total=219576ms] 2019-02-22 12:00:23.052 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=0d48f6d5-7601-4839-b422-55b228978da5, pages=150587, markPos=FileWALPointer [idx=332, fileOff=47800048, len=79426], walSegmentsCleared=12, walSegmentsCovered=[320 - 331], markDuration=2275ms, pagesWrite=752ms, fsync=236023ms, total=239051ms] 2019-02-22 12:04:05.562 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=b23526c4-5121-48a9-9902-522a5ffb3a28, pages=155805, markPos=FileWALPointer [idx=345, fileOff=40020477, len=79426], walSegmentsCleared=13, walSegmentsCovered=[332 - 344], markDuration=525ms, pagesWrite=1324ms, fsync=220654ms, total=222504ms] 2019-02-22 12:07:54.005 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=1bb2a0c0-7a89-47f2-af9f-90e90d44c14b, pages=149055, markPos=FileWALPointer [idx=357, fileOff=51666923, len=79426], walSegmentsCleared=12, walSegmentsCovered=[345 - 356], markDuration=995ms, pagesWrite=1559ms, fsync=225888ms, total=228442ms] 2019-02-22 12:11:49.962 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=19f14a04-842f-409e-9c6b-afb59193e419, pages=153022, markPos=FileWALPointer [idx=370, fileOff=16234647, len=79426], walSegmentsCleared=13, walSegmentsCovered=[357 - 369], markDuration=1773ms, pagesWrite=1044ms, fsync=233139ms, total=235957ms] 2019-02-22 12:15:59.332 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=c1316fb7-1ecc-4358-bf90-a87772969c03, pages=159668, markPos=FileWALPointer [idx=383, fileOff=21979375, len=79426], walSegmentsCleared=13, walSegmentsCovered=[370 - 382], markDuration=1249ms, pagesWrite=1693ms, fsync=246428ms, total=249370ms] 2019-02-22 12:20:05.814 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=a9e394f2-0011-4f89-8b5a-a6ed77774103, pages=156891, markPos=FileWALPointer [idx=396, fileOff=3956799, len=79426], walSegmentsCleared=13, walSegmentsCovered=[383 - 395], markDuration=1030ms, pagesWrite=1275ms, fsync=244176ms, total=246482ms] 2019-02-22 12:24:40.217 INFO 5271 --- [oint-thread-#67] i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished [cpId=0bc71cc5-97c4-4f7d-95c2-430f379eeeb0, pages=148039, markPos=FileWALPointer [idx=407, fileOff=57767331, len=79426], walSegmentsCleared=11, walSegmentsCovered=[396 - 406], markDuration=323ms, pagesWrite=1620ms, fsync=272460ms, total=274403ms] Thanks in advance. Antonio -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/