Hi support,

I'm running a performance test writing 4000 entry per second on a cache:
1.      TRANSACTIONAL
2.      partitioned
3.      with backup 1 (and affinity with exclude neighbors enabled)
4.      write synchronization mode FULL_ASYNC
5.      indexed on key and value (and enabled to SQL inquiry)

Writes are performed by a client node using a data stream with StreamVisitor
and set autoFlushFrequency 1 sec.

We have configured:
1.      failureDetectionTimeout to 120000msec
2.      Data region (only 1):
a.      Persistence enabled
b.      max size 8 GB
c.      checkpointPageBufferSize 2 GB
3.      WAL mode LOG_ONLY
4.      disabled WAL archiving (WAL path and the WAL archive path to the same
value)
5.      Pages Writes Throttling enabled


After some hour submitting about 20 million entries without problems, the
client node starts to accuse delays: the queue from the client node Ignite
reads messages start to grow.

Verifying the logs of server and client node there isn’t any error message
but from the statistics of WAL  high FSYNC values are observed.

Could you help me to understand why inspite a constant rate and a constant
consumption of cpu of about 30% only after a certain amount of entry it
seems the server slow down in term of performance?

May be there is some param to tune that I missed?

Below the configuration used for the simulation:

Total server nodes  8 so distributed:
HOST1 with 4 nodes server and 1 client node on HDD disk
HOST2 with 4 nodes on HDD disk


Both hosts are machines with 16 cores of 256 GB of memory and HHD disk.

The DataStorageConfiguration for each server node is as follows:


<property name="dataStorageConfiguration">
                <bean
                       
class="org.apache.ignite.configuration.DataStorageConfiguration">
                        
                        <property name="writeThrottlingEnabled" value="true"
/>
                        <property name="defaultDataRegionConfiguration">
                                <bean
                                       
class="org.apache.ignite.configuration.DataRegionConfiguration">
                                        <property name="persistenceEnabled"
value="true" />
                                        <property name="maxSize" value="#{8L
* 1024 * 1024 * 1024}"/>   
                                        <property
name="checkpointPageBufferSize"
                                                value="#{2048L * 1024 *
1024}" />
                                </bean>
                        </property>

                        
                        
                        <property name="walMode" value="LOG_ONLY" />
                        <property name="walPath" value="wal/path" />
                        <property name="walArchivePath" value="wal/path" />
                </bean>
        </property>


JVM option used for start each server node:

-server -Xms4g -Xmx8g -XX:+AlwaysPreTouch -XX:+UseG1GC
-XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC



I report the WAL statistics from log of node 1 :

At Simulation start:
2019-02-22 10:19:44.195  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=c115ead9-643d-45e5-be41-cd7ae5caac14, pages=11891,
markPos=FileWALPointer [idx=1, fileOff=36517886, len=79426],
walSegmentsCleared=0, walSegmentsCovered=[0], markDuration=34ms,
pagesWrite=87ms, fsync=1931ms, total=2052ms]
2019-02-22 10:22:44.742  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=d40c9096-2e10-46ca-ae8e-2a39e242b768, pages=66732,
markPos=FileWALPointer [idx=7, fileOff=63806638, len=79426],
walSegmentsCleared=7, walSegmentsCovered=[1 - 6], markDuration=98ms,
pagesWrite=407ms, fsync=2085ms, total=2590ms]
2019-02-22 10:25:44.900  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=5124f12f-6ed8-4ed3-9644-3c58957600ed, pages=70253,
markPos=FileWALPointer [idx=14, fileOff=47159207, len=79426],
walSegmentsCleared=6, walSegmentsCovered=[7 - 13], markDuration=98ms,
pagesWrite=402ms, fsync=2241ms, total=2741ms]
2019-02-22 10:28:47.866  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=9094d36c-cd79-4d30-9b90-e48de78fa3e6, pages=72524,
markPos=FileWALPointer [idx=21, fileOff=39728290, len=79426],
walSegmentsCleared=8, walSegmentsCovered=[14 - 20], markDuration=83ms,
pagesWrite=365ms, fsync=5255ms, total=5703ms]
2019-02-22 10:31:53.635  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=7132d53a-e2a6-4ac8-b1a8-2621cc39c82b, pages=77471,
markPos=FileWALPointer [idx=28, fileOff=64681287, len=79426],
walSegmentsCleared=7, walSegmentsCovered=[21 - 27], markDuration=494ms,
pagesWrite=748ms, fsync=10136ms, total=11472ms]


At end of simulation

2019-02-22 11:52:36.339  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=dc8369b9-b100-4dd0-bfaf-9a5b13620072, pages=129942,
markPos=FileWALPointer [idx=309, fileOff=19048810, len=79426],
walSegmentsCleared=11, walSegmentsCovered=[298 - 308], markDuration=77ms,
pagesWrite=797ms, fsync=171049ms, total=171923ms]
2019-02-22 11:56:24.001  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=1fb1ced8-a257-4c6f-be97-a1867ba6692e, pages=133096,
markPos=FileWALPointer [idx=320, fileOff=13420410, len=79426],
walSegmentsCleared=11, walSegmentsCovered=[309 - 319], markDuration=1707ms,
pagesWrite=1537ms, fsync=216332ms, total=219576ms]
2019-02-22 12:00:23.052  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=0d48f6d5-7601-4839-b422-55b228978da5, pages=150587,
markPos=FileWALPointer [idx=332, fileOff=47800048, len=79426],
walSegmentsCleared=12, walSegmentsCovered=[320 - 331], markDuration=2275ms,
pagesWrite=752ms, fsync=236023ms, total=239051ms]
2019-02-22 12:04:05.562  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=b23526c4-5121-48a9-9902-522a5ffb3a28, pages=155805,
markPos=FileWALPointer [idx=345, fileOff=40020477, len=79426],
walSegmentsCleared=13, walSegmentsCovered=[332 - 344], markDuration=525ms,
pagesWrite=1324ms, fsync=220654ms, total=222504ms]
2019-02-22 12:07:54.005  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=1bb2a0c0-7a89-47f2-af9f-90e90d44c14b, pages=149055,
markPos=FileWALPointer [idx=357, fileOff=51666923, len=79426],
walSegmentsCleared=12, walSegmentsCovered=[345 - 356], markDuration=995ms,
pagesWrite=1559ms, fsync=225888ms, total=228442ms]
2019-02-22 12:11:49.962  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=19f14a04-842f-409e-9c6b-afb59193e419, pages=153022,
markPos=FileWALPointer [idx=370, fileOff=16234647, len=79426],
walSegmentsCleared=13, walSegmentsCovered=[357 - 369], markDuration=1773ms,
pagesWrite=1044ms, fsync=233139ms, total=235957ms]
2019-02-22 12:15:59.332  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=c1316fb7-1ecc-4358-bf90-a87772969c03, pages=159668,
markPos=FileWALPointer [idx=383, fileOff=21979375, len=79426],
walSegmentsCleared=13, walSegmentsCovered=[370 - 382], markDuration=1249ms,
pagesWrite=1693ms, fsync=246428ms, total=249370ms]
2019-02-22 12:20:05.814  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=a9e394f2-0011-4f89-8b5a-a6ed77774103, pages=156891,
markPos=FileWALPointer [idx=396, fileOff=3956799, len=79426],
walSegmentsCleared=13, walSegmentsCovered=[383 - 395], markDuration=1030ms,
pagesWrite=1275ms, fsync=244176ms, total=246482ms]
2019-02-22 12:24:40.217  INFO 5271 --- [oint-thread-#67]
i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
[cpId=0bc71cc5-97c4-4f7d-95c2-430f379eeeb0, pages=148039,
markPos=FileWALPointer [idx=407, fileOff=57767331, len=79426],
walSegmentsCleared=11, walSegmentsCovered=[396 - 406], markDuration=323ms,
pagesWrite=1620ms, fsync=272460ms, total=274403ms]


Thanks in advance.

Antonio




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to