Hi Surinder, > I increased it, but it started showing timeout errors
These messages are INFO messages but not errors. Checkpoints can be triggered by several causes, some of them are below: 1) Checkpoint timeout. You specified 'checkpointFrequency' equal to 30 seconds, which means that checkpoint should be attempted at least each 30 seconds. 2) Too many dirty pages. 3) Creating a snapshot. 4) WAL segments rollover. ср, 23 мар. 2022 г. в 18:10, Surinder Mehra <[email protected]>: > Hi, > We have a 4 node cluster with 64G RAM and 40G DISK per node attached for > /work and /walarchive each. WAL dir is 10G per node > > Below is our data region configuration and jvm_opts. We are getting > timeouts on checkpointing and WalArchive is getting filled up and no data > is moving to the /work directory. Checkpointing error is mentioned below > > could you suggest what's wrong with these configs(Note : with > walarchveSize as 16G and checkpointingBUffSize as 4G, it was writing but > very slow and throttling rate was 35% so I increased it, but it started > showing timeout errors) > > JVM_OPTS: -XX:MaxDirectMemorySize=2g -Xms20g -Xmx25g > -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ScavengeBeforeFullGC > -XX:+DisableExplicitGC > > Ignite config: > > <property name="dataStorageConfiguration"> > <bean > class="org.apache.ignite.configuration.DataStorageConfiguration"> > <property name="walBufferSize" value="#{256L * 1024 * > 1024}"/> > <property name="checkpointFrequency" value="30000"/> > <property name="checkpointThreads" value="12"/> > <property name="walSegmentSize" value="#{512L * 1024 * > 1024}"/> > <property name="maxWalArchiveSize" value="#{32L * 1024 * > 1024 * 1024}"/> > <property name="writeThrottlingEnabled" value="true"/> > <property name="defaultDataRegionConfiguration"> > <bean > class="org.apache.ignite.configuration.DataRegionConfiguration"> > <property name="persistenceEnabled" value="true"/> > <!-- > https://ignite.apache.org/docs/latest/persistence/persistence-tuning#adjusting-checkpointing-buffer-size-- > > > <property name="checkpointPageBufferSize" > value="#{8L * 1024 * 1024 * 1024}"/> > <!--<property name="pageReplacementMode" > value="SEGMENTED_LRU"/>--> > </bean> > </property> > > <property name="walPath" value="/ignite/wal"/> > <property name="walArchivePath" > value="/ignite/walarchive"/> > </bean> > > </property> > > > Checkpointing logs: > > [14:56:55,209][INFO][db-checkpoint-thread-#105][Checkpointer] Checkpoint > started [checkpointId=4b071279-8f25-4cd0-a5ef-e6f58f3a5653, > startPtr=WALPointer [idx=930, fileOff=527943956, len=51411], > checkpointBeforeLockTime=9ms, checkpointLockWait=0ms, > checkpointListenersExecuteTime=6ms, checkpointLockHoldTime=9ms, > walCpRecordFsyncDuration=7ms, writeCheckpointEntryDuration=3ms, > splitAndSortCpPagesDuration=45ms, pages=35542, reason='timeout'] > [14:56:55,921][INFO][db-checkpoint-thread-#105][Checkpointer] Checkpoint > finished [cpId=4b071279-8f25-4cd0-a5ef-e6f58f3a5653, pages=35542, > markPos=WALPointer [idx=930, fileOff=527943956, len=51411], > walSegmentsCovered=[], markDuration=64ms, pagesWrite=108ms, fsync=604ms, > total=785ms] >
