Hi Surinder,

> I increased it, but it started showing timeout errors

These messages are INFO messages but not errors. Checkpoints can be
triggered by several causes, some of them are below:
1) Checkpoint timeout. You specified 'checkpointFrequency' equal to 30
seconds, which means that checkpoint should be attempted at least each 30
seconds.
2) Too many dirty pages.
3) Creating a snapshot.
4) WAL segments rollover.


ср, 23 мар. 2022 г. в 18:10, Surinder Mehra <redni...@gmail.com>:

> Hi,
> We have a 4 node cluster with 64G RAM and 40G DISK per node attached for
> /work and /walarchive each.  WAL dir is 10G per node
>
> Below is our data region configuration and jvm_opts. We are getting
> timeouts on checkpointing and WalArchive is getting filled up and no data
> is moving to the /work directory. Checkpointing error is mentioned below
>
>  could you suggest what's wrong with these configs(Note : with
> walarchveSize as 16G and checkpointingBUffSize as 4G, it was writing but
> very slow and throttling rate was 35% so I increased it, but it started
> showing timeout errors)
>
> JVM_OPTS:     -XX:MaxDirectMemorySize=2g -Xms20g -Xmx25g
> -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ScavengeBeforeFullGC
> -XX:+DisableExplicitGC
>
> Ignite config:
>
> <property name="dataStorageConfiguration">
>             <bean
> class="org.apache.ignite.configuration.DataStorageConfiguration">
>                 <property name="walBufferSize" value="#{256L * 1024 *
> 1024}"/>
>                 <property name="checkpointFrequency" value="30000"/>
>                 <property name="checkpointThreads" value="12"/>
>                 <property name="walSegmentSize" value="#{512L * 1024 *
> 1024}"/>
>                 <property name="maxWalArchiveSize" value="#{32L * 1024 *
> 1024 * 1024}"/>
>                 <property name="writeThrottlingEnabled" value="true"/>
>                 <property name="defaultDataRegionConfiguration">
>                     <bean
> class="org.apache.ignite.configuration.DataRegionConfiguration">
>                         <property name="persistenceEnabled" value="true"/>
>                         <!--
> https://ignite.apache.org/docs/latest/persistence/persistence-tuning#adjusting-checkpointing-buffer-size--
> >
>                         <property name="checkpointPageBufferSize"
> value="#{8L * 1024 * 1024 * 1024}"/>
>                         <!--<property name="pageReplacementMode"
> value="SEGMENTED_LRU"/>-->
>                     </bean>
>                 </property>
>
>                 <property name="walPath" value="/ignite/wal"/>
>                 <property name="walArchivePath"
> value="/ignite/walarchive"/>
>             </bean>
>
>         </property>
>
>
> Checkpointing logs:
>
> [14:56:55,209][INFO][db-checkpoint-thread-#105][Checkpointer] Checkpoint
> started [checkpointId=4b071279-8f25-4cd0-a5ef-e6f58f3a5653,
> startPtr=WALPointer [idx=930, fileOff=527943956, len=51411],
> checkpointBeforeLockTime=9ms, checkpointLockWait=0ms,
> checkpointListenersExecuteTime=6ms, checkpointLockHoldTime=9ms,
> walCpRecordFsyncDuration=7ms, writeCheckpointEntryDuration=3ms,
> splitAndSortCpPagesDuration=45ms, pages=35542, reason='timeout']
> [14:56:55,921][INFO][db-checkpoint-thread-#105][Checkpointer] Checkpoint
> finished [cpId=4b071279-8f25-4cd0-a5ef-e6f58f3a5653, pages=35542,
> markPos=WALPointer [idx=930, fileOff=527943956, len=51411],
> walSegmentsCovered=[], markDuration=64ms, pagesWrite=108ms, fsync=604ms,
> total=785ms]
>

Reply via email to