Re: Questions related to check pointing

Ilya Kasnacheev Mon, 28 Dec 2020 02:53:35 -0800

Hello!

1. If we knew the specific circumstances in which a specific setting value
will yield the most benefit, we would've already set it to that value. A
setting means that you may tune it and get better results, or not. But in
general we can't promise you anything. I did see improvements from
increasing this setting in a very specific setup, but in general you may
leave it as is.


2. More frequent checkpoints mean increased write amplification. So
reducing this value may overwhelm your system with load that it was able to
handle previously. You can set this setting to arbitrary small value,
meaning that checkpoints will be purely sequential without any pauses
between them.

3. I don't think that default throttling mechanism will emit any warnings.
What do you see in thread dumps?

Regards,
-- 
Ilya Kasnacheev


ср, 23 дек. 2020 г. в 12:48, Raymond Wilson <raymond_wil...@trimble.com>:

> Hi,
>
> We have been investigating some issues which appear to be related to
> checkpointing. We currently use the IA 2.8.1 with the C# client.
>
> I have been trying to gain clarity on how certain aspects of the Ignite
> configuration relate to the checkpointing process:
>
> 1. Number of check pointing threads. This defaults to 4, but I don't
> understand how it applies to the checkpointing process. Are more threads
> generally better (eg: because it makes the disk IO parallel across the
> threads), or does it only have a positive effect if you have many data
> storage regions? Or something else? If this could be clarified in the
> documentation (or a pointer to it which Google has not yet found), that
> would be good.
>
> 2. Checkpoint frequency. This is defaulted to 180 seconds. I was thinking
> that reducing this time would result in smaller less disruptive check
> points. Setting it to 60 seconds seems pretty safe, but is there a
> practical lower limit that should be used for use cases with new data
> constantly being added, eg: 5 seconds, 10 seconds?
>
> 3. Write exclusivity constraints during checkpointing. I understand that
> while a checkpoint is occurring ongoing writes will be supported into the
> caches being check pointed, and if those are writes to existing pages then
> those will be duplicated into the checkpoint buffer. If this buffer becomes
> full or stressed then Ignite will throttle, and perhaps block, writes until
> the checkpoint is complete. If this is the case then Ignite will emit
> logging (warning or informational?) that writes are being throttled.
>
> We have cases where simple puts to caches (a few requests per second) are
> taking up to 90 seconds to execute when there is an active check point
> occurring, where the check point has been triggered by the checkpoint
> timer. When a checkpoint is not occurring the time to do this is usually in
> the milliseconds. The checkpoints themselves can take 90 seconds or longer,
> and are updating up to 30,000-40,000 pages, across a pair of data storage
> regions, one with 4Gb in-memory space allocated (which should be 1,000,000
> pages at the standard 4kb page size), and one small region with 128Mb.
> There is no 'throttling' logging being emitted that we can tell, so the
> checkpoint buffer (which should be 1Gb for the first data region and 256 Mb
> for the second smaller region in this case) does not look like it can fill
> up during the checkpoint.
>
> It seems like the checkpoint is affecting the put operations, but I don't
> understand why that may be given the documented checkpointing process, and
> the checkpoint itself (at least via Informational logging) is not
> advertising any restrictions.
>
> Thanks,
> Raymond.
>
> --
> <http://www.trimble.com/>
> Raymond Wilson
> Solution Architect, Civil Construction Software Systems (CCSS)
>
>

Re: Questions related to check pointing

Reply via email to