I tend to agree with Mitchell that the cluster should not crash. If the
crash is unavoidable based on the current architecture then a message
should be more descriptive.

Ignite persistence experts, could you please join the conversation and shed
more light to the reported behavior?

-
Denis


On Wed, Dec 11, 2019 at 3:25 AM Mitchell Rathbun (BLOOMBERG/ 731 LEX) <
mrathb...@bloomberg.net> wrote:

> 2 GB is not reasonable for off heap memory for our use case. In general,
> even if off-heap is very low, performance should just degrade and calls
> should become blocking, I don't think that we should crash. Either way, the
> issue seems to be with putAll, not concurrent updates of different caches
> in the same data region. If I use Ignite's DataStreamer API instead of
> putAll, I get much better performance and no OOM exception. Any insight
> into why this might be would be appreciated.
>
> From: user@ignite.apache.org At: 12/10/19 11:24:35
> To: Mitchell Rathbun (BLOOMBERG/ 731 LEX ) <mrathb...@bloomberg.net>,
> user@ignite.apache.org
> Subject: Re: IgniteOutOfMemoryException in LOCAL cache mode with
> persistence enabled
>
> Hello!
>
> 10M is very very low-ball for testing performance of disk, considering how
> Ignite's wal/checkpoints are structured. As already told, it does not even
> work properly.
>
> I recommend using 2G value instead. Just load enough data so that you can
> observe constant checkpoints.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 4 дек. 2019 г. в 03:16, Mitchell Rathbun (BLOOMBERG/ 731 LEX) <
> mrathb...@bloomberg.net>:
>
>> For the requested full ignite log, where would this be found if we are
>> running using local mode? We are not explicitly running a separate ignite
>> node, and our WorkDirectory does not seem to have any logs
>>
>> From: user@ignite.apache.org At: 12/03/19 19:00:18
>> To: user@ignite.apache.org
>> Subject: Re: IgniteOutOfMemoryException in LOCAL cache mode with
>> persistence enabled
>>
>> For our configuration properties, our DataRegion initialSize and MaxSize
>> was set to 11 MB and persistence was enabled. For DataStorage, our pageSize
>> was set to 8192 instead of 4096. For Cache, write behind is disabled, on
>> heap cache is disabled, and Atomicity Mode is Atomic
>>
>> From: user@ignite.apache.org At: 12/03/19 13:40:32
>> To: user@ignite.apache.org
>> Subject: Re: IgniteOutOfMemoryException in LOCAL cache mode with
>> persistence enabled
>>
>> Hi Mitchell,
>>
>> Looks like it could be easily reproduced on low off-heap sizes, I tried
>> with
>> simple puts and got the same exception:
>>
>> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Failed to
>> find a page for eviction [segmentCapacity=1580, loaded=619,
>> maxDirtyPages=465, dirtyPages=619, cpPages=0, pinnedInSegment=0,
>> failedToPrepare=620]
>> Out of memory in data region [name=Default_Region, initSize=10.0 MiB,
>> maxSize=10.0 MiB, persistenceEnabled=true] Try the following:
>> ^-- Increase maximum off-heap memory size
>> (DataRegionConfiguration.maxSize)
>> ^-- Enable Ignite persistence (DataRegionConfiguration.persistenceEnabled)
>> ^-- Enable eviction or expiration policies
>>
>> It looks like Ignite must issue a proper warning in this case and couple
>> of
>> issues must be filed against Ignite JIRA.
>>
>> Check out this article on persistent store available in Ignite confluence
>> as
>> well:
>>
>> https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+und
>> er+the+hood#IgnitePersistentStore-underthehood-Checkpointing
>>
>> I've managed to make kind of similar example working with 20 Mb region
>> with
>> a bit of tuning, added following properties to
>> org.apache.ignite.configuration.DataStorageConfiguration:
>> /<property name="checkpointFrequency" value="1500"/>
>> <property name="writeThrottlingEnabled" value="true"/>/
>>
>> The whole idea behind this is to trigger checkpoint on timeout rather than
>> on too much dirty pages percentage threshold. The checkpoint page buffer
>> size may not exceed data region size, which is 10 Mb, which might be
>> overflown during checkpoint as well.
>>
>> I assume that checkpoint is never triggered in this case because of
>> per-partition overhead: Ignite writes some meta per partition and it looks
>> like that it is at least 1 meta page utilized for each which results in
>> some
>> amount of off-heap devoured by these meta pages. In the case with the
>> lowest
>> possible region size, this might consume more than 3 Mb for cache with 1k
>> partitions and 70% dirty data pages threshold would never be reached.
>>
>> However, I found another issue when it is not possible to save meta page
>> on
>> checkpoint begin, this reproduces on 10 Mb data region with mentioned
>> storage configuration options.
>>
>> Could you please describe the configuration if you have anything different
>> from defaults (page size, wal mode, partitions count) and types of
>> key/value
>> that you use? And if it is possible, could you please attach full Ignite
>> log
>> from the node that has suffered from IOOM?
>>
>> As for the data region/cache, in reality you do also have cache groups
>> here
>> playing a role. But generally I would recommend you to go with one data
>> region for all caches unless you have a particular reason to have multiple
>> regions. As for example, you have some really important data in some cache
>> that always needs to be available in durable off-heap memory, then you
>> should have a separate data region for this cache as I'm not aware if
>> there
>> is possibility to disallow evicting pages for a specific cache.
>>
>> Cache groups documentation link:
>> https://apacheignite.readme.io/docs/cache-groups
>>
>> By default (a cache doesn't have cacheGroup property defined) each cache
>> has
>> it's own cache group with the very same name, that lives in the data
>> region
>> specified or default data region. You might use them or not, depending on
>> the goal you have: use when you want to reduce meta
>> overhead/checkpoints/partition exchanges and share internal structures to
>> save up space a bit, or do not use them if you want to speed up
>> inserts/lookups by having it's own dedicated partition maps and B+ trees.
>>
>> Best regards,
>> Anton
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>>
>>
>>
>

Reply via email to