For our configuration properties, our DataRegion initialSize and MaxSize was set to 11 MB and persistence was enabled. For DataStorage, our pageSize was set to 8192 instead of 4096. For Cache, write behind is disabled, on heap cache is disabled, and Atomicity Mode is Atomic
From: user@ignite.apache.org At: 12/03/19 13:40:32To: user@ignite.apache.org Subject: Re: IgniteOutOfMemoryException in LOCAL cache mode with persistence enabled Hi Mitchell, Looks like it could be easily reproduced on low off-heap sizes, I tried with simple puts and got the same exception: class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Failed to find a page for eviction [segmentCapacity=1580, loaded=619, maxDirtyPages=465, dirtyPages=619, cpPages=0, pinnedInSegment=0, failedToPrepare=620] Out of memory in data region [name=Default_Region, initSize=10.0 MiB, maxSize=10.0 MiB, persistenceEnabled=true] Try the following: ^-- Increase maximum off-heap memory size (DataRegionConfiguration.maxSize) ^-- Enable Ignite persistence (DataRegionConfiguration.persistenceEnabled) ^-- Enable eviction or expiration policies It looks like Ignite must issue a proper warning in this case and couple of issues must be filed against Ignite JIRA. Check out this article on persistent store available in Ignite confluence as well: https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+und er+the+hood#IgnitePersistentStore-underthehood-Checkpointing I've managed to make kind of similar example working with 20 Mb region with a bit of tuning, added following properties to org.apache.ignite.configuration.DataStorageConfiguration: /<property name="checkpointFrequency" value="1500"/> <property name="writeThrottlingEnabled" value="true"/>/ The whole idea behind this is to trigger checkpoint on timeout rather than on too much dirty pages percentage threshold. The checkpoint page buffer size may not exceed data region size, which is 10 Mb, which might be overflown during checkpoint as well. I assume that checkpoint is never triggered in this case because of per-partition overhead: Ignite writes some meta per partition and it looks like that it is at least 1 meta page utilized for each which results in some amount of off-heap devoured by these meta pages. In the case with the lowest possible region size, this might consume more than 3 Mb for cache with 1k partitions and 70% dirty data pages threshold would never be reached. However, I found another issue when it is not possible to save meta page on checkpoint begin, this reproduces on 10 Mb data region with mentioned storage configuration options. Could you please describe the configuration if you have anything different from defaults (page size, wal mode, partitions count) and types of key/value that you use? And if it is possible, could you please attach full Ignite log from the node that has suffered from IOOM? As for the data region/cache, in reality you do also have cache groups here playing a role. But generally I would recommend you to go with one data region for all caches unless you have a particular reason to have multiple regions. As for example, you have some really important data in some cache that always needs to be available in durable off-heap memory, then you should have a separate data region for this cache as I'm not aware if there is possibility to disallow evicting pages for a specific cache. Cache groups documentation link: https://apacheignite.readme.io/docs/cache-groups By default (a cache doesn't have cacheGroup property defined) each cache has it's own cache group with the very same name, that lives in the data region specified or default data region. You might use them or not, depending on the goal you have: use when you want to reduce meta overhead/checkpoints/partition exchanges and share internal structures to save up space a bit, or do not use them if you want to speed up inserts/lookups by having it's own dedicated partition maps and B+ trees. Best regards, Anton -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/