I tend to agree with Mitchell that the cluster should not crash. If the crash is unavoidable based on the current architecture then a message should be more descriptive.
Ignite persistence experts, could you please join the conversation and shed more light to the reported behavior? - Denis On Wed, Dec 11, 2019 at 3:25 AM Mitchell Rathbun (BLOOMBERG/ 731 LEX) < mrathb...@bloomberg.net> wrote: > 2 GB is not reasonable for off heap memory for our use case. In general, > even if off-heap is very low, performance should just degrade and calls > should become blocking, I don't think that we should crash. Either way, the > issue seems to be with putAll, not concurrent updates of different caches > in the same data region. If I use Ignite's DataStreamer API instead of > putAll, I get much better performance and no OOM exception. Any insight > into why this might be would be appreciated. > > From: user@ignite.apache.org At: 12/10/19 11:24:35 > To: Mitchell Rathbun (BLOOMBERG/ 731 LEX ) <mrathb...@bloomberg.net>, > user@ignite.apache.org > Subject: Re: IgniteOutOfMemoryException in LOCAL cache mode with > persistence enabled > > Hello! > > 10M is very very low-ball for testing performance of disk, considering how > Ignite's wal/checkpoints are structured. As already told, it does not even > work properly. > > I recommend using 2G value instead. Just load enough data so that you can > observe constant checkpoints. > > Regards, > -- > Ilya Kasnacheev > > > ср, 4 дек. 2019 г. в 03:16, Mitchell Rathbun (BLOOMBERG/ 731 LEX) < > mrathb...@bloomberg.net>: > >> For the requested full ignite log, where would this be found if we are >> running using local mode? We are not explicitly running a separate ignite >> node, and our WorkDirectory does not seem to have any logs >> >> From: user@ignite.apache.org At: 12/03/19 19:00:18 >> To: user@ignite.apache.org >> Subject: Re: IgniteOutOfMemoryException in LOCAL cache mode with >> persistence enabled >> >> For our configuration properties, our DataRegion initialSize and MaxSize >> was set to 11 MB and persistence was enabled. For DataStorage, our pageSize >> was set to 8192 instead of 4096. For Cache, write behind is disabled, on >> heap cache is disabled, and Atomicity Mode is Atomic >> >> From: user@ignite.apache.org At: 12/03/19 13:40:32 >> To: user@ignite.apache.org >> Subject: Re: IgniteOutOfMemoryException in LOCAL cache mode with >> persistence enabled >> >> Hi Mitchell, >> >> Looks like it could be easily reproduced on low off-heap sizes, I tried >> with >> simple puts and got the same exception: >> >> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Failed to >> find a page for eviction [segmentCapacity=1580, loaded=619, >> maxDirtyPages=465, dirtyPages=619, cpPages=0, pinnedInSegment=0, >> failedToPrepare=620] >> Out of memory in data region [name=Default_Region, initSize=10.0 MiB, >> maxSize=10.0 MiB, persistenceEnabled=true] Try the following: >> ^-- Increase maximum off-heap memory size >> (DataRegionConfiguration.maxSize) >> ^-- Enable Ignite persistence (DataRegionConfiguration.persistenceEnabled) >> ^-- Enable eviction or expiration policies >> >> It looks like Ignite must issue a proper warning in this case and couple >> of >> issues must be filed against Ignite JIRA. >> >> Check out this article on persistent store available in Ignite confluence >> as >> well: >> >> https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+und >> er+the+hood#IgnitePersistentStore-underthehood-Checkpointing >> >> I've managed to make kind of similar example working with 20 Mb region >> with >> a bit of tuning, added following properties to >> org.apache.ignite.configuration.DataStorageConfiguration: >> /<property name="checkpointFrequency" value="1500"/> >> <property name="writeThrottlingEnabled" value="true"/>/ >> >> The whole idea behind this is to trigger checkpoint on timeout rather than >> on too much dirty pages percentage threshold. The checkpoint page buffer >> size may not exceed data region size, which is 10 Mb, which might be >> overflown during checkpoint as well. >> >> I assume that checkpoint is never triggered in this case because of >> per-partition overhead: Ignite writes some meta per partition and it looks >> like that it is at least 1 meta page utilized for each which results in >> some >> amount of off-heap devoured by these meta pages. In the case with the >> lowest >> possible region size, this might consume more than 3 Mb for cache with 1k >> partitions and 70% dirty data pages threshold would never be reached. >> >> However, I found another issue when it is not possible to save meta page >> on >> checkpoint begin, this reproduces on 10 Mb data region with mentioned >> storage configuration options. >> >> Could you please describe the configuration if you have anything different >> from defaults (page size, wal mode, partitions count) and types of >> key/value >> that you use? And if it is possible, could you please attach full Ignite >> log >> from the node that has suffered from IOOM? >> >> As for the data region/cache, in reality you do also have cache groups >> here >> playing a role. But generally I would recommend you to go with one data >> region for all caches unless you have a particular reason to have multiple >> regions. As for example, you have some really important data in some cache >> that always needs to be available in durable off-heap memory, then you >> should have a separate data region for this cache as I'm not aware if >> there >> is possibility to disallow evicting pages for a specific cache. >> >> Cache groups documentation link: >> https://apacheignite.readme.io/docs/cache-groups >> >> By default (a cache doesn't have cacheGroup property defined) each cache >> has >> it's own cache group with the very same name, that lives in the data >> region >> specified or default data region. You might use them or not, depending on >> the goal you have: use when you want to reduce meta >> overhead/checkpoints/partition exchanges and share internal structures to >> save up space a bit, or do not use them if you want to speed up >> inserts/lookups by having it's own dedicated partition maps and B+ trees. >> >> Best regards, >> Anton >> >> >> >> -- >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >> >> >> >> >