Just a correction to context of the data region running out of memory: This
one does not have a queue of items or a continuous query operating on a
cache within it.

Thanks,
Raymond.

On Thu, Jun 11, 2020 at 4:12 PM Raymond Wilson <raymond_wil...@trimble.com>
wrote:

> Pavel,
>
> I have run into a different instance of a memory out of error in a data
> region in a different context from the one I wrote the reproducer for. In
> this case, there is an activity which queues items for processing at a
> point in the future and which does use a continuous query, however there is
> also significant vanilla put/get activity against a range of other caches..
>
> This data region was permitted to grow to 1Gb and has persistence enabled.
> We are now using Ignite 2.8
>
> I would like to understand if this is a possible failure mode given that
> the data region has persistence enabled. The underlying cause appears to be
> 'Unable to find a page for eviction'. Should this be expected on data
> regions with persistence?
>
> I have included the error below.
>
> This is the initial error reported by Ignite:
>
> 2020-06-11 12:53:35,082 [98] ERR [ImmutableCacheComputeServer] JVM will be
> halted immediately due to the failure: [failureCtx=FailureContext
> [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
> Failed to find a page for eviction [segmentCapacity=13612, loaded=5417,
> maxDirtyPages=4063, dirtyPages=5417, cpPages=0, pinnedInSegment=0,
> failedToPrepare=5417]
> Out of memory in data region [name=Default-Immutable, initSize=128.0 MiB,
> maxSize=1.0 GiB, persistenceEnabled=true] Try the following:
>   ^-- Increase maximum off-heap memory size
> (DataRegionConfiguration.maxSize)
>   ^-- Enable Ignite persistence
> (DataRegionConfiguration.persistenceEnabled)
>   ^-- Enable eviction or expiration policies]]
>
> Following this error is a lock dump, where this is the only thread with a
> lock:(I am assuming the structureId member with the value
> 'Spatial-SubGridSegment-Mutable-602' refers to a remote actor holding a
> lock against an item in the local node )
>
> Thread=[name=sys-stripe-11-#12%TRex-Immutable%, id=26], state=RUNNABLE
> Locked pages = [284060547022916[0001025a00000044](r=0|w=1)]
> Locked pages log: name=sys-stripe-11-#12%TRex-Immutable%
> time=(1591836815071, 2020-06-11 12:53:35.071)
> L=1 -> Write lock pageId=284060547022916,
> structureId=Spatial-SubGridSegment-Mutable-602 [pageIdHex=0001025a00000044,
> partId=602, pageIdx=68, flags=00000001]
>
> Following the lock dump is this final error before the Ignite node stops:
>
> 2020-06-11 12:53:35,082 [98] ERR [ImmutableCacheComputeServer] JVM will be
> halted immediately due to the failure: [failureCtx=FailureContext
> [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
> Failed to find a page for eviction [segmentCapacity=13612, loaded=5417,
> maxDirtyPages=4063, dirtyPages=5417, cpPages=0, pinnedInSegment=0,
> failedToPrepare=5417]
> Out of memory in data region [name=Default-Immutable, initSize=128.0 MiB,
> maxSize=1.0 GiB, persistenceEnabled=true] Try the following:
>   ^-- Increase maximum off-heap memory size
> (DataRegionConfiguration.maxSize)
>   ^-- Enable Ignite persistence
> (DataRegionConfiguration.persistenceEnabled)
>   ^-- Enable eviction or expiration policies]]
>
>
>
>
> On Wed, May 13, 2020 at 2:15 AM Raymond Wilson <raymond_wil...@trimble.com>
> wrote:
>
>> Hi Pavel,
>>
>> The reproducer is not the actual use case which is too big to use - it's
>> a small example using the same mechanisms. I have not used a data streamer
>> before, I'll read up on it.
>>
>> I'll try running the reproducer again against 2.8 (I used 2.7.6 for the
>> reproducer).
>>
>> Thanks,
>> Raymond.
>>
>>
>> On Tue, May 12, 2020 at 11:18 PM Pavel Tupitsyn <ptupit...@apache.org>
>> wrote:
>>
>>> Hi Raymond,
>>>
>>> First, I could not reproduce the issue. Attached program runs to
>>> completion on my machine.
>>>
>>> Second, I see a few issues with the attached code:
>>> - Cache.PutIfAbsent is used instead of DataStreamer
>>> - ICacheEntryEventFilter is used to remove cache entries, and is called
>>> twice - on add and on remove
>>>
>>> My recommendation is to use a "classic" combination of Data Streamer,
>>> Continuous Query, and Expiry Policy.
>>> Set expiry policy to a few seconds, and you won't keep much data in
>>> memory. Ignite will handle the removal for you.
>>> Let me know if I should prepare an example.
>>>
>>> Also it is not clear why persistence is needed for such a "buffer" cache
>>> - items are removed almost immediately,
>>> it would be much more efficient to disable persistence.
>>>
>>> Thanks,
>>> Pavel
>>>
>>> On Tue, May 12, 2020 at 12:23 PM Raymond Wilson <
>>> raymond_wil...@trimble.com> wrote:
>>>
>>>> Well, it appears I was wrong. It reappeared. :(
>>>>
>>>> I thought I had sent a reply to this thread but cannot find it, so I am
>>>> resending it now.
>>>>
>>>> Attached is a c# reproducer that throws Ignite out of memory errors in
>>>> the situation I outlined above where cache operations against a small cache
>>>> with persistence enabled.
>>>>
>>>> Let me know if you're able to reproduce it on your local systems.
>>>>
>>>> Thanks,
>>>> Raymond.
>>>>
>>>>
>>>> On Tue, Mar 3, 2020 at 1:31 PM Raymond Wilson <
>>>> raymond_wil...@trimble.com> wrote:
>>>>
>>>>> It's possible this is user (me) error.
>>>>>
>>>>> I discovered I had set the cache size to be 64Mb in the server, but
>>>>> 65Mb (typo!) in the client. Making these two values consistent appeared to
>>>>> prevent the error.
>>>>>
>>>>> Raymond.
>>>>>
>>>>>
>>>>> On Tue, Mar 3, 2020 at 12:58 PM Raymond Wilson <
>>>>> raymond_wil...@trimble.com> wrote:
>>>>>
>>>>>> I'm using Ignite v2.7.5 with C# client.
>>>>>>
>>>>>> I have an error where Ignite throws an out of memory exception, like
>>>>>> this:
>>>>>>
>>>>>> 2020-03-03 12:02:58,036 [287] ERR [MutableCacheComputeServer] JVM
>>>>>> will be halted immediately due to the failure: [failureCtx=FailureContext
>>>>>> [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException: 
>>>>>> Out
>>>>>> of memory in data region [name=TAGFileBufferQueue, initSize=64.0 MiB,
>>>>>> maxSize=64.0 MiB, persistenceEnabled=true] Try the following:
>>>>>>   ^-- Increase maximum off-heap memory size
>>>>>> (DataRegionConfiguration.maxSize)
>>>>>>   ^-- Enable Ignite persistence
>>>>>> (DataRegionConfiguration.persistenceEnabled)
>>>>>>   ^-- Enable eviction or expiration policies]]
>>>>>>
>>>>>> I don't have an eviction policy set (is this even a valid
>>>>>> recommendation when using persistence?)
>>>>>>
>>>>>> Increasing the off heap memory size for the data region does prevent
>>>>>> this error, but I want to minimise the in-memory size for this buffer as 
>>>>>> it
>>>>>> is essentially just a queue.
>>>>>>
>>>>>> The suggestion of enabling data persistence is strange as this data
>>>>>> region has already persistence enabled for it.
>>>>>>
>>>>>> My assumption is that Ignite manages the memory in this cache by
>>>>>> saving and loading values as required.
>>>>>>
>>>>>> The test workflow in this failure is one where ~14,500 objects
>>>>>> totalling ~440 Mb in size (avery object size = ~30Kb) are added to the
>>>>>> cache, and are then drained by a processors using a continuous query.
>>>>>> Elements are removed from the cache as the processor completes them.
>>>>>>
>>>>>> Is this kind of out of memory error supposed to be possible when
>>>>>> using persistent data regions?
>>>>>>
>>>>>> Thanks,
>>>>>> Raymond.
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>> --
>>>> <http://www.trimble.com/>
>>>> Raymond Wilson
>>>> Solution Architect, Civil Construction Software Systems (CCSS)
>>>> 11 Birmingham Drive | Christchurch, New Zealand
>>>> +64-21-2013317 Mobile
>>>> raymond_wil...@trimble.com
>>>>
>>>>
>>>> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>
>>>>
>>>
>>
>> --
>> <http://www.trimble.com/>
>> Raymond Wilson
>> Solution Architect, Civil Construction Software Systems (CCSS)
>> 11 Birmingham Drive | Christchurch, New Zealand
>> +64-21-2013317 Mobile
>> raymond_wil...@trimble.com
>>
>>
>> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>
>>
>
>
> --
> <http://www.trimble.com/>
> Raymond Wilson
> Solution Architect, Civil Construction Software Systems (CCSS)
> 11 Birmingham Drive | Christchurch, New Zealand
> +64-21-2013317 Mobile
> raymond_wil...@trimble.com
>
>
> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>
>


-- 
<http://www.trimble.com/>
Raymond Wilson
Solution Architect, Civil Construction Software Systems (CCSS)
11 Birmingham Drive | Christchurch, New Zealand
+64-21-2013317 Mobile
raymond_wil...@trimble.com

<https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>

Reply via email to