Hello Amit,

There are the plans to make the cluster to heal itself by kicking off unstable 
nodes or unblocking pending transactions if an abnormal situation happens:
https://cwiki.apache.org/confluence/display/IGNITE/IEP-5+Cluster+reaction+if+node+detects+an+extraordinary+situations

Created a ticket for your particular problem:
https://issues.apache.org/jira/browse/IGNITE-6953

Please attache the logs to facilitate with the reproducer.

Anyway, for now I would find out why the OOM happens. Find the root cause and 
heal it. 

—
Denis

> On Nov 14, 2017, at 4:01 AM, Ilya Kasnacheev <ilya.kasnach...@gmail.com> 
> wrote:
> 
> Hello!
> 
> My recommendation here is to always leave some extra RAM and heap so that a 
> hot spot won't cause OOM. Maybe use less RAM-intensive algorithms.
> 
> Without stack traces and logs it's hard to say more, but OOM may not be a 
> recoverable error with Ignite.
> 
> Regards,
> 
> -- 
> Ilya Kasnacheev
> 
> 2017-11-11 19:12 GMT+03:00 Amit Pundir <amitpun...@gmail.com>:
> Hi Ilya,
> Thanks for the response.
> 
> I have been following the release notes for every release - 2.1/2.2/2.3. I
> haven't seen any fixes around this (or similar sounding) issue. Since I am
> using Ignite is a very critical application, I would like to use a stable
> version which meets my requirements. I don't have a usecase for disk
> persistence so I haven't upgraded.
> 
> If there is an open transaction in the grid and OOM happens on one of the
> client node, would it stall the complete cluster? I have tried to allocate
> enough memory to the cluster but there is chance of creating hot spots with
> some nodes getting higher share of cache occupancy.
> 
> I'll share the logs soon.
> 
> 
> Thanks
> 
> 
> 
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> 

Reply via email to