>From GC logs at the end of files I see Full GC pauses like this: 2017-02-17T04:29:22.118-0800: 21122.643: [Full GC (Allocation Failure) 10226M->8526M(10G), 26.8952036 secs] [Eden: 0.0B(512.0M)->0.0B(536.0M) Survivors: 0.0B->0.0B Heap: 10226.0M(10.0G)->8526.8M(10.0G)], [Metaspace: 77592K->77592K(1120256K)]
Your heap is exhausted. During GC discovery doesn't receive heart betas and nodes stopped due to segmentation. Please check your nodes' logs for NODE_SEGMENTED pattern. If it is your case try to tune GC or reduce load on GC (see for details [1]) [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning On Fri, Feb 17, 2017 at 6:35 PM, Anil <anilk...@gmail.com> wrote: > Hi Andrey, > > The queyr execution time is very high when limit 10000+250 . > > 10 GB of heap memory for both client and servers. I have attached the gc > logs of 4 servers. Could you please take a look ? thanks. > > > On 17 February 2017 at 20:52, Anil <anilk...@gmail.com> wrote: >> >> Hi Andrey, >> >> I checked GClogs and everything looks good. >> >> Thanks >> >> On 17 February 2017 at 20:45, Andrey Gura <ag...@apache.org> wrote: >>> >>> Anil, >>> >>> IGNITE-4003 isn't related with your problem. >>> >>> I think that nodes are going out of topology due to long GC pauses. >>> You can easily check this using GC logs. >>> >>> On Fri, Feb 17, 2017 at 6:04 PM, Anil <anilk...@gmail.com> wrote: >>> > Hi, >>> > >>> > We noticed whenever long running queries fired, nodes are going out of >>> > topology and entire ignite cluster is down. >>> > >>> > In my case, a filter criteria could get 5L records. So each API request >>> > could fetch 250 records. When page number is getting increased the >>> > query >>> > execution time is high and entire cluster is down >>> > >>> > https://issues.apache.org/jira/browse/IGNITE-4003 related to this ? >>> > >>> > Can we set seperate thread pool for queries executions, compute jobs >>> > and >>> > other services instead of common public thread pool ? >>> > >>> > Thanks >>> > >>> > >> >> >