All Ignite Nodes Crashes After SQL Select Complex Queries

Farhan Abdul Shakoor Wed, 06 Jul 2022 11:13:07 -0700

Hi Folks,

We are running into strange issues in running queries into ignite. Here is
our current setup


- 8 Node ignite on 128 GB VMs deployed on Azure kubernetes
- Persistence enabled with 30GB Data region size

With following node configuration:
<property name="dataStorageConfiguration">
            <bean
class="org.apache.ignite.configuration.DataStorageConfiguration">
                <property name="metricsEnabled" value="true"/>
                <property name="pageSize" value="#{8 * 1024}"/>
                <property name="defaultDataRegionConfiguration">
                    <bean
class="org.apache.ignite.configuration.DataRegionConfiguration">
                        <property name="persistenceEnabled" value="true"/>
                        <property name="maxSize" value="#{30L * 1024 * 1024
* 1024}"/>
                         <property name="pageReplacementMode"
value="SEGMENTED_LRU"/>
<property name="pageEvictionMode" value="RANDOM_2_LRU"/>
                        <property name="metricsEnabled" value="true"/>
                    </bean>
                </property>
                <property name="walSegmentSize" value="#{128L * 1024 *
1024}"/>
                <property name="walPath" value="/ignite/wal"/>
                <property name="walArchivePath" value="/ignite/walarchive"/>
                <property name="walMode" value="FSYNC"/>
            </bean>
        </property>
<property name="failureHandler">
            <bean
class="org.apache.ignite.failure.RestartProcessFailureHandler"/>
        </property>


When query exception start, we got multiple waiting error like this:

Thread [name="main", id=1, state=WAITING, blockCnt=5, waitCnt=2636]
    Lock [object=java.util.concurrent.CountDownLatch$Sync@b027ad0,
ownerName=null, ownerId=-1]
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
        at
java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
        at
o.a.i.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:398)
[14:25:07,980][SEVERE][disco-event-worker-#67][FailureProcessor] Ignite
node is in invalid state due to a critical failure.

And then all nodes gets crashed.

Please suggest if there is any config value we can change to terminate long
running queries.

Thanks

All Ignite Nodes Crashes After SQL Select Complex Queries

Reply via email to