Re: gridgain ultimate edition snapshot error
Hello there Yes it's not a bug. The open files limit is the default and needs to be raised. Maybe apply a ulimit action in your start script. Very likely it worked for you on Ignite vs GridGain because you're using Ignite on another machine (a testing vm?) where you have less caches hence less files. Cheers Gianluca On Tue, 7 Jun 2022 at 10:35, Surinder Mehra wrote: > Hi, > Thanks for your reply. Current limits are highlighted below. As suggested > in prev reply, I will change limits and try again. > > Limit Soft Limit Hard Limit Units > > Max cpu time unlimitedunlimited > seconds > Max file size unlimitedunlimitedbytes > > Max data size unlimitedunlimitedbytes > > Max stack size8388608 unlimitedbytes > > Max core file sizeunlimitedunlimitedbytes > > Max resident set unlimitedunlimitedbytes > > Max processes 6330663306 > processes > *Max open files1024 4096 files > * > Max locked memory 6553665536bytes > > Max address space unlimitedunlimitedbytes > > Max file locksunlimitedunlimitedlocks > > Max pending signals 6330663306 > signals > Max msgqueue size 819200 819200 bytes > > Max nice priority 00 > Max realtime priority 00 > Max realtime timeout unlimitedunlimitedus > > On Tue, Jun 7, 2022 at 1:38 PM Gianluca Bonetti < > gianluca.bone...@gmail.com> wrote: > >> Hello >> >> What is returned by this command? >> >> # cat /proc/PID/limits >> >> Cheers >> Gianluca >> Gianluca >> >> On Tue, 7 Jun 2022 at 07:35, Surinder Mehra wrote: >> >>> Hi, >>> I was going through this post on stackoverflow which is about the same >>> issue. The fact that snapshot works for apache ignite bit not in ultimate >>> edition indicates there is some bug in later. Could you please confirm. We >>> have around 15 caches with 2 backups. I changed backups to zero but still >>> see this issue. Could you please advise further. >>> >>> >>> https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster >>> >>> On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra >>> wrote: >>> Hi, I was experimenting with the GG ultimate edition to take snapshots and encountered the below error and cluster stops. Please note that this works in the ignite free version and we don't see too many files open error. Is this a bug or we are missing some configuration? version: gridgain-8.8.19 /bin./snapshot-utility.sh snapshot -type=full [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin] class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304) at org.gridgain.grid.internal.processors.cache.databas
Re: gridgain ultimate edition snapshot error
Hi, Thanks for your reply. Current limits are highlighted below. As suggested in prev reply, I will change limits and try again. Limit Soft Limit Hard Limit Units Max cpu time unlimitedunlimitedseconds Max file size unlimitedunlimitedbytes Max data size unlimitedunlimitedbytes Max stack size8388608 unlimitedbytes Max core file sizeunlimitedunlimitedbytes Max resident set unlimitedunlimitedbytes Max processes 6330663306 processes *Max open files1024 4096 files * Max locked memory 6553665536bytes Max address space unlimitedunlimitedbytes Max file locksunlimitedunlimitedlocks Max pending signals 6330663306signals Max msgqueue size 819200 819200 bytes Max nice priority 00 Max realtime priority 00 Max realtime timeout unlimitedunlimitedus On Tue, Jun 7, 2022 at 1:38 PM Gianluca Bonetti wrote: > Hello > > What is returned by this command? > > # cat /proc/PID/limits > > Cheers > Gianluca > Gianluca > > On Tue, 7 Jun 2022 at 07:35, Surinder Mehra wrote: > >> Hi, >> I was going through this post on stackoverflow which is about the same >> issue. The fact that snapshot works for apache ignite bit not in ultimate >> edition indicates there is some bug in later. Could you please confirm. We >> have around 15 caches with 2 backups. I changed backups to zero but still >> see this issue. Could you please advise further. >> >> >> https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster >> >> On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra wrote: >> >>> Hi, >>> I was experimenting with the GG ultimate edition to take snapshots and >>> encountered the below error and cluster stops. Please note that this works >>> in the ignite free version and we don't see too many files open error. Is >>> this a bug or we are missing some configuration? >>> >>> version: gridgain-8.8.19 >>> >>> /bin./snapshot-utility.sh snapshot -type=full >>> >>> [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical >>> system error detected. Will be handled accordingly to configured handler >>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, >>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet >>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], >>> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class >>> o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize >>> partition file: >>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin] >>> class >>> org.apache.ignite.internal.processors.cache.persistence.StorageException: >>> Failed to initialize partition file: >>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin >>> at >>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.ja
Re: gridgain ultimate edition snapshot error
Hello What is returned by this command? # cat /proc/PID/limits Cheers Gianluca Gianluca On Tue, 7 Jun 2022 at 07:35, Surinder Mehra wrote: > Hi, > I was going through this post on stackoverflow which is about the same > issue. The fact that snapshot works for apache ignite bit not in ultimate > edition indicates there is some bug in later. Could you please confirm. We > have around 15 caches with 2 backups. I changed backups to zero but still > see this issue. Could you please advise further. > > > https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster > > On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra wrote: > >> Hi, >> I was experimenting with the GG ultimate edition to take snapshots and >> encountered the below error and cluster stops. Please note that this works >> in the ignite free version and we don't see too many files open error. Is >> this a bug or we are missing some configuration? >> >> version: gridgain-8.8.19 >> >> /bin./snapshot-utility.sh snapshot -type=full >> >> [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical >> system error detected. Will be handled accordingly to configured handler >> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, >> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet >> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], >> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class >> o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize >> partition file: >> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin] >> class >> org.apache.ignite.internal.processors.cache.persistence.StorageException: >> Failed to initialize partition file: >> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin >> at >> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519) >> at >> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405) >> at >> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68) >> at >> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577) >> at >> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911) >> at >> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730) >> at >> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711) >> at >> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304) >> at >> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486) >> at >> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171) >> at >> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352) >> at >> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286) >> at >> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567) >> at >> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) >> at java.base/java.lang.Thread.run(Thread.java:829) >> Caused by: java.nio.file.FileSystemException: >> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin: >> Too many open files >> at >> java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) >> at >> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) >> at >> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) >> at >> java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201) >> at >> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253) >> at >> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311) >> at >> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.(AsyncFileIO.java:65) >> at >> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43) >> at >> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491) >> ... 14 more >> [21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] >> No deadlocked threads de
Re[2]: gridgain ultimate edition snapshot error
hi, u need to change limits [1] [1] https://www.gridgain.com/docs/latest/perf-troubleshooting-guide/general-perf-tips#ulimits >Вторник, 7 июня 2022, 8:35 +03:00 от Surinder Mehra : > >Hi, >I was going through this post on stackoverflow which is about the same issue. >The fact that snapshot works for apache ignite bit not in ultimate edition >indicates there is some bug in later. Could you please confirm. We have around >15 caches with 2 backups. I changed backups to zero but still see this issue. >Could you please advise further. > >https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster > >On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra < redni...@gmail.com > wrote: >>Hi, >>I was experimenting with the GG ultimate edition to take snapshots and >>encountered the below error and cluster stops. Please note that this works in >>the ignite free version and we don't see too many files open error. Is this a >>bug or we are missing some configuration? >> >>version: gridgain-8.8.19 >> >>/bin./snapshot-utility.sh snapshot -type=full >> >>[21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical system >>error detected. Will be handled accordingly to configured handler >>[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, >>super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet >>[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], >>failureCtx=FailureContext [type=CRITICAL_ERROR, err=class >>o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize >>partition file: >>/home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin] >>class >>org.apache.ignite.internal.processors.cache.persistence.StorageException: >>Failed to initialize partition file: >>/home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin >>at >>org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519) >>at >>org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405) >>at >>org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68) >>at >>org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577) >>at >>org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911) >>at >>org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730) >>at >>org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711) >>at >>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304) >>at >>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486) >>at >>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171) >>at >>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352) >>at >>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286) >>at >>org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567) >>at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) >>at java.base/java.lang.Thread.run(Thread.java:829) >>Caused by: java.nio.file.FileSystemException: >>/home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin: >> Too many open files >>at >>java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) >>at >>java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) >>at >>java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) >>at >>java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201) >>at >>java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253) >>at >>java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311) >>at >>org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.(AsyncFileIO.java:65) >>at >>org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43) >>at >>org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491) >>... 14 more >>[21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
Re: gridgain ultimate edition snapshot error
Hi, I was going through this post on stackoverflow which is about the same issue. The fact that snapshot works for apache ignite bit not in ultimate edition indicates there is some bug in later. Could you please confirm. We have around 15 caches with 2 backups. I changed backups to zero but still see this issue. Could you please advise further. https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra wrote: > Hi, > I was experimenting with the GG ultimate edition to take snapshots and > encountered the below error and cluster stops. Please note that this works > in the ignite free version and we don't see too many files open error. Is > this a bug or we are missing some configuration? > > version: gridgain-8.8.19 > > /bin./snapshot-utility.sh snapshot -type=full > > [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical > system error detected. Will be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet > [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], > failureCtx=FailureContext [type=CRITICAL_ERROR, err=class > o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize > partition file: > /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin] > class > org.apache.ignite.internal.processors.cache.persistence.StorageException: > Failed to initialize partition file: > /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin > at > org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519) > at > org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68) > at > org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711) > at > org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304) > at > org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486) > at > org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171) > at > org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352) > at > org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286) > at > org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.nio.file.FileSystemException: > /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin: > Too many open files > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201) > at > java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253) > at > java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311) > at > org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.(AsyncFileIO.java:65) > at > org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43) > at > org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491) > ... 14 more > [21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] > No deadlocked threads detected. > [21:03:51,767][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] > Thread dump at 2022/06/06 21:03:51 IST > Thread [name="main", id=1, state=WAITING, blockCnt=4, waitCnt=4169] > Lock [object=java.util.concurrent.Count
gridgain ultimate edition snapshot error
Hi, I was experimenting with the GG ultimate edition to take snapshots and encountered the below error and cluster stops. Please note that this works in the ignite free version and we don't see too many files open error. Is this a bug or we are missing some configuration? version: gridgain-8.8.19 /bin./snapshot-utility.sh snapshot -type=full [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin] class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.nio.file.FileSystemException: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin: Too many open files at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) at java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201) at java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253) at java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311) at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.(AsyncFileIO.java:65) at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491) ... 14 more [21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] No deadlocked threads detected. [21:03:51,767][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] Thread dump at 2022/06/06 21:03:51 IST Thread [name="main", id=1, state=WAITING, blockCnt=4, waitCnt=4169] Lock [object=java.util.concurrent.CountDownLatch$Sync@5b60e356, ownerName=null, ownerId=-1] at java.base@11.0.14.1/jdk.internal.misc.Unsafe.park(Native Method) at java.base@11.0.14.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) at java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) at java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039) at java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345)