Re: gridgain ultimate edition snapshot error

2022-06-07 Thread Gianluca Bonetti
Hello there

Yes it's not a bug.
The open files limit is the default and needs to be raised.
Maybe apply a ulimit action in your start script.

Very likely it worked for you on Ignite vs GridGain because you're using
Ignite on another machine (a testing vm?) where you have less caches hence
less files.

Cheers
Gianluca

On Tue, 7 Jun 2022 at 10:35, Surinder Mehra  wrote:

> Hi,
> Thanks for your reply. Current limits are highlighted below. As suggested
> in prev reply, I will change limits and try again.
>
> Limit Soft Limit   Hard Limit   Units
>
> Max cpu time  unlimitedunlimited
>  seconds
> Max file size unlimitedunlimitedbytes
>
> Max data size unlimitedunlimitedbytes
>
> Max stack size8388608  unlimitedbytes
>
> Max core file sizeunlimitedunlimitedbytes
>
> Max resident set  unlimitedunlimitedbytes
>
> Max processes 6330663306
>  processes
> *Max open files1024 4096 files
>   *
> Max locked memory 6553665536bytes
>
> Max address space unlimitedunlimitedbytes
>
> Max file locksunlimitedunlimitedlocks
>
> Max pending signals   6330663306
>  signals
> Max msgqueue size 819200   819200   bytes
>
> Max nice priority 00
> Max realtime priority 00
> Max realtime timeout  unlimitedunlimitedus
>
> On Tue, Jun 7, 2022 at 1:38 PM Gianluca Bonetti <
> gianluca.bone...@gmail.com> wrote:
>
>> Hello
>>
>> What is returned by this command?
>>
>> # cat /proc/PID/limits
>>
>> Cheers
>> Gianluca
>> Gianluca
>>
>> On Tue, 7 Jun 2022 at 07:35, Surinder Mehra  wrote:
>>
>>> Hi,
>>> I was going through this post on stackoverflow which is about the same
>>> issue. The fact that snapshot works for apache ignite bit not in ultimate
>>> edition indicates there is some bug in later. Could you please confirm. We
>>> have around 15 caches with 2 backups. I changed backups to zero but still
>>> see this issue. Could you please advise further.
>>>
>>>
>>> https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster
>>>
>>> On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra 
>>> wrote:
>>>
 Hi,
 I was experimenting with the GG ultimate edition to take snapshots and
 encountered the below error and cluster stops. Please note that this works
 in the ignite free version and we don't see too many files open error. Is
 this a bug or we are missing some configuration?

 version:  gridgain-8.8.19

 /bin./snapshot-utility.sh snapshot -type=full

 [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical
 system error detected. Will be handled accordingly to configured handler
 [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
 super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
 [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
 failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
 o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize
 partition file:
 /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin]
 class
 org.apache.ignite.internal.processors.cache.persistence.StorageException:
 Failed to initialize partition file:
 /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin
 at
 org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519)
 at
 org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405)
 at
 org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68)
 at
 org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577)
 at
 org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911)
 at
 org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
 at
 org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711)
 at
 org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304)
 at
 org.gridgain.grid.internal.processors.cache.databas

Re: gridgain ultimate edition snapshot error

2022-06-07 Thread Surinder Mehra
Hi,
Thanks for your reply. Current limits are highlighted below. As suggested
in prev reply, I will change limits and try again.

Limit Soft Limit   Hard Limit   Units

Max cpu time  unlimitedunlimitedseconds

Max file size unlimitedunlimitedbytes

Max data size unlimitedunlimitedbytes

Max stack size8388608  unlimitedbytes

Max core file sizeunlimitedunlimitedbytes

Max resident set  unlimitedunlimitedbytes

Max processes 6330663306
 processes
*Max open files1024 4096 files
  *
Max locked memory 6553665536bytes

Max address space unlimitedunlimitedbytes

Max file locksunlimitedunlimitedlocks

Max pending signals   6330663306signals

Max msgqueue size 819200   819200   bytes

Max nice priority 00
Max realtime priority 00
Max realtime timeout  unlimitedunlimitedus

On Tue, Jun 7, 2022 at 1:38 PM Gianluca Bonetti 
wrote:

> Hello
>
> What is returned by this command?
>
> # cat /proc/PID/limits
>
> Cheers
> Gianluca
> Gianluca
>
> On Tue, 7 Jun 2022 at 07:35, Surinder Mehra  wrote:
>
>> Hi,
>> I was going through this post on stackoverflow which is about the same
>> issue. The fact that snapshot works for apache ignite bit not in ultimate
>> edition indicates there is some bug in later. Could you please confirm. We
>> have around 15 caches with 2 backups. I changed backups to zero but still
>> see this issue. Could you please advise further.
>>
>>
>> https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster
>>
>> On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra  wrote:
>>
>>> Hi,
>>> I was experimenting with the GG ultimate edition to take snapshots and
>>> encountered the below error and cluster stops. Please note that this works
>>> in the ignite free version and we don't see too many files open error. Is
>>> this a bug or we are missing some configuration?
>>>
>>> version:  gridgain-8.8.19
>>>
>>> /bin./snapshot-utility.sh snapshot -type=full
>>>
>>> [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical
>>> system error detected. Will be handled accordingly to configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
>>> o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize
>>> partition file:
>>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin]
>>> class
>>> org.apache.ignite.internal.processors.cache.persistence.StorageException:
>>> Failed to initialize partition file:
>>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.ja

Re: gridgain ultimate edition snapshot error

2022-06-07 Thread Gianluca Bonetti
Hello

What is returned by this command?

# cat /proc/PID/limits

Cheers
Gianluca
Gianluca

On Tue, 7 Jun 2022 at 07:35, Surinder Mehra  wrote:

> Hi,
> I was going through this post on stackoverflow which is about the same
> issue. The fact that snapshot works for apache ignite bit not in ultimate
> edition indicates there is some bug in later. Could you please confirm. We
> have around 15 caches with 2 backups. I changed backups to zero but still
> see this issue. Could you please advise further.
>
>
> https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster
>
> On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra  wrote:
>
>> Hi,
>> I was experimenting with the GG ultimate edition to take snapshots and
>> encountered the below error and cluster stops. Please note that this works
>> in the ignite free version and we don't see too many files open error. Is
>> this a bug or we are missing some configuration?
>>
>> version:  gridgain-8.8.19
>>
>> /bin./snapshot-utility.sh snapshot -type=full
>>
>> [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical
>> system error detected. Will be handled accordingly to configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
>> o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize
>> partition file:
>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin]
>> class
>> org.apache.ignite.internal.processors.cache.persistence.StorageException:
>> Failed to initialize partition file:
>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin
>> at
>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711)
>> at
>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304)
>> at
>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486)
>> at
>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171)
>> at
>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352)
>> at
>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286)
>> at
>> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567)
>> at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>> at java.base/java.lang.Thread.run(Thread.java:829)
>> Caused by: java.nio.file.FileSystemException:
>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin:
>> Too many open files
>> at
>> java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
>> at
>> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
>> at
>> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
>> at
>> java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201)
>> at
>> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253)
>> at
>> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.(AsyncFileIO.java:65)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491)
>> ... 14 more
>> [21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
>> No deadlocked threads de

Re[2]: gridgain ultimate edition snapshot error

2022-06-07 Thread Zhenya Stanilovsky

hi, u need to change limits [1]
 
[1]  
https://www.gridgain.com/docs/latest/perf-troubleshooting-guide/general-perf-tips#ulimits
  
>Вторник, 7 июня 2022, 8:35 +03:00 от Surinder Mehra :
> 
>Hi,
>I was going through this post on stackoverflow which is about the same issue. 
>The fact that snapshot works for apache ignite bit not in ultimate edition 
>indicates there is some bug in later. Could you please confirm. We have around 
>15 caches with 2 backups. I changed backups to zero but still see this issue. 
>Could you please advise further.
>
>https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster
>  
>On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra < redni...@gmail.com > wrote:
>>Hi,
>>I was experimenting with the GG ultimate edition to take snapshots and 
>>encountered the below error and cluster stops. Please note that this works in 
>>the ignite free version and we don't see too many files open error. Is this a 
>>bug or we are missing some configuration?
>> 
>>version:  gridgain-8.8.19
>>
>>/bin./snapshot-utility.sh snapshot -type=full
>>
>>[21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical system 
>>error detected. Will be handled accordingly to configured handler 
>>[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
>>super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
>>[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
>>failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
>>o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize 
>>partition file: 
>>/home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin]
>>class 
>>org.apache.ignite.internal.processors.cache.persistence.StorageException: 
>>Failed to initialize partition file: 
>>/home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519)
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405)
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68)
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577)
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911)
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711)
>>at 
>>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304)
>>at 
>>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486)
>>at 
>>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171)
>>at 
>>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352)
>>at 
>>org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286)
>>at 
>>org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567)
>>at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>>at java.base/java.lang.Thread.run(Thread.java:829)
>>Caused by: java.nio.file.FileSystemException: 
>>/home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin:
>> Too many open files
>>at 
>>java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
>>at 
>>java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
>>at 
>>java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
>>at 
>>java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201)
>>at 
>>java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253)
>>at 
>>java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311)
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.(AsyncFileIO.java:65)
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43)
>>at 
>>org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491)
>>... 14 more
>>[21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]

Re: gridgain ultimate edition snapshot error

2022-06-06 Thread Surinder Mehra
Hi,
I was going through this post on stackoverflow which is about the same
issue. The fact that snapshot works for apache ignite bit not in ultimate
edition indicates there is some bug in later. Could you please confirm. We
have around 15 caches with 2 backups. I changed backups to zero but still
see this issue. Could you please advise further.

https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster

On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra  wrote:

> Hi,
> I was experimenting with the GG ultimate edition to take snapshots and
> encountered the below error and cluster stops. Please note that this works
> in the ignite free version and we don't see too many files open error. Is
> this a bug or we are missing some configuration?
>
> version:  gridgain-8.8.19
>
> /bin./snapshot-utility.sh snapshot -type=full
>
> [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical
> system error detected. Will be handled accordingly to configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
> o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize
> partition file:
> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin]
> class
> org.apache.ignite.internal.processors.cache.persistence.StorageException:
> Failed to initialize partition file:
> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin
> at
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519)
> at
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68)
> at
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711)
> at
> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304)
> at
> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486)
> at
> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171)
> at
> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352)
> at
> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286)
> at
> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.nio.file.FileSystemException:
> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin:
> Too many open files
> at
> java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
> at
> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
> at
> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
> at
> java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201)
> at
> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253)
> at
> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311)
> at
> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.(AsyncFileIO.java:65)
> at
> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43)
> at
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491)
> ... 14 more
> [21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
> No deadlocked threads detected.
> [21:03:51,767][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
> Thread dump at 2022/06/06 21:03:51 IST
> Thread [name="main", id=1, state=WAITING, blockCnt=4, waitCnt=4169]
> Lock [object=java.util.concurrent.Count

gridgain ultimate edition snapshot error

2022-06-06 Thread Surinder Mehra
Hi,
I was experimenting with the GG ultimate edition to take snapshots and
encountered the below error and cluster stops. Please note that this works
in the ignite free version and we don't see too many files open error. Is
this a bug or we are missing some configuration?

version:  gridgain-8.8.19

/bin./snapshot-utility.sh snapshot -type=full

[21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical system
error detected. Will be handled accordingly to configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize
partition file:
/home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin]
class
org.apache.ignite.internal.processors.cache.persistence.StorageException:
Failed to initialize partition file:
/home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin
at
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519)
at
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68)
at
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711)
at
org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304)
at
org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486)
at
org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171)
at
org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352)
at
org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286)
at
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.nio.file.FileSystemException:
/home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name/part-88.bin:
Too many open files
at
java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
at
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at
java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201)
at
java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253)
at
java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311)
at
org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.(AsyncFileIO.java:65)
at
org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43)
at
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491)
... 14 more
[21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
No deadlocked threads detected.
[21:03:51,767][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
Thread dump at 2022/06/06 21:03:51 IST
Thread [name="main", id=1, state=WAITING, blockCnt=4, waitCnt=4169]
Lock [object=java.util.concurrent.CountDownLatch$Sync@5b60e356,
ownerName=null, ownerId=-1]
at java.base@11.0.14.1/jdk.internal.misc.Unsafe.park(Native Method)
at
java.base@11.0.14.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at
java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
at
java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039)
at
java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345)