Hi, Is this related to issue seen with IGNITE-13912 ?
I had hit IGNITE-13912 when I was using ignite 2.9 release. I am yet to try my use case with the fix provided as part of IGNITE-13912 Regards, Vishwas On Tue, 26 Jan, 2021, 21:18 ткаленко кирилл, <tkalkir...@yandex.ru> wrote: > Hello, everyone! > > Currently, property DataStorageConfiguration#maxWalArchiveSize is not > working as expected by users. We can easily go beyond this limit and > overflow the disk, which will lead to errors and a crash of the node. I > propose to fix this behavior and not let WAL archive overflow. > > It is suggested not to add segments to the archive if we can exceed the > DataStorageConfiguration#maxWalArchiveSize and wait until space becomes > available for this. > > Thus, we may have a deadlock: > Get checkpontReadLock -> write to WAL -> need to rollover WAL segment -> > need to clean WAL archive -> need to complete checkpoint (impossible > because of checkpontReadLock taken). > > To avoid such situations, I suggest adding a custom heuristic - do not > give a IgniteCacheDatabaseSharedManager#checkpointReadLock if there are few > (default 1) segments left. > But this will not allow us to completely avoid archive overflow > situations. Therefore, I suggest fail node by FH when a deadlock is > detected, since it could be the same if there was no disk space left. >