[ https://issues.apache.org/jira/browse/IGNITE-13912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269478#comment-17269478 ]
shivakumar edited comment on IGNITE-13912 at 1/21/21, 5:43 PM: --------------------------------------------------------------- Hi [~ktkale...@gridgain.com] I have now attached the log file (server1-full-wal-checkpoint.log) which is filtered with cat complete_server.log | grep -i -e "wal" -e "checkpoint" | grep -v "db-checkpoint-thread" Not sure if this going to help or not. Below is the stack trace which you added for debugging: [2021-01-21T13:47:37,493][WARN ][sys-#310][FileWriteAheadLogManager] Reserved WAL stack java.lang.Exception: null at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.reserve(FileWriteAheadLogManager.java:1015) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.reserveHistoryForPreloading(GridCacheDatabaseSharedManager.java:1716) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:2555) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:4073) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3840) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processSingleMessage(GridDhtPartitionsExchangeFuture.java:3339) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:3126) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:3114) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:354) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onReceiveSingleMessage(GridDhtPartitionsExchangeFuture.java:3114) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processSinglePartitionUpdate(GridCachePartitionExchangeManager.java:2069) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$2.onMessage(GridCachePartitionExchangeManager.java:452) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$2.onMessage(GridCachePartitionExchangeManager.java:420) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:3845) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:3824) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:392) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:318) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1908) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1529) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1422) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:834) [?:?] [2021-01-21T13:47:37,496][DEBUG][sys-#310][FileWriteAheadLogManager] Min reserved WAL segment: 11 [2021-01-21T13:47:37,496][DEBUG][wal-file-cleaner%null-#82][FileWriteAheadLogManager] Finish await available truncate for WAL clean: 4 [2021-01-21T13:47:37,497][INFO ][wal-file-cleaner%null-#82][FileWriteAheadLogManager] Starting to clean WAL archive [highIdx=7, currSize=5.6 GB, maxSize=10.0 GB] [2021-01-21T13:47:37,497][DEBUG][sys-#310][FileWriteAheadLogManager] Released WAL pointer: WALPointer [idx=6, fileOff=576103521, len=9572] [2021-01-21T13:47:37,499][DEBUG][sys-#310][GridDhtPartitionTopologyImpl] Updated rebalanced version [grp=groupEternal, ver=AffinityTopologyVersion [topVer=10, minorTopVer=0]] [2021-01-21T13:47:37,499][DEBUG][sys-#310][GridCachePartitionExchangeManager] Exchange done [topVer=AffinityTopologyVersion [topVer=10, minorTopVer=0], err=null] was (Author: shm): Hi [~ktkale...@gridgain.com] I have now attached the log file (server1-full-wal-checkpoint.log) which is filtered with cat complete_server.log | grep -i -e "wal" -e "checkpoint" | grep -v "db-checkpoint-thread" Not sure if this going to help or not. > Incorrect calculation of WAL segments that should be deleted from WAL archive > ----------------------------------------------------------------------------- > > Key: IGNITE-13912 > URL: https://issues.apache.org/jira/browse/IGNITE-13912 > Project: Ignite > Issue Type: Bug > Components: persistence > Reporter: Kirill Tkalenko > Assignee: Kirill Tkalenko > Priority: Critical > Fix For: 2.10 > > Attachments: server1-full-wal-checkpoint.log, wal-checkpoint-logs, > wal_dir_contents, wal_grows_from_peak.PNG, wal_issue_reproduced.PNG, > wal_usage.PNG, wal_usage_dec12.PNG, wal_usage_dec22nd_binary.PNG > > Time Spent: 0.5h > Remaining Estimate: 0h > > Now there is an incorrect calculation of WAL segments that should be deleted > from WAL archive. Since we delete only those segments whose total size should > not exceed *DataStorageConfiguration#maxWalArchiveSize * > IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*, but should be up to > DataStorageConfiguration#maxWalArchiveSize * > IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*. Therefore, an excess of > *DataStorageConfiguration#maxWalArchiveSize* occurs. -- This message was sent by Atlassian Jira (v8.3.4#803005)