[ https://issues.apache.org/jira/browse/IGNITE-13912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269554#comment-17269554 ]
Kirill Tkalenko edited comment on IGNITE-13912 at 1/21/21, 6:58 PM: -------------------------------------------------------------------- Hi. [~shm]! The last reserved segment is 11 (it is reserved because of the PME), and I cannot see its release, so all segments greater and equal to it cannot be deleted. {noformat} [2021-01-21T13:47:37,493][DEBUG][sys-#310][FileWriteAheadLogManager] Reserved WAL pointer: WALPointer [idx=11, fileOff=540780430, len=9572] [2021-01-21T13:47:37,493][WARN ][sys-#310][FileWriteAheadLogManager] Reserved WAL stack at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.reserve(FileWriteAheadLogManager.java:1015) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] {noformat} Here is an example of the last released, and all after & before it was deleted. {noformat} [2021-01-21T13:47:37,497][DEBUG][sys-#310][FileWriteAheadLogManager] Released WAL pointer: WALPointer [idx=6, fileOff=576103521, len=9572] {noformat} You need to understand why the PME happened. Anyway, without the reproducer it is impossible to understand what the matter is. was (Author: ktkale...@gridgain.com): Hi. [~shm]! The last reserved segment is 11 (it is reserved because of the PME), and I cannot see its release, so all segments greater and equal to it cannot be deleted. {noformat} [2021-01-21T13:47:37,493][DEBUG][sys-#310][FileWriteAheadLogManager] Reserved WAL pointer: WALPointer [idx=11, fileOff=540780430, len=9572] [2021-01-21T13:47:37,493][WARN ][sys-#310][FileWriteAheadLogManager] Reserved WAL stack at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.reserve(FileWriteAheadLogManager.java:1015) [ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT] {noformat} Here is an example of the last released, and all after it was deleted. {noformat} [2021-01-21T13:47:37,497][DEBUG][sys-#310][FileWriteAheadLogManager] Released WAL pointer: WALPointer [idx=6, fileOff=576103521, len=9572] {noformat} You need to understand why the PME happened. Anyway, without the reproducer it is impossible to understand what the matter is. > Incorrect calculation of WAL segments that should be deleted from WAL archive > ----------------------------------------------------------------------------- > > Key: IGNITE-13912 > URL: https://issues.apache.org/jira/browse/IGNITE-13912 > Project: Ignite > Issue Type: Bug > Components: persistence > Reporter: Kirill Tkalenko > Assignee: Kirill Tkalenko > Priority: Critical > Fix For: 2.10 > > Attachments: server1-full-wal-checkpoint.log, wal-checkpoint-logs, > wal_dir_contents, wal_grows_from_peak.PNG, wal_issue_reproduced.PNG, > wal_usage.PNG, wal_usage_dec12.PNG, wal_usage_dec22nd_binary.PNG > > Time Spent: 0.5h > Remaining Estimate: 0h > > Now there is an incorrect calculation of WAL segments that should be deleted > from WAL archive. Since we delete only those segments whose total size should > not exceed *DataStorageConfiguration#maxWalArchiveSize * > IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*, but should be up to > DataStorageConfiguration#maxWalArchiveSize * > IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*. Therefore, an excess of > *DataStorageConfiguration#maxWalArchiveSize* occurs. -- This message was sent by Atlassian Jira (v8.3.4#803005)