[ https://issues.apache.org/jira/browse/IGNITE-16264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17474346#comment-17474346 ]
Pavel Pereslegin commented on IGNITE-16264: ------------------------------------------- [~prom1se], 1. You can ensure that the "PageRestore" marks the page as dirty by applying this trick to the data partition (part-N.bin, not index.bin) 2. Your cache (by default) has no indexes, so I think Ignite is somehow just doesn't restore a snapshot of the metapage from WAL in this case. But I'm not sure about that. I'll try to investigate this case more thoroughly. > [PageMemoryImpl] PageRestore does not mark the restored page as dirty. > ---------------------------------------------------------------------- > > Key: IGNITE-16264 > URL: https://issues.apache.org/jira/browse/IGNITE-16264 > Project: Ignite > Issue Type: Bug > Components: persistence > Affects Versions: 2.11.1 > Reporter: Fedor Malchikov > Assignee: Pavel Pereslegin > Priority: Major > Attachments: grid.1.node.2.log > > > {code:java} > [15:53:09,313][INFO][main][GridCacheDatabaseSharedManager] Starting binary > memory restore for: [1587982887, 1586135659, 1586135625, 1588907152, > 1134507861, 1587059273, -2100569601, 374280889, 374280888, 374280887, > 374280886, 374280885, 374280884, > 1587059204][15:53:09,583][INFO][main][CheckpointMarkersStorage] Read > checkpoint status > [startMarker=/storage/ssd/prtagent/tiden/sow-220111-155046/test_iep_14/ignite.server.2/work/db/node_1_2/cp/1641905556201-40b99ac3-a6c5-48e6-956b-8b2b0a5804c9-START.bin, > > endMarker=/storage/ssd/prtagent/tiden/sow-220111-155046/test_iep_14/ignite.server.2/work/db/node_1_2/cp/1641905556201-40b99ac3-a6c5-48e6-956b-8b2b0a5804c9-END.bin][15:53:09,583][INFO][main][GridCacheDatabaseSharedManager] > Checking memory state [lastValidPos=FileWALPointer [idx=41, fileOff=734466, > len=52185], lastMarked=FileWALPointer [idx=41, fileOff=734466, len=52185], > lastCheckpointId=40b99ac3-a6c5-48e6-956b-8b2b0a5804c9][15:53:09,588][INFO][main][GridCacheDatabaseSharedManager] > Found last checkpoint marker [cpId=40b99ac3-a6c5-48e6-956b-8b2b0a5804c9, > pos=FileWALPointer [idx=41, fileOff=734466, > len=52185]][15:53:09,590][INFO][main][GridCacheDatabaseSharedManager] Binary > memory state restored at node startup [restoredPtr=FileWALPointer [idx=41, > fileOff=786651, len=0]][15:53:09,593][INFO][main][FileWriteAheadLogManager] > Resuming logging to WAL segment > [file=/storage/ssd/prtagent/tiden/sow-220111-155046/test_iep_14/ignite.server.2/work/db/wal/node_1_2/0000000000000001.wal, > offset=786651, ver=2][15:53:09,611][INFO][main][PageMemoryImpl] Started page > memory [memoryAllocated=500.0 MiB, pages=124064, tableSize=9.7 MiB, > replacementSize=15.3 KiB, checkpointBuffer=256.0 > MiB][15:53:09,731][INFO][main][GridCacheProcessor] Started cache in recovery > mode [name=cache_group_3_088, id=1587982887, dataRegionName=Default_Region, > mode=PARTITIONED, atomicity=TRANSACTIONAL, backups=3, > mvcc=false][15:53:09,769][INFO][main][GridCacheProcessor] Started cache in > recovery mode [name=cache_group_1_028, id=1586135659, > dataRegionName=Default_Region, mode=PARTITIONED, atomicity=TRANSACTIONAL, > backups=3, mvcc=false][15:53:09,774][WARNING][main][PageMemoryImpl] Failed to > read page (data integrity violation encountered, will try to restore using > existing WAL) [fullPageId=FullPageId [pageId=0002ffff00000000, > effectivePageId=0000ffff00000000, grpId=1134507861]]class > org.apache.ignite.internal.processors.cache.persistence.wal.crc.IgniteDataIntegrityViolationException: > Failed to read page (CRC validation failed) [id=0002ffff00000000, off=0, > file=/storage/ssd/prtagent/tiden/sow-220111-155046/test_iep_14/ignite.server.2/work/db/node_1_2/cacheGroup-test_cache_group/index.bin, > fileSize=24576, savedCrc=93303030, curCrc=93e69aca, > page=0b0002000000000000000000ffff0200000000000000000000000000000000000000000000000000000000000000000002000000ffff020003000000ffff0200000000000000000000000000000000000100000000000000000000000000000005000000050000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000] > at > org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:441) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68) > at > org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:582) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:904) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:723) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:704) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.getOrAllocateCacheMetas(GridCacheOffheapManager.java:1199) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.initDataStructures(GridCacheOffheapManager.java:193) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.start(IgniteCacheOffheapManagerImpl.java:199) > at > org.apache.ignite.internal.processors.cache.CacheGroupContext.start(CacheGroupContext.java:1130) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCacheGroup(GridCacheProcessor.java:2522) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$getOrCreateCacheGroupContext$17(GridCacheProcessor.java:2201) > at > org.apache.ignite.internal.util.InitializationProtector.protect(InitializationProtector.java:59) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.getOrCreateCacheGroupContext(GridCacheProcessor.java:2198) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCacheInRecoveryMode(GridCacheProcessor.java:2326) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.access$1700(GridCacheProcessor.java:226) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor$CacheRecoveryLifecycle.afterBinaryMemoryRestore(GridCacheProcessor.java:5382) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreBinaryMemory(GridCacheDatabaseSharedManager.java:1105) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.startMemoryRestore(GridCacheDatabaseSharedManager.java:1938) > at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1210) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1784) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1706) > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1143) > at > org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1058) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:944) at > org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:843) at > org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:713) at > org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:682) at > org.apache.ignite.Ignition.start(Ignition.java:344) at > org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:290)[15:53:09,982][INFO][main][GridCacheProcessor] > Started cache in recovery mode [name=test_cache_001, id=-454739833, > group=test_cache_group, dataRegionName=Default_Region, mode=PARTITIONED, > atomicity=TRANSACTIONAL, backups=2, > mvcc=false][15:53:09,987][INFO][main][GridCacheProcessor] Started cache in > recovery mode [name=cache_group_3_075, id=1587982853, group=cache_group_3, > dataRegionName=Default_Region, mode=PARTITIONED, atomicity=TRANSACTIONAL, > backups=3, mvcc=false][15:53:09,990][INFO][main][GridCacheProcessor] Started > cache in recovery mode [name=cache_group_1_015, id=1586135625, > dataRegionName=Default_Region, mode=PARTITIONED, atomicity=TRANSACTIONAL, > backups=3, mvcc=false][15:53:09,993][INFO][main][GridCacheProcessor] Started > cache in recovery mode [name=cache_group_3_061, id=1587982818, > group=cache_group_1, dataRegionName=Default_Region, mode=PARTITIONED, > atomicity=TRANSACTIONAL, backups=3, > mvcc=false][15:53:09,997][INFO][main][GridCacheProcessor] Started cache in > recovery mode [name=cache_group_4_118, id=1588907152, > dataRegionName=Default_Region, mode=REPLICATED, atomicity=TRANSACTIONAL, > backups=2147483647, mvcc=false][15:53:10,000][INFO][main][GridCacheProcessor] > Started cache in recovery mode [name=cache_group_1_008, id=1586135597, > group=cache_group_2, dataRegionName=Default_Region, mode=PARTITIONED, > atomicity=TRANSACTIONAL, backups=3, > mvcc=false][15:53:10,003][INFO][main][GridCacheProcessor] Started cache in > recovery mode [name=test_cache_002, id=-454739832, group=test_cache_group, > dataRegionName=Default_Region, mode=PARTITIONED, atomicity=TRANSACTIONAL, > backups=2, mvcc=false][15:53:10,005][INFO][main][GridCacheProcessor] Started > cache in recovery mode [name=cache_group_3_068, id=1587982825, > group=cache_group_2, dataRegionName=Default_Region, mode=PARTITIONED, > atomicity=TRANSACTIONAL, backups=3, > mvcc=false][15:53:10,008][INFO][main][GridCacheProcessor] Started cache in > recovery mode [name=cache_group_2_058, id=1587059273, > dataRegionName=Default_Region, mode=REPLICATED, atomicity=TRANSACTIONAL, > backups=2147483647, mvcc=false][15:53:10,011][INFO][main][GridCacheProcessor] > Started cache in recovery mode [name=cache_group_2_038, id=1587059211, > group=cache_group_5, dataRegionName=Default_Region, mode=REPLICATED, > atomicity=TRANSACTIONAL, backups=2147483647, > mvcc=false][15:53:10,013][INFO][main][GridCacheProcessor] Started cache in > recovery mode [name=cache_group_4_098, id=1588906439, group=cache_group_5, > dataRegionName=Default_Region, mode=REPLICATED, atomicity=TRANSACTIONAL, > backups=2147483647, mvcc=false][15:53:10,015][INFO][main][GridCacheProcessor] > Started cache in recovery mode [name=ignite-sys-cache, id=-2100569601, > dataRegionName=sysMemPlc, mode=REPLICATED, atomicity=TRANSACTIONAL, > backups=2147483647, mvcc=false][15:53:10,019][INFO][main][GridCacheProcessor] > Started cache in recovery mode [name=cache_group_4_091, id=1588906432, > group=cache_group_4, dataRegionName=Default_Region, mode=REPLICATED, > atomicity=TRANSACTIONAL, backups=2147483647, > mvcc=false][15:53:10,022][INFO][main][GridCacheProcessor] Started cache in > recovery mode [name=cache_group_4_105, id=1588907118, group=cache_group_6, > dataRegionName=Default_Region, mode=REPLICATED, atomicity=TRANSACTIONAL, > backups=2147483647, mvcc=false][15:53:10,026][INFO][main][GridCacheProcessor] > Started cache in recovery mode [name=cache_group_2_031, id=1587059204, > dataRegionName=Default_Region, mode=REPLICATED, atomicity=TRANSACTIONAL, > backups=2147483647, mvcc=false][15:53:10,028][INFO][main][GridCacheProcessor] > Started cache in recovery mode [name=cache_group_2_045, id=1587059239, > group=cache_group_6, dataRegionName=Default_Region, mode=REPLICATED, > atomicity=TRANSACTIONAL, backups=2147483647, > mvcc=false][15:53:10,043][INFO][main][GridCacheDatabaseSharedManager] Binary > recovery performed in 729 > ms.[15:53:10,043][INFO][main][CheckpointMarkersStorage] Read checkpoint > status > [startMarker=/storage/ssd/prtagent/tiden/sow-220111-155046/test_iep_14/ignite.server.2/work/db/node_1_2/cp/1641905556201-40b99ac3-a6c5-48e6-956b-8b2b0a5804c9-START.bin, > > endMarker=/storage/ssd/prtagent/tiden/sow-220111-155046/test_iep_14/ignite.server.2/work/db/node_1_2/cp/1641905556201-40b99ac3-a6c5-48e6-956b-8b2b0a5804c9-END.bin][15:53:10,044][INFO][main][GridCacheDatabaseSharedManager] > Applying lost cache updates since last checkpoint record > [lastMarked=FileWALPointer [idx=41, fileOff=734466, len=52185], > lastCheckpointId=40b99ac3-a6c5-48e6-956b-8b2b0a5804c9][15:53:10,050][INFO][main][GridCacheDatabaseSharedManager] > Finished applying WAL changes [updatesApplied=0, time=0 > ms][15:53:10,051][INFO][main][GridCacheProcessor] Restoring partition state > for local groups.[15:53:10,709][INFO][main][GridCacheProcessor] Finished > restoring partition state for local groups [groupsProcessed=14, > partitionsProcessed=2740, time=661ms] {code} > The page has been restored from the wall, but it is not marked as dirty, as a > result, the recovery result will not get to the disk and the next time the > recovery process will be repeated. > STR: > 1) start cluster. activate. upload data. > 2) stop cluster. > 3) break index.bin on zero-page (for ex. like this: printf "000" | dd > of=\{work_path}/index.bin bs=1 seek=4100 count=3 conv=notrunc) > 4) start cluster. > Full logs in attache. Reproduced on gridgain 8.8.11. -- This message was sent by Atlassian Jira (v8.20.1#820001)