[ https://issues.apache.org/jira/browse/IGNITE-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Goncharuk updated IGNITE-8797: ------------------------------------- Labels: MakeTeamcityGreenAgain (was: ) > Error during writeCheckpointEntry is not passed to failure handler during > checkpoint finish > ------------------------------------------------------------------------------------------- > > Key: IGNITE-8797 > URL: https://issues.apache.org/jira/browse/IGNITE-8797 > Project: Ignite > Issue Type: Bug > Reporter: Alexey Goncharuk > Priority: Major > Labels: MakeTeamcityGreenAgain > > I observed the following failure in Cache 3 suite: > {code} > [13:10:55]W: [org.apache.ignite:ignite-core] [2018-06-14 > 10:10:55,509][ERROR][db-checkpoint-thread-#138910%paged.PageEvictionMultinodeMixedRegionsTest2%][GridCacheDatabaseSharedManager] > Failed to create checkpoint. > [13:10:55]W: [org.apache.ignite:ignite-core] class > org.apache.ignite.internal.processors.cache.persistence.file.PersistentStorageIOException: > Failed to write checkpoint entry [ptr=FileWALPointer [idx=0, fileOff=219747, > len=1947], cpTs=1528971054548, cpId=d8b42759-ca5e-4613-b091-ed0356b3915d, > type=END] > [13:10:55]W: [org.apache.ignite:ignite-core] at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.writeCheckpointEntry(GridCacheDatabaseSharedManager.java:2757) > [13:10:55]W: [org.apache.ignite:ignite-core] at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.access$8100(GridCacheDatabaseSharedManager.java:178) > [13:10:55]W: [org.apache.ignite:ignite-core] at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointEnd(GridCacheDatabaseSharedManager.java:3716) > [13:10:55]W: [org.apache.ignite:ignite-core] at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3277) > [13:10:55]W: [org.apache.ignite:ignite-core] at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3053) > [13:10:55]W: [org.apache.ignite:ignite-core] at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) > [13:10:55]W: [org.apache.ignite:ignite-core] at > java.lang.Thread.run(Thread.java:748) > [13:10:55]W: [org.apache.ignite:ignite-core] Caused by: > java.nio.file.NoSuchFileException: > /data/teamcity/work/c182b70f2dfa6507/work/db/node03-c5dcc243-fc3c-4b2f-8002-81e88d8cff7d/cp/1528971054548-d8b42759-ca5e-4613-b091-ed0356b3915d-END.bin.tmp > [13:10:55]W: [org.apache.ignite:ignite-core] at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > [13:10:55]W: [org.apache.ignite:ignite-core] at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > [13:10:55]W: [org.apache.ignite:ignite-core] at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > [13:10:55]W: [org.apache.ignite:ignite-core] at > sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409) > [13:10:55]W: [org.apache.ignite:ignite-core] at > sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) > [13:10:55]W: [org.apache.ignite:ignite-core] at > java.nio.file.Files.move(Files.java:1395) > [13:10:55]W: [org.apache.ignite:ignite-core] at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.writeCheckpointEntry(GridCacheDatabaseSharedManager.java:2752) > [13:10:55]W: [org.apache.ignite:ignite-core] ... 6 more > [13:10:55]W: [org.apache.ignite:ignite-core] [2018-06-14 > 10:10:55,509][ERROR][db-checkpoint-thread-#138914%paged.PageEvictionMultinodeMixedRegionsTest3%][GridCacheDatabaseSharedManager] > Failed to create checkpoint. > {code} > I see two issues here: > 1) Some concurrent process is removing the work folder which results in the > exception above > 2) The checkpoint exception is not passed to the failure handler. This is due > to a catch {{// TODO-ignite-db how to handle exception?}} in > {{Checkpointer}}, which yields an uncompleted checkpoint future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)