[ 
https://issues.apache.org/jira/browse/IGNITE-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518097#comment-16518097
 ] 

ASF GitHub Bot commented on IGNITE-8797:
----------------------------------------

GitHub user alex-plekhanov opened a pull request:

    https://github.com/apache/ignite/pull/4229

    IGNITE-8797 cleanPersistenceDir() before stopAllGrids()

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/alex-plekhanov/ignite ignite-8797

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/ignite/pull/4229.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4229
    
----
commit d36cf94ba4bdf3f7f5bc27dce18a6a915fda55ac
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-20T12:35:08Z

    IGNITE-8797 cleanPersistenceDir() before stopAllGrids()

----


> Error during writeCheckpointEntry is not passed to failure handler during 
> checkpoint finish
> -------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-8797
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8797
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Alexey Goncharuk
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain
>
> I observed the following failure in Cache 3 suite:
> {code}
> [13:10:55]W:           [org.apache.ignite:ignite-core] [2018-06-14 
> 10:10:55,509][ERROR][db-checkpoint-thread-#138910%paged.PageEvictionMultinodeMixedRegionsTest2%][GridCacheDatabaseSharedManager]
>  Failed to create checkpoint.
> [13:10:55]W:           [org.apache.ignite:ignite-core] class 
> org.apache.ignite.internal.processors.cache.persistence.file.PersistentStorageIOException:
>  Failed to write checkpoint entry [ptr=FileWALPointer [idx=0, fileOff=219747, 
> len=1947], cpTs=1528971054548, cpId=d8b42759-ca5e-4613-b091-ed0356b3915d, 
> type=END]
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.writeCheckpointEntry(GridCacheDatabaseSharedManager.java:2757)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.access$8100(GridCacheDatabaseSharedManager.java:178)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointEnd(GridCacheDatabaseSharedManager.java:3716)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3277)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3053)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> java.lang.Thread.run(Thread.java:748)
> [13:10:55]W:           [org.apache.ignite:ignite-core] Caused by: 
> java.nio.file.NoSuchFileException: 
> /data/teamcity/work/c182b70f2dfa6507/work/db/node03-c5dcc243-fc3c-4b2f-8002-81e88d8cff7d/cp/1528971054548-d8b42759-ca5e-4613-b091-ed0356b3915d-END.bin.tmp
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> java.nio.file.Files.move(Files.java:1395)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.writeCheckpointEntry(GridCacheDatabaseSharedManager.java:2752)
> [13:10:55]W:           [org.apache.ignite:ignite-core]        ... 6 more
> [13:10:55]W:           [org.apache.ignite:ignite-core] [2018-06-14 
> 10:10:55,509][ERROR][db-checkpoint-thread-#138914%paged.PageEvictionMultinodeMixedRegionsTest3%][GridCacheDatabaseSharedManager]
>  Failed to create checkpoint.
> {code}
> I see two issues here:
> 1) Some concurrent process is removing the work folder which results in the 
> exception above
> 2) The checkpoint exception is not passed to the failure handler. This is due 
> to a catch {{// TODO-ignite-db how to handle exception?}} in 
> {{Checkpointer}}, which yields an uncompleted checkpoint future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to