[ 
https://issues.apache.org/jira/browse/IGNITE-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506087#comment-16506087
 ] 

ASF GitHub Bot commented on IGNITE-8529:
----------------------------------------

GitHub user alex-plekhanov opened a pull request:

    https://github.com/apache/ignite/pull/4159

    IGNITE-8529 Implement testing framework for checking WAL delta records 
consistency

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/alex-plekhanov/ignite ignite-8529

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/ignite/pull/4159.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4159
    
----
commit a5c142daf7c46a354d5417dac7cf7c3c79a9488b
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-07T10:39:52Z

    IGNITE-8529 Draft 3 WIP

commit 0ddd4d82c3625e45f21650267685bd2020997cb1
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-07T12:25:42Z

    IGNITE-8529 Draft 3 WIP

commit ada909a74d5b000ac741c07421da7f5bcc955023
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-07T16:46:33Z

    IGNITE-8529 Draft 3 WIP

commit 3f570c578b4946c6d599e9efbabf6260a45bce50
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-07T16:51:08Z

    IGNITE-8529 Draft 3 WIP

commit 883acf9447c2619799f6078523504082ada4dc21
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-07T21:36:02Z

    IGNITE-8529 Draft 2 WIP

commit 7cb3d90ff758e42ef7d876d17cb4d597fb0ee240
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-08T07:46:42Z

    IGNITE-8529 Draft 3 WIP

commit 41d2dc6a44c3a3775254f9d68595e04ba4198e98
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-08T10:43:18Z

    IGNITE-8529 Implement testing framework for checking WAL delta records 
consistency

commit 4678f6a6b4c7a5922063f2118bb4810f5e2b6d12
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-08T12:52:01Z

    IGNITE-8529 Made page memory reusable after cache destroy.

commit c64719bf6be1562b0ad8f660eecf780cafca4334
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-08T14:23:14Z

    IGNITE-8529 Made page memory reusable after cache destroy (fix).

commit 755cae5c68ef472a56871a891095721aebe60ff0
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-06-08T14:32:47Z

    IGNITE-8529 Cleanup

----


> Implement testing framework for checking WAL delta records consistency
> ----------------------------------------------------------------------
>
>                 Key: IGNITE-8529
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8529
>             Project: Ignite
>          Issue Type: New Feature
>          Components: persistence
>            Reporter: Ivan Rakov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>             Fix For: 2.6
>
>
> We use sharp checkpointing of page memory in persistent mode. That implies 
> that we write two types of records to write-ahead log: logical (e.g. data 
> records) and phyisical (page snapshots + binary delta records). Physical 
> records are applied only when node crashes/stops during ongoing checkpoint. 
> We have the following invariant: checkpoint #(n-1) + all physical records = 
> checkpoint #n.
> If correctness of physical records is broken, Ignite node may recover with 
> incorrect page memory state, which in turn can bring unexpected delayed 
> errors. However, consistency of physical records is poorly tested: only small 
> part of our autotests perform node restarts, and even less part of them 
> perform node stop when ongoing checkpoint is running.
> We should implement abstract test that:
> 1. Enforces checkpoint, freezes memory state at the moment of checkpoint.
> 2. Performs necessary test load.
> 3. Enforces checkpoint again, replays WAL and checks that page store at the 
> moment of previous checkpoint with all applied physical records exactly 
> equals to current checkpoint state.
> Except for checking correctness, test framework should do the following:
> 1. Gather statistics (like histogram) for types of wriiten physical records. 
> That will help us to know what types of physical records are covered by test.
> 2. Visualize expected and actual page state (with all applied physical 
> records) if incorrect page state is detected.
> Regarding implementation, I suppose we can use checkpoint listener mechanism 
> to freeze page memory state at the moment of checkpoint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to