[ 
https://issues.apache.org/jira/browse/IGNITE-19904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-19904:
--------------------------------------
    Attachment: failure2.16_with_thread_dump.log

> Assertion in defragmentation
> ----------------------------
>
>                 Key: IGNITE-19904
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19904
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.12
>            Reporter: Vladimir Steshin
>            Priority: Major
>              Labels: ise
>         Attachments: default-config.xml, failure2.16_with_thread_dump.log, 
> ignite.log, ignite_wierd_other_failureNPE.log, jvm.opts
>
>
> Defragmentaion fails with:
> {code:java}
> java.lang.AssertionError: Invalid state. Type is 0! pageId = 0001000d00024cbf
>               at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.copyPageForCheckpoint(PageMemoryImpl.java:1359)
>  ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
>               at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.checkpointWritePage(PageMemoryImpl.java:1277)
>  ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
>               at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.writePages(CheckpointPagesWriter.java:208)
>  ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
>               at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:150)
>  ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
> {code}
> Difficult to write a test. Can't reproduce on my computers :(. Flackly 
> appears on a server (4 core x 4 cpu) with 100G of the test cache data and 
> million+ pages to checkpoint during defragmentation. More often, this occurs 
> with pageSize 1024 (to produce more pages).
> Regarding my diagnostic build, I suppose that a fresh, empty page is caught 
> in defragmentation. Here is a page dump with test-expented PAGE_OVERHEAD 
> (=64) and same error a bit before copyPageForCheckpoint():
> {code:java}
> org.apache.ignite.IgniteException: Wrong page type in checkpointWritePage1. 
> Page: Data region = 'defragPartitionsDataRegion'.
>  FullPageId [pageId=281878703760205, effectivePageId=403727049549, 
> grpId=-1368047378].
>  PageDump = page_id: 281878703760205, rel_id: 48603, cache_id: -1368047378, 
> pin: 0, lock: 65536, tmp_buf: 72057594037927935, test_val: 1. data_hex: 
> 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
>               at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.checkpointWritePage(PageMemoryImpl.java:1240)
>  ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
>               at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.writePages(CheckpointPagesWriter.java:208)
>  ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
>               at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:150)
>  ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
> {code}
> 'test_val' is my diagnostic page prefix extension. Various numbers are 
> assigned where tmp_buf is assigned (by `PageHeader::tempBufferPointer()`). 
> '1' comes from PageHeader::initNew(long absPtr, long relative):
> {code:java}
>     public static void PageHeader::initNew(long absPtr, long relative) {
>         relative(absPtr, relative);
>         tempBufferPointer(absPtr, PageMemoryImpl.INVALID_REL_PTR);
>         GridUnsafe.putLong(absPtr, PAGE_MARKER);
>         GridUnsafe.putInt(absPtr + PAGE_PIN_CNT_OFFSET, 0);
>         // For diagnostic purposes.
>         PageUtils.setTestTmpValue(absPtr, 1);
>     }
>     
>     public static void PageUtils::setTestTmpValue(long absPtr, int val) {
>         // Next to page overhead
>         putInt(absPtr, PAGE_OVERHEAD - 4, val);
>     }
>     
>     public static final int PageMemoryImpl#PAGE_OVERHEAD = 64;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to