[jira] [Commented] (IGNITE-16926) Interrupted compute job may fail a node

2022-09-14 Thread Nikita Amelchev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17604038#comment-17604038
 ] 

Nikita Amelchev commented on IGNITE-16926:
--

[~ibessonov], Thanks for clarification.

I would like option 1. I will cherry-pick to the 2.14 on merge if CI will OK.

> Interrupted compute job may fail a node
> ---
>
> Key: IGNITE-16926
> URL: https://issues.apache.org/jira/browse/IGNITE-16926
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: ise.lts
> Fix For: 2.14
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
> corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden 
> ","logger_name":"ROOT","thread_name":"pub-#1278%x%","level":"ERROR","level_value":4,"stack_trace":"org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>  B+Tree is corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden ]] at 
> org.apache.ignite.internal.processors.query.h2.database.H2Tree.corruptedTreeException(H2Tree.java:1003)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2492)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.tre

[jira] [Commented] (IGNITE-16926) Interrupted compute job may fail a node

2022-09-14 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17604032#comment-17604032
 ] 

Ivan Bessonov commented on IGNITE-16926:


[~NSAmelchev] I see now, thank you!

That was clearly a mistake from my side. In your PR I would recommend hiding 
*walWriter.close();* in *else* branch of mmap check.

Alternative way to fix it would be like this I believe, but now it seems more 
dangerous to me, we would need to look into code more carefully:
{code:java}
Index: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/persistence/wal/filehandle/FileHandleManagerImpl.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===
diff --git 
a/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/persistence/wal/filehandle/FileHandleManagerImpl.java
 
b/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/persistence/wal/filehandle/FileHandleManagerImpl.java
--- 
a/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/persistence/wal/filehandle/FileHandleManagerImpl.java
    (revision 18ff1592f9c7f78abad2b62b9c7a2034bb72796e)
+++ 
b/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/persistence/wal/filehandle/FileHandleManagerImpl.java
    (date 1663154842011)
@@ -216,8 +216,7 @@
 
     /** {@inheritDoc} */
     @Override public void resumeLogging() {
-        if (!mmap)
-            walWriter.restart();
+        walWriter.restart();
 
         if (cctx.kernalContext().clientNode())
             return;
@@ -475,7 +474,7 @@
          * @param expPos Expected position.
          */
         void flushBuffer(long expPos) throws IgniteCheckedException {
-            if (mmap)
+            if (mmap && expPos >= 0)
                 return;
 
             Throwable err = walWriter.err;
 {code}

> Interrupted compute job may fail a node
> ---
>
> Key: IGNITE-16926
> URL: https://issues.apache.org/jira/browse/IGNITE-16926
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: ise.lts
> Fix For: 2.14
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
> corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden 
> ","logger_name":"ROOT","thread_name":"pub-#1278%x%","level":"ERROR","level_value":4,"stack_trace":"org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>  B+Tree is corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag.

[jira] [Commented] (IGNITE-16926) Interrupted compute job may fail a node

2022-09-14 Thread Nikita Amelchev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17604017#comment-17604017
 ] 

Nikita Amelchev commented on IGNITE-16926:
--

[~ibessonov], I have created a reproducer and a fix: 
[PR|https://github.com/apache/ignite/pull/10250/files#diff-50f80de931797123ff52b15641131a3aacd8a45510297e5fc9388907b7906a6c].
 
Could you take a look, please?

> Interrupted compute job may fail a node
> ---
>
> Key: IGNITE-16926
> URL: https://issues.apache.org/jira/browse/IGNITE-16926
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: ise.lts
> Fix For: 2.14
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
> corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden 
> ","logger_name":"ROOT","thread_name":"pub-#1278%x%","level":"ERROR","level_value":4,"stack_trace":"org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>  B+Tree is corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden ]] at 
> org.apache.ignite.internal.processors.query.h2.database.H2Tree.corruptedTreeException(H2Tree.java:1003)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.d

[jira] [Commented] (IGNITE-16926) Interrupted compute job may fail a node

2022-09-14 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603926#comment-17603926
 ] 

Ivan Bessonov commented on IGNITE-16926:


Hi [~NSAmelchev],

as far as I see, {{FileHandleManagerImpl.WALWriter#body}} does close a 
{{fileIO}} instance when we call {{{}walWriter.close(){}}}.

Can you explain your scenario in more details? Is this a different {{fileIO}} 
instance?

The idea of the fix was to move all remaining {{fileIO}} manipulations to wal 
writer thread, thus preventing channel interruption if user thread is 
interrupted.

> Interrupted compute job may fail a node
> ---
>
> Key: IGNITE-16926
> URL: https://issues.apache.org/jira/browse/IGNITE-16926
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: ise.lts
> Fix For: 2.14
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
> corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden 
> ","logger_name":"ROOT","thread_name":"pub-#1278%x%","level":"ERROR","level_value":4,"stack_trace":"org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>  B+Tree is corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden ]] at 

[jira] [Commented] (IGNITE-16926) Interrupted compute job may fail a node

2022-09-13 Thread Nikita Amelchev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603580#comment-17603580
 ] 

Nikita Amelchev commented on IGNITE-16926:
--

[~ibessonov], [~ktkale...@gridgain.com], Hi guys.

Could you please explain why the fileIO is not being closed now?

This leads to unable to delete these files (freeing space) until the process is 
complete on unix. 

> Interrupted compute job may fail a node
> ---
>
> Key: IGNITE-16926
> URL: https://issues.apache.org/jira/browse/IGNITE-16926
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: ise.lts
> Fix For: 2.14
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
> corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden 
> ","logger_name":"ROOT","thread_name":"pub-#1278%x%","level":"ERROR","level_value":4,"stack_trace":"org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>  B+Tree is corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden ]] at 
> org.apache.ignite.internal.processors.query.h2.database.H2Tree.corruptedTreeException(H2Tree.java:1003)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTr

[jira] [Commented] (IGNITE-16926) Interrupted compute job may fail a node

2022-05-06 Thread Kirill Tkalenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532700#comment-17532700
 ] 

Kirill Tkalenko commented on IGNITE-16926:
--

[~ibessonov] Looks good to me.

> Interrupted compute job may fail a node
> ---
>
> Key: IGNITE-16926
> URL: https://issues.apache.org/jira/browse/IGNITE-16926
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
> corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden 
> ","logger_name":"ROOT","thread_name":"pub-#1278%x%","level":"ERROR","level_value":4,"stack_trace":"org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>  B+Tree is corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden ]] at 
> org.apache.ignite.internal.processors.query.h2.database.H2Tree.corruptedTreeException(H2Tree.java:1003)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2492)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putx(BPlusTree.java:2432)
>  at 
> org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:500)
>  at 

[jira] [Commented] (IGNITE-16926) Interrupted compute job may fail a node

2022-05-06 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532687#comment-17532687
 ] 

Ignite TC Bot commented on IGNITE-16926:


{panel:title=Branch: [pull/10011/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/10011/head] Base: [master] : New Tests 
(2)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}PDS (Indexing){color} [[tests 
2|https://ci.ignite.apache.org/viewLog.html?buildId=6556894]]
* {color:#013220}IgnitePdsWithIndexingCoreTestSuite: 
IgnitePdsThreadInterruptionRandomAccessWalTest.testInterruptsOnRead - 
PASSED{color}
* {color:#013220}IgnitePdsWithIndexingCoreTestSuite: 
IgnitePdsThreadInterruptionRandomAccessWalTest.testInterruptsOnWALWrite - 
PASSED{color}

{panel}
[TeamCity *--> Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=6556937&buildTypeId=IgniteTests24Java8_RunAll]

> Interrupted compute job may fail a node
> ---
>
> Key: IGNITE-16926
> URL: https://issues.apache.org/jira/browse/IGNITE-16926
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
> corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden 
> ","logger_name":"ROOT","thread_name":"pub-#1278%x%","level":"ERROR","level_value":4,"stack_trace":"org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>  B+Tree is corrupted [groupId=1234619879, pageIds=[7290201467513], 
> cacheId=645096946, cacheName=*, indexName=*, msg=Runtime failure on row: 
> Row@79570772[ key: 1168930235, val: Data hidden due to 
> IGNITE_SENSITIVE_DATA_LOGGING flag. ][ data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data hidden, data hidden, data hidden, data hidden, data hidden, data hidden, 
> data h