[jira] [Commented] (IGNITE-10974) Node may hangs if an exception is throw from PageMemoryImpl.beforeReleaseWrite()

2019-01-18 Thread Dmitriy Govorukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746746#comment-16746746
 ] 

Dmitriy Govorukhin commented on IGNITE-10974:
-

Merged to master.

> Node may hangs if an exception is throw from 
> PageMemoryImpl.beforeReleaseWrite()
> 
>
> Key: IGNITE-10974
> URL: https://issues.apache.org/jira/browse/IGNITE-10974
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.8
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
>  There are situations when a node may hang during stopping. Bug in the 
> PageMemoryImpl.writeUnlockPage(), this method invokes 
> PageMemoryImpl.beforeReleaseWrite() which may throw a exception and 
> writeUnlockPage does not release page lock, after that checkpoint try to do 
> final checkpoint (if stop was gracefully) and next access for this page hangs 
> on obtain write lock.
> {code:java}
> [2019-01-17 14:35:15,953][WARN ][main][root] Thread dump at 2019/01/17 
> 14:35:15 UTC
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#857%failure.IoomFailureHandlerTest0%", id=931, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#856%failure.IoomFailureHandlerTest0%", id=930, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#855%failure.IoomFailureHandlerTest0%", id=929, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> 

[jira] [Commented] (IGNITE-10974) Node may hangs if an exception is throw from PageMemoryImpl.beforeReleaseWrite()

2019-01-18 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746731#comment-16746731
 ] 

Ignite TC Bot commented on IGNITE-10974:


{panel:title=-- Run :: All: No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=2835958buildTypeId=IgniteTests24Java8_RunAll]

> Node may hangs if an exception is throw from 
> PageMemoryImpl.beforeReleaseWrite()
> 
>
> Key: IGNITE-10974
> URL: https://issues.apache.org/jira/browse/IGNITE-10974
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.8
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  There are situations when a node may hang during stopping. Bug in the 
> PageMemoryImpl.writeUnlockPage(), this method invokes 
> PageMemoryImpl.beforeReleaseWrite() which may throw a exception and 
> writeUnlockPage does not release page lock, after that checkpoint try to do 
> final checkpoint (if stop was gracefully) and next access for this page hangs 
> on obtain write lock.
> {code:java}
> [2019-01-17 14:35:15,953][WARN ][main][root] Thread dump at 2019/01/17 
> 14:35:15 UTC
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#857%failure.IoomFailureHandlerTest0%", id=931, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#856%failure.IoomFailureHandlerTest0%", id=930, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#855%failure.IoomFailureHandlerTest0%", id=929, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> 

[jira] [Commented] (IGNITE-10974) Node may hangs if an exception is throw from PageMemoryImpl.beforeReleaseWrite()

2019-01-18 Thread Dmitriy Govorukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746722#comment-16746722
 ] 

Dmitriy Govorukhin commented on IGNITE-10974:
-

I guess that problem in a test because they were run on windows agent. I 
started new re-run. 

> Node may hangs if an exception is throw from 
> PageMemoryImpl.beforeReleaseWrite()
> 
>
> Key: IGNITE-10974
> URL: https://issues.apache.org/jira/browse/IGNITE-10974
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.8
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  There are situations when a node may hang during stopping. Bug in the 
> PageMemoryImpl.writeUnlockPage(), this method invokes 
> PageMemoryImpl.beforeReleaseWrite() which may throw a exception and 
> writeUnlockPage does not release page lock, after that checkpoint try to do 
> final checkpoint (if stop was gracefully) and next access for this page hangs 
> on obtain write lock.
> {code:java}
> [2019-01-17 14:35:15,953][WARN ][main][root] Thread dump at 2019/01/17 
> 14:35:15 UTC
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#857%failure.IoomFailureHandlerTest0%", id=931, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#856%failure.IoomFailureHandlerTest0%", id=930, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#855%failure.IoomFailureHandlerTest0%", id=929, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> 

[jira] [Commented] (IGNITE-10974) Node may hangs if an exception is throw from PageMemoryImpl.beforeReleaseWrite()

2019-01-18 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746607#comment-16746607
 ] 

Ignite TC Bot commented on IGNITE-10974:


{panel:title=-- Run :: All: Possible 
Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}JDBC Driver{color} [[tests 
36|https://ci.ignite.apache.org/viewLog.html?buildId=2835866]]
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedSelfTest.testDefaultCharsetPacketSize1 
- 0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedSelfTest.testWrongCharset_Utf8AsAscii - 
0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedSelfTest.testDefaultCharset - 0,0% 
fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedSelfTest.testWrongCharset_Win1251AsAscii
 - 0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedNearSelfTest.testBulkLoadToNonAffinityNode
 - 0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedNearSelfTest.testWrongCharset_Utf8AsWin1251 - 
0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedSelfTest.testBulkLoadToNonAffinityNode 
- 0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedSelfTest.testWrongCharset_Utf8AsWin1251 - 0,0% 
fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedSelfTest.testUtf8Charset - 0,0% fails in last 
572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedNearSelfTest.testDefaultCharsetPacketSize1
 - 0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedNearSelfTest.testWrongCharset_Utf8AsAscii
 - 0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedSelfTest.testWrongCharset_Win1251AsUtf8 - 0,0% 
fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedNearSelfTest.testUtf8Charset - 0,0% fails in 
last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedNearSelfTest.testDefaultCharset - 0,0% 
fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedSelfTest.testWin1251Charset - 0,0% fails in 
last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedSelfTest.testDefaultCharsetPacketSize1 - 0,0% 
fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedSelfTest.testWrongCharset_Utf8AsAscii - 0,0% 
fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedSelfTest.testDefaultCharset - 0,0% fails in 
last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedNearSelfTest.testWrongCharset_Win1251AsAscii
 - 0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedSelfTest.testWrongCharset_Win1251AsAscii - 
0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedNearSelfTest.testWrongCharset_Win1251AsUtf8 - 
0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedNearSelfTest.testUtf8Charset - 0,0% 
fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedNearSelfTest.testWin1251Charset - 0,0% fails 
in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedNearSelfTest.testWrongCharset_Win1251AsUtf8
 - 0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedNearSelfTest.testDefaultCharsetPacketSize1 - 
0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedSelfTest.testWrongCharset_Utf8AsWin1251 
- 0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedNearSelfTest.testWrongCharset_Utf8AsAscii - 
0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedNearSelfTest.testWin1251Charset - 0,0% 
fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedNearSelfTest.testDefaultCharset - 0,0% fails 
in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadTransactionalPartitionedSelfTest.testUtf8Charset - 0,0% fails 
in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedNearSelfTest.testWrongCharset_Win1251AsAscii - 
0,0% fails in last 572 master runs.
* IgniteJdbcDriverTestSuite: 
JdbcThinBulkLoadAtomicPartitionedNearSelfTest.testBulkLoadToNonAffinityNode - 
0,0% fails in 

[jira] [Commented] (IGNITE-10974) Node may hangs if an exception is throw from PageMemoryImpl.beforeReleaseWrite()

2019-01-18 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746315#comment-16746315
 ] 

Dmitriy Pavlov commented on IGNITE-10974:
-

LGTM for me provided that is it approved by the Bot.

> Node may hangs if an exception is throw from 
> PageMemoryImpl.beforeReleaseWrite()
> 
>
> Key: IGNITE-10974
> URL: https://issues.apache.org/jira/browse/IGNITE-10974
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.8
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  There are situations when a node may hang during stopping. Bug in the 
> PageMemoryImpl.writeUnlockPage(), this method invokes 
> PageMemoryImpl.beforeReleaseWrite() which may throw a exception and 
> writeUnlockPage does not release page lock, after that checkpoint try to do 
> final checkpoint (if stop was gracefully) and next access for this page hangs 
> on obtain write lock.
> {code:java}
> [2019-01-17 14:35:15,953][WARN ][main][root] Thread dump at 2019/01/17 
> 14:35:15 UTC
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#857%failure.IoomFailureHandlerTest0%", id=931, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#856%failure.IoomFailureHandlerTest0%", id=930, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#855%failure.IoomFailureHandlerTest0%", id=929, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: 

[jira] [Commented] (IGNITE-10974) Node may hangs if an exception is throw from PageMemoryImpl.beforeReleaseWrite()

2019-01-18 Thread Dmitriy Govorukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746310#comment-16746310
 ] 

Dmitriy Govorukhin commented on IGNITE-10974:
-

[~dpavlov] Fixed, please take a look again.

> Node may hangs if an exception is throw from 
> PageMemoryImpl.beforeReleaseWrite()
> 
>
> Key: IGNITE-10974
> URL: https://issues.apache.org/jira/browse/IGNITE-10974
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.8
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  There are situations when a node may hang during stopping. Bug in the 
> PageMemoryImpl.writeUnlockPage(), this method invokes 
> PageMemoryImpl.beforeReleaseWrite() which may throw a exception and 
> writeUnlockPage does not release page lock, after that checkpoint try to do 
> final checkpoint (if stop was gracefully) and next access for this page hangs 
> on obtain write lock.
> {code:java}
> [2019-01-17 14:35:15,953][WARN ][main][root] Thread dump at 2019/01/17 
> 14:35:15 UTC
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#857%failure.IoomFailureHandlerTest0%", id=931, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#856%failure.IoomFailureHandlerTest0%", id=930, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#855%failure.IoomFailureHandlerTest0%", id=929, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: 

[jira] [Commented] (IGNITE-10974) Node may hangs if an exception is throw from PageMemoryImpl.beforeReleaseWrite()

2019-01-18 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746284#comment-16746284
 ] 

Dmitriy Pavlov commented on IGNITE-10974:
-

I left several comments and questions in the PR. Provided that TC bot issues a 
green VISA, and comments will be addressed  - I agree with the change.

> Node may hangs if an exception is throw from 
> PageMemoryImpl.beforeReleaseWrite()
> 
>
> Key: IGNITE-10974
> URL: https://issues.apache.org/jira/browse/IGNITE-10974
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.8
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
>  There are situations when a node may hang during stopping. Bug in the 
> PageMemoryImpl.writeUnlockPage(), this method invokes 
> PageMemoryImpl.beforeReleaseWrite() which may throw a exception and 
> writeUnlockPage does not release page lock, after that checkpoint try to do 
> final checkpoint (if stop was gracefully) and next access for this page hangs 
> on obtain write lock.
> {code:java}
> [2019-01-17 14:35:15,953][WARN ][main][root] Thread dump at 2019/01/17 
> 14:35:15 UTC
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#857%failure.IoomFailureHandlerTest0%", id=931, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#856%failure.IoomFailureHandlerTest0%", id=930, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.lang.Thread.run(Thread.java:748)
> [17:35:15]W: [org.apache.ignite:ignite-core] 
> [17:35:15]W: [org.apache.ignite:ignite-core] Thread 
> [name="sys-#855%failure.IoomFailureHandlerTest0%", id=929, 
> state=TIMED_WAITING, blockCnt=0, waitCnt=1]
> [17:35:15]W: [org.apache.ignite:ignite-core] Lock 
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
>  ownerName=null, ownerId=-1]
> [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
> Method)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> [17:35:15]W: [org.apache.ignite:ignite-core] at 
>