[jira] [Created] (IGNITE-11358) Bug in ZK tests occurs periodically
Pavel Voronkin created IGNITE-11358: --- Summary: Bug in ZK tests occurs periodically Key: IGNITE-11358 URL: https://issues.apache.org/jira/browse/IGNITE-11358 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin java.lang.NullPointerException at org.apache.ignite.spi.discovery.zk.ZookeeperDiscoverySpi.allNodesSupport(ZookeeperDiscoverySpi.java:342) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.isHandshakeWaitSupported(TcpCommunicationSpi.java:4109) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.access$400(TcpCommunicationSpi.java:277) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onConnected(TcpCommunicationSpi.java:430) at org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onSessionOpened(GridNioFilterChain.java:251) at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionOpened(GridNioFilterAdapter.java:88) at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onSessionOpened(GridNioCodecFilter.java:66) at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionOpened(GridNioFilterAdapter.java:88) at org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onSessionOpened(GridConnectionBytesVerifyFilter.java:58) at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionOpened(GridNioFilterAdapter.java:88) at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionOpened(GridNioServer.java:3525) at org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionOpened(GridNioFilterChain.java:139) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.register(GridNioServer.java:2639) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1997) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1818) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11350) doInParallel interruption is not properly handled in ExchangeFuture.
Pavel Voronkin created IGNITE-11350: --- Summary: doInParallel interruption is not properly handled in ExchangeFuture. Key: IGNITE-11350 URL: https://issues.apache.org/jira/browse/IGNITE-11350 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11308) Add soLinger parameter support in TcpDiscoverySpi .NET configuration.
Pavel Voronkin created IGNITE-11308: --- Summary: Add soLinger parameter support in TcpDiscoverySpi .NET configuration. Key: IGNITE-11308 URL: https://issues.apache.org/jira/browse/IGNITE-11308 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11288) Missing SO_LINGER in TcpDiscovery and TcpCommunicationSpi causing SSLSocket.close() deadlock.
Pavel Voronkin created IGNITE-11288: --- Summary: Missing SO_LINGER in TcpDiscovery and TcpCommunicationSpi causing SSLSocket.close() deadlock. Key: IGNITE-11288 URL: https://issues.apache.org/jira/browse/IGNITE-11288 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11255) Fix test failure after IGNITE-7648
Pavel Voronkin created IGNITE-11255: --- Summary: Fix test failure after IGNITE-7648 Key: IGNITE-11255 URL: https://issues.apache.org/jira/browse/IGNITE-11255 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11221) Refactor timeout logic in TcpDiscovery
Pavel Voronkin created IGNITE-11221: --- Summary: Refactor timeout logic in TcpDiscovery Key: IGNITE-11221 URL: https://issues.apache.org/jira/browse/IGNITE-11221 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin We need to reimplement IgniteSpiOperationTimeoutHelper, cause it's mixing exception handling and timeout calculation. We need to reuse ExponentialBackoffTimeout to encapsulate logic of calculating different sets of timeout separately and get rid of many local variables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11201) ConnectorConfdiguration and TransactionConfiguration toString is not properly implemented.
Pavel Voronkin created IGNITE-11201: --- Summary: ConnectorConfdiguration and TransactionConfiguration toString is not properly implemented. Key: IGNITE-11201 URL: https://issues.apache.org/jira/browse/IGNITE-11201 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11172) On receiving duplicated connections we got exception.
Pavel Voronkin created IGNITE-11172: --- Summary: On receiving duplicated connections we got exception. Key: IGNITE-11172 URL: https://issues.apache.org/jira/browse/IGNITE-11172 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin [2019-01-31 16:10:19,072][INFO ][grid-nio-worker-tcp-comm-5-#45][TcpCommunicationSpi] Received incoming connection from remote node while connecting to this node, rejecting [locNode=e0668107-3c19-41ba-b9f5-9f073711d3ce, locNodeOrder=1, rmtNode=848095e3-29bf-4d67-a5d7-117f44001b70, rmtNodeOrder=2] [2019-01-31 16:10:20,310][ERROR][grid-nio-worker-tcp-comm-6-#46][TcpCommunicationSpi] Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=GridWorker [name=grid-nio-worker-tcp-comm-6, igniteInstanceName=null, finished=false, hashCode=848731852, interrupted=false, runner=grid-nio-worker-tcp-comm-6-#46]AbstractNioClientWorker [idx=6, bytesRcvd=28540977, bytesSent=0, bytesRcvd0=30504, bytesSent0=0, select=true, super=]DirectNioClientWorker [super=], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32511 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=484549, resendCnt=0, rcvCnt=443208, sentCnt=532641, reserved=true, lastAck=443200, nodeLeft=false, node=TcpDiscoveryNode [id=848095e3-29bf-4d67-a5d7-117f44001b70, addrs=ArrayList [172.25.1.12], sockAddrs=HashSet [lab12.gridgain.local/172.25.1.12:47500], discPort=47500, order=2, intOrder=2, lastExchangeTime=1548940115834, loc=false, ver=2.5.5#20190131-sha1:38e914f7, isClient=false], connected=false, connectCnt=16, queueLimit=4096, reserveCnt=17, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=484549, resendCnt=0, rcvCnt=443208, sentCnt=532641, reserved=true, lastAck=443200, nodeLeft=false, node=TcpDiscoveryNode [id=848095e3-29bf-4d67-a5d7-117f44001b70, addrs=ArrayList [172.25.1.12], sockAddrs=HashSet [lab12.gridgain.local/172.25.1.12:47500], discPort=47500, order=2, intOrder=2, lastExchangeTime=1548940115834, loc=false, ver=2.5.5#20190131-sha1:38e914f7, isClient=false], connected=false, connectCnt=16, queueLimit=4096, reserveCnt=17, pairedConnections=false], super=GridNioSessionImpl [locAddr=/172.25.1.11:58372, rmtAddr=lab12.gridgain.local/172.25.1.12:47100, createTime=1548940219095, closeTime=0, bytesSent=5750672, bytesRcvd=23544, bytesSent0=5750672, bytesRcvd0=23544, sndSchedTime=1548940219095, lastSndTime=1548940219306, lastRcvTime=1548940219115, readsPaused=false, filterChain=FilterChain[filters=[, GridConnectionBytesVerifyFilter, SSL filter], accepted=false, markedForClose=true]]] javax.net.ssl.SSLException: Failed to encrypt data (SSL engine error) [status=CLOSED, handshakeStatus=NEED_UNWRAP, ses=GridSelectorNioSessionImpl [worker=GridWorker [name=grid-nio-worker-tcp-comm-6, igniteInstanceName=null, finished=false, hashCode=848731852, interrupted=false, runner=grid-nio-worker-tcp-comm-6-#46]AbstractNioClientWorker [idx=6, bytesRcvd=28540977, bytesSent=0, bytesRcvd0=30504, bytesSent0=0, select=true, super=]DirectNioClientWorker [super=], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32511 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=484549, resendCnt=0, rcvCnt=443208, sentCnt=532641, reserved=true, lastAck=443200, nodeLeft=false, node=TcpDiscoveryNode [id=848095e3-29bf-4d67-a5d7-117f44001b70, addrs=ArrayList [172.25.1.12], sockAddrs=HashSet [lab12.gridgain.local/172.25.1.12:47500], discPort=47500, order=2, intOrder=2, lastExchangeTime=1548940115834, loc=false, ver=2.5.5#20190131-sha1:38e914f7, isClient=false], connected=false, connectCnt=16, queueLimit=4096, reserveCnt=17, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=484549, resendCnt=0, rcvCnt=443208, sentCnt=532641, reserved=true, lastAck=443200, nodeLeft=false, node=TcpDiscoveryNode [id=848095e3-29bf-4d67-a5d7-117f44001b70, addrs=ArrayList [172.25.1.12], sockAddrs=HashSet [lab12.gridgain.local/172.25.1.12:47500], discPort=47500, order=2, intOrder=2, lastExchangeTime=1548940115834, loc=false, ver=2.5.5#20190131-sha1:38e914f7, isClient=false], connected=false, connectCnt=16, queueLimit=4096, reserveCnt=17, pairedConnections=false], super=GridNioSessionImpl [locAddr=/172.25.1.11:58372, rmtAddr=lab12.gridgain.local/172.25.1.12:47100, createTime=1548940219095, closeTime=0, bytesSent=5750672, bytesRcvd=23544, bytesSent0=5750672, bytesRcvd0=23544, sndSchedTime=1548940219095, lastSndTime=1548940219306, lastRcvTime=1548940219115, readsPaused=false, filterChain=FilterChain[filters=[, GridConnectionBytesVerifyFilter, SSL filter], accepted=false, markedForClose=true]]] at org.apache.ignite.internal.util.nio.ssl.GridNioSslHandler.encrypt(GridNioSslHandler.java:380) at
[jira] [Created] (IGNITE-11126) Rework TcpCommunicationSpi.createShmemClient failure detection logic.
Pavel Voronkin created IGNITE-11126: --- Summary: Rework TcpCommunicationSpi.createShmemClient failure detection logic. Key: IGNITE-11126 URL: https://issues.apache.org/jira/browse/IGNITE-11126 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11061) Сopyright still points out 2018
Pavel Voronkin created IGNITE-11061: --- Summary: Сopyright still points out 2018 Key: IGNITE-11061 URL: https://issues.apache.org/jira/browse/IGNITE-11061 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11054) GridNioServer.processWrite() reordered socket.write and onMessageWritten callback.
Pavel Voronkin created IGNITE-11054: --- Summary: GridNioServer.processWrite() reordered socket.write and onMessageWritten callback. Key: IGNITE-11054 URL: https://issues.apache.org/jira/browse/IGNITE-11054 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11031) Improve test coverage on ssl and fix existing ssl tcp communication spi tests.
Pavel Voronkin created IGNITE-11031: --- Summary: Improve test coverage on ssl and fix existing ssl tcp communication spi tests. Key: IGNITE-11031 URL: https://issues.apache.org/jira/browse/IGNITE-11031 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11026) Support TcpCommunicationSpi.NeedWaitDelay, TcpCommunicationSpi.MaxNeedWaitDelay.
Pavel Voronkin created IGNITE-11026: --- Summary: Support TcpCommunicationSpi.NeedWaitDelay, TcpCommunicationSpi.MaxNeedWaitDelay. Key: IGNITE-11026 URL: https://issues.apache.org/jira/browse/IGNITE-11026 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11017) OffheapEntriesCount metrics calculate size on all not EVICTED partitions
Pavel Voronkin created IGNITE-11017: --- Summary: OffheapEntriesCount metrics calculate size on all not EVICTED partitions Key: IGNITE-11017 URL: https://issues.apache.org/jira/browse/IGNITE-11017 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11016) NEED_WAIT write message failed in case of SSL
Pavel Voronkin created IGNITE-11016: --- Summary: NEED_WAIT write message failed in case of SSL Key: IGNITE-11016 URL: https://issues.apache.org/jira/browse/IGNITE-11016 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10877) GridAffinityAssignment.initPrimaryBackupMaps memory pressure
Pavel Voronkin created IGNITE-10877: --- Summary: GridAffinityAssignment.initPrimaryBackupMaps memory pressure Key: IGNITE-10877 URL: https://issues.apache.org/jira/browse/IGNITE-10877 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10876) finishExchangeOnCoordinator parallelization
Pavel Voronkin created IGNITE-10876: --- Summary: finishExchangeOnCoordinator parallelization Key: IGNITE-10876 URL: https://issues.apache.org/jira/browse/IGNITE-10876 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10851) Improve WalCompactionSwitchOnTest to rely on rollOver() instead of hardcoded values.
Pavel Voronkin created IGNITE-10851: --- Summary: Improve WalCompactionSwitchOnTest to rely on rollOver() instead of hardcoded values. Key: IGNITE-10851 URL: https://issues.apache.org/jira/browse/IGNITE-10851 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10686) NPE on query execution on node kill
Pavel Voronkin created IGNITE-10686: --- Summary: NPE on query execution on node kill Key: IGNITE-10686 URL: https://issues.apache.org/jira/browse/IGNITE-10686 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin Attachments: IndexingTestNpe.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10681) PME benchmarks become unstable at high number of partitions per cache.
Pavel Voronkin created IGNITE-10681: --- Summary: PME benchmarks become unstable at high number of partitions per cache. Key: IGNITE-10681 URL: https://issues.apache.org/jira/browse/IGNITE-10681 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin With increased number of partitions per cache (20-32k), some of the stages happening during PME become very unstable both per node and per run. This must be investigated because it blocks us from fine-tuning PME time optimizations further. Example: same configuration, same servers, same version (8.5.1-p150) a run 2018-11-13: {noformat} Exchange [13,0] during LEAVE 1 server(s): 8942 msec, 8952 msec New topology version after JOIN 1 server(s): 14 Exchange [14,0] during JOIN 1 server(s): 7558 msec, 8225 msec Exchange [14,1] during JOIN 1 server(s): 19510 msec, 19562 msec {noformat} a run 2018-11-14: {noformat} Exchange [13, 0] during LEAVE 1 server(s): 14434 msec, 14448 msec New topology version after JOIN 1 server(s): 14 Exchange [14, 1] during JOIN 1 server(s): 9089 msec, 9512 msec Exchange [14, 0] during JOIN 1 server(s): 14455 msec, 14671 msec {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10679) Add more debug info for 'Affinity changes' PME stage.
Pavel Voronkin created IGNITE-10679: --- Summary: Add more debug info for 'Affinity changes' PME stage. Key: IGNITE-10679 URL: https://issues.apache.org/jira/browse/IGNITE-10679 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin At current stage of PME time optimizations, next possible target to optimize is 'Affinity changes' apply stage, because it starts to slowdown PME when configured number of partitions is increased (up to 32k) We need to add more debug information to understand what is the sources of slowdown. Example log: {noformat} [15:21:55,326][INFO][sys-#121][GridDhtPartitionsExchangeFuture] Received full message, will finish exchange [node=8ca2f878-70cf-4d34-a3ea-97b244f8d3d6, resVer=AffinityTopologyVersion [topVer=13, minorTopVer=0]] [15:21:55,725][INFO][sys-#121][CacheAffinitySharedManager] Affinity applying from full message performed in 398 ms. [15:21:56,173][INFO][db-checkpoint-thread-#85][GridCacheDatabaseSharedManager] Skipping checkpoint (no pages were modified) [checkpointLockWait=0ms, checkpointLockHoldTime=740ms, reason='timeout'] [15:21:57,150][INFO][db-checkpoint-thread-#85][GridCacheDatabaseSharedManager] Skipping checkpoint (no pages were modified) [checkpointLockWait=13ms, checkpointLockHoldTime=699ms, reason='timeout'] [15:21:58,189][INFO][db-checkpoint-thread-#85][GridCacheDatabaseSharedManager] Skipping checkpoint (no pages were modified) [checkpointLockWait=4ms, checkpointLockHoldTime=730ms, reason='timeout'] [15:21:58,340][INFO][sys-#121][GridDhtPartitionsExchangeFuture] Affinity changes applied in 3013 ms. [15:21:59,131][INFO][db-checkpoint-thread-#85][GridCacheDatabaseSharedManager] Skipping checkpoint (no pages were modified) [checkpointLockWait=6ms, checkpointLockHoldTime=664ms, reason='timeout'] [15:21:59,311][INFO][sys-#121][GridDhtPartitionsExchangeFuture] Full map updating for 67 groups performed in 971 ms. [15:21:59,311][INFO][sys-#121][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], err=null] [15:22:00,039][INFO][db-checkpoint-thread-#85][GridCacheDatabaseSharedManager] Skipping checkpoint (no pages were modified) [checkpointLockWait=4ms, checkpointLockHoldTime=569ms, reason='timeout'] [15:22:00,108][INFO][sys-#121][GridDhtPartitionsExchangeFuture] Detecting lost partitions performed in 796 ms. [15:22:00,376][WARNING][sys-stripe-14-#15][finish] Received finish request for completed transaction (the message may be too late) [txId=GridCacheVersion [topVer=153677859, order=1542197978690, nodeOrder=10], dhtTxId=null, node=ca22eac2-862a-4fed-a84b-4abe85bfab25, commit=false] [15:22:00,376][WARNING][sys-stripe-9-#10][finish] Received finish request for completed transaction (the message may be too late) [txId=GridCacheVersion [topVer=153677859, order=1542197978695, nodeOrder=10], dhtTxId=null, node=ca22eac2-862a-4fed-a84b-4abe85bfab25, commit=false] [15:22:00,376][WARNING][sys-stripe-10-#11][GridDhtColocatedCache] Failed to acquire lock (transaction has been completed): GridCacheVersion [topVer=153677859, order=1542197978695, nodeOrder=10] [15:22:00,376][WARNING][sys-stripe-11-#12][GridDhtColocatedCache] Failed to acquire lock (transaction has been completed): GridCacheVersion [topVer=153677859, order=1542197978690, nodeOrder=10] [15:22:01,404][INFO][db-checkpoint-thread-#85][GridCacheDatabaseSharedManager] Skipping checkpoint (no pages were modified) [checkpointLockWait=2ms, checkpointLockHoldTime=928ms, reason='timeout'] [15:22:02,069][INFO][db-checkpoint-thread-#85][GridCacheDatabaseSharedManager] Checkpoint started [checkpointId=4a3ff280-b07c-48fc-8352-dd1b09e4ddbb, startPtr=FileWALPointer [idx=35, fileOff=11841470, len=11791007], checkpointLockWait=5ms, checkpointLockHoldTime=568ms, walCpRecordFsyncDuration=14ms, pages=12, reason='timeout'] [15:22:02,070][INFO][db-checkpoint-thread-#85][GridCacheDatabaseSharedManager] Checkpoint finished [cpId=4a3ff280-b07c-48fc-8352-dd1b09e4ddbb, pages=12, markPos=FileWALPointer [idx=35, fileOff=11841470, len=11791007], walSegmentsCleared=0, walSegmentsCovered=[], markDuration=584ms, pagesWrite=0ms, fsync=1ms, total=590ms] [15:22:02,476][INFO][exchange-worker-#66][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=13, minorTopVer=0], evt=NODE_FAILED, node=a10011b8-7431-4bba-8b17-565a2a2be222] [15:22:02,760][WARNING][sys-stripe-11-#12][finish] Received finish request for completed transaction (the message may be too late) [txId=GridCacheVersion [topVer=153677859, order=1542197978805, nodeOrder=10], dhtTxId=null, node=ca22eac2-862a-4fed-a84b-4abe85bfab25, commit=false] [15:22:03,044][INFO][db-checkpoint-thread-#85][GridCacheDatabaseSharedManager] Skipping
[jira] [Created] (IGNITE-10671) Double initialization of segmentAware and FileArchiver lead to race breaking file compression.
Pavel Voronkin created IGNITE-10671: --- Summary: Double initialization of segmentAware and FileArchiver lead to race breaking file compression. Key: IGNITE-10671 URL: https://issues.apache.org/jira/browse/IGNITE-10671 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin Attachments: WalCompactionSwitchOverTest.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10658) Rebalance status in Visor stays on 99.99%.
Pavel Voronkin created IGNITE-10658: --- Summary: Rebalance status in Visor stays on 99.99%. Key: IGNITE-10658 URL: https://issues.apache.org/jira/browse/IGNITE-10658 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10648) Ignite hang to stop if node wasn't started completely. GridTcpRestNioListener hangs on latch.
Pavel Voronkin created IGNITE-10648: --- Summary: Ignite hang to stop if node wasn't started completely. GridTcpRestNioListener hangs on latch. Key: IGNITE-10648 URL: https://issues.apache.org/jira/browse/IGNITE-10648 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10644) CorruptedTreeException might occur after force node kill during transaction
Pavel Voronkin created IGNITE-10644: --- Summary: CorruptedTreeException might occur after force node kill during transaction Key: IGNITE-10644 URL: https://issues.apache.org/jira/browse/IGNITE-10644 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10638) Improve CacheNoAffinityExchangeTest.testNoAffinityChangeOnClientLeftWithMergedExchanges to cover persistence case
Pavel Voronkin created IGNITE-10638: --- Summary: Improve CacheNoAffinityExchangeTest.testNoAffinityChangeOnClientLeftWithMergedExchanges to cover persistence case Key: IGNITE-10638 URL: https://issues.apache.org/jira/browse/IGNITE-10638 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10511) disco-event-worker can be deadlocked by BinaryContext.metadata running is sys striped pool waiting for cache entry lock
Pavel Voronkin created IGNITE-10511: --- Summary: disco-event-worker can be deadlocked by BinaryContext.metadata running is sys striped pool waiting for cache entry lock Key: IGNITE-10511 URL: https://issues.apache.org/jira/browse/IGNITE-10511 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin Attachments: race.txt -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10417) notifyDiscoveryListener can be lost
Pavel Voronkin created IGNITE-10417: --- Summary: notifyDiscoveryListener can be lost Key: IGNITE-10417 URL: https://issues.apache.org/jira/browse/IGNITE-10417 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10344) Speed up cleanupRestoredCaches
Pavel Voronkin created IGNITE-10344: --- Summary: Speed up cleanupRestoredCaches Key: IGNITE-10344 URL: https://issues.apache.org/jira/browse/IGNITE-10344 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin if (!cctx.kernalContext().clientNode() && !isLocalNodeInBaseline()) { // Stop all recovered caches and groups. cctx.cache().onKernalStopCaches(true); cctx.cache().stopCaches(true); cctx.database().cleanupRestoredCaches(); // Set initial node started marker. cctx.database().nodeStart(null); } If we have many cache groups we spent a lot of time about 30sec to cleanupRestoredCaches(). We need to speed up this phase and add metrics on this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10324) Disallow fallback to Scanner in control.sh when asking password
Pavel Voronkin created IGNITE-10324: --- Summary: Disallow fallback to Scanner in control.sh when asking password Key: IGNITE-10324 URL: https://issues.apache.org/jira/browse/IGNITE-10324 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10316) control.sh --baseline remove outputs wrong error message when trying to remove node from baseline when cluster is inactive.
Pavel Voronkin created IGNITE-10316: --- Summary: control.sh --baseline remove outputs wrong error message when trying to remove node from baseline when cluster is inactive. Key: IGNITE-10316 URL: https://issues.apache.org/jira/browse/IGNITE-10316 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin 1. start 2 nodes from clean lfs 2. grep node consistent ids {noformat} $ grep "Consistent ID" work/log/*.log work/log/ignite-15958745.0.log:[16:59:13,190][INFO][main][PdsFoldersResolver] Consistent ID used for local node is [7ee8018b-3f5f-4c58-b6dd-f53ed7af2679] according to persistence data storage folders work/log/ignite-300c7412.0.log:[16:59:15,678][INFO][main][PdsFoldersResolver] Consistent ID used for local node is [5adcf3a1-7ad9-40fa-88ac-40b488dc6b34] according to persistence data storage folders {noformat} 3. try to remove node from baseline expected: error message about cluster inactive state actual: error message about node id not found, BUG {noformat} Caused by: java.lang.IllegalStateException: Node not found for consistent ID: 7ee8018b-3f5f-4c58-b6dd-f53ed7af2679 at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.remove(VisorBaselineTask.java:178) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.run(VisorBaselineTask.java:208) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.run(VisorBaselineTask.java:52) at org.apache.ignite.internal.visor.VisorJob.execute(VisorJob.java:69) at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:568) at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6726) at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:562) ... 19 more {noformat} however, when trying to add node to baseline, error message is correct: {noformat} Caused by: class org.apache.ignite.IgniteException: Changing BaselineTopology on inactive cluster is not allowed. at org.apache.ignite.internal.cluster.IgniteClusterImpl.validateBeforeBaselineChange(IgniteClusterImpl.java:406) at org.apache.ignite.internal.cluster.IgniteClusterImpl.setBaselineTopology(IgniteClusterImpl.java:356) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.set0(VisorBaselineTask.java:87) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.add(VisorBaselineTask.java:162) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.run(VisorBaselineTask.java:205) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.run(VisorBaselineTask.java:52) at org.apache.ignite.internal.visor.VisorJob.execute(VisorJob.java:69) at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:568) at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6726) at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:562) at org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:491) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1123) at org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1407) at org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:660) at org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:532) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10315) control.sh --baseline remove outputs wrong error message when trying to remove node from baseline when cluster is inactive.
Pavel Voronkin created IGNITE-10315: --- Summary: control.sh --baseline remove outputs wrong error message when trying to remove node from baseline when cluster is inactive. Key: IGNITE-10315 URL: https://issues.apache.org/jira/browse/IGNITE-10315 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin 1. start 2 nodes from clean lfs 2. grep node consistent ids {noformat} $ grep "Consistent ID" work/log/*.log work/log/ignite-15958745.0.log:[16:59:13,190][INFO][main][PdsFoldersResolver] Consistent ID used for local node is [7ee8018b-3f5f-4c58-b6dd-f53ed7af2679] according to persistence data storage folders work/log/ignite-300c7412.0.log:[16:59:15,678][INFO][main][PdsFoldersResolver] Consistent ID used for local node is [5adcf3a1-7ad9-40fa-88ac-40b488dc6b34] according to persistence data storage folders {noformat} 3. try to remove node from baseline expected: error message about cluster inactive state actual: error message about node id not found, BUG {noformat} Caused by: java.lang.IllegalStateException: Node not found for consistent ID: 7ee8018b-3f5f-4c58-b6dd-f53ed7af2679 at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.remove(VisorBaselineTask.java:178) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.run(VisorBaselineTask.java:208) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.run(VisorBaselineTask.java:52) at org.apache.ignite.internal.visor.VisorJob.execute(VisorJob.java:69) at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:568) at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6726) at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:562) ... 19 more {noformat} however, when trying to add node to baseline, error message is correct: {noformat} Caused by: class org.apache.ignite.IgniteException: Changing BaselineTopology on inactive cluster is not allowed. at org.apache.ignite.internal.cluster.IgniteClusterImpl.validateBeforeBaselineChange(IgniteClusterImpl.java:406) at org.apache.ignite.internal.cluster.IgniteClusterImpl.setBaselineTopology(IgniteClusterImpl.java:356) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.set0(VisorBaselineTask.java:87) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.add(VisorBaselineTask.java:162) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.run(VisorBaselineTask.java:205) at org.apache.ignite.internal.visor.baseline.VisorBaselineTask$VisorBaselineJob.run(VisorBaselineTask.java:52) at org.apache.ignite.internal.visor.VisorJob.execute(VisorJob.java:69) at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:568) at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6726) at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:562) at org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:491) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1123) at org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1407) at org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:660) at org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:532) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10300) control.sh: incorrect error message after three tries on unsuccessful authorization
Pavel Voronkin created IGNITE-10300: --- Summary: control.sh: incorrect error message after three tries on unsuccessful authorization Key: IGNITE-10300 URL: https://issues.apache.org/jira/browse/IGNITE-10300 Project: Ignite Issue Type: Bug Reporter: Pavel Voronkin 1. start grid with securirty enabled 2. try to issue control.sh --cache authentication credentials asked 3. enter incorrect credentials three times expected: Authentication error printed and logged actual: Latest topology update failed error printed {noformat} IGNITE_HOME=`pwd` bin/control.sh --cache list . Control utility [ver. 2.5.1-p160#20181113-sha1:5f845ca7] 2018 Copyright(C) Apache Software Foundation User: mshonichev Authentication error, try connection again. user: password: Authentication error, try connection again. user: password: Authentication error, try connection again. user: password: Authentication error. Error: Latest topology update failed. {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10295) Rework Sending Full Message logging.
Pavel Voronkin created IGNITE-10295: --- Summary: Rework Sending Full Message logging. Key: IGNITE-10295 URL: https://issues.apache.org/jira/browse/IGNITE-10295 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10085) Make compressed wal archives user friendly.
Pavel Voronkin created IGNITE-10085: --- Summary: Make compressed wal archives user friendly. Key: IGNITE-10085 URL: https://issues.apache.org/jira/browse/IGNITE-10085 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin Compressed wal archives are created with ZipEntry(""). In some ZIP GUIs those archives are shown empty which can really confuse users. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9970) Add ability to set nodeId for VisorIdleVerifyDumpTask executed from ./control.sh --host HOST --cache idle_verify
Pavel Voronkin created IGNITE-9970: -- Summary: Add ability to set nodeId for VisorIdleVerifyDumpTask executed from ./control.sh --host HOST --cache idle_verify Key: IGNITE-9970 URL: https://issues.apache.org/jira/browse/IGNITE-9970 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9942) We need ability in WebConsole to disable selfregistration feature.
Pavel Voronkin created IGNITE-9942: -- Summary: We need ability in WebConsole to disable selfregistration feature. Key: IGNITE-9942 URL: https://issues.apache.org/jira/browse/IGNITE-9942 Project: Ignite Issue Type: Improvement Reporter: Pavel Voronkin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9793) Deactivation, segmentation of one node, activation may lead to hang activation forever
Pavel Voronkin created IGNITE-9793: -- Summary: Deactivation, segmentation of one node, activation may lead to hang activation forever Key: IGNITE-9793 URL: https://issues.apache.org/jira/browse/IGNITE-9793 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Pavel Voronkin There is coordinator and ring of nodes coordinator -> 1 -> 2 - > 3 -> 4 coordinator deactivated: 2018-09-24 15:09:01.609 [INFO ][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Successfully deactivated data structures, services and caches [nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4, client=false, topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]] 2018-09-24 15:09:01.620 [DEBUG][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.l.ExchangeLatchManager] Server latch is created [latch=CompletableLatchUid\{id='exchange', topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]}, participantsSize=160] 2018-09-24 15:09:01.621 [INFO ][exchange-worker-#153%DPL_GRID%DplGridNodeName%] nodes 1, 2, 3, 4 were deactivated: 2018-09-24 15:09:01.609 [INFO ][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Successfully deactivated data structures, services and caches [nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4, client=false, topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]] 2018-09-24 15:09:03.328 [INFO ][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Successfully deactivated data structures, services and caches [nodeId=22a58223-47b5-43c2-897b-e70e8e50edf7, client=false, topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]] 2018-09-24 15:09:03.334 [INFO ][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Successfully deactivated data structures, services and caches [nodeId=973eb8ce-3b8c-463d-a6ab-00ac66d93f13, client=false, topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]] 2018-09-24 15:09:03.332 [INFO ][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Successfully deactivated data structures, services and caches [nodeId=a904bac4-aaed-4f69-90f3-c13bc4d331d1, client=false, topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]] Node 2 SEGMENTED 2018-09-24 15:17:50.068 [WARN ][tcp-disco-msg-worker-#2%DPL_GRID%DplGridNodeName%][o.a.i.s.d.tcp.TcpDiscoverySpi] Node is out of topology (probably, due to short-time network problems). 2018-09-24 15:17:50.069 [WARN ][disco-event-worker-#152%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Local node SEGMENTED: TcpDiscoveryNode [id=a904bac4-aaed-4f69-90f3-c13bc4d331d1, addrs=ArrayList [10.116.206.98], sockAddrs=HashSet [grid724.domain/10.116.206.98:47500], discPort=47500, order=110, intOrder=110, lastExchangeTime=1537791470063, loc=true, ver=2.5.1#20180906-sha1:ebde6c79, isClient=false] Coordinator started activation on topology without node2 2018-09-24 15:19:48.686 [INFO ][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Start activation process [nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4, client=false, topVer=AffinityTopologyVersion [topVer=188, minorTopVer=1]] But node 3 which is next to node 2 haven't received activation message. Coordinator sent activation to all except 2018-09-24 15:24:25.911 [INFO ][sys-#28144%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Coordinator received single message [ver=AffinityTopologyVersion [topVer=188, minorTopVer=1], node=073f1598-6b70-49df-8f45-126735611775, allReceived=false] GridDhtPartitionsExchangeFuture hangs forever. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9433) Refactoring to improve constant usage for file suffixes
Pavel Voronkin created IGNITE-9433: -- Summary: Refactoring to improve constant usage for file suffixes Key: IGNITE-9433 URL: https://issues.apache.org/jira/browse/IGNITE-9433 Project: Ignite Issue Type: Task Reporter: Pavel Voronkin Fix For: 2.7 -- This message was sent by Atlassian JIRA (v7.6.3#76005)