[jira] [Created] (IGNITE-14476) Get rid of using storage implementation explicitly in ConfigurationRoot annotation
Vladislav Pyatkov created IGNITE-14476: -- Summary: Get rid of using storage implementation explicitly in ConfigurationRoot annotation Key: IGNITE-14476 URL: https://issues.apache.org/jira/browse/IGNITE-14476 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov Fix For: 3.0.0-alpha2 Today we are using generated schema classes in public API, but we don't want to provide an implementation in it. For example: {code:java} @ConfigurationRoot(rootName = "rest", storage = InMemoryConfigurationStorage.class) public class RestConfigurationSchema { ... {code} The mention of InMemoryConfigurationStorage should be changed to a specific constant: {code:java} @ConfigurationRoot(rootName = "rest", storage = IgniteConsts.MEMORY_CONFIGURATION_STORAGE) public class RestConfigurationSchema { ... {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14419) SPI suite hangs sporadically on private TC
Vladislav Pyatkov created IGNITE-14419: -- Summary: SPI suite hangs sporadically on private TC Key: IGNITE-14419 URL: https://issues.apache.org/jira/browse/IGNITE-14419 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov [SPI|https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_Spi?branch=%3Cdefault%3E=overview=builds] suite times out on master branch from time to time. Hangs and successful runs happen without changes or with unrelated changes. Logs from three last timeouts are attached. >From analysis of last three timed out runs: # Suite hangs either on testJoinErrorMissedAddFinishedMessage2 or testClientConnectToCluster tests. # In both cases there are no obvious exceptions or assertions in logs, but internal components output warnings about hanging exchange: *Still waiting for initial partition map exchange*. Most likely these hangs are caused by the same problem with hanging PME. Tests need to be investigated further. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14185) Synchronous checkpoints on several nodes greatly increase a latency of distributed transaction
Vladislav Pyatkov created IGNITE-14185: -- Summary: Synchronous checkpoints on several nodes greatly increase a latency of distributed transaction Key: IGNITE-14185 URL: https://issues.apache.org/jira/browse/IGNITE-14185 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov If we have several nodes where the checkpoints configured identical (with the same frequency), we can get a distributed lag of transaction processing. Even if anyone of them separately holds an exclusive lock significantly low time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14140) Checkpointer thread holds write lock too long
Vladislav Pyatkov created IGNITE-14140: -- Summary: Checkpointer thread holds write lock too long Key: IGNITE-14140 URL: https://issues.apache.org/jira/browse/IGNITE-14140 Project: Ignite Issue Type: Bug Components: persistence Reporter: Vladislav Pyatkov Assignee: Vladislav Pyatkov Free lists flushing optimization can block db-checkpoint-thread when it got Write lock. It might block all transactions for several hundreds milliseconds. {noformat} "db-checkpoint-thread-#334%DPL_GRID%DplGridNodeName%" #667 daemon prio=5 os_prio=0 tid=0x7e765c123800 nid=0xee0b8 runnable [0x7e767f535000] java.lang.Thread.State: RUNNABLE at sun.misc.Unsafe.getObjectVolatile(Native Method) at java.util.concurrent.atomic.AtomicReferenceArray.getRaw(AtomicReferenceArray.java:130) at java.util.concurrent.atomic.AtomicReferenceArray.get(AtomicReferenceArray.java:125) at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.getBucketCache(AbstractFreeList.java:690) at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:374) at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:343) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:373) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:336) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:322) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onMarkCheckpointBegin(GridCacheOffheapManager.java:247) at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointBegin(CheckpointWorkflow.java:281) at org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:388) at org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:264) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.lang.Thread.run(Thread.java:748) {noformat} We can to reduce time into Write lock if switch off optimization before the lock will be gotten and enable it after the lock will be left off. This image confirms that all time consume of storing the metadata cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14139) Incorrect initialize checkpoint-runner-cpu thread pool
Vladislav Pyatkov created IGNITE-14139: -- Summary: Incorrect initialize checkpoint-runner-cpu thread pool Key: IGNITE-14139 URL: https://issues.apache.org/jira/browse/IGNITE-14139 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Assignee: Vladislav Pyatkov First initialization of checkpoint thread pool for CPU is incorrect. Look at the constructor of {{CheckpointWorkflow}}: At start, we initialize the pool: {code:java} this.checkpointCollectPagesInfoPool = initializeCheckpointPool(); {code} and only after, we set a size of the pool: {code:java} this.checkpointCollectInfoThreads = checkpointCollectInfoThreads; {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14138) Historical rebalance kills cluster
Vladislav Pyatkov created IGNITE-14138: -- Summary: Historical rebalance kills cluster Key: IGNITE-14138 URL: https://issues.apache.org/jira/browse/IGNITE-14138 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov {noformat} [2021-01-12T05:11:02,142][ERROR][rebalance-#508%---%][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.IgniteCheckedException: Failed to continue supplying [grp=SQL_USAGES_EPE, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1 org.apache.ignite.IgniteCheckedException: Failed to continue supplying [grp=SQL_1, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1]] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:571) [ignite-core.jar] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleDemandMessage(GridDhtPreloader.java:398) [ignite-core.jar] at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:489) [ignite-core.jar] at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:474) [ignite-core.jar] at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142) [ignite-core.jar] at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591) [ignite-core.jar] at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$800(GridCacheIoManager.java:109) [ignite-core.jar] at org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1707) [ignite-core.jar] at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1721) [ignite-core.jar] at org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:157) [ignite-core.jar] at org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:3011) [ignite-core.jar] at org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1662) [ignite-core.jar] at org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:157) [ignite-core.jar] at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1629) [ignite-core.jar] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:834) [?:?] Caused by: org.apache.ignite.IgniteCheckedException: Could not find start pointer for partition [part=4, partCntrSince=1115] at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.searchEarliestWalPointer(CheckpointHistory.java:557) ~[ignite-core.jar] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.historicalIterator(GridCacheOffheapManager.java:1121) ~[ignite-core.jar] at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.rebalanceIterator(IgniteCacheOffheapManagerImpl.java:1195) ~[ignite-core.jar] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:322) ~[ignite-core.jar] ... 16 more {noformat} I believe that it should throw IgniteHistoricalIteratorException instead of IgniteCheckedException, so it can be properly handled and rebalance can move to the full rebalance instead of killing nodes -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14073) False alarm to lose all transaction nodes
Vladislav Pyatkov created IGNITE-14073: -- Summary: False alarm to lose all transaction nodes Key: IGNITE-14073 URL: https://issues.apache.org/jira/browse/IGNITE-14073 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Assignee: Vladislav Pyatkov This exception will happen when losing a primary and other one node during the transaction. But it may not be truth, because the transaction will be able to continue on backups (if they are still alive). {noformat} [2021-01-23 22:32:50,584][ERROR][test-runner-#1%near.IgniteTxExceptionNodeFailTest%][root] Transaction was not committed. class org.apache.ignite.IgniteException: Failed to commit a transaction (all partition owners have left the grid, partition data has been lost) [cacheName=default, partition=3, key=386050343] at org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1096) at org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.commit(TransactionProxyImpl.java:323) at org.apache.ignite.internal.processors.cache.distributed.near.IgniteTxExceptionNodeFailTest.cacheWithBackups(IgniteTxExceptionNodeFailTest.java:280) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.ignite.testframework.junits.GridAbstractTest$7.run(GridAbstractTest.java:2367) at java.lang.Thread.run(Thread.java:748) Caused by: class org.apache.ignite.internal.processors.cache.CacheInvalidStateException: Failed to commit a transaction (all partition owners have left the grid, partition data has been lost) [cacheName=default, partition=3, key=386050343] at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture$FinishMiniFuture.onNodeLeft(GridNearTxFinishFuture.java:993) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.onNodeLeft(GridNearTxFinishFuture.java:167) at org.apache.ignite.internal.processors.cache.GridCacheMvccManager$4.onEvent(GridCacheMvccManager.java:265) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager$LocalListenerWrapper.onEvent(GridEventStorageManager.java:1393) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:888) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:873) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record0(GridEventStorageManager.java:349) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:312) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2948) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:3164) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:2968) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) ... 1 more {noformat} It will frighten a client, because it looks like a data lose. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13977) Code enhancement after review of encryption persistent storage
Vladislav Pyatkov created IGNITE-13977: -- Summary: Code enhancement after review of encryption persistent storage Key: IGNITE-13977 URL: https://issues.apache.org/jira/browse/IGNITE-13977 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov # There are a lot of difficult code snipped in `GridCacheOffheapManager` where the type of page choosing. # CacheGroupReencryptionTest.testPhysicalRecoveryWithUpdates test is flaky, when checkpoint is triggered before expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13866) Validate index does not stop after process of control.sh was interrupted
Vladislav Pyatkov created IGNITE-13866: -- Summary: Validate index does not stop after process of control.sh was interrupted Key: IGNITE-13866 URL: https://issues.apache.org/jira/browse/IGNITE-13866 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Validate index command might to continue in a cluster even after the command is emergency terminated. For example: we can type CTRL+c in console for terminate an incorrect command invocation, but this command does not be terminated in cluster. In the end of this we have a several processes the result of which does not need anymore. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13864) Assertion error happens on stale latch's acknowledge
Vladislav Pyatkov created IGNITE-13864: -- Summary: Assertion error happens on stale latch's acknowledge Key: IGNITE-13864 URL: https://issues.apache.org/jira/browse/IGNITE-13864 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov There are several Assertion errors on TC logs, they are bounded with exchange latch. Seem it happens because latch manager is not handling stale acknowledge. {noformat} {{[18:39:26]W: [org.gridgain:ignite-core] [2020-03-26 18:39:26,680][ERROR][sys-#53190%distributed.CacheLoadingConcurrentGridStartSelfTest2%][GridIoManager] An error occurred processing the message [msg=GridIoMessage [plc=2, topic=TOPIC_EXCHANGE, topicOrd=31, or dered=false, timeout=0, skipOnTimeout=false, msg=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.LatchAckMessage@779ce9da], nodeId=5bd19ec1-da96-41a1-a3e0-ebb55321]. [18:39:26]W: [org.gridgain:ignite-core] java.lang.AssertionError [18:39:26]W: [org.gridgain:ignite-core] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.processAck(ExchangeLatchManager.java:399) [18:39:26]W: [org.gridgain:ignite-core] at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.lambda$new$0(ExchangeLatchManager.java:119) [18:39:26]W: [org.gridgain:ignite-core] at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1654) [18:39:26]W: [org.gridgain:ignite-core] at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1274) [18:39:26]W: [org.gridgain:ignite-core] at org.apache.ignite.internal.managers.communication.GridIoManager.access$4500(GridIoManager.java:145) [18:39:26]W: [org.gridgain:ignite-core] at org.apache.ignite.internal.managers.communication.GridIoManager$8.execute(GridIoManager.java:1159) [18:39:26]W: [org.gridgain:ignite-core] at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:50) [18:39:26]W: [org.gridgain:ignite-core] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [18:39:26]W: [org.gridgain:ignite-core] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [18:39:26]W: [org.gridgain:ignite-core] at java.lang.Thread.run(Thread.java:748)}} {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13594) Model classes require manual deserialization if used inside Job loaded by p2p
Vladislav Pyatkov created IGNITE-13594: -- Summary: Model classes require manual deserialization if used inside Job loaded by p2p Key: IGNITE-13594 URL: https://issues.apache.org/jira/browse/IGNITE-13594 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov After fix in IGNITE-5038[,|https://ggsystems.atlassian.net/browse/GG-28146,] now users can use model classes inside CompuJobs, but they still need to change their code and add manual deserialization like Object personVal = binaryVal.deserialize(testClsLdr); If they want to use them. I believe that we can do it under the hood and proper classloader can be chosen automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13593) IgniteClientCacheStartFailoverTest.testRebalanceStateConcurrentStart (Cache 2) is flaky
Vladislav Pyatkov created IGNITE-13593: -- Summary: IgniteClientCacheStartFailoverTest.testRebalanceStateConcurrentStart (Cache 2) is flaky Key: IGNITE-13593 URL: https://issues.apache.org/jira/browse/IGNITE-13593 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov [https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=749390831986783178=testDetails_IgniteTests24Java8=%3Cdefault%3E] Flaky rate is 14% There are two kinds of fails in this test (as a TC says): # Exception on MVCC cache, because tests adds identical keys in one moment. This exception will fix here. # Assertion error, because size of cache as different as expected. This behavior is difficulty reproduced and happened very rare in TC. It will be fixed in another ticket if it appears again after this issue would be closed. The reason of flacking of this test is an exception on MVCC cache: {noformat} javax.cache.CacheException: class org.apache.ignite.transactions.TransactionSerializationException: Cannot serialize transaction due to write conflict (transaction is marked for rollback) at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1265) at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.cacheException(IgniteCacheProxyImpl.java:2077) at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1313) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:817) at org.apache.ignite.internal.processors.cache.IgniteClientCacheStartFailoverTest$8.call(IgniteClientCacheStartFailoverTest.java:399) at org.apache.ignite.internal.processors.cache.IgniteClientCacheStartFailoverTest$8.call(IgniteClientCacheStartFailoverTest.java:375) at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:87) Caused by: class org.apache.ignite.transactions.TransactionSerializationException: Cannot serialize transaction due to write conflict (transaction is marked for rollback) at org.apache.ignite.internal.util.IgniteUtils$16.apply(IgniteUtils.java:1011) at org.apache.ignite.internal.util.IgniteUtils$16.apply(IgniteUtils.java:1009) ... 7 more Caused by: class org.apache.ignite.internal.transactions.IgniteTxSerializationCheckedException: Cannot serialize transaction due to write conflict (transaction is marked for rollback) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.serializationError(GridCacheMapEntry.java:7123) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.access$700(GridCacheMapEntry.java:136) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$MvccUpdateLockListener.apply(GridCacheMapEntry.java:5629) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$MvccUpdateLockListener.apply(GridCacheMapEntry.java:5482) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:407) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:355) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:343) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:520) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:498) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:464) at org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl$LockFuture.run(MvccProcessorImpl.java:1952) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13501) AssertionError: Invalid value in testMergeServersFail1_8
Vladislav Pyatkov created IGNITE-13501: -- Summary: AssertionError: Invalid value in testMergeServersFail1_8 Key: IGNITE-13501 URL: https://issues.apache.org/jira/browse/IGNITE-13501 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov java.lang.AssertionError: Invalid value [node=distributed.CacheExchangeMergeTest0, client=false, order=1, cache=c6] expected:<1> but was: Reproduced by [1] [2] [1] https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-3187056670751319047=testDetails Covered of case of one phase committed transaction with zero backups. The reason of this issue in that a primary node fails during a near node is sending a request. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13417) Cache Interceptors deserialization on client nodes
Vladislav Pyatkov created IGNITE-13417: -- Summary: Cache Interceptors deserialization on client nodes Key: IGNITE-13417 URL: https://issues.apache.org/jira/browse/IGNITE-13417 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov After fix https://issues.apache.org/jira/browse/IGNITE-1903, Cache Interceptors still don't work Looks like we need to add @SerializeSeparately to this field too -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13402) [Suite] PDS 3 flaky failed on TC
Vladislav Pyatkov created IGNITE-13402: -- Summary: [Suite] PDS 3 flaky failed on TC Key: IGNITE-13402 URL: https://issues.apache.org/jira/browse/IGNITE-13402 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov {noformat} java.lang.AssertionError: Invalid topology version [topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], group=Group1] at org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.readyTopologyVersion(GridDhtPartitionTopologyImpl.java:317) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.nextVersion(GridCacheAdapter.java:3663) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2821) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2747) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:1090) at org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:242) at org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.lambda$body$0(GridCacheSharedTtlCleanupManager.java:178) at java.util.concurrent.ConcurrentHashMap.computeIfPresent(ConcurrentHashMap.java:1769) [2020-07-28 06:36:24,540][INFO ][exchange-worker-#38244%persistence.IgnitePdsContinuousRestartTestWithExpiryPolicy2%][FileWriteAheadLogManager] Resuming logging to WAL segment [file=/opt/buildagent/work/bde9b45ddb020b34/incubator-ignite/work/db/wal/persistence_IgnitePdsContinuousRestartTestWithExpiryPolicy2/.wal, offset=3573451, ver=2] at org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:177) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.lang.Thread.run(Thread.java:748) [2020-07-28 06:36:24,544][ERROR][ttl-cleanup-worker-#38230%persistence.IgnitePdsContinuousRestartTestWithExpiryPolicy2%][IgniteTestResources] Critical system error detected. Will be handled accordingly to configured handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: GridWorker [name=ttl-cleanup-worker, igniteInstanceName=persistence.IgnitePdsContinuousRestartTestWithExpiryPolicy2, finished=true, heartbeatTs=1595907384509]]] class org.apache.ignite.IgniteException: GridWorker [name=ttl-cleanup-worker, igniteInstanceName=persistence.IgnitePdsContinuousRestartTestWithExpiryPolicy2, finished=true, heartbeatTs=1595907384509] at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1859) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1854) at org.apache.ignite.internal.worker.WorkersRegistry.onStopped(WorkersRegistry.java:168) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:152) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13379) Exception occur on SQL caches when client reconnect
Vladislav Pyatkov created IGNITE-13379: -- Summary: Exception occur on SQL caches when client reconnect Key: IGNITE-13379 URL: https://issues.apache.org/jira/browse/IGNITE-13379 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Assignee: Vladislav Pyatkov When client started only subset of all cluster caches, it can have issues on reconnect. If cache isn't started on client, it still registered some SQL structures: {{GridQueryProcessor#initQueryStructuresForNotStartedCache}} but these structures are not cleared on disconnect: GridCacheProcessor#onReconnected This leads to exception on reconnect: {noformat} class org.apache.ignite.IgniteCheckedException: Type with name 'Timestamp' already indexed in cache 'TEST_CACHE2'. at org.apache.ignite.internal.processors.query.GridQueryProcessor.registerCache0(GridQueryProcessor.java:1712) at org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStart0(GridQueryProcessor.java:834) at org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStart(GridQueryProcessor.java:911) at org.apache.ignite.internal.processors.query.GridQueryProcessor.initQueryStructuresForNotStartedCache(GridQueryProcessor.java:889) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processCacheStartRequests(CacheAffinitySharedManager.java:968) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.onCacheChangeRequest(CacheAffinitySharedManager.java:857) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onClusterStateChangeRequest(GridDhtPartitionsExchangeFuture.java:1205) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:850) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3258) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3104) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13377) WalModeChangeAdvancedSelfTest.testServerRestartNonCoordinator
Vladislav Pyatkov created IGNITE-13377: -- Summary: WalModeChangeAdvancedSelfTest.testServerRestartNonCoordinator Key: IGNITE-13377 URL: https://issues.apache.org/jira/browse/IGNITE-13377 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13265) Historical iterator for atomic group should transfer few more rows than required
Vladislav Pyatkov created IGNITE-13265: -- Summary: Historical iterator for atomic group should transfer few more rows than required Key: IGNITE-13265 URL: https://issues.apache.org/jira/browse/IGNITE-13265 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov On a historical rebalance some updates move from one node to another wherein this update may have various order in nodes. Reordering can happen in smell interval, but it cannot avoid at all in current implementation atomic protocol. This mean we will reduce a probably of loosing update if we make a margin from initial counter for the historical iterator on atomic cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13254) Historical rebalance iterator may skip checkpoint if it not contains updates
Vladislav Pyatkov created IGNITE-13254: -- Summary: Historical rebalance iterator may skip checkpoint if it not contains updates Key: IGNITE-13254 URL: https://issues.apache.org/jira/browse/IGNITE-13254 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13253) Advanced heuristics for historical rebalance
Vladislav Pyatkov created IGNITE-13253: -- Summary: Advanced heuristics for historical rebalance Key: IGNITE-13253 URL: https://issues.apache.org/jira/browse/IGNITE-13253 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov Before, cluster detects partitions that have not to rebalance by history, by them size. This threshold might be set through a system property IGNITE_PDS_WAL_REBALANCE_THRESHOLD. But it is not fair deciding which partitions will be rebalanced by WAL only by them size. WAL can have much more records than size of a partition (many update by one key) and that rebalance required more data than full transferring by network. Need to implement a heuristic, that might to estimate data size. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13245) Rebalance future might hangs in no final state though all partitions are owned
Vladislav Pyatkov created IGNITE-13245: -- Summary: Rebalance future might hangs in no final state though all partitions are owned Key: IGNITE-13245 URL: https://issues.apache.org/jira/browse/IGNITE-13245 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov It is very specific case, when supplier go out of cluster and in the same time, its partitions have not needed rebalance in new topology. Loot at my PR for to understand it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13191) Public-facing API for "waiting for backups on shutdown"
Vladislav Pyatkov created IGNITE-13191: -- Summary: Public-facing API for "waiting for backups on shutdown" Key: IGNITE-13191 URL: https://issues.apache.org/jira/browse/IGNITE-13191 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov We should introduce "should wait for backups on shutdown" flag in Ignition and/or IgniteConfiguration. Maybe we should do the same to "cancel compute tasks" flag. Also make sure that we can shut down node explicitly, overriding this flag but without JVM termination. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13168) Retrigger historical rebalance if it was cancelled in case WAL history is still available
Vladislav Pyatkov created IGNITE-13168: -- Summary: Retrigger historical rebalance if it was cancelled in case WAL history is still available Key: IGNITE-13168 URL: https://issues.apache.org/jira/browse/IGNITE-13168 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov If historical rebalance is cancelled, full rebalance will be unconditionally triggered on the PME that caused the cancellation (only outdated OWNING partitions can be rebalanced by history in the current implementation). We have to allow MOVING partitions to be historically rebalanced as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13072) Synchronization problems when different classloaders are used for deployment of same class
Vladislav Pyatkov created IGNITE-13072: -- Summary: Synchronization problems when different classloaders are used for deployment of same class Key: IGNITE-13072 URL: https://issues.apache.org/jira/browse/IGNITE-13072 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Assignee: Vladislav Pyatkov If you concurrently deploy one class using different classloaders you can get error: {noformat} 2020-04-28 14:36:42.523[ERROR][sys-stripe-45-#46%GRID%GridNodeName%][o.a.i.i.m.d.GridDeploymentLocalStore] Found more than one active deployment for the same resource [cls=class org.some.class.old.InvokeIndexRemover, depMode=SHARED, dep=GridDeployment [ts=1588067100125, depMode=SHARED, clsLdr=org.some.class.factory.NodeClassLoader@14035d21, clsLdrId=85ab310c171-a9fad11c-9f8c-4d2a-8146-6c87254303e7, userVer=0, loc=true, sampleClsName=org.some.class.predicates.CompositePredicate, pendingUndeploy=false, undeployed=false, usage=0]] 2020-04-28 14:36:42.544[ERROR][sys-stripe-45-#46%GRID%GridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed to process message [senderId=f104e069-9d80-4202-b50a-b3dc1804ac89, msg=GridNearAtomicSingleUpdateRequest [key=KeyCacheObject [hasValBytes=true], super=GridNearAtomicSingleUpdateRequest [key=KeyCacheObject [hasValBytes=true], parent=GridNearAtomicAbstractSingleUpdateRequest [nodeId=null, futId=1376257, topVer=AffinityTopologyVersion [topVer=35, minorTopVer=0], parent=GridNearAtomicAbstractUpdateRequest [res=null, flags=] java.lang.AssertionError: null at org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.getDeployment(GridDeploymentLocalStore.java:203) at org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getLocalDeployment(GridDeploymentManager.java:383) at org.apache.ignite.internal.processors.cache.GridCacheDeploymentManager$CacheClassLoader.findClass(GridCacheDeploymentManager.java:802) at org.apache.ignite.internal.processors.cache.GridCacheDeploymentManager$CacheClassLoader.loadClass(GridCacheDeploymentManager.java:794) at org.apache.ignite.internal.util.IgniteUtils.forName(IgniteUtils.java:8561) at org.apache.ignite.internal.MarshallerContextImpl.getClass(MarshallerContextImpl.java:374) at org.apache.ignite.internal.binary.BinaryContext.descriptorForTypeId(BinaryContext.java:700) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1757) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716) at org.apache.ignite.internal.binary.GridBinaryMarshaller.deserialize(GridBinaryMarshaller.java:313) at org.apache.ignite.internal.binary.BinaryMarshaller.unmarshal0(BinaryMarshaller.java:99) at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:82) at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:9959) at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10017) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateInvokeRequest.finishUnmarshal(GridNearAtomicSingleUpdateInvokeRequest.java:200) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.unmarshall(GridCacheIoManager.java:1560) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:582) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:386) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:312) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:102) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:301) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:546) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at java.lang.Thread.run(Thread.java:748) {noformat} Looks like we lack synchronization for modifying {{LocalDeploymentSpi.ldrRsrcs}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12935) Disadvantages in log of historical rebalance
Vladislav Pyatkov created IGNITE-12935: -- Summary: Disadvantages in log of historical rebalance Key: IGNITE-12935 URL: https://issues.apache.org/jira/browse/IGNITE-12935 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov # Mention in the log only partitions for which there are no nodes that suit as historical supplier For these partitions, print minimal counter (since which we should perform historical rebalancing) with corresponding node and maximum reserved counter (since which cluster can perform historical rebalancing) with corresponding node. This will let us know: ## Whether history was reserved at all ## How much reserved history we lack to perform a historical rebalancing ## I see resulting output like this: Historical rebalancing wasn't scheduled for some partitions: History wasn't reserved for: [list of partitions and groups] History was reserved, but minimum present counter is less than maximum reserved: [[grp=GRP, part=ID, minCntr=cntr, minNodeId=ID, maxReserved=cntr, maxReservedNodeId=ID], ...] ## We can also aggregate previous message by (minNodeId) to easily find the exact node (or nodes) which were the reason of full rebalance. # Log results of reserveHistoryForExchange(). They can be compactly represented as mappings: (grpId -> checkpoint (id, timestamp)). For every group, also log message about why the previous checkpoint wasn't successfully reserved. There can be three reasons: ## Previous checkpoint simply isn't present in the history (the oldest is reserved) ## WAL reservation failure (call below returned false) {code:java} chpEntry = entry(cpTs);boolean reserved = cctx.wal().reserve(chpEntry.checkpointMark());// If checkpoint WAL history can't be reserved, stop searching. if (!reserved) break; {code} ## Checkpoint was marked as inapplicable for historical rebalancing {code:java} for (Integer grpId : new HashSet<>(groupsAndPartitions.keySet())) if (!isCheckpointApplicableForGroup(grpId, chpEntry)) groupsAndPartitions.remove(grpId); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12818) SoLinger is not set for reader-sockets in discovery
Vladislav Pyatkov created IGNITE-12818: -- Summary: SoLinger is not set for reader-sockets in discovery Key: IGNITE-12818 URL: https://issues.apache.org/jira/browse/IGNITE-12818 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov {noformat} Thread [name="tcp-disco-client-message-worker-#29%DPL_GRID%DplGridNodeName%", id=543, state=RUNNABLE, blockCnt=0, waitCnt=109538] at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) at java.net.SocketOutputStream.write(SocketOutputStream.java:155) at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431) at sun.security.ssl.OutputRecord.write(OutputRecord.java:417) at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:879) at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:850) at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123) - locked sun.security.ssl.AppOutputStream@6e6441c6 at java.io.OutputStream.write(OutputStream.java:75) at o.a.i.spi.discovery.tcp.TcpDiscoverySpi.writeToSocket(TcpDiscoverySpi.java:1613) at o.a.i.spi.discovery.tcp.ServerImpl$ClientMessageWorker.processMessage(ServerImpl.java:7281) at o.a.i.spi.discovery.tcp.ServerImpl$ClientMessageWorker.processMessage(ServerImpl.java:7156) at o.a.i.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7538) at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:120) at o.a.i.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7469) at o.a.i.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) Thread [name="grid-timeout-worker-#39%DPL_GRID%DplGridNodeName%", id=230, state=WAITING, blockCnt=49, waitCnt=902487] Lock [object=java.util.concurrent.locks.ReentrantLock$NonfairSync@7dcea545, ownerName=tcp-disco-client-message-worker-#29%DPL_GRID%DplGridNodeName%, ownerId=543] at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:848) at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:720) at sun.security.ssl.SSLSocketImpl.sendAlert(SSLSocketImpl.java:2066) at sun.security.ssl.SSLSocketImpl.warning(SSLSocketImpl.java:1893) at sun.security.ssl.SSLSocketImpl.closeInternal(SSLSocketImpl.java:1656) - locked sun.security.ssl.SSLSocketImpl@5c2090f8 at sun.security.ssl.SSLSocketImpl.close(SSLSocketImpl.java:1594) at o.a.i.i.util.IgniteUtils.closeQuiet(IgniteUtils.java:4089) at o.a.i.spi.discovery.tcp.TcpDiscoverySpi$SocketTimeoutObject.onTimeout(TcpDiscoverySpi.java:2462) at o.a.i.i.processors.timeout.GridSpiTimeoutObject.onTimeout(GridSpiTimeoutObject.java:42) at o.a.i.i.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:279) at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:120) at java.lang.Thread.run(Thread.java:748) {noformat} Need to use SoLinger for socket got through `sock = srvrSock.accept();` like it used in `TcpDiscoverySpi#createSocket`. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12780) Deadlock between db-checkpoint-thread and checkpoint-runner
Vladislav Pyatkov created IGNITE-12780: -- Summary: Deadlock between db-checkpoint-thread and checkpoint-runner Key: IGNITE-12780 URL: https://issues.apache.org/jira/browse/IGNITE-12780 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Look at this run: https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_PdsIndexing/5121878?buildTab=log=3 {noformat} "db-checkpoint-thread-#46926%db.IgniteSequentialNodeCrashRecoveryTest0%" #55580 prio=5 os_prio=0 tid=0x7efb2000c800 nid=0x77e waiting on condition [0x7eff31add000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.fillCacheGroupState(GridCacheDatabaseSharedManager.java:4367) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:4147) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3728) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3617) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at java.lang.Thread.run(Thread.java:748) "checkpoint-runner-#46927%db.IgniteSequentialNodeCrashRecoveryTest0%" #55581 prio=5 os_prio=0 tid=0x7efbd4009000 nid=0x77f waiting on condition [0x7eff317da000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xe5c23ed8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1645) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.init0(GridCacheOffheapManager.java:1688) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.fullSize(GridCacheOffheapManager.java:2061) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.lambda$fillCacheGroupState$1(GridCacheDatabaseSharedManager.java:4336) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer$$Lambda$565/253081186.run(Unknown Source) at org.apache.ignite.internal.util.IgniteUtils.lambda$wrapIgniteFuture$3(IgniteUtils.java:11392) at org.apache.ignite.internal.util.IgniteUtils$$Lambda$561/471384364.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12689) Partitions should become owned after a checkpoint, regardless of a topology change. Nevertheless a rebalance is not required.
Vladislav Pyatkov created IGNITE-12689: -- Summary: Partitions should become owned after a checkpoint, regardless of a topology change. Nevertheless a rebalance is not required. Key: IGNITE-12689 URL: https://issues.apache.org/jira/browse/IGNITE-12689 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov After checkpoint completed we try to own all partitions of rebalanced cache (see WalStateManager#onGroupRebalanceFinished): {code} cpFut.futureFor(FINISHED).listen(new IgniteInClosureX() { @Override public void applyx(IgniteInternalFuture future) { if (X.hasCause(future.error(), NodeStoppingException.class)) return; for (Integer grpId0 : groupsToEnable) { try { cctx.database().walEnabled(grpId0, true, true); } catch (Exception e) { if (!X.hasCause(e, NodeStoppingException.class)) throw e; } CacheGroupContext grp = cctx.cache().cacheGroup(grpId0); if (grp != null) grp.topology().ownMoving(lastGroupTop); else if (log.isDebugEnabled()) log.debug("Cache group was destroyed before checkpoint finished, [grpId=" + grpId0 + ']'); } if (log.isDebugEnabled()) log.debug("Refresh partitions due to rebalance finished"); // Trigger exchange for switching to ideal assignment when all nodes are ready. cctx.exchange().refreshPartitions(); } }); {code} But in case of topology changes during checkpoint pass, we are need to invoke rebalance manually (see GridDhtPartitionTopologyImpl#ownMoving): {code} if (lastAffChangeVer.compareTo(rebFinishedTopVer) > 0) { if (log.isInfoEnabled()) { log.info("Affinity topology changed, no MOVING partitions will be owned " + "[rebFinishedTopVer=" + rebFinishedTopVer + ", lastAffChangeVer=" + lastAffChangeVer + "]"); } {code} That will be hardly ever happends, but if it was we restarted whole rebalance (over all partitions).I am advice start rebalance only when it needed and mark partitions as own if it definitely not need (when change of topology does not fluent to assignment). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12671) Update of partition's states can stuck when rebalance completed during exchange
Vladislav Pyatkov created IGNITE-12671: -- Summary: Update of partition's states can stuck when rebalance completed during exchange Key: IGNITE-12671 URL: https://issues.apache.org/jira/browse/IGNITE-12671 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Single message is ignoring during exchange: {code:java|GridCachePartitionExchangeManager.java} if (exchangeInProgress()) { if (log.isInfoEnabled()) log.info("Ignore single message without exchange id (there is exchange in progress) [nodeId=" + node.id() + "]"); return; } {code} By thew reason the message does not be received after exchange. As result waiting ideal assignment stuck until next rebalance. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12522) Extend test coverage [IGNITE-12104] Check deployment from cache before to load it from local or version storage
Vladislav Pyatkov created IGNITE-12522: -- Summary: Extend test coverage [IGNITE-12104] Check deployment from cache before to load it from local or version storage Key: IGNITE-12522 URL: https://issues.apache.org/jira/browse/IGNITE-12522 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12290) Re-balance fully restart for case when WAL disabled
Vladislav Pyatkov created IGNITE-12290: -- Summary: Re-balance fully restart for case when WAL disabled Key: IGNITE-12290 URL: https://issues.apache.org/jira/browse/IGNITE-12290 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Re-balance will restart by any topology event. In case when WAL was disabled new re-balance clearing all re-balanced partition and start over. Data about re-balanced partitions should stored and migrated when re-balance cancelled and started again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12104) Check deployment from cache before to load it from local or version storage
Vladislav Pyatkov created IGNITE-12104: -- Summary: Check deployment from cache before to load it from local or version storage Key: IGNITE-12104 URL: https://issues.apache.org/jira/browse/IGNITE-12104 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov Fix For: 2.8 {noformat} "pub-#3217917%DPL_GRID%DplGridNodeName%" #3223897 prio=5 os_prio=0 tid=0x7f47a414f800 nid=0x1dca46 runnable [0x7eaca31b] java.lang.Thread.State: RUNNABLE at java.lang.String.concat(String.java:2034) at java.net.URLClassLoader$1.run(URLClassLoader.java:364) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) - locked <0x7f4c8dd6c888> (a java.lang.Object) at java.lang.ClassLoader.loadClass(ClassLoader.java:411) - locked <0x7f4c8db4f530> (a java.lang.Object) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:411) - locked <0x7f4ba0138340> (a com.sbt.core.envelope.container.loader.NamedClassLoader) at java.lang.ClassLoader.loadClass(ClassLoader.java:411) - locked <0x7f4ba012a800> (a com.sbt.core.envelope.container.loader.ImplClassLoader) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.getDeployment(GridDeploymentLocalStore.java:191) at org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getGlobalDeployment(GridDeploymentManager.java:462) at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:983) at org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1921) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IGNITE-11844) Should to filtered indexes by cache name instead of validate all caches in group
Vladislav Pyatkov created IGNITE-11844: -- Summary: Should to filtered indexes by cache name instead of validate all caches in group Key: IGNITE-11844 URL: https://issues.apache.org/jira/browse/IGNITE-11844 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov control.sh utility method validate_indexes checks all indexes of all caches in group. Just do specify one caches (from generic group) in caches list, then all indexes from all caches (that group) will be start to validate and this can consume more time, than checks indexes only specified caches. Will be correct to validate only indexes of specified caches, for the purpose need to filtered caches, by list from parameters, in shared group. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11834) Confusing message on rebalance
Vladislav Pyatkov created IGNITE-11834: -- Summary: Confusing message on rebalance Key: IGNITE-11834 URL: https://issues.apache.org/jira/browse/IGNITE-11834 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov When rebalance was scheduled by caches print message like {noformat} Rebalancing scheduled [order=[c8], top=AffinityTopologyVersion [topVer=6, minorTopVer=0], force=true, evt=NODE_JOINED, node=9b5ff0c4-cfd7-489d-a02d-470342d5] {noformat} but force flag ({{force=true}}) does not mean that is force rebalance. I suggest log force flag correct by the value of {{forcePreload}}, and change name of available flag. For example {{exchnageRebalance}} or {{exchnage}} according to its meaning. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11763) GridP2PComputeWithNestedEntryProcessorTest failed on TC
Vladislav Pyatkov created IGNITE-11763: -- Summary: GridP2PComputeWithNestedEntryProcessorTest failed on TC Key: IGNITE-11763 URL: https://issues.apache.org/jira/browse/IGNITE-11763 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Test failed with exception: {noformat} [2019-04-16 19:50:21,725][ERROR][main][root] Test failed. javax.cache.CacheException: class org.apache.ignite.IgniteCheckedException: Failed to execute query on node [query=GridCacheQueryBean [qry=GridCacheQueryAdapter [type=SCAN, clsName=null, clause=null, filter=org.apache.ignite.tests.p2p.pedicates.CompositePredicate@2b939b78, transform=null, part=null, incMeta=false, metrics=GridCacheQueryMetricsAdapter [minTime=9223372036854775807, maxTime=0, sumTime=0, avgTime=0.0, execs=0, completed=0, fails=0], pageSize=1024, timeout=0, incBackups=false, forceLocal=false, dedup=false, prj=null, keepBinary=true, subjId=008694d2-98a2-4add-9ccc-b7674e6d717f, taskHash=0, mvccSnapshot=null, dataPageScanEnabled=null], rdc=null, trans=null], nodeId=8575809f-3373-4c47-8684-a318c221] at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1318) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryFutureAdapter.next(GridCacheQueryFutureAdapter.java:168) at org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$5.onHasNext(GridCacheDistributedQueryManager.java:643) at org.apache.ignite.internal.util.GridCloseableIteratorAdapter.hasNextX(GridCloseableIteratorAdapter.java:53) at org.apache.ignite.internal.util.lang.GridIteratorAdapter.hasNext(GridIteratorAdapter.java:45) at org.apache.ignite.internal.processors.cache.QueryCursorImpl.getAll(QueryCursorImpl.java:123) at org.apache.ignite.p2p.GridP2PComputeWithNestedEntryProcessorTest.scanByCopositeFirstPredicate(GridP2PComputeWithNestedEntryProcessorTest.java:205) at org.apache.ignite.p2p.GridP2PComputeWithNestedEntryProcessorTest.scnaCacheData(GridP2PComputeWithNestedEntryProcessorTest.java:188) at org.apache.ignite.p2p.GridP2PComputeWithNestedEntryProcessorTest.processTest(GridP2PComputeWithNestedEntryProcessorTest.java:140) at org.apache.ignite.p2p.GridP2PComputeWithNestedEntryProcessorTest.testContinuousMode(GridP2PComputeWithNestedEntryProcessorTest.java:105) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2044) at java.lang.Thread.run(Thread.java:748) Caused by: class org.apache.ignite.IgniteCheckedException: Failed to execute query on node [query=GridCacheQueryBean [qry=GridCacheQueryAdapter [type=SCAN, clsName=null, clause=null, filter=org.apache.ignite.tests.p2p.pedicates.CompositePredicate@2b939b78, transform=null, part=null, incMeta=false, metrics=GridCacheQueryMetricsAdapter [minTime=9223372036854775807, maxTime=0, sumTime=0, avgTime=0.0, execs=0, completed=0, fails=0], pageSize=1024, timeout=0, incBackups=false, forceLocal=false, dedup=false, prj=null, keepBinary=true, subjId=008694d2-98a2-4add-9ccc-b7674e6d717f, taskHash=0, mvccSnapshot=null, dataPageScanEnabled=null], rdc=null, trans=null], nodeId=8575809f-3373-4c47-8684-a318c221] at org.apache.ignite.internal.processors.cache.query.GridCacheQueryFutureAdapter.onPage(GridCacheQueryFutureAdapter.java:384) at org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager.processQueryResponse(GridCacheDistributedQueryManager.java:402) at org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager.access$000(GridCacheDistributedQueryManager.java:64) at org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$1.apply(GridCacheDistributedQueryManager.java:94) at
[jira] [Created] (IGNITE-11734) IgniteCache.replace(k, v, nv) requires classes when element is null or old value - null
Vladislav Pyatkov created IGNITE-11734: -- Summary: IgniteCache.replace(k, v, nv) requires classes when element is null or old value - null Key: IGNITE-11734 URL: https://issues.apache.org/jira/browse/IGNITE-11734 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11698) Issue with P2P class loader
Vladislav Pyatkov created IGNITE-11698: -- Summary: Issue with P2P class loader Key: IGNITE-11698 URL: https://issues.apache.org/jira/browse/IGNITE-11698 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Sometimes classes of remote query filter loading incorrect. {noformat} Exception in thread "main" javax.cache.CacheException: class org.apache.ignite.IgniteCheckedException: Failed to execute query on node [query=GridCacheQueryBean [qry=GridCacheQueryAdapter [type=SCAN, clsName=null, clause=null, filter=CompositePredicate@7ba93755, transform=null, part=null, incMeta=false, metrics=GridCacheQueryMetricsAdapter [minTime=9223372036854775807, maxTime=0, sumTime=0, avgTime=0.0, execs=0, completed=0, fails=0], pageSize=1024, timeout=0, incBackups=false, forceLocal=false, dedup=false, prj=null, keepBinary=false, subjId=f4870536-0f68-4e19-a87c-3862cbd30497, taskHash=0, mvccSnapshot=null, dataPageScanEnabled=null], rdc=null, trans=null], nodeId=40a03665-a203-4dc0-9a79-9aaede7a5dfa] at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1318) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryFutureAdapter.next(GridCacheQueryFutureAdapter.java:168) at org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$5.onHasNext(GridCacheDistributedQueryManager.java:643) at org.apache.ignite.internal.util.GridCloseableIteratorAdapter.hasNextX(GridCloseableIteratorAdapter.java:53) at org.apache.ignite.internal.util.GridCloseableIteratorAdapter.nextX(GridCloseableIteratorAdapter.java:38) at org.apache.ignite.internal.util.lang.GridIteratorAdapter.next(GridIteratorAdapter.java:35) at org.apache.ignite.internal.processors.cache.AutoClosableCursorIterator.next(AutoClosableCursorIterator.java:59) at ClientP2P.query(ClientP2P.java:61) at ClientP2P.main(ClientP2P.java:45) Caused by: class org.apache.ignite.IgniteCheckedException: Failed to execute query on node [query=GridCacheQueryBean [qry=GridCacheQueryAdapter [type=SCAN, clsName=null, clause=null, filter=CompositePredicate@7ba93755, transform=null, part=null, incMeta=false, metrics=GridCacheQueryMetricsAdapter [minTime=9223372036854775807, maxTime=0, sumTime=0, avgTime=0.0, execs=0, completed=0, fails=0], pageSize=1024, timeout=0, incBackups=false, forceLocal=false, dedup=false, prj=null, keepBinary=false, subjId=f4870536-0f68-4e19-a87c-3862cbd30497, taskHash=0, mvccSnapshot=null, dataPageScanEnabled=null], rdc=null, trans=null], nodeId=40a03665-a203-4dc0-9a79-9aaede7a5dfa] at org.apache.ignite.internal.processors.cache.query.GridCacheQueryFutureAdapter.onPage(GridCacheQueryFutureAdapter.java:384) at org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager.processQueryResponse(GridCacheDistributedQueryManager.java:402) at org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager.access$000(GridCacheDistributedQueryManager.java:64) at org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$1.apply(GridCacheDistributedQueryManager.java:94) at org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$1.apply(GridCacheDistributedQueryManager.java:92) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1126) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$800(GridCacheIoManager.java:109) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1691) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1561) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4100(GridIoManager.java:127) at org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2753) at org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1521) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4400(GridIoManager.java:127) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1490) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: class org.apache.ignite.IgniteException: BinaryPredicateFirst
[jira] [Created] (IGNITE-11643) Optimize GC pressure on GridDhtPartitionTopologyImpl#updateRebalanceVersion
Vladislav Pyatkov created IGNITE-11643: -- Summary: Optimize GC pressure on GridDhtPartitionTopologyImpl#updateRebalanceVersion Key: IGNITE-11643 URL: https://issues.apache.org/jira/browse/IGNITE-11643 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov Have surplused HashMap in the method {{GridDhtPartitionTopologyImpl#updateRebalanceVersion}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11474) Add possibility to run idle_verify in not idle cluster
Vladislav Pyatkov created IGNITE-11474: -- Summary: Add possibility to run idle_verify in not idle cluster Key: IGNITE-11474 URL: https://issues.apache.org/jira/browse/IGNITE-11474 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov We are capable to make sort of READ_ONLY mode for blocking all data load. Using this mode we should to add specific parameter for idle_verify, which exclude data load and after cluster switched to READ_ONLY continue the task. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11425) Log information about inaccessible nodes through Communication
Vladislav Pyatkov created IGNITE-11425: -- Summary: Log information about inaccessible nodes through Communication Key: IGNITE-11425 URL: https://issues.apache.org/jira/browse/IGNITE-11425 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov In case of long getting communication TCP client (longe than this CONNECTION_ESTABLISH_THRESHOLD_MS = 100) message will printed: {noformat} [sys-#20167%dht.CacheGetReadFromBackupFailoverTest0%][TcpCommunicationSpi] TCP client created [client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=3, bytesRcvd=0, bytesSent=0, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-3, igniteInstanceName=dht.CacheGetReadFromBackupFailoverTest0, finished=false, heartbeatTs=1550512236151, hashCode=140561231, interrupted=false, runner=grid-nio-worker-tcp-comm-3-#20147%dht.CacheGetReadFromBackupFailoverTest0%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=0, sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=8a660330-6ddb-4031-b955-4cb4f4b2, addrs=ArrayList [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47502], discPort=47502, order=5, intOrder=4, lastExchangeTime=1550512235890, loc=false, ver=2.8.0#20190218-sha1:29232e37, isClient=false], connected=false, connectCnt=2, queueLimit=4096, reserveCnt=2, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=0, sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=8a660330-6ddb-4031-b955-4cb4f4b2, addrs=ArrayList [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47502], discPort=47502, order=5, intOrder=4, lastExchangeTime=1550512235890, loc=false, ver=2.8.0#20190218-sha1:29232e37, isClient=false], connected=false, connectCnt=2, queueLimit=4096, reserveCnt=2, pairedConnections=false], super=GridNioSessionImpl [locAddr=/127.0.0.1:38770, rmtAddr=/127.0.0.1:45212, createTime=1550512236151, closeTime=0, bytesSent=0, bytesRcvd=0, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1550512236151, lastSndTime=1550512236151, lastRcvTime=1550512236151, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@d240a48, directMode=true], GridConnectionBytesVerifyFilter], accepted=false, markedForClose=false]], super=GridAbstractCommunicationClient [lastUsed=1550512236151, closed=false, connIdx=0]], duration=211ms] {noformt} but in some cases we can not to get client during time out, and the message reduce to TCP client created [client=null, duration=60004 ms] According to the message you cannot understand which nodes were inaccessible. Moreover, wants to see the connection trouble earlier than the 10 minutes after. Should to log ip/host for clear understanding what was the node and log WARN message each time when need to increase timeout: {code} if (lastWaitingTimeout < 6) lastWaitingTimeout *= 2; {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11291) Assertion error in time of rebalance completion lead to to critical failure node
Vladislav Pyatkov created IGNITE-11291: -- Summary: Assertion error in time of rebalance completion lead to to critical failure node Key: IGNITE-11291 URL: https://issues.apache.org/jira/browse/IGNITE-11291 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov {noformat} java.lang.AssertionError: Got removed exception on entry with dht local candidate: [IgniteTxEntry [key=KeyCacheObjectImpl [part=11859, val=3338011748811769508, hasValBytes=true], cacheId=-313938805, txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=11859, val=3338011748811769508, hasValBytes=true], cacheId=-313938805], val=[op=UPDATE, val=com.sbt.bm.ucp.common.dpl.model.party.hashcodes.DPartyHashCode_DPL_PROXY [idHash=1985166276, hash=1530579783, colocationKey=11859, sync Flag=null, hashUpdateTime=Sat Feb 09 20:16:55 MSK 2019, lastChangeDate=1549732615042, partition_DPL_id=4, ownerId=ucp, externalSystem=21, serializedHashCodesMap={"Address":["581a1367d4f50172c41168747c99105df1d3a4c6","24d3413bc5d3151dd784e088eaf4603e8e86c40b","fd7ce1449cd539a5976d358fb15060820c630f36"],"BirthDate":["a5cc7c1ac9821a6b19a455ae7eac55f4a4475bd7","415fe3810f8c4555a7188fe62275b34cdd1384cd","1311875998ef2aaccc5860ac6f991107ad4fa558"],"BirthPlace":["2fb5ea4cf7e9cfbdf77baae18f26804a10f51d57","76ddc5f2d959d27044c180957d9d0aed7369c0a2"],"Gender":["ad99678bb0e6de91d6a2ce620ba4fa98206aa9b9"],"IndividualIdentification":["c5b73f0087461fa4b980362344d38afddd1677b6","157952428fbe8c5b80da4db12c945cb4ad1f33f8","4d2257f62b1aa9cf290728d147776dd569b226a7"],"IndividualName":["256dba12a3b9d95dc1ada524d1a67cb590eb3ec2","2302eda39fb8f3751f3646542ad54ae717f5bc2c","9b8084de661530d5dc6ea140df793f4aa417a114"],"PartyToPartyGroup":["8d41984edf2e9ff916da0a74b884554cc2fceca9"],"PhoneNumber":["c89cbfb741ba5ea0fa13091a1cc7591c69374c0c","149081529b36fe29bc72bf88c66e51bdab3ae2d7","3c6bc9cc9e47ecd2f79b1506a22799ba69b95227","389830cd794fd6748f46ab7d8d878a2fec75cf63","9d5c3ecd5c951a16745cefd771b79c17bbc8c665","dfd8959391fd18e89505bfde31e0aa9a80513fec"],"Individual":["82dc325513d53990709bafbf1a334b602025ba17","544aab947d3a33787c7feffc93c4a4572e957dca","2728a40ce9759fa7110bc5d0d3c8b5c193210a2d"]}, uid=null, isDeleted=false, isImmutable=false, checksum=null, id=3338011748811769508, externalClientId=75662144, colocationId=1216693183554769678]], prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=11859, val=3338011748811769508, hasValBytes=true], val=null, startVer=1551196730698, ver=GridCacheVersion [topVer=156972865, order=1546089814280, nodeOrder=65], hash=777160356, extras=GridCacheObsoleteEntryExtras [obsoleteVer=GridCacheVersion [topVer=2147483647, order=0, nodeOrder=0]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=11859, super=], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=2, partUpdateCntr=0, serReadVer=GridCacheVersion[topVer=156972865, order=1546089814280, nodeOrder=65], xidVer=null]] at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.checkReadConflict(GridDhtTxPrepareFuture.java:1164) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.prepare0(GridDhtTxPrepareFuture.java:1223) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.access$000(GridDhtTxPrepareFuture.java:109) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture$2.apply(GridDhtTxPrepareFuture.java:701) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture$2.apply(GridDhtTxPrepareFuture.java:696) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:495) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:474) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtForceKeysFuture.onDone(GridDhtForceKeysFuture.java:153) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtForceKeysFuture.onDone(GridDhtForceKeysFuture.java:69) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:451) at
[jira] [Created] (IGNITE-11270) Batch join to topology
Vladislav Pyatkov created IGNITE-11270: -- Summary: Batch join to topology Key: IGNITE-11270 URL: https://issues.apache.org/jira/browse/IGNITE-11270 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov In first cluster start many nodes will trying to join. This case leed to many time consuming join process (TcpDiscoveryJoinRequestMessage -> TcpDiscoveryNodeAddedMessage -> TcpDiscoveryNodeAddFinishedMessage). Finally, collect of topology required to much time. We can to merge some of TcpDiscoveryJoinRequestMessage and join to topology as one batch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11269) Optimize node join to topology
Vladislav Pyatkov created IGNITE-11269: -- Summary: Optimize node join to topology Key: IGNITE-11269 URL: https://issues.apache.org/jira/browse/IGNITE-11269 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov When coordinator recived TcpDiscoveryJoinRequestMessage appropriate TcpDiscoveryNodeAddedMessage had been sent in should not to process new recived TcpDiscoveryJoinRequestMessage until first joined node does not complitly joined (TcpDiscoveryNodeAddFinishedMessage was sented). This solution allow to faster join node to topology without blocking ring of huge TcpDiscoveryNodeAddedMessage's. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11262) Compression on Discovery data bag
Vladislav Pyatkov created IGNITE-11262: -- Summary: Compression on Discovery data bag Key: IGNITE-11262 URL: https://issues.apache.org/jira/browse/IGNITE-11262 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov Size of GridComponetns data may increase significantly in large deployment. Examples: 1) In case of more then 3K caches with QueryEntry configured - size of {{DiscoveryDataBag}}{{GridCacheProcessor}} data bag consume more then 20 Mb 2) If cluster contain more then 13K objects - {{GridMarshallerMappingProcessor}} size more then 1 Mb 3) Cluster with more then 3К types in binary format - {{CacheObjectBinaryProcessorImpl}} size can grow to 10Mb The data in most cases contain duplicated structure and simple zip compression can led to seriously reduce size. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11120) Remove static fields from GridDhtLockFuture
Vladislav Pyatkov created IGNITE-11120: -- Summary: Remove static fields from GridDhtLockFuture Key: IGNITE-11120 URL: https://issues.apache.org/jira/browse/IGNITE-11120 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov {code} /** Logger reference. */ private static final AtomicReference logRef = new AtomicReference<>(); /** Logger. */ private static IgniteLogger log; /** Logger. */ private static IgniteLogger msgLog; {code} In that case we can to miss log messages, when restart node without restart JVM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11023) Processing data bag on GridMarshallerMappingProcessor consume many time
Vladislav Pyatkov created IGNITE-11023: -- Summary: Processing data bag on GridMarshallerMappingProcessor consume many time Key: IGNITE-11023 URL: https://issues.apache.org/jira/browse/IGNITE-11023 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov I have measure a processing data bag time on each join node and discovered what GridMarshallerMappingProcessor consume more time then others. It slow down on collecting topology, in particular case if joining some node simultaneous. {noformat} 2019-01-11 20:35:01.207 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Starting processing discovery data bag 2019-01-11 20:35:01.207 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component ClusterProcessor processed joining node data bag in 0ms 2019-01-11 20:35:01.207 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component IgnitePluginProcessor processed joining node data bag in 0ms 2019-01-11 20:35:01.208 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component CacheObjectBinaryProcessorImpl processed joining node data bag in 0ms 2019-01-11 20:35:01.208 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component IgniteAuthenticationProcessor processed joining node data bag in 0ms 2019-01-11 20:35:01.219 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component GridCacheProcessor processed joining node data bag in 10ms 2019-01-11 20:35:01.219 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component GridQueryProcessor processed joining node data bag in 0ms 2019-01-11 20:35:01.219 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component GridContinuousProcessor processed joining node data bag in 0ms 2019-01-11 20:35:01.463 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component GridMarshallerMappingProcessor processed joining node data bag in 242ms 2019-01-11 20:35:01.463 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Total time of processing discovery data bag: 252ms 2019-01-11 20:35:01.780 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Starting processing discovery data bag 2019-01-11 20:35:01.781 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component ClusterProcessor processed joining node data bag in 0ms 2019-01-11 20:35:01.781 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component IgnitePluginProcessor processed joining node data bag in 0ms 2019-01-11 20:35:01.781 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component CacheObjectBinaryProcessorImpl processed joining node data bag in 0ms 2019-01-11 20:35:01.781 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component IgniteAuthenticationProcessor processed joining node data bag in 0ms 2019-01-11 20:35:01.791 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component GridCacheProcessor processed joining node data bag in 10ms 2019-01-11 20:35:01.792 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component GridQueryProcessor processed joining node data bag in 0ms 2019-01-11 20:35:01.792 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component GridContinuousProcessor processed joining node data bag in 0ms 2019-01-11 20:35:02.134 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component GridMarshallerMappingProcessor processed joining node data bag in 338ms 2019-01-11 20:35:02.134 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Total time of processing discovery data bag: 348ms 2019-01-11 20:35:02.326 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Starting processing discovery data bag 2019-01-11 20:35:02.326 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component ClusterProcessor processed joining node data bag in 0ms 2019-01-11 20:35:02.326 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component IgnitePluginProcessor processed joining node data bag in 0ms 2019-01-11 20:35:02.326 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component CacheObjectBinaryProcessorImpl processed joining node data bag in 0ms 2019-01-11 20:35:02.326 [INFO ][tcp-disco-msg-worker-#2%NodeName%][o.a.i.i.m.d.GridDiscoveryManager] Component IgniteAuthenticationProcessor processed joining node data bag in 0ms 2019-01-11 20:35:02.337 [INFO
[jira] [Created] (IGNITE-10933) Node may hang on join to topology and not move forward
Vladislav Pyatkov created IGNITE-10933: -- Summary: Node may hang on join to topology and not move forward Key: IGNITE-10933 URL: https://issues.apache.org/jira/browse/IGNITE-10933 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Several nodes join to topology simultaneously and hang on a long time. That can be on first start all cluster nodes or join nodes to completed topology. In the logs of problem nodes can see messages: {noformat} 2019-01-11 18:37:39.296 [WARN ][Thread-56][o.a.i.s.d.tcp.TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require sig nificant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000] 2019-01-11 18:43:09.374 [WARN ][Thread-56][o.a.i.s.d.tcp.TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require sig nificant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000] ... {noformat} and this long time without others. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10108) Non-static class is passed between cluster nodes
Vladislav Pyatkov created IGNITE-10108: -- Summary: Non-static class is passed between cluster nodes Key: IGNITE-10108 URL: https://issues.apache.org/jira/browse/IGNITE-10108 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Need to avoid passing anonymous classes on compute, because this lead to serialize whole test-class context. By the reason need to refactor that place {code} ignite.compute().withTimeout(5_000).broadcastAsync(new IgniteRunnable() { ... }); {code} in method \{{GridCommonAbstractTest#manualCacheRebalancing}} into private static nested class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10092) Race in partition state when checkpoint started at the middle of starts caches
Vladislav Pyatkov created IGNITE-10092: -- Summary: Race in partition state when checkpoint started at the middle of starts caches Key: IGNITE-10092 URL: https://issues.apache.org/jira/browse/IGNITE-10092 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10028) Incorrect handling of page on replacement
Vladislav Pyatkov created IGNITE-10028: -- Summary: Incorrect handling of page on replacement Key: IGNITE-10028 URL: https://issues.apache.org/jira/browse/IGNITE-10028 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov We can to pass incorrect page version to IgniteCacheSnapshotManager. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9934) Improve logging on partition map exchange
Vladislav Pyatkov created IGNITE-9934: - Summary: Improve logging on partition map exchange Key: IGNITE-9934 URL: https://issues.apache.org/jira/browse/IGNITE-9934 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov Partition Map Exchange (PME) is a cluster wide process, be the reason it does not completed before then each node do not done its part of job. Coordinator, as a not witch managed the process, can to print quantity nodes finished its stage of PME and other than, which not yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9885) Issue in termination of GridWorkerFuture
Vladislav Pyatkov created IGNITE-9885: - Summary: Issue in termination of GridWorkerFuture Key: IGNITE-9885 URL: https://issues.apache.org/jira/browse/IGNITE-9885 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov Can to start a closure through method like {{GridClosureProcessor#runLocalSafe(java.lang.Runnable)}} but does not possible to wait termination of the task after cancellation. For understanding need to look at {{GridWorkerFuture.cancel}}. The method affix an {{interrupted}} flag on executed thread, but not wait task termination. Having an instance of GridWorkerFuture, you can not to await real termination of task. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9738) Client node can suddenly fail on start
Vladislav Pyatkov created IGNITE-9738: - Summary: Client node can suddenly fail on start Key: IGNITE-9738 URL: https://issues.apache.org/jira/browse/IGNITE-9738 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov If client joining to large topology it can to spend some time on waiting {{TcpDiscoveryNodeAddFinishedMessage}}, but in that time it can not to send {{TcpDiscoveryClientMetricsUpdateMessage.}} By that reason server can to reset client from topology. We should to sent {{TcpDiscoveryClientMetricsUpdateMessage as soon as possible}} without, waiting finish of join procedure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9707) TouchedExpiryPolicy with persistent on atomic cache, update TTL without lock
Vladislav Pyatkov created IGNITE-9707: - Summary: TouchedExpiryPolicy with persistent on atomic cache, update TTL without lock Key: IGNITE-9707 URL: https://issues.apache.org/jira/browse/IGNITE-9707 Project: Ignite Issue Type: Test Reporter: Vladislav Pyatkov Attachments: AtomicCacheWithTtlTest.java {noformat} [2018-09-26 18:06:29,882][ERROR][sys-stripe-0-#86%internal.AtomicCacheWithTtlTest2%][GridCacheIoManager] Failed to process message [senderId=949211e1-5e2a-41fe-b66b-195ae3300033, messageType=class o.a.i.i.processors.cache.distributed.near.GridNearSingleGetRequest] java.lang.AssertionError at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1247) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1528) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:352) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:3605) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:3581) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.updateTtl(GridCacheMapEntry.java:2468) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.updateTtl(GridCacheMapEntry.java:2444) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerGet0(GridCacheMapEntry.java:680) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerGetVersioned(GridCacheMapEntry.java:554) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.getAllAsync0(GridCacheAdapter.java:1994) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.getDhtAllAsync(GridDhtCacheAdapter.java:781) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.getAsync(GridDhtGetSingleFuture.java:360) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.map0(GridDhtGetSingleFuture.java:254) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.map(GridDhtGetSingleFuture.java:237) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.init(GridDhtGetSingleFuture.java:161) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.getDhtSingleAsync(GridDhtCacheAdapter.java:878) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetRequest(GridDhtCacheAdapter.java:893) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$300(GridDhtAtomicCache.java:130) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$4.apply(GridDhtAtomicCache.java:252) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$4.apply(GridDhtAtomicCache.java:247) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:496) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9448) Change ZooKeeper version to 3.4.13
Vladislav Pyatkov created IGNITE-9448: - Summary: Change ZooKeeper version to 3.4.13 Key: IGNITE-9448 URL: https://issues.apache.org/jira/browse/IGNITE-9448 Project: Ignite Issue Type: Test Components: zookeeper Reporter: Vladislav Pyatkov Should to change ZooKeeper dependency to last release - just now it is 3.4.13. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8965) Add logs in SegmentReservationStorage on exchange process
Vladislav Pyatkov created IGNITE-8965: - Summary: Add logs in SegmentReservationStorage on exchange process Key: IGNITE-8965 URL: https://issues.apache.org/jira/browse/IGNITE-8965 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8913) Uninformative SQL query cancellation message
Vladislav Pyatkov created IGNITE-8913: - Summary: Uninformative SQL query cancellation message Key: IGNITE-8913 URL: https://issues.apache.org/jira/browse/IGNITE-8913 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Fix For: 2.5 When query timeouted or cancelled or other exception, we getting message: "The query was cancelled while executing". Need make message more clear - text of query, node which the cancelled, reason of cancel query e.t.c. {noformat} 2018-06-19 00:00:10.653[ERROR][query-#93192%DPL_GRID%DplGridNodeName%][o.a.i.i.p.q.h.t.GridMapQueryExecutor] Failed to execute local query. org.apache.ignite.cache.query.QueryCancelledException: The query was cancelled while executing. at org.apache.ignite.internal.processors.query.GridQueryCancel.set(GridQueryCancel.java:53) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQuery(IgniteH2Indexing.java:1115) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQueryWithTimer(IgniteH2Indexing.java:1207) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQueryWithTimer(IgniteH2Indexing.java:1185) at org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest0(GridMapQueryExecutor.java:683) at org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest(GridMapQueryExecutor.java:527) at org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onMessage(GridMapQueryExecutor.java:218) at org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor$2.onMessage(GridMapQueryExecutor.java:178) at org.apache.ignite.internal.managers.communication.GridIoManager$ArrayListener.onMessage(GridIoManager.java:2333) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2018-06-19 00:00:11.629[ERROR][query-#93187%DPL_GRID%DplGridNodeName%][o.a.i.i.p.q.h.t.GridMapQueryExecutor] Failed to execute local query. org.apache.ignite.cache.query.QueryCancelledException: The query was cancelled while executing. at org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest0(GridMapQueryExecutor.java:670) at org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest(GridMapQueryExecutor.java:527) at org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onMessage(GridMapQueryExecutor.java:218) at org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor$2.onMessage(GridMapQueryExecutor.java:178) at org.apache.ignite.internal.managers.communication.GridIoManager$ArrayListener.onMessage(GridIoManager.java:2333) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8866) Need attempt to upload class until node leave or fail topology by discovery SPI
Vladislav Pyatkov created IGNITE-8866: - Summary: Need attempt to upload class until node leave or fail topology by discovery SPI Key: IGNITE-8866 URL: https://issues.apache.org/jira/browse/IGNITE-8866 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov After one fail attempt to upload a class, client code getting exception: {noformat} 10:04:46,253 INFO [stdout] (Thread-732) java.lang.NoClassDefFoundError: ru/sbt/deposit_pf_api/core/utils/DplUtils 10:04:46,253 INFO [stdout] (Thread-732) at ru.sbt.deposit_pf_api.comparators.CommonPredicate.nodeIdIgnite(CommonPredicate.java:225) 10:04:46,253 INFO [stdout] (Thread-732) at ru.sbt.deposit_pf_api.comparators.CommonPredicate.cacheEntities(CommonPredicate.java:191) 10:04:46,253 INFO [stdout] (Thread-732) at ru.sbt.deposit_pf_api.comparators.CommonPredicate.(CommonPredicate.java:116) {noformat} And log contains some related warnings: {noformat} 018-06-19 10:04:18.459 [WARN ][pub-#3308%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDeploymentCommunication] Failed to receive peer response from node within duration [node=5861d763-a552-463e-817a-0742f7aad114, duration=5008] 2018-06-19 10:04:18.459 [WARN ][pub-#3308%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDeploymentPerVersionStore] Failed to send class-loading request to node (is node alive?) [node=5861d763-a552-463e-817a-0742f7aad114, clsName=ru.sbt.deposit_pf_api.core.utils.DplUtils, clsPath=ru/sbt/deposit_pf_api/core/utils/DplUtils.class, clsLdrId=370f1361461-5861d763-a552-463e-817a-0742f7aad114, parentClsLdr=com.sbt.dpl.gridgain.ignite.NodeClassLoader@1ce4a752] {noformat} I think should to upload class through p2p until node present in topology. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8829) Some configuration properties of TcpCommunicationSpi does not annotated appropriately
Vladislav Pyatkov created IGNITE-8829: - Summary: Some configuration properties of TcpCommunicationSpi does not annotated appropriately Key: IGNITE-8829 URL: https://issues.apache.org/jira/browse/IGNITE-8829 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov When I checked all properties of TcpCommunicationSpi, I have found an issue with getting all configuration properties from code. Because a part of them not be a configured property, but a part of a real SPI life. I should was rid of these issues - all configurable properties must annotate as {{IgniteSpiConfiguration}}, but it not done for each. I have found at least two properties for which not be done: {{connectionsPerNode}} {{usePairedConnections}} and one property which not appropriate contract (it have only setter, but not getter): {{addressResolver}} Need to revised all properties CommunicationSpi and correct them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8754) Node outside of baseline does not start when service configured
Vladislav Pyatkov created IGNITE-8754: - Summary: Node outside of baseline does not start when service configured Key: IGNITE-8754 URL: https://issues.apache.org/jira/browse/IGNITE-8754 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Attachments: ServiceOnNodeOutOfBaselineTest.java Enough to configure service in {{ServiceConfiguration}} and the node does not started if the node outside of baseline. {noformat} "async-runnable-runner-1" #287 prio=5 os_prio=0 tid=0x24e0c800 nid=0x4e6c waiting on condition [0xe87fe000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) at org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart0(GridServiceProcessor.java:287) at org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart(GridServiceProcessor.java:228) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1105) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) - locked <0x00076c142400> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:649) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:882) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:845) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:833) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:799) at org.gridgain.internal.ServiceOnNodeOutOfBaselineTest.lambda$test$0(ServiceOnNodeOutOfBaselineTest.java:107) at org.gridgain.internal.ServiceOnNodeOutOfBaselineTest$$Lambda$22/781127963.run(Unknown Source) at org.apache.ignite.testframework.GridTestUtils.lambda$runAsync$1(GridTestUtils.java:898) at org.apache.ignite.testframework.GridTestUtils$$Lambda$23/1655470614.call(Unknown Source) at org.apache.ignite.testframework.GridTestUtils.lambda$runAsync$2(GridTestUtils.java:956) at org.apache.ignite.testframework.GridTestUtils$$Lambda$24/1782331932.run(Unknown Source) at org.apache.ignite.testframework.GridTestUtils$6.call(GridTestUtils.java:1254) at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8710) Applying WAL works long time or fail at all, when *.wal files been removed
Vladislav Pyatkov created IGNITE-8710: - Summary: Applying WAL works long time or fail at all, when *.wal files been removed Key: IGNITE-8710 URL: https://issues.apache.org/jira/browse/IGNITE-8710 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov In specific cases when removed *.wal files or unmounted wal directories we got some warning message on start: {noformat} 2018-06-02 12:10:06.127[INFO ][Thread-100][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=0, len=0], lastMarked=FileWALPointer [idx=0, fileOff=0, len=0], lastCheckpointId=----] 2018-06-02 12:10:06.546[WARN ][Thread-100][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager] Found unexpected checkpoint marker, skipping [cpId=94b5ce03-87b7-489e-b08b-b4c5dc522bd5, expCpId=----, pos=FileWALPointer [idx=0, fileOff=44266869, len=977]] 2018-06-02 12:10:57.860[WARN ][Thread-100][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager] Found unexpected checkpoint marker, skipping [cpId=3f6ab238-23f7-4924-b4ef-0cb68d914a04, expCpId=----, pos=FileWALPointer [idx=7, fileOff=872888269, len=460112]] 2018-06-02 12:11:46.600[INFO ][Thread-100][o.a.i.i.p.c.p.w.FileWriteAheadLogManager] Stopping WAL iteration due to an exception: EOF at position [1073741824] expected to read [1] bytes, ptr=FileWALPointer [idx=15, fileOff=1073741824, len=0] 2018-06-02 12:12:21.181[WARN ][Thread-100][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager] Found unexpected checkpoint marker, skipping [cpId=3fe33806-ee11-49b7-8c47-648cd1adacbc, expCpId=----, pos=FileWALPointer [idx=23, fileOff=693360866, len=460112]] {noformat} And trying to recovery from WAL hangs a long try without success. Should to stop the node and print message about not found necessary wal-files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8606) Node hangs on next exchange, when no access to marshaller's folder
Vladislav Pyatkov created IGNITE-8606: - Summary: Node hangs on next exchange, when no access to marshaller's folder Key: IGNITE-8606 URL: https://issues.apache.org/jira/browse/IGNITE-8606 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov {noformat} 2018-05-18 11:12:57.572 [ERROR][tcp-disco-msg-worker-#3%DPL_GRID%DplGridNodeName%][o.a.i.i.MarshallerMappingFileStore] Failed to write class name to file [platformId=0id=1713316383, clsName=com.sbt.dpl.gridgain.affinity.DPLIndexAffinityPrimaryFilter, file=/u01/pprb/work/marshaller/1713316383.classname0] java.io.FileNotFoundException: /u01/pprb/work/marshaller/1713316383.classname0 (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:162) at org.apache.ignite.internal.MarshallerMappingFileStore.writeMapping(MarshallerMappingFileStore.java:94) at org.apache.ignite.internal.MarshallerMappingFileStore.mergeAndWriteMapping(MarshallerMappingFileStore.java:207) at org.apache.ignite.internal.MarshallerContextImpl.onMappingDataReceived(MarshallerContextImpl.java:201) at org.apache.ignite.internal.processors.marshaller.GridMarshallerMappingProcessor.processIncomingMappings(GridMarshallerMappingProcessor.java:356) at org.apache.ignite.internal.processors.marshaller.GridMarshallerMappingProcessor.onJoiningNodeDataReceived(GridMarshallerMappingProcessor.java:336) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$5.onExchange(GridDiscoveryManager.java:908) at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.onExchange(TcpDiscoverySpi.java:1939) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:4220) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2744) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536) at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8458) AffinityAssigment absorbs a lot of java heap
Vladislav Pyatkov created IGNITE-8458: - Summary: AffinityAssigment absorbs a lot of java heap Key: IGNITE-8458 URL: https://issues.apache.org/jira/browse/IGNITE-8458 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov For the more hundred caches and several thousand partitions the size can grow out of 10 Gb. In my case heap stored ~5К {{HistoryAffinityAssigment}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8440) Transaction may hangs on node in PREPARED state
Vladislav Pyatkov created IGNITE-8440: - Summary: Transaction may hangs on node in PREPARED state Key: IGNITE-8440 URL: https://issues.apache.org/jira/browse/IGNITE-8440 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov In some specific cases we can to see when transaction hangs on one node in {{PREPARED}} state, but does not hang in others. That unhappy node waiting to get {{TxFinishRequest}}, but never got it and continue to print _long running transaction message_. Should to check other nodes, when transaction hang on PREPARED state without progress. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8348) Debug information about discovery messages in TcpDiscoverySpiMBean
Vladislav Pyatkov created IGNITE-8348: - Summary: Debug information about discovery messages in TcpDiscoverySpiMBean Key: IGNITE-8348 URL: https://issues.apache.org/jira/browse/IGNITE-8348 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov In some discovery issue, like: 1) Behavior on instable network 2) Segmentation of several nodes or others, when the SPI works does not obviously. Wants to know - what kind of messages has been sent (receive) from particular node? By that reason want to add method on TcpDiscoverySpiMBean: {code} /** * Print a list of discarder messages. */ @MXBeanDescription("Print a list of discarded messages to log.") public void printListOfDiscardedMessages(); /** * Print a list of received messages. */ @MXBeanDescription("Print a list of received messages to log.") public void printListOfReceivedMessages(); /** * Print a list of sent messages. */ @MXBeanDescription("Print a list of sent messages to log.") public void printListOfSentMessages(); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8136) Discovery service wrong works if node stopping by segmentation and hangs
Vladislav Pyatkov created IGNITE-8136: - Summary: Discovery service wrong works if node stopping by segmentation and hangs Key: IGNITE-8136 URL: https://issues.apache.org/jira/browse/IGNITE-8136 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8087) Assertion error in time to rebalancing
Vladislav Pyatkov created IGNITE-8087: - Summary: Assertion error in time to rebalancing Key: IGNITE-8087 URL: https://issues.apache.org/jira/browse/IGNITE-8087 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov {noformat} 2018-03-30 10:06:17.936[ERROR][sys-#308516%DPL_GRID%DplGridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed processing message [senderId=4754f275-a46b-4df5-b263-8369a9cb899b, msg=GridDhtPartitionSupplyMessage [updateSeq=151421, topVer=AffinityTopologyVersion [topVer=546, minorTopVer=8], missed=null, clean=null, msgSize=524554, estimatedKeysCnt=-1, size=1, parts=[45], super=GridCacheGroupIdMessage [grpId=218536256]]] java.lang.AssertionError: GridDhtCacheEntry [rdrs=[], part=45, super=GridDistributedCacheEntry [super=GridCacheMapEntry [key=KeyCacheObjectImpl [part=45, val=1005, hasValBytes=true], val=null, startVer=1522624104073, ver=GridCacheVersion [topVer=133119151, order=1522038055581, nodeOrder=13], hash=1005, extras=null, flags=3]]] at org.apache.ignite.internal.processors.cache.GridCacheContext.onDeferredDelete(GridCacheContext.java:1644) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.unswap(GridCacheMapEntry.java:446) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.unswap(GridCacheMapEntry.java:377) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.initialValue(GridCacheMapEntry.java:2713) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.preloadEntry(GridDhtPartitionDemander.java:798) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.handleSupplyMessage(GridDhtPartitionDemander.java:678) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleSupplyMessage(GridDhtPreloader.java:375) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:354) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$700(GridCacheIoManager.java:99) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1609) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4100(GridIoManager.java:126) at org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2751) at org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1515) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4400(GridIoManager.java:126) at org.apache.ignite.internal.managers.communication.GridIoManager$10.run(GridIoManager.java:1484) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8030) Cluster hangs on deactivation process in time stopping indexed cache
Vladislav Pyatkov created IGNITE-8030: - Summary: Cluster hangs on deactivation process in time stopping indexed cache Key: IGNITE-8030 URL: https://issues.apache.org/jira/browse/IGNITE-8030 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Attachments: thrdump-server.log {noformat} "sys-#10283%DPL_GRID%DplGridNodeName%" #13068 prio=5 os_prio=0 tid=0x7f07040eb000 nid=0x2e0f waiting on condition [0x7e6deb9b8000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x7f0bd2b0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireInterruptibly(AbstractQueuedSynchronizer.java:897) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1222) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lockInterruptibly(ReentrantReadWriteLock.java:998) at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.lock(GridH2Table.java:292) at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.lock(GridH2Table.java:253) at org.h2.command.ddl.DropTable.prepareDrop(DropTable.java:87) at org.h2.command.ddl.DropTable.update(DropTable.java:113) at org.h2.command.CommandContainer.update(CommandContainer.java:101) at org.h2.command.Command.executeUpdate(Command.java:260) - locked <0x7f0c276c85b8> (a org.h2.engine.Session) at org.h2.jdbc.JdbcStatement.executeUpdateInternal(JdbcStatement.java:137) - locked <0x7f0c276c85b8> (a org.h2.engine.Session) at org.h2.jdbc.JdbcStatement.executeUpdate(JdbcStatement.java:122) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.dropTable(IgniteH2Indexing.java:654) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.unregisterCache(IgniteH2Indexing.java:2482) at org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStop0(GridQueryProcessor.java:1684) - locked <0x7f0b69f822d0> (a java.lang.Object) at org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStop(GridQueryProcessor.java:879) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCache(GridCacheProcessor.java:1189) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStop(GridCacheProcessor.java:2063) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onExchangeDone(GridCacheProcessor.java:2219) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:1518) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:2538) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:2297) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processSingleMessage(GridDhtPartitionsExchangeFuture.java:2034) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$100(GridDhtPartitionsExchangeFuture.java:122) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:1891) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:1879) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) at org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:353) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onReceiveSingleMessage(GridDhtPartitionsExchangeFuture.java:1879) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processSinglePartitionUpdate(GridCachePartitionExchangeManager.java:1523) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1000(GridCachePartitionExchangeManager.java:133) at
[jira] [Created] (IGNITE-8021) Destroyed caches can be return to life by restart grid
Vladislav Pyatkov created IGNITE-8021: - Summary: Destroyed caches can be return to life by restart grid Key: IGNITE-8021 URL: https://issues.apache.org/jira/browse/IGNITE-8021 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Cache configuration files stay stored on file system after invoke \{{destroy}} method. By the reason after restart grid all removed caches are start. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8006) Starting multiple caches inhibits exchange process on joining node
Vladislav Pyatkov created IGNITE-8006: - Summary: Starting multiple caches inhibits exchange process on joining node Key: IGNITE-8006 URL: https://issues.apache.org/jira/browse/IGNITE-8006 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov In some cases when we starts multiple caches (over 2K caches), we can to got a stop on exchange when new node joining to the cluster. Coordinator-node wait to receive a single message from all other nodes, but last node (which want to joining to the cluster) stopped on starting caches: {noformat} Stack trace at java.lang.Thread.dumpStack(Thread.java:1329) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCache(GridCacheProcessor.java:1159) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStart(GridCacheProcessor.java:1900) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCachesOnLocalJoin(GridCacheProcessor.java:1764) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initCachesOnLocalJoin(GridDhtPartitionsExchangeFuture.java:740) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:622) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2329) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) {noformat} that inhibits cluster exchange process, until all caches started on the last node. We should to start caches in parallel threads or exclude the action from exchange init process. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7930) Partition map hang in incorrect state when backup filter is assigned
Vladislav Pyatkov created IGNITE-7930: - Summary: Partition map hang in incorrect state when backup filter is assigned Key: IGNITE-7930 URL: https://issues.apache.org/jira/browse/IGNITE-7930 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Attachments: IgnitePdsRebalanceCompletionTest.java The test ([^IgnitePdsRebalanceCompletionTest.java]) shown, which some partition turn up OWNING (but this should not be so) state and whole cluster hangs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7896) Files of evicted partitions do not removed from disk storage
Vladislav Pyatkov created IGNITE-7896: - Summary: Files of evicted partitions do not removed from disk storage Key: IGNITE-7896 URL: https://issues.apache.org/jira/browse/IGNITE-7896 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Attachments: IgnitePdsRebalanceCompletionAndPartitionFilesTest.java Look at test reproduction: [^IgnitePdsRebalanceCompletionAndPartitionFilesTest.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7703) Add a method gets caches in batch
Vladislav Pyatkov created IGNITE-7703: - Summary: Add a method gets caches in batch Key: IGNITE-7703 URL: https://issues.apache.org/jira/browse/IGNITE-7703 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Ignite allows to start (and/or get) caches in batch, but not allows to do get without starting. In some cases need to start particular subset of all cluster caches, but if calls this one by one: _org.apache.ignite.Ignite#cache_ we have a risk to overload discovery layer by messages of _DynamicCacheChangeRequest_. Will be better to add a specific method for gets of caches in batch. _org.apache.ignite.Ignite#caches_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-6991) SharedDeploymentTest.testDeploymentFromSecondAndThird Test fails in 100 percentage cases
Vladislav Pyatkov created IGNITE-6991: - Summary: SharedDeploymentTest.testDeploymentFromSecondAndThird Test fails in 100 percentage cases Key: IGNITE-6991 URL: https://issues.apache.org/jira/browse/IGNITE-6991 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov {noformat} java.lang.ClassNotFoundException: org.apache.ignite.tests.p2p.compute.ExternalCallable2 at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at org.apache.ignite.testframework.GridTestExternalClassLoader.findClass(GridTestExternalClassLoader.java:143) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at org.apache.ignite.testframework.GridTestExternalClassLoader.loadClass(GridTestExternalClassLoader.java:152) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.ignite.p2p.SharedDeploymentTest.runJob2(SharedDeploymentTest.java:124) at org.apache.ignite.p2p.SharedDeploymentTest.testDeploymentFromSecondAndThird(SharedDeploymentTest.java:82) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6922) Class can not undeploy from grid in some specific cases
Vladislav Pyatkov created IGNITE-6922: - Summary: Class can not undeploy from grid in some specific cases Key: IGNITE-6922 URL: https://issues.apache.org/jira/browse/IGNITE-6922 Project: Ignite Issue Type: Bug Security Level: Public (Viewable by anyone) Reporter: Vladislav Pyatkov -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6799) Check of starvation in striped thread pool
Vladislav Pyatkov created IGNITE-6799: - Summary: Check of starvation in striped thread pool Key: IGNITE-6799 URL: https://issues.apache.org/jira/browse/IGNITE-6799 Project: Ignite Issue Type: Improvement Security Level: Public (Viewable by anyone) Reporter: Vladislav Pyatkov We have got false alarm like: {noformat} 2017-10-30 14:01:40.308[WARN ][grid-timeout-worker-#63%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.util.typedef.G] >>> Possible starvation in striped pool. 2017-10-30 13:56:41.538[WARN ][grid-timeout-worker-#63%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.util.typedef.G] >>> Possible starvation in striped pool. 2017-10-30 13:46:40.488[WARN ][grid-timeout-worker-#63%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.util.typedef.G] >>> Possible starvation in striped pool. 2017-10-30 13:37:45.481[WARN ][grid-timeout-worker-#63%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.util.typedef.G] >>> Possible starvation in striped pool. {noformat} It will be on checkpoint usually, but that is false triggering. Because thread have not been active long time, but got active recently. We should save last active state on stripe like it done with completedCntrs and rewrite condition: {code} completedCntrs[i] != -1 && completedCntrs[i] == completedCnt && actives[i] == active && active {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6737) GridDeploymentPerVersionStore retries loading class infinitely
Vladislav Pyatkov created IGNITE-6737: - Summary: GridDeploymentPerVersionStore retries loading class infinitely Key: IGNITE-6737 URL: https://issues.apache.org/jira/browse/IGNITE-6737 Project: Ignite Issue Type: Bug Security Level: Public (Viewable by anyone) Reporter: Vladislav Pyatkov {noformat} 2017-10-24 14:34:06 [DEBUG] [org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore] [pub-#5258%DPL_GRID%DplGridNodeName%] - Deployment meta for local deployment: GridDeploymentMetadata [depMode=SHARED, alias=com.sbt.bgp.task.AffinityApplicationTaskCallable, clsName=com.sbt.bgp.task.AffinityApplicationTaskCallable, userVer=null, sndNodeId=1b852edd-1f41-4489-af78-dbe8226a9b16, clsLdrId=null, clsLdr=null, participants=null, parentLdr=null, record=true, nodeFilter=null, seqNum=n/a] 2017-10-24 14:34:06 [DEBUG] [org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore] [pub-#5258%DPL_GRID%DplGridNodeName%] - Failed to load class for local auto-deployment [ldr=grid:com.sbt.core.envelope.container.FileCl assLoader@3e4327dc, meta=GridDeploymentMetadata [depMode=SHARED, alias=com.sbt.bgp.task.AffinityApplicationTaskCallable, clsName=com.sbt.bgp.task.AffinityApplicationTaskCallable, userVer=null, sndNodeId=1b852edd-1f41-4489-af78-dbe8226a9b 16, clsLdrId=null, clsLdr=null, participants=null, parentLdr=null, record=true, nodeFilter=null, seqNum=n/a]] 2017-10-24 14:34:06 [DEBUG] [org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore] [pub-#5258%DPL_GRID%DplGridNodeName%] - Deployment cannot be reused (class does not exist on participating nodes) [dep=SharedDeployment [rmv=false, super=GridDeployment [ts=1508810401226, depMode=SHARED, clsLdr=GridDeploymentClassLoader [id=7953e0c4f51-1b852edd-1f41-4489-af78-dbe8226a9b16, singleNode=false, nodeLdrMap={bc5a1eaa-e056-4bd8-b7d3-684e75522b81=373cd8c4f51-bc5a1eaa-e056-4bd8-b7d3-684e75522b81, 3018f0bb-7c94-410e-9a0f-028c3fbc8aab=a5b822c4f51-3018f0bb-7c94-410e-9a0f-028c3fbc8aab, f1774f8d-84e9-43c3-86a3-d7a47c291f45=afd441c4f51-f1774f8d-84e9-43c3-86a3-d7a47c291f45, 5a0b56e8-a8ae-4742-834c-d688592866c4=a6e985c4f51-5a0b56e8-a8ae-4742-834c-d688592866c4, 65fdae9e-78c7-49a2-b9ee-a8e99dbb87ea=bcd257c4f51-65fdae9e-78c7-49a2-b9ee-a8e99dbb87ea, 045ddd4d-3e39-4b25-bf52-c264f59efbc6=e6ec81c4f51-045ddd4d-3e39-4b25-bf52-c264f59efbc6, afadbbce-542d-435c-b85a-78d395b463a5=967664c4f51-afadbbce-542d-435c-b85a-78d395b463a5, 4b2662e9-d525-4d96-936c-8cc645464e65=591541c4f51-4b2662e9-d525-4d96-936c-8cc645464e65}, p2pTimeout=5000, usrVer=0, depMode=SHARED, quiet=false], clsLdrId=7953e0c4f51-1b852edd-1f41-4489-af78-dbe8226a9b16, userVer=0, loc=false, sampleClsName=com.sbt.fea_cc.services.business.autoStopTurnkeySettings.AutoStopTurnkeySettingsService$FindOrderTurnkeyForSuspend, pendingUndeploy=false, undeployed=false, usage=0]], meta=GridDeploymentMetadata [depMode=SHARED, alias=com.sbt.bgp.task.AffinityApplicationTaskCallable, clsName=com.sbt.bgp.task.AffinityApplicationTaskCallable, userVer=0, sndNodeId=4457016c-5f93-450f-b2a7-86bd25f536cf, clsLdrId=898962c4f51-4457016c-5f93-450f-b2a7-86bd25f536cf, clsLdr=null, participants=null, parentLdr=null, record=true, nodeFilter=null, seqNum=150888744]] 2017-10-24 14:34:06 [DEBUG] [org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore] [pub-#5258%DPL_GRID%DplGridNodeName%] - Deployment cannot be reused (random class could not be loaded from sender node) [dep=SharedDeployment [rmv=false, super=GridDeployment [ts=1508810401226, depMode=SHARED, clsLdr=GridDeploymentClassLoader [id=7953e0c4f51-1b852edd-1f41-4489-af78-dbe8226a9b16, singleNode=false, nodeLdrMap={bc5a1eaa-e056-4bd8-b7d3-684e75522b81=373cd8c4f51-bc5a1eaa-e056-4bd8-b7d3-684e75522b81, 3018f0bb-7c94-410e-9a0f-028c3fbc8aab=a5b822c4f51-3018f0bb-7c94-410e-9a0f-028c3fbc8aab, f1774f8d-84e9-43c3-86a3-d7a47c291f45=afd441c4f51-f1774f8d-84e9-43c3-86a3-d7a47c291f45, 5a0b56e8-a8ae-4742-834c-d688592866c4=a6e985c4f51-5a0b56e8-a8ae-4742-834c-d688592866c4, 65fdae9e-78c7-49a2-b9ee-a8e99dbb87ea=bcd257c4f51-65fdae9e-78c7-49a2-b9ee-a8e99dbb87ea, 045ddd4d-3e39-4b25-bf52-c264f59efbc6=e6ec81c4f51-045ddd4d-3e39-4b25-bf52-c264f59efbc6, afadbbce-542d-435c-b85a-78d395b463a5=967664c4f51-afadbbce-542d-435c-b85a-78d395b463a5, 4b2662e9-d525-4d96-936c-8cc645464e65=591541c4f51-4b2662e9-d525-4d96-936c-8cc645464e65}, p2pTimeout=5000, usrVer=0, depMode=SHARED, quiet=false], clsLdrId=7953e0c4f51-1b852edd-1f41-4489-af78-dbe8226a9b16, userVer=0, loc=false, sampleClsName=com.sbt.fea_cc.services.business.autoStopTurnkeySettings.AutoStopTurnkeySettingsService$FindOrderTurnkeyForSuspend, pendingUndeploy=false, undeployed=false, usage=0]], meta=GridDeploymentMetadata [depMode=SHARED, alias=com.sbt.bgp.task.AffinityApplicationTaskCallable,
[jira] [Created] (IGNITE-6589) Encountered incompatible class loaders for cache
Vladislav Pyatkov created IGNITE-6589: - Summary: Encountered incompatible class loaders for cache Key: IGNITE-6589 URL: https://issues.apache.org/jira/browse/IGNITE-6589 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov By unknown reasons DeploymentManager forces to use objects with compatible classloader. {noformat} class org.apache.ignite.IgniteCheckedException: Encountered incompatible class loaders for cache [class1=org.apache.ignite.tests.p2p.cache.Person, class2=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionFullMap] at org.apache.ignite.internal.processors.cache.GridCacheDeploymentManager.registerClass(GridCacheDeploymentManager.java:642) at org.apache.ignite.internal.processors.cache.GridCacheDeploymentManager.registerClass(GridCacheDeploymentManager.java:586) at org.apache.ignite.internal.processors.cache.GridCacheMessage.prepareObject(GridCacheMessage.java:223) at org.apache.ignite.internal.processors.cache.GridCacheMessage.marshalInvokeArguments(GridCacheMessage.java:444) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateInvokeRequest.prepareMarshal(GridNearAtomicSingleUpdateInvokeRequest.java:192) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onSend(GridCacheIoManager.java:1120) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1154) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1205) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:311) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6579) WAL history does not used when node returns to cluster again
Vladislav Pyatkov created IGNITE-6579: - Summary: WAL history does not used when node returns to cluster again Key: IGNITE-6579 URL: https://issues.apache.org/jira/browse/IGNITE-6579 Project: Ignite Issue Type: Bug Components: persistence Reporter: Vladislav Pyatkov When I have set big enough value to "WAL history size" and stop node on 20 minutes, I got the message from coordinator (order=1): {noformat} 2017-10-06 15:46:33.429 [WARN ][sys-#10740%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.GridDhtPartitionTopologyImpl] Partition has been scheduled for rebalancing due to outdated update counter [nodeId=e51a1db2-f49b-44a9-b122-adde4016d9e7, cacheOrGroupName=CACHEGROUP_PARTICLE_DServiceZone, partId=2424, haveHistory=false] 2017-10-06 15:46:33.429 [WARN ][sys-#10740%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.GridDhtPartitionTopologyImpl] Partition has been scheduled for rebalancing due to outdated update counter [nodeId=e51a1db2-f49b-44a9-b122-adde4016d9e7, cacheOrGroupName=CACHEGROUP_PARTICLE_DServiceZone, partId=2427, haveHistory=false] 2017-10-06 15:46:33.429 [WARN ][sys-#10740%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.GridDhtPartitionTopologyImpl] Partition has been scheduled for rebalancing due to outdated update counter [nodeId=e51a1db2-f49b-44a9-b122-adde4016d9e7, cacheOrGroupName=CACHEGROUP_PARTICLE_DServiceZone, partId=2426, haveHistory=false] {noformat} after start node again. I think, history size should be enough, but I see it is not by logs (haveHistory=false). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6552) The ability to set WAL history size in time units
Vladislav Pyatkov created IGNITE-6552: - Summary: The ability to set WAL history size in time units Key: IGNITE-6552 URL: https://issues.apache.org/jira/browse/IGNITE-6552 Project: Ignite Issue Type: Improvement Components: persistence Affects Versions: 2.2 Reporter: Vladislav Pyatkov We can to set size of WAL history in number of checkpoints. {code} org.apache.ignite.configuration.PersistentStoreConfiguration#setWalHistorySize {code} But it is not convenient fro end user. Nobody to say how many checkpoint to occur over several minutes. I think, it will be better if we will have ability to set WAL history size in time units (milliseconds for example). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6549) Web agent: Able to start web agent internal of grid node
Vladislav Pyatkov created IGNITE-6549: - Summary: Web agent: Able to start web agent internal of grid node Key: IGNITE-6549 URL: https://issues.apache.org/jira/browse/IGNITE-6549 Project: Ignite Issue Type: Improvement Components: wizards Affects Versions: 2.2 Reporter: Vladislav Pyatkov We are should have ability to start web agent internal of grid node. It will be allow to simplify a interconnection between web console and cluster and reduce latency. In additional it allows does not start additional services on node for interaction with web agent (HTTP for ignite rest). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6224) Node stoping does not wait all transactions completion
Vladislav Pyatkov created IGNITE-6224: - Summary: Node stoping does not wait all transactions completion Key: IGNITE-6224 URL: https://issues.apache.org/jira/browse/IGNITE-6224 Project: Ignite Issue Type: Bug Affects Versions: 2.1 Reporter: Vladislav Pyatkov I have started grid node and executing transaction over some cache. After I stopped the node in the middle execution of transaction. I have got transaction execution exception: {noformat} java.lang.IllegalStateException: class org.apache.ignite.internal.processors.cache.CacheStoppedException: Failed to perform cache operation (cache is stopped): cache at org.apache.ignite.internal.processors.cache.GridCacheGateway.enter(GridCacheGateway.java:164) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onEnter(GatewayProtectedCacheProxy.java:1656) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:869) at org.apache.ignite.TransactionBehindStopNodeTest.testOneNode(TransactionBehindStopNodeTest.java:56) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at junit.framework.TestCase.runTest(TestCase.java:176) at org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2000) at org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:132) at org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:1915) at java.lang.Thread.run(Thread.java:745) {noformat} also I stopped node with _false_ {{canceled}} flag. {code} G.stop(getTestIgniteInstanceName(0), false); {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6213) Unexpected setting local deployment owner anyone node
Vladislav Pyatkov created IGNITE-6213: - Summary: Unexpected setting local deployment owner anyone node Key: IGNITE-6213 URL: https://issues.apache.org/jira/browse/IGNITE-6213 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov In my test I have seen, when one node tune up {{locDepOwner}} flag suddenly. {noformat} 16:55:47.868 [ DEBUG] [ o.a.i.i.p.c.GridCacheDeploymentManager] [ T:] - Prepared grid cache deployable [ dep=GridDeploymentInfoBean [ clsLdrId=aefa3c4fd51-12bb727e-4815-4ab2-8f8c-cc6fd52c8553, depMode=SHARED, userVer=0, locDepOwner=true, participants=null], deployable=GridNearAtomicSingleUpdateRequest [ key=UserKeyCacheObjectImpl [ part=111, val=4翿翿, hasValBytes=true], super=GridNearAtomicSingleUpdateRequest [ key=UserKeyCacheObjectImpl [ part=111, val=4翿翿, hasValBytes=true], parent=GridNearAtomicAbstractSingleUpdateRequest [ nodeId=45acc827-8a2d-47d3-aa04-94936ad25ac2, futId=81921, topVer=AffinityTopologyVersion [ topVer=4, minorTopVer=0], parent=GridNearAtomicAbstractUpdateRequest [ res=null, flags=] {noformat} By the reason global participant was been registered: {noformat} 16:55:47.871 [ DEBUG] [ o.a.i.i.m.d.GridDeploymentPerVersionStore] [ T:] - Explicitly added participant [ dep=SharedDeployment [ rmv=false, super=GridDeployment [ ts=1503050146264, depMode=SHARED, clsLdr=GridDeploymentClassLoader [ id=acaa3c4fd51-45acc827-8a2d-47d3-aa04-94936ad25ac2, singleNode=false, nodeLdrMap={12bb727e-4815-4ab2-8f8c-cc6fd52c8553=aefa3c4fd51-12bb727e-4815-4ab2-8f8c-cc6fd52c8553, 101abc71-83b4-4a87-bb07-14e4cbc7226e=2c044c4fd51-101abc71-83b4-4a87-bb07-14e4cbc7226e, 9d30737f-44d2-4414-b84d-25f032484290=e70b3c4fd51-9d30737f-44d2-4414-b84d-25f032484290}, p2pTimeout=5000, usrVer=0, depMode=SHARED, quiet=false], clsLdrId=acaa3c4fd51-45acc827-8a2d-47d3-aa04-94936ad25ac2, userVer=0, loc=false, sampleClsName=com.sbt.dpl.gridgain.index.InvokeIndexAdder, pendingUndeploy=false, undeployed=false, usage=0]], nodeId=12bb727e-4815-4ab2-8f8c-cc6fd52c8553, ldrId=aefa3c4fd51-12bb727e-4815-4ab2-8f8c-cc6fd52c8553] {noformat} And after that I am geting the Exception when try to get class from node where the class was not located: {noformat} 16:55:50.684 [ERROR] [o.a.i.i.p.job.GridJobProcessor] [T:] - Task was not deployed or was redeployed since task execution [taskName=com.sbt.azimuth_psi.publisher.forms.computing.parallelBatchCollectForm$TestMapFunction, taskClsName=com.sbt.azimuth_psi.publisher.forms.computing.parallelBatchCollectForm$TestMapFunction, codeVer=0, clsLdrId=2c044c4fd51-101abc71-83b4-4a87-bb07-14e4cbc7226e, seqNum=1503050088642, depMode=SHARED, dep=null] org.apache.ignite.IgniteDeploymentException: Task was not deployed or was redeployed since task execution [taskName=com.sbt.azimuth_psi.publisher.forms.computing.parallelBatchCollectForm$TestMapFunction, taskClsName=com.sbt.azimuth_psi.publisher.forms.computing.parallelBatchCollectForm$TestMapFunction, codeVer=0, clsLdrId=2c044c4fd51-101abc71-83b4-4a87-bb07-14e4cbc7226e, seqNum=1503050088642, depMode=SHARED, dep=null] at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1160) ~[ignite-core-2.1.3.jar:2.1.3] at org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1908) [ignite-core-2.1.3.jar:2.1.3] at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) [ignite-core-2.1.3.jar:2.1.3] at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) [ignite-core-2.1.3.jar:2.1.3] at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126) [ignite-core-2.1.3.jar:2.1.3] at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097) [ignite-core-2.1.3.jar:2.1.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_80] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_80] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6113) Partition eviction prevents exchange from completion
Vladislav Pyatkov created IGNITE-6113: - Summary: Partition eviction prevents exchange from completion Key: IGNITE-6113 URL: https://issues.apache.org/jira/browse/IGNITE-6113 Project: Ignite Issue Type: Bug Affects Versions: 2.1 Reporter: Vladislav Pyatkov Customer has waited for 3 hours for completion without any success. exchange-worker is blocked. {noformat} "exchange-worker-#92%DPL_GRID%grid554.ca.sbrf.ru%" #173 prio=5 os_prio=0 tid=0x7f0835c2e000 nid=0xb907 runnable [0x7e74ab1d] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x7efee630a7c0> (a org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition$1) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:189) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:139) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.assign(GridDhtPreloader.java:340) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1801) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - None {noformat} {noformat} "sys-#124%DPL_GRID%grid554.ca.sbrf.ru%" #278 prio=5 os_prio=0 tid=0x7e731c02d000 nid=0xbf4d runnable [0x7e734e7f7000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:51) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211) - locked <0x7f056161bf88> (a java.lang.Object) at org.gridgain.grid.cache.db.wal.FileWriteAheadLogManager$FileWriteHandle.writeBuffer(FileWriteAheadLogManager.java:1829) at org.gridgain.grid.cache.db.wal.FileWriteAheadLogManager$FileWriteHandle.flush(FileWriteAheadLogManager.java:1572) at org.gridgain.grid.cache.db.wal.FileWriteAheadLogManager$FileWriteHandle.addRecord(FileWriteAheadLogManager.java:1421) at org.gridgain.grid.cache.db.wal.FileWriteAheadLogManager$FileWriteHandle.access$800(FileWriteAheadLogManager.java:1331) at org.gridgain.grid.cache.db.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:339) at org.gridgain.grid.internal.processors.cache.database.pagemem.PageMemoryImpl.beforeReleaseWrite(PageMemoryImpl.java:1287) at org.gridgain.grid.internal.processors.cache.database.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1142) at org.gridgain.grid.internal.processors.cache.database.pagemem.PageImpl.releaseWrite(PageImpl.java:167) at org.apache.ignite.internal.processors.cache.database.tree.util.PageHandler.writeUnlock(PageHandler.java:193) at org.apache.ignite.internal.processors.cache.database.tree.util.PageHandler.writePage(PageHandler.java:242) at org.apache.ignite.internal.processors.cache.database.tree.util.PageHandler.writePage(PageHandler.java:119) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree$Remove.doRemoveFromLeaf(BPlusTree.java:2886) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree$Remove.removeFromLeaf(BPlusTree.java:2865) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree$Remove.access$6900(BPlusTree.java:2515) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree.removeDown(BPlusTree.java:1607) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree.removeDown(BPlusTree.java:1574) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree.removeDown(BPlusTree.java:1574) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree.removeDown(BPlusTree.java:1574) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree.removeDown(BPlusTree.java:1574) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree.doRemove(BPlusTree.java:1481) at org.apache.ignite.internal.processors.cache.database.tree.BPlusTree.remove(BPlusTree.java:1451) at
[jira] [Created] (IGNITE-6083) Null value have appear in the entry processor, but the entry is existing
Vladislav Pyatkov created IGNITE-6083: - Summary: Null value have appear in the entry processor, but the entry is existing Key: IGNITE-6083 URL: https://issues.apache.org/jira/browse/IGNITE-6083 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 2.1 Reporter: Vladislav Pyatkov In one thread load some data in a cache, after that I have execute OPTIMISTIC, SERIALIZABLE transaction with two {{IgniteCache.invoke()}} methods. The value had been corrected at first {{EntryProcessor}}, but it is NULL at second. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5941) Index with long name stores incorrect
Vladislav Pyatkov created IGNITE-5941: - Summary: Index with long name stores incorrect Key: IGNITE-5941 URL: https://issues.apache.org/jira/browse/IGNITE-5941 Project: Ignite Issue Type: Bug Components: persistence Reporter: Vladislav Pyatkov SQL query by Index with long name return inconsistent result after cluster restart and recover from storage. At the same time a query by other index (with more shorter name) works correctly before and after recovery. For example long index name: {code} QueryIndex index = new QueryIndex("name", true, "COM.SBT.AZIMUTH_PSI.PUBLISHER.ENTITIES.PUB.PARTICLES.CARPORT#MODELCOM.SBT.AZIMUTH_PSI.PUBLISHER.ENTITIES.PUB.PARTICLES.CARPORT"); {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5889) Exception message confused when double cache configuration by some mistake.
Vladislav Pyatkov created IGNITE-5889: - Summary: Exception message confused when double cache configuration by some mistake. Key: IGNITE-5889 URL: https://issues.apache.org/jira/browse/IGNITE-5889 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov {noformat} javax.cache.CacheException: Failed to start client cache (a cache with the given name is not started): cache1 at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.clientCachesToStart(CacheAffinitySharedManager.java:378) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCacheStartRequests(CacheAffinitySharedManager.java:411) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesChanges(CacheAffinitySharedManager.java:603) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:410) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:1789) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1878) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5602) By bytes access to binary format
Vladislav Pyatkov created IGNITE-5602: - Summary: By bytes access to binary format Key: IGNITE-5602 URL: https://issues.apache.org/jira/browse/IGNITE-5602 Project: Ignite Issue Type: New Feature Affects Versions: 2.0 Reporter: Vladislav Pyatkov Need to avoid memory additional allocation when pass bytes to stream. Now we are doing only {code} BinaryObject get = (BinaryObject) cache.get(key); byte[] dataFromCache = get.field("data"); System.out.write(dataFromCache, 0, dataFromCache.length); {code} But we want to write bytes to stream directly, without allocation additional {{byte[]}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5340) AssertionError in index name check
Vladislav Pyatkov created IGNITE-5340: - Summary: AssertionError in index name check Key: IGNITE-5340 URL: https://issues.apache.org/jira/browse/IGNITE-5340 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov {noformat} java.lang.AssertionError: null at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.escapeName(IgniteH2Indexing.java:1980) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$TableDescriptor.createUserIndex(IgniteH2Indexing.java:3214) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$TableDescriptor.createUserIndexes(IgniteH2Indexing.java:3199) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.createTable(IgniteH2Indexing.java:2041) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.registerType(IgniteH2Indexing.java:1909) at org.apache.ignite.internal.processors.query.GridQueryProcessor.registerCache0(GridQueryProcessor.java:1298) at org.apache.ignite.internal.processors.query.GridQueryProcessor.initializeCache(GridQueryProcessor.java:751) at org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStart(GridQueryProcessor.java:809) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCache(GridCacheProcessor.java:1261) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStart(GridCacheProcessor.java:1930) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStart(GridCacheProcessor.java:1817) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.onCacheChangeRequest(CacheAffinitySharedManager.java:384) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onCacheChangeRequest(GridDhtPartitionsExchangeFuture.java:764) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:556) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1824) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-4882) Sometime network request will be produce when cache entry is locked
Vladislav Pyatkov created IGNITE-4882: - Summary: Sometime network request will be produce when cache entry is locked Key: IGNITE-4882 URL: https://issues.apache.org/jira/browse/IGNITE-4882 Project: Ignite Issue Type: Bug Affects Versions: 1.7 Reporter: Vladislav Pyatkov Look at the trace: {noformat} "utility-#10740%null%" #10774 prio=5 os_prio=0 tid=0x7f06c8026000 nid=0x556 waiting on condition [0x7f06c33f] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00074e9e65f8> (a org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$ConnectFuture) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:159) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:117) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2094) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1989) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1955) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1148) at org.apache.ignite.internal.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1384) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1335) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1306) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1288) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendNotification(GridContinuousProcessor.java:949) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:892) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:803) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.access$700(CacheContinuousQueryHandler.java:91) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler$1.onEntryUpdated(CacheContinuousQueryHandler.java:412) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:347) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1291) - locked <0x000683f62f38> (a org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCacheEntry) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:784) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.finish(GridNearTxLocal.java:747) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:418) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$4.apply(GridNearTxLocal.java:868) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$4.apply(GridNearTxLocal.java:860) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onComplete(GridNearOptimisticTxPrepareFuture.java:287) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onDone(GridNearOptimisticTxPrepareFuture.java:264) at
[jira] [Created] (IGNITE-4747) Memory leak on massive cache operations over atomic cache
Vladislav Pyatkov created IGNITE-4747: - Summary: Memory leak on massive cache operations over atomic cache Key: IGNITE-4747 URL: https://issues.apache.org/jira/browse/IGNITE-4747 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov When starts several nodes (I have tested in 4 nodes) with the application, through some time (depends of heap size), we have got a Full GC pause and segmentation the grid. After heap dump analysis, I found some suspicious object: {noforamt} Class Instance Count Total Size class java.util.concurrent.ConcurrentSkipListMap$Node 63622411 1526937864 class java.util.concurrent.ConcurrentSkipListMap$Index 31810595 763454280 class java.lang.Long 63622318 508978544 {noforamt} the leak in the fils "updates" of class GridDhtLocalPartition. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-4579) Cluster ceased to process transaction, after massive operations of cache on compute task
Vladislav Pyatkov created IGNITE-4579: - Summary: Cluster ceased to process transaction, after massive operations of cache on compute task Key: IGNITE-4579 URL: https://issues.apache.org/jira/browse/IGNITE-4579 Project: Ignite Issue Type: Bug Affects Versions: 1.6 Reporter: Vladislav Pyatkov Cluster ceased to process transaction, after massive operations of cache on compute task. Some of thread are on receiving nio-session: {noformat} at java.util.concurrent.Semaphore.acquireUninterruptibly(Semaphore.java:335) at org.apache.ignite.internal.util.nio.GridSelectorNioSessionImpl.offerFuture(GridSelectorNioSessionImpl.java:190) at org.apache.ignite.internal.util.nio.GridNioServer.send0(GridNioServer.java:434) {noformat} Another, when invoking cache operation, calling anothe node synchronously: {noformat} at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1193) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:778) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:927) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicUpdateFuture.map(GridDhtAtomicUpdateFuture.java:406) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1588) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1337) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-4491) Commutation loss between two nodes leads to hang whole cluster.
Vladislav Pyatkov created IGNITE-4491: - Summary: Commutation loss between two nodes leads to hang whole cluster. Key: IGNITE-4491 URL: https://issues.apache.org/jira/browse/IGNITE-4491 Project: Ignite Issue Type: Bug Affects Versions: 1.8 Reporter: Vladislav Pyatkov Priority: Critical Reproduction steps: 1) Start nodes: DC1 DC2 1 (10.116.172.1) 8 (10.116.64.11) 2 (10.116.172.2) 7 (10.116.64.12) 3 (10.116.172.3) 6 (10.116.64.13) 4 (10.116.172.4) 5 (10.116.64.14) each node have client which run in same host with server (look source in attachment). 2) Drop connection Between 1-8, 1 (10.116.172.1) 8 (10.116.64.11) Drop all input and output traffic Invoke from 10.116.172.1 iptables -A INPUT -s 10.116.64.11 -j DROP iptables -A OUTPUT -d 10.116.64.11 -j DROP Between 4-5 4 (10.116.172.4) 5 (10.116.64.14) Invoke from 10.116.172.4 iptables -A INPUT -s 10.116.64.14 -j DROP iptables -A OUTPUT -d 10.116.64.14 -j DROP 3) Stop the grid, after several seconds If you are looking into logs, you can find which node was segmented (pay attention, which clients did not segmented.), after drop traffic: [12:04:33,914][INFO][disco-event-worker-#211%null%][GridDiscoveryManager] Topology snapshot [ver=18, servers=6, clients=8, CPUs=456, heap=68.0GB] And all operations stopped at the same time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3994) Client buffer CacheContinuousQueryEntry on pendingEvts after reconnect to alive cluster.
Vladislav Pyatkov created IGNITE-3994: - Summary: Client buffer CacheContinuousQueryEntry on pendingEvts after reconnect to alive cluster. Key: IGNITE-3994 URL: https://issues.apache.org/jira/browse/IGNITE-3994 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3806) Node hangs on invocation of .clear() method of Ignite queue.
Vladislav Pyatkov created IGNITE-3806: - Summary: Node hangs on invocation of .clear() method of Ignite queue. Key: IGNITE-3806 URL: https://issues.apache.org/jira/browse/IGNITE-3806 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Steps for reproduction: 1) Start tow node from example. 2) All works fine until none of the nodes are leaved topology. 3) Node hangs on .clear() of listen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3748) Data rebalancing of large cache can hang out.
Vladislav Pyatkov created IGNITE-3748: - Summary: Data rebalancing of large cache can hang out. Key: IGNITE-3748 URL: https://issues.apache.org/jira/browse/IGNITE-3748 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov See [the thread for details | http://apache-ignite-users.70518.x6.nabble.com/Failed-to-wait-for-initial-partition-map-exchange-tt6252.html#a7171] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3618) Client can not load data after server restarts
Vladislav Pyatkov created IGNITE-3618: - Summary: Client can not load data after server restarts Key: IGNITE-3618 URL: https://issues.apache.org/jira/browse/IGNITE-3618 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Start server and Client After client has printed "Sleep", need to restart server Wait topology update and client will be reconnect Type enter in client console and you will see in client console "No object in cache" Server throws exception: {noformat} Caused by: class org.apache.ignite.binary.BinaryObjectException: Cannot find metadata for object with compact footer: -995427962 at org.apache.ignite.internal.binary.BinaryReaderExImpl.getOrCreateSchema(BinaryReaderExImpl.java:1687) at org.apache.ignite.internal.binary.BinaryReaderExImpl.(BinaryReaderExImpl.java:255) at org.apache.ignite.internal.binary.BinaryReaderExImpl.(BinaryReaderExImpl.java:168) at org.apache.ignite.internal.binary.BinaryObjectImpl.reader(BinaryObjectImpl.java:572) at org.apache.ignite.internal.binary.BinaryObjectImpl.reader(BinaryObjectImpl.java:585) at org.apache.ignite.internal.binary.BinaryObjectImpl.hasField(BinaryObjectImpl.java:395) at org.apache.ignite.internal.processors.query.GridQueryProcessor$BinaryProperty.value(GridQueryProcessor.java:1990) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$RowDescriptor.columnValue(IgniteH2Indexing.java:2513) at org.apache.ignite.internal.processors.query.h2.opt.GridH2AbstractKeyValueRow.getValue(GridH2AbstractKeyValueRow.java:289) at org.apache.ignite.internal.processors.query.h2.opt.GridH2IndexBase.compareRows(GridH2IndexBase.java:119) at org.apache.ignite.internal.processors.query.h2.opt.GridH2TreeIndex.compare(GridH2TreeIndex.java:248) at org.apache.ignite.internal.processors.query.h2.opt.GridH2TreeIndex.compare(GridH2TreeIndex.java:49) at org.apache.ignite.internal.util.offheap.unsafe.GridOffHeapSnapTreeMap$2.compareTo(GridOffHeapSnapTreeMap.java:1350) at org.apache.ignite.internal.util.offheap.unsafe.GridOffHeapSnapTreeMap$2.compareTo(GridOffHeapSnapTreeMap.java:1346) at org.apache.ignite.internal.util.offheap.unsafe.GridOffHeapSnapTreeMap.attemptUpdate(GridOffHeapSnapTreeMap.java:2102) at org.apache.ignite.internal.util.offheap.unsafe.GridOffHeapSnapTreeMap.updateUnderRoot(GridOffHeapSnapTreeMap.java:2034) at org.apache.ignite.internal.util.offheap.unsafe.GridOffHeapSnapTreeMap.update(GridOffHeapSnapTreeMap.java:1915) at org.apache.ignite.internal.util.offheap.unsafe.GridOffHeapSnapTreeMap.put(GridOffHeapSnapTreeMap.java:1864) at org.apache.ignite.internal.util.offheap.unsafe.GridOffHeapSnapTreeMap.put(GridOffHeapSnapTreeMap.java:108) at org.apache.ignite.internal.processors.query.h2.opt.GridH2TreeIndex.put(GridH2TreeIndex.java:403) at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.doUpdate(GridH2Table.java:405) at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.update(GridH2Table.java:339) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.store(IgniteH2Indexing.java:539) at org.apache.ignite.internal.processors.query.GridQueryProcessor.store(GridQueryProcessor.java:700) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:407) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.updateIndex(GridCacheMapEntry.java:4024) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1244) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:802) ... 29 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3362) Event EVT_CACHE_REBALANCE_STOPPED fires prematurely.
Vladislav Pyatkov created IGNITE-3362: - Summary: Event EVT_CACHE_REBALANCE_STOPPED fires prematurely. Key: IGNITE-3362 URL: https://issues.apache.org/jira/browse/IGNITE-3362 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Assignee: Semen Boikov EVT_CACHE_REBALANCE_STOPPED fires earlier then the partiotions has been rebalanced. You can see detail here: [How do I know the cache rebalance is finished?|http://apache-ignite-users.70518.x6.nabble.com/How-do-I-know-the-cache-rebalance-is-finished-tc5219.html#a5746] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3218) Partition can not be reserved
Vladislav Pyatkov created IGNITE-3218: - Summary: Partition can not be reserved Key: IGNITE-3218 URL: https://issues.apache.org/jira/browse/IGNITE-3218 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov if you set ScanQuery for the partition fall with the error: {noformat} Caused by: class org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtUnreservedPartitionException [part=3, msg=Partition can not be reserved.] at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.onheapIterator(GridCacheQueryManager.java:1042) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.scanIterator(GridCacheQueryManager.java:854) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.scanQueryLocal(GridCacheQueryManager.java:1761) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryAdapter$ScanQueryFallbackClosableIterator.init(GridCacheQueryAdapter.java:677) ... 27 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-3190) OffHeap cache metrics do not detected get from OffHeap
Vladislav Pyatkov created IGNITE-3190: - Summary: OffHeap cache metrics do not detected get from OffHeap Key: IGNITE-3190 URL: https://issues.apache.org/jira/browse/IGNITE-3190 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Assignee: Vladislav Pyatkov Simple configuration cache with OffHeap tiered (statistics must be enabled) never increase of get from OffHeap (CacheMetrics#getOffHeapGets always 0) {code} cache.put(46744, "val 46744"); cache.get(46744); {code} {noforamt} 016-05-24 14:19:31 INFO ServerNode:78 - Swap put 0 get 0 (0, 0) entries count 0 2016-05-24 14:19:31 INFO ServerNode:81 - OffHeap put 1 get 0 (0, 0) entries count 1 2016-05-24 14:19:31 INFO ServerNode:84 - OnHeap put 1 get 1 (1, 0) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)