[ https://issues.apache.org/jira/browse/IGNITE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maksim Timonin reassigned IGNITE-17908: --------------------------------------- Assignee: Maksim Timonin > AssertionError LWM after reserved on data insertion after the cluster restart > ----------------------------------------------------------------------------- > > Key: IGNITE-17908 > URL: https://issues.apache.org/jira/browse/IGNITE-17908 > Project: Ignite > Issue Type: Sub-task > Reporter: Anton Vinogradov > Assignee: Maksim Timonin > Priority: Major > Labels: ise > Attachments: LwmAfterReservedTest.java > > > After the cluster restart you may see the following assertion: > {code} > java.lang.AssertionError: LWM after reserved: lwm=2030, reserved=2010, > cntr=Counter [lwm=2030, missed=[], hwm=2030, reserved=2011] > at > org.apache.ignite.internal.processors.cache.PartitionUpdateCounterTrackingImpl.reserve(PartitionUpdateCounterTrackingImpl.java:270) > at > org.apache.ignite.internal.processors.cache.PartitionUpdateCounterErrorWrapper.reserve(PartitionUpdateCounterErrorWrapper.java:58) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.getAndIncrementUpdateCounter(IgniteCacheOffheapManagerImpl.java:1594) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.getAndIncrementUpdateCounter(GridCacheOffheapManager.java:2483) > at > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtLocalPartition.getAndIncrementUpdateCounter(GridDhtLocalPartition.java:942) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.calculatePartitionUpdateCounters(IgniteTxLocalAdapter.java:510) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.prepare0(GridDhtTxPrepareFuture.java:1356) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.mapIfLocked(GridDhtTxPrepareFuture.java:726) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.prepare(GridDhtTxPrepareFuture.java:1132) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.prepareAsyncLocal(GridNearTxLocal.java:4282) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.prepareColocatedTx(IgniteTxHandler.java:303) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.proceedPrepare(GridNearOptimisticTxPrepareFuture.java:565) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.prepareSingle(GridNearOptimisticTxPrepareFuture.java:392) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.prepare0(GridNearOptimisticTxPrepareFuture.java:335) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFutureAdapter.prepareOnTopology(GridNearOptimisticTxPrepareFutureAdapter.java:205) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFutureAdapter.prepare(GridNearOptimisticTxPrepareFutureAdapter.java:129) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.prepareNearTxLocal(GridNearTxLocal.java:3946) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.commitNearTxLocalAsync(GridNearTxLocal.java:3994) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.optimisticPutFuture(GridNearTxLocal.java:3051) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.putAsync0(GridNearTxLocal.java:729) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.putAsync(GridNearTxLocal.java:484) > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter$20.op(GridCacheAdapter.java:2511) > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter$20.op(GridCacheAdapter.java:2509) > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4284) > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter.put0(GridCacheAdapter.java:2509) > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2487) > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2466) > at > org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1332) > at > org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:867) > at > org.apache.ignite.util.BrokenRebalanceTest.testCountersOnCrachRecovery(BrokenRebalanceTest.java:191) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2506) > {code} > when clusted was under the load and has a counters gaps on restart. > See [^LwmAfterReservedTest.java] > Possible solution is to fix > {{PartitionUpdateCounterTrackingImpl#reservedCntr}} calculation > from > {code} > long max = Math.max(val, curLwm); > {code} > to something like > {code} > long max = Math.max(val, curLwm); > max = Math.max(max, highestAppliedCounter()); > {code} > Another possible fix is to get rid or > {{PartitionUpdateCounterTrackingImpl#reservedCntr}} and always use > {{highestAppliedCounter()}}. > This may slowdown the system, but look like a correct fix. So, some > optimisation may be required. > Please make sure that counters are correct after the fix using idle_verify. -- This message was sent by Atlassian Jira (v8.20.10#820010)