Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-06-08 Thread agura
Hi,

I think writeSynchronizationMode doesn't affect key locking. Actually this
mode is just hint for transaction origination node ant it defines when tx
should return control flow to user code: when all nodes responded or only
primary nodes responded (without waiting for reply from backups).



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p13516.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-05-22 Thread Andrey Gura
Hi,

issue mentioned early is related with both pessimistic and optimistic
transactions. So it is still possible that it is your case.

Anyway, I don't have any other ideas about the problem. It would be
great to have reproducer in order to debug issue. But from my point of
view migaration to newer version is the best way.

On Fri, May 19, 2017 at 12:23 PM, yfernando <yohan.ferna...@tudor.com> wrote:
> The issue you mention doesn't seem to be related to the issue we are having
> as,
> - The entire grid does not lock up as in the situation described in the bug
> - There are no threads blocked or locked when our key lock occurs
> - The bug seems to occur on Optimistic locking whereas our scenario occurs
> on Pessimistic
>
>
> agura wrote
>> There was a problem with incorrect transaction timeout handling [1]
>> that was fixed in Ignite 1.8. It is possible that it is your case.
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-2797
>
> Another observation we had was that the key that was locked occurred on a
> cache that gets loaded at startup and never changes content. The cache does
> however take part in transactions in terms of cache reads (using
> IgniteCache.get() )
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p13021.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-05-19 Thread yfernando
The issue you mention doesn't seem to be related to the issue we are having
as,
- The entire grid does not lock up as in the situation described in the bug
- There are no threads blocked or locked when our key lock occurs
- The bug seems to occur on Optimistic locking whereas our scenario occurs
on Pessimistic


agura wrote
> There was a problem with incorrect transaction timeout handling [1]
> that was fixed in Ignite 1.8. It is possible that it is your case.
> 
> [1] https://issues.apache.org/jira/browse/IGNITE-2797

Another observation we had was that the key that was locked occurred on a
cache that gets loaded at startup and never changes content. The cache does
however take part in transactions in terms of cache reads (using
IgniteCache.get() )



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p13021.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-05-16 Thread Andrey Gura
Hi,

There was a problem with incorrect transaction timeout handling [1]
that was fixed in Ignite 1.8. It is possible that it is your case.

[1] https://issues.apache.org/jira/browse/IGNITE-2797

On Thu, May 11, 2017 at 1:51 AM, bintisepaha <binti.sep...@tudor.com> wrote:
> Hey guys, we had a key lock issue again on 1.7.0. here is a suspicious thread
> dump. Is this helpful for tracking down our issue further?
> we did not see any topology changes or any other exceptions.
>
> Attaching the entire thread dump too tdump.zip
> <http://apache-ignite-users.70518.x6.nabble.com/file/n12611/tdump.zip>
>
> "pub-#7%DataGridServer-Production%" Id=47 in WAITING on
> lock=org.apache.ignite.internal.util.future.GridFutureAdapter$ChainFuture@2094df59
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>   at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:159)
>   at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:117)
>   at
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4800)
>   at
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4783)
>   at
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:1395)
>   at
> org.apache.ignite.internal.processors.cache.IgniteCacheProxy.get(IgniteCacheProxy.java:956)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.updatePosition(OrderHolderSaveRunnable.java:790)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.cancelPosition(OrderHolderSaveRunnable.java:805)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.cancelExistingTradeOrderForPositionUpdate(OrderHolderSaveRunnable.java:756)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.processOrderHolders(OrderHolderSaveRunnable.java:356)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.run(OrderHolderSaveRunnable.java:109)
>   at
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$C4V2.execute(GridClosureProcessor.java:2184)
>   at
> org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:509)
>   at
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6521)
>   at
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:503)
>   at
> org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:456)
>   at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at
> org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1161)
>   at
> org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1766)
>   at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
>   at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
>   at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
>   at
> org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
>
>   Locked synchronizers: count = 1
>      <1237e0be>  - java.util.concurrent.ThreadPoolExecutor$Worker@1237e0be
>
>
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12611.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-05-10 Thread bintisepaha
Hey guys, we had a key lock issue again on 1.7.0. here is a suspicious thread
dump. Is this helpful for tracking down our issue further?
we did not see any topology changes or any other exceptions.

Attaching the entire thread dump too tdump.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n12611/tdump.zip>  

"pub-#7%DataGridServer-Production%" Id=47 in WAITING on
lock=org.apache.ignite.internal.util.future.GridFutureAdapter$ChainFuture@2094df59
  at sun.misc.Unsafe.park(Native Method)
  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
  at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
  at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
  at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:159)
  at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:117)
  at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4800)
  at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4783)
  at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:1395)
  at
org.apache.ignite.internal.processors.cache.IgniteCacheProxy.get(IgniteCacheProxy.java:956)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.updatePosition(OrderHolderSaveRunnable.java:790)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.cancelPosition(OrderHolderSaveRunnable.java:805)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.cancelExistingTradeOrderForPositionUpdate(OrderHolderSaveRunnable.java:756)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.processOrderHolders(OrderHolderSaveRunnable.java:356)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.run(OrderHolderSaveRunnable.java:109)
  at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C4V2.execute(GridClosureProcessor.java:2184)
  at
org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:509)
  at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6521)
  at
org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:503)
  at
org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:456)
  at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
  at
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1161)
  at
org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1766)
  at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
  at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
  at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
  at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
  at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)

  Locked synchronizers: count = 1
 <1237e0be>  - java.util.concurrent.ThreadPoolExecutor$Worker@1237e0be





--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12611.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-26 Thread Andrey Gura
Hi,

fix merged into ignite-2.0 that will be released soon.

On Tue, Apr 25, 2017 at 4:52 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> We have not been able to migrate to 1.8 yet due to business reasons, and now
> we are planning to migrate to 1.9 directly and are testing it.
>
> Do you have an ETA for the fix? Also do you have the code for reproducing
> it?
>
> We are also changing our clients to become long running servers so that the
> topology does not change a lot.
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12237.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-24 Thread Andrey Gura
Hi Binti,

unfortunately I don't have any other ideas why this NPE happens
(except of two mentioned above). But I believe that fix of this
problem will prevent this NPE in all other cases.

About key remaining future: in some cases mentioned NPE led to it.

BTW, did you migrate to Ignite 1.8?


On Fri, Apr 21, 2017 at 10:39 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> Andrey, we never start a txn on the client side. The key that gets locked on
> our end and stays locked even after a successful txn is never read or
> updated in a txn from the client side. Are you also able to reproduce the
> key remaining locked issue?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12164.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-21 Thread bintisepaha
Andrey, we never start a txn on the client side. The key that gets locked on
our end and stays locked even after a successful txn is never read or
updated in a txn from the client side. Are you also able to reproduce the
key remaining locked issue?

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12164.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-20 Thread Andrey Gura
Hi,

I have a simple reproducer and JIRA ticket is created:
https://issues.apache.org/jira/browse/IGNITE-5041

Workaround looks very strange but it should work: you should call
getOrCreateCache on client node for all your caches or disable
deadlock detection (see ticket)


On Thu, Apr 20, 2017 at 12:32 AM, bintisepaha <binti.sep...@tudor.com> wrote:
> This is positive news Andrey. Thanks a lot.
>
> Please keep us posted about reproducing this. We are definitely not using
> node filters...and we suspect topology changes to be causing issues, but
> irrespective of that, we are not able to reproduce it. we also do not see
> deadlock issues reported anywhere. the last time we got a key lock last
> week, we did not see the NPE but only topology change for client.
>
> Thanks,
> Binti
>
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12094.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-19 Thread bintisepaha
This is positive news Andrey. Thanks a lot.

Please keep us posted about reproducing this. We are definitely not using
node filters...and we suspect topology changes to be causing issues, but
irrespective of that, we are not able to reproduce it. we also do not see
deadlock issues reported anywhere. the last time we got a key lock last
week, we did not see the NPE but only topology change for client.

Thanks,
Binti




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12094.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-19 Thread Andrey Gura
2449097, partId=-1,
> txKey=IgniteTxKey [key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
> clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
> valueDate=19000101], hasValBytes=true], cacheId=812449097], val=[op=READ,
> val=null], prevVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
> conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
> filters=null, filtersPassed=false, filtersSet=true,
> entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
> [super=GridCacheMapEntry [key=KeyCacheObjectImpl [val=PositionId
> [fundAbbrev=BVI, clearBrokerId=12718, insIid=679675, strategy=AFI,
> traderId=6531, valueDate=19000101], hasValBytes=true], val=null,
> startVer=1492027414018, ver=GridCacheVersion [topVer=103141926,
> time=1492107344360, order=1492027414018, nodeOrder=13], hash=-1282641846,
> extras=null, flags=0]]], prepared=0, locked=false,
> nodeId=63e3e087-15c5-45e8-9420-9ee657a5a871, locMapped=false,
> expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
> serReadVer=null, xidVer=GridCacheVersion [topVer=103141926,
> time=1492107344341, order=1492027414014, nodeOrder=13]]}],
> super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=103141926,
> time=1492107344341, order=1492027414014, nodeOrder=13], writeVer=null,
> implicit=false, loc=true, threadId=83, startTime=1492107344336,
> nodeId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18, startVer=GridCacheVersion
> [topVer=103141926, time=1492107344341, order=1492027414014, nodeOrder=13],
> endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC,
> timeout=1, sysInvalidate=false, sys=false, plc=2, commitVer=null,
> finalizing=NONE, preparing=false, invalidParts=null, state=MARKED_ROLLBACK,
> timedOut=false, topVer=AffinityTopologyVersion [topVer=1979,
> minorTopVer=34], duration=10006ms, onePhaseCommit=false], size=2
>
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12009.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-17 Thread bintisepaha
, execBrokerId=12038, branchBrokerId=7511, fillOrigUserId=3081,
initialMargin=null, cmmsnChgUserId=0, cmmsnChgReasonId=0, fixingSourceId=0,
orderDesignationId=0, riskRewardId=0, placementTime=2017-04-13 13:29:47.657,
initialInvestment=0.0, equityFxBrokerTypeId=0, execBranchBrokerId=0,
createUserId=3081, targetAllocFlag=N, pvDate=null, pvFactor=null, pvId=0,
executionTypeId=0, borrowScheduleId=0, borrowScheduleTypeId=0,
marketPrice=null, interestAccrualDate=null, sourceAppId=103,
initiatingUserId=6531, isDiscretionary=Y, traderBsssc=S, clearingBsssc=S,
executingBsssc=S, shortsellBanApproverUserId=null, intendedQuantity=-7600.0,
lastUpdated=2017-04-13 14:15:02.147, traderStrategyId=24686,
businessDate=2017-04-13 00:00:00.0, firstExecutionTime=2017-04-13
13:29:47.657, doNotBulkFlag=null, trimDb=trim_grn, trades=[Trade
[tradeKey=TradeKey [tradeId=263603637, tradeId64=789971421, traderId=6531],
orderId=12382604, ftbId=2023850, quantity=-985.0, fundAbbrev=TRCP,
subfundAbbrev=TRCP_EDAB, date=Thu Apr 13 00:00:00 EDT 2017,
commission=0.003, fillId=1, flyallocNumber=1, pnlTime=Tue Jan 01 00:00:00
EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan
01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0,
splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey [tradeId=888175445,
tradeId64=182624390, traderId=6531], orderId=12382604, ftbId=2022525,
quantity=-4141.0, fundAbbrev=BVI, subfundAbbrev=BVI_EDAB, date=Thu Apr 13
00:00:00 EDT 2017, commission=0.003, fillId=1, flyallocNumber=2, pnlTime=Tue
Jan 01 00:00:00 EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I,
psSettle=Tue Jan 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036,
splitTradeId=0, splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey
[tradeId=938093803, tradeId64=953318988, traderId=6531], orderId=12382604,
ftbId=2022524, quantity=-1674.0, fundAbbrev=TGF, subfundAbbrev=TGF_EDAB,
date=Thu Apr 13 00:00:00 EDT 2017, commission=0.003, fillId=1,
flyallocNumber=3, pnlTime=Tue Jan 01 00:00:00 EST 2036, price=18.0021324,
psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan 01 00:00:00 EST 2036,
psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0, splitTradeId64=0,
trimDb=]]], hasValBytes=true], startVer=1492027414017, ver=GridCacheVersion
[topVer=103141923, time=1492107302158, order=1492027411508, nodeOrder=14],
hash=383868216, extras=null, flags=0]]], prepared=0, locked=true,
nodeId=89f19ec9-462e-4d99-a267-1189457c124b, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=103141926,
time=1492107344341, order=1492027414014, nodeOrder=13]], IgniteTxKey
[key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true], cacheId=812449097]=IgniteTxEntry
[key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true], cacheId=812449097, partId=-1,
txKey=IgniteTxKey [key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true], cacheId=812449097], val=[op=READ,
val=null], prevVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
filters=null, filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [val=PositionId
[fundAbbrev=BVI, clearBrokerId=12718, insIid=679675, strategy=AFI,
traderId=6531, valueDate=19000101], hasValBytes=true], val=null,
startVer=1492027414018, ver=GridCacheVersion [topVer=103141926,
time=1492107344360, order=1492027414018, nodeOrder=13], hash=-1282641846,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=63e3e087-15c5-45e8-9420-9ee657a5a871, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=103141926,
time=1492107344341, order=1492027414014, nodeOrder=13]]}],
super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=103141926,
time=1492107344341, order=1492027414014, nodeOrder=13], writeVer=null,
implicit=false, loc=true, threadId=83, startTime=1492107344336,
nodeId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18, startVer=GridCacheVersion
[topVer=103141926, time=1492107344341, order=1492027414014, nodeOrder=13],
endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC,
timeout=1, sysInvalidate=false, sys=false, plc=2, commitVer=null,
finalizing=NONE, preparing=false, invalidParts=null, state=MARKED_ROLLBACK,
timedOut=false, topVer=AffinityTopologyVersion [topVer=1979,
minorTopVer=34], duration=10006ms, onePhaseCommit=false], size=2




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-13 Thread bintisepaha
Andrey, in the exception above, we see this

 pendingLocks=[KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true]], super=GridCompoundIdentityFuture
[super=GridCompoundFuture [rdc=Bool reducer: true, initFlag=0, lsnrCalls=0,
done=true, cancelled=false, err=null, futs=[]]]
class
org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
Failed to acquire lock within provided timeout for transaction
[timeout=1,
tx=org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter$1@7597f862]

What does this mean? clearly its a pendingLock and we know that be behavior
because every subsequent update to this key fails with a txn timeout. But up
until this point, all updates were fine.

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11968.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-13 Thread bintisepaha
(IgniteTxLocalAdapter.java:3874)
at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:91)
at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:297)
at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:290)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
at
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:745)





--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11967.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-12 Thread Andrey Gura
As far as I know no one has such a problems.

On Wed, Apr 12, 2017 at 4:15 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> Thanks for trying, is the node filter an issue someone else is seeing?
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11905.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-12 Thread bintisepaha
Thanks for trying, is the node filter an issue someone else is seeing?



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11905.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-10 Thread Andrey Gura
Hi,

unfortunatelly I didn't have possibility to reproduce it. But, anyway,
I'm going to reproduce case with node filters. I still don't have any
idea about your case. Hopefully I'll get some ideas during reproducing
and fixing problem that can happen when cache exists only on some
subset of nodes.

On Thu, Apr 6, 2017 at 6:41 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> Andrey,
>
> We start the caches only at server start up time. 2 of our caches are
> replicated, all other caches are partitioned with 1 backup. Do you think
> replicated caches might be causing this issue, when clients leave and join
> the cluster?
>
> All server nodes start with the same cache config. All client nodes start
> with the same config but have no knowledge of cache creation.
>
> We don't use any node filters. For IgniteCompute we start the Runnable on
> server nodes, but for cache creation there are no filters.
>
> Any luck reproducing? Has anyone else had an issue where a txn seems like it
> is finished but the key lock is not released?
>
> Thanks,
> Binti
>
>
>
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11784.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-06 Thread bintisepaha
Andrey, 

We start the caches only at server start up time. 2 of our caches are
replicated, all other caches are partitioned with 1 backup. Do you think
replicated caches might be causing this issue, when clients leave and join
the cluster?

All server nodes start with the same cache config. All client nodes start
with the same config but have no knowledge of cache creation. 

We don't use any node filters. For IgniteCompute we start the Runnable on
server nodes, but for cache creation there are no filters.

Any luck reproducing? Has anyone else had an issue where a txn seems like it
is finished but the key lock is not released?

Thanks,
Binti






--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11784.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-04 Thread Andrey Gura
It's obviously bug. NPE occures if node doesn't have cache but got
respone message during deadlock detection about thiscache on toher
node. It's possible for example when you have node filter and caches
start on some subset nodes of cluster. Is it your case?

I'll try to reproduce this problem, create JIRA ticket and answer here.

On Mon, Apr 3, 2017 at 8:23 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> Sorry for the late response. We do not close/destroy caches dynamically.
> could you please explain this NPE?
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11676.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-04-03 Thread bintisepaha
Sorry for the late response. We do not close/destroy caches dynamically.
could you please explain this NPE?



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11676.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-27 Thread Andrey Gura
It's very strange NPE. It looks like cache is closed or destoryed. Is
it possible that you start and close/destroy caches dynamically?

On Fri, Mar 24, 2017 at 5:43 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> Hey Andrey, Thanks a lot for getting back.
>
> These errors were a result of a bad client connected to grid.
>
> We have been running clients that leave and join the cluster constantly in
> order to see if we can reproduce this issue. Last night we saw this issue
> again. Here is one of the errors that a sys thread has on a client node that
> initiates a transaction. The client node was not restarted or disconnected.
> It kept working fine.
> We do not restart these clients but there are some otehr clietns that leave
> and join the cluster.
>
> Do you think this is helpful in locating the cause?
>
> Exception in thread "sys-#41%DataGridServer-Production%"
> java.lang.NullPointerException
> at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxKey.finishUnmarshal(IgniteTxKey.java:92)
> at
> org.apache.ignite.internal.processors.cache.transactions.TxLocksResponse.finishUnmarshal(TxLocksResponse.java:190)
> at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.unmarshall(IgniteTxManager.java:2427)
> at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.onMessage(IgniteTxManager.java:2317)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> [20:02:18] Topology snapshot [ver=7551, servers=16, clients=53, CPUs=217,
> heap=740.0GB]
> [20:02:22] Topology snapshot [ver=7552, servers=16, clients=52, CPUs=213,
> heap=740.0GB]
> [20:02:28] Topology snapshot [ver=7553, servers=16, clients=53, CPUs=217,
> heap=740.0GB]
> [20:02:36] Topology snapshot [ver=7554, servers=16, clients=54, CPUs=217,
> heap=740.0GB]
> [20:02:40] Topology snapshot [ver=7555, servers=16, clients=53, CPUs=217,
> heap=740.0GB]
> [20:02:41] Topology snapshot [ver=7556, servers=16, clients=54, CPUs=217,
> heap=740.0GB]
> [20:02:48] Topology snapshot [ver=7557, servers=16, clients=53, CPUs=217,
> heap=740.0GB]
>
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11433.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-24 Thread bintisepaha
Hey Andrey, Thanks a lot for getting back.

These errors were a result of a bad client connected to grid.

We have been running clients that leave and join the cluster constantly in
order to see if we can reproduce this issue. Last night we saw this issue
again. Here is one of the errors that a sys thread has on a client node that
initiates a transaction. The client node was not restarted or disconnected.
It kept working fine.
We do not restart these clients but there are some otehr clietns that leave
and join the cluster.

Do you think this is helpful in locating the cause?

Exception in thread "sys-#41%DataGridServer-Production%"
java.lang.NullPointerException
at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxKey.finishUnmarshal(IgniteTxKey.java:92)
at
org.apache.ignite.internal.processors.cache.transactions.TxLocksResponse.finishUnmarshal(TxLocksResponse.java:190)
at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.unmarshall(IgniteTxManager.java:2427)
at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.onMessage(IgniteTxManager.java:2317)
at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[20:02:18] Topology snapshot [ver=7551, servers=16, clients=53, CPUs=217,
heap=740.0GB]
[20:02:22] Topology snapshot [ver=7552, servers=16, clients=52, CPUs=213,
heap=740.0GB]
[20:02:28] Topology snapshot [ver=7553, servers=16, clients=53, CPUs=217,
heap=740.0GB]
[20:02:36] Topology snapshot [ver=7554, servers=16, clients=54, CPUs=217,
heap=740.0GB]
[20:02:40] Topology snapshot [ver=7555, servers=16, clients=53, CPUs=217,
heap=740.0GB]
[20:02:41] Topology snapshot [ver=7556, servers=16, clients=54, CPUs=217,
heap=740.0GB]
[20:02:48] Topology snapshot [ver=7557, servers=16, clients=53, CPUs=217,
heap=740.0GB]




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11433.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-23 Thread Andrey Gura
Hi,

>From your errors list I see many connection errors. Why does it happen?

> how ignite treats cluster topology when transaction are running on server 
> nodes

When Ignite executes some operation in transaction it maps all
involved keys to affinity nodes and then transaction will be
propagated to this nodes. If transaction isn't finished and some of
involved nodes left topology then transaction will be failed and
rolled back.

Client nodes doesn't affect transactions when join or leave cluster
except of cases when client node is transaction initiator or have
local cache that participates in transaction.


On Tue, Mar 21, 2017 at 11:40 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> Sorry for keep following up on this, but this is becoming a major issue for
> us and we need to understand how ignite treats cluster topology when
> transaction are running on server nodes. How do clients joining and leaving
> the cluster affects txns?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11346.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-21 Thread bintisepaha
Sorry for keep following up on this, but this is becoming a major issue for
us and we need to understand how ignite treats cluster topology when
transaction are running on server nodes. How do clients joining and leaving
the cluster affects txns?

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11346.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-20 Thread bintisepaha
Andrey, would you be able to look at the attached errors and advice please?

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11322.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-16 Thread bintisepaha
Evans, we don't have a grid hanging issue. Also we do have server nodes
leaving the cluster. the toplogy changes are for clients.

I am attaching some errors we saw when a key got locked.
Could someone please look at it and help us to find out why a client
topology change is causing key locks?

Errors.txt
<http://apache-ignite-users.70518.x6.nabble.com/file/n11254/Errors.txt>  



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11254.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-15 Thread Evans Ye
I'm not quite sure this is relevant to your issue.
I've experienced the hanging issue on ignite 1.8.0 when topology changed.
https://issues.apache.org/jira/browse/IGNITE-4450 might get to fix the
problem.
FYR.


2017-03-16 1:03 GMT+08:00 bintisepaha <binti.sep...@tudor.com>:

> Andrey, do you think topology changes with client nodes (not server nodes)
> leaving and joining the cluster may have any impact on transaction?
>
> twice out of the 4 times we saw topology changes while the last successful
> update was done to the key that was eventually locked.
>
> Yesterday we saw this same issue on another cache. We are due for 1.8
> upgrade soon, but given that no one else has seen this issue before, we are
> not sure if 1.8 would have fixed it either.
>
> Let us know what you think.
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-
> subsequent-txns-failed-tp10536p11203.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-15 Thread bintisepaha
Andrey, do you think topology changes with client nodes (not server nodes)
leaving and joining the cluster may have any impact on transaction?

twice out of the 4 times we saw topology changes while the last successful
update was done to the key that was eventually locked.

Yesterday we saw this same issue on another cache. We are due for 1.8
upgrade soon, but given that no one else has seen this issue before, we are
not sure if 1.8 would have fixed it either.

Let us know what you think.

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11203.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-09 Thread bintisepaha
This is helpful and we have seen it earlier in our code too. So you are right
that the reason for this issue we are seeing is not deadlock.

It feels like a strange bug where a pessimistic txn is committing but not
releasing the key. Let's see if 1.8 helps. We will upgrade to it in 2 weeks.

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11100.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-08 Thread Andrey Gura
LockFuture.java:1355)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture$LockTimeoutObject$1.apply(GridDhtColocatedLockFuture.java:1349)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:264)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:252)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:382)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:348)
at 
org.apache.ignite.internal.processors.cache.transactions.TxDeadlockDetection$TxDeadlockFuture.onDone(TxDeadlockDetection.java:537)
at 
org.apache.ignite.internal.processors.cache.transactions.TxDeadlockDetection$TxDeadlockFuture.onDone(TxDeadlockDetection.java:158)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:325)
at 
org.apache.ignite.internal.processors.cache.transactions.TxDeadlockDetection$TxDeadlockFuture.detect(TxDeadlockDetection.java:313)
at 
org.apache.ignite.internal.processors.cache.transactions.TxDeadlockDetection$TxDeadlockFuture.onResult(TxDeadlockDetection.java:513)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.onMessage(IgniteTxManager.java:2525)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1082)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:710)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:102)
at 
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:673)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

On Wed, Mar 8, 2017 at 12:32 AM, bintisepaha <binti.sep...@tudor.com> wrote:
> Could you please tell us what the logs would say if there was a deadlock?
> we rollback the txn on any exception explicitly.
>
> We are running 1.8 in UAT environment this week. It is not a simple upgrade
> to production, we need to let it burn in. But in UAT we never saw this issue
> with 1.7 either. It already happened in production, so far 3 times.
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11064.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-07 Thread bintisepaha
Could you please tell us what the logs would say if there was a deadlock?
we rollback the txn on any exception explicitly.

We are running 1.8 in UAT environment this week. It is not a simple upgrade
to production, we need to let it burn in. But in UAT we never saw this issue
with 1.7 either. It already happened in production, so far 3 times.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11064.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-06 Thread Andrey Gura
Hi,

you could to know more about deadlock detection here [1]

Mentioned ticket about optiomistic transactions. Deadlock detection
for pessimistic transactions (your case) is implemented in 1.7
release.

I don't think that it is your case because in case of
TransactionDeadlockException transaction should be rolled back and all
locks should be released. But it is possible that there is bug in this
place.

By the way, do you have the same issue on 1.8?

[1] https://apacheignite.readme.io/docs/transactions#deadlock-detection

On Mon, Mar 6, 2017 at 5:16 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> Andrey, Could you please tell us a little bit more about the deadlock
> detection feature in 1.8?
>
> https://issues.apache.org/jira/browse/IGNITE-2969
>
> How would we able to know that there is a deadlock? Do you think that is the
> case for us now in 1.7 but we can't be sure because we have no such feature?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11038.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-03-06 Thread bintisepaha
Andrey, Could you please tell us a little bit more about the deadlock
detection feature in 1.8?

https://issues.apache.org/jira/browse/IGNITE-2969

How would we able to know that there is a deadlock? Do you think that is the
case for us now in 1.7 but we can't be sure because we have no such feature?

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11038.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-28 Thread Andrey Gura
> The long pause is between 2 different threads, so isn't that normal?

Usually it is not normal. So need some investigation in order to
understand why this pause happens.

> Also the 9990 ms and 10 ms used earlier for some previous step in the tn, is
this how ignite breaks out the time?

When Ignite starts transaction on initial node it pass timeout to
tranasction. After mapping keys on nodes of cluster Ignite starts
transactions on remote nodes and pass remaining transaction time to
it. So it is possible that it is your case.

> Is there any way we can force release a locked key without restarting the
whole cluster?

There is no such way.

On Mon, Feb 27, 2017 at 5:39 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> Andrey, thanks for getting back.
> The long pause is between 2 different threads, so isn't that normal?
>
> Also the 9990 ms and 10 ms used earlier for some previous step in the tn, is
> this how ignite breaks out the time? we have always seen the timeouts for
> the value in ms that we set in our code. Also the stack trace does not come
> from our code which it usually does on genuine timeouts that do not leave
> the key locked.
>
> We are trying 1.8 in UAT environment and will release to production soon.
> Unfortunately this issue does not happen in UAT and we have no way of
> reproducing it.
>
> Is there any way we can force release a locked key without restarting the
> whole cluster?
>
> Thanks,
> Binti
>
>
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10910.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-27 Thread bintisepaha
Andrey, thanks for getting back.
The long pause is between 2 different threads, so isn't that normal?

Also the 9990 ms and 10 ms used earlier for some previous step in the tn, is
this how ignite breaks out the time? we have always seen the timeouts for
the value in ms that we set in our code. Also the stack trace does not come
from our code which it usually does on genuine timeouts that do not leave
the key locked.

We are trying 1.8 in UAT environment and will release to production soon.
Unfortunately this issue does not happen in UAT and we have no way of
reproducing it.

Is there any way we can force release a locked key without restarting the
whole cluster?

Thanks,
Binti





--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10910.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-27 Thread Andrey Gura
.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
> 109746 at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
> 109747 at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
> 109748 at
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
> 109749 at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> 109750 at java.lang.Thread.run(Thread.java:745)
> 109751
> 109752 Feb 22, 2017 3:46:17 PM org.apache.ignite.logger.java.JavaLogger
> error
> 109753 SEVERE: Failed to initialize job
> [jobId=73652976a51-2c4ae429-3284-4588-a96e-7e487f6653a1,
> ses=GridJobSessionImpl [ses=GridTaskSessionImpl
> [taskName=com.tudor.datagridI.server.tradegen.OrderHolderSaveRunn
> able, dep=LocalDeployment [super=GridDeployment [ts=1487430097277,
> depMode=SHARED, clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2,
> clsLdrId=1051eb15a51-985b44be-ea4a-4d0b-a2a5-ab2aa441ac62, userVer=0,
> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
> undeployed=false, usage=0]],
> taskClsName=com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable,
> sesId=63652976a51-2c4ae429-3284-4588   -a96e-7e487f6653a1,
> startTime=1487796377472, endTime=9223372036854775807,
> taskNodeId=2c4ae429-3284-4588-a96e-7e487f6653a1,
> clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2, closed=false, cpSpi=null,
> fail   Spi=null, loadSpi=null, usage=1, fullSup=false,
> subjId=2c4ae429-3284-4588-a96e-7e487f6653a1, mapFut=IgniteFuture
> [orig=GridFutureAdapter [resFlag=0, res=null, startTime=1487796377491,
> endTime=0, ignoreIn   terrupts=false, state=INIT]]],
> jobId=73652976a51-2c4ae429-3284-4588-a96e-7e487f6653a1]]
> 109754 class org.apache.ignite.IgniteCheckedException: Failed to deserialize
> object with given class loader: sun.misc.Launcher$AppClassLoader@18b4aac2
> 109755 at
> org.apache.ignite.marshaller.optimized.OptimizedMarshaller.unmarshal(OptimizedMarshaller.java:250)
> 109756 at
> org.apache.ignite.internal.processors.job.GridJobWorker.initialize(GridJobWorker.java:409)
> 109757 at
> org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1089)
> 109758 at
> org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1766)
> 109759 at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
> 109760 at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
> 109761 at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
> 109762 at
> org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
> 109763 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 109764 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 109765 at java.lang.Thread.run(Thread.java:745)
> 109766 Caused by: java.io.EOFException: Attempt to read beyond the end of
> the stream [pos=2216, more=1, max=2215]
> 109767 at
> org.apache.ignite.internal.util.io.GridUnsafeDataInput.offset(GridUnsafeDataInput.java:171)
> 109768 at
> org.apache.ignite.internal.util.io.GridUnsafeDataInput.readByte(GridUnsafeDataInput.java:398)
> 109769 at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:197)
> 109770 at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
> 109771 at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readFields(OptimizedObjectInputStream.java:491)
> 109772 at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readSerializable(OptimizedObjectInputStream.java:579)
> 109773 at
> org.apache.ignite.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:841)
> 109774 at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:324)
> 109775 at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
> 109776 at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readFields(OptimizedObjectInputStream.java:491)
> 109777 at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readSerializable(OptimizedObjectInputStream.java:579)
> 109778 at
> org.apache.ignite.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:841)
> 109779 at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:324)
> 109780 at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
> 109781 at
> org.apache.ignite.marshaller.optimized.OptimizedMarshaller.unmarshal(OptimizedMarshaller.java:247)
> 109782 ... 10 more
> 109783
>
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10833.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-27 Thread Andrey Gura
Hi

Timeout with value 9990 is ok because it is just the next step of
transaction and 10 ms was spend already for the previous steps of
transactions.

On Thu, Feb 23, 2017 at 8:48 AM, bintisepaha <binti.sep...@tudor.com> wrote:
> Andrey,
>
> I finally have an error that might help. this happened again in production
> today for us.
> Ignite-Console-3.zip
> <http://apache-ignite-users.70518.x6.nabble.com/file/n10828/Ignite-Console-3.zip>
>
> This is the last update that threw an error, after this error every update
> just times out.
>
> The timeout=9990 in this error, none of our transactions have this timeout.
> Do you think this is an ignite bug? If you look at the stack trace, this is
> not happening due to our code.
>
> Although there is an error on marshaling. How can we further narrow it down?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10828.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-23 Thread bintisepaha
-3284-4588-a96e-7e487f6653a1, mapFut=IgniteFuture
[orig=GridFutureAdapter [resFlag=0, res=null, startTime=1487796377491,
endTime=0, ignoreIn   terrupts=false, state=INIT]]],
jobId=73652976a51-2c4ae429-3284-4588-a96e-7e487f6653a1]]
109754 class org.apache.ignite.IgniteCheckedException: Failed to deserialize
object with given class loader: sun.misc.Launcher$AppClassLoader@18b4aac2
109755 at
org.apache.ignite.marshaller.optimized.OptimizedMarshaller.unmarshal(OptimizedMarshaller.java:250)
109756 at
org.apache.ignite.internal.processors.job.GridJobWorker.initialize(GridJobWorker.java:409)
109757 at
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1089)
109758 at
org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1766)
109759 at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
109760 at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
109761 at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
109762 at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
109763 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
109764 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
109765 at java.lang.Thread.run(Thread.java:745)
109766 Caused by: java.io.EOFException: Attempt to read beyond the end of
the stream [pos=2216, more=1, max=2215]
109767 at
org.apache.ignite.internal.util.io.GridUnsafeDataInput.offset(GridUnsafeDataInput.java:171)
109768 at
org.apache.ignite.internal.util.io.GridUnsafeDataInput.readByte(GridUnsafeDataInput.java:398)
109769 at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:197)
109770 at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
109771 at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readFields(OptimizedObjectInputStream.java:491)
109772 at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readSerializable(OptimizedObjectInputStream.java:579)
109773 at
org.apache.ignite.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:841)
109774 at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:324)
109775 at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
109776 at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readFields(OptimizedObjectInputStream.java:491)
109777 at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readSerializable(OptimizedObjectInputStream.java:579)
109778 at
org.apache.ignite.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:841)
109779 at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:324)
109780 at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
109781 at
org.apache.ignite.marshaller.optimized.OptimizedMarshaller.unmarshal(OptimizedMarshaller.java:247)
109782 ... 10 more
109783




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10833.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-22 Thread bintisepaha
Andrey,

I finally have an error that might help. this happened again in production
today for us.
Ignite-Console-3.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n10828/Ignite-Console-3.zip>
  

This is the last update that threw an error, after this error every update
just times out.

The timeout=9990 in this error, none of our transactions have this timeout.
Do you think this is an ignite bug? If you look at the stack trace, this is
not happening due to our code.

Although there is an error on marshaling. How can we further narrow it down?

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10828.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-21 Thread bintisepaha
Attaching the console log file too, it has the above error.
Unfortunately we lost the files for older updates to this key now, they
rolled over.
Ignite-Console-1.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n10779/Ignite-Console-1.zip>
  

but the "Transaction has been already completed" error could also happen
because maybe we call explicit rollback in case of any Runtime Exception.

tx.rollback();

We do get genuine timeouts in the code, but at that time, we never get this
exception.

Hope this helps.

Thanks,
Binti





--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10779.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-21 Thread bintisepaha
 14796 at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:88)
 14797 at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1$2$1.run(GridCacheIoManager.java:222)
 14798 at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6577)
 14799 at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:781)
 14800 at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
 14801 ... 3 more
 14802 




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10778.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-21 Thread Andrey Gura
There are no something suspicious in stack trace.

You can check that key is locked using IgniteCache.isLocalLocked()
method. For remote nodes you can run task that performs this checking.

Could you please provide full logs for analysis?

On Tue, Feb 21, 2017 at 6:48 PM, bintisepaha <binti.sep...@tudor.com> wrote:
> Andrey, thanks for getting back.
> I am attaching the stack trace. Don't think the cause is a deadloc, but the
> trace is long so maybe I am missing out something, let me know if you find
> something useful.
>
> We cannot ourselves reproduce this issue as there are no errors on the prior
> successful update. It feels like the txn was marked successful, but on one
> of the keys the lock was not released. and later when we try to access the
> key, its locked, hence the exceptions.
>
> No messages in the logs for long running txns or futures.
> By killing the node that holds the key, the lock is not released.
>
> Is there a way to query ignite to see if the locks are being held on a
> particular key? Any code we can run to salvage such locks?
>
> Any other suggestions?
>
> Thanks,
> Binti
>
>
>
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10764.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-21 Thread bintisepaha
Andrey, thanks for getting back.
I am attaching the stack trace. Don't think the cause is a deadloc, but the
trace is long so maybe I am missing out something, let me know if you find
something useful.

We cannot ourselves reproduce this issue as there are no errors on the prior
successful update. It feels like the txn was marked successful, but on one
of the keys the lock was not released. and later when we try to access the
key, its locked, hence the exceptions.

No messages in the logs for long running txns or futures. 
By killing the node that holds the key, the lock is not released.

Is there a way to query ignite to see if the locks are being held on a
particular key? Any code we can run to salvage such locks?

Any other suggestions?

Thanks,
Binti






--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10764.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-20 Thread Andrey Gura
Hi,

Actualy dead lock detection for pessimistic transactions was
introduced in Apache Ignite 1.7 release. But anyway it is good idea to
try the same code on the newest release.

In case of trasaction timeout dead lock detection should be triggered
and TransactionTimoutException should contain
TransactionDeadlockException as cause if dead lock was detected. Is it
your case or not?

If transaction doesn't release any locks (or non transactional locks)
then you will get messages in logs about long running transactions or
futures because exchange can't be finished in this case. Do you have
something like to this messages?

Is it possible to create some reproducer project that we can analyze?


On Sat, Feb 18, 2017 at 1:42 AM, bintisepaha <binti.sep...@tudor.com> wrote:
> Thanks Andrew,
>
> The same thing happened again today. Clearly the key is locked, we get the
> timeout exceptions. But prior update to the same has not thrown any
> exceptions. Suddenly one update fails with timeout exceptions and we are
> notified due to those exceptions that the key is locked.
>
> We will upgrade to 1.8, but in the meantime is there a way to free up this
> locked key using some code?
>
> We try killing nodes, but we have one back up and it looks like the lock is
> carried over too, which would be the right thing to do.
>
> Outside the transaction we can read this key (dirty read). This is becoming
> an issue for us, since its a production system and the only way to free up
> is to restart the cluster.
>
> Please point us in a direction where we can avoid this or free it up.
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: 
> http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10713.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-17 Thread bintisepaha
Thanks Andrew,

The same thing happened again today. Clearly the key is locked, we get the
timeout exceptions. But prior update to the same has not thrown any
exceptions. Suddenly one update fails with timeout exceptions and we are
notified due to those exceptions that the key is locked.

We will upgrade to 1.8, but in the meantime is there a way to free up this
locked key using some code?

We try killing nodes, but we have one back up and it looks like the lock is
carried over too, which would be the right thing to do.

Outside the transaction we can read this key (dirty read). This is becoming
an issue for us, since its a production system and the only way to free up
is to restart the cluster.

Please point us in a direction where we can avoid this or free it up.

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10713.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-17 Thread Andrey Mashenkov
Hi,

I see nothing in thread dump. Possibly, there is a deadlock, as there are
no known issues with pessimistic TX on stable topology for ignite 1.7.
Deadlock detection implemented in ignite 1.8. You can try to upgrade to 1.8.


On Thu, Feb 16, 2017 at 9:32 PM, bintisepaha <binti.sep...@tudor.com> wrote:

> Hello, any help on this would be very much appreciated.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-
> subsequent-txns-failed-tp10536p10683.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Best regards,
Andrey V. Mashenkov


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-16 Thread bintisepaha
Hello, any help on this would be very much appreciated.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10683.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-13 Thread bintisepaha
Hi Andrey,

We are using ignite 1.7.0.
I am attaching the thread dumps. As far as we can tell the grid was working
fine, no threads were hanging. Only subsequent updates to this key were
hung.

The topology was stable. And the client sends an object to be saved to the
server and the transaction is actually only on the server side. 
prodThreadDump20170208.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n10601/prodThreadDump20170208.zip>
  

Please let us know if you see something useful.

Thanks,
Binti




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10601.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

2017-02-13 Thread Andrey Mashenkov
Hi Binti,

It looks like smth happed while transaction was rollingback and some locks
was not released properly.
What Ignite version do you use?

Is it happened on stable topology or not?
Did you see hanging threads?
Are there any clients on topology that was involved in transaction?

Please provide more information, logs and thread-dumps if possible.

On Fri, Feb 10, 2017 at 1:36 AM, bintisepaha <binti.sep...@tudor.com> wrote:

> Guys,
>
> We use pessimistic txns in ignite and they have been working great for
> months now, however yesterday we ran into an issue, where on multiple
> updates to the same key (although in interval of minutes), many updates
> went
> through, but at some point, we started getting TransactionTimeoutException
> because the key was locked. After that any update for the same key failed
> with the same error.
>
> We checked the logs for the last successful update and all looks good.
> Something happened that although the txn was completed, they key was still
> not released. We had to actually restart the cluster post business hours to
> resolve the issue.
>
> 1) Is there a way to avoid this?
> 2) Can we monitor objects via visor or jconsole to see if some keys are
> locked for a prolonged period of time?
> 3) Is there a way to free up the lock remotely or by running some salvaging
> code?
>
> Please advise, your help is much appreciated.
>
> Thanks,
> Binti
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-
> subsequent-txns-failed-tp10536.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>