[orientdb] Re: Server gets into useless state - Maximum lock count exceeded - version 1.6.4

StevenTomer Mon, 20 Jan 2014 14:36:50 -0800

Andrey,

The fix to Issue 1977 does work to fix this bug.


Is there any chance that we could get this on a hotfix?

Steve

On Friday, January 17, 2014 11:03:21 PM UTC-7, StevenTomer wrote:
>
> We are getting the server into a bad state, after which every transaction 
> fails with a java.lang.Error (Maximum lock count exceeded).  We're using a 
> remote connection.
>
> I wrapped a try block, around the lock.readLock().lock() call in the class 
> OModificationLock, checking for java.lang.Error (and printing a stack trace 
> if there is a failure) and adding an instance counter.
>
> I found that there is a path for which calls to requestModificationLock 
> are not being followed up by calls to releaseModificationLock.
>
> The Maximum lock count exceeded message occurs around when the counter 
> reaches ~520,000.
>
> When I get the error, the stack trace I printed is as follows:
> java.lang.Error (Maximum lock count exceeded)
> com.orientechnologies.common.concur.lock.OModificationLock.
> requestMoficationLock(OModificationLock.java:51)
> com.orientechnologies.orient.core.index.OIndexAbstract.
> acquireModificationLock(OIndexAbstract.java:1084)
> com.orientechnologies.orient.core.index.OClassIndexManager.
> acquireModificationLock(OClassIndexManager.java:556)
> com.orientechnologies.orient.core.index.OClassIndexManager.
> checkIndexesAndAcquireLock(OClassIndexManager.java:522)
> com.orientechnologies.orient.core.index.OClassIndexManager.
> onRecordBeforeUpdate(OClassIndexManager.java:129)
> com.orientechnologies.orient.core.hook.ODocumentHookAbstract.onTrigger(
> ODocumentHookAbstract.java:263)
> com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.
> callbackHooks(ODatabaseRecordAbstract.java:1065)
> com.orientechnologies.orient.core.tx.OTransactionOptimistic.addRecord(
> OTransactionOptimistic.java:313)
> com.orientechnologies.orient.server.tx.OTransactionOptimisticProxy.begin(
> OTransactionOptimisticProxy.java:129)
> com.orientechnologies.orient.core.db.record.ODatabaseRecordTx.begin(
> ODatabaseRecordTx.java:94)
> com.orientechnologies.orient.core.db.record.ODatabaseRecordTx.begin(
> ODatabaseRecordTx.java:38)
> com.orientechnologies.orient.core.db.ODatabaseRecordWrapperAbstract.begin(
> ODatabaseRecordWrapperAbstract.java:126)
> com.orientechnologies.orient.server.network.protocol.binary.
> ONetworkProtocolBinary.commit(ONetworkProtocolBinary.java:1047)
> com.orientechnologies.orient.server.network.protocol.binary.
> ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:303)
> com.orientechnologies.orient.server.network.protocol.binary.
> OBinaryNetworkProtocolAbstract.execute(OBinaryNetworkProtocolAbstract.java
> :125)
> com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:45)
>
>
>
> Tracing into the problem, it only occurs for us when sending a 
> Transaction, and the number of locks only increases for creates in this 
> transaction (not for updates...)
>
> (OTransactionOptimistic.java)
> protected void addRecord(final ORecordInternal<?> iRecord, final byteiStatus
> , final String iClusterName) {
>     switch (iStatus) {
>         case ORecordOperation.CREATED:
>             database.checkSecurity(ODatabaseSecurityResources.CLUSTER, 
> ORole.PERMISSION_CREATE, iClusterName);
>             // !! This gets the Modification Lock !!
>             database.callbackHooks(TYPE.BEFORE_CREATE, iRecord); 
>     break;
>
>
>     ...
>
>
>     switch (iStatus) {
>         case ORecordOperation.CREATED:
>             // !! This calls the hooks that apply after a create, which 
> should free the Modification Lock !!
>             database.callbackHooks(TYPE.AFTER_CREATE, iRecord);  
>             break;
>
>
>     ...
>
>
> (ODatabaseRecordAbstract.java)
> public ORecordHook.RESULT callbackHooks(final TYPE iType, final 
> OIdentifiable id) {
>     // !! There are some returns at the beginning of this function that 
> may not be safe - won't free the lock !!
>     ...
>     // !! Call the trigger !!
>     final RESULT res = hook.onTrigger(iType, rec);
>     ...
>
>
> (ODocumentHookAbstract.java)
> public RESULT onTrigger(final TYPE iType, final ORecord<?> iRecord) {
>     if (ODatabaseRecordThreadLocal.INSTANCE.isDefined() && 
> ODatabaseRecordThreadLocal.INSTANCE.get().getStatus() != STATUS.OPEN)
>         // !! Probably not safe - won't free the lock !!
>         return RESULT.RECORD_NOT_CHANGED;
>
>
>     if (!(iRecord instanceof ODocument))
>         // !! Probably not safe - won't free the lock !!
>         return RESULT.RECORD_NOT_CHANGED;
>
>
>     final ODocument document = (ODocument) iRecord;
>     if (!filterBySchemaClass(document))
>         // !! Not safe - won't free the lock !!
>         // !! The creates are falling in here when iType = 
> TYPE.AFTER_CREATE !!
>         return RESULT.RECORD_NOT_CHANGED;
>     ...
>
>
>     switch (iType) {
>         case AFTER_CREATE:
>             // !! releases the lock - we don't get here !!
>             onRecordAfterCreate(document);
>             break;
>         case CREATE_FAILED:
>             // !! releases the lock - we don't get here !!
>             onRecordCreateFailed(document);
>             break;
>     ...
>
>
> It seems to me that filterBySchemaClass is not doing the right thing - the 
> variable includeClasses is set, but only with single items like OFunction 
> or OUser.  I don't think this is how it was intended to be used.
>
> The return statements that don't call the hook also seem like a problem. 
>  Maybe it would be wise to look into whether the return statements are the 
> right strategy, considering the implications of not releasing the locks. 
>  It looks like it is possible to get into a similar situation with Updates 
> or Deletes, although we have not yet experienced it.
>
> We're really in a bind with this problem - we don't want to have to 
> downgrade back to the 1.1.0 server we have been using.
>
> Thanks for your help.  I hope I've given you enough information to put you 
> right on it.  I've made issue 1977 for this.
>
> Steve
>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

[orientdb] Re: Server gets into useless state - Maximum lock count exceeded - version 1.6.4

Reply via email to