We are getting the server into a bad state, after which every transaction
fails with a java.lang.Error (Maximum lock count exceeded). We're using a
remote connection.
I wrapped a try block, around the lock.readLock().lock() call in the class
OModificationLock, checking for java.lang.Error (and printing a stack trace
if there is a failure) and adding an instance counter.
I found that there is a path for which calls to requestModificationLock are
not being followed up by calls to releaseModificationLock.
The Maximum lock count exceeded message occurs around when the counter
reaches ~520,000.
When I get the error, the stack trace I printed is as follows:
java.lang.Error (Maximum lock count exceeded)
com.orientechnologies.common.concur.lock.OModificationLock.
requestMoficationLock(OModificationLock.java:51)
com.orientechnologies.orient.core.index.OIndexAbstract.
acquireModificationLock(OIndexAbstract.java:1084)
com.orientechnologies.orient.core.index.OClassIndexManager.
acquireModificationLock(OClassIndexManager.java:556)
com.orientechnologies.orient.core.index.OClassIndexManager.
checkIndexesAndAcquireLock(OClassIndexManager.java:522)
com.orientechnologies.orient.core.index.OClassIndexManager.
onRecordBeforeUpdate(OClassIndexManager.java:129)
com.orientechnologies.orient.core.hook.ODocumentHookAbstract.onTrigger(
ODocumentHookAbstract.java:263)
com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.
callbackHooks(ODatabaseRecordAbstract.java:1065)
com.orientechnologies.orient.core.tx.OTransactionOptimistic.addRecord(
OTransactionOptimistic.java:313)
com.orientechnologies.orient.server.tx.OTransactionOptimisticProxy.begin(
OTransactionOptimisticProxy.java:129)
com.orientechnologies.orient.core.db.record.ODatabaseRecordTx.begin(
ODatabaseRecordTx.java:94)
com.orientechnologies.orient.core.db.record.ODatabaseRecordTx.begin(
ODatabaseRecordTx.java:38)
com.orientechnologies.orient.core.db.ODatabaseRecordWrapperAbstract.begin(
ODatabaseRecordWrapperAbstract.java:126)
com.orientechnologies.orient.server.network.protocol.binary.
ONetworkProtocolBinary.commit(ONetworkProtocolBinary.java:1047)
com.orientechnologies.orient.server.network.protocol.binary.
ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:303)
com.orientechnologies.orient.server.network.protocol.binary.
OBinaryNetworkProtocolAbstract.execute(OBinaryNetworkProtocolAbstract.java:
125)
com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:45)
Tracing into the problem, it only occurs for us when sending a Transaction,
and the number of locks only increases for creates in this transaction (not
for updates...)
(OTransactionOptimistic.java)
protected void addRecord(final ORecordInternal<?> iRecord, final byteiStatus
, final String iClusterName) {
switch (iStatus) {
case ORecordOperation.CREATED:
database.checkSecurity(ODatabaseSecurityResources.CLUSTER, ORole
.PERMISSION_CREATE, iClusterName);
// !! This gets the Modification Lock !!
database.callbackHooks(TYPE.BEFORE_CREATE, iRecord);
break;
...
switch (iStatus) {
case ORecordOperation.CREATED:
// !! This calls the hooks that apply after a create, which
should free the Modification Lock !!
database.callbackHooks(TYPE.AFTER_CREATE, iRecord);
break;
...
(ODatabaseRecordAbstract.java)
public ORecordHook.RESULT callbackHooks(final TYPE iType, final
OIdentifiable id) {
// !! There are some returns at the beginning of this function that may
not be safe - won't free the lock !!
...
// !! Call the trigger !!
final RESULT res = hook.onTrigger(iType, rec);
...
(ODocumentHookAbstract.java)
public RESULT onTrigger(final TYPE iType, final ORecord<?> iRecord) {
if (ODatabaseRecordThreadLocal.INSTANCE.isDefined() &&
ODatabaseRecordThreadLocal.INSTANCE.get().getStatus() != STATUS.OPEN)
// !! Probably not safe - won't free the lock !!
return RESULT.RECORD_NOT_CHANGED;
if (!(iRecord instanceof ODocument))
// !! Probably not safe - won't free the lock !!
return RESULT.RECORD_NOT_CHANGED;
final ODocument document = (ODocument) iRecord;
if (!filterBySchemaClass(document))
// !! Not safe - won't free the lock !!
// !! The creates are falling in here when iType =
TYPE.AFTER_CREATE !!
return RESULT.RECORD_NOT_CHANGED;
...
switch (iType) {
case AFTER_CREATE:
// !! releases the lock - we don't get here !!
onRecordAfterCreate(document);
break;
case CREATE_FAILED:
// !! releases the lock - we don't get here !!
onRecordCreateFailed(document);
break;
...
It seems to me that filterBySchemaClass is not doing the right thing - the
variable includeClasses is set, but only with single items like OFunction
or OUser. I don't think this is how it was intended to be used.
The return statements that don't call the hook also seem like a problem.
Maybe it would be wise to look into whether the return statements are the
right strategy, considering the implications of not releasing the locks.
It looks like it is possible to get into a similar situation with Updates
or Deletes, although we have not yet experienced it.
We're really in a bind with this problem - we don't want to have to
downgrade back to the 1.1.0 server we have been using.
Thanks for your help. I hope I've given you enough information to put you
right on it. I've made issue 1977 for this.
Steve
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.