Re: deadlock when running SYSCS_UTIL.SYSCS_COMPRESS_TABLE

2022-09-30 Thread Nathan Jensen
Ok, below is the log output for the first two compresses.  Running select
right before the first compress showed no locks so I included the output
for the second compress as that was more interesting, though the
transaction ids don't match.  I removed a bunch of normal qpid data flow
logging stuff to shorten the output for clarity, and I formatted the locks
printout to hopefully be more readable.

-

2022-09-30T20:48:32,186Z INFO  [RepackTimer]
(o.a.q.s.d.DerbyRepackingMessageStore) - connection.getAutoCommit() is false
2022-09-30T20:48:32,186Z INFO  [RepackTimer]
(o.a.q.s.d.DerbyRepackingMessageStore) - Running select from lock table
2022-09-30T20:48:32,259Z INFO  [RepackTimer]
(o.a.q.s.d.DerbyRepackingMessageStore) - Locks are:

2022-09-30T20:48:32,260Z INFO  [RepackTimer]
(o.a.q.s.d.DerbyRepackingMessageStore) - Compressing qpid_queue_entries
2022-09-30T20:48:52,554Z ERROR [IO-/147.18.139.60:45964]
(o.a.q.s.d.DerbyRepackingMessageStore) - Failed to enqueue messages
java.sql.SQLTransactionRollbackException: A lock could not be obtained due
to a deadlock, cycle of locks and waiters is:
Lock : TABLE, QPID_QUEUE_ENTRIES, Tablelock
  Waiting XID : {627716261, IX} , APP, INSERT INTO QPID_QUEUE_ENTRIES
(queue_id, message_id) values (?,?)
  Granted XID : {627715588, X}
Lock : ROW, SYSCONGLOMERATES, (7,6)
  Waiting XID : {627715588, X} , APP, alter table
"APP"."QPID_QUEUE_ENTRIES" compress
  Granted XID : {627715588, S} , {627716262, S} , {627716261, S}
. The selected victim is XID : 627716261.
at
org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:100)
at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Util.java:230)
at
org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:431)
at
org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:360)
at
org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2405)
at
org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:88)
at
org.apache.derby.impl.jdbc.EmbedPreparedStatement.(EmbedPreparedStatement.java:152)
at
org.apache.derby.impl.jdbc.EmbedPreparedStatement42.(EmbedPreparedStatement42.java:41)
at
org.apache.derby.iapi.jdbc.Driver42.newEmbedPreparedStatement(Driver42.java:56)
at
org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(EmbedConnection.java:1740)
at
org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(EmbedConnection.java:1568)
at
org.apache.qpid.server.store.jdbc.AbstractJDBCMessageStore.enqueueMessages(AbstractJDBCMessageStore.java:684)
at
org.apache.qpid.server.store.jdbc.AbstractJDBCMessageStore.access$700(AbstractJDBCMessageStore.java:81)
at
org.apache.qpid.server.store.jdbc.AbstractJDBCMessageStore$JDBCTransaction.lambda$new$0(AbstractJDBCMessageStore.java:1226)
at
org.apache.qpid.server.store.jdbc.AbstractJDBCMessageStore$JDBCTransaction.doPreCommitActions(AbstractJDBCMessageStore.java:1289)
at
org.apache.qpid.server.store.jdbc.AbstractJDBCMessageStore$JDBCTransaction.commitTranAsync(AbstractJDBCMessageStore.java:1278)
at
org.apache.qpid.server.txn.AsyncAutoCommitTransaction.enqueue(AsyncAutoCommitTransaction.java:294)
at org.apache.qpid.server.message.RoutingResult.send(RoutingResult.java:124)
at
org.apache.qpid.server.protocol.v1_0.NodeReceivingDestination.send(NodeReceivingDestination.java:117)

2022-09-30T20:49:12,632Z INFO  [RepackTimer]
(o.a.q.s.d.DerbyRepackingMessageStore) - Running select from lock table
2022-09-30T20:49:12,644Z INFO  [RepackTimer]
(o.a.q.s.d.DerbyRepackingMessageStore) - Locks are:
 627716294 TABLE IX QPID_MESSAGE_CONTENT  Tablelock
GRANT T 2 null
 627716294 ROWX QPID_MESSAGE_METADATA (118,8)
GRANT T 1 null
 627716294 ROWX QPID_QUEUE_ENTRIES(96,32)
GRANT T 1 null
 627716294 TABLE IX QPID_QUEUE_ENTRIESTablelock
GRANT T 2 null
 627716285 TABLE IX QPID_MESSAGE_METADATA Tablelock
GRANT T 2 null
 627716294 TABLE IX QPID_MESSAGE_METADATA Tablelock
GRANT T 2 null
 627716285 ROWX QPID_MESSAGE_METADATA (150,13)
 GRANT T 2 null
 627716294 ROWX QPID_MESSAGE_CONTENT  (81,85)
GRANT T 1 null
2022-09-30T20:49:12,644Z INFO  [RepackTimer]
(o.a.q.s.d.DerbyRepackingMessageStore) - Compressing qpid_message_metadata

2022-09-30T20:49:32,682Z ERROR [IO-/147.18.139.173:44710]
(o.a.q.s.u.ServerScopedRuntimeException) - Exception on enqueuing message
into message store25904045
java.sql.SQLTransactionRollbackException: A lock could not be obtained due
to a deadlock, cycle of locks and waiters is:
Lock : TABLE, QPID_MESSAGE_METADATA, Tablelock
  Waiting XID : {627716335, IX} , APP, INSERT INTO QPID_MESSAGE_METADATA(
message_id , meta_data ) values (?, 

Re: deadlock when running SYSCS_UTIL.SYSCS_COMPRESS_TABLE

2022-09-30 Thread Rick Hillegas
I don't have any theories yet about what is going on. Can you try 
issuing the following statement just before you compress the table:


SELECT * FROM SYSCS_DIAG.LOCK_TABLE;

I don't understand why the INSERT transaction grabs a lock on a 
SYSCONGLOMERATES row.


Thanks,
-Rick


On 9/29/22 1:24 PM, Nathan Jensen wrote:

Thanks for the response!  To clarify a few things, we're using Derby
10.15.2.0 and Java 11.  Our project is sticking with Java 11 for the time
being so I can't upgrade to Derby 10.16 at this time.  The version of Qpid
we're using at present is 7.1.12.

I don't see anything suspicious regarding the INSERT statement, but maybe
you could see something that I don't.  I believe the relevant code is here:
https://github.com/apache/qpid-broker-j/blob/7.1.12/broker-plugins/jdbc-store/src/main/java/org/apache/qpid/server/store/jdbc/AbstractJDBCMessageStore.java#L675
That does not commit the transaction, the transaction is committed
elsewhere in the same file, here:
https://github.com/apache/qpid-broker-j/blob/7.1.12/broker-plugins/jdbc-store/src/main/java/org/apache/qpid/server/store/jdbc/AbstractJDBCMessageStore.java#L1264
I should also note that one time I saw the deadlock happen with a DELETE
statement instead of an INSERT, so I don't think the unexpected row locking
in SYSCONGLOMERATES is limited to INSERT.

Here is the DDL as produced by Derby's dblook:
-- --
-- DDL Statements for tables
-- --

CREATE TABLE "APP"."QPID_QUEUE_ENTRIES" ("QUEUE_ID" VARCHAR(36) NOT NULL,
"MESSAGE_ID" BIGINT NOT NULL);

CREATE TABLE "APP"."QPID_MESSAGE_CONTENT" ("MESSAGE_ID" BIGINT NOT NULL,
"CONTENT" BLOB(2147483647));

CREATE TABLE "APP"."QPID_XID_ACTIONS" ("FORMAT" BIGINT NOT NULL,
"GLOBAL_ID" VARCHAR (64) FOR BIT DATA NOT NULL, "BRANCH_ID" VARCHAR (64)
FOR BIT DATA NOT NULL, "ACTION_TYPE" CHAR(1) NOT NULL, "QUEUE_ID"
VARCHAR(36) NOT NULL, "MESSAGE_ID" BIGINT NOT NULL);

CREATE TABLE "APP"."AMQP_1_0_LINKS" ("LINK_KEY" VARCHAR(44) NOT NULL,
"REMOTE_CONTAINER_ID" BLOB(2147483647), "LINK_NAME" BLOB(2147483647),
"LINK_ROLE" INTEGER, "SOURCE" BLOB(2147483647), "TARGET" BLOB(2147483647));

CREATE TABLE "APP"."QPID_MESSAGE_METADATA" ("MESSAGE_ID" BIGINT NOT NULL,
"META_DATA" BLOB(2147483647));

CREATE TABLE "APP"."QPID_DB_VERSION" ("VERSION" INTEGER NOT NULL);

CREATE TABLE "APP"."QPID_XIDS" ("FORMAT" BIGINT NOT NULL, "GLOBAL_ID"
VARCHAR (64) FOR BIT DATA NOT NULL, "BRANCH_ID" VARCHAR (64) FOR BIT DATA
NOT NULL);

CREATE TABLE "APP"."AMQP_1_0_LINKS_VERSION" ("VERSION" VARCHAR(10) NOT
NULL, "VERSION_TIME" TIMESTAMP);

-- --
-- DDL Statements for keys
-- --

-- PRIMARY/UNIQUE
ALTER TABLE "APP"."AMQP_1_0_LINKS" ADD CONSTRAINT
"SQL86-b2280141-0183-5b33-aec6-16c47528" PRIMARY KEY
("LINK_KEY");

ALTER TABLE "APP"."QPID_MESSAGE_METADATA" ADD CONSTRAINT
"SQL82-e333810d-0183-5b33-aec6-16c47528" PRIMARY KEY
("MESSAGE_ID");

ALTER TABLE "APP"."QPID_QUEUE_ENTRIES" ADD CONSTRAINT
"SQL81-61228105-0183-5b33-aec6-16c47528" PRIMARY KEY
("QUEUE_ID", "MESSAGE_ID");

ALTER TABLE "APP"."QPID_XID_ACTIONS" ADD CONSTRAINT
"SQL85-69c68125-0183-5b33-aec6-16c47528" PRIMARY KEY ("FORMAT",
"GLOBAL_ID", "BRANCH_ID", "ACTION_TYPE", "QUEUE_ID", "MESSAGE_ID");

ALTER TABLE "APP"."QPID_XIDS" ADD CONSTRAINT
"SQL84-e785811d-0183-5b33-aec6-16c47528" PRIMARY KEY ("FORMAT",
"GLOBAL_ID", "BRANCH_ID");

ALTER TABLE "APP"."QPID_MESSAGE_CONTENT" ADD CONSTRAINT
"SQL83-65548115-0183-5b33-aec6-16c47528" PRIMARY KEY
("MESSAGE_ID");

ALTER TABLE "APP"."AMQP_1_0_LINKS_VERSION" ADD CONSTRAINT
"SQL87-c14b-0183-5b33-aec6-16c47528" PRIMARY KEY
("VERSION");


And just in case I'm doing something stupid, here is the current version of
my code.  conn.getAutoCommit() is false, but I also tried it with an
autocommit connection and without the commit statements that time.  The
code is inside a try-catch-finally that is inside a TimerTask.

conn = newConnection();
LOGGER.info("connection.getAutoCommit() is " + conn.getAutoCommit());
LOGGER.info("Compressing qpid_queue_entries");
conn.prepareStatement("call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP',
'QPID_QUEUE_ENTRIES', 0)").executeUpdate();
conn.commit();
LOGGER.info("Compressing qpid_message_metadata");
conn.prepareStatement("call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP',
'QPID_MESSAGE_METADATA', 0)").executeUpdate();
conn.commit();
LOGGER.info("Compressing qpid_message_content");
conn.prepareStatement("call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP',
'QPID_MESSAGE_CONTENT', 0)").executeUpdate();
conn.commit();

Thanks for any help you can provide!

On Thu, Sep 29, 2022 at 1:28 PM Rick Hillegas 
wrote:


I'm not an expert on the table compression code, but it does need an
overhaul (see https://issues.apache.org/jira/browse/DERBY-3683).

If I'm interpreting the error correctly, th