Hi,
To give you a background of what I was testing:
We have process that uses database as our guranteed message store. To give a
stress test to Derby, this program was run to store about 70k+ messages. Then
a subscriber was started to fetch the messages. But it seems that the database
has stopped responding to the requests.
Initially I was able to get message count using:
ij> select count(*) from M_20070129MESSAGES;
1
-----------
54598
After ~15 minutes.
1 row selected
ij> select count(*) from M_20070129MESSAGES;
ERROR 40XL1: A lock could not be obtained within the time requested
ij> select count(*) from M_20070129MESSAGES;
ERROR 40XL1: A lock could not be obtained within the time requested
ij> select count(*) from M_20070129MESSAGES;
ERROR 40XL1: A lock could not be obtained within the time requested
ij> select * from M_20070129MESSAGES where MESSAGEID='000000000000000000001746'
;
MESSAGEID |MES&
-----------------------------
ERROR 40XL1: A lock could not be obtained within the time requested
When I did prstat on the running derby process it gave the following result:
-------------------------------------------
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
27823 schitale 76 1.0 - - - - 21 - 73 127 767 0 java/28
27823 schitale 0.5 0.0 - - - - 62 - 2 38 40 0 java/2
27823 schitale 0.5 0.0 - - - - 62 - 2 36 38 0 java/4
27823 schitale 0.5 0.0 - - - - 62 - 2 37 39 0 java/3
27823 schitale 0.5 0.0 - - - - 62 - 2 36 38 0 java/5
27823 schitale 0.2 0.0 - - - - 62 - 0 0 0 0 java/11
27823 schitale 0.2 0.0 - - - - 62 - 0 0 0 0 java/12
27823 schitale 0.1 0.0 - - - - 62 - 1 0 5 0 java/6
27823 schitale 0.1 0.0 - - - - 62 - 1 0 8 0 java/27
27823 schitale 0.0 0.0 - - - - 63 - 4 0 12 0 java/8
27823 schitale 0.0 0.0 - - - - 63 - 5 0 12 0 java/7
27823 schitale 0.0 0.0 - - - - 63 - 0 0 0 0 java/1
27823 schitale 0.0 0.0 - - - - 63 - 0 0 0 0 java/10
27823 schitale 0.0 0.0 - - - - 63 - 0 0 0 0 java/15
27823 schitale 0.0 0.0 - - - - 100 - 0 0 0 0 java/14
NLWP USERNAME SIZE RSS MEMORY TIME CPU
20 schitale 6610M 5433M 34% 1:59.03 23%
Total: 1 processes, 20 lwps, load averages: 1.12, 1.12, 1.21
Thread no 28 was taking us most of the CPU time, and 23% was the total CPU
cycle share was taken up by derby process.
To find out which java thread was taking time I did a jstack of the process
which revealed the following:
jstack 27823
--------------------------------------------------------------------------
Thread [EMAIL PROTECTED]: (state = IN_JAVA)
- java.util.Hashtable.get(java.lang.Object) @bci=45, line=339 (Compiled
frame; information may be imprecise)
- org.apache.derby.impl.store.raw.data.FileContainer.getUserPage
(org.apache.derby.impl.store.raw.data.BaseContainerHandle, long, boolean,
boolean) @bci=48 (
Compiled frame)
- org.apache.derby.impl.store.access.btree.BTreeScan.reposition
(org.apache.derby.impl.store.access.btree.BTreeRowPosition, boolean) @bci=114
(Compiled frame
)
- org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows
(org.apache.derby.impl.store.access.btree.BTreeRowPosition,
org.apache.derby.iapi.types
.DataValueDescriptor[][], org.apache.derby.iapi.types.RowLocation[],
org.apache.derby.iapi.store.access.BackingStoreHashtable, long, int[]) @bci=31
(Compiled
frame)
- org.apache.derby.impl.store.access.btree.BTreeScan.fetchNext
(org.apache.derby.iapi.types.DataValueDescriptor[]) @bci=29 (Compiled frame)
- org.apache.derby.iapi.store.access.DiskHashtable.get(java.lang.Object)
@bci=4 (Interpreted frame)
- org.apache.derby.iapi.store.access.BackingStoreHashtable.get
(java.lang.Object) @bci=27 (Interpreted frame)
- org.apache.derby.impl.sql.execute.HashScanResultSet.getNextRowCore()
@bci=64 (Interpreted frame)
-
org.apache.derby.impl.sql.execute.NestedLoopLeftOuterJoinResultSet.getNextRowCo
re() @bci=98 (Compiled frame)
- org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore()
@bci=30 (Compiled frame)
- org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore()
@bci=30 (Compiled frame)
- org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore()
@bci=30 (Compiled frame)
- org.apache.derby.impl.sql.execute.AnyResultSet.getNextRowCore() @bci=25
(Interpreted frame)
- org.apache.derby.exe.ac2c864112x0112x0e5dx37f1x00000ec63ae01.g0() @bci=330
(Interpreted frame)
- org.apache.derby.exe.ac2c864112x0112x0e5dx37f1x00000ec63ae01.e1() @bci=1
(Interpreted frame)
- org.apache.derby.impl.services.reflect.DirectCall.invoke(java.lang.Object)
@bci=72 (Compiled frame)
- org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore()
@bci=71 (Compiled frame)
- org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore
(org.apache.derby.iapi.sql.execute.NoPutResultSet) @bci=1 (Compiled frame)
- org.apache.derby.impl.sql.execute.DeleteResultSet.open() @bci=5
(Interpreted frame)
- org.apache.derby.impl.sql.GenericPreparedStatement.execute
(org.apache.derby.iapi.sql.Activation, boolean, long) @bci=341 (Compiled frame)
- org.apache.derby.impl.jdbc.EmbedStatement.executeStatement
(org.apache.derby.iapi.sql.Activation, boolean, boolean) @bci=183 (Compiled
frame)
- org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement
(org.apache.derby.iapi.sql.Activation, boolean, boolean) @bci=16 (Compiled
frame)
- org.apache.derby.impl.jdbc.EmbedPreparedStatement.execute() @bci=9
(Compiled frame)
- org.apache.derby.impl.drda.DRDAStatement.execute() @bci=4 (Interpreted
frame)
- org.apache.derby.impl.drda.DRDAConnThread.parseEXCSQLSTT() @bci=668
(Compiled frame)
- org.apache.derby.impl.drda.DRDAConnThread.processCommands() @bci=1153
(Compiled frame)
- org.apache.derby.impl.drda.DRDAConnThread.run() @bci=111 (Compiled frame)
-----------------------------------------------------------------------
Then I ran a script which recorded these two outputs at ~30 sec interval. Most
of the time the Derby process was executing the following method:
org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(...)
The derby process is now running for more than 5hrs but the command:
kill -3 <derby-pid>
doest show any deadlock.
I would like to send the log of the jstat and prstat output (~164kb), to Derby
developers if interested.
Regards,
Sachin