Hi Richard,

Thanks for answer.

> In any case, each process uses a *single* SQLite session serialized between
> all threads with a mutex.

Are you really, really, positively sure that this mutex is working?

Yes, I'm sure.

If you only run a single process at a time does the problem go
away.  (If it does, that indicates that the problem is in the
locking code - an area where Fedora has given us no end of problems
in the past - not in the BTree layer.)

There's no problem with a single process (even if it's multi-threaded). I keep a 'DB locked' counter and in case of a single multi-threaded process the counter is always zero. That's another reason I'm sure the mutex works. The problem occurs almost immediately when the server is run in multi-process single thread mode, I think because the DB contention is highest in this mode. The problem is much less frequent, but still occurs when running in multi-process multi-threaded mode, where the contention is much lower (because of mutex for each process). I believe the problem is a very subtle locking problem during auto-vacuum. The machine I'm running on is very strong, and that's why the problem appears on it more often than on other weaker servers.

Can you record the sequence of SQL statements that are being
executed and send them to me?


1. INSERT INTO q_data VALUES (:id,:deq_time,:flag,0,:enq_time,:delay,:msg_flags, 0, :cor_id, null, :data)
2. UPDATE q_data SET id = ?, FLAG = ? WHERE
  ROWID = (SELECT ROWID FROM q_data WHERE
     id ISNULL AND deq_time <= ? ORDER BY deq_time LIMIT 1)
3. SELECT id,deq_time,msg_flags,cor_id,msg_data FROM q_data WHERE id = ?
4. DELETE FROM q_data WHERE id = ?

Please tell me what the RCS version number on your btree.c file is.
For 3.3.8, it should be 1.328, but my line numbers do not match your.

Please recompile with all optimization turned off and see if the
problem goes away.


The RCS version is correct, and the lines didn't match due to optimizations.
I recompiled without optimizations and the problem remains. This time the lines match, and autoVacuumCommit() is in stack trace.
Below is some information from corruption on the non-optimized SQLite:

(gdb) bt
#0  0x0014637f in sqlite3Corrupt () at ../sqlite-3.3.8/src/main.c:1153
#1 0x0012029d in ptrmapPut (pBt=0x8b7c0d0, key=0, eType=5 '\005', parent=45)
   at ../sqlite-3.3.8/src/btree.c:826
#2 0x001234f8 in setChildPtrmaps (pPage=0x8b95600) at ../sqlite-3.3.8/src/btree.c:2193 #3 0x00123774 in relocatePage (pBt=0x8b7c0d0, pDbPage=0x8b95600, eType=5 '\005',
   iPtrPage=3, iFreePage=45) at ../sqlite-3.3.8/src/btree.c:2301
#4  0x00123c50 in autoVacuumCommit (pBt=0x8b7c0d0, nTrunc=0xbf96df48)
   at ../sqlite-3.3.8/src/btree.c:2452
#5  0x0012db21 in sqlite3BtreeSync (p=0x8b74078, zMaster=0x0)
   at ../sqlite-3.3.8/src/btree.c:6576
#6 0x00176ca8 in vdbeCommit (db=0x8b73e28) at ../sqlite-3.3.8/src/vdbeaux.c:1037 #7 0x001773ad in sqlite3VdbeHalt (p=0x8b89298) at ../sqlite-3.3.8/src/vdbeaux.c:1353 #8 0x0016821f in sqlite3VdbeExec (p=0x8b89298) at ../sqlite-3.3.8/src/vdbe.c:641 #9 0x00173791 in sqlite3_step (pStmt=0x8b89298) at ../sqlite-3.3.8/src/vdbeapi.c:231 #10 0x009f05bf in sql_step (db_sess=0x8b32328, stmt=0x8b89298) at q_utils.c:371
#11 0x009f2b5f in Q_Delete_entry (db_sess=0x8b32328,
msgID=0x8b8fc80 "6117-3086022336-77769482-533", pChanges=0xbf96e798) at q_utils.c:1364
...
(gdb) print *pBt
$1 = {pPager = 0x8b80218, pCursor = 0x0, pPage1 = 0x8b923d0, inStmt = 0 '\0', readOnly = 0 '\0', maxEmbedFrac = 64 '@', minEmbedFrac = 32 ' ', minLeafFrac = 32 ' ', pageSizeFixed = 1 '\001', autoVacuum = 1 '\001', pageSize = 1024, usableSize = 1024, maxLocal = 230, minLocal = 103, maxLeaf = 989, minLeaf = 103, pBusyHandler = 0x8b73f3c,
 inTransaction = 2 '\002', nRef = 1, nTransaction = 1, pSchema = 0x8b7fe68,
 xFreeSchema = 0x1352f7 <sqlite3SchemaFree>, pLock = 0x0, pNext = 0x0}
(gdb) print *pBt->pPager
$2 = {journalOpen = 1 '\001', journalStarted = 0 '\0', useJournal = 1 '\001', noReadlock = 0 '\0', stmtOpen = 0 '\0', stmtInUse = 0 '\0', stmtAutoopen = 0 '\0', noSync = 1 '\001', fullSync = 0 '\0', full_fsync = 0 '\0', state = 2 '\002', tempFile = 0 '\0', readOnly = 0 '\0', needSync = 0 '\0', dirtyCache = 1 '\001', alwaysRollback = 0 '\0', memDb = 0 '\0', setMaster = 0 '\0', errCode = 0, dbSize = 50, origDbSize = 50, stmtSize = 0, nRec = 9, cksumInit = 2006923303, stmtNRec = 0, nExtra = 80, pageSize = 1024, nPage = 13, nMaxPage = 157, nRef = 2, mxPage = 2000,
 aInJournal = 0x8b80130 "\016\036", aInStmt = 0x0,
 zFilename = 0x8b802e0 "/usr/local/apache2/q-db/mq/zz.db",
 zJournal = 0x8b80322 "/usr/local/apache2/q-db/mq/zz.db-journal",
zDirectory = 0x8b80301 "/usr/local/apache2/q-db/mq", fd = 0x8b7cff0, jfd = 0x8b7fd90, stfd = 0x0, pBusyHandler = 0x8b73f3c, pFirst = 0x8b93f88, pLast = 0x8b94d38, pFirstSynced = 0x8b93f88, pAll = 0x8b951c8, pStmt = 0x0, pDirty = 0x8b951c8, journalOff = 9800, journalHdr = 0, stmtHdrOff = 0, stmtCksum = 0, stmtJSize = 0,
 sectorSize = 512, xDestructor = 0x121dc6 <pageDestructor>,
xReiniter = 0x121e4c <pageReinit>, xCodec = 0, pCodecArg = 0x0, nHash = 256,
 aHash = 0x8b7d4a0}
(gdb) print *pBt->pPage1
$3 = {isInit = 0 '\0', idxShift = 0 '\0', nOverflow = 0 '\0', intKey = 0 '\0',
 leaf = 0 '\0', zeroData = 0 '\0', leafData = 0 '\0', hasData = 0 '\0',
hdrOffset = 100 'd', childPtrSize = 0 '\0', maxLocal = 0, minLocal = 0, cellOffset = 0, idxParent = 0, nFree = 0, nCell = 0, aOvfl = {{pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}}, pBt = 0x8b7c0d0, aData = 0x8b91fd0 "SQLite format 3", pgno = 1, pParent = 0x0}
(gdb) up
#2 0x001234f8 in setChildPtrmaps (pPage=0x8b95600) at ../sqlite-3.3.8/src/btree.c:2193
2193        rc = ptrmapPut(pBt, childPgno, PTRMAP_BTREE, pgno);
(gdb) print *pPage
$4 = {isInit = 1 '\001', idxShift = 0 '\0', nOverflow = 0 '\0', intKey = 0 '\0',
 leaf = 0 '\0', zeroData = 0 '\0', leafData = 0 '\0', hasData = 1 '\001',
hdrOffset = 0 '\0', childPtrSize = 4 '\004', maxLocal = 230, minLocal = 103, cellOffset = 12, idxParent = 0, nFree = 65524, nCell = 0, aOvfl = {{pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}, { pCell = 0x0, idx = 0}}, pBt = 0x8b7c0d0, aData = 0x8b95200 "", pgno = 45,
 pParent = 0x0}
(gdb) up
#3 0x00123774 in relocatePage (pBt=0x8b7c0d0, pDbPage=0x8b95600, eType=5 '\005',
   iPtrPage=3, iFreePage=45) at ../sqlite-3.3.8/src/btree.c:2301
2301        rc = setChildPtrmaps(pDbPage);
(gdb) print *pDbPage
$5 = {isInit = 1 '\001', idxShift = 0 '\0', nOverflow = 0 '\0', intKey = 0 '\0',
 leaf = 0 '\0', zeroData = 0 '\0', leafData = 0 '\0', hasData = 1 '\001',
hdrOffset = 0 '\0', childPtrSize = 4 '\004', maxLocal = 230, minLocal = 103, cellOffset = 12, idxParent = 0, nFree = 65524, nCell = 0, aOvfl = {{pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}, {pCell = 0x0, idx = 0}, { pCell = 0x0, idx = 0}}, pBt = 0x8b7c0d0, aData = 0x8b95200 "", pgno = 45,
 pParent = 0x0}
(gdb) up
#4  0x00123c50 in autoVacuumCommit (pBt=0x8b7c0d0, nTrunc=0xbf96df48)
   at ../sqlite-3.3.8/src/btree.c:2452
2452        rc = relocatePage(pBt, pDbMemPage, eType, iPtrPage, iFreePage);
(gdb) print *nTrunc
$6 = 0
(gdb) up
#5  0x0012db21 in sqlite3BtreeSync (p=0x8b74078, zMaster=0x0)
   at ../sqlite-3.3.8/src/btree.c:6576
6576          rc = autoVacuumCommit(pBt, &nTrunc);
(gdb) print *p
$7 = {pSqlite = 0x8b73e28, pBt = 0x8b7c0d0, inTrans = 2 '\002'}
(gdb) print *p->pSqlite
$10 = {nDb = 2, aDb = 0x8b73f4c, flags = 32832, errCode = 0, errMask = 255,
autoCommit = 1 '\001', temp_store = 0 '\0', nTable = 4, pDfltColl = 0x8b73f80,
 lastRowid = 9415, priorNewRowid = 0, magic = -264537850, nChange = 1,
nTotalChange = 1642, init = {iDb = 0, newTnum = 1, busy = 0 '\0'}, nExtension = 0, aExtension = 0x0, pVdbe = 0x8b89298, activeVdbeCnt = 1, xTrace = 0, pTraceArg = 0x0,
 xProfile = 0, pProfileArg = 0x0, pCommitArg = 0x0, xCommitCallback = 0,
pRollbackArg = 0x0, xRollbackCallback = 0, pUpdateArg = 0x0, xUpdateCallback = 0, xCollNeeded = 0, xCollNeeded16 = 0, pCollNeededArg = 0x0, pErr = 0x8b6fa70, zErrMsg = 0x0, zErrMsg16 = 0x0, u1 = {isInterrupted = 0, notUsed1 = 0}, xAuth = 0, pAuthArg = 0x0, xProgress = 0, pProgressArg = 0x0, nProgressOps = 0, aModule = {
   keyClass = 3 '\003', copyKey = 0 '\0', count = 0, first = 0x0,
xMalloc = 0x1656bb <sqlite3MallocX>, xFree = 0x165692 <sqlite3FreeX>, htsize = 0, ht = 0x0}, pVTab = 0x0, aVTrans = 0x0, nVTrans = 0, aFunc = {keyClass = 3 '\003', copyKey = 0 '\0', count = 38, first = 0x8b7cd60, xMalloc = 0x1656bb <sqlite3MallocX>, xFree = 0x165692 <sqlite3FreeX>, htsize = 64, ht = 0x8b7d248}, aCollSeq = {
   keyClass = 3 '\003', copyKey = 0 '\0', count = 2, first = 0x8b7d118,
xMalloc = 0x1656bb <sqlite3MallocX>, xFree = 0x165692 <sqlite3FreeX>, htsize = 8, ht = 0x8b73fd8}, busyHandler = {xFunc = 0, pArg = 0x0, nBusy = 0}, busyTimeout = 2000,
 aDbStatic = {{zName = 0x189bb5 "main", pBt = 0x8b74078, inTrans = 0 '\0',
safety_level = 1 '\001', pAux = 0x0, xFreeAux = 0, pSchema = 0x8b7fe68}, { zName = 0x189bba "temp", pBt = 0x0, inTrans = 0 '\0', safety_level = 1 '\001',
     pAux = 0x0, xFreeAux = 0, pSchema = 0x8b80140}}}
(gdb)

Thanks,
Ron

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to