I use "insert or replace" heavily. In debug mode, I set it to use temp file on disk so, I can watch the disk IO, in release mode I set it to temp file in memory.
The danger is that if you ever do anything that requires a bunch of temp file, you can easily run out of RAM. Adding and deleting indexes for example while temp store is set to memory will run you out of address space on a larger DB and a 32 bit system (windows). In my testing, "insert or replace" uses temp store most of the time. Wednesday, January 23, 2013, 9:16:42 AM, you wrote: DK> On 01/23/2013 04:20 AM, Heiles, Katrina wrote: >> I am in the process of updating from 3.6.4 to the latest and greatest >> version (finally :-) ). >> While running performance unit tests, i found a major regression (10K/sec vs >> 30k/sec) >> on a common use case for us. The regression occurs when using "insert or >> replace". >> >> I narrowed this down as follows: >> - the regression was introduced in version 3.6.19, and has been there ever >> since. >> - If i comment out line 74643 in sqlite3.c of version 3.6.19, performance is >> good again. >> >> QUESTION: I realize that commenting this line out is not the correct >> solution. >> Is there any way to improve performance of "insert or replace"? DK> Out of interest, is performance improved any with 3.7.15.2 if you DK> execute "PRAGMA temp_store = memory" before calling _do_batch_insert()? DK> Thanks, DK> Dan. >> >> >> I am including 2 other attachments: >> ------------------------------------ >> 1. test_batch_insert.c >> - this is the unit test i created to reproduce the issue. It output >> 2. notes.txt >> - this contains the performance output from running test_batch_insert.c >> on 3 different versions of sqlite >> - 3.6.18 (fast) >> - 3.6.19 (slow) >> - 3.6.19 with line 74643 commented out (fast) >> >> >> Below are detailed (but cryptic) notes on what causes the issue. Please let >> me know if you need more info. >> >> system: >> Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:59:20 EDT 2006 x86_64 x86_64 >> x86_64 GNU/Linux >> >> NOTE: all line number references are based on 3_6_19 version of sqlite3.c, >> where the regression was first introduced. >> >> ROOT CAUSE: >> If I remove one line of code from sqlite3.c (line 74643) that was added in >> this release, performance >> regression is resolved. >> ----------------------------------------------------------------------------------------------- >> 74640 default: { >> 74641 Trigger *pTrigger = 0; >> 74642 assert( onError==OE_Replace ); >> 74643 sqlite3MultiWrite(pParse);<----------- THIS IS THE GUILTY >> LINE!!!! >> 74644 if( pParse->db->flags&SQLITE_RecTriggers ){ >> 74645 pTrigger = sqlite3TriggersExist(pParse, pTab, TK_DELETE, >> 0, 0); >> 74646 } >> 74647 sqlite3GenerateRowDelete( >> 74648 pParse, pTab, baseCur, regR, 0, pTrigger, OE_Replace >> 74649 ); >> 74650 seenReplace = 1; >> 74651 break; >> 74652 } >> ----------------------------------------------------------------------------------------------- >> >> >> DETAILS OF WHY THIS LINE CAUSES THE REGRESSION: >> The effect of including line 74643 when running a batch insert: >> >> HIGH LEVEL: >> in pager_write function, we<sometimes> end up going into this if statement, >> which creates a >> subjournal, causing creation of etilqs_xxx file. >> NOTE: using the attached test_batch_insert.c file with max_entries set to >> 40000, this >> results in creation of 15 etilqs_xxx and 262122 MORE writes!! >> ----------------------------------------------------------------------------------------------- >> 35536 /* If the statement journal is open and the page is not in >> it, >> 35537 ** then write the current page to the statement journal. >> Note that >> 35538 ** the statement journal format differs from the standard >> journal format >> 35539 ** in that it omits the checksums and the header. >> 35540 */ >> 35541 if( subjRequiresPage(pPg) ){ >> 35542 rc = subjournalPage(pPg);<---- we create a subjournal >> 35543 } >> 35544 } >> ----------------------------------------------------------------------------------------------- >> >> >> LOWER LEVEL DETAILS OF WHY subJRequiresPage IS CALLED WHEN LINE 74643 IS IN >> THE CODE >> when running sample test_batch_insert.c file (included in this bug) >> The reason that subjRequiresPage() returns true is that when >> sqlite3GenerateConstraintChecks is called from sqlite3Insert, >> this calls the guilty line,(74643)sqlite3MultiWrite(pParse); >> >> - This sets pParse->isMultiWrite to 1 >> - then sqlite3FinishCoding calls sqlite3VdbeMakeReady with >> pParse->isMultiWrite =1 >> causing 3rd param of call to be TRUE. >> - This causes Vdbe.usesStmtJournal to be set to TRUE in >> sqliteVdbeMakeReady >> - Then sqlite3BtreeBeginStmt is called from sqlite3VdbeExec (case >> OP_Transaction:) >> here, p->iStatement is set to 1 because p->usesStmtJournal is 1 >> 54698 if( pOp->p2&& p->usesStmtJournal<--- we go INTO >> this if statement >> 54699&& (db->autoCommit==0 || db->activeVdbeCnt>1) >> 54700 ){ >> 54701 assert( sqlite3BtreeIsInTrans(u.as.pBt) ); >> 54702 if( p->iStatement==0 ){ >> 54703 assert( db->nStatement>=0&& db->nSavepoint>=0 >> ); >> 54704 db->nStatement++;<---- this sets to 1, causing >> next line to set iStatement to 1 >> 54705 p->iStatement = db->nSavepoint + db->nStatement; >> 54706 } >> 54707 rc = sqlite3BtreeBeginStmt(u.as.pBt, >> p->iStatement); >> >> - sqlite3BtreeBeginStmt calls sqlite3PagerOpenSavepoint using >> iStatement as 2nd parameter, >> therefore nSavepoint is set to 1 >> >> - eventually we call sqlite3BtreeInsert which calls insertCell which >> calls sqlite3PagerWrite which >> calls pager_write a second time for the SAME pPg->pgno (see below for >> example page 6) >> - because we've gone through once for the same page, we do not >> enter the if clause >> on line 35464, >> 35464 /* The transaction journal now exists and we have a >> RESERVED or an >> 35465 ** EXCLUSIVE lock on the main database file. Write >> the current page to >> 35466 ** the transaction journal if it is not there >> already. >> 35467 */ >> 35468 if( !pageInJournal(pPg)&& isOpen(pPager->jfd) ){ >> >> - therefore we DO NOT add this page to the savepoint bitVec on line >> 35517 >> 35517 rc = sqlite3BitvecSet(pPager->pInJournal, >> pPg->pgno); >> >> - therefore we DO go into the if(subjRequiresPage(pPg)), because >> nSavepoint is 1 but the bit is NOT set >> 35536 /* If the statement journal is open and the page is >> not in it, >> 35537 ** then write the current page to the statement >> journal. Note that >> 35538 ** the statement journal format differs from the >> standard journal format >> 35539 ** in that it omits the checksums and the header. >> 35540 */ >> 35541 if( subjRequiresPage(pPg) ){ >> 35542 rc = subjournalPage(pPg); >> 35543 } >> from gdb... >> - (gdb) p *pPg >> $17= {pPage = 0x2b84618, pData = 0x2b83618, pExtra = >> 0x2b84698, pDirty = 0x0, pPager = 0x2b666c8, >> pgno = 6, flags = 6, nRef = 1, pCache = 0x2b667d0, >> pDirtyNext = 0x2b83540, pDirtyPrev = 0x0} >> (gdb) p pageInJournal(pPg) >> $18 = 1 >> (gdb) p >> sqlite3BitvecTest(pPager->aSavepoint[0]->pInSavepoint,6) >> $19 = 0 >> >> - therefore we end up CALLING subjournalPage(pPg) from line 35542, >> causing temp file to be created and lots of small writes to the >> file >> >> >> >> _______________________________________________ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users DK> _______________________________________________ DK> sqlite-users mailing list DK> sqlite-users@sqlite.org DK> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Best regards, Teg mailto:t...@djii.com _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users