I am in the process of updating from 3.6.4 to the latest and greatest version (finally :-) ). While running performance unit tests, i found a major regression (10K/sec vs 30k/sec) on a common use case for us. The regression occurs when using "insert or replace".
I narrowed this down as follows: - the regression was introduced in version 3.6.19, and has been there ever since. - If i comment out line 74643 in sqlite3.c of version 3.6.19, performance is good again. QUESTION: I realize that commenting this line out is not the correct solution. Is there any way to improve performance of "insert or replace"? I am including 2 other attachments: ------------------------------------ 1. test_batch_insert.c - this is the unit test i created to reproduce the issue. It output 2. notes.txt - this contains the performance output from running test_batch_insert.c on 3 different versions of sqlite - 3.6.18 (fast) - 3.6.19 (slow) - 3.6.19 with line 74643 commented out (fast) Below are detailed (but cryptic) notes on what causes the issue. Please let me know if you need more info. system: Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:59:20 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux NOTE: all line number references are based on 3_6_19 version of sqlite3.c, where the regression was first introduced. ROOT CAUSE: If I remove one line of code from sqlite3.c (line 74643) that was added in this release, performance regression is resolved. ----------------------------------------------------------------------------------------------- 74640 default: { 74641 Trigger *pTrigger = 0; 74642 assert( onError==OE_Replace ); 74643 sqlite3MultiWrite(pParse); <----------- THIS IS THE GUILTY LINE!!!! 74644 if( pParse->db->flags&SQLITE_RecTriggers ){ 74645 pTrigger = sqlite3TriggersExist(pParse, pTab, TK_DELETE, 0, 0); 74646 } 74647 sqlite3GenerateRowDelete( 74648 pParse, pTab, baseCur, regR, 0, pTrigger, OE_Replace 74649 ); 74650 seenReplace = 1; 74651 break; 74652 } ----------------------------------------------------------------------------------------------- DETAILS OF WHY THIS LINE CAUSES THE REGRESSION: The effect of including line 74643 when running a batch insert: HIGH LEVEL: in pager_write function, we <sometimes> end up going into this if statement, which creates a subjournal, causing creation of etilqs_xxx file. NOTE: using the attached test_batch_insert.c file with max_entries set to 40000, this results in creation of 15 etilqs_xxx and 262122 MORE writes!! ----------------------------------------------------------------------------------------------- 35536 /* If the statement journal is open and the page is not in it, 35537 ** then write the current page to the statement journal. Note that 35538 ** the statement journal format differs from the standard journal format 35539 ** in that it omits the checksums and the header. 35540 */ 35541 if( subjRequiresPage(pPg) ){ 35542 rc = subjournalPage(pPg); <---- we create a subjournal 35543 } 35544 } ----------------------------------------------------------------------------------------------- LOWER LEVEL DETAILS OF WHY subJRequiresPage IS CALLED WHEN LINE 74643 IS IN THE CODE when running sample test_batch_insert.c file (included in this bug) The reason that subjRequiresPage() returns true is that when sqlite3GenerateConstraintChecks is called from sqlite3Insert, this calls the guilty line,(74643)sqlite3MultiWrite(pParse); - This sets pParse->isMultiWrite to 1 - then sqlite3FinishCoding calls sqlite3VdbeMakeReady with pParse->isMultiWrite =1 causing 3rd param of call to be TRUE. - This causes Vdbe.usesStmtJournal to be set to TRUE in sqliteVdbeMakeReady - Then sqlite3BtreeBeginStmt is called from sqlite3VdbeExec (case OP_Transaction:) here, p->iStatement is set to 1 because p->usesStmtJournal is 1 54698 if( pOp->p2 && p->usesStmtJournal <--- we go INTO this if statement 54699 && (db->autoCommit==0 || db->activeVdbeCnt>1) 54700 ){ 54701 assert( sqlite3BtreeIsInTrans(u.as.pBt) ); 54702 if( p->iStatement==0 ){ 54703 assert( db->nStatement>=0 && db->nSavepoint>=0 ); 54704 db->nStatement++; <---- this sets to 1, causing next line to set iStatement to 1 54705 p->iStatement = db->nSavepoint + db->nStatement; 54706 } 54707 rc = sqlite3BtreeBeginStmt(u.as.pBt, p->iStatement); - sqlite3BtreeBeginStmt calls sqlite3PagerOpenSavepoint using iStatement as 2nd parameter, therefore nSavepoint is set to 1 - eventually we call sqlite3BtreeInsert which calls insertCell which calls sqlite3PagerWrite which calls pager_write a second time for the SAME pPg->pgno (see below for example page 6) - because we've gone through once for the same page, we do not enter the if clause on line 35464, 35464 /* The transaction journal now exists and we have a RESERVED or an 35465 ** EXCLUSIVE lock on the main database file. Write the current page to 35466 ** the transaction journal if it is not there already. 35467 */ 35468 if( !pageInJournal(pPg) && isOpen(pPager->jfd) ){ - therefore we DO NOT add this page to the savepoint bitVec on line 35517 35517 rc = sqlite3BitvecSet(pPager->pInJournal, pPg->pgno); - therefore we DO go into the if(subjRequiresPage(pPg)), because nSavepoint is 1 but the bit is NOT set 35536 /* If the statement journal is open and the page is not in it, 35537 ** then write the current page to the statement journal. Note that 35538 ** the statement journal format differs from the standard journal format 35539 ** in that it omits the checksums and the header. 35540 */ 35541 if( subjRequiresPage(pPg) ){ 35542 rc = subjournalPage(pPg); 35543 } from gdb... - (gdb) p *pPg $17= {pPage = 0x2b84618, pData = 0x2b83618, pExtra = 0x2b84698, pDirty = 0x0, pPager = 0x2b666c8, pgno = 6, flags = 6, nRef = 1, pCache = 0x2b667d0, pDirtyNext = 0x2b83540, pDirtyPrev = 0x0} (gdb) p pageInJournal(pPg) $18 = 1 (gdb) p sqlite3BitvecTest(pPager->aSavepoint[0]->pInSavepoint,6) $19 = 0 - therefore we end up CALLING subjournalPage(pPg) from line 35542, causing temp file to be created and lots of small writes to the file
version 3.6.18 -------------- run 1 batch insert of 400000 entries took: 10.69 seconds 37418 entries/second run 2 batch insert of 400000 entries took: 11.16 seconds 35842 entries/second strace output with num_entries set to 40000 shows 0 etilqs_xxx (temp files) generated 8512 writes version 3.6.19 --------------- run 1 batch insert of 400000 entries took: 37.63 seconds 10630 entries/second run 2 batch insert of 400000 entries took: 36.49 seconds 10962 entries/second strace output with num_entries set to 40000 shows 15 etilqs_xxx (temp files) generated 270634 writes (note this is 262122 more writes than the 8512 done in 3.6.18) ------------------------------------------------------------ version 3.6.19 WITH line 74643 of sqlite3.c commented out ------------------------------------------------------------ run 1 batch insert of 400000 entries took: 11.27 seconds 35492 entries/second run 2 batch insert of 400000 entries took: 10.90 seconds 36697 entries/second
#include <stdio.h> #include "sqlite3.h" #include <time.h> static void _do_batch_insert(sqlite3 *pdb) { int max_entries=40000; // int max_entries=400000; // int max_entries=5; int serr; sqlite_uint64 i; sqlite3_stmt *stmt; const char *pzTail; char *zErrMsg; const char *key_prefix = "__extend.extend.extend.extend.extend.extend.extend"; const char *value = "_value.value.value.value.value.value.value.value"; char key[128]; /* * Create the table to do the test batch insert */ serr = sqlite3_exec(pdb, "create table test_batch_insert_tbl (tst_key text primary key not null, value text not null)", NULL, 0, &zErrMsg); if (serr != SQLITE_OK) { sqlite3_free(zErrMsg); fprintf(stderr, "SQLite error %d", serr); return; } serr = sqlite3_prepare_v2(pdb, "insert or replace into test_batch_insert_tbl (tst_key, value) values (?,?)", -1, &stmt, &pzTail); if (serr != SQLITE_OK || stmt == NULL) { fprintf(stderr, "SQLite error %d", serr); return; } serr = sqlite3_bind_text(stmt, 2, value, -1, SQLITE_STATIC); if (serr != SQLITE_OK) { fprintf(stderr, "SQLite error %d", serr); return; } serr = sqlite3_exec(pdb, "begin immediate transaction", NULL, 0, &zErrMsg); if (serr != SQLITE_OK) { fprintf(stderr, "SQLite error %d", serr); return; } /* Do the batch insert */ clock_t start = clock(); for (i = 0; i < max_entries; i++) { snprintf(key, sizeof(key), "%s%llu", key_prefix, i); serr = sqlite3_bind_text(stmt, 1, key, -1, SQLITE_STATIC); if (serr != SQLITE_OK) { fprintf(stderr, "SQLite error %d", serr); return; } serr = sqlite3_step(stmt); if (serr != SQLITE_DONE) { fprintf(stderr, "SQLite error %d", serr); return; } serr = sqlite3_reset(stmt); if (serr != SQLITE_OK) { fprintf(stderr, "SQLite error %d", serr); return; } /* * Don't let the journal get too big, commit periodically. */ if (i % 10000 == 0) { serr = sqlite3_exec(pdb, "commit", NULL, 0, &zErrMsg); if (serr != SQLITE_OK) { fprintf(stderr, "SQLite error %d", serr); return; } serr = sqlite3_exec(pdb, "begin immediate transaction", NULL, 0, &zErrMsg); if (serr != SQLITE_OK) { fprintf(stderr, "SQLite error %d", serr); return; } } } clock_t end = clock(); double total_time = (double)(end - start) / CLOCKS_PER_SEC; fprintf(stderr, "batch insert of %d entries took: %.2f seconds\n", max_entries, total_time); fprintf(stderr, "%.0f entries/second\n", max_entries/total_time); serr = sqlite3_exec(pdb, "commit", NULL, 0, &zErrMsg); if (serr != SQLITE_OK) { fprintf(stderr, "SQLite error %d", serr); return; } serr = sqlite3_finalize(stmt); serr = sqlite3_exec(pdb, "drop table test_batch_insert_tbl", NULL, 0, &zErrMsg); if (serr != SQLITE_OK) { fprintf(stderr, "SQLite error %d", serr); return; } } int main(int argc, char **argv) { sqlite3 *pdb; char *zErrMsg = 0; int serr; char * dbname="/tmp/ktstslow.db"; int flags = (SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE); serr = sqlite3_open_v2(dbname, &pdb, flags, NULL); if (serr != SQLITE_OK) { fprintf(stderr, "SQLite error %d", serr); return; } _do_batch_insert(pdb); sqlite3_close(pdb); // delete the db unlink(dbname); return 0; }
_______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users