Re: [sqlite] Process memory space exhausted in 3.7.0
A couple of seconds before the failure occurs, the test.db-wal file is 5,502,389KB and test.db-shm is 10,688KB. The private bytes (probably the best measure of how much memory a windows application is using) is perhaps a few megs above 130MB. Making the change to have it commit every 1 records by changing the source code from this: rc = sqlite3_step( statement ); if ( rc != SQLITE_DONE ) { std::string errorMessage( sqlite3_errmsg( connection ) ); std::runtime_error ex( errorMessage ); sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL ); sqlite3_finalize( statement ); sqlite3_close( connection ); statement = NULL; connection = NULL; std::cerr << errorMessage; throw ex; } sqlite3_reset( statement ); To this: rc = sqlite3_step( statement ); if ( rc != SQLITE_DONE ) { std::string errorMessage( sqlite3_errmsg( connection ) ); std::runtime_error ex( errorMessage ); sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL ); sqlite3_finalize( statement ); sqlite3_close( connection ); statement = NULL; connection = NULL; std::cerr << errorMessage; throw ex; } sqlite3_reset( statement ); if ( recIdx % 25 == 0 ) { sqlite3_exec( connection, "COMMIT TRANSACTION;BEGIN IMMEDIATE TRANSACTION;", NULL, NULL, NULL ); } Is currently running, but it looks like it will take quite a bit of time for it to finish, so I'll have to update you tomorrow. Victor -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy Sent: Tuesday, August 10, 2010 6:45 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian Filter detected spam On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote: > As it turns out, I can reproduce the failure using a single huge > insert. > The code that I'm including below compiles under bcc32 from > Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more > likely that people have MS compilers available, the compilation line > having this source file and the sqlite3 3.7.0 amalgamation files in > the > same folder is: > > cl -EHsc -Fefail.exe main.cpp sqlite3.c > > You can then invoke fail.exe with a single command line argument of > 8000 like this: > > Fail.exe 8000 > > The source for the executable is listed below. If you're wondering > about > why the numbers being inserted are more complicated than need be, it's > because I just wanted the table and indices to look as much as > possible > like the actual data that our application stores in sqlite because I > had > not realized that the failure could be reproduced with simply > inserting. > Beware that there is no handling of incorrect command line arguments. > > If you monitor this executable run with perfmon and look at its > virtual > bytes, you'll see them hit 2GB and then the next time the insert > statement is stepped, it fails with an I/O disc error. When this happens, how large are the "test.db-wal" and "test.db-shm" files on disk? Here, using g++/Linux, the test.db-shm file (the one memory mapped into the processes address space) grows to about 24 MB here. The test.db-wal file grows to about 12 GB. This is as expected - the *-shm file requires 8 bytes of space for each page written into the *-wal log. So with your 4KB pages, the *-wal file should be roughly 512 times the size of the *-shm. When the transaction is committed, the database is checkpointed. In the checkpoint code there is a big heap memory allocation - Say 2.5 bytes for each page in the WAL. So in this case maybe 6-7MB. It's not ideal to be making allocations this big, but it shouldn't cause any trouble for a desktop PC. Memory usage here peaks at around 130MB. That's the 85MB of configured cache space (20,000 * 4KB pages), plus the mapping of the test.db-shm file plus the big allocation made during the checkpoint. Plus something else I suppose. > If you break up the insert into chunks > _and_close_the_connection_between_chunks_ then the error does not > occur. Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN") every 10,000 inserts the program still does not run to completion? Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian Filter detected spam
I'll run the test again tonight to give you the size of the -wal and -shm file at the point of failure. On WinXP (32 bits) the memory used by the application is not a problem as it stays also in the low hundreds of MB even though the virtual bytes (which are explained like this by perfmon: "Virtual Bytes is the current size, in bytes, of the virtual address space the process is using. Use of virtual address space does not necessarily imply corresponding use of either disk or main memory pages. Virtual space is finite, and the process can limit its ability to load libraries.") hits the 2GB max. Commiting without closing the connection did not help. I'll write the code for the corresponding tests and I'll reply again later today. Later! Victor -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy Sent: Tuesday, August 10, 2010 6:45 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian Filter detected spam On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote: > As it turns out, I can reproduce the failure using a single huge > insert. > The code that I'm including below compiles under bcc32 from > Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more > likely that people have MS compilers available, the compilation line > having this source file and the sqlite3 3.7.0 amalgamation files in > the > same folder is: > > cl -EHsc -Fefail.exe main.cpp sqlite3.c > > You can then invoke fail.exe with a single command line argument of > 8000 like this: > > Fail.exe 8000 > > The source for the executable is listed below. If you're wondering > about > why the numbers being inserted are more complicated than need be, it's > because I just wanted the table and indices to look as much as > possible > like the actual data that our application stores in sqlite because I > had > not realized that the failure could be reproduced with simply > inserting. > Beware that there is no handling of incorrect command line arguments. > > If you monitor this executable run with perfmon and look at its > virtual > bytes, you'll see them hit 2GB and then the next time the insert > statement is stepped, it fails with an I/O disc error. When this happens, how large are the "test.db-wal" and "test.db-shm" files on disk? Here, using g++/Linux, the test.db-shm file (the one memory mapped into the processes address space) grows to about 24 MB here. The test.db-wal file grows to about 12 GB. This is as expected - the *-shm file requires 8 bytes of space for each page written into the *-wal log. So with your 4KB pages, the *-wal file should be roughly 512 times the size of the *-shm. When the transaction is committed, the database is checkpointed. In the checkpoint code there is a big heap memory allocation - Say 2.5 bytes for each page in the WAL. So in this case maybe 6-7MB. It's not ideal to be making allocations this big, but it shouldn't cause any trouble for a desktop PC. Memory usage here peaks at around 130MB. That's the 85MB of configured cache space (20,000 * 4KB pages), plus the mapping of the test.db-shm file plus the big allocation made during the checkpoint. Plus something else I suppose. > If you break up the insert into chunks > _and_close_the_connection_between_chunks_ then the error does not > occur. Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN") every 10,000 inserts the program still does not run to completion? Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Process memory space exhausted in 3.7.0
On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote: > As it turns out, I can reproduce the failure using a single huge > insert. > The code that I'm including below compiles under bcc32 from > Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more > likely that people have MS compilers available, the compilation line > having this source file and the sqlite3 3.7.0 amalgamation files in > the > same folder is: > > cl -EHsc -Fefail.exe main.cpp sqlite3.c > > You can then invoke fail.exe with a single command line argument of > 8000 like this: > > Fail.exe 8000 > > The source for the executable is listed below. If you're wondering > about > why the numbers being inserted are more complicated than need be, it's > because I just wanted the table and indices to look as much as > possible > like the actual data that our application stores in sqlite because I > had > not realized that the failure could be reproduced with simply > inserting. > Beware that there is no handling of incorrect command line arguments. > > If you monitor this executable run with perfmon and look at its > virtual > bytes, you'll see them hit 2GB and then the next time the insert > statement is stepped, it fails with an I/O disc error. When this happens, how large are the "test.db-wal" and "test.db-shm" files on disk? Here, using g++/Linux, the test.db-shm file (the one memory mapped into the processes address space) grows to about 24 MB here. The test.db-wal file grows to about 12 GB. This is as expected - the *-shm file requires 8 bytes of space for each page written into the *-wal log. So with your 4KB pages, the *-wal file should be roughly 512 times the size of the *-shm. When the transaction is committed, the database is checkpointed. In the checkpoint code there is a big heap memory allocation - Say 2.5 bytes for each page in the WAL. So in this case maybe 6-7MB. It's not ideal to be making allocations this big, but it shouldn't cause any trouble for a desktop PC. Memory usage here peaks at around 130MB. That's the 85MB of configured cache space (20,000 * 4KB pages), plus the mapping of the test.db-shm file plus the big allocation made during the checkpoint. Plus something else I suppose. > If you break up the insert into chunks > _and_close_the_connection_between_chunks_ then the error does not > occur. Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN") every 10,000 inserts the program still does not run to completion? Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Process memory space exhausted in 3.7.0
As it turns out, I can reproduce the failure using a single huge insert. The code that I'm including below compiles under bcc32 from Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more likely that people have MS compilers available, the compilation line having this source file and the sqlite3 3.7.0 amalgamation files in the same folder is: cl -EHsc -Fefail.exe main.cpp sqlite3.c You can then invoke fail.exe with a single command line argument of 8000 like this: Fail.exe 8000 The source for the executable is listed below. If you're wondering about why the numbers being inserted are more complicated than need be, it's because I just wanted the table and indices to look as much as possible like the actual data that our application stores in sqlite because I had not realized that the failure could be reproduced with simply inserting. Beware that there is no handling of incorrect command line arguments. If you monitor this executable run with perfmon and look at its virtual bytes, you'll see them hit 2GB and then the next time the insert statement is stepped, it fails with an I/O disc error. I understand that this may not be the intended use case of sqlite, but I don't remember reading anything anywhere that forces you limit the size of insert operations, so a priori, there should not be any reason why simply inserting a lot of rows in a single transaction should make the db fail. If you break up the insert into chunks _and_close_the_connection_between_chunks_ then the error does not occur. This is even more bothersome because there is also no documentation saying that there would be a limit on the number of operations that can be performed on a single connection. Should I open a bug for this? Thanks a lot! Victor start main.cpp- #include #include #include #include #include #include #include #include "sqlite3.h" int main( int argc, char* argv[] ) { // boost::lexical_cast where art thou? int nRecords = 400; if ( argc > 1 ) { std::stringstream sstr; sstr << argv[1]; sstr >> nRecords; } sqlite3* connection = NULL; sqlite3_stmt* statement = NULL; // Open db std::cout << "Openning db." << std::endl; int rc = sqlite3_open( "./test.db", ); if ( rc ) { std::string errorMessage( sqlite3_errmsg( connection ) ); std::runtime_error ex( errorMessage ); sqlite3_close( connection ); connection = NULL; std::cerr << errorMessage; throw ex; } else { int theTimeout = 5000; sqlite3_exec( connection, "PRAGMA page_size = 4096; PRAGMA foreign_keys = 1; PRAGMA cache_size = 2; PRAGMA journal_mode=WAL;", NULL, NULL, NULL ); sqlite3_busy_timeout( connection, theTimeout ); } // Schema std::cout << "Creating schema." << std::endl; sqlite3_exec( connection, "BEGIN IMMEDIATE TRANSACTION;;", NULL, NULL, NULL ); sqlite3_exec( connection, "CREATE TABLE IF NOT EXISTS TEST_DATA( ID INTEGER PRIMARY KEY AUTOINCREMENT, GROUPID INTEGER, USERID INTEGER, CONTEXTID INTEGER, TSTAMP INTEGER, SOURCE STRING, TYPE STRING, SERVERID INTEGER, NSIG INTEGER, NNOI INTEGER, LON FLOAT, LAT FLOAT, CONF FLOAT );", NULL, NULL, NULL ); sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS IDX_TEST_DATA_IDON TEST_DATA( ID );", NULL, NULL, NULL ); sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS IDX_TEST_DATA_USERID_GROUPID_TSTAMP ON TEST_DATA( USERID, GROUPID, TSTAMP);", NULL, NULL, NULL ); sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS IDX_TEST_DATA_GROUPID_TSTAMPON TEST_DATA( GROUPID, TSTAMP );", NULL, NULL, NULL ); sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS IDX_TEST_DATA_CONTEXTID_TSTAMP ON TEST_DATA( CONTEXTID, TSTAMP );", NULL, NULL, NULL ); sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS IDX_TEST_DATA_LAT_LON ON TEST_DATA( LAT, LON );", NULL, NULL, NULL ); sqlite3_exec( connection, "COMMIT TRANSACTION;", NULL, NULL, NULL ); std::string insertRecordStatementStr( "INSERT INTO TEST_DATA ( GROUPID, USERID, CONTEXTID, TSTAMP, SOURCE, TYPE, SERVERID, NSIG, NNOI, LON, LAT, CONF ) VALUES ( NULL, ?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, NULL, NULL, NULL );" ); std::cout << "Preparing statement db." << std::endl; rc = sqlite3_prepare_v2( connection, insertRecordStatementStr.c_str(), -1, , 0 ); if ( rc != SQLITE_OK ) { std::string errorMessage( sqlite3_errmsg( connection ) ); std::runtime_error ex( errorMessage ); sqlite3_finalize( statement ); sqlite3_close( connection ); statement = NULL; connection = NULL; std::cerr << errorMessage; throw ex; } std::map< int, std::pair< int, int > > userId2ContextIdCount; const char* sourceStr = "test_failure"; const char* types[] = { "type_01_1234567890_1234567890",