Re: [sqlite] Process memory space exhausted in 3.7.0

2010-08-12 Thread Victor Morales-Duarte
A couple of seconds before the failure occurs, the test.db-wal file is
5,502,389KB and test.db-shm is 10,688KB.
The private bytes (probably the best measure of how much memory a
windows application is using) is perhaps a few megs above 130MB.
Making the change to have it commit every 1 records by changing the
source code from this:

  rc = sqlite3_step( statement );
  if ( rc != SQLITE_DONE )
  {
 std::string errorMessage( sqlite3_errmsg( connection ) );
 std::runtime_error ex( errorMessage );
 sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL );
 sqlite3_finalize( statement );
 sqlite3_close( connection );
 statement = NULL;
 connection = NULL;
 std::cerr << errorMessage;
 throw ex;
  }
  sqlite3_reset( statement );

To this:

  rc = sqlite3_step( statement );
  if ( rc != SQLITE_DONE )
  {
 std::string errorMessage( sqlite3_errmsg( connection ) );
 std::runtime_error ex( errorMessage );
 sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL );
 sqlite3_finalize( statement );
 sqlite3_close( connection );
 statement = NULL;
 connection = NULL;
 std::cerr << errorMessage;
 throw ex;
  }
  sqlite3_reset( statement );
  if ( recIdx % 25 == 0 )
  {
 sqlite3_exec( connection, "COMMIT TRANSACTION;BEGIN IMMEDIATE
TRANSACTION;", NULL, NULL, NULL );
  }

Is currently running, but it looks like it will take quite a bit of time
for it to finish, so I'll have to update you tomorrow.

Victor

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy
Sent: Tuesday, August 10, 2010 6:45 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian
Filter detected spam


On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote:

> As it turns out, I can reproduce the failure using a single huge  
> insert.
> The code that I'm including below compiles under bcc32 from
> Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more
> likely that people have MS compilers available, the compilation line
> having this source file and the sqlite3 3.7.0 amalgamation files in  
> the
> same folder is:
>
> cl -EHsc -Fefail.exe main.cpp sqlite3.c
>
> You can then invoke fail.exe with a single command line argument of
> 8000 like this:
>
> Fail.exe 8000
>
> The source for the executable is listed below. If you're wondering  
> about
> why the numbers being inserted are more complicated than need be, it's
> because I just wanted the table and indices to look as much as  
> possible
> like the actual data that our application stores in sqlite because I  
> had
> not realized that the failure could be reproduced with simply  
> inserting.
> Beware that there is no handling of incorrect command line arguments.
>
> If you monitor this executable run with perfmon and look at its  
> virtual
> bytes, you'll see them hit 2GB and then the next time the insert
> statement is stepped, it fails with an I/O disc error.

When this happens, how large are the "test.db-wal" and "test.db-shm"
files on disk?

Here, using g++/Linux, the test.db-shm file (the one memory mapped
into the processes address space) grows to about 24 MB here. The  
test.db-wal file
grows to about 12 GB. This is as expected - the *-shm file
requires 8 bytes of space for each page written into the *-wal
log. So with your 4KB pages, the *-wal file should be roughly
512 times the size of the *-shm.

When the transaction is committed, the database is checkpointed.
In the checkpoint code there is a big heap memory allocation -
Say 2.5 bytes for each page in the WAL. So in this case maybe
6-7MB. It's not ideal to be making allocations this big, but
it shouldn't cause any trouble for a desktop PC.

Memory usage here peaks at around 130MB. That's the 85MB of
configured cache space (20,000 * 4KB pages), plus the mapping
of the test.db-shm file plus the big allocation made during
the checkpoint. Plus something else I suppose.

> If you break up the insert into chunks
> _and_close_the_connection_between_chunks_ then the error does not  
> occur.

Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN")
every 10,000 inserts the program still does not run to completion?

Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian Filter detected spam

2010-08-12 Thread Victor Morales-Duarte
I'll run the test again tonight to give you the size of the -wal and
-shm file at the point of failure.

On WinXP (32 bits) the memory used by the application is not a problem
as it stays also in the low hundreds of MB even though the virtual bytes
(which are explained like this by perfmon: "Virtual Bytes is the current
size, in bytes, of the virtual address space the process is using. Use
of virtual address space does not necessarily imply corresponding use of
either disk or main memory pages. Virtual space is finite, and the
process can limit its ability to load libraries.") hits the 2GB max.

Commiting without closing the connection did not help. I'll write the
code for the corresponding tests and I'll reply again later today.

Later!

Victor


-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy
Sent: Tuesday, August 10, 2010 6:45 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian
Filter detected spam


On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote:

> As it turns out, I can reproduce the failure using a single huge  
> insert.
> The code that I'm including below compiles under bcc32 from
> Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more
> likely that people have MS compilers available, the compilation line
> having this source file and the sqlite3 3.7.0 amalgamation files in  
> the
> same folder is:
>
> cl -EHsc -Fefail.exe main.cpp sqlite3.c
>
> You can then invoke fail.exe with a single command line argument of
> 8000 like this:
>
> Fail.exe 8000
>
> The source for the executable is listed below. If you're wondering  
> about
> why the numbers being inserted are more complicated than need be, it's
> because I just wanted the table and indices to look as much as  
> possible
> like the actual data that our application stores in sqlite because I  
> had
> not realized that the failure could be reproduced with simply  
> inserting.
> Beware that there is no handling of incorrect command line arguments.
>
> If you monitor this executable run with perfmon and look at its  
> virtual
> bytes, you'll see them hit 2GB and then the next time the insert
> statement is stepped, it fails with an I/O disc error.

When this happens, how large are the "test.db-wal" and "test.db-shm"
files on disk?

Here, using g++/Linux, the test.db-shm file (the one memory mapped
into the processes address space) grows to about 24 MB here. The  
test.db-wal file
grows to about 12 GB. This is as expected - the *-shm file
requires 8 bytes of space for each page written into the *-wal
log. So with your 4KB pages, the *-wal file should be roughly
512 times the size of the *-shm.

When the transaction is committed, the database is checkpointed.
In the checkpoint code there is a big heap memory allocation -
Say 2.5 bytes for each page in the WAL. So in this case maybe
6-7MB. It's not ideal to be making allocations this big, but
it shouldn't cause any trouble for a desktop PC.

Memory usage here peaks at around 130MB. That's the 85MB of
configured cache space (20,000 * 4KB pages), plus the mapping
of the test.db-shm file plus the big allocation made during
the checkpoint. Plus something else I suppose.

> If you break up the insert into chunks
> _and_close_the_connection_between_chunks_ then the error does not  
> occur.

Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN")
every 10,000 inserts the program still does not run to completion?

Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Process memory space exhausted in 3.7.0

2010-08-10 Thread Dan Kennedy

On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote:

> As it turns out, I can reproduce the failure using a single huge  
> insert.
> The code that I'm including below compiles under bcc32 from
> Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more
> likely that people have MS compilers available, the compilation line
> having this source file and the sqlite3 3.7.0 amalgamation files in  
> the
> same folder is:
>
> cl -EHsc -Fefail.exe main.cpp sqlite3.c
>
> You can then invoke fail.exe with a single command line argument of
> 8000 like this:
>
> Fail.exe 8000
>
> The source for the executable is listed below. If you're wondering  
> about
> why the numbers being inserted are more complicated than need be, it's
> because I just wanted the table and indices to look as much as  
> possible
> like the actual data that our application stores in sqlite because I  
> had
> not realized that the failure could be reproduced with simply  
> inserting.
> Beware that there is no handling of incorrect command line arguments.
>
> If you monitor this executable run with perfmon and look at its  
> virtual
> bytes, you'll see them hit 2GB and then the next time the insert
> statement is stepped, it fails with an I/O disc error.

When this happens, how large are the "test.db-wal" and "test.db-shm"
files on disk?

Here, using g++/Linux, the test.db-shm file (the one memory mapped
into the processes address space) grows to about 24 MB here. The  
test.db-wal file
grows to about 12 GB. This is as expected - the *-shm file
requires 8 bytes of space for each page written into the *-wal
log. So with your 4KB pages, the *-wal file should be roughly
512 times the size of the *-shm.

When the transaction is committed, the database is checkpointed.
In the checkpoint code there is a big heap memory allocation -
Say 2.5 bytes for each page in the WAL. So in this case maybe
6-7MB. It's not ideal to be making allocations this big, but
it shouldn't cause any trouble for a desktop PC.

Memory usage here peaks at around 130MB. That's the 85MB of
configured cache space (20,000 * 4KB pages), plus the mapping
of the test.db-shm file plus the big allocation made during
the checkpoint. Plus something else I suppose.

> If you break up the insert into chunks
> _and_close_the_connection_between_chunks_ then the error does not  
> occur.

Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN")
every 10,000 inserts the program still does not run to completion?

Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Process memory space exhausted in 3.7.0

2010-08-09 Thread Victor Morales-Duarte
As it turns out, I can reproduce the failure using a single huge insert.
The code that I'm including below compiles under bcc32 from
Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more
likely that people have MS compilers available, the compilation line
having this source file and the sqlite3 3.7.0 amalgamation files in the
same folder is:

cl -EHsc -Fefail.exe main.cpp sqlite3.c

You can then invoke fail.exe with a single command line argument of
8000 like this:

Fail.exe 8000

The source for the executable is listed below. If you're wondering about
why the numbers being inserted are more complicated than need be, it's
because I just wanted the table and indices to look as much as possible
like the actual data that our application stores in sqlite because I had
not realized that the failure could be reproduced with simply inserting.
Beware that there is no handling of incorrect command line arguments.

If you monitor this executable run with perfmon and look at its virtual
bytes, you'll see them hit 2GB and then the next time the insert
statement is stepped, it fails with an I/O disc error. I understand that
this may not be the intended use case of sqlite, but I don't remember
reading anything anywhere that forces you limit the size of insert
operations, so a priori, there should not be any reason why simply
inserting a lot of rows in a single transaction should make the db fail.
If you break up the insert into chunks
_and_close_the_connection_between_chunks_ then the error does not occur.
This is even more bothersome because there is also no documentation
saying that there would be a limit on the number of operations that can
be performed on a single connection.

Should I open a bug for this?

Thanks a lot!

Victor


start main.cpp-
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include "sqlite3.h"

int main( int argc, char* argv[] )
{
   // boost::lexical_cast where art thou?
   int nRecords = 400;
   if ( argc > 1 )
   {
  std::stringstream sstr;
  sstr << argv[1];
  sstr >> nRecords;
   }

   sqlite3* connection = NULL;
   sqlite3_stmt* statement = NULL;
   
   // Open db
   std::cout << "Openning db." << std::endl;
   int rc = sqlite3_open( "./test.db",  );
   if ( rc )
   {
  std::string errorMessage( sqlite3_errmsg( connection ) );
  std::runtime_error ex( errorMessage );
  sqlite3_close( connection );
  connection = NULL;
  std::cerr << errorMessage;
  throw ex;
   }
   else
   {
  int theTimeout = 5000;
  sqlite3_exec( connection, 
"PRAGMA page_size = 4096; PRAGMA foreign_keys = 1;
PRAGMA cache_size = 2; PRAGMA journal_mode=WAL;", 
NULL, NULL, NULL );
  sqlite3_busy_timeout( connection, theTimeout );
   }

   // Schema
   std::cout << "Creating schema." << std::endl;
   sqlite3_exec( connection, "BEGIN IMMEDIATE TRANSACTION;;", NULL,
NULL, NULL );
   sqlite3_exec( connection, "CREATE TABLE IF NOT EXISTS TEST_DATA( ID
INTEGER PRIMARY KEY AUTOINCREMENT, GROUPID INTEGER, USERID INTEGER,
CONTEXTID INTEGER, TSTAMP INTEGER, SOURCE STRING, TYPE STRING, SERVERID
INTEGER, NSIG INTEGER, NNOI INTEGER, LON FLOAT, LAT FLOAT, CONF FLOAT
);", NULL, NULL, NULL );
   sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS
IDX_TEST_DATA_IDON TEST_DATA( ID );",
NULL, NULL, NULL );
   sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS
IDX_TEST_DATA_USERID_GROUPID_TSTAMP ON TEST_DATA( USERID, GROUPID,
TSTAMP);", NULL, NULL, NULL );
   sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS
IDX_TEST_DATA_GROUPID_TSTAMPON TEST_DATA( GROUPID, TSTAMP );",
NULL, NULL, NULL );
   sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS
IDX_TEST_DATA_CONTEXTID_TSTAMP  ON TEST_DATA( CONTEXTID, TSTAMP );",
NULL, NULL, NULL );
   sqlite3_exec( connection, "CREATE INDEX IF NOT EXISTS
IDX_TEST_DATA_LAT_LON   ON TEST_DATA( LAT, LON );",
NULL, NULL, NULL );
   sqlite3_exec( connection, "COMMIT TRANSACTION;",  NULL, NULL,
NULL );
   
   std::string insertRecordStatementStr( "INSERT INTO TEST_DATA (
GROUPID, USERID, CONTEXTID, TSTAMP, SOURCE, TYPE, SERVERID, NSIG, NNOI,
LON, LAT, CONF ) VALUES ( NULL, ?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, NULL,
NULL, NULL );" );

   std::cout << "Preparing statement db." << std::endl;
   rc = sqlite3_prepare_v2( connection,
insertRecordStatementStr.c_str(), -1, , 0 );
   if ( rc != SQLITE_OK )
   {
  std::string errorMessage( sqlite3_errmsg( connection ) );
  std::runtime_error ex( errorMessage );
  sqlite3_finalize( statement );
  sqlite3_close( connection );
  statement = NULL;
  connection = NULL;
  std::cerr << errorMessage;
  throw ex;
   }

   std::map< int, std::pair< int, int > > userId2ContextIdCount;
   const char* sourceStr = "test_failure";
   const char* types[] = { "type_01_1234567890_1234567890",