Re: [sqlite] sqlite-analyzer largish db?
On 01/26/2013 05:06 AM, Petite Abeille wrote: Hello, [3.7.15.2 2013-01-09 11:53:05 c0e09560d26f0a6456be9dd3447f5311eb4f238f] [sqlite-analyzer-osx-x86-3071502] sqlite3_analyzer seems to, hmmm, overflow or something given a largish db. Looks like the osx build on sqlite.org was dynamically linked against Tcl 8.4.0, which is the version installed by default on OSX 10.5.x. In that version, sometimes the [expr] command uses 32-bit integer arithmetic, resulting in an overflow when calculating the bytes of storage space used by a large db. Please try the version on sqlite.org now, which is statically linked against Tcl 8.6. Dan. For example, for a db of 8,434,233,344 bytes (8.43 GB on disk) : --8-- *** All tables and indices *** Percentage of total database.. 100.000% Number of entries. 143344555 Bytes of storage consumed. -155705344 Bytes used after compression.. 8434229248 -5416.8% Bytes of payload.. 6972547011 -4478.0% --8-- Note the negative 'Bytes of storage consumed' and associated negative percentages. Bug or feature? Thanks in advance. Cheers, PA. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] bug report: out of memory error for most operations on a large database
To answer your question: - Disk space is not a problem - there is over 500gb free space on that disk (with database size 80gb). - Memory - 16gb with at least 8 free at all times during testing - Platform - I moved the platform to Windows Server 2012 I did more testing. I have a large csv file (100gb/500m lines) Originally I have imported full file into database (using sqlite3 .import) - and then ran the select query which resulted in out-of-memory error; Over the last 24 hours it tried the following - split the csv file into smaller pieces (10m lines each) - import smaller pieces one-by-one (either in one sqlite3 session or 1 session per piece - same result) - after each import run the query to check whether we have out-of-memory condition Out of memory condition: 1. does not affect import - I can still load the rows and count shows that they are loaded 2. once it occurs all subsequent select...group by queries show out of memory - adding new rows does not change that 3. is _random_ I got it after three piece import yesterday, after two piece import today (on the same machine, creating new db each time). So this is _not_ deterministic but over larger timescale (like few hours - not minutes or a couple of hours) - It does not depend on the database size - 20m lines database is only 3gb. - In in the full database I have tables with 20 and 30gb sizes - different schemas - many more columns but many less rows - It is not caused by malformatted data - the data is from a bcp out (SQL Server) (I will test in in the next 24h) Additional observations: When looking at task manager and sqlite3 process - during import the memory growth up to 800 mb (the cache size) - during select count(*) it goes up to 1.5gb - when starting select ... group by the memory it goes up to 1.6 gb for successful imports, up to 2gb for out of memory condition My current suspicion: this is not just the size, but (mostly?) the query. -Original Message- From: Michael Black [mailto:mdblac...@yahoo.com] Sent: Friday, January 25, 2013 5:41 AM To: ammon_lympha...@hotmail.com; 'General Discussion of SQLite Database' Subject: RE: [sqlite] bug report: out of memory error for most operations on a large database How much free disk space do you have? Your temp tables might be exceeding capacity. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of ammon_lymphater Sent: Thursday, January 24, 2013 1:41 PM To: sqlite-users@sqlite.org Subject: [sqlite] bug report: out of memory error for most operations on a large database Summary: except for select count(*) , all operations on a large table (500m rows/80gb) result in out-of-memory error Details: a. Using SQLite3 ver 3.7.15.2, Windows 7 64 bit AMD(but the error also in 3.6.xx and on other Windows platforms) b. Created a table (schema attached), imported data from a csv file using sqlite3 (no problems) c. select Name5, count(*) from LargeData group by name5 order by name5 results in Error: out of memory (the cardinality of Name5 is 12) d. sqlite select count(*) from StorageHourlyBySIDCL2; gives 587608348, as it should e. The above independent of cache_size (from 0.3 to 1.5gb); happens for page_size 32kb and 64kb (others not tested) (personally not urgent for me - just testing the limits - but may be useful for others) table schema sqlite3_analyzer output CREATE TABLE largedata ( name1 smalldatetime , Name2 uniqueidentifier , Name3 varchar (16) , Name4 varchar (8) , Name5 varchar (80) , Name6 real , Name7 real , Name8 real , Name9 real , Name10 real , Name11 real , Name12 real , Name13 real , Name14 smallint , Name15 tinyint , Name16 tinyint , Name17 smalldatetime , Name18 real , Name19 tinyint ); -- SQLITE3_ANALYZER output /** Disk-Space Utilization Report For h:\temp\convert\import2.db Page size in bytes 32768 Pages in the whole file (measured) 2578119 Pages in the whole file (calculated).. 2578118 Pages that store data. 2578118100.000% Pages on the freelist (per header) 00.0% Pages on the freelist (calculated) 10.0% Pages of auto-vacuum overhead. 00.0% Number of tables in the database.. 11 Number of indices. 0 Number of named indices... 0 Automatically generated indices... 0 Size of the file in bytes. 84479803392 Bytes of user payload stored.. 79293861071 93.9% *** Page counts for all tables with their indices LargeData. 2578108100.000% smalldata 10.0% (zero-length tables omitted) *** All tables
Re: [sqlite] using the same sqlite parameter more than once causes premature memory deallocation
i fixed it!! you were right! it's not to do with the guts of sql.. rather it's to do with my incorrect sql statement.. i used MATCH instead of =.. for more details see the stack over flow response i made here http://stackoverflow.com/a/14534957/766570 A On Fri, Jan 25, 2013 at 9:18 PM, Roger Binns [via SQLite] ml-node+s1065341n66718...@n5.nabble.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 24/01/13 17:37, abbood wrote: Btw I'm curious how did you find out about this auto release thing? Can you add more detail about that? I used a debugger to set a breakpoint in malloc_error_debug as the message says. That gave a stack trace of an autorelease pool being drained. Since most frees happen in the autorelease pool drain it doesn't particularly help with telling you which item is the problem. I asked on the valgrind list and got a way of keeping the allocation stack trace instead of the free one. (The next valgrind release will allow you to print both.) http://article.gmane.org/gmane.comp.debugging.valgrind/12755 The cause of that original issue in my code was NSInvocation and dealing with returned objects. The cause of the most recent issue was because NSData was owning a buffer passed to it that I didn't want it to. Ie I put a break point in malloc_error_break.. But then I just jump to the internals of sqlite.. And basically within the salite internals it's freeing an operation that doesn't exist.. Using valgrind will narrow down the problem. What you are seeing is the consequence of earlier memory errors. But im not sure if it's a good idea to modify the guts of sqlite.. Is it? SQLite is *extremely* unlikely to have a bug. Some other piece of code has the bug, and SQLite is the victim. Remember that virtually every web browser on virtually every platform is using SQLite - an error would show up for someone. http://www.sqlite.org/testing.html And then I made the second part use in line variables as opposed to bindings.. Ie NSString stringwithformat.. Do remember sqlite3_mprintf for that sort of thing, especially if strings are involved to avoid sql injection attacks/bugs: http://www.sqlite.org/c3ref/mprintf.html So I deleted that row from the dbase and it works fine.. I couldn't find anything different on that row.. Memory allocators typically have buckets for different sized allocations often in powers of two. It could be that row had a string needing 33 bytes while others needed less which then caused memory to come out of a different bucket which then changes where the victim of the actual bug shows up. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAlEC2o8ACgkQmOOfHg372QQQgACg2u97/wfBys3ryf/EZphv0R43 hjUAoLh2/anUjqGWa+GxC+7GO5tt3D0L =X0kB -END PGP SIGNATURE- ___ sqlite-users mailing list [hidden email] http://user/SendEmail.jtp?type=nodenode=66718i=0 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- If you reply to this email, your message will be added to the discussion below: http://sqlite.1065341.n5.nabble.com/using-the-same-sqlite-parameter-more-than-once-causes-premature-memory-deallocation-tp66687p66718.html To unsubscribe from using the same sqlite parameter more than once causes premature memory deallocation, click herehttp://sqlite.1065341.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=66687code=YWJkdWxsYWguYmFraGFjaEBnbWFpbC5jb218NjY2ODd8LTIwMjk1NDgyOTg= . NAMLhttp://sqlite.1065341.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://sqlite.1065341.n5.nabble.com/using-the-same-sqlite-parameter-more-than-once-causes-premature-memory-deallocation-tp66687p66726.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] bug report: out of memory error for most operations on a large database
ammon_lymphater wrote: - during import the memory growth up to 800 mb (the cache size) - during select count(*) it goes up to 1.5gb - when starting select ... group by the memory it goes up to 1.6 gb for successful imports, up to 2gb for out of memory condition This looks as if all temporary data is stored in memory (of which you don't have too much in a 32-bit process.) What is the value of PRAGMA temp_store; and the default setting of TEMP_STORE (PRAGMA compile_options;)? Regards, Clemens ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] using the same sqlite parameter more than once causes premature memory deallocation
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/01/13 23:45, abbood wrote: i fixed it!! you were right! it's not to do with the guts of sql.. rather it's to do with my incorrect sql statement.. Huh? There is no SQL statement, valid or not, that can cause memory errors. It looks like SQLite is the victim of some other memory mismanagement in your app and changing the SQL has just changed what code will fall victim to it. It is a very good idea to run valgrind before proceeding. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAlEEA6AACgkQmOOfHg372QSswACgoIg1jidTzar4EVVfQmFZlDwb ZuAAn1iIuUz84T4Gpzgv2Q58U1zYq9MR =z+X5 -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] bug report: out of memory error for most operations on a large database
CL This looks as if all temporary data is stored in memory (of which you CL don't have too much in a 32-bit process.) Yeah, this. Typically you only have 2 gigs of address space to work with, in 32 bit windows. I've run out of RAM by having temp_store=memory before. I also have 30-60 GB DB files that I operate on that don't have any issues like this. CL This looks as if all temporary data is stored in memory (of which you CL don't have too much in a 32-bit process.) CL What is the value of PRAGMA temp_store; and the default setting of CL TEMP_STORE (PRAGMA compile_options;)? CL Regards, CL Clemens CL ___ CL sqlite-users mailing list CL sqlite-users@sqlite.org CL http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Best regards, Tegmailto:t...@djii.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] identify virtual tables?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/01/13 12:57, Petite Abeille wrote: and lower( sql ) like '% virtual %' It is worth pointing out that SQLite normalises the create table text for a table in the sqlite_master table. ie it always begins CREATE VIRTUAL TABLE (uppercased) no matter how the table was created. By contrast the columns are left exactly as they were, whitepspace, comments and all. DDL: create /* foo */ vIrTualtable foo using fts3(content /* ab */ TEXT); Corresponding sqlite_master: CREATE VIRTUAL TABLE foo using fts3(content /* ab */ TEXT) Of course this is an implementation detail and not specified anywhere, but it is good enough. Your like query has the problem that it will match rows where virtual appears anywhere. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAlEECw8ACgkQmOOfHg372QRNTwCgrYP5vGFvPHlvnauTv1u6e1Mu xeIAn1t5fJLIvFna1Ndf+Jf42AHud0ev =pwfQ -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SELECT and UPDATE in single query
On Fri, Jan 25, 2013 at 1:36 AM, Yongil Jang yongilj...@gmail.com wrote: There is sqlite3_update_hook() function that returns rowid of changed record. sounds promising, but apparently not supported by the python wrapper. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SELECT and UPDATE in single query
There is sqlite3_update_hook() function that returns rowid of changed record. sounds promising, but apparently not supported by the python wrapper. Not the included sqlite3/pysqlite dbiapi wrapper -- though it is supported by Roger Binns APSW wrapper. http://code.google.com/p/apsw/ --- () ascii ribbon campaign against html e-mail /\ www.asciiribbon.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Can I safely use the pragma synchronization = OFF?
Hi all, I read the documentation about the synchronization pragma. It got to do with how often xSync method is called. With synchronization = FULL xSync is called after each and every change to the DataBase file as far as I understand... Observing the VFS interface used by the SQLITE: typedef struct sqlite3_io_methods sqlite3_io_methods; struct sqlite3_io_methods { int iVersion; int (*xClose)(sqlite3_file*); int (*xRead)(sqlite3_file*, void*, int iAmt, sqlite3_int64 iOfst); *int (*xWrite)(sqlite3_file*, const void*, int iAmt, sqlite3_int64 iOfst);* int (*xTruncate)(sqlite3_file*, sqlite3_int64 size); * int (*xSync)(sqlite3_file*, int flags);* * * I see both xWrite and xSync... Is this means that xWrite initiate a FS write to the file? Is that means that xSync makes sure that the FS buffered changes are synced to disk? I guess it is calling fsync in case of LINUX /FreeBSD am I right? If the above is correct and SQLITE operates over modern reliable FS that has journaling with each write, than despite the fact that the write buffer cache are not fully synced they are protected by the FS journal that fully records all the changes to the file and that is going to be replayed in case of a FS mount after a system crash. If my understanding is correct than assuming the FS journaling is bullet proof than I can safely operate with synchronization = OFF with SQLITE and still be fully protected by the FS journal in case system crash, right? Meaning synchronization = NORMAL doesn't buy me anything in fact it severely slows the Data Base operations. Am I missing something here? Shuki ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Can I safely use the pragma synchronization = OFF?
On Sat, Jan 26, 2013 at 6:50 PM, Shuki Sasson gur.mons...@gmail.com wrote: Hi all, I read the documentation about the synchronization pragma. It got to do with how often xSync method is called. With synchronization = FULL xSync is called after each and every change to the DataBase file as far as I understand... Observing the VFS interface used by the SQLITE: typedef struct sqlite3_io_methods sqlite3_io_methods; struct sqlite3_io_methods { int iVersion; int (*xClose)(sqlite3_file*); int (*xRead)(sqlite3_file*, void*, int iAmt, sqlite3_int64 iOfst); *int (*xWrite)(sqlite3_file*, const void*, int iAmt, sqlite3_int64 iOfst);* int (*xTruncate)(sqlite3_file*, sqlite3_int64 size); * int (*xSync)(sqlite3_file*, int flags);* * * I see both xWrite and xSync... Is this means that xWrite initiate a FS write to the file? Yes, in a sense that subsequent read without power cut from the machine will return written data. Is that means that xSync makes sure that the FS buffered changes are synced to disk? Yes. I guess it is calling fsync in case of LINUX /FreeBSD am I right? fdatasync() I think. If the above is correct and SQLITE operates over modern reliable FS that has journaling with each write, than despite the fact that the write buffer cache are not fully synced they are protected by the FS journal that fully records all the changes to the file and that is going to be replayed in case of a FS mount after a system crash. If my understanding is correct than assuming the FS journaling is bullet proof than I can safely operate with synchronization = OFF with SQLITE and still be fully protected by the FS journal in case system crash, right? I really doubt journaling filesystems work like that. Yes, your file will be restored using journal if the journal records made it to disk. But FS just can't physically write every record of the journal to disk at the moment of that record creation. If it did that your computer would be really slow. But as FS doesn't do that fdatasync still makes sense if you want to guarantee that when COMMIT execution is finished it's safe to cut the power off or crash. Meaning synchronization = NORMAL doesn't buy me anything in fact it severely slows the Data Base operations. Am I missing something here? Please re-check documentation on how journaling FS work. Pavel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users