Re: [sqlite] sqlite-analyzer largish db?

2013-01-26 Thread Dan Kennedy

On 01/26/2013 05:06 AM, Petite Abeille wrote:

Hello,

[3.7.15.2 2013-01-09 11:53:05 c0e09560d26f0a6456be9dd3447f5311eb4f238f]
[sqlite-analyzer-osx-x86-3071502]

sqlite3_analyzer seems to, hmmm, overflow or something given a largish db.


Looks like the osx build on sqlite.org was dynamically linked against
Tcl 8.4.0, which is the version installed by default on OSX 10.5.x.
In that version, sometimes the [expr] command uses 32-bit integer
arithmetic, resulting in an overflow when calculating the bytes of
storage space used by a large db.

Please try the version on sqlite.org now, which is statically linked 
against Tcl 8.6.


Dan.






For example, for a db of 8,434,233,344 bytes (8.43 GB on disk) :

--8--

*** All tables and indices ***

Percentage of total database.. 100.000%
Number of entries. 143344555
Bytes of storage consumed. -155705344
Bytes used after compression.. 8434229248 -5416.8%
Bytes of payload.. 6972547011 -4478.0%

--8--

Note the negative 'Bytes of storage consumed' and associated negative 
percentages.

Bug or feature?

Thanks in advance.

Cheers,

PA.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug report: out of memory error for most operations on a large database

2013-01-26 Thread ammon_lymphater
To answer your question: 
 - Disk space is not a problem - there is over 500gb free space on that disk
(with database size 80gb). 
 - Memory - 16gb with at least 8 free at all times during testing
 - Platform -  I moved the platform to Windows Server 2012

I did more testing. I have a large csv file (100gb/500m lines)  Originally I
have imported full file into database  (using sqlite3 .import)  - and then
ran the select query which resulted in out-of-memory error;

Over the last 24 hours it tried the following
 - split the csv file into smaller pieces (10m lines each)
 - import  smaller pieces one-by-one  (either in one sqlite3 session or 1
session per piece - same result)
-  after each import run the query to check whether we have out-of-memory
condition

Out of memory condition:
1. does not affect import - I can still load the rows and count shows that
they are loaded
2. once it occurs all subsequent select...group by queries show out of
memory - adding new rows does not change that
3. is _random_ I got it after three piece import  yesterday, after two piece
import  today (on the same machine, creating new db each time).
 
So this is _not_ deterministic but over larger timescale (like few hours -
not minutes or a couple of hours)
 - It does not depend on the database size - 20m lines database is only 3gb.

 - In in the full database I have tables with 20 and 30gb sizes  - different
schemas - many more columns but many less rows
  - It is not caused by malformatted data - the data is from a bcp out (SQL
Server) (I will test in in the next 24h)

Additional observations:
When looking at task manager and sqlite3 process
- during import the memory growth up to 800 mb (the cache size)
- during select count(*) it goes up to 1.5gb
- when starting select ... group by the memory it goes up to 1.6 gb
for successful imports, up to 2gb for out of memory condition

My current suspicion: this is not just the size, but (mostly?) the query. 





-Original Message-
From: Michael Black [mailto:mdblac...@yahoo.com] 
Sent: Friday, January 25, 2013 5:41 AM
To: ammon_lympha...@hotmail.com; 'General Discussion of SQLite Database'
Subject: RE: [sqlite] bug report: out of memory error for most operations on
a large database

How much free disk space do you have?  Your temp tables might be exceeding
capacity.

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of ammon_lymphater
Sent: Thursday, January 24, 2013 1:41 PM
To: sqlite-users@sqlite.org
Subject: [sqlite] bug report: out of memory error for most operations on a
large database

Summary: except for  select count(*) , all operations on a large table (500m
rows/80gb) result in out-of-memory error

 

Details:

a.   Using SQLite3 ver 3.7.15.2, Windows 7  64 bit AMD(but the error
also in 3.6.xx and on other Windows platforms)

b.  Created a table (schema attached), imported data from a csv file
using sqlite3 (no problems)

c.  select Name5, count(*) from LargeData group by name5 order by name5
results in Error: out of memory (the cardinality of Name5 is 12)

d.  sqlite select count(*) from StorageHourlyBySIDCL2; gives 587608348,
as it should

e.  The above independent of cache_size (from 0.3 to 1.5gb); happens for
page_size 32kb and 64kb (others not tested)

 

(personally not urgent for me - just testing the limits -  but may be useful
for others)

 

   table schema  sqlite3_analyzer output

CREATE TABLE largedata (

  name1 smalldatetime

 , Name2 uniqueidentifier

 , Name3 varchar (16)

 , Name4 varchar (8)

 , Name5 varchar (80)

 , Name6 real

 , Name7 real

 , Name8 real

 , Name9 real

 , Name10 real

 , Name11 real

 , Name12 real

 , Name13 real

 , Name14 smallint

 , Name15 tinyint

 , Name16 tinyint

 , Name17 smalldatetime

 , Name18 real

 , Name19 tinyint

);

 

-- SQLITE3_ANALYZER output

/** Disk-Space Utilization Report For h:\temp\convert\import2.db

 

Page size in bytes 32768

Pages in the whole file (measured) 2578119

Pages in the whole file (calculated).. 2578118

Pages that store data. 2578118100.000%

Pages on the freelist (per header) 00.0%

Pages on the freelist (calculated) 10.0%

Pages of auto-vacuum overhead. 00.0%

Number of tables in the database.. 11

Number of indices. 0

Number of named indices... 0

Automatically generated indices... 0

Size of the file in bytes. 84479803392

Bytes of user payload stored.. 79293861071  93.9%

 

*** Page counts for all tables with their indices 

 

LargeData. 2578108100.000%

smalldata 10.0%

(zero-length tables omitted)

 

*** All tables 

Re: [sqlite] using the same sqlite parameter more than once causes premature memory deallocation

2013-01-26 Thread abbood
i fixed it!! you were right! it's not to do with the guts of sql.. rather
it's to do with my incorrect sql statement..

i used MATCH instead of =.. for more details see the stack over flow
response i made here http://stackoverflow.com/a/14534957/766570

A


On Fri, Jan 25, 2013 at 9:18 PM, Roger Binns [via SQLite] 
ml-node+s1065341n66718...@n5.nabble.com wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 24/01/13 17:37, abbood wrote:
  Btw I'm curious how did you find out about this auto release thing? Can
  you add more detail about that?

 I used a debugger to set a breakpoint in malloc_error_debug as the message
 says.  That gave a stack trace of an autorelease pool being drained.

 Since most frees happen in the autorelease pool drain it doesn't
 particularly help with telling you which item is the problem.  I asked on
 the valgrind list and got a way of keeping the allocation stack trace
 instead of the free one.  (The next valgrind release will allow you to
 print both.)

   http://article.gmane.org/gmane.comp.debugging.valgrind/12755

 The cause of that original issue in my code was NSInvocation and dealing
 with returned objects.  The cause of the most recent issue was because
 NSData was owning a buffer passed to it that I didn't want it to.

  Ie I put a break point in malloc_error_break.. But then I just jump to
  the internals of sqlite.. And basically within the salite internals
  it's freeing an operation that doesn't exist..

 Using valgrind will narrow down the problem.  What you are seeing is the
 consequence of earlier memory errors.

  But im not sure if it's a good idea to modify the guts of sqlite.. Is
  it?

 SQLite is *extremely* unlikely to have a bug.  Some other piece of code
 has the bug, and SQLite is the victim.  Remember that virtually every web
 browser on virtually every platform is using SQLite - an error would show
 up for someone.

   http://www.sqlite.org/testing.html


  And then I made the second part use in line variables as opposed to
  bindings.. Ie NSString stringwithformat..

 Do remember sqlite3_mprintf for that sort of thing, especially if strings
 are involved to avoid sql injection attacks/bugs:

   http://www.sqlite.org/c3ref/mprintf.html


  So I deleted that row from the dbase and it works fine.. I couldn't
  find anything different on that row..

 Memory allocators typically have buckets for different sized allocations
 often in powers of two.  It could be that row had a string needing 33
 bytes while others needed less which then caused memory to come out of a
 different bucket which then changes where the victim of the actual bug
 shows up.

 Roger
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.11 (GNU/Linux)

 iEYEARECAAYFAlEC2o8ACgkQmOOfHg372QQQgACg2u97/wfBys3ryf/EZphv0R43
 hjUAoLh2/anUjqGWa+GxC+7GO5tt3D0L
 =X0kB
 -END PGP SIGNATURE-
 ___
 sqlite-users mailing list
 [hidden email] http://user/SendEmail.jtp?type=nodenode=66718i=0
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://sqlite.1065341.n5.nabble.com/using-the-same-sqlite-parameter-more-than-once-causes-premature-memory-deallocation-tp66687p66718.html
  To unsubscribe from using the same sqlite parameter more than once causes
 premature memory deallocation, click 
 herehttp://sqlite.1065341.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=66687code=YWJkdWxsYWguYmFraGFjaEBnbWFpbC5jb218NjY2ODd8LTIwMjk1NDgyOTg=
 .
 NAMLhttp://sqlite.1065341.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/using-the-same-sqlite-parameter-more-than-once-causes-premature-memory-deallocation-tp66687p66726.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug report: out of memory error for most operations on a large database

2013-01-26 Thread Clemens Ladisch
ammon_lymphater wrote:
 - during import the memory growth up to 800 mb (the cache size)
 - during select count(*) it goes up to 1.5gb
 - when starting select ... group by the memory it goes up to 1.6 gb
   for successful imports, up to 2gb for out of memory condition

This looks as if all temporary data is stored in memory (of which you
don't have too much in a 32-bit process.)

What is the value of PRAGMA temp_store; and the default setting of
TEMP_STORE (PRAGMA compile_options;)?


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] using the same sqlite parameter more than once causes premature memory deallocation

2013-01-26 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 25/01/13 23:45, abbood wrote:
 i fixed it!! you were right! it's not to do with the guts of sql..
 rather it's to do with my incorrect sql statement..

Huh?  There is no SQL statement, valid or not, that can cause memory
errors.  It looks like SQLite is the victim of some other memory
mismanagement in your app and changing the SQL has just changed what code
will fall victim to it.

It is a very good idea to run valgrind before proceeding.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAlEEA6AACgkQmOOfHg372QSswACgoIg1jidTzar4EVVfQmFZlDwb
ZuAAn1iIuUz84T4Gpzgv2Q58U1zYq9MR
=z+X5
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug report: out of memory error for most operations on a large database

2013-01-26 Thread Teg
CL This looks as if all temporary data is stored in memory (of which you
CL don't have too much in a 32-bit process.)


Yeah, this. Typically you only have 2 gigs of address space to work
with, in 32 bit windows. I've run out of RAM by having
temp_store=memory before. I also have 30-60 GB DB files that I
operate on that don't have any issues like this. 


CL This looks as if all temporary data is stored in memory (of which you
CL don't have too much in a 32-bit process.)
CL What is the value of PRAGMA temp_store; and the default setting of
CL TEMP_STORE (PRAGMA compile_options;)?


CL Regards,
CL Clemens
CL ___
CL sqlite-users mailing list
CL sqlite-users@sqlite.org
CL http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
Best regards,
 Tegmailto:t...@djii.com

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] identify virtual tables?

2013-01-26 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 25/01/13 12:57, Petite Abeille wrote:
 and   lower( sql ) like '% virtual %'

It is worth pointing out that SQLite normalises the create table text for
a table in the sqlite_master table.  ie it always begins CREATE VIRTUAL
TABLE (uppercased) no matter how the table was created.  By contrast the
columns are left exactly as they were, whitepspace, comments and all.

DDL:

  create /* foo */ vIrTualtable foo using fts3(content /* ab */ TEXT);

Corresponding sqlite_master:

  CREATE VIRTUAL TABLE foo using fts3(content /* ab */ TEXT)

Of course this is an implementation detail and not specified anywhere, but
it is good enough.  Your like query has the problem that it will match
rows where virtual appears anywhere.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAlEECw8ACgkQmOOfHg372QRNTwCgrYP5vGFvPHlvnauTv1u6e1Mu
xeIAn1t5fJLIvFna1Ndf+Jf42AHud0ev
=pwfQ
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT and UPDATE in single query

2013-01-26 Thread Richard Baron Penman
On Fri, Jan 25, 2013 at 1:36 AM, Yongil Jang yongilj...@gmail.com wrote:
 There is sqlite3_update_hook() function that returns rowid of changed
 record.

sounds promising, but apparently not supported by the python wrapper.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT and UPDATE in single query

2013-01-26 Thread Keith Medcalf
 There is sqlite3_update_hook() function that returns rowid of changed record.
 
 sounds promising, but apparently not supported by the python wrapper.

Not the included sqlite3/pysqlite dbiapi wrapper -- though it is supported by 
Roger Binns APSW wrapper.

http://code.google.com/p/apsw/

---
()  ascii ribbon campaign against html e-mail
/\  www.asciiribbon.org




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Can I safely use the pragma synchronization = OFF?

2013-01-26 Thread Shuki Sasson
Hi all, I read the documentation about the synchronization pragma.
It got to do with how often xSync method is called.
With synchronization = FULL xSync is called after each and every change to
the DataBase file as far as I understand...

Observing the VFS interface used by the SQLITE:

typedef struct sqlite3_io_methods sqlite3_io_methods;
struct sqlite3_io_methods {
  int iVersion;
  int (*xClose)(sqlite3_file*);
  int (*xRead)(sqlite3_file*, void*, int iAmt, sqlite3_int64 iOfst);
  *int (*xWrite)(sqlite3_file*, const void*, int iAmt, sqlite3_int64 iOfst);*
  int (*xTruncate)(sqlite3_file*, sqlite3_int64 size);
 * int (*xSync)(sqlite3_file*, int flags);*

*
*

I see both xWrite and xSync...

Is this means that xWrite initiate  a FS write to the file?

Is that means that xSync makes sure that the FS buffered changes are
synced to disk?

I guess it is calling fsync in case of LINUX /FreeBSD am I right?


If the above is correct and SQLITE operates over modern reliable FS
that has journaling with each write, than despite the fact that the
write buffer cache are not fully synced they are protected by the FS
journal that fully records all the changes to the file and that is
going to be replayed in case of a FS mount after a system crash.

If  my understanding is correct than assuming the FS journaling  is
bullet proof than I can safely operate with synchronization = OFF with
SQLITE and still be fully protected by the FS journal in case system
crash, right?

Meaning synchronization = NORMAL doesn't buy me anything in fact it
severely slows the Data Base operations.

Am I missing something here?


Shuki
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Can I safely use the pragma synchronization = OFF?

2013-01-26 Thread Pavel Ivanov
On Sat, Jan 26, 2013 at 6:50 PM, Shuki Sasson gur.mons...@gmail.com wrote:

 Hi all, I read the documentation about the synchronization pragma.
 It got to do with how often xSync method is called.
 With synchronization = FULL xSync is called after each and every change to
 the DataBase file as far as I understand...

 Observing the VFS interface used by the SQLITE:

 typedef struct sqlite3_io_methods sqlite3_io_methods;
 struct sqlite3_io_methods {
   int iVersion;
   int (*xClose)(sqlite3_file*);
   int (*xRead)(sqlite3_file*, void*, int iAmt, sqlite3_int64 iOfst);
   *int (*xWrite)(sqlite3_file*, const void*, int iAmt, sqlite3_int64 iOfst);*
   int (*xTruncate)(sqlite3_file*, sqlite3_int64 size);
  * int (*xSync)(sqlite3_file*, int flags);*

 *
 *

 I see both xWrite and xSync...

 Is this means that xWrite initiate  a FS write to the file?

Yes, in a sense that subsequent read without power cut from the
machine will return written data.


 Is that means that xSync makes sure that the FS buffered changes are
 synced to disk?

Yes.

 I guess it is calling fsync in case of LINUX /FreeBSD am I right?

fdatasync() I think.

 If the above is correct and SQLITE operates over modern reliable FS
 that has journaling with each write, than despite the fact that the
 write buffer cache are not fully synced they are protected by the FS
 journal that fully records all the changes to the file and that is
 going to be replayed in case of a FS mount after a system crash.

 If  my understanding is correct than assuming the FS journaling  is
 bullet proof than I can safely operate with synchronization = OFF with
 SQLITE and still be fully protected by the FS journal in case system
 crash, right?

I really doubt journaling filesystems work like that. Yes, your file
will be restored using journal if the journal records made it to disk.
But FS just can't physically write every record of the journal to disk
at the moment of that record creation. If it did that your computer
would be really slow. But as FS doesn't do that fdatasync still makes
sense if you want to guarantee that when COMMIT execution is finished
it's safe to cut the power off or crash.

 Meaning synchronization = NORMAL doesn't buy me anything in fact it
 severely slows the Data Base operations.

 Am I missing something here?

Please re-check documentation on how journaling FS work.


Pavel
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users