As per your questions:

1.       I do not use sqlite from C or some other language. The calls are
from the command line using sqlite3 command line utility. Hence no
sqlite3_memory_used calls. Memory used by sqlite3 process was obtained by
eyeballing the sqlite3 process in Windows task manager. 

2.       I have appended at the bottom of this mail the previous report,
with the table schema and the query which results in memory problem (as well
as sqlite analyzer results for the database).

3.       [Over the weekend I have also tried to run with different settings
for journal mode and synchronous - no change in results. As well as running
on a windows server 2012 in the cloud, with no other software installed - no
change in results]

4.       Thus, I believe, we have only three elements involved: the database
file containing one table, the query and SQLite 3 command line utility. 

5.       I do not understand the internals of sqlite3. What baffles me is
different memory use profile when ran with cache_size 2,000 pages and
cache_size 10,000 pages - constant in the former and linearly growing with
the latter. 

 

 

From: drhsql...@gmail.com [mailto:drhsql...@gmail.com] On Behalf Of Richard
Hipp
Sent: Monday, January 28, 2013 6:18 AM
To: ammon_lympha...@hotmail.com; General Discussion of SQLite Database
Cc: Michael Black
Subject: Re: [sqlite] bug report: out of memory error for most operations on
a large database

 

 

On Sun, Jan 27, 2013 at 2:57 PM, ammon_lymphater
<ammon_lympha...@hotmail.com <mailto:ammon_lympha...@hotmail.com> > wrote:


I think I localized the bug - potential query memory use with larger cache
sizes

Details
        - created a database and started in a loop 10 times { import 1m
records; run query}
        - for each iteration observe the memory use by SQLITE3 process using
task manager
        - the mimimum use was during the load, the maximum during the query

Below you will find the observations of memory size for three different
cache sizes (Page size 32768)

Comments:
        - for 2k pages the maximum levels off at d (4m rows) at 220mb, so
the query mem use is 220-60 = 160mb
        - for 10k pages the maximum grows by 50mb for each million rows,
query mem is 160 mb initially, but 500mb for 10m rows
        - for 30k pages cache the picture is similar to that of 10k pages
cache

Summary: it looks like for large cache sizes the query has some sort of
memory mis-management.
Run on the full table with 500m records but cache size limited to 2k pages
the same query was fine.


Have you measured SQLite's memory utilization directly using the
sqlite3_memory_used() routine described at

     http://www.sqlite.org/c3ref/memory_highwater.html 


Do you have any other evidence to suggest that the problem is in SQLite and
not in your application?

 


The coding is 'f 66 220' means that during loading of the 6-th million of
rows (f) the minimum memory reported
Was 66mb, maximum 220mb)
cache 2k pages
a 90      b 64 166 c 64 ...   d 64 219  e 64 220  f 66 220 g 65 220 h 65 220
i 65 220 j 65 220

cache 10k pages
a 135 187 b 202 372 c 307 469 d 320 521 e 320 571 f 318 622 g 319 672 h 322
722 i 318 772 j 323 824

cache 30k pages

a 187 270 b .. 373 c 363 559 d 421 746 e 562 932 f 711 1175 g 820 1302 h 974
1354 i 960 1403 j 961 1455



-- 
D. Richard Hipp
d...@sqlite.org <mailto:d...@sqlite.org>  

 

 

 

 

 

-----Original Message-----
From: ammon_lymphater [mailto:ammon_lympha...@hotmail.com] 
Sent: Friday, January 25, 2013 1:40 PM
To: 'Michael Black'; 'General Discussion of SQLite Database'
Subject: RE: [sqlite] bug report: out of memory error for most operations on
a large database

 

To answer your question: 

 - Disk space is not a problem - there is over 500gb free space on that disk
(with database size 80gb). 

 - Memory - 16gb with at least 8 free at all times during testing

- Platform -  I moved the platform to Windows Server 2012

 

I did more testing. I have a large csv file (100gb/500m lines)  Originally I
have imported full file into database  (using sqlite3 .import)  - and then
ran the select query which resulted in out-of-memory error;

 

Over the last 24 hours it tried the following

- split the csv file into smaller pieces (10m lines each)

- import  smaller pieces one-by-one  (either in one sqlite3 session or 1
session per piece - same result)

-  after each import run the query to check whether we have out-of-memory
condition

 

Out of memory condition:

1. does not affect import - I can still load the rows and count shows that
they are loaded 2. once it occurs all subsequent select...group by queries
show out of memory - adding new rows does not change that 3. is _random_ I
got it after three piece import  yesterday, after two piece import  today
(on the same machine, creating new db each time).

So this is _not_ deterministic but over larger timescale (like few hours -
not minutes or a couple of hours)

- It does not depend on the database size - 20m lines database is only 3gb.

 

- In in the full database I have tables with 20 and 30gb sizes  - different
schemas - many more columns but many less rows

  - It is not caused by malformatted data - the data is from a bcp out (SQL

Server) (I will test in in the next 24h)

 

Additional observations:

                When looking at task manager and sqlite3 process

                - during import the memory growth up to 800 mb (the cache
size)

                - during select count(*) it goes up to 1.5gb

                - when starting select ... group by the memory it goes up to
1.6 gb for successful imports, up to 2gb for out of memory condition

 

My current suspicion: this is not just the size, but (mostly?) the query. 

 

 

 

 

 

-----Original Message-----

From: Michael Black [ <mailto:mdblac...@yahoo.com>
mailto:mdblac...@yahoo.com]

Sent: Friday, January 25, 2013 5:41 AM

To:  <mailto:ammon_lympha...@hotmail.com> ammon_lympha...@hotmail.com;
'General Discussion of SQLite Database'

Subject: RE: [sqlite] bug report: out of memory error for most operations on
a large database

 

How much free disk space do you have?  Your temp tables might be exceeding
capacity.

 

-----Original Message-----

From:  <mailto:sqlite-users-boun...@sqlite.org>
sqlite-users-boun...@sqlite.org

[ <mailto:sqlite-users-boun...@sqlite.org>
mailto:sqlite-users-boun...@sqlite.org] On Behalf Of ammon_lymphater

Sent: Thursday, January 24, 2013 1:41 PM

To:  <mailto:sqlite-users@sqlite.org> sqlite-users@sqlite.org

Subject: [sqlite] bug report: out of memory error for most operations on a
large database

 

Summary: except for  select count(*) , all operations on a large table (500m

rows/80gb) result in out-of-memory error

 

 

Details:

 

a.       Using SQLite3 ver 3.7.15.2, Windows 7  64 bit AMD(but the error

also in 3.6.xx and on other Windows platforms)

 

b.      Created a table (schema attached), imported data from a csv file

using sqlite3 (no problems)

 

c.  "select Name5, count(*) from LargeData group by name5 order by name5
results" in Error: out of memory (the cardinality of Name5 is 12)

 

d.  "sqlite> select count(*) from StorageHourlyBySIDCL2;" gives 587608348,
as it should

 

e.      The above independent of cache_size (from 0.3 to 1.5gb); happens for

page_size 32kb and 64kb (others not tested)

 

 

(personally not urgent for me - just testing the limits -  but may be useful
for others)

 

 

====   table schema & sqlite3_analyzer output

 

CREATE TABLE largedata (

 

      name1 smalldatetime

 

     , Name2 uniqueidentifier

 

     , Name3 varchar (16)

 

     , Name4 varchar (8)

 

     , Name5 varchar (80)

 

     , Name6 real

 

     , Name7 real

 

     , Name8 real

 

     , Name9 real

 

     , Name10 real

 

     , Name11 real

 

     , Name12 real

 

     , Name13 real

 

     , Name14 smallint

 

     , Name15 tinyint

 

     , Name16 tinyint

 

     , Name17 smalldatetime

 

     , Name18 real

 

     , Name19 tinyint

 

);

 

 

-- SQLITE3_ANALYZER output

 

/** Disk-Space Utilization Report For h:\temp\convert\import2.db

 

 

Page size in bytes.................... 32768

 

Pages in the whole file (measured).... 2578119

 

Pages in the whole file (calculated).. 2578118

 

Pages that store data................. 2578118    100.000%

 

Pages on the freelist (per header).... 0            0.0%

 

Pages on the freelist (calculated).... 1            0.0%

 

Pages of auto-vacuum overhead......... 0            0.0%

 

Number of tables in the database...... 11

 

Number of indices..................... 0

 

Number of named indices............... 0

 

Automatically generated indices....... 0

 

Size of the file in bytes............. 84479803392

 

Bytes of user payload stored.......... 79293861071  93.9%

 

 

*** Page counts for all tables with their indices ********************

 

 

LargeData................. 2578108    100.000%

 

smalldata............................ 1            0.0%

 

(zero-length tables omitted)

 

 

*** All tables *******************************************************

 

 

Percentage of total database.......... 100.000%

 

Number of entries..................... 587608358

 

Bytes of storage consumed............. 84479770624

 

Bytes of payload...................... 79293871126  93.9%

 

Average payload per entry............. 134.94

 

Average unused bytes per entry........ 0.32

 

Average fanout........................ 2716.00

 

Fragmentation.........................   0.11%

 

Maximum payload per entry............. 1933

 

Entries that use overflow............. 0            0.0%

 

Index pages used...................... 949

 

Primary pages used.................... 2577169

 

Overflow pages used................... 0

 

Total pages used...................... 2578118

 

Unused bytes on index pages........... 3904523     12.6%

 

Unused bytes on primary pages......... 184122910    0.22%

 

Unused bytes on overflow pages........ 0

 

Unused bytes on all pages............. 188027433    0.22%

 

 

--$ removing unused tables

 

*** Table STORAGEHOURLYBYSIDCL2 **************************************

 

 

Percentage of total database.......... 100.000%

 

Number of entries..................... 587608348

 

Bytes of storage consumed............. 84479442944

 

Bytes of payload...................... 79293861071  93.9%

 

Average payload per entry............. 134.94

 

Average unused bytes per entry........ 0.32

 

Average fanout........................ 2716.00

 

Fragmentation.........................   0.11%

 

Maximum payload per entry............. 183

 

Entries that use overflow............. 0            0.0%

 

Index pages used...................... 949

 

Primary pages used.................... 2577159

 

Overflow pages used................... 0

 

Total pages used...................... 2578108

 

Unused bytes on index pages........... 3904523     12.6%

 

Unused bytes on primary pages......... 183805515    0.22%

 

Unused bytes on overflow pages........ 0

 

Unused bytes on all pages............. 187710038    0.22%

 

 

 

*** Table SQLITE_MASTER **********************************************

 

 

Percentage of total database..........   0.0%

 

Number of entries..................... 10

 

Bytes of storage consumed............. 32768

 

Bytes of payload...................... 10055       30.7%

 

Average payload per entry............. 1005.50

 

Average unused bytes per entry........ 2255.50

 

Maximum payload per entry............. 1933

 

Entries that use overflow............. 0            0.0%

 

Primary pages used.................... 1

 

Overflow pages used................... 0

 

Total pages used...................... 1

 

Unused bytes on primary pages......... 22555       68.8%

 

Unused bytes on overflow pages........ 0

 

Unused bytes on all pages............. 22555       68.8%

 

 

*** Definitions ******************************************************

 

--$ removed definitions

 

**********************************************************************

 

The entire text of this report can be sourced into any SQL database

 

engine for further analysis.  All of the text above is an SQL comment.

 

The data used to generate this report follows:

 

*/

 

BEGIN;

 

CREATE TABLE space_used(

 

   name clob,        -- Name of a table or index in the database file

 

   tblname clob,     -- Name of associated table

 

   is_index boolean, -- TRUE if it is an index, false for a table

 

   nentry int,       -- Number of entries in the BTree

 

   leaf_entries int, -- Number of leaf entries

 

   payload int,      -- Total amount of data stored in this table or index

 

   ovfl_payload int, -- Total amount of data stored on overflow pages

 

   ovfl_cnt int,     -- Number of entries that use overflow

 

   mx_payload int,   -- Maximum payload size

 

   int_pages int,    -- Number of interior pages used

 

   leaf_pages int,   -- Number of leaf pages used

 

   ovfl_pages int,   -- Number of overflow pages used

 

   int_unused int,   -- Number of unused bytes on interior pages

 

   leaf_unused int,  -- Number of unused bytes on primary pages

 

   ovfl_unused int,  -- Number of unused bytes on overflow pages

 

   gap_cnt int,      -- Number of gaps in the page layout

 

   compressed_size int  -- Total bytes stored on disk

 

);

 

INSERT INTO space_used

VALUES('sqlite_master','sqlite_master',0,10,10,10055,0,0,1933,0,1,0,0,22555,

0,0,32768);

 

--$ removed unused tables

 

INSERT INTO space_used

VALUES('LargeData','LargeData',0,590185506,587608348,79293861071,0,0,183,949

,2577159,0,3904523,183805515,0,2844,84479442944);

 

COMMIT;

 

 

_______________________________________________

sqlite-users mailing list

 <mailto:sqlite-users@sqlite.org> sqlite-users@sqlite.org

 <http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users>
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

 

 

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to