Re: [sqlite] Much unused space in sqlite3 database when using blobs
Richard, Paul thanks! You help me so much. Omitting WITHOUT ROWID do the trick. 2014-12-02 3:04 GMT+07:00 Paul Sanderson : > Further to Richards comment the formula to determine when a record > overflows into a page file is given in section 1.5 of the file format > doc > > http://www.sqlite.org/fileformat.html > > The relevant section states: > > The amount of payload that spills onto overflow pages also depends on > the page type. For the following computations, let U be the usable > size of a database page, the total page size less the reserved space > at the end of each page. And let P be the payload size. > > Table B-Tree Leaf Cell: > > If the payload size P is less than or equal to U-35 then the entire > payload is stored on the b-tree leaf page. Let M be > ((U-12)*32/255)-23. If P is greater than U-35 then the number of byte > stored on the b-tree leaf page is the smaller of M+((P-M)%(U-4)) and > U-35. Note that number of bytes stored on the leaf page is never less > than M. > > So in your case with a max blob size of 49870 bytes (which is > U-35) > so as far as I see it all the blobs can fit onto a single page and > acccording to that formula they should be. > > Certainly in my testing I used your schema (with rowids) and wrote a > single blob of size 6 bytes to a DB with a pagesize of 65536 bytes > and the complete blob was stored in one page as expected (and as shown > below). > > http://sandersonforensics.com/pics/largeblobhdr.jpg > > http://sandersonforensics.com/pics/largeblob.jpg > > You have truncated some of the data for DBs with page size > 512 - I > assume you have determined that this is not relevant but without the > info I can't draw any conclusions. I also assume that you started a > new instance of sqlite between tests? > > Perhaps I could suggest that you do a similar test to mine just adding > one (or a small handful) of blobs to the table and see if this sheds > any light on what is going on. > > > > > > > > > Paul > www.sandersonforensics.com > skype: r3scue193 > twitter: @sandersonforens > Tel +44 (0)1326 572786 > http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit > -Forensic Toolkit for SQLite > http://sandersonforensics.com/forum/content.php?168-Reconnoitre - VSC > processing made easy > > > > On 1 December 2014 at 13:13, Richard Hipp wrote: > > On Sun, Nov 30, 2014 at 11:06 PM, Александр Гурьянов < > caiiiy...@gmail.com> > > wrote: > > > >> Hi. Hi. I am writing an app that stores some blobs in sqlite3 database. > My > >> question is about disk space usage. > >> For analyst i use sqlite_analyzer. > >> > >> My database is created with this script: > >> > >> PRAGMA page_size = 512; > >> PRAGMA synchronous = OFF; > >> PRAGMA journal_mode = MEMORY; > >> begin; > >> create table grid_blobs(grid_id integer, blob blob, PRIMARY > KEY(grid_id)) > >> WITHOUT ROWID; > >> > > > > You'll have much better success here if you omit the WITHOUT ROWID. See > > section 4.0 "When to use WITHOUT ROWID" at > > https://www.sqlite.org/withoutrowid.html for additional information. > > > > See also: http://www.sqlite.org/affcase1.html#smaller > > > > > > > > > >> insert into grid_blobs values(0, randomblob(70)); > >> insert into grid_blobs values(1, randomblob(66)); > >> insert into grid_blobs values(2, randomblob(82)); > >> insert into grid_blobs values(3, randomblob(79)); > >> insert into grid_blobs values(4, randomblob(67)); > >> insert into grid_blobs values(5, randomblob(124)); > >> insert into grid_blobs values(6, randomblob(119)); > >> insert into grid_blobs values(7, randomblob(70)); > >> insert into grid_blobs values(8, randomblob(82)); > >> insert into grid_blobs values(9, randomblob(75)); > >> --... (other entries) > >> insert into grid_blobs values(2703, randomblob(698)); > >> commit; > >> > >> There are 2703 entries with blob length from 31 bytes up to 49870 bytes > >> (avg: 2297, median: 332). > >> The output of sqlite_analyzer for this table is: > >> > >> sqlite3 test.db "PRAGMA page_size=512; vacuum;" && > >> ~/Downloads/sqlite3_analyzer test.db > analyze > >> Percentage of total database.. 99.984% > >> Number of entries. 2437 > >> Bytes of storage consumed. 3229184 > >> Bytes of payload.. 2608484 80.8% > >> Average payload per entry. 1070.37 > >> Average unused bytes per entry 235.34 > >> Average fanout 8.00 > >> Non-sequential pages.. 00.0% > >> Maximum payload per entry. 49877 > >> Entries that use overflow. 232195.2% > >> Index pages used.. 35 > >> Primary pages used 268 > >> Overflow pages used... 6004 > >> Total pages used.. 6307 > >> Unused
Re: [sqlite] Much unused space in sqlite3 database when using blobs
Further to Richards comment the formula to determine when a record overflows into a page file is given in section 1.5 of the file format doc http://www.sqlite.org/fileformat.html The relevant section states: The amount of payload that spills onto overflow pages also depends on the page type. For the following computations, let U be the usable size of a database page, the total page size less the reserved space at the end of each page. And let P be the payload size. Table B-Tree Leaf Cell: If the payload size P is less than or equal to U-35 then the entire payload is stored on the b-tree leaf page. Let M be ((U-12)*32/255)-23. If P is greater than U-35 then the number of byte stored on the b-tree leaf page is the smaller of M+((P-M)%(U-4)) and U-35. Note that number of bytes stored on the leaf page is never less than M. So in your case with a max blob size of 49870 bytes (which is > U-35) so as far as I see it all the blobs can fit onto a single page and acccording to that formula they should be. Certainly in my testing I used your schema (with rowids) and wrote a single blob of size 6 bytes to a DB with a pagesize of 65536 bytes and the complete blob was stored in one page as expected (and as shown below). http://sandersonforensics.com/pics/largeblobhdr.jpg http://sandersonforensics.com/pics/largeblob.jpg You have truncated some of the data for DBs with page size > 512 - I assume you have determined that this is not relevant but without the info I can't draw any conclusions. I also assume that you started a new instance of sqlite between tests? Perhaps I could suggest that you do a similar test to mine just adding one (or a small handful) of blobs to the table and see if this sheds any light on what is going on. Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit -Forensic Toolkit for SQLite http://sandersonforensics.com/forum/content.php?168-Reconnoitre - VSC processing made easy On 1 December 2014 at 13:13, Richard Hipp wrote: > On Sun, Nov 30, 2014 at 11:06 PM, Александр Гурьянов > wrote: > >> Hi. Hi. I am writing an app that stores some blobs in sqlite3 database. My >> question is about disk space usage. >> For analyst i use sqlite_analyzer. >> >> My database is created with this script: >> >> PRAGMA page_size = 512; >> PRAGMA synchronous = OFF; >> PRAGMA journal_mode = MEMORY; >> begin; >> create table grid_blobs(grid_id integer, blob blob, PRIMARY KEY(grid_id)) >> WITHOUT ROWID; >> > > You'll have much better success here if you omit the WITHOUT ROWID. See > section 4.0 "When to use WITHOUT ROWID" at > https://www.sqlite.org/withoutrowid.html for additional information. > > See also: http://www.sqlite.org/affcase1.html#smaller > > > > >> insert into grid_blobs values(0, randomblob(70)); >> insert into grid_blobs values(1, randomblob(66)); >> insert into grid_blobs values(2, randomblob(82)); >> insert into grid_blobs values(3, randomblob(79)); >> insert into grid_blobs values(4, randomblob(67)); >> insert into grid_blobs values(5, randomblob(124)); >> insert into grid_blobs values(6, randomblob(119)); >> insert into grid_blobs values(7, randomblob(70)); >> insert into grid_blobs values(8, randomblob(82)); >> insert into grid_blobs values(9, randomblob(75)); >> --... (other entries) >> insert into grid_blobs values(2703, randomblob(698)); >> commit; >> >> There are 2703 entries with blob length from 31 bytes up to 49870 bytes >> (avg: 2297, median: 332). >> The output of sqlite_analyzer for this table is: >> >> sqlite3 test.db "PRAGMA page_size=512; vacuum;" && >> ~/Downloads/sqlite3_analyzer test.db > analyze >> Percentage of total database.. 99.984% >> Number of entries. 2437 >> Bytes of storage consumed. 3229184 >> Bytes of payload.. 2608484 80.8% >> Average payload per entry. 1070.37 >> Average unused bytes per entry 235.34 >> Average fanout 8.00 >> Non-sequential pages.. 00.0% >> Maximum payload per entry. 49877 >> Entries that use overflow. 232195.2% >> Index pages used.. 35 >> Primary pages used 268 >> Overflow pages used... 6004 >> Total pages used.. 6307 >> Unused bytes on index pages... 1774 9.9% >> Unused bytes on primary pages. 8836 6.4% >> Unused bytes on overflow pages 562916 18.3% >> Unused bytes on all pages. 573526 17.8% >> >> sqlite3 test.db "PRAGMA page_size=1024; vacuum;" && >> ~/Downloads/sqlite3_analyzer test.db > ana
Re: [sqlite] Much unused space in sqlite3 database when using blobs
On Sun, Nov 30, 2014 at 11:06 PM, Александр Гурьянов wrote: > Hi. Hi. I am writing an app that stores some blobs in sqlite3 database. My > question is about disk space usage. > For analyst i use sqlite_analyzer. > > My database is created with this script: > > PRAGMA page_size = 512; > PRAGMA synchronous = OFF; > PRAGMA journal_mode = MEMORY; > begin; > create table grid_blobs(grid_id integer, blob blob, PRIMARY KEY(grid_id)) > WITHOUT ROWID; > You'll have much better success here if you omit the WITHOUT ROWID. See section 4.0 "When to use WITHOUT ROWID" at https://www.sqlite.org/withoutrowid.html for additional information. See also: http://www.sqlite.org/affcase1.html#smaller > insert into grid_blobs values(0, randomblob(70)); > insert into grid_blobs values(1, randomblob(66)); > insert into grid_blobs values(2, randomblob(82)); > insert into grid_blobs values(3, randomblob(79)); > insert into grid_blobs values(4, randomblob(67)); > insert into grid_blobs values(5, randomblob(124)); > insert into grid_blobs values(6, randomblob(119)); > insert into grid_blobs values(7, randomblob(70)); > insert into grid_blobs values(8, randomblob(82)); > insert into grid_blobs values(9, randomblob(75)); > --... (other entries) > insert into grid_blobs values(2703, randomblob(698)); > commit; > > There are 2703 entries with blob length from 31 bytes up to 49870 bytes > (avg: 2297, median: 332). > The output of sqlite_analyzer for this table is: > > sqlite3 test.db "PRAGMA page_size=512; vacuum;" && > ~/Downloads/sqlite3_analyzer test.db > analyze > Percentage of total database.. 99.984% > Number of entries. 2437 > Bytes of storage consumed. 3229184 > Bytes of payload.. 2608484 80.8% > Average payload per entry. 1070.37 > Average unused bytes per entry 235.34 > Average fanout 8.00 > Non-sequential pages.. 00.0% > Maximum payload per entry. 49877 > Entries that use overflow. 232195.2% > Index pages used.. 35 > Primary pages used 268 > Overflow pages used... 6004 > Total pages used.. 6307 > Unused bytes on index pages... 1774 9.9% > Unused bytes on primary pages. 8836 6.4% > Unused bytes on overflow pages 562916 18.3% > Unused bytes on all pages. 573526 17.8% > > sqlite3 test.db "PRAGMA page_size=1024; vacuum;" && > ~/Downloads/sqlite3_analyzer test.db > analyze > Unused bytes on index pages... 467910.6% > Unused bytes on primary pages. 172335.7% > Unused bytes on overflow pages 933114 28.7% > Unused bytes on all pages. 955026 26.5% > > sqlite3 test.db "PRAGMA page_size=2048; vacuum;" && > ~/Downloads/sqlite3_analyzer test.db > analyze > Unused bytes on index pages... 13738 14.6% > Unused bytes on primary pages. 378826.6% > Unused bytes on overflow pages 1184110 37.0% > Unused bytes on all pages. 1235730 31.9% > > sqlite3 test.db "PRAGMA page_size=4096; vacuum;" && > ~/Downloads/sqlite3_analyzer test.db > analyze > Unused bytes on index pages... 16017 12.6% > Unused bytes on primary pages. 576456.3% > Unused bytes on overflow pages 126 43.2% > Unused bytes on all pages. 1338106 33.7% > > sqlite3 test.db "PRAGMA page_size=8192; vacuum;" && > ~/Downloads/sqlite3_analyzer test.db > analyze > Unused bytes on index pages... 31248 18.2% > Unused bytes on primary pages. 700635.5% > Unused bytes on overflow pages 1535111 54.6% > Unused bytes on all pages. 1636422 38.4% > > sqlite3 test.db "PRAGMA page_size=16384; vacuum;" && > ~/Downloads/sqlite3_analyzer test.db > analyze > Unused bytes on index pages... 52549 24.7% > Unused bytes on primary pages. 889745.1% > Unused bytes on overflow pages 1202291 59.7% > Unused bytes on all pages. 1343814 33.9% > > sqlite3 test.db "PRAGMA page_size=32768; vacuum;" && > ~/Downloads/sqlite3_analyzer test.db > analyze > Unused bytes on index pages... 36957 18.8% > Unused bytes on primary pages. 104610 5.0% > Unused bytes on overflow pages 1104895 70.2% > Unused bytes on al
[sqlite] Much unused space in sqlite3 database when using blobs
Hi. Hi. I am writing an app that stores some blobs in sqlite3 database. My question is about disk space usage. For analyst i use sqlite_analyzer. My database is created with this script: PRAGMA page_size = 512; PRAGMA synchronous = OFF; PRAGMA journal_mode = MEMORY; begin; create table grid_blobs(grid_id integer, blob blob, PRIMARY KEY(grid_id)) WITHOUT ROWID; insert into grid_blobs values(0, randomblob(70)); insert into grid_blobs values(1, randomblob(66)); insert into grid_blobs values(2, randomblob(82)); insert into grid_blobs values(3, randomblob(79)); insert into grid_blobs values(4, randomblob(67)); insert into grid_blobs values(5, randomblob(124)); insert into grid_blobs values(6, randomblob(119)); insert into grid_blobs values(7, randomblob(70)); insert into grid_blobs values(8, randomblob(82)); insert into grid_blobs values(9, randomblob(75)); --... (other entries) insert into grid_blobs values(2703, randomblob(698)); commit; There are 2703 entries with blob length from 31 bytes up to 49870 bytes (avg: 2297, median: 332). The output of sqlite_analyzer for this table is: sqlite3 test.db "PRAGMA page_size=512; vacuum;" && ~/Downloads/sqlite3_analyzer test.db > analyze Percentage of total database.. 99.984% Number of entries. 2437 Bytes of storage consumed. 3229184 Bytes of payload.. 2608484 80.8% Average payload per entry. 1070.37 Average unused bytes per entry 235.34 Average fanout 8.00 Non-sequential pages.. 00.0% Maximum payload per entry. 49877 Entries that use overflow. 232195.2% Index pages used.. 35 Primary pages used 268 Overflow pages used... 6004 Total pages used.. 6307 Unused bytes on index pages... 1774 9.9% Unused bytes on primary pages. 8836 6.4% Unused bytes on overflow pages 562916 18.3% Unused bytes on all pages. 573526 17.8% sqlite3 test.db "PRAGMA page_size=1024; vacuum;" && ~/Downloads/sqlite3_analyzer test.db > analyze Unused bytes on index pages... 467910.6% Unused bytes on primary pages. 172335.7% Unused bytes on overflow pages 933114 28.7% Unused bytes on all pages. 955026 26.5% sqlite3 test.db "PRAGMA page_size=2048; vacuum;" && ~/Downloads/sqlite3_analyzer test.db > analyze Unused bytes on index pages... 13738 14.6% Unused bytes on primary pages. 378826.6% Unused bytes on overflow pages 1184110 37.0% Unused bytes on all pages. 1235730 31.9% sqlite3 test.db "PRAGMA page_size=4096; vacuum;" && ~/Downloads/sqlite3_analyzer test.db > analyze Unused bytes on index pages... 16017 12.6% Unused bytes on primary pages. 576456.3% Unused bytes on overflow pages 126 43.2% Unused bytes on all pages. 1338106 33.7% sqlite3 test.db "PRAGMA page_size=8192; vacuum;" && ~/Downloads/sqlite3_analyzer test.db > analyze Unused bytes on index pages... 31248 18.2% Unused bytes on primary pages. 700635.5% Unused bytes on overflow pages 1535111 54.6% Unused bytes on all pages. 1636422 38.4% sqlite3 test.db "PRAGMA page_size=16384; vacuum;" && ~/Downloads/sqlite3_analyzer test.db > analyze Unused bytes on index pages... 52549 24.7% Unused bytes on primary pages. 889745.1% Unused bytes on overflow pages 1202291 59.7% Unused bytes on all pages. 1343814 33.9% sqlite3 test.db "PRAGMA page_size=32768; vacuum;" && ~/Downloads/sqlite3_analyzer test.db > analyze Unused bytes on index pages... 36957 18.8% Unused bytes on primary pages. 104610 5.0% Unused bytes on overflow pages 1104895 70.2% Unused bytes on all pages. 1246462 32.2% sqlite3 test.db "PRAGMA page_size=65536; vacuum;" && ~/Downloads/sqlite3_analyzer test.db > analyze Unused bytes on index pages... 115861 44.2% Unused bytes on primary pages. 179123 7.2% Unused bytes on overflow pages 427834 72.5% Unused bytes on all pages. 722818 21.6% As you can see there are initially 18% of unused space i
[sqlite] undesirable output from ~/.sqliterc processing
Version: 3.8.7.1 2014-10-29 13:59:56 3b7b72c4685aa5cf5e675c2c47ebec10d9704221 I added a single line to my .sqliterc file to load pcre like so: .load /usr/lib/sqlite3/pcre.so Now there's annoying output on ever run: $ sqlite3 test.db .schema -- Loading resources from /home/zen/.sqliterc CREATE TABLE foobar ( ... The 'Loading' part is on stderr which is OK, but still annoying. However, that extra 2nd line is on stdout which is BAD. I believe the culprit is `process_sqliterc()` when it calls `process_input()` will keep a `stdin_is_interactive = 1` setting and `process_input()` thinking it's reading from the user will emit a newline for some reason or another. I did something like this to quash these annoying things: static int process_sqliterc( ... int original_stdin_is_interactive = stdin_is_interactive; ... if( in ){ /* XXX ugh noisy if( stdin_is_interactive ){ fprintf(stderr,"-- Loading resources from %s\n",sqliterc); } */ stdin_is_interactive = 0; rc = process_input(p,in); stdin_is_interactive = original_stdin_is_interactive; fclose(in); } ... I do think the spurious output to stdout is something that shouldn't happen. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users