On Nov 28, 2008, at 7:37 AM, Simon Bulman wrote: > Ahhh, sorry, I wrongly calculated the number of rows in table 2. It > actually > has 29581 rows. Still surprised at the 7x size increase but perhaps > you are > not based on the overheads?
The overhead should typically be about 20% or less - not 7x. Something else is going wrong. Run sqlite3_analyzer.exe and post the results somewhere that we can read them. I suspect the output of sqlite3_analyzer.exe will give us the clues we need to figure out what is amiss. > > > Sorry for the misleading info, > > Cheers, > S. > > -----Original Message----- > From: Jens Miltner [mailto:[EMAIL PROTECTED] > Sent: 28 November 2008 08:38 > To: [EMAIL PROTECTED] > Cc: General Discussion of SQLite Database > Subject: Re: [sqlite] Database file size > > > Am 28.11.2008 um 09:20 schrieb Simon Bulman: > >> Hi Jens, >> >> Thanks for your input. UTF-8 did not make a difference. I expected >> that >> SQLite file would be larger on disk than our proprietary format >> because of >> the overheads that you mention - I am surprised however it at least >> 7x >> larger. > > To be honest - given your table definitions below, I'm surprised the > database is _that_ large, too: > > Table 1 - according to your definition - should contain at most about > 50 bytes of pure data per row (plus the overhead needed by SQLite). > Table 2 would only contain ~ 16 bytes of data per row. > > Dividing the database disk size by the total number of rows you > mentioned, would indicate a whopping 8k per row. > > I did a quick test and created a schema similar to what you outlined > and filled it with data (the same number of rows you mentioned and 28 > and ~20 characters per row for the two varchar columns) and my > database ended up being 71kB in size instead of the 11.8 MB you saw... > > Are there any other tables that contain non-negligible amounts of > data? > Are the data sizes indeed what's indicated in the schema (since SQLite > doesn't really care about the varchar size constraints you can > actually put any amount of data into a varchar(30) column) ? > > </jum> > > >> >> I am actually recreating the whole database (delete file and >> recreate) >> programmatically so vacuuming has not effect. >> >> Cheers, >> S. >> >> -----Original Message----- >> From: Jens Miltner [mailto:[EMAIL PROTECTED] >> Sent: 27 November 2008 13:48 >> To: General Discussion of SQLite Database >> Cc: [EMAIL PROTECTED] >> Subject: Re: [sqlite] Database file size >> >> >> Am 27.11.2008 um 09:12 schrieb Simon Bulman: >> >>> I have been playing around with SQLite to use as an alternative to >>> one of >>> our proprietary file formats used to read large amounts of data. Our >>> proprietary format performs very badly i.e. takes a long time to >>> load some >>> data; as expected SQLite is lighting quick in comparison - great! >>> >>> One considerable stumbling block is the footprint (size) of the >>> database >>> file on disk. It turns out that SQLite is roughly 7x larger than our >>> proprietary format - this is prohibitive. The data is pretty simple >>> really, >>> 2 tables >>> >>> Table 1 >>> >>> BIGINT (index), VARCHAR(30), VARCHAR(10) >>> >>> >>> Table 2 >>> >>> BIGINT (index), FLOAT >>> >>> >>> For a particular data set Table1 has 1165 rows and Table 2 has 323 >>> rows, >>> however typically Table 2 becomes bigger for larger models. The size >>> on disk >>> of this file is 11.8 Mb (compared to 1.7 Mb for our proprietary >>> format). I >>> have noticed that if I drop the indexes the size drops >>> dramatically - >>> however the query performance suffers to an unacceptable level. >>> >>> For a larger model the DB footprint is 2.2 Gb compared to 267 Mb for >>> the >>> proprietary format. >>> >>> Does anybody have any comments on this? Are there any configuration >>> options >>> or ideas I could use to reduce the footprint of the db file? >> >> >> I don't think you'll be able to make SQLite as efficient (regarding >> storage size) as a custom file format, because it has to have some >> overhead for indexes, etc. >> >> However, one thing that comes to mind is the way string data is >> stored: >> If you're concerned about disk space an your string data is mostly >> ASCII, make sure your strings are stored as UTF-8 - for ASCII string >> data, this will save you one byte per character in the string data >> storage. >> To enforce UTF-8 string storage, execute "PRAGMA encoding='UTF-8'" as >> the first command when creating the database (before you create and >> tables). >> You can query the format using "PRAGMA encoding" - UTF-16 encodings >> will store two bytes / character, regardless of the actual >> characters... >> >> Note that this doesn't mean your database size will shrink to half >> the >> size - it merely means you'll be able to fit more rows onto a single >> page, thus eventually you should see a decrease in file size when >> comparing UTF-16 vs. UTF-8 databases. >> >> BTW: are you aware that SQLite database won't shrink by themselves? >> You'll have to vacuum them to reclaim unused space (see >> <http://www.sqlite.org/faq.html#q12 >>> ) >> >> HTH, >> </jum> >> > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users D. Richard Hipp [EMAIL PROTECTED] _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users