Am 28.11.2008 um 13:37 schrieb Simon Bulman:

> Ahhh, sorry, I wrongly calculated the number of rows in table 2. It  
> actually
> has 29581 rows. Still surprised at the 7x size increase but perhaps  
> you are
> not based on the overheads?

I still can't reproduce your database sizes - creating 3000 rows in  
table 1 and 30000 in table 2 I end up with 745kB database file size.
Even with 30000 rows in table 1 I end up with only 2.9MB file size...

And the overhead I was talking about is overhead that sqlite maintains  
for each row in every database, so that should be the same on your end  
and my end.

Unless your row counts are much, much higher, I suspect that there's  
either
(1) other tables that contribute the major part to the database file  
size
(2) your database does have lots of free space and needs to be vacuumed

You could try to run sqlite3_analyzer on your database to see the  
memory usage of each table, free space, etc.
Are you using a stock sqlite3 installation or did you modify/customize  
it through build settings?

</jum>


>
> -----Original Message-----
> From: Jens Miltner [mailto:[EMAIL PROTECTED]
> Sent: 28 November 2008 08:38
> To: [EMAIL PROTECTED]
> Cc: General Discussion of SQLite Database
> Subject: Re: [sqlite] Database file size
>
>
> Am 28.11.2008 um 09:20 schrieb Simon Bulman:
>
>> Hi Jens,
>>
>> Thanks for your input. UTF-8 did not make a difference. I expected
>> that
>> SQLite file would be larger on disk than our proprietary format
>> because of
>> the overheads that you mention - I am surprised however it at least  
>> 7x
>> larger.
>
> To be honest - given your table definitions below, I'm surprised the
> database is _that_ large, too:
>
> Table 1 - according to your definition - should contain at most about
> 50 bytes of pure data per row (plus the overhead needed by SQLite).
> Table 2 would only contain ~ 16 bytes of data per row.
>
> Dividing the database disk size by the total number of rows you
> mentioned, would indicate a whopping 8k per row.
>
> I did a quick test and created a schema similar to what you outlined
> and filled it with data (the same number of rows you mentioned and 28
> and ~20 characters per row for the two varchar columns) and my
> database ended up being 71kB in size instead of the 11.8 MB you saw...
>
> Are there any other tables that contain non-negligible amounts of  
> data?
> Are the data sizes indeed what's indicated in the schema (since SQLite
> doesn't really care about the varchar size constraints you can
> actually put any amount of data into a varchar(30) column) ?
>
> </jum>
>
>
>>
>> I am actually recreating the whole database (delete file and  
>> recreate)
>> programmatically so vacuuming has not effect.
>>
>> Cheers,
>> S.
>>
>> -----Original Message-----
>> From: Jens Miltner [mailto:[EMAIL PROTECTED]
>> Sent: 27 November 2008 13:48
>> To: General Discussion of SQLite Database
>> Cc: [EMAIL PROTECTED]
>> Subject: Re: [sqlite] Database file size
>>
>>
>> Am 27.11.2008 um 09:12 schrieb Simon Bulman:
>>
>>> I have been playing around with SQLite to use as an alternative to
>>> one of
>>> our proprietary file formats used to read large amounts of data. Our
>>> proprietary format performs very badly i.e. takes a long time to
>>> load some
>>> data; as expected SQLite is lighting quick in comparison - great!
>>>
>>> One considerable stumbling block is the footprint (size) of the
>>> database
>>> file on disk. It turns out that SQLite is roughly 7x larger than our
>>> proprietary format - this is prohibitive. The data is pretty simple
>>> really,
>>> 2 tables
>>>
>>> Table 1
>>>
>>> BIGINT (index),  VARCHAR(30), VARCHAR(10)
>>>
>>>
>>> Table 2
>>>
>>> BIGINT (index), FLOAT
>>>
>>>
>>> For a particular data set Table1 has 1165 rows and Table 2 has 323
>>> rows,
>>> however typically Table 2 becomes bigger for larger models. The size
>>> on disk
>>> of this file is 11.8 Mb (compared to 1.7 Mb for our proprietary
>>> format). I
>>> have noticed that if I drop the indexes the size drops  
>>> dramatically -
>>> however the query performance suffers to an unacceptable level.
>>>
>>> For a larger model the DB footprint is 2.2 Gb compared to 267 Mb for
>>> the
>>> proprietary format.
>>>
>>> Does anybody have any comments on this? Are there any configuration
>>> options
>>> or ideas I could use to reduce the footprint of the db file?
>>
>>
>> I don't think you'll be able to make SQLite as efficient (regarding
>> storage size) as a custom file format, because it has to have some
>> overhead for indexes, etc.
>>
>> However, one thing that comes to mind is the way string data is
>> stored:
>> If you're concerned about disk space an your string data is mostly
>> ASCII, make sure your strings are stored as UTF-8 - for ASCII string
>> data, this will save you one byte per character in the string data
>> storage.
>> To enforce UTF-8 string storage, execute "PRAGMA encoding='UTF-8'" as
>> the first command when creating the database (before you create and
>> tables).
>> You can query the format using "PRAGMA encoding" - UTF-16 encodings
>> will store two bytes / character, regardless of the actual
>> characters...
>>
>> Note that this doesn't mean your database size will shrink to half  
>> the
>> size - it merely means you'll be able to fit more rows onto a single
>> page, thus eventually you should see a decrease in file size when
>> comparing UTF-16 vs. UTF-8 databases.
>>
>> BTW: are you aware that SQLite database won't shrink by themselves?
>> You'll have to vacuum them to reclaim unused space (see
>> <http://www.sqlite.org/faq.html#q12
>>> )
>>
>> HTH,
>> </jum>
>>
>

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to