Re: [sqlite] Page Size question

2019-04-18 Thread Richard Hipp
On 4/18/19, Stephen Chrzanowski  wrote:
> I'm wondering
> if there is going to be any optimization internally from SQLite by making
> the page size, say, 180k? (

SQLite handles storing 200KB blobs just fine.  It has been doing so
routinely for time out of mind.  Do not stress over the details of how
it does so - that is a premature optimization.  Just be assured that
it works.

Once you get your application up and working, if you want to optimize
at that point, simply construct some test databases and manually
change the page size to each of the eight different allowed SQLite
page sizes (512, 1024, 2048, 4096, 8192, 16384, 32768, and 65536) and
check to see the storage efficiency and performance at each page size.
Then use whichever one works best.  They should all work correctly -
it is only a question of performance and size.  And even then, the
difference will be marginal.

If you are concerned about unused space in the file, run the
"sqlite3_analyzer.exe" utility on a database to get a detailed
analysis of the unused space in the database file.

Typically, storing blobs in an SQLite database will use *less* space
than storing those same blobs as separate files in the filesystem.
See https://www.sqlite.org/fasterthanfs.html for further analysis.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Page Size question

2019-04-18 Thread Stephen Chrzanowski
When I get my act together and get un-lazy about it, I want to take a large
set of "small" files and put them into a SQLite database blobs in a table,
then have my app run some kind of CRC check
(CRC32/MD5SUM/SHA1/custom/whatever - Haven't decided on a CRC or just
byte-for byte comparisons) to identify byte exact duplicate files.  I'm not
so much concerned about the physical file size of the database file and how
the page size can cross 4k sized cluster boundaries on a disk drive, but a
possible optimization on the page size within the database for reading and
writing the data from and to the database file.

ALL of the files are going to be a multiple of 174,848 bytes.  So that,
plus that CRC check, plus an integer based auto-inc'd field, I'm wondering
if there is going to be any optimization internally from SQLite by making
the page size, say, 180k? (I know, there'll be unused bytes in the page,
but, that's a low priority thing -- I'll tighten up that page size and
re-import data when I get more to a "production" state)

My app is going to be on Windows, and I'm not concerned about constraints
of memory or CPUs or anything of the sort.  I'm just thinking about reading
in and out of the database as a POSSIBLE early (Not exactly premature I
believe) optimization.  I figure that if SQLite is going to read in the
174,848 byte blob, it'd be quicker to read in that one page file into
memory rather than take multiple stabs across a fragmented file, of course,
having the full understanding that fragmentation will happen at the file
system level.

These raw blobs are going to be sitting in a table by themselves, with a
separate many-to-one table sitting elsewhere for other nefarious reasons.
I know that this table, plus any others, are going to end up sitting in the
180k sized pages, but beyond that, is there any practical gain going to
180k instead of the default 4k?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users