Hi everyone,

as I mentioned a few months ago, I'm using SQLite to continuously log data
collected from several sources on a linux system:

This is the current approach:
- There is just one single-threaded writer process, which also periodically
issues DELETE statements to remove older data.
- In order to prevent long-running reading queries from blocking the
writer, I'm using WAL mode.
- The database is opened with SYNCHRONOUS=1 to prevent database corruption.
- Not doing any manual checkpoints, just limiting journal size to 100MB.
- Page size is at its default
- Underlying storage is a F2FS partition on a commodity 32GB SSD.

The idea would be to end up with a ~20GB database with about 10 days worth
of rolling data.

There are two (apparently) opposing requirements:
- in case of a power outage, all data collected up to at most N seconds
prior to the power failure should be readable. Ideally N=0, but what's
important is that the database never gets corrupted.
- the SSD's wear due to continuous writes should be reduced to a minimum

Of course there's no silver bullet and some sort of compromise must be
accepted.
However, it's not clear to me how to control or even understand the overall
behavior of the current system -- in terms of potential data loss and SSD
wearout rate -- apart from empirical testing.

There's just too many layers which will affect the end result:
- how the application interacts with SQLite (pragmas, statements,
transactions, explicit checkpoints, etc...)
- how SQLite interacts with the OS (write(), sync()...)
- how the F2FS filesystem interacts with the SSD (block writes, TRIM...)
- how the SSD controller interacts with the underlying flash chips
(buffering, erases, writes, wear leveling...)

Any suggestion on how to proceed, where to start?
What should I be assuming as "already-taken-care-of" and what should I
rather concentrate on?
Final database size? Commit/checkpoint frequency? Page size?

For instance, one crazy idea might be to put the WAL file on a ramdisk
instead (if at all possible) and manually run checkpoints at periodic
intervals. But that would make no sense if I knew for a fact that the OS
will never actually write to disk while the WAL file is open, until a
sync() occurs.

I found the "Total_Erase_Count" reported by smartmontools to be an
interesting end-to-end metric in the long run and I believe its growth rate
is what needs to be minimized in the end.
But I have no idea what the theoretical minimum would be (as a function of
data rate and acceptable amount of data loss), and if there's a more
immediate way of measuring SSD wear rate.

I could start fiddling with everything that comes into mind and measure the
end result, but
I believe this should be a well-known problem and I'm just missing the
basic assumptions here...

Thank you!
Gerlando
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to