Re: [sqlite] SQLite for datalogging - best practices

Gerlando Falauto Mon, 29 Oct 2018 10:11:33 -0700

On Mon, Oct 29, 2018 at 2:28 PM Richard Hipp <[email protected]> wrote:

> On 10/28/18, Gerlando Falauto <[email protected]> wrote:
> >
> > This is the current approach:
> > - There is just one single-threaded writer process, which also
> periodically
> > issues DELETE statements to remove older data.
> > - In order to prevent long-running reading queries from blocking the
> > writer, I'm using WAL mode.
> > - The database is opened with SYNCHRONOUS=1 to prevent database
> corruption.
> > - Not doing any manual checkpoints, just limiting journal size to 100MB.
> > - Page size is at its default
> > - Underlying storage is a F2FS partition on a commodity 32GB SSD.
> >
>
> Other comments on this thread have been good.  Thanks, everybody!
>

Thank you (and BTW, what an honor...) :-)

For peak performance and efficiency, omit indexes from the log table.
> Adding to a log is just appending to a table, which is particularly
> fast.  But if there are indexes, entries must be inserted into the
> middle of the indexes (so that the all entries end up in index-order)
> which requires more I/O.
>

I see. However, giving up on indexes kind of defeats the whole purpose of
having a database.
I assume there is no way to leverage the fact that e.g. rows would be
inherently sorted by timestamp, without recurring to indexes.

If possible, combine two or more (as many as practical, subject to
> your timing constraints) inserts into a single transaction.
>

That would have been my guess, so I'm glad to hear it makes sense then.

> Using the latest Linux kernels and F2FS filesystem modules, and if you
> compile SQLite using SQLITE_ENABLE_BATCH_ATOMIC_WRITE, and if you use
> DELETE mode rather than WAL mode, then SQLite will try to take
> advantage of the batch atomic write capabilities provided by F2FS.
> This can be *much* more efficient in terms of performance and I/O and
> also reduced wear on the flash chips.

Cool, I had no idea about this capability, I'll have to dig it further.

> The downsides to this approach
> are (1) there is an exclusive lock held for a few milliseconds while
> doing each write, so readers and writers cannot overlap as they can in
> WAL mode and

Actually my biggest concern with DELETE mode is not the writer blocking
readers,
as much as readers (which might potentially take a very long time to
consume all data for e.g. analytics)
blocking the writer.
I don't believe there is a way to give up any of the ACID properties to
work around this limitation, is there?

> (2) there are scattered reports of instabilities with
> F2FS following power failures in the middle of transactions - it is
> unclear if these reports are true or not and they are rare in any
> event, but F2FS is relatively new technology (compared to EXT4) so
> there *might* still be some obscure problems on optimized code paths.
>

Thank you for the heads-up. I was not aware of any such issues.
I guess some continuous testing would be in order here.
What kind of worries me is the chance of getting stranded with a corrupt
database
and no way to automatically recover the oldest (long untouched) parts of it.

Having the WAL file on a ramdisk could result in database corruption
> if you take a power failure in the middle of a checkpoint.
>

OK, that was a stupid idea anyway.
I believe just leaving SYNC=1 is probably enough to reduce writes to a
minimum, isn't it?

Thank you,
Gerlando
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite for datalogging - best practices

Reply via email to