Hi Simon,
See my comments below.
Thank you for your time,
Greg
On 10/04/14 18:25, Simon Slavin wrote:
On 10 Apr 2014, at 5:04pm, Grzegorz Sikorski <g.sikor...@kelvatek.com> wrote:
The problem is we occasionally observe database corruption and whole database
becomes rubbish. The problem is very rare and occurs only (as far as we were
able to confirm so far) if there is a power lose during write to the database.
After power is lost, presumably you then reboot the device and restart the app.
Does your rebooting procedure, or anything else done before the database is
reopened, mess with the database or journal file in any way ? Especially, does
it rename, move or delete the journal file ?
No, the journal files (-wal/-shm) are not modified at all. The only
think that may be done (but rarely) is changing its privileges to
rw-rw-r--. We need to be able to open database in read-only mode by
group and as far as I know there is no other way to change privileges in
SQLite3. Actually, I think this is kind of missing API/bug, but minor
and I did not report it so far. SQLite3 should at least respect
process's umask when it creates -shm/-wal files (during openning DB in
read-only or read-write mode), but this is not the case, unfortunately
(at least it did not work for me). But as far as I know filesystem, this
should not hurt the files itself.
Does you app use any PRAGMAs when it opens the database file (apart from you
having put it in WAL mode once upon a time) ?
Yes. For readers I do:
PRAGMA cache_size=8192;
and for writer (just in case the database is empty):
PRAGMA journal_mode=wal;
When the database is corrupted, I assume that the file length remains the same
(or at least is plausible for an uncorrupted database) but that some of the
file is overwritten with bytes which shouldn't be there. Do these bytes look
like data which could be from elsewhere in the database file, or elsewhere on
the storage medium, or something which might have been in memory, or complete
gibberish ?
Unfortunately, I am not really familiar with internal structures of
SQLite3 DB. If it helps, here is an example output from 'pragma
integrity_check;' command:
sqlite> pragma integrity_check;
*** in database main ***
On tree page 9173 cell 3: Rowid 13335 out of order (previous was
1397130453176)
On tree page 9173 cell 30: Rowid 1397130684092 out of order (previous
was 1397667554004)
On tree page 9173 cell 36: Rowid 1397130694597 out of order (previous
was 1534569645968)
On tree page 9173 cell 39: Rowid 1397130866645 out of order (previous
was 1397130961910)
On tree page 9173 cell 40: Rowid 1397130803109 out of order (previous
was 1397130866645)
On tree page 9173 cell 45: Rowid 3 out of order (previous was 1397130807113)
On tree page 9173 cell 50: Rowid 3 out of order (previous was 1397130906159)
On tree page 9173 cell 53: Rowid 1397130914407 out of order (previous
was 178832756909971)
On tree page 9173 cell 60: Rowid 0 out of order (previous was 1397130922549)
On tree page 9173 cell 67: Rowid 1397130948151 out of order (previous
was 1397198039630)
On tree page 9173 cell 68: Rowid 1397130932767 out of order (previous
was 1397130948151)
On tree page 9173 cell 70: Rowid 1397130935768 out of order (previous
was 1397399370224)
On tree page 9173 cell 73: Rowid 1397130939282 out of order (previous
was 3596154193322)
On tree page 9173 cell 77: Rowid 3 out of order (min less than parent
min of 1397130450131)
Corruption detected in cell 0 on page 9173
Corruption detected in cell 3 on page 9173
Corruption detected in cell 26 on page 9173
Corruption detected in cell 50 on page 9173
Corruption detected in cell 67 on page 9173
Corruption detected in cell 69 on page 9173
Corruption detected in cell 76 on page 9173
Multiple uses for byte 265 of page 9173
Fragmentation of 11 bytes reported as 0 on page 9173
Corruption detected in cell 0 on page 9176
Fragmentation of 965 bytes reported as 0 on page 9176
This is an example log, there is many other scenarios, which I can share
on request.
If you can't answer some of the above, don't worry. It might help, might not.
We admit there may be a problem with SD drivers, POSIX mutex implementation or ext4
driver (we are running relatively old version of Linux kernel), however we have never
observed corruption of any other file on the same partition, even though this partition
is heavily used for other purposes (syslog, all temporary data, configuration, some
exported data files in zip format). We also use POSIX mutexes in other programs and have
never discovered any issue (apart from invalid use in developer's code;) Does anyone know
if there is any "unstable/unknown" part of SQLite3 code which may cause this
issue?
Thank you for the details which has saved a lot of laborious back-and-forth.
I'm not aware of any issues involving corruption as you describe, though
someone else who reads this list may be. Or may be able to solve your problem
another way.
Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
--
Asylia ExchangeDefender Message Security: Click below to verify authenticity
http://www.exchangedefender.com/verify.asp?id=s3BAnOq1026118&from=g.sikor...@camlintechnologies.com
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users