Hi Simon,

See my comments below.

Thank you for your time,
Greg

On 10/04/14 18:25, Simon Slavin wrote:
On 10 Apr 2014, at 5:04pm, Grzegorz Sikorski <g.sikor...@kelvatek.com> wrote:

The problem is we occasionally observe database corruption and whole database 
becomes rubbish. The problem is very rare and occurs only (as far as we were 
able to confirm so far) if there is a power lose during write to the database.
After power is lost, presumably you then reboot the device and restart the app. 
 Does your rebooting procedure, or anything else done before the database is 
reopened, mess with the database or journal file in any way ?  Especially, does 
it rename, move or delete the journal file ?
No, the journal files (-wal/-shm) are not modified at all. The only think that may be done (but rarely) is changing its privileges to rw-rw-r--. We need to be able to open database in read-only mode by group and as far as I know there is no other way to change privileges in SQLite3. Actually, I think this is kind of missing API/bug, but minor and I did not report it so far. SQLite3 should at least respect process's umask when it creates -shm/-wal files (during openning DB in read-only or read-write mode), but this is not the case, unfortunately (at least it did not work for me). But as far as I know filesystem, this should not hurt the files itself.

Does you app use any PRAGMAs when it opens the database file (apart from you 
having put it in WAL mode once upon a time) ?
Yes. For readers I do:
PRAGMA cache_size=8192;
and for writer (just in case the database is empty):
PRAGMA journal_mode=wal;

When the database is corrupted, I assume that the file length remains the same 
(or at least is plausible for an uncorrupted database) but that some of the 
file is overwritten with bytes which shouldn't be there.  Do these bytes look 
like data which could be from elsewhere in the database file, or elsewhere on 
the storage medium, or something which might have been in memory, or complete 
gibberish ?
Unfortunately, I am not really familiar with internal structures of SQLite3 DB. If it helps, here is an example output from 'pragma integrity_check;' command:
sqlite> pragma integrity_check;
*** in database main ***
On tree page 9173 cell 3: Rowid 13335 out of order (previous was 1397130453176) On tree page 9173 cell 30: Rowid 1397130684092 out of order (previous was 1397667554004) On tree page 9173 cell 36: Rowid 1397130694597 out of order (previous was 1534569645968) On tree page 9173 cell 39: Rowid 1397130866645 out of order (previous was 1397130961910) On tree page 9173 cell 40: Rowid 1397130803109 out of order (previous was 1397130866645)
On tree page 9173 cell 45: Rowid 3 out of order (previous was 1397130807113)
On tree page 9173 cell 50: Rowid 3 out of order (previous was 1397130906159)
On tree page 9173 cell 53: Rowid 1397130914407 out of order (previous was 178832756909971)
On tree page 9173 cell 60: Rowid 0 out of order (previous was 1397130922549)
On tree page 9173 cell 67: Rowid 1397130948151 out of order (previous was 1397198039630) On tree page 9173 cell 68: Rowid 1397130932767 out of order (previous was 1397130948151) On tree page 9173 cell 70: Rowid 1397130935768 out of order (previous was 1397399370224) On tree page 9173 cell 73: Rowid 1397130939282 out of order (previous was 3596154193322) On tree page 9173 cell 77: Rowid 3 out of order (min less than parent min of 1397130450131)
Corruption detected in cell 0 on page 9173
Corruption detected in cell 3 on page 9173
Corruption detected in cell 26 on page 9173
Corruption detected in cell 50 on page 9173
Corruption detected in cell 67 on page 9173
Corruption detected in cell 69 on page 9173
Corruption detected in cell 76 on page 9173
Multiple uses for byte 265 of page 9173
Fragmentation of 11 bytes reported as 0 on page 9173
Corruption detected in cell 0 on page 9176
Fragmentation of 965 bytes reported as 0 on page 9176

This is an example log, there is many other scenarios, which I can share on request.

If you can't answer some of the above, don't worry.  It might help, might not.

We admit there may be a problem with SD drivers, POSIX mutex implementation or ext4 
driver (we are running relatively old version of Linux kernel), however we have never 
observed corruption of any other file on the same partition, even though this partition 
is heavily used for other purposes (syslog, all temporary data, configuration, some 
exported data files in zip format). We also use POSIX mutexes in other programs and have 
never discovered any issue (apart from invalid use in developer's code;) Does anyone know 
if there is any "unstable/unknown" part of SQLite3 code which may cause this 
issue?
Thank you for the details which has saved a lot of laborious back-and-forth.  
I'm not aware of any issues involving corruption as you describe, though 
someone else who reads this list may be.  Or may be able to solve your problem 
another way.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


--
Asylia ExchangeDefender Message Security: Click below to verify authenticity
http://www.exchangedefender.com/verify.asp?id=s3BAnOq1026118&from=g.sikor...@camlintechnologies.com


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to