I am debugging db corruption. After I get some corrupted db, I found that they all corrupted by writing null data. So, I decide to add some check and dump call stackin the source code in order to find out who corrupts the db.
Here is the code I added in the source code. int sqlite3CheckNullData(const unsigned char* data, const int length) { const size_t* s = (const size_t*)data; const unsigned char* d = (const unsigned char*)data; int n = length/sizeof(size_t); int i; for (i = 0; i n; i++) { if (s[i]!=0) { return 0; } } for (i = i*sizeof(size_t); ilength; i++) { if (d[i]!=0) { return 0; } } return 1; } static int unixWrite( sqlite3_file *id, const void *pBuf, int amt, sqlite3_int64 offset ){ unixFile *pFile = (unixFile*)id; if (amt0sqlite3CheckNullData(pBuf, amt)) { SQLITE_KNOWN_ERROR(SQLITE_CORRUPT, "writing null data into %s from %d length %d", unixGetFilename(pFile-zPath), offset, amt); } ... } The code is simple. I check the data whether is all null in [sqlite3CheckNullData], and add a macro [SQLITE_KNOWN_ERROR], which is defined as [sqlite_log], to throw this error outside SQLite. Outside SQLite, I dump the call stack of all thread, and I got this: 0x195774000 + 113628 objc_msgSend (in libobjc.dylib) + 28 0x1000f8000 + 7781724 _ZL9LogSQLitePviPKc,WCDataBase.mm,line 81 0x1000f8000 + 2836888 sqlite3_vlog,printf.c,line 1023 0x1000f8000 + 2778664 sqlite3KnownError,main.c,line 3192 0x1000f8000 + 2554560 unixWrite,os_unix.c,line 3335 0x1000f8000 + 2821984 sqlite3WalCheckpoint,wal.c,line 1798 0x1000f8000 + 2819864 sqlite3WalClose,wal.c,line 1914 0x1000f8000 + 2529964 sqlite3PagerClose,pager.c,line 3995 0x1000f8000 + 2574152 sqlite3BtreeClose,btree.c,line 2516 0x1000f8000 + 2774444 sqlite3LeaveMutexAndCloseZombie,main.c,line 10834297741736 0x1000f8000 + 2774220 sqlite3Close,main.c,line 1026 This is the only thread operating database. All other call stack of threads make no sense. You can see the SQLite checkpointing. That is the reason why my database corrupt. And I have no idea how this happened even I checking the source code. Here is some of my conclusion: 1. This checking null data also work for writing into WAL file, but there is no report that WAL is been written by null data. 2.Some rogue file descriptor may write the null data into WAL file. But, I have several db with the same problem. It?s a rare event that the rogue writter only write the null data into the WAL, not all other db files or normal files. 3. I guess it could be a problem of operating system. I work on iOS, but I have no any further idea. 4. It would happened in normal knee. But it could easily happen when the disk free space is low. I also haveno any further idea about this. So, this is my confusion: 1. Does anyone have any idea about this? 2. What can I do to reserve this type of corruption? Note that if a page of sqlite_master is been rewritten by null data, the [.dump] shell command will not work to repair the database.