[sqlite] SQLite Corruption By Writing NULL Data

2016-03-04 Thread sanhua.zh
I use the C API.
I think it is not possible to get the hole call stack if continue using the 
released SQLite pointer.



???:Simon Slavinslavins at bigfraud.org
???:SQLite mailing listsqlite-users at mailinglists.sqlite.org
:2016?3?4?(??)?17:50
??:Re: [sqlite] SQLite Corruption By Writing NULL Data


On 4 Mar 2016, at 8:22am, sanhua.zh sanhua.zh at foxmail.com wrote:  3. I guess 
it could be a problem of operating system. I work on iOS, but I have no any 
further idea. Almost all of these problems are caused by your program doing one 
of these A) Writing its own data into a pointer made by SQLite B) Releasing a 
SQLite pointer and then continuing to use it Which API are you using to call 
SQLite ? Are you calling the SQLite API using C commands ? Or are you using 
another language or another API ? How are you including SQLite in your project 
? Are you calling a library you supply ? Are you calling a library already 
present in your programming language ? Or are you including the 'sqlite.c' and 
'sqlite.h' files in your project ? Simon. 
___ sqlite-users mailing list 
sqlite-users at mailinglists.sqlite.org 
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite Corruption By Writing NULL Data

2016-03-04 Thread sanhua.zh
I am debugging db corruption. After I get some corrupted db, I found that they 
all corrupted by writing null data.
So, I decide to add some check and dump call stackin the source code in order 
to find out who corrupts the db.


Here is the code I added in the source code.


int sqlite3CheckNullData(const unsigned char* data, const int length)
{
  const size_t* s = (const size_t*)data;
  const unsigned char* d = (const unsigned char*)data;
  int n = length/sizeof(size_t);
  int i;
  for (i = 0; i  n; i++) {
if (s[i]!=0) {
  return 0;
}
  }
  for (i = i*sizeof(size_t); ilength; i++) {
if (d[i]!=0) {
  return 0;
}
  }
  return 1;
}
static int unixWrite(
 sqlite3_file *id,
 const void *pBuf,
 int amt,
 sqlite3_int64 offset
){
 unixFile *pFile = (unixFile*)id;
 if (amt0sqlite3CheckNullData(pBuf, amt)) {
  SQLITE_KNOWN_ERROR(SQLITE_CORRUPT, "writing null data into %s from %d length 
%d", unixGetFilename(pFile-zPath), offset, amt);
 }
...
}

The code is simple. I check the data whether is all null in 
[sqlite3CheckNullData], and add a macro [SQLITE_KNOWN_ERROR], which is defined 
as [sqlite_log], to throw this error outside SQLite. Outside SQLite, I dump the 
call stack of all thread, and I got this:

0x195774000 + 113628   objc_msgSend (in libobjc.dylib) + 28
0x1000f8000 + 7781724   _ZL9LogSQLitePviPKc,WCDataBase.mm,line 81
0x1000f8000 + 2836888   sqlite3_vlog,printf.c,line 1023
0x1000f8000 + 2778664   sqlite3KnownError,main.c,line 3192
0x1000f8000 + 2554560   unixWrite,os_unix.c,line 3335
0x1000f8000 + 2821984   sqlite3WalCheckpoint,wal.c,line 1798
0x1000f8000 + 2819864   sqlite3WalClose,wal.c,line 1914
0x1000f8000 + 2529964   sqlite3PagerClose,pager.c,line 3995
0x1000f8000 + 2574152   sqlite3BtreeClose,btree.c,line 2516
0x1000f8000 + 277   sqlite3LeaveMutexAndCloseZombie,main.c,line 
10834297741736

0x1000f8000 + 2774220   sqlite3Close,main.c,line 1026


This is the only thread operating database. All other call stack of threads 
make no sense.
You can see the SQLite checkpointing. That is the reason why my database 
corrupt. And I have no idea how this happened even I checking the source code.


Here is some of my conclusion:
1. This checking null data also work for writing into WAL file, but there is no 
report that WAL is been written by null data.
2.Some rogue file descriptor may write the null data into WAL file. But, I have 
several db with the same problem. It?s a rare event that the rogue writter only 
write the null data into the WAL, not all other db files or normal files.
3. I guess it could be a problem of operating system. I work on iOS, but I have 
no any further idea.
4. It would happened in normal knee. But it could easily happen when the disk 
free space is low. I also haveno any further idea about this.


So, this is my confusion:
1. Does anyone have any idea about this?
2. What can I do to reserve this type of corruption?


Note that if a page of sqlite_master is been rewritten by null data, the 
[.dump] shell command will not work to repair the database.


[sqlite] SQLite Corruption By Writing NULL Data

2016-03-04 Thread Clemens Ladisch
sanhua.zh wrote:
> I am debugging db corruption. After I get some corrupted db, I found that 
> they all corrupted by writing null data.
>
> 0x1000f8000 + 2778664   sqlite3KnownError,main.c,line 3192
> 0x1000f8000 + 2554560   unixWrite,os_unix.c,line 3335
> 0x1000f8000 + 2821984   sqlite3WalCheckpoint,wal.c,line 1798
>
> You can see the SQLite checkpointing. That is the reason why my database 
> corrupt.

The checkpoint operations just copies data from the WAL file to the
actual database file.  For this to be a bug in the database, you'd have
to catch it writing those zero bytes to the WAL file first.

Unless you do, we can conclude that the zero bytes were never written by
the database; the WAL file got corrupted for other reasons (probably
broken flash).


Regards,
Clemens


[sqlite] SQLite Corruption By Writing NULL Data

2016-03-04 Thread Simon Slavin

On 4 Mar 2016, at 8:22am, sanhua.zh  wrote:

> 3. I guess it could be a problem of operating system. I work on iOS, but I 
> have no any further idea.

Almost all of these problems are caused by your program doing one of these

A) Writing its own data into a pointer made by SQLite
B) Releasing a SQLite pointer and then continuing to use it

Which API are you using to call SQLite ?  Are you calling the SQLite API using 
C commands ?  Or are you using another language or another API ?

How are you including SQLite in your project ?  Are you calling a library you 
supply ?  Are you calling a library already present in your programming 
language ?  Or are you including the 'sqlite.c' and 'sqlite.h' files in your 
project ?

Simon.