Re: [sqlite] Null returned from NOT NULL column thanks to undetected corruption

2017-01-23 Thread Rowan Worth
So I tracked down the commit which introduced the "truncated page is fine"
behaviour:

$ fossil info 647e3b156e
uuid: 647e3b156e32e37debd60b0079fc5a52bdc9b8c8 2009-03-28 06:59:41
UTC
parent:   1c6521e53b846eec2e046b1e9c04c60658b8e0e8 2009-03-27 15:26:03
UTC
child:c9fa329f54736de517cddaf747595c9eca931f32 2009-03-28 07:03:42
UTC
tags: trunk
comment:  Fix readDbPage() so that if an SQLITE_IOERR_SHORT_READ is
  encountered, the page content is zeroed. Ticket #3756. (CVS
6395)
  (user: danielk1977)



Based on (CVS 6395) I'm guessing this was pre-fossil. I can't find any
reference to ticket #3756 in the current system - is that still around or
lost forever?

-Rowan

On 14 January 2017 at 11:38, Rowan Worth  wrote:

> On 13 January 2017 at 22:59, David Raymond 
> wrote:
>
>> My view is that the general thinking of the program here is simply: "just
>> don't make things worse." It can't help what pragmas (ie
>> ignore_check_constraints, writable_schema etc) others may have turned on
>> for their connections, or what sort of junk was there when it arrived.
>
>
> ignore_check_constraints doesn't seem to affect NOT NULL constraints:
>
>   sqlite> create table a(b INTEGER NOT NULL);
>   sqlite> insert into a values(NULL);
>   Error: NOT NULL constraint failed: a.b
>   sqlite> pragma ignore_check_constraints = 1;
>   sqlite> insert into a values(NULL);
>   Error: NOT NULL constraint failed: a.b
>
> I can't see any pragmas that do? writable_schema is fair, but the
> advertised procedure[1] _does_ come with the warning that a mistake using
> it will corrupt your database. It's not entirely unreasonable to classify
> "using writable_schema to add a NOT NULL constraint to a column without
> ensuring all rows are NOT NULL" as a mistake. I also agree that calling the
> result corrupted is not an ideal result, but it provides benefits in the
> form of sanity checks elsewhere.
>
> [1] https://www.sqlite.org/lang_altertable.html#otheralter
>
> I haven't figured out any other way to get a NULL value into a NOT NULL
> column. The regular ALTER TABLE command refuses:
>
>   sqlite> alter table a add column c INTEGER NOT NULL;
>   Error: Cannot add a NOT NULL column with default value NULL
>
> Nor does sqlite allow them to propagate from tables affected by truncation
> as in my original post:
>
>   sqlite> create table table2 (C INTEGER NOT NULL);
>   sqlite> insert into table2 select a from test_table;
>   Error: NOT NULL constraint failed: table2.C
>
> As to the truncated page it would be kind of bad for it to say "hey,
>> someone else corrupted 28 bytes, so I won't give you info on the other 99%
>> that I can read just fine." All that being said though it might be nice to
>> know there was a noticed issue.
>>
>
> I don't buy this one at all. Especially as sqlite fills pages backwards
> (starting from the end), so a truncated page almost certainly represents
> lost user data. I can't see anything in the hardware assumptions[2] to
> suggest a truncated page is ever a valid configuration.
>
> [2] https://www.sqlite.org/atomiccommit.html#hardware
>
> It does clarify that sqlite goes to no effort to detect bit errors and the
> like, but again I'm not asking for checksums or error correction codes. It
> also says "SQLite assumes that the data it reads is exactly the same data
> that it previously wrote". I'm pretty sure it never does any write that is
> not a multiple of the page size, so a short read due to EOF is a pretty
> good indication that this assumption has been violated!
>
>
>> Also, quietly dealing with a different number of fields found in the file
>> compared to what it parsed from the schema is how it handles "alter table
>> add column" without needing to rewrite the entire table when you do that.
>> That might also explain why you can see nulls for fields that got
>> truncated.
>
>
> Nope, see above. If you can provide me a legitimate way to violate a NOT
> NULL constraint then I'll happily concede that the presence of a NULL cell
> cannot be trusted as an indicator of corruption, but so far all my attempts
> have failed. And at this stage no I don't consider shooting yourself in the
> foot with writable_schema a legitimate way :P
>
> I've been testing with 3.14.2 - if it was possible/easier in previous
> versions to violate NOT NULL then backwards compatibility may also rule
> this out.
>
> -Rowan
>
>
>
>   sqlite3 test.db "CREATE TABLE test_table (A INT4 NOT NULL, B INT4 NOT
>> NULL);"
>>  for N in `seq 1 10`
>>  do
>>  sqlite3 test.db "INSERT INTO test_table (A, B) VALUES ($N, $N);"
>>  done
>>
>>  dd if=test.db of=cropped.db bs=2020 count=1
>>  sqlite3 cropped.db "SELECT * FROM test_table;"
>>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Null returned from NOT NULL column thanks to undetected corruption

2017-01-13 Thread Rowan Worth
On 13 January 2017 at 22:59, David Raymond  wrote:

> My view is that the general thinking of the program here is simply: "just
> don't make things worse." It can't help what pragmas (ie
> ignore_check_constraints, writable_schema etc) others may have turned on
> for their connections, or what sort of junk was there when it arrived.


ignore_check_constraints doesn't seem to affect NOT NULL constraints:

  sqlite> create table a(b INTEGER NOT NULL);
  sqlite> insert into a values(NULL);
  Error: NOT NULL constraint failed: a.b
  sqlite> pragma ignore_check_constraints = 1;
  sqlite> insert into a values(NULL);
  Error: NOT NULL constraint failed: a.b

I can't see any pragmas that do? writable_schema is fair, but the
advertised procedure[1] _does_ come with the warning that a mistake using
it will corrupt your database. It's not entirely unreasonable to classify
"using writable_schema to add a NOT NULL constraint to a column without
ensuring all rows are NOT NULL" as a mistake. I also agree that calling the
result corrupted is not an ideal result, but it provides benefits in the
form of sanity checks elsewhere.

[1] https://www.sqlite.org/lang_altertable.html#otheralter

I haven't figured out any other way to get a NULL value into a NOT NULL
column. The regular ALTER TABLE command refuses:

  sqlite> alter table a add column c INTEGER NOT NULL;
  Error: Cannot add a NOT NULL column with default value NULL

Nor does sqlite allow them to propagate from tables affected by truncation
as in my original post:

  sqlite> create table table2 (C INTEGER NOT NULL);
  sqlite> insert into table2 select a from test_table;
  Error: NOT NULL constraint failed: table2.C

As to the truncated page it would be kind of bad for it to say "hey,
> someone else corrupted 28 bytes, so I won't give you info on the other 99%
> that I can read just fine." All that being said though it might be nice to
> know there was a noticed issue.
>

I don't buy this one at all. Especially as sqlite fills pages backwards
(starting from the end), so a truncated page almost certainly represents
lost user data. I can't see anything in the hardware assumptions[2] to
suggest a truncated page is ever a valid configuration.

[2] https://www.sqlite.org/atomiccommit.html#hardware

It does clarify that sqlite goes to no effort to detect bit errors and the
like, but again I'm not asking for checksums or error correction codes. It
also says "SQLite assumes that the data it reads is exactly the same data
that it previously wrote". I'm pretty sure it never does any write that is
not a multiple of the page size, so a short read due to EOF is a pretty
good indication that this assumption has been violated!


> Also, quietly dealing with a different number of fields found in the file
> compared to what it parsed from the schema is how it handles "alter table
> add column" without needing to rewrite the entire table when you do that.
> That might also explain why you can see nulls for fields that got
> truncated.


Nope, see above. If you can provide me a legitimate way to violate a NOT
NULL constraint then I'll happily concede that the presence of a NULL cell
cannot be trusted as an indicator of corruption, but so far all my attempts
have failed. And at this stage no I don't consider shooting yourself in the
foot with writable_schema a legitimate way :P

I've been testing with 3.14.2 - if it was possible/easier in previous
versions to violate NOT NULL then backwards compatibility may also rule
this out.

-Rowan



  sqlite3 test.db "CREATE TABLE test_table (A INT4 NOT NULL, B INT4 NOT
> NULL);"
>  for N in `seq 1 10`
>  do
>  sqlite3 test.db "INSERT INTO test_table (A, B) VALUES ($N, $N);"
>  done
>
>  dd if=test.db of=cropped.db bs=2020 count=1
>  sqlite3 cropped.db "SELECT * FROM test_table;"
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Null returned from NOT NULL column thanks to undetected corruption

2017-01-13 Thread James K. Lowden
On Fri, 13 Jan 2017 11:17:11 +0800
Rowan Worth  wrote:

> I wonder if this is something sqlite could catch in normal operation
> and return SQLITE_CORRUPT? Or are there reasons/history which would
> render this conclusion inaccurate?

Not without cost.  

In general, it's difficult for any program to verify the integrity of a
file.  One way would be to keep a checksum of some kind in the file,
and update it whenever the file is closed.  That imposes the cost of
validating the checksum when the file is opened, and of computing it
when it's closed.  And the cost of a false positive: perhaps the data
are OK, but the checksum was corrupted.  

A checksum like that couldn't be added to SQLite without changing the
file format, which has been fixed for over a decade.  

The guard against FTP truncation is well known.  Transfer the file to a
temporary name, and rename it to its proper name only when the transfer
is successfully concluded.  TCP ensures reliable delivery; only a bug
in the FTP program (or underlying software or hardware) could introduce
data corruption. 

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Null returned from NOT NULL column thanks to undetected corruption

2017-01-13 Thread David Raymond
My view is that the general thinking of the program here is simply: "just don't 
make things worse." It can't help what pragmas (ie ignore_check_constraints, 
writable_schema etc) others may have turned on for their connections, or what 
sort of junk was there when it arrived. In its head it's going to hear its 
Mom's voice saying "if all the other kids were ignoring not null constraints, 
would you?" and simply not let you add new errors. As to the truncated page it 
would be kind of bad for it to say "hey, someone else corrupted 28 bytes, so I 
won't give you info on the other 99% that I can read just fine." All that being 
said though it might be nice to know there was a noticed issue.

Remember that with the select you're asking it to tell you what "is" in there, 
not what "should" be in there. Also, quietly dealing with a different number of 
fields found in the file compared to what it parsed from the schema is how it 
handles "alter table add column" without needing to rewrite the entire table 
when you do that. That might also explain why you can see nulls for fields that 
got truncated.

So I suppose what it comes down to then is whether it should return OK, or some 
other code that says "I did what you told me, but I found a problem that wasn't 
so bad that it keep me from completing my task"



(Apologies for those who won't get the following reference from my psychotic 
mind)

create table objects (object_type, object_location,
  check (case object_location when 'out on the wing' then object_type = 
'nothing' else 1 end));

select * from objects where object_location = 'out on the wing';
object_type|object_location
something|out on the wing

Should that return SQLITE_OK, SQLITE_CORRUPT or 
SQLITE_UHH_SOMETHINGS_NOT_QUITE_RIGHT_HERE ?

Or should it have looked at the check constraint and just returned...

object_type|object_location
nothing|out on the wing

...despite it seeing...

object_type|object_location
something|out on the wing

...because it just shouldn't exist.

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Rowan Worth
Sent: Thursday, January 12, 2017 10:17 PM
To: General Discussion of SQLite Database
Subject: [sqlite] Null returned from NOT NULL column thanks to undetected 
corruption

Hi guys,

Ran into an interesting situation recently where a database was transferred
via FTP and the client somehow managed to truncate the file. As a result
the last page was only 337 bytes rather than the expected 1024.
Surprisingly running a SELECT on the affected table works without sqlite
returning any error!

However several of the returned rows are completely blank, despite 7 out of
8 columns having a NOT NULL constraint. Anyway we came up with a simple
reproducer:


 sqlite3 test.db "CREATE TABLE test_table (A INT4 NOT NULL, B INT4 NOT
NULL);"
 for N in `seq 1 10`
 do
 sqlite3 test.db "INSERT INTO test_table (A, B) VALUES ($N, $N);"
 done

 dd if=test.db of=cropped.db bs=2020 count=1
 sqlite3 cropped.db "SELECT * FROM test_table;"


Which produces results:

|
|
|
|
0|0
6|6
7|7
8|8
9|9
10|10

test.db is two pages long, so the truncation here is only 28 bytes. I
realise that some types of corruption are too small to notice without some
kind of checksum/error checking code, which feels a bit heavyweight for the
general case. But here it seems like there are some pretty significant red
flags:

1. the database contains an incomplete page
2. a NOT NULL column contains a NULL cell

"PRAGMA integrity_check" does flag problems with cropped.db, but because of
these two conditions I wonder if this is something sqlite could catch in
normal operation and return SQLITE_CORRUPT? Or are there reasons/history
which would render this conclusion inaccurate?

I notice that if I modify the database and another page gets added, sqlite
*does* start returning SQLITE_CORRUPT so I wonder if there's something
special about the last page?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Null returned from NOT NULL column thanks to undetected corruption

2017-01-12 Thread Rowan Worth
Hi guys,

Ran into an interesting situation recently where a database was transferred
via FTP and the client somehow managed to truncate the file. As a result
the last page was only 337 bytes rather than the expected 1024.
Surprisingly running a SELECT on the affected table works without sqlite
returning any error!

However several of the returned rows are completely blank, despite 7 out of
8 columns having a NOT NULL constraint. Anyway we came up with a simple
reproducer:


 sqlite3 test.db "CREATE TABLE test_table (A INT4 NOT NULL, B INT4 NOT
NULL);"
 for N in `seq 1 10`
 do
 sqlite3 test.db "INSERT INTO test_table (A, B) VALUES ($N, $N);"
 done

 dd if=test.db of=cropped.db bs=2020 count=1
 sqlite3 cropped.db "SELECT * FROM test_table;"


Which produces results:

|
|
|
|
0|0
6|6
7|7
8|8
9|9
10|10

test.db is two pages long, so the truncation here is only 28 bytes. I
realise that some types of corruption are too small to notice without some
kind of checksum/error checking code, which feels a bit heavyweight for the
general case. But here it seems like there are some pretty significant red
flags:

1. the database contains an incomplete page
2. a NOT NULL column contains a NULL cell

"PRAGMA integrity_check" does flag problems with cropped.db, but because of
these two conditions I wonder if this is something sqlite could catch in
normal operation and return SQLITE_CORRUPT? Or are there reasons/history
which would render this conclusion inaccurate?

I notice that if I modify the database and another page gets added, sqlite
*does* start returning SQLITE_CORRUPT so I wonder if there's something
special about the last page?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users