Re: [sqlite] Null returned from NOT NULL column thanks to undetected corruption
So I tracked down the commit which introduced the "truncated page is fine" behaviour: $ fossil info 647e3b156e uuid: 647e3b156e32e37debd60b0079fc5a52bdc9b8c8 2009-03-28 06:59:41 UTC parent: 1c6521e53b846eec2e046b1e9c04c60658b8e0e8 2009-03-27 15:26:03 UTC child:c9fa329f54736de517cddaf747595c9eca931f32 2009-03-28 07:03:42 UTC tags: trunk comment: Fix readDbPage() so that if an SQLITE_IOERR_SHORT_READ is encountered, the page content is zeroed. Ticket #3756. (CVS 6395) (user: danielk1977) Based on (CVS 6395) I'm guessing this was pre-fossil. I can't find any reference to ticket #3756 in the current system - is that still around or lost forever? -Rowan On 14 January 2017 at 11:38, Rowan Worthwrote: > On 13 January 2017 at 22:59, David Raymond > wrote: > >> My view is that the general thinking of the program here is simply: "just >> don't make things worse." It can't help what pragmas (ie >> ignore_check_constraints, writable_schema etc) others may have turned on >> for their connections, or what sort of junk was there when it arrived. > > > ignore_check_constraints doesn't seem to affect NOT NULL constraints: > > sqlite> create table a(b INTEGER NOT NULL); > sqlite> insert into a values(NULL); > Error: NOT NULL constraint failed: a.b > sqlite> pragma ignore_check_constraints = 1; > sqlite> insert into a values(NULL); > Error: NOT NULL constraint failed: a.b > > I can't see any pragmas that do? writable_schema is fair, but the > advertised procedure[1] _does_ come with the warning that a mistake using > it will corrupt your database. It's not entirely unreasonable to classify > "using writable_schema to add a NOT NULL constraint to a column without > ensuring all rows are NOT NULL" as a mistake. I also agree that calling the > result corrupted is not an ideal result, but it provides benefits in the > form of sanity checks elsewhere. > > [1] https://www.sqlite.org/lang_altertable.html#otheralter > > I haven't figured out any other way to get a NULL value into a NOT NULL > column. The regular ALTER TABLE command refuses: > > sqlite> alter table a add column c INTEGER NOT NULL; > Error: Cannot add a NOT NULL column with default value NULL > > Nor does sqlite allow them to propagate from tables affected by truncation > as in my original post: > > sqlite> create table table2 (C INTEGER NOT NULL); > sqlite> insert into table2 select a from test_table; > Error: NOT NULL constraint failed: table2.C > > As to the truncated page it would be kind of bad for it to say "hey, >> someone else corrupted 28 bytes, so I won't give you info on the other 99% >> that I can read just fine." All that being said though it might be nice to >> know there was a noticed issue. >> > > I don't buy this one at all. Especially as sqlite fills pages backwards > (starting from the end), so a truncated page almost certainly represents > lost user data. I can't see anything in the hardware assumptions[2] to > suggest a truncated page is ever a valid configuration. > > [2] https://www.sqlite.org/atomiccommit.html#hardware > > It does clarify that sqlite goes to no effort to detect bit errors and the > like, but again I'm not asking for checksums or error correction codes. It > also says "SQLite assumes that the data it reads is exactly the same data > that it previously wrote". I'm pretty sure it never does any write that is > not a multiple of the page size, so a short read due to EOF is a pretty > good indication that this assumption has been violated! > > >> Also, quietly dealing with a different number of fields found in the file >> compared to what it parsed from the schema is how it handles "alter table >> add column" without needing to rewrite the entire table when you do that. >> That might also explain why you can see nulls for fields that got >> truncated. > > > Nope, see above. If you can provide me a legitimate way to violate a NOT > NULL constraint then I'll happily concede that the presence of a NULL cell > cannot be trusted as an indicator of corruption, but so far all my attempts > have failed. And at this stage no I don't consider shooting yourself in the > foot with writable_schema a legitimate way :P > > I've been testing with 3.14.2 - if it was possible/easier in previous > versions to violate NOT NULL then backwards compatibility may also rule > this out. > > -Rowan > > > > sqlite3 test.db "CREATE TABLE test_table (A INT4 NOT NULL, B INT4 NOT >> NULL);" >> for N in `seq 1 10` >> do >> sqlite3 test.db "INSERT INTO test_table (A, B) VALUES ($N, $N);" >> done >> >> dd if=test.db of=cropped.db bs=2020 count=1 >> sqlite3 cropped.db "SELECT * FROM test_table;" >> > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Null returned from NOT NULL column thanks to undetected corruption
On 13 January 2017 at 22:59, David Raymondwrote: > My view is that the general thinking of the program here is simply: "just > don't make things worse." It can't help what pragmas (ie > ignore_check_constraints, writable_schema etc) others may have turned on > for their connections, or what sort of junk was there when it arrived. ignore_check_constraints doesn't seem to affect NOT NULL constraints: sqlite> create table a(b INTEGER NOT NULL); sqlite> insert into a values(NULL); Error: NOT NULL constraint failed: a.b sqlite> pragma ignore_check_constraints = 1; sqlite> insert into a values(NULL); Error: NOT NULL constraint failed: a.b I can't see any pragmas that do? writable_schema is fair, but the advertised procedure[1] _does_ come with the warning that a mistake using it will corrupt your database. It's not entirely unreasonable to classify "using writable_schema to add a NOT NULL constraint to a column without ensuring all rows are NOT NULL" as a mistake. I also agree that calling the result corrupted is not an ideal result, but it provides benefits in the form of sanity checks elsewhere. [1] https://www.sqlite.org/lang_altertable.html#otheralter I haven't figured out any other way to get a NULL value into a NOT NULL column. The regular ALTER TABLE command refuses: sqlite> alter table a add column c INTEGER NOT NULL; Error: Cannot add a NOT NULL column with default value NULL Nor does sqlite allow them to propagate from tables affected by truncation as in my original post: sqlite> create table table2 (C INTEGER NOT NULL); sqlite> insert into table2 select a from test_table; Error: NOT NULL constraint failed: table2.C As to the truncated page it would be kind of bad for it to say "hey, > someone else corrupted 28 bytes, so I won't give you info on the other 99% > that I can read just fine." All that being said though it might be nice to > know there was a noticed issue. > I don't buy this one at all. Especially as sqlite fills pages backwards (starting from the end), so a truncated page almost certainly represents lost user data. I can't see anything in the hardware assumptions[2] to suggest a truncated page is ever a valid configuration. [2] https://www.sqlite.org/atomiccommit.html#hardware It does clarify that sqlite goes to no effort to detect bit errors and the like, but again I'm not asking for checksums or error correction codes. It also says "SQLite assumes that the data it reads is exactly the same data that it previously wrote". I'm pretty sure it never does any write that is not a multiple of the page size, so a short read due to EOF is a pretty good indication that this assumption has been violated! > Also, quietly dealing with a different number of fields found in the file > compared to what it parsed from the schema is how it handles "alter table > add column" without needing to rewrite the entire table when you do that. > That might also explain why you can see nulls for fields that got > truncated. Nope, see above. If you can provide me a legitimate way to violate a NOT NULL constraint then I'll happily concede that the presence of a NULL cell cannot be trusted as an indicator of corruption, but so far all my attempts have failed. And at this stage no I don't consider shooting yourself in the foot with writable_schema a legitimate way :P I've been testing with 3.14.2 - if it was possible/easier in previous versions to violate NOT NULL then backwards compatibility may also rule this out. -Rowan sqlite3 test.db "CREATE TABLE test_table (A INT4 NOT NULL, B INT4 NOT > NULL);" > for N in `seq 1 10` > do > sqlite3 test.db "INSERT INTO test_table (A, B) VALUES ($N, $N);" > done > > dd if=test.db of=cropped.db bs=2020 count=1 > sqlite3 cropped.db "SELECT * FROM test_table;" > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Null returned from NOT NULL column thanks to undetected corruption
On Fri, 13 Jan 2017 11:17:11 +0800 Rowan Worthwrote: > I wonder if this is something sqlite could catch in normal operation > and return SQLITE_CORRUPT? Or are there reasons/history which would > render this conclusion inaccurate? Not without cost. In general, it's difficult for any program to verify the integrity of a file. One way would be to keep a checksum of some kind in the file, and update it whenever the file is closed. That imposes the cost of validating the checksum when the file is opened, and of computing it when it's closed. And the cost of a false positive: perhaps the data are OK, but the checksum was corrupted. A checksum like that couldn't be added to SQLite without changing the file format, which has been fixed for over a decade. The guard against FTP truncation is well known. Transfer the file to a temporary name, and rename it to its proper name only when the transfer is successfully concluded. TCP ensures reliable delivery; only a bug in the FTP program (or underlying software or hardware) could introduce data corruption. --jkl ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Null returned from NOT NULL column thanks to undetected corruption
My view is that the general thinking of the program here is simply: "just don't make things worse." It can't help what pragmas (ie ignore_check_constraints, writable_schema etc) others may have turned on for their connections, or what sort of junk was there when it arrived. In its head it's going to hear its Mom's voice saying "if all the other kids were ignoring not null constraints, would you?" and simply not let you add new errors. As to the truncated page it would be kind of bad for it to say "hey, someone else corrupted 28 bytes, so I won't give you info on the other 99% that I can read just fine." All that being said though it might be nice to know there was a noticed issue. Remember that with the select you're asking it to tell you what "is" in there, not what "should" be in there. Also, quietly dealing with a different number of fields found in the file compared to what it parsed from the schema is how it handles "alter table add column" without needing to rewrite the entire table when you do that. That might also explain why you can see nulls for fields that got truncated. So I suppose what it comes down to then is whether it should return OK, or some other code that says "I did what you told me, but I found a problem that wasn't so bad that it keep me from completing my task" (Apologies for those who won't get the following reference from my psychotic mind) create table objects (object_type, object_location, check (case object_location when 'out on the wing' then object_type = 'nothing' else 1 end)); select * from objects where object_location = 'out on the wing'; object_type|object_location something|out on the wing Should that return SQLITE_OK, SQLITE_CORRUPT or SQLITE_UHH_SOMETHINGS_NOT_QUITE_RIGHT_HERE ? Or should it have looked at the check constraint and just returned... object_type|object_location nothing|out on the wing ...despite it seeing... object_type|object_location something|out on the wing ...because it just shouldn't exist. -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Rowan Worth Sent: Thursday, January 12, 2017 10:17 PM To: General Discussion of SQLite Database Subject: [sqlite] Null returned from NOT NULL column thanks to undetected corruption Hi guys, Ran into an interesting situation recently where a database was transferred via FTP and the client somehow managed to truncate the file. As a result the last page was only 337 bytes rather than the expected 1024. Surprisingly running a SELECT on the affected table works without sqlite returning any error! However several of the returned rows are completely blank, despite 7 out of 8 columns having a NOT NULL constraint. Anyway we came up with a simple reproducer: sqlite3 test.db "CREATE TABLE test_table (A INT4 NOT NULL, B INT4 NOT NULL);" for N in `seq 1 10` do sqlite3 test.db "INSERT INTO test_table (A, B) VALUES ($N, $N);" done dd if=test.db of=cropped.db bs=2020 count=1 sqlite3 cropped.db "SELECT * FROM test_table;" Which produces results: | | | | 0|0 6|6 7|7 8|8 9|9 10|10 test.db is two pages long, so the truncation here is only 28 bytes. I realise that some types of corruption are too small to notice without some kind of checksum/error checking code, which feels a bit heavyweight for the general case. But here it seems like there are some pretty significant red flags: 1. the database contains an incomplete page 2. a NOT NULL column contains a NULL cell "PRAGMA integrity_check" does flag problems with cropped.db, but because of these two conditions I wonder if this is something sqlite could catch in normal operation and return SQLITE_CORRUPT? Or are there reasons/history which would render this conclusion inaccurate? I notice that if I modify the database and another page gets added, sqlite *does* start returning SQLITE_CORRUPT so I wonder if there's something special about the last page? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Null returned from NOT NULL column thanks to undetected corruption
Hi guys, Ran into an interesting situation recently where a database was transferred via FTP and the client somehow managed to truncate the file. As a result the last page was only 337 bytes rather than the expected 1024. Surprisingly running a SELECT on the affected table works without sqlite returning any error! However several of the returned rows are completely blank, despite 7 out of 8 columns having a NOT NULL constraint. Anyway we came up with a simple reproducer: sqlite3 test.db "CREATE TABLE test_table (A INT4 NOT NULL, B INT4 NOT NULL);" for N in `seq 1 10` do sqlite3 test.db "INSERT INTO test_table (A, B) VALUES ($N, $N);" done dd if=test.db of=cropped.db bs=2020 count=1 sqlite3 cropped.db "SELECT * FROM test_table;" Which produces results: | | | | 0|0 6|6 7|7 8|8 9|9 10|10 test.db is two pages long, so the truncation here is only 28 bytes. I realise that some types of corruption are too small to notice without some kind of checksum/error checking code, which feels a bit heavyweight for the general case. But here it seems like there are some pretty significant red flags: 1. the database contains an incomplete page 2. a NOT NULL column contains a NULL cell "PRAGMA integrity_check" does flag problems with cropped.db, but because of these two conditions I wonder if this is something sqlite could catch in normal operation and return SQLITE_CORRUPT? Or are there reasons/history which would render this conclusion inaccurate? I notice that if I modify the database and another page gets added, sqlite *does* start returning SQLITE_CORRUPT so I wonder if there's something special about the last page? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users