Re: [sqlite] Bad db feature request

2016-06-30 Thread Eduardo Morras
On Thu, 30 Jun 2016 10:12:05 +0100
Paul Sanderson  wrote:

> The OP wanted something to check his systems for corruption - if the
> SQLite dev team don't want to add checksums then the OP could possibly
> solve his own problem by adding them to his own internal version.
> 
> Extension may have been a bad choice of word - he can modify/compile
> his own version of SQLite add checksums - and also add a corruption
> engine if he wants.
> Paul

Your right

A simple trigger on insert/update a row to calculate a crc or hash of other 
columns content in the row and update a hidden column with the value can do the 
trick at row level. It will make performance lower, undesired but expected 
effect.

---   ---
Eduardo Morras 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-30 Thread Paul Sanderson
The OP wanted something to check his systems for corruption - if the
SQLite dev team don't want to add checksums then the OP could possibly
solve his own problem by adding them to his own internal version.

Extension may have been a bad choice of word - he can modify/compile
his own version of SQLite add checksums - and also add a corruption
engine if he wants.
Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence


On 29 June 2016 at 22:10, Dominique Devienne  wrote:
> On Wed, Jun 29, 2016 at 9:54 PM, Paul Sanderson <
> sandersonforens...@gmail.com> wrote:
>
>> As mentioned above there is (or can be) reserved space at the end of
>> each page (documented in the DB header) that can be used for checksums
>> - you just need to write your own extension :)
>>
>> https://www.sqlite.org/fileformat2.html
>
>
> Can you really? What kind of extension?
>
> The pager is not an extension point.
>
> The VFL is one, but https://www.sqlite.org/c3ref/io_methods.html is byte
> oriented, not page oriented.
> Plus there are different types of pages, with different format. Does the
> checksum go in the same place for all them?
> Official SQLite would ignore the extra info, and probably either not
> rewrite it, or keep it as-is, when modifying a page,
> making the page appear invalid. This is precisely the kind of things that
> if not built in is kinda useless I'm afraid.
>
> Plus w/o a format change, even if built-in to the latest SQLite, older
> SQLite apps wouldn't keep the checksums up-to-date too.
> This and index-based-row-counts and probably other things do require a
> backward-incompatible change IMHO. --DD
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Dominique Devienne
On Wed, Jun 29, 2016 at 9:54 PM, Paul Sanderson <
sandersonforens...@gmail.com> wrote:

> As mentioned above there is (or can be) reserved space at the end of
> each page (documented in the DB header) that can be used for checksums
> - you just need to write your own extension :)
>
> https://www.sqlite.org/fileformat2.html


Can you really? What kind of extension?

The pager is not an extension point.

The VFL is one, but https://www.sqlite.org/c3ref/io_methods.html is byte
oriented, not page oriented.
Plus there are different types of pages, with different format. Does the
checksum go in the same place for all them?
Official SQLite would ignore the extra info, and probably either not
rewrite it, or keep it as-is, when modifying a page,
making the page appear invalid. This is precisely the kind of things that
if not built in is kinda useless I'm afraid.

Plus w/o a format change, even if built-in to the latest SQLite, older
SQLite apps wouldn't keep the checksums up-to-date too.
This and index-based-row-counts and probably other things do require a
backward-incompatible change IMHO. --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Paul Sanderson
As mentioned above there is (or can be) reserved space at the end of
each page (documented in the DB header) that can be used for checksums
- you just need to write your own extension :)

https://www.sqlite.org/fileformat2.html

Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence


On 29 June 2016 at 18:52, Scott Hess  wrote:
> On Wed, Jun 29, 2016 at 10:36 AM, Simon Slavin  wrote:
>> On 29 Jun 2016, at 5:45pm, Drago, William @ CSG - NARDA-MITEQ 
>>  wrote:
>>> Aren't there things like that already built in to the hard disk controllers 
>>> (CRC, Reed Solomon, etc.)?
>>
>> Yes.  But they operate at the level they understand.  For instance ...
>>
>> A change is made in a field which involves changing just one page of data.  
>> In terms of the SQLite file format this would mean that a table page is 
>> overwritten -- a one disk sector change.  If SQLite checksums existed then 
>> this would mean that the checksum, stored in the table pointer page, would 
>> also have to be updated.  Which would mean that another disk sector has to 
>> be changed too.
>>
>> Now suppose there's a big in the storage medium driver which means it 
>> occasionally writes the correct data to the wrong sector on disk.  Without 
>> checksums this fault would not be noticed: since the wrong sector on disk 
>> was updated the wrong checksum on disk would be updated too.  The errors 
>> would match.
>
> I think the bigger problem is that delegating this means that you
> assume the entire underlying stack is working correctly.  For
> instance, the disk may have elaborate error-correction protocols that
> are working correctly per sector, but SQLite's pages may span sectors.
> Or the underlying disk may be perfect and the filesystem doesn't
> provide the same guarantees.  Or someone is running things over NFS.
> Having the page checksum embedded in the page at the SQLite level
> would provide end-to-end confidence.
>
> Chaining the checksums is a whole different level of assurance.  To
> the best of my knowledge _all_ legitimately (1) corrupted databases
> I've seen had pages which were individually valid, but not valid when
> taken together.  Like an index page referred to a row which wasn't
> present in the table page.  This implies that the atomicity guarantees
> SQLite relies on were broken at the filesystem or disk level.
>
> -scott
>
> (1) I consider a system where the filesystem is simply broken to not
> be legitimate corruption.  For instance, if you get a page of random
> which doesn't appear to have every been SQLite data in the first
> place.  There's not much SQLite can do about that kind of thing.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Scott Hess
On Wed, Jun 29, 2016 at 10:36 AM, Simon Slavin  wrote:
> On 29 Jun 2016, at 5:45pm, Drago, William @ CSG - NARDA-MITEQ 
>  wrote:
>> Aren't there things like that already built in to the hard disk controllers 
>> (CRC, Reed Solomon, etc.)?
>
> Yes.  But they operate at the level they understand.  For instance ...
>
> A change is made in a field which involves changing just one page of data.  
> In terms of the SQLite file format this would mean that a table page is 
> overwritten -- a one disk sector change.  If SQLite checksums existed then 
> this would mean that the checksum, stored in the table pointer page, would 
> also have to be updated.  Which would mean that another disk sector has to be 
> changed too.
>
> Now suppose there's a big in the storage medium driver which means it 
> occasionally writes the correct data to the wrong sector on disk.  Without 
> checksums this fault would not be noticed: since the wrong sector on disk was 
> updated the wrong checksum on disk would be updated too.  The errors would 
> match.

I think the bigger problem is that delegating this means that you
assume the entire underlying stack is working correctly.  For
instance, the disk may have elaborate error-correction protocols that
are working correctly per sector, but SQLite's pages may span sectors.
Or the underlying disk may be perfect and the filesystem doesn't
provide the same guarantees.  Or someone is running things over NFS.
Having the page checksum embedded in the page at the SQLite level
would provide end-to-end confidence.

Chaining the checksums is a whole different level of assurance.  To
the best of my knowledge _all_ legitimately (1) corrupted databases
I've seen had pages which were individually valid, but not valid when
taken together.  Like an index page referred to a row which wasn't
present in the table page.  This implies that the atomicity guarantees
SQLite relies on were broken at the filesystem or disk level.

-scott

(1) I consider a system where the filesystem is simply broken to not
be legitimate corruption.  For instance, if you get a page of random
which doesn't appear to have every been SQLite data in the first
place.  There's not much SQLite can do about that kind of thing.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread R Smith



On 2016/06/29 4:28 PM, Simon Slavin wrote:

On 29 Jun 2016, at 10:17am, R Smith  wrote:


I know I can write nonsense over the file, but I was hoping to be able to cause 
specific common corruptions, like Invalid-Schema, Missing Index entries, 
Missing pages etc. with this command. It need not be fancy, just find the first 
table that can be corrupted in the DB, and mess with it enough to fail an 
integrity check, but still work if possible.

Trouble is that whatever checking you develop might spot precisely the fault 
that the simulator introduces but not a slightly different one.  For instance 
if there's something special about the last row in a page it might spot all 
missing rows in the index except if the missing row is the last one on the page.


Of course, but that's an aside wish for me, the main aim here is to 
enable the systems we make to be tested for reporting of and surviving 
of corrupted DBs for different styles of corruption. One that can 
repeatedly be invoked in a test suite even.


Agreed though, this will never be enough for developing recovery 
mechanisms, though they, if working correctly, should at a minimum 
report and deal with these basic corruptions too.



I prefer your idea of opening the file and randomly overwriting a few bytes.  
Of course some of them will just be values in unindexed fields, so there would 
be no damage SQLite could notice.


Yes quite, but I was actually making this exact thing when I realised 
it's quite useless - I'm testing something that will near NEVER happen. 
In real world, it is far more likely to find an Index missing some 
entries or a Schema being unreadable or page-count wrong etc, than 
random data being zapped into your DB file by virtue of Neutrino's from 
Sol (or a stray cataclysm).


Like when car manufacturers test the integrity of their roll-cages, they 
don't drop airplanes on the roof of the car - though it's not 100% 
inconceivable that this might occur in life, but they rather ram it 
head-first into some barrier - exactly the most common type of accident 
out there. This is what I'm trying to achieve with the request.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Simon Slavin

On 29 Jun 2016, at 5:45pm, Drago, William @ CSG - NARDA-MITEQ 
 wrote:

> Aren't there things like that already built in to the hard disk controllers 
> (CRC, Reed Solomon, etc.)?

Yes.  But they operate at the level they understand.  For instance ...

A change is made in a field which involves changing just one page of data.  In 
terms of the SQLite file format this would mean that a table page is 
overwritten -- a one disk sector change.  If SQLite checksums existed then this 
would mean that the checksum, stored in the table pointer page, would also have 
to be updated.  Which would mean that another disk sector has to be changed too.

Now suppose there's a big in the storage medium driver which means it 
occasionally writes the correct data to the wrong sector on disk.  Without 
checksums this fault would not be noticed: since the wrong sector on disk was 
updated the wrong checksum on disk would be updated too.  The errors would 
match.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Scott Hess
On Wed, Jun 29, 2016 at 2:17 AM, R Smith  wrote:
> In response to a recent forum post and many other posts, where SQLite
> corrupt files or Index integrity was the problem at hand, I was wondering if
> we could ask for an API function that would corrupt a DB for us.

I have done some things like this in Chromium's sql/ wrapper.  There
are helpers at:
   https://cs.chromium.org/chromium/src/sql/test/test_helpers.h

sql::test::CorruptSizeInHeader() is a helper function which uses
mostly libc operations to modify the page count in the header, which
results in SQLITE_CORRUPT for most operations.

In the same file, sql::test::CorruptTableOrIndex() is a helper which
snapshots the root page of a table or index, runs a given update
query, then restores the snapshot.  This can be used to build
mis-matched indices or tables.

These have example use in:
   https://cs.chromium.org/chromium/src/sql/connection_unittest.cc
   https://cs.chromium.org/chromium/src/sql/recovery_unittest.cc
Of course, none of this would be directly useful, since it's written
in Chromium C++.  But it would be straightforward to write them using
libc and the SQLite C API, or your appropriate wrapper.

Unfortunately, one thing I found in this is that it's kind of hard to
inject real corruption reliably.  You have to figure out ways to be
wrong enough to be noticeable, and right enough that SQLite can make
progress to see the corruption, and then you have to make sure SQLite
is running through the code paths which will actually expose the
corruption.

It might be cleaner to just have a way to do error injection.  Like
maybe something like the authorizer interface, but where it could
detect the table row/column being touched and introduce an error.  Or
maybe you could write a vtable which proxies a real table and
introduces errors.

-scott
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Roger Binns
On 29/06/16 09:45, Drago, William @ CSG - NARDA-MITEQ wrote:
> Aren't there things like that [checksums] already built in to the hard disk 
> controllers (CRC, Reed Solomon, etc.)?

They are at a different level and can only detect issues in what they
see.  For example SQLite can create a page of data, and then hand it off
to the C library which then hands it off to the kernel which then hands
it off to various filesystem drivers which then hand it off to various
block devices which then hand it off over a bus of some sort to the
storage.  If corruption happens at any point before getting to the
storage then the corrupted version is going to be considered correct by
the storage.

Having checksums at the SQLite level means that SQLite can itself verify
that what it wrote (and went through any number of other layers) is what
it gets back.  Short of extremely robust C libraries, operating systems,
drivers, and hardware, SQLite is the sensible place to add checksums.
The "Lite" bit guarantees that SQLite is not run on robust everything,
but usually on less reliable components.  That is why I am somewhat
disappointed the SQLite team doesn't see value in implementing the request.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Drago, William @ CSG - NARDA-MITEQ
> boun...@mailinglists.sqlite.org] On Behalf Of Dominique Devienne
> Sent: Wednesday, June 29, 2016 10:52 AM
> To: SQLite mailing list
> Subject: Re: [sqlite] Bad db feature request
>
> On Wed, Jun 29, 2016 at 4:28 PM, Simon Slavin <slav...@bigfraud.org>
> wrote:
>
> I wish for the day SQLite has page checksums to detect any such random
> corruption.
> Yes, I know, it's a format change, and will likely slow things down a little, 
> but
> it's worth it IMHO.

Aren't there things like that already built in to the hard disk controllers 
(CRC, Reed Solomon, etc.)?

--
Bill Drago
Staff Engineer
L3 Narda-MITEQ
435 Moreland Road
Hauppauge, NY 11788
631-272-5947 / william.dr...@l-3com.com

CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any 
attachments are solely for the use of the addressee and may contain information 
that is privileged or confidential. Any disclosure, use or distribution of the 
information contained herein is prohibited. In the event this e-mail contains 
technical data within the definition of the International Traffic in Arms 
Regulations or Export Administration Regulations, it is subject to the export 
control laws of the U.S.Government. The recipient should check this e-mail and 
any attachments for the presence of viruses as L-3 does not accept any 
liability associated with the transmission of this e-mail. If you have received 
this communication in error, please notify the sender by reply e-mail and 
immediately delete this message and any attachments.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Roger Binns
On 29/06/16 07:51, Dominique Devienne wrote:
> I wish for the day SQLite has page checksums to detect any such random
> corruption.

Agreed.  The SQLite team rejected doing so:

  http://www.sqlite.org/src/tktview/72b01a982a84f64d4284

> Yes, I know, it's a format change, and will likely slow things down a
> little, but it's worth it IMHO.

Note that it isn't as big a change as you think, and could be done
today.  SQLite already allows a portion of each page to be used for
other purposes, with the big user being encryption.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Dominique Devienne
On Wed, Jun 29, 2016 at 4:28 PM, Simon Slavin  wrote:

> I prefer your idea of opening the file and randomly overwriting a few
> bytes.  Of course some of them will just be values in unindexed fields, so
> there would be no damage SQLite could notice.


I wish for the day SQLite has page checksums to detect any such random
corruption.
Yes, I know, it's a format change, and will likely slow things down a
little, but it's worth it IMHO.
SQLite4, as you used to say Simon... --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Simon Slavin

On 29 Jun 2016, at 10:17am, R Smith  wrote:

> I know I can write nonsense over the file, but I was hoping to be able to 
> cause specific common corruptions, like Invalid-Schema, Missing Index 
> entries, Missing pages etc. with this command. It need not be fancy, just 
> find the first table that can be corrupted in the DB, and mess with it enough 
> to fail an integrity check, but still work if possible.

Trouble is that whatever checking you develop might spot precisely the fault 
that the simulator introduces but not a slightly different one.  For instance 
if there's something special about the last row in a page it might spot all 
missing rows in the index except if the missing row is the last one on the page.

I prefer your idea of opening the file and randomly overwriting a few bytes.  
Of course some of them will just be values in unindexed fields, so there would 
be no damage SQLite could notice.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bad db feature request

2016-06-29 Thread R Smith

Hi SQLite devs,

In response to a recent forum post and many other posts, where SQLite 
corrupt files or Index integrity was the problem at hand, I was 
wondering if we could ask for an API function that would corrupt a DB 
for us.


I would like to use this to test all our systems' ability to catch bad 
DBs or survive them at least. I also maintain one of the SQLite DB 
management tools and would like to test it and make it more apt at 
detecting, handling and perhaps even solving some of the corruptions. (I 
know someone here made some recovery tools already).


A simple:
  int sqlite3_makecorrupt(*sqlite3_db , Int corruptionType)

I know I can write nonsense over the file, but I was hoping to be able 
to cause specific common corruptions, like Invalid-Schema, Missing Index 
entries, Missing pages etc. with this command. It need not be fancy, 
just find the first table that can be corrupted in the DB, and mess with 
it enough to fail an integrity check, but still work if possible. (Or 
whatever check is appropriate to detecting that kind of corruption, if 
any). Also, making it part of the connection API, it means the file I am 
corrupting had to already be opened successfully and so must be a valid 
SQLite DB to start with.


A returned error code might indicate whether successfully corrupted (I 
might ask to corrupt by missing Index entries, and the DB might have no 
usable Indices). Other returned codes might be the usual SQLITE_BUSY, 
SQLITE_NOT_A_DB and the like. (This is the kind of things hard to check 
if I am just messing with the file from outside a connection).


Another possibility is to produce a file-copy (perhaps via the backup 
API) of the DB that gets corrupted in stead of the DB itself - this is 
probably safer and lends itself better to testing environments where I 
can manufacture many different corruptions from the same DB file. Maybe 
add a parameter where I can suggest the new filename.


Perhaps force a new file named to clearly indicate it (to avoid 
accidents and people with nefarious intents from messing with working 
DBs easily) such as:

  MyDB.sqlite.corrupted
(from MyDB.sqlite as the original or suggested name).


The one thing I'm not sure about, maybe you or others have thoughts on 
this, is whether the corruption should be random or repeatable.


I'm thinking this must already exist somewhat in the vast SQLite 
test-suite in some form.



Thanks,
Ryan

PS: It would probably be very useful too for embedded/phone type systems 
to test battery-fail or unplugged-cable type corruptions - though these 
are easy to simulate already, but if in the API, it could more easily be 
included in test scripts.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users