Re: [sqlite] Please review this email to sqlite's mailing list

2012-05-16 Thread Adam DeVita
Did you check out
http://www.sqlite.org/inmemorydb.html

Could you use an in-memory db to act as a db for a save point?

When you are ready to commit, do so from 1 in memory db, while accumulating
into another in preparation for that save point.

Adam

On Tue, May 15, 2012 at 1:55 PM, Charles Samuels char...@cariden.comwrote:


 I'm using sqlite in addition to another database (otherdb) storing data
 in a
 specific manner. I'm trying to keep atomicity of my disk commits. It can
 take
 several minutes for otherdb to commit, and while it commits it can already
 start accumulating data for a future transaction.

 Some of the data coming into this application also goes into the sqlite
 database. But I'd like to keep what's on the oxide between sqlite and
 otherdb consistent with eachother. Let's accept that otherdb

 At some point, we get a checkpoint; at this instant, what is in otherdb and
 what is in sqlite is what we want committed to sqlite, if either of them
 fails, we can rollback both of them and both databases return to a
 consistent
 state of a previous checkpoint. The problem is that in the time between
 checkpoint 1 and checkpoint 1 being committed to disk, more data is
 arriving.

 The question here is: where can I put that more data so that it won't be
 part of checkpoint 1, but is still accessable by sqlite select statements?
 (Accept that otherdb allows asychronous commits such that I can add more
 data
 to it that doesn't wind up on disk).

 There's a few possibilities with some serious disadvantages:

 * When otherdb completes its checkpoint, I commit sqlite; until otherdb and
 sqlite finish their commits, any data going into sqlite instead goes into a
 mirror sqlite that I can do queries against meanwhile (but then I have to
 replay *all* of those modifications against the primary sqlite). This can
 cost
 huge amounts of memory because the sqlite database can get big: 3GiB or
 more.
 It's also slow because all of a sudden I have to do a whole bunch of sqlite
 statements. It's even slower because now any update I do *normally* has to
 be
 cloned.

 * I could write a virtual filesystem layer for sqlite that somehow
 accumulates
 changes that I can merge in with insert statements. So it's like the
 previous
 solution but I use some arm waving in combination with smoke and mirrors
 to at
 least not make me have two total copies of the database. The problem with
 this
 one is I don't know how to do it, and even if I did, I wouldn't know how
 reliable it was.

 * If sqlite had a commit transaction to savepoint X, then sqlite commits
 to
 the oxide everything up to a specific savepoint, keeping the savepoints
 after
 those committed still as active and uncommitted savepoints. The only
 disadvantage I can think of to this is that sqlite has no such feature.

 So how could I do this?


 Charles
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
VerifEye Technologies Inc.
905-948-0015x245
151 Whitehall Dr, Unit 2
Markham ON, L3R 9T1
Canada
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please review this email to sqlite's mailing list

2012-05-16 Thread Charles Samuels
On Wednesday, May 16, 2012 10:00:37 f.h. Adam DeVita wrote:
 Did you check out
 http://www.sqlite.org/inmemorydb.html
 
 Could you use an in-memory db to act as a db for a save point?

Yes, but this is incredibly hard: I have to maintain the schema twice, I can't 
allow the user to enter their own SQL statements, I still have to copy all the 
accumulated stuff over, and schema changes would be very difficult to handle.

Charles
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please review this email to sqlite's mailing list

2012-05-16 Thread Igor Tandetnik

On 5/15/2012 1:55 PM, Charles Samuels wrote:

The question here is: where can I put that more data so that it won't be
part of checkpoint 1, but is still accessable by sqlite select statements?
(Accept that otherdb allows asychronous commits such that I can add more data
to it that doesn't wind up on disk).


Well, SQLite doesn't. Basically, you want SQLite to maintain multiple 
not-yet-committed versions of the same record. I don't believe there's 
any way to get it to do that automatically - you'll have to come up with 
an explicit versioning scheme (like your idea of creating a temporary 
shadow database).



* If sqlite had a commit transaction to savepoint X, then sqlite commits to
the oxide everything up to a specific savepoint, keeping the savepoints after
those committed still as active and uncommitted savepoints. The only
disadvantage I can think of to this is that sqlite has no such feature.


Oh yes, I thought of something... but unfortunately, it rather involved 
being on the other side of this airtight hatchway. -- The Hitchhiker's 
Guide to the Galaxy.

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please review this email to sqlite's mailing list

2012-05-16 Thread Charles Samuels
On Wednesday, May 16, 2012 11:33:02 f.h. Igor Tandetnik wrote:
 Well, SQLite doesn't. Basically, you want SQLite to maintain multiple 
 not-yet-committed versions of the same record.

Well, yes, but it already does; you can rollback a savepoint.

If I could rollback a savepoint and then unrollback it after doing an commit, 
that'd be the equivalent, if I could trick the paging module.

 I don't believe there's 
 any way to get it to do that automatically - you'll have to come up with 
 an explicit versioning scheme (like your idea of creating a temporary 
 shadow database).

I'm trying that now, it involves temporary tables and views; but the 
bookkeeping for it is looking to be increasingly difficult and it is also going 
to make my own code unmaintainable.

I'm willing to pay the sqlite team to add this feature to a future of sqlite.

Charles

p.s. please excuse my Subject line.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please review this email to sqlite's mailing list

2012-05-16 Thread Simon Slavin
On Tue, May 15, 2012 at 1:55 PM, Charles Samuels char...@cariden.comwrote:

 At some point, we get a checkpoint; at this instant, what is in otherdb and
 what is in sqlite is what we want committed to sqlite, if either of them
 fails, we can rollback both of them and both databases return to a
 consistent
 state of a previous checkpoint. The problem is that in the time between
 checkpoint 1 and checkpoint 1 being committed to disk, more data is
 arriving.

You will find that in almost all situations where either of them failed, 
whatever method you're using to protect the integrity of your data will fail 
too, because it will depend on the same thing.  Not only that, but that your 
method of assuring synchrony is likely to add lots of complication which will 
make the whole system fail more than a simple one-SQL-engine implementation 
would.  For data integrity of the type you describe, there's nothing much you 
can do short of using duplicate servers talking to RAIDs with redundancy.

On 16 May 2012, at 7:40pm, Charles Samuels char...@cariden.com wrote:

 I'm trying that now, it involves temporary tables and views; but the 
 bookkeeping for it is looking to be increasingly difficult and it is also 
 going 
 to make my own code unmaintainable.

You are inventing a versioning system for database rows.  This has been 
discussed a few times on this list, but never with any useful outcome in the 
form of code.  There's no reason it couldn't be done, but it can turn 
complicated fast.  Especially if you are going to deal correctly with 
multi-thread, multi-process or multi-user access.  Two standard approaches are 
savepoints and an endless undo mechanism.  Each have their advantages and 
disadvantages.  The fact that there is no well-known way to do this should give 
you an idea how difficult or useful it will end up being.

To reduce the complication you have already noted, I recommend you try to 
divorce the versioning system from the code of your project.  Try to write 
yourself a database library which would be useful for many projects, rather 
than have any one routine which includes both knowledge about your project and 
knowledge about how versioning works.

Another way to do it is simply to use ROLLBACK points as defined in the SQL 
standard and implemented by SQLite.  We already know that they work correctly 
and there is a lot of documentation on their use and drawbacks.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please review this email to sqlite's mailing list

2012-05-16 Thread Charles Samuels
On Wednesday, May 16, 2012 1:28:17 e.h. Simon Slavin wrote:
 On Tue, May 15, 2012 at 1:55 PM, Charles Samuels char...@cariden.comwrote:
  At some point, we get a checkpoint; at this instant, what is in otherdb
  and what is in sqlite is what we want committed to sqlite, if either of
  them fails, we can rollback both of them and both databases return to a
  consistent
  state of a previous checkpoint. The problem is that in the time between
  checkpoint 1 and checkpoint 1 being committed to disk, more data is
  arriving.
 
 You will find that in almost all situations where either of them failed,
 whatever method you're using to protect the integrity of your data will
 fail too, because it will depend on the same thing.  Not only that, but
 that your method of assuring synchrony is likely to add lots of
 complication which will make the whole system fail more than a simple
 one-SQL-engine implementation would.  For data integrity of the type you
 describe, there's nothing much you can do short of using duplicate servers
 talking to RAIDs with redundancy.

Sorry, I misspoke here a little bit and I think it's confusing you to the 
actual problem I'm having. To clarify: After I do a commit to otherdb - I'm 
*still* able to atomically roll it back, simply because I keep history of its 
previous versions in the sqlite db. I have it such that if the sqlite database 
itself rolls back, then the data in otherdb automatically does as well (it 
involves lots of COW pages and such). This is a solved problem.

The real problem is that while I'm waiting for otherdb to synchronize, more 
data is still coming in to this entire solution. Otherdb is able to take that 
data and put it in a future version, but sqlite can't. After otherdb 
synchronizes (i.e., with fsync), I can commit the sqlite db and even if 
otherdb has more stuff of a future version added to it, the sqlite database 
causes that future data to be safely ignored.

However, while I can safely add to otherdb's future data, there's no place 
to add it into SQLite, as I still haven't committed sqlite's present version 
- I can't do that until otherdb reports that it's committed its entire present 
version.

 [...]

 To reduce the complication you have already noted, I recommend you try to
 divorce the versioning system from the code of your project.  Try to write
 yourself a database library which would be useful for many projects,
 rather than have any one routine which includes both knowledge about your
 project and knowledge about how versioning works.
I have already done that; I have SQL-friendly data stored in SQLite, and non-
SQL data stored in otherdb, but either I can't accept new data while waiting 
for otherdb to do its full fsync (because I have nowhere to put the sqlite-
side of things), or I need a way to have sqlite not commit the stuff from 
between the start of otherdb's checkpoint and the point that it finishes its 
fsync.

 
 Another way to do it is simply to use ROLLBACK points as defined in the SQL
 standard and implemented by SQLite.  We already know that they work
 correctly and there is a lot of documentation on their use and drawbacks.

If by rollback points, you mean savepoints; then they're simply not useful if 
they discard (as opposed to giving me back the data) the data that is being 
rolled-back. To keep that data, I would need a shadow-database of some sort, 
which would of course require me to have many many routines that include 
knowledge of my project and knowledge of how versioning works.

Charles
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Please review this email to sqlite's mailing list

2012-05-15 Thread Charles Samuels

I'm using sqlite in addition to another database (otherdb) storing data in a 
specific manner. I'm trying to keep atomicity of my disk commits. It can take 
several minutes for otherdb to commit, and while it commits it can already 
start accumulating data for a future transaction.

Some of the data coming into this application also goes into the sqlite 
database. But I'd like to keep what's on the oxide between sqlite and 
otherdb consistent with eachother. Let's accept that otherdb

At some point, we get a checkpoint; at this instant, what is in otherdb and 
what is in sqlite is what we want committed to sqlite, if either of them 
fails, we can rollback both of them and both databases return to a consistent 
state of a previous checkpoint. The problem is that in the time between  
checkpoint 1 and checkpoint 1 being committed to disk, more data is arriving.

The question here is: where can I put that more data so that it won't be 
part of checkpoint 1, but is still accessable by sqlite select statements? 
(Accept that otherdb allows asychronous commits such that I can add more data 
to it that doesn't wind up on disk).

There's a few possibilities with some serious disadvantages:

* When otherdb completes its checkpoint, I commit sqlite; until otherdb and 
sqlite finish their commits, any data going into sqlite instead goes into a 
mirror sqlite that I can do queries against meanwhile (but then I have to 
replay *all* of those modifications against the primary sqlite). This can cost 
huge amounts of memory because the sqlite database can get big: 3GiB or more. 
It's also slow because all of a sudden I have to do a whole bunch of sqlite 
statements. It's even slower because now any update I do *normally* has to be 
cloned.

* I could write a virtual filesystem layer for sqlite that somehow accumulates 
changes that I can merge in with insert statements. So it's like the previous 
solution but I use some arm waving in combination with smoke and mirrors to at 
least not make me have two total copies of the database. The problem with this 
one is I don't know how to do it, and even if I did, I wouldn't know how 
reliable it was.

* If sqlite had a commit transaction to savepoint X, then sqlite commits to 
the oxide everything up to a specific savepoint, keeping the savepoints after 
those committed still as active and uncommitted savepoints. The only 
disadvantage I can think of to this is that sqlite has no such feature.

So how could I do this?


Charles
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users