Re: [sqlite] Endless loop possible with simultaneous SELECT and UPDATE?

2019-09-03 Thread Keith Medcalf

On Tuesday, 3 September, 2019 15:01, Kees Nuyt  wrote:

>On Tue, 3 Sep 2019 18:26:01 +0100, you wrote:

>>> // do SELECT on db1
>>> // do UPDATE on db2

>> Do you expect the SELECT to see the results of the previous
>> UPDATE ?  It won't, until the transaction has ended
>> (unless you arrange this explicitly).

>That's the nice thing about this construct:
>isolation between the SELECT and the UPDATE,
>the pattern is indeed:

>* Iterate over unchanged rows,
>  without seeing DELETEs, UPDATEs and INSERTs
>  on the tables in the select,

>* Be free to act upon the original rows
>  without disturbance of the read.

>I think it only works for WAL journal mode.

It will work just fine in non-WAL mode provided that the number of changed 
pages made by the updates does not exceed the size of the cache (that is, the 
changes on db2 do not have to spill the cache and thus obtain an exclusive 
lock, that it will not be able to obtain, prior to commit time).  You of course 
have to start an ordinary transaction on db1 and an immediate transaction on 
db2 before starting the query/change loop, and commit db1 before committing db2 
after the loop runs out of rows (so that you do not deadlock yourself when the 
commit on db2 attempts to upgrade to an exclusive lock).

In WAL journal mode you do not have to worry any of that at all since an open 
read will not prevent a write on a different connection from spilling the cache 
or committing.

You can do it all on one connection using only a single immediate transaction 
(whether in WAL journal mode or not) if you are sure the query in the outer 
loop cannot be using an index modified by the inner update/delete/insert 
operations.  One way to do this is to require the use of a sorter by that query 
so that all the results must be gathered before the first row is returned (such 
as by using the ORDER BY +colname implementation detail).

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Endless loop possible with simultaneous SELECT and UPDATE?

2019-09-03 Thread Kees Nuyt
On Tue, 3 Sep 2019 18:26:01 +0100, you wrote:

>> // do SELECT on db1
>> // do UPDATE on db2
>
> Do you expect the SELECT to see the results of the previous
> UPDATE ?  It won't, until the transaction has ended
> (unless you arrange this explicitly).

That's the nice thing about this construct: 
isolation between the SELECT and the UPDATE, 
the pattern is indeed:

* Iterate over unchanged rows,
  without seeing DELETEs, UPDATEs and INSERTs 
  on the tables in the select, 

* Be free to act upon the original rows
  without disturbance of the read.

I think it only works for WAL journal mode.

-- 
Regards,
Kees Nuyt

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Endless loop possible with simultaneous SELECT and UPDATE?

2019-09-03 Thread Simon Slavin
On 3 Sep 2019, at 5:34pm, Alexander Vega  wrote:

> sqlite3_open("database1", );
> sqlite3_open("database1", );

Bear in mind that SQLite is not a server/client DBMS.  The database is not kept 
in memory (unless you arrange this explicitly).  All operations have to wait 
for the storage that holds the database.

So yes, you can open two connections.  But it's not going to magically double 
the speed of your program.

> // do SELECT on db1
> // do UPDATE on db2

Do you expect the SELECT to see the results of the previous UPDATE ?  It won't, 
until the transaction has ended (unless you arrange this explicitly).
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Endless loop possible with simultaneous SELECT and UPDATE?

2019-09-03 Thread Alexander Vega
So in the original code if I added a NOT INDEXED it would be valid? Also,
would an ORDER BY Auth_id ASC fix the issue, since I an not adding any new
rows the auth_ids would remain constant?

Wow I did not know that you could call open multiple times on the same
database! So the following is valid and safe (considering POSIX locking
etc.)
sqlite3_open("database1", );
sqlite3_open("database1", );
// loop start
// BEGIN TRANSACTION BOTH dbs
// do SELECT on db1
// do UPDATE on db2
// END TRANSACTION
// loop end
// close db1
// use db2 for rest of the program...

I have not seen this pattern within any sqlite3 code before so I think it
is not well known that this is a design pattern for searching and updating
at the same time.

If I were determined to stay within defined behavior AND to only use one
connection, would the following updated code attain that.

#define SELECT_EXPIRED_IDS \
"SELECT Auth_id FROM AuthTable WHERE Auth_id > ? AND expiration < ? " \
"ORDER BY Auth_id ASC LIMIT 128;"

#define UPDATE_SESID_EXPIRED \
"UPDATE AuthTable SET sesCookie=?, expiration=? WHERE Auth_id=?;"

static void
expire_sesid(void)
{
int auth_ids[128] = {0};
int i=0, j;
/* get raw current time */
current_time = get_current_db_time(false);

/* prepare SQL queries */
sqlite3_prepare_v2(db,   SELECT_EXPIRED_IDS,
-1, _info, NULL);
sqlite3_prepare_v2(db,  UPDATE_SESID_EXPIRED,
-1, _ses_expired, NULL);

do {
  sqlite3_bind_int(expire_info, 1, auth_ids[i]);
  sqlite3_bind_int(expire_info, 2, current_time);
  /* while there is work to be done */
  i=0;
  while (sqlite3_step(expire_info) == SQLITE_ROW) {
auth_ids[i++] = sqlite3_column_int(expire_info, 0);   /* auth_id */
  }
  sqlite3_reset(expire_info);

  for (j=0; j < i; j++) {
/* generate new session id */
generate_ses_id(ses_id);
/* update ses_id and internal expiration to a year ahead, log in
will set it to a month for user log in */
sqlite3_bind_text(update_ses_expired, 1, ses_id,
   16, SQLITE_STATIC);
sqlite3_bind_int64(update_ses_expired, 2, current_time_plus_year);
sqlite3_bind_int(update_ses_expired, 3, auth_ids[j]);
sqlite3_step(update_ses_expired);
sqlite3_reset(update_ses_expired);
  }
} while ( i == 128 );
/* all work has completed */
sqlite3_finalize(expire_info);
sqlite3_finalize(update_ses_expired);
return;
}




On Sun, Sep 1, 2019 at 3:47 PM Keith Medcalf  wrote:

>
> On Sunday, 1 September, 2019 11:12, Alexander Vega 
> wrote:
>
> >Thank you Keith for your answer. It has led me to more questions.
>
> >"though you may or may not have visited all rows"
> >From the documentation I did not get the impression that you would
> >ever not visit ALL ROWS at least once. Is there a technical reason
> >for this? I would assume a full table scan is walking the un-ordered
> >leaf pages of the B*tree?
>
> How do you know that you are doing a table scan?  This certainly cannot be
> assumed.  Perhaps the AuthTable has 57 columns with a total length of
> several hundred bytes per row but there also happens to be an index on a
> subset of the columns that includes the two columns that you have asked
> for.  Perhaps you are "table scanning" that covering index instead (because
> it is cheaper than reading the actual table)?  There are ways to insist on
> a table scan (select ... from table NOT INDEXED ...) for example.  However,
> you left it up to the database engine to choose the most cost effective way
> to answer your select (which is how SQL works ... it is a declarative
> language ... you declare what you want and the database figures out the
> best way to go about giving you what you asked for).
>
> As a result of updating the first such row thus received the index has now
> changed such that the row you are operating on became the last row in the
> index being scanned.  Therefore there is no "next" row.  You will have
> visited only one row, even though there might have been millions of rows in
> the table.
>
> >"Your outer query should probably be "select auth_id, expiration from
> >AuthTable where expiration <= ? order by +auth_id, +expiration" and
> >binding current_time as the parameter since there is no point in
> >retrieving rows that you will not be updating is there?  "
>
> >You are correct that does make sense. I guess I was trying avoid any
> >ambiguities of a WHERE clause on the SELECT because I do not
> >understand its behavior in this circumstance.
>
> If you cannot understand the behaviour with a WHERE clause, then what
> would make you think that one without a WHERE clause would be any more
> transparent, especially given that all Relational Databases are designed to
> provide you the results you asked for as efficiently as possible?  Perhaps
> in a few days you will discover that you need to create another index for
> some other purpose, and that causes SQLite3 to obtain what you said you
> wanted in an entirely different manner.  When you make any change to the
> database do you re-evaluate the implementation 

Re: [sqlite] Endless loop possible with simultaneous SELECT and UPDATE?

2019-09-01 Thread Keith Medcalf

On Sunday, 1 September, 2019 11:12, Alexander Vega  wrote:

>Thank you Keith for your answer. It has led me to more questions.

>"though you may or may not have visited all rows"
>From the documentation I did not get the impression that you would
>ever not visit ALL ROWS at least once. Is there a technical reason 
>for this? I would assume a full table scan is walking the un-ordered 
>leaf pages of the B*tree?

How do you know that you are doing a table scan?  This certainly cannot be 
assumed.  Perhaps the AuthTable has 57 columns with a total length of several 
hundred bytes per row but there also happens to be an index on a subset of the 
columns that includes the two columns that you have asked for.  Perhaps you are 
"table scanning" that covering index instead (because it is cheaper than 
reading the actual table)?  There are ways to insist on a table scan (select 
... from table NOT INDEXED ...) for example.  However, you left it up to the 
database engine to choose the most cost effective way to answer your select 
(which is how SQL works ... it is a declarative language ... you declare what 
you want and the database figures out the best way to go about giving you what 
you asked for).

As a result of updating the first such row thus received the index has now 
changed such that the row you are operating on became the last row in the index 
being scanned.  Therefore there is no "next" row.  You will have visited only 
one row, even though there might have been millions of rows in the table.

>"Your outer query should probably be "select auth_id, expiration from
>AuthTable where expiration <= ? order by +auth_id, +expiration" and
>binding current_time as the parameter since there is no point in 
>retrieving rows that you will not be updating is there?  "

>You are correct that does make sense. I guess I was trying avoid any
>ambiguities of a WHERE clause on the SELECT because I do not
>understand its behavior in this circumstance.

If you cannot understand the behaviour with a WHERE clause, then what would 
make you think that one without a WHERE clause would be any more transparent, 
especially given that all Relational Databases are designed to provide you the 
results you asked for as efficiently as possible?  Perhaps in a few days you 
will discover that you need to create another index for some other purpose, and 
that causes SQLite3 to obtain what you said you wanted in an entirely different 
manner.  When you make any change to the database do you re-evaluate the 
implementation details of every previously written SQL statement to see if it 
still compatible with the details you depended on?  What about it you update 
the version of SQLite3?  You should not be dependent on the peculiarities of 
the implementation since they might change at any time.

>You mentioned two database connections to the same database. Is this
>going to work if I am using Threadsafe mode = 0? 

Yes.  Threadsafe mode only affects programs having multiple threads making 
calls into the sqlite3 library.  These are independent variables (that is you 
can have X threads and Y connections, and X is independent of Y) just because 
you have 47 connections does not mean that you have more than 1 thread, nor 
does having 47 threads mean that you have more than 1 connection.  Threads are 
commenced with _beginthread (or equivalent for the OS) calls and connections 
are commenced with sqlite3_open* calls.  The _beginthread operations result in 
the creation of a thread and the sqlite3_open* calls create a database 
connection -- they are not related to each other in any way.  Also consider 
that it is entirely possible for a program to have hundreds of threads yet 
still only be single-threaded as far as sqlite3 is concerned if only one of 
those threads makes use of the sqlite3 library, and that one thread may use 
hundreds of database connections either serially or in parallel or in some 
combination thereof.

>Would the second connection be done through an attach?

No.  The attach statement attaches a database to a connection.  You have to 
have opened the connection first.  Connections are created with the 
sqlite3_open* functions which return a pointer to a database connection.

>Does this conversation change if I wrap the whole select and updates
>in one transaction? e.g. BEGIN...END

No, because isolation is only BETWEEN connections, not WITHIN connections.  And 
the transaction state is per connection.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Endless loop possible with simultaneous SELECT and UPDATE?

2019-09-01 Thread Alexander Vega
Thank you Keith for your answer. It has led me to more questions.

"though you may or may not have visited all rows"
From the documentation I did not get the impression that you would ever not
visit ALL ROWS at least once. Is there a technical reason for this? I would
assume a full table scan is walking the un-ordered leaf pages of the B*
tree?

"Your outer query should probably be "select auth_id, expiration from
AuthTable where expiration <= ? order by +auth_id, +expiration" and binding
current_time as the parameter since there is no point in retrieving rows
that you will not be updating is there?  "
You are correct that does make sense. I guess I was trying avoid any
ambiguities of a WHERE clause on the SELECT because I do not understand its
behavior in this circumstance.

You mentioned two database connections to the same database. Is this going
to work if I am using Threadsafe mode = 0? Would the second connection be
done through an attach?

Does this conversation change if I wrap the whole select and updates in one
transaction? e.g. BEGIN...END

Thanks



On Sun, Sep 1, 2019 at 1:32 AM Keith Medcalf  wrote:

>
> > Having read :  https://www.sqlite.org/isolation.html
> > Specifically the line "And the application can UPDATE the current row
> > or any prior row, though doing so might cause that row to reappear in a
> > subsequent sqlite3_step()."
>
> > Is it possible to create and endless loop
>
> Eventually you will have no more rows to update and therefore the
> underlying structures become stable and the select loop will eventually run
> out of rows, though you may or may not have visited all rows, and may visit
> some rows two or more times (once before update and more than once after).
>
> If you change the outer query to "select auth_id, expiration from
> AuthTable order by +auth_id, +expiration" then you will PROBABLY never have
> a problem since the results will LIKELY be from a sorter and not from the
> underlying table, and therefore mutation of the underlying tables and
> indexes will not interfere with the result of the outer select, even if
> those mutations affect the AuthTable or the indexes on it.  Some SQL
> varients use a FOR UPDATE clause on a SELECT to tell the query planner that
> you intend to dally-about with the underlying datastore without having the
> proper isolation in place.  The SQLite way of doing this is by requesting a
> row sorter not dependent on indexes by using the +columnname syntax in an
> order by on the select.
>
> Your outer query should probably be "select auth_id, expiration from
> AuthTable where expiration <= ? order by +auth_id, +expiration" and binding
> current_time as the parameter since there is no point in retrieving rows
> that you will not be updating is there?
>
> The correct solution is, of course, to use separate connections so that
> you have isolation between the select and the updates.
>
> You SHOULD be executing the outer select on one connection and the updates
> on another connection.  This will work for journal mode delete unless the
> number of changed pages is too large to fit in sqlite's cache, in which
> case you may get an error from the update statement when it needs to spill
> the cache, and you will need to kaibosh the whole thing and do the updates
> in smaller chunks by putting a limit on the outer select and looping the
> whole thing until there are no more rows to process.  (or increase the
> cache_size to be sufficient).
>
> You can avoid that particular problem by having the database in
> journal_mode=WAL in which case you can even process each update in its own
> transaction if you wish (get rid of the db2.beginimmediate() and
> db2.commit(), though then you will have to handle the eventuality of
> getting errors on the UPDATE).
>
> db1 = Connection('database.db')
> db1.executescript('pragma journal_mode=WAL;')
> db2 = Connection('database.db')
> current_time = datetime.now()
> current_time_plus_one_year = current_time.add(years=1)
> sess_id = ... some constant ...
> db2.beginimmediate()
> for row in db1.execute('select auth_id, expiration from authtable where
> expiration <= ?;',
>(current_time,)):
> db2.execute('update authtable set sesCookie = ?, expiration = ? where
> auth_id = ?;',
> (generate_ses_id(sess_id), current_time_plus_one_year,
> row.auth_id,))
> db2.commit()
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Endless loop possible with simultaneous SELECT and UPDATE?

2019-08-31 Thread Keith Medcalf

> Having read :  https://www.sqlite.org/isolation.html
> Specifically the line "And the application can UPDATE the current row
> or any prior row, though doing so might cause that row to reappear in a
> subsequent sqlite3_step()."

> Is it possible to create and endless loop

Eventually you will have no more rows to update and therefore the underlying 
structures become stable and the select loop will eventually run out of rows, 
though you may or may not have visited all rows, and may visit some rows two or 
more times (once before update and more than once after).

If you change the outer query to "select auth_id, expiration from AuthTable 
order by +auth_id, +expiration" then you will PROBABLY never have a problem 
since the results will LIKELY be from a sorter and not from the underlying 
table, and therefore mutation of the underlying tables and indexes will not 
interfere with the result of the outer select, even if those mutations affect 
the AuthTable or the indexes on it.  Some SQL varients use a FOR UPDATE clause 
on a SELECT to tell the query planner that you intend to dally-about with the 
underlying datastore without having the proper isolation in place.  The SQLite 
way of doing this is by requesting a row sorter not dependent on indexes by 
using the +columnname syntax in an order by on the select.

Your outer query should probably be "select auth_id, expiration from AuthTable 
where expiration <= ? order by +auth_id, +expiration" and binding current_time 
as the parameter since there is no point in retrieving rows that you will not 
be updating is there?

The correct solution is, of course, to use separate connections so that you 
have isolation between the select and the updates.  

You SHOULD be executing the outer select on one connection and the updates on 
another connection.  This will work for journal mode delete unless the number 
of changed pages is too large to fit in sqlite's cache, in which case you may 
get an error from the update statement when it needs to spill the cache, and 
you will need to kaibosh the whole thing and do the updates in smaller chunks 
by putting a limit on the outer select and looping the whole thing until there 
are no more rows to process.  (or increase the cache_size to be sufficient).

You can avoid that particular problem by having the database in 
journal_mode=WAL in which case you can even process each update in its own 
transaction if you wish (get rid of the db2.beginimmediate() and db2.commit(), 
though then you will have to handle the eventuality of getting errors on the 
UPDATE).

db1 = Connection('database.db')
db1.executescript('pragma journal_mode=WAL;')
db2 = Connection('database.db')
current_time = datetime.now()
current_time_plus_one_year = current_time.add(years=1)
sess_id = ... some constant ...
db2.beginimmediate()
for row in db1.execute('select auth_id, expiration from authtable where 
expiration <= ?;', 
   (current_time,)):
db2.execute('update authtable set sesCookie = ?, expiration = ? where 
auth_id = ?;', 
(generate_ses_id(sess_id), current_time_plus_one_year, 
row.auth_id,))
db2.commit()

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Endless loop possible with simultaneous SELECT and UPDATE?

2019-08-31 Thread Alexander Vega
 Having read :  https://www.sqlite.org/isolation.html
Specifically the line "And the application can UPDATE the current row or
any prior row, though doing so might cause that row to reappear in a
subsequent sqlite3_step()."

Is it possible to create and endless loop with the following (pseudo)code?

#define SELECT_EXPIRE_INFO \
"SELECT Auth_id, expiration FROM AuthTable;"

#define UPDATE_SESID_EXPIRED \
"UPDATE AuthTable SET sesCookie=?, expiration=? WHERE Auth_id=?;"

static void
expire_sesid(void)
{

/* get raw current time */
current_time = get_current_db_time(false);

/* prepare SQL queries */
sqlite3_prepare_v2(db,  SELECT_EXPIRE_INFO,
-1, _info, NULL);
sqlite3_prepare_v2(db,  UPDATE_SESID_EXPIRED,
-1, _ses_expired, NULL);

/* while there is work to be done */
while (sqlite3_step(expire_info) == SQLITE_ROW) {
auth_id = sqlite3_column_int(expire_info, 0);   /* auth_id */
expiration_time = sqlite3_column_int64(expire_info, 1);   /* expiration
*/

/* if the session is expired, today is greater than expiration date */
if ( current_time >  expiration_time ) {
  /* generate new invalid session id */
  generate_ses_id(ses_id);
  /* invalidate ses_id and set internal expiration to a year ahead, log
in
  will set it to a month for user log in */
  sqlite3_bind_text(update_ses_expired, 1, ses_id,
 16, SQLITE_STATIC);
  sqlite3_bind_int64(update_ses_expired, 2, current_time_plus_year);
  sqlite3_bind_int(update_ses_expired, 3, auth_id);
  sqlite3_step(update_ses_expired);
  sqlite3_reset(update_ses_expired);
  }
  }
/* all work has completed */
sqlite3_finalize(expire_info);
sqlite3_finalize(update_ses_expired);
return;
}

I appreciate everyone's time,
-Alex V
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users