Re: [sqlite] advice about schema versioning and WAL

2017-12-29 Thread Gwendal Roué
Thanks Dan for this information!

(For some reason, your reply appears on the mailing list archive, but never 
found the way of my inbox. I'm replying to my own message)

Gwendal

> Dan Kennedy wrote:
> 
> "PRAGMA schema_version" should work the same way in WAL mode. The pragma 
> will read the "database header" from the newest version of page 1 in the 
> wal file if required.
> 
> Dan.
> 
>> Le 28 déc. 2017 à 19:28, Gwendal Roué  a écrit :
>> 
>> Hello,
>> 
>> Season's greetings to all SQLite fellows!
>> 
>> 
>> I'm developping a library that would like to keep a "cache" of some 
>> information about the database schema. Such information are the columns of a 
>> table, its primary key, or its indexes. The purpose of this cache is not 
>> really speed, even if it may help, but mainly to avoid cluttering the 
>> sqlite3_trace/sqlite3_trace_v2 hooks with noisy pragmas whenever the library 
>> needs to infer some implicit information from the actual database schema.
>> 
>> This cache has to be invalidated whenever the schema changes. The 
>> Compile-Time Authorization Callback [1] is the perfect tool for the job, 
>> since it allows to identify statements that create, drop, alter tables and 
>> indexes.
>> 
>> Everything is fine and easy when a single connection is used in a 
>> single-threaded way: statements are executed one after the other, and the 
>> management of the schema cache is trivial.
>> 
>> 
>> It's much less trivial with the WAL mode. I focus on a setup which uses a 
>> unique writer connection, and several reader connections. All connections 
>> are used sequentially in their own thread, but readers and writer can run 
>> concurrently in order to take advantage from the WAL mode.
>> 
>> When a read-only connection uses a deferred transaction to enter snapshot 
>> isolation, it doesn't see the changes performed by other transactions. For 
>> example, if a reader acquires snapshot isolation before a table is altered 
>> by the writer, it won't see the alteration until it commits its deferred 
>> transaction. I wish my schema cache would behave the same.
>> 
>> To be precise, I only have two important needs:
>> 
>> 1. A connection's schema cache is correct, which means that it never 
>> contains information that does not match SQLite's genuine view of the 
>> database schema. Being invalidated/empty is correct, if not efficient (the 
>> missing information is then loaded from SQLite).
>> 2. Synchronization points between readers and writers are avoided 
>> (non-blocking access is the whole point of WAL, and I want to avoid locks as 
>> much as possible)
>> 
>> I was hoping that a connection would have a "schema version": an 
>> automatically incremented value that SQLite bumps whenever the schema is 
>> changed. That would have been enough for my use case. Unfortunately, PRAGMA 
>> schema_version reads the database header, and I thus guess that it does not 
>> play well with WAL (I'm not sure). Furthermore, PRAGMA schema_version 
>> clutters the tracing hook.
>> 
>> The most simple solution I have is to invalidate a reader's schema cache 
>> each time it is used. This would unfortunately invalidate the readers' 
>> caches too often, since most real-life uses only alter the schema at 
>> application start-up, which means that the schema is, practically speaking, 
>> stable after this initialisation phase.
>> 
>> Do any of you have any better idea?
>> 
>> Thanks in advance, regards,
>> Gwendal Roué
>> 
>> [1] https://sqlite.org/c3ref/set_authorizer.html
> 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] advice about schema versioning and WAL

2017-12-28 Thread Dan Kennedy

On 12/29/2017 01:28 AM, Gwendal Roué wrote:

Hello,

Season's greetings to all SQLite fellows!


I'm developping a library that would like to keep a "cache" of some information 
about the database schema. Such information are the columns of a table, its primary key, 
or its indexes. The purpose of this cache is not really speed, even if it may help, but 
mainly to avoid cluttering the sqlite3_trace/sqlite3_trace_v2 hooks with noisy pragmas 
whenever the library needs to infer some implicit information from the actual database 
schema.

This cache has to be invalidated whenever the schema changes. The Compile-Time 
Authorization Callback [1] is the perfect tool for the job, since it allows to 
identify statements that create, drop, alter tables and indexes.

Everything is fine and easy when a single connection is used in a 
single-threaded way: statements are executed one after the other, and the 
management of the schema cache is trivial.


It's much less trivial with the WAL mode. I focus on a setup which uses a 
unique writer connection, and several reader connections. All connections are 
used sequentially in their own thread, but readers and writer can run 
concurrently in order to take advantage from the WAL mode.

When a read-only connection uses a deferred transaction to enter snapshot 
isolation, it doesn't see the changes performed by other transactions. For 
example, if a reader acquires snapshot isolation before a table is altered by 
the writer, it won't see the alteration until it commits its deferred 
transaction. I wish my schema cache would behave the same.

To be precise, I only have two important needs:

1. A connection's schema cache is correct, which means that it never contains 
information that does not match SQLite's genuine view of the database schema. 
Being invalidated/empty is correct, if not efficient (the missing information 
is then loaded from SQLite).
2. Synchronization points between readers and writers are avoided (non-blocking 
access is the whole point of WAL, and I want to avoid locks as much as possible)

I was hoping that a connection would have a "schema version": an automatically 
incremented value that SQLite bumps whenever the schema is changed. That would have been 
enough for my use case. Unfortunately, PRAGMA schema_version reads the database header, 
and I thus guess that it does not play well with WAL (I'm not sure).



"PRAGMA schema_version" should work the same way in WAL mode. The pragma 
will read the "database header" from the newest version of page 1 in the 
wal file if required.


Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] advice about schema versioning and WAL

2017-12-28 Thread Gwendal Roué
Hello,

Season's greetings to all SQLite fellows!


I'm developping a library that would like to keep a "cache" of some information 
about the database schema. Such information are the columns of a table, its 
primary key, or its indexes. The purpose of this cache is not really speed, 
even if it may help, but mainly to avoid cluttering the 
sqlite3_trace/sqlite3_trace_v2 hooks with noisy pragmas whenever the library 
needs to infer some implicit information from the actual database schema.

This cache has to be invalidated whenever the schema changes. The Compile-Time 
Authorization Callback [1] is the perfect tool for the job, since it allows to 
identify statements that create, drop, alter tables and indexes.

Everything is fine and easy when a single connection is used in a 
single-threaded way: statements are executed one after the other, and the 
management of the schema cache is trivial.


It's much less trivial with the WAL mode. I focus on a setup which uses a 
unique writer connection, and several reader connections. All connections are 
used sequentially in their own thread, but readers and writer can run 
concurrently in order to take advantage from the WAL mode.

When a read-only connection uses a deferred transaction to enter snapshot 
isolation, it doesn't see the changes performed by other transactions. For 
example, if a reader acquires snapshot isolation before a table is altered by 
the writer, it won't see the alteration until it commits its deferred 
transaction. I wish my schema cache would behave the same.

To be precise, I only have two important needs:

1. A connection's schema cache is correct, which means that it never contains 
information that does not match SQLite's genuine view of the database schema. 
Being invalidated/empty is correct, if not efficient (the missing information 
is then loaded from SQLite).
2. Synchronization points between readers and writers are avoided (non-blocking 
access is the whole point of WAL, and I want to avoid locks as much as possible)

I was hoping that a connection would have a "schema version": an automatically 
incremented value that SQLite bumps whenever the schema is changed. That would 
have been enough for my use case. Unfortunately, PRAGMA schema_version reads 
the database header, and I thus guess that it does not play well with WAL (I'm 
not sure). Furthermore, PRAGMA schema_version clutters the tracing hook.

The most simple solution I have is to invalidate a reader's schema cache each 
time it is used. This would unfortunately invalidate the readers' caches too 
often, since most real-life uses only alter the schema at application start-up, 
which means that the schema is, practically speaking, stable after this 
initialisation phase.

Do any of you have any better idea?

Thanks in advance, regards,
Gwendal Roué

[1] https://sqlite.org/c3ref/set_authorizer.html
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users