Re: [sqlite] Data visibility problem

João Ramos Wed, 25 Jun 2014 02:20:08 -0700

On Mon, Jun 23, 2014 at 3:03 PM, Igor Tandetnik <i...@tandetnik.org> wrote:

> On 6/23/2014 5:43 AM, João Ramos wrote:
>
>> If multiple connections in shared cache mode are treated as a single
>> "real"
>> connection, should I change the thread mode to Serialized?
>>
>
> What mode do you use now?
>
> In any case, I doubt the thread mode has any bearing on the problem. The
> scenario may occur even if no two threads ever run in parallel. It's only
> necessary that SELECTs on different connections to shared cache interleave:
> you call sqlite3_step on connection A, then later call sqlite3_step on
> connection B (while the statement in A has not been reset) - that second
> traversal would see the data as of the start of the first traversal.
>
>
I'm using multi-thread because DB connections aren't used simultaneously in
two or more threads. But since you said that in shared cache mode multiple
connections behave as a "single" connection, I was afraid that this mode
would somehow change the meaning of "simultaneously in two or more threads".

For example, imagine a scenario with two threads, each with its own
connection, both with shared cache and in multi-thread mode. According to
the documentation, I don't need the serialized thread mode because the
connections aren't being used simultaneously. However, internally, they
will simultaneously access the shared cache. I'm sure the shared cache is
protected against this, and I was just making sure that this is the case
(and not an omission in the docs, like for example: "Multi-thread. In this
mode, SQLite can be safely used by multiple threads provided that no single
database connection is used simultaneously in two or more threads *and* one
or more connections aren't in shared cache mode.")

>
>  I tough this
>> "single" connection was in terms of page cache
>>
>
> Well, yes. So you have a version of a page in this cache - that version
> must be as of the time of the oldest read transaction still outstanding. So
> all connections to that shared cache see the same version of that page, and
> cannot observe any changes made to it later.

Ok, I see what you mean.

>
>  and would not have any
>> implications on how the connections are accessed (or not) concurrently,
>> especially transaction control.
>>
>
> Concurrent access and transaction control are unrelated concepts. I'm not
> sure why you are bringing them up together. You can observe transaction
> isolation effects with a single thread interleaving access to two database
> connections. Start traversing a SELECT statement on one connection; between
> two calls to sqlite3_step, modify the data on the other (WAL journal mode
> would let you); continue traversal of the statement - it would not observe
> the changes the same thread has just made.

I never meant to imply that they are related somehow (see the first
response). Only asking if I could be stumbling against an improper use of
thread-mode + shared cache, and that could cause a connection not seeing
committed data.

>
>  Either way, the example you gave does not apply. What was happening with
>> my
>> code was that a transaction A started at T1, inserted a row and was
>> committed successfully at T2. Another transaction B started at T3 and
>> didn't see the new row.
>>
>
> Was there, by any chance, a seemingly unrelated transaction C using the
> same shared cache that started reading at T0, and has not completed by T3?

Yes, this could very possible be the case.

>
>
>  I can't see how the fact that transaction A was in
>> shared cache mode and B in private cache mode
>>
>
> Shouldn't it be the other way round? I thought you said all readers (B
> among them) use shared cache, while a writer (like A) uses a private
> connection.

Your're correct, sorry for my mistake.

>
>
>  could influence the new row
>> not being seen in transaction B, when its transaction started after
>> transaction A commit. Doesn't this break ACID?
>>
>
> ACID provides guarantees on when changes made in one transaction will
> *not* be seen by another. It says nothing about when changes in one
> transaction *will* be seen by another.
>
> Yes, connections sharing a cache are not quite as isolated from each other
> as independent private connections (it's even possible to enable
> read-uncommitted mode, whereby one such connection can see
> not-yet-committed changes made by another). As any other option, shared
> cache brings some benefits and some limitations (if it were all benefits,
> it wouldn't be an option).

Sorry, but I'm going to disagree (in part) with what you're saying. I agree
with what you say concerning "consistency" but the scenario you describe
completely violates "isolation". A transaction creates a well defined area
of isolation that always, and without question, terminates in the commit.
What you're saying is that its impossible to control when the transaction
isolation terminates. If the example above, if the transaction that started
at T0 lasted a few minutes, it would take that long for any other read
connection to see the committed data. In my opinion that's is not proper
isolation (or even consistency).

And I understand the limitations a shared-cache mode may have, but if this
is to be the case, it should be documented. I can be wrong, but I didn't
find anywhere in the docs where this scenario is presented (or has some
sort of warning).

>
> --
> Igor Tandetnik
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Data visibility problem

Reply via email to