Vlad,
On 06/23/2014 03:11 AM, Vlad Khorsun wrote:
21.06.2014 0:52, Nikolay Samofatov wrote:
Hello, All!
We have encountered subtle errors and data corruptions when using complex
triggers/stored procedures in READ COMMITTED
transactions.
I assume you tell about logical data corruptions, not physical, correct ?
Well, I was talking about logical corruption.
There is also a fix for physical corruption with READ_ONLY + READ COMMITTED transactions which this
patch depends on, but it is unrelated to cursors stability behavior.
AFAIK, concurrent GC in pre-committed transactions without transaction lock was unsafe, and created
permanently broken databases.
I believe corruption has become especially apparent after commit intended to fix CORE-3515. I don't
know who fixed it, but Red Soft builds include a patch for this problem for about a year. I was
surprised to learn last week than vanilla Firebird still has the problem.
Most applied programmers don't think that while their SP/trigger is executing
the world is changing underneath them.
So lost updates and inconsistent data changes are happening during concurrent
operation.
I.e. no real data corruptions, just from application POV ?
Correct. There is no physical database corruption. Only broken invariants
application depends on.
To address this problem, in our engine builds we changed behavior of READ
COMMITTED + REC_VERSION transactions to ensure
cursor stability.
There are two close but very different things about cursors. Let me first
speak about both to avoid
confusing:
a) cursor stability, when cursor could see changes made but cursor itself or by
statements running
at nested level. It have nothing common with transaction isolation modes
and, iirc, standard
describes such behavior as vendor-specific. It is addressed in fb3, btw.
b) cursor sensitivity, when cursor could see changes made by statements running
in separate transactions,
committed while cursor reads data. Of course it is impossible to have
sensitive cursor in snapshot
transaction. IIRC, standard allows both types of cursors (sensitive and
insensitive) and default
behavior is vendor-specific.
I assume you speak about (b), so lets name things correctly, please.
I've read your patch not applying it, so i could miss something - please
correct me, if i write
stupid things below.
Each request is executed in its own snapshot, that is released when execution
of request ends.
Could you explicitly describe behavior of sub-requests, dynamic request in
the same transaction
(EXECUTE STATEMENT) and requests, running in autonomous transactions ?
Subrequests and EXECUTE STATEMENT requests share snapshot with top-level
request.
Autonomous transactions establish their own snapshot.
Sub-transaction logic now applies not only to isc_tpb_concurrency, but also to
isc_tpb_read_committed transactions.
We tried to be extra careful not to hurt performance while establishing these
snapshots.
> We do it much more efficiently than done for isc_tpb_concurrency.
AFAIU, you maintain list of active transactions at top-level request. You
create this list
using new Lock Manager call which iterates thru LCK_tra locks and calls
callback which fills
list (actually BePlusTree) of active transactions.
Later reader consult this list (i.e. search BePlusTree) to check
transaction state.
Yes, the data structure used is a bit more involved, but in general you are
correct.
BePlusTree based SparseBitmap implementation is quite quick and is currently used in the most
performance-sensitive parts of the code (data index retrieval by index, for example).
Is it really faster than copy part of (already cached) TIP contents into
transaction and then
test a bit in a plain vector ? Also, it allows to reuse existing code.
No, this actually would have horrible performance characteristics. Take
BroadView, for example.
They have many millions transactions processed daily, and TIP size of 1500-4000
pages is typical.
Imagine how costly is to copy such a thing?
Actually, my code would probably improve performance of isc_tpb_concurrency transactions if used
there, but I didn't try it.
Next, i see that you trying to adjust transaction lock data to change
garbage collection
threshold. At first look it is correct and highly necessary, but i think it
can't work as
it can't change values already cached by concurrent transactions. I.e. you
can't make OST
less than it was known to someone (except of yourself). Therefore i consider
that new
transaction mode should set its lock data not to the self number but to the OAT
value found
at transaction start (i.e. same as snapshot transaction).
No, I think you should go through the logic again. I might be in error, but I analyzed engine's GC
watermarking algorithms for a few days with pen and paper, so this is unlikely. Consider this logic:
1. There is no need to hold GC watermark (lck_data) past transaction number if there are no request
snapshots.
2. Cached values you are concerned about cannot go down because we started a request snapshot,
because at the time we create a snapshot oldest active transaction is indeed ACTIVE (and thus
holding cached GC watermarks because of its own existence). Possible concurrency during setting up
of a snapshot is handled by the loop in TRA_setup_request_snapshot and is explained in a comment.
I tried to be extra-careful and tested this logic. Please point me to an error.
This mimics Oracle's behavior in that regard but Oracle doesn't always
implement nesting correctly, we do.
Could you explain, please ?
Well, in READ COMMITTED mode Oracle Database always creates snapshot at the
start of each statement.
They call it *statement-level read consistency*//. This is why I named setting
ReadConsistency.
They also say that their cursors are "always stable".
So yes, let's call new behavior "statement-level read consistency" to avoid
confusion.
AFAIK, not all sub-requests for Oracle's request share snapshot with parent
request correctly.
But this is a problem of Oracle, and not ours. It might be fixed already, I
don't know.
I attach patch for this functionality to give you an idea of implementation. It
depends on a couple other changes so it
doesn't apply to FB2.5 cleanly.
Any objections if I start porting this functionality to FB3?
The principal thing there is how new isolation mode is introduced. You
introduced a new
setting in firebird.conf, i.e. ANY read-committed transaction will have new
behavior. I
think it will be better to introduce new option for read-committed mode
(probably make it
default) to allow user to have a choice.
I was contemplating this for a while as well, consulting applied software
developers.
I believe that no sane person really wants inconsistent reads. Most people make implicit or explicit
assumption in their PSQL code that their reads are consistent.
When I ask applied developers how do they feel about inconsistent reads in Firebird, they stare at
me and tell that they didn't know about such buggy behavior.
In my past projects I thought that I can handle inconsistent reads properly using pessimistic locks,
external synchronization, etc. Now I understand that my code was naive and probably unsafe.
So IMO, introducing statement-level read consistency is really a bugfix. This is how READ COMMITTED
should have been implemented from the beginning.
Configuration setting to revert to legacy behavior appears to be appropriate. This is similar to
OldParameterOrdering setting, and other Old* entries.
And if the user is willing to deal with all sorts of instability, he can still use NO_REC_VERSION
mode (which is the default, BTW). :-)
However if you also propose to make REC_VERSION a default - I vote both hands for it, but this is
unrelated to the change under discussion.
Regards,
Vlad
Thank you,
Nikolay Samofatov
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel