Re: [HACKERS] SSI and 2PC

2011-01-12 Thread Dan Ports
On Tue, Jan 11, 2011 at 12:34:44PM -0600, Kevin Grittner wrote:
 Agreed; but I am starting to get concerned about whether this
 particular area can be completed by the start of the CF.  I might
 run a few days over on 2PC support.  Unless ... Dan?  Could you look
 into this while I chase down the issue Anssi raised?

I'll take a look at it, but be forewarned that I currently know
extremely little about 2PC in Postgres...

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-11 Thread Jeff Davis
On Mon, 2011-01-10 at 11:50 -0600, Kevin Grittner wrote:
 I'm trying not to panic here, but I haven't looked at 2PC before
 yesterday and am just dipping into the code to support it, and time
 is short.  Can anyone give me a pointer to anything I should read
 before I dig through the 2PC code, which might accelerate this?

I don't see much about 2PC outside of twophase.c.

Regarding the original post, I agree that we should have two
phase-commit support for SSI. We opted not to support it for
notifications, but there was a fairly reasonable argument why users
wouldn't value the combination of 2PC and NOTIFY.

I don't expect this to be a huge roadblock for the feature though. It
seems fairly contained. I haven't read the 2PC code either, but I don't
expect that you'll need to change the rest of your algorithm just to
support it.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-11 Thread Florian Pflug
On Jan10, 2011, at 18:50 , Kevin Grittner wrote:
 I'm trying not to panic here, but I haven't looked at 2PC before
 yesterday and am just dipping into the code to support it, and time
 is short.  Can anyone give me a pointer to anything I should read
 before I dig through the 2PC code, which might accelerate this?


It roughly works as follows

Upon PREPARE, the locks previously held by the transaction are transferred
to a kind of virtual backend which only consists of a special proc array
entry. The transaction will thus still appear to be running, and will still
be holding its locks, even after the original backend is gone. The information
necessary to reconstruct that proc array entry is also written to the 2PC state,
and used to recreate the virtual backend after a restart or crash.

There are also some additional pieces of transaction state which are stored
in the 2PC state file like the full list of subtransaction xids (The proc array
entry may not contain all of them if it overflowed). 

Upon COMMIT PREPARED, the information in the 2PC state file is used to write
a COMMIT wal record and to update the clog. The transaction is then committed,
and the special proc array entry is removed and all lockmgr locks it held are
released.

For 2PC to work for SSI transaction, I guess you must check for conflicts
during PREPARE - at any later point the COMMIT may only fail transiently,
not permanently. Any transaction that adds a conflict with an already
prepared transaction must check if that conflict completes a dangerous
structure, and abort if this is the case, since the already PREPAREd transaction
can no longer be aborted. COMMIT PREPARED then probably doesn't need to do
anything special for SSI transactions, apart from some cleanup actions maybe.

Unfortunately, it seems that doing things this way will undermine the guarantee
that retrying a failed SSI transaction won't fail due to the same conflict as
it did originally. Consider

T1 BEGIN TRANSACTION ISOLATION SERIALIZABLE
T1 SELECT * FROM T
T1 UPDATE T ...
T1 PREPARE TRANSACTION

T2 BEGIN TRANSACTION ISOLATION SERIALIZABLE
T2 SELECT * FROM T
T2 UPDATE T ...
- Serialization Error

Retrying T2 won't help as long as T1 isn't COMMITTED.

There doesn't seems a way around that, however - any correct implementation
of 2PC for SSI will have to behave that way I fear :-(

Hope this helps  best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-11 Thread Kevin Grittner
Florian Pflug f...@phlo.org wrote:
 On Jan10, 2011, at 18:50 , Kevin Grittner wrote:
 I'm trying not to panic here, but I haven't looked at 2PC before
 yesterday and am just dipping into the code to support it, and
 time is short.  Can anyone give me a pointer to anything I should
 read before I dig through the 2PC code, which might accelerate
 this?
 
 
 It roughly works as follows
 
 Upon PREPARE, the locks previously held by the transaction are
 transferred to a kind of virtual backend which only consists of a
 special proc array entry. The transaction will thus still appear
 to be running, and will still be holding its locks, even after the
 original backend is gone. The information necessary to reconstruct
 that proc array entry is also written to the 2PC state, and used
 to recreate the virtual backend after a restart or crash.
 
 There are also some additional pieces of transaction state which
 are stored in the 2PC state file like the full list of
 subtransaction xids (The proc array entry may not contain all of
 them if it overflowed). 
 
 Upon COMMIT PREPARED, the information in the 2PC state file is
 used to write a COMMIT wal record and to update the clog. The
 transaction is then committed, and the special proc array entry is
 removed and all lockmgr locks it held are released.
 
 For 2PC to work for SSI transaction, I guess you must check for
 conflicts during PREPARE - at any later point the COMMIT may only
 fail transiently, not permanently. Any transaction that adds a
 conflict with an already prepared transaction must check if that
 conflict completes a dangerous structure, and abort if this is the
 case, since the already PREPAREd transaction can no longer be
 aborted. COMMIT PREPARED then probably doesn't need to do anything
 special for SSI transactions, apart from some cleanup actions
 maybe.
 
Thanks; that all makes sense.  The devil, as they say, is in the
details.  As far as I've worked it out, the PREPARE must persist
both the predicate locks and any conflict pointers which are to
other prepared transactions.  That leaves some fussy work around the
coming and going of prepared transactions, because on recovery you
need to be prepared to ignore conflict pointers with prepared
transactions which committed or rolled back.
 
What I haven't found yet is the right place and means to persist and
recover this stuff, but that's just a matter of digging through
enough source code.  Any tips regarding that may save time.  I'm
also not clear on what, if anything, needs to be written to WAL. I'm
really fuzzy on that, still.
 
 Unfortunately, it seems that doing things this way will undermine
 the guarantee that retrying a failed SSI transaction won't fail
 due to the same conflict as it did originally.
 
I hadn't thought of that, but you're right.  Of course, I can't
enforce that guarantee, anyway, without some other patch first being
there to allow me to cancel other transactions with
serialization_failure, even if they are idle in transaction.
 
 There doesn't seems a way around that, however - any correct
 implementation of 2PC for SSI will have to behave that way I fear
 :-(
 
I think you're right.
 
 Hope this helps  best regards,
 
It does.  Even the parts which just confirm my tentative conclusions
save me time in not feeling like I need to cross-check so much.  I
can move forward with more confidence.  Thanks.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-11 Thread Kevin Grittner
Jeff Davis pg...@j-davis.com wrote:
 
 I don't expect this to be a huge roadblock for the feature though.
 It seems fairly contained. I haven't read the 2PC code either, but
 I don't expect that you'll need to change the rest of your
 algorithm just to support it.
 
Agreed; but I am starting to get concerned about whether this
particular area can be completed by the start of the CF.  I might
run a few days over on 2PC support.  Unless ... Dan?  Could you look
into this while I chase down the issue Anssi raised?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-11 Thread Heikki Linnakangas

On 11.01.2011 20:08, Florian Pflug wrote:

Unfortunately, it seems that doing things this way will undermine the guarantee
that retrying a failed SSI transaction won't fail due to the same conflict as
it did originally. Consider

T1  BEGIN TRANSACTION ISOLATION SERIALIZABLE
T1  SELECT * FROM T
T1  UPDATE T ...
T1  PREPARE TRANSACTION

T2  BEGIN TRANSACTION ISOLATION SERIALIZABLE
T2  SELECT * FROM T
T2  UPDATE T ...
 -  Serialization Error

Retrying T2 won't help as long as T1 isn't COMMITTED.


T2 should block until T1 commits. I would be very surprised if it 
doesn't behave like that already. In general, a prepared transaction 
should be treated like an in-progress transaction - it might still abort 
too.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-11 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 On 11.01.2011 20:08, Florian Pflug wrote:
 Unfortunately, it seems that doing things this way will undermine
 the guarantee that retrying a failed SSI transaction won't fail
 due to the same conflict as it did originally. Consider

 T1  BEGIN TRANSACTION ISOLATION SERIALIZABLE
 T1  SELECT * FROM T
 T1  UPDATE T ...
 T1  PREPARE TRANSACTION

 T2  BEGIN TRANSACTION ISOLATION SERIALIZABLE
 T2  SELECT * FROM T
 T2  UPDATE T ...
  -  Serialization Error

 Retrying T2 won't help as long as T1 isn't COMMITTED.
 
 T2 should block until T1 commits. I would be very surprised if it 
 doesn't behave like that already. In general, a prepared
 transaction should be treated like an in-progress transaction - it
 might still abort too.
 
It shouldn't block if the updates were to different rows, which is
what I took Florian to mean; otherwise this would be handled by SI
and would have nothing to do with the SSI patch.  SSI doesn't
introduce any new blocking (with the one exception of the READ ONLY
DEFERRABLE style we invented to support long-running reports and
backups, and all blocking there is at the front -- once it's
running, it's going full speed ahead).
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-11 Thread Florian Pflug
On Jan11, 2011, at 19:41 , Heikki Linnakangas wrote:
 On 11.01.2011 20:08, Florian Pflug wrote:
 Unfortunately, it seems that doing things this way will undermine the 
 guarantee
 that retrying a failed SSI transaction won't fail due to the same conflict as
 it did originally. Consider
 
 T1  BEGIN TRANSACTION ISOLATION SERIALIZABLE
 T1  SELECT * FROM T
 T1  UPDATE T ...
 T1  PREPARE TRANSACTION
 
 T2  BEGIN TRANSACTION ISOLATION SERIALIZABLE
 T2  SELECT * FROM T
 T2  UPDATE T ...
 -  Serialization Error
 
 Retrying T2 won't help as long as T1 isn't COMMITTED.
 
 T2 should block until T1 commits.

The serialization error will occur even if T1 and T2 update *different* rows. 
This is
due to the SELECTs in the interleaved schedule above returning the state of T 
prior to
both T1 and T2. Which of course never the case for a serial schedule.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 In going back through old emails to see what issues might have
 been raised but not yet addressed for the SSI patch, I found the
 subject issue described in a review by Jeff Davis here:
  
 http://archives.postgresql.org/pgsql-hackers/2010-10/msg01159.php
 
After reviewing the docs and testing things, I'm convinced that more
work is needed.  Because the transaction's writes aren't visible
until COMMIT PREPARED is run, and write-write conflicts are still
causing serialization failures after PREPARE TRANSACTION, some of
the work being done for SSI on PREPARE TRANSACTION needs to be moved
to COMMIT PREPARED.
 
It seems likely that shops who use prepared transactions are more
likely than most to care about truly serializable transactions, so I
don't think I should write this off as a limitation for the 9.1
implementation.  Unless someone sees some dire problem with the
patch which I've missed, this seems like my top priority to fix
before cutting a patch.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread David Fetter
On Mon, Jan 10, 2011 at 08:49:12AM -0600, Kevin Grittner wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  
  In going back through old emails to see what issues might have
  been raised but not yet addressed for the SSI patch, I found the
  subject issue described in a review by Jeff Davis here:
   
  http://archives.postgresql.org/pgsql-hackers/2010-10/msg01159.php
  
 After reviewing the docs and testing things, I'm convinced that more
 work is needed.  Because the transaction's writes aren't visible
 until COMMIT PREPARED is run, and write-write conflicts are still
 causing serialization failures after PREPARE TRANSACTION, some of
 the work being done for SSI on PREPARE TRANSACTION needs to be moved
 to COMMIT PREPARED.
  
 It seems likely that shops who use prepared transactions are more
 likely than most to care about truly serializable transactions, so I
 don't think I should write this off as a limitation for the 9.1
 implementation.  Unless someone sees some dire problem with the
 patch which I've missed, this seems like my top priority to fix
 before cutting a patch.

Could people fix it after the patch?  ISTM that a great way to test it
is to make very sure it's available ASAP to a wide range of people via
the next alpha (or beta, if that's where we're going next).

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread Kevin Grittner
David Fetter da...@fetter.org wrote:
 
 Could people fix it after the patch?  ISTM that a great way to
 test it is to make very sure it's available ASAP to a wide range
 of people via the next alpha (or beta, if that's where we're going
 next).
 
People can always pull from the git repo:
 
git://git.postgresql.org/git/users/kgrittn/postgres.git
 
Also, I can post a patch against HEAD at any time.  Should I post
one now, and then again after this is solved?
 
Full disclosure requires that I mention that while Dan has completed
code to fix the page split/combine issues Heikki raised, I don't
think he's done testing it.  (It's hard to test because you don't
hit the problem unless you have a page split or combine right at the
point where the hash table for predicate lock becomes full.)  So,
anyway, there could possibly be some wet paint there.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread David Fetter
On Mon, Jan 10, 2011 at 08:59:45AM -0600, Kevin Grittner wrote:
 David Fetter da...@fetter.org wrote:
  Could people fix it after the patch?  ISTM that a great way to
  test it is to make very sure it's available ASAP to a wide range
  of people via the next alpha (or beta, if that's where we're going
  next).
  
 People can always pull from the git repo:
  
 git://git.postgresql.org/git/users/kgrittn/postgres.git
  
 Also, I can post a patch against HEAD at any time.  Should I post
 one now, and then again after this is solved?
  
 Full disclosure requires that I mention that while Dan has completed
 code to fix the page split/combine issues Heikki raised, I don't
 think he's done testing it.  (It's hard to test because you don't
 hit the problem unless you have a page split or combine right at the
 point where the hash table for predicate lock becomes full.)  So,
 anyway, there could possibly be some wet paint there.

Short of a test suite that can inject faults at the exact kinds of
places where this occurs and a way to enumerate all those faults,
there's only so much testing that's possible to do /in vitro/.  Oh,
and such enumerations tend to be combinatorial explosions anyhow. :P

At some point, and that point is rapidly approaching if it's not
already here, you've done what you can to shake out bugs and
infelicities, and the next steps are up to people testing alphas,
betas, and to be completely frank, 9.1.0 and possibly later versions.

This is way, way too big a feature to expect you can get a perfect
handle on it by theory alone.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and 2PC

2011-01-10 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote: 
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  
 In going back through old emails to see what issues might have
 been raised but not yet addressed for the SSI patch, I found the
 subject issue described in a review by Jeff Davis here:
  
 http://archives.postgresql.org/pgsql-hackers/2010-10/msg01159.php
  
 After reviewing the docs and testing things, I'm convinced that
 more work is needed.  Because the transaction's writes aren't
 visible until COMMIT PREPARED is run, and write-write conflicts
 are still causing serialization failures after PREPARE
 TRANSACTION, some of the work being done for SSI on PREPARE
 TRANSACTION needs to be moved to COMMIT PREPARED.
 
I'm now also convinced that Jeff is right in his assessment that
when a transaction is prepared, information about predicate locks
and conflicts with other prepared transactions must be persisted
somewhere.  (Jeff referred to a 2PC state file.)
 
I'm trying not to panic here, but I haven't looked at 2PC before
yesterday and am just dipping into the code to support it, and time
is short.  Can anyone give me a pointer to anything I should read
before I dig through the 2PC code, which might accelerate this?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SSI and 2PC

2011-01-09 Thread Kevin Grittner
In going back through old emails to see what issues might have been
raised but not yet addressed for the SSI patch, I found the subject
issue described in a review by Jeff Davis here:
 
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01159.php
 
I think this is already handled based on feedback from Heikki:
 
http://archives.postgresql.org/pgsql-hackers/2010-09/msg00789.php
 
Basically, the PREPARE TRANSACTION statement plays the same role for
the prepared transaction that COMMIT does for other transactions --
final conflict checking is done and the transaction becomes immune to
later serialization_failure rollback.  A transaction which starts
after PREPARE TRANSACTION executes is not considered to overlap with
the prepared transaction.
 
In Jeff's example, if the Session2 runs a query before Session1
executes PREPARE TRANSACTION, one of them will fail.  (I tested to
make sure.)
 
Does that sound sane, or is something else needed here?
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers