Re: [HACKERS] 2-phase commit

2003-10-25 Thread Rob Butler
Of course I have no time to work on it : (, but in my opinion XA interface
and support for the JDBC driver is absolutely necessary.  I think that 2pc
will generally be used more for supporting 2pc transactions between the DB
and JMS than it would be for 2pc across 2 db's.

Glad to see some progress on 2PC with Postgres though.

Later
Rob


 The next step is going to be writing 2PC support to the JDBC driver using
 the new backend commands. XA interface would be very nice too, but I'm
 personally not that interested in that. Any volunteers?

 Please comment! I'd like to know what you guys think about this. Am I
 heading into the right direction?



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-10-24 Thread Heikki Linnakangas
On Fri, 10 Oct 2003, Heikki Linnakangas wrote:

 On Thu, 9 Oct 2003, Bruce Momjian wrote:

  Agreed.  Let's get it into 7.5 and see it in action.  If we need to
  adjust it, we can, but right now, we need something for distributed
  transactions, and this seems like the logical direction.

 I've started working on two-phase commits last week, and the very
 basic stuff is now working. Still a lot of bugs though.

I have done more work on my 2PC commit patch. I still need to work out
notifications and CREATE statements, but otherwise I'm quite happy with it
now. I received no feedback on the first version, so I'll try to clarify
how it works a bit.

The patch is against the current cvs tip. I'll post it to the
patches-list, and you can also grab it from here:
http://www.hut.fi/~hlinnaka/twophase2.diff

The patch introduces three new commands, PREPCOMMIT, COMMITPREPARED and
ABORTPREPARED.

PREPCOMMIT is called in place of COMMIT, to put the active transaction
block into prepared state. PREPCOMMIT takes a string argument that
becomes the Global Transaction Identifier (GID) for the transaction. The
GID is used as a handle to COMMITPREPARED/ABORTPREPARED commands to finish
the 2nd phase commit. After the PREPCOMMIT command finishes, the
transaction is no longer associated with any specific backend.

COMMITPREPARED/ABORTPREPARED commands are used to finish the prepared
transaction. They can be issued from any backend.

There's also a new system view, pg_prepared_xacts that show all prepared
transactions.

Here's a little step-by-step tutorial to trying out the patch:
-
1. apply patch, patch -p0  twophase2.diff
2. compile
3. create a new database system with initdb.
4. run postmaster
5. psql template1
6. CREATE TABLE foobar (a integer);
7. INSERT INTO foobar values (1);

8. BEGIN; UPDATE foobar SET a = 2 WHERE a = 1;
9. SELECT * FROM foobar;
10. PREPCOMMIT 'foobar_update1';

The transaction is now in prepared state, and it's no longer associated
with this backend, as you can see by issuing:

11. SELECT * FROM foobar;
12. SELECT * FROM pg_prepared_xacts;

Let's commit it then.

13. COMMITPREPARED 'foobar_update1';
14. SELECT * FROM pg_prepared_xacts;
15. SELECT * FROM foobar;

Next repeat steps 8-15 but try killing postmaster somewhere after step 9,
and observe that the transaction is not lost. Also try doing another
update with a different backend, and see that the locks held by the
prepared transaction survive the crash.


I also took a look at Satoshis patches. The main difference is that
his implementation made modifications to the BE/FE protocol, while my
implementation works at the statement level. His patches don't handle
shutdowns or broken connections yet, but that was on his TODO list.

When I started working on 2PC, I didn't know about Satoshis patches,
otherwise I probably would have took them as a starting point.

The next step is going to be writing 2PC support to the JDBC driver using
the new backend commands. XA interface would be very nice too, but I'm
personally not that interested in that. Any volunteers?

Please comment! I'd like to know what you guys think about this. Am I
heading into the right direction?

Some people have expressed concerns about performance issues with 2PC in
general. Please note that this patch doesn't change the traditional
commit routines, so it won't affect you performance if you don't use 2PC.

- Heikki


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-10-23 Thread Bruce Momjian

Satoshi, can you get this ready for inclusion in 7.5?  We need a formal
proposal of how it will work from the user's perspective (new
commands?), and how it will internally work.  It seem Heikki Linnakangas
has also started working on this and perhaps he can help.

Ideally, we should have this proposal when we start 7.5 development in a
few weeks.

I know some people have concerns about 2-phase commit, from a
performance perspective and from a network failure perspective, but I
think there are enough people who want it that we should see how this
can be implemented with the proper safeguards.

---

Satoshi Nagayasu wrote:
 
 Andrew Sullivan [EMAIL PROTECTED] wrote:
  On Fri, Oct 10, 2003 at 09:46:35AM +0900, Tatsuo Ishii wrote:
   Satoshi, the only guy who made a trial implementation of 2PC for
   PostgreSQL, has already showed that 2PC is not that slow.
  
  If someone has a fast implementation, so much the better.  I'm not
  opposed to fast implementations! 
 
 The pgbench results of my experimental 2PC implementation
 and plain postgresql are available.
 
 PostgreSQL 7.3
   http://snaga.org/pgsql/pgbench/pgbench-REL7_3.log
 
 Experimental 2PC in PostgreSQL 7.3
   http://snaga.org/pgsql/pgbench/pgbench-TPC0_0_2.log
 
 I can't see a grave overhead from this comparison.
 
  
  A
  
  -- 
  
  Andrew Sullivan 204-4141 Yonge Street
  Afilias CanadaToronto, Ontario Canada
  [EMAIL PROTECTED]  M2P 2A8
   +1 416 646 3304 x110
  
  
  ---(end of broadcast)---
  TIP 8: explain analyze is your friend
  
 
 
 -- 
 NAGAYASU Satoshi [EMAIL PROTECTED]
 
 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
http://archives.postgresql.org
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-10-23 Thread Satoshi Nagayasu
Bruce,

Ok, I will write my proposal.

BTW, my 2PC work is now suspended because of my master thesis.
My master thesis will (must) be finished in next few months.

To finish 2PC work, I feel 2 or 3 months are needed after that.

Bruce Momjian wrote:
 Satoshi, can you get this ready for inclusion in 7.5?  We need a formal
 proposal of how it will work from the user's perspective (new
 commands?), and how it will internally work.  It seem Heikki Linnakangas
 has also started working on this and perhaps he can help.
 
 Ideally, we should have this proposal when we start 7.5 development in a
 few weeks.
 
 I know some people have concerns about 2-phase commit, from a
 performance perspective and from a network failure perspective, but I
 think there are enough people who want it that we should see how this
 can be implemented with the proper safeguards.
 
 ---
 
 Satoshi Nagayasu wrote:
 
Andrew Sullivan [EMAIL PROTECTED] wrote:

On Fri, Oct 10, 2003 at 09:46:35AM +0900, Tatsuo Ishii wrote:

Satoshi, the only guy who made a trial implementation of 2PC for
PostgreSQL, has already showed that 2PC is not that slow.

If someone has a fast implementation, so much the better.  I'm not
opposed to fast implementations! 

The pgbench results of my experimental 2PC implementation
and plain postgresql are available.

PostgreSQL 7.3
  http://snaga.org/pgsql/pgbench/pgbench-REL7_3.log

Experimental 2PC in PostgreSQL 7.3
  http://snaga.org/pgsql/pgbench/pgbench-TPC0_0_2.log

I can't see a grave overhead from this comparison.


A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 8: explain analyze is your friend



-- 
NAGAYASU Satoshi [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

 
 


-- 
NAGAYASU Satoshi [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] 2-phase commit

2003-10-23 Thread Bruce Momjian
Satoshi Nagayasu wrote:
 Bruce,
 
 Ok, I will write my proposal.
 
 BTW, my 2PC work is now suspended because of my master thesis.
 My master thesis will (must) be finished in next few months.
 
 To finish 2PC work, I feel 2 or 3 months are needed after that.

Oh, OK, that is helpful.  Perhaps Heikki Linnakangas could help too.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-10-14 Thread Hans-Jürgen Schönig
I'm tired of this kind of 2PC is too slow arguments. I think
Satoshi, the only guy who made a trial implementation of 2PC for
PostgreSQL, has already showed that 2PC is not that slow.


Where does Satoshi's implementation sit right now?  Will it patch to v7.4?
Can it provide us with a base to work from, or is it complete?


It is not ready yet.
You can find it at ...
http://snaga.org/pgsql/

It is based on 7.3

* the 2-phase commit protocol (precommit and commit)
* the multi-master replication using 2PC
* distributed transaction (distributed query)
current work

* restarting (from 2nd phase) when the session is disconnected in 
2nd phase (XLOG stuffs)
* XA compliance

future work

* hot failover and recovery in PostgreSQL cluster
* data partitioning on different servers
I have compiled it a while ago.
Seems to be pretty nice :).
	Hans

--
Cybertec Geschwinde u Schoenig
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/2952/30706 or +43/660/816 40 77
www.cybertec.at, www.postgresql.at, kernel.cybertec.at


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] 2-phase commit

2003-10-14 Thread Heikki Linnakangas
On Thu, 9 Oct 2003, Bruce Momjian wrote:

 Agreed.  Let's get it into 7.5 and see it in action.  If we need to
 adjust it, we can, but right now, we need something for distributed
 transactions, and this seems like the logical direction.

I've started working on two-phase commits last week, and the very
basic stuff is now working. Still a lot of bugs though.

I posted the stuff I've put together to patches-list. I'd appreciate any
comments.

- Heikki


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-10-14 Thread Hans-Jürgen Schönig
Why would you spent time on implementing a mechanism whose ultimate
benefit is supposed to be increasing reliability and performance, when you
already realize that it will have to lock up at the slightest sight of
trouble?  There are better mechanisms out there that you can use instead.


If you want cross-server transactions, what other methods are there that
are more reliable?  It seems network unreliability is going to be a
problem no matter what method you use.


I guess we need something like PITR to make this work because otherwise 
I cannot see a way to get in sync again.
Maybe I should call the desired mechanism Entire cluster back to 
transaction X recovery.
Did anybody hear about PITR recently?

How else would you recover from any kind of problem?
No matter what you are doing network reliability will be a problem so we 
have to live with it.
Having some going back to something consistent is necessary anyway.
People might argue now that committed transactions might be lost. If 
people knew which ones, its ok. 90% of all people will understand that 
in case of a crash something evil might happen.

	Hans

--
Cybertec Geschwinde u Schoenig
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/2952/30706 or +43/660/816 40 77
www.cybertec.at, www.postgresql.at, kernel.cybertec.at


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-10-13 Thread Dann Corbit
 -Original Message-
 From: Jeroen T. Vermeulen [mailto:[EMAIL PROTECTED] 
 Sent: Saturday, October 11, 2003 5:36 AM
 To: Dann Corbit
 Cc: Christopher Browne; [EMAIL PROTECTED]
 Subject: Re: [HACKERS] 2-phase commit
 
 
 On Fri, Oct 10, 2003 at 09:37:53PM -0700, Dann Corbit wrote:
  Why not apply the effort to something already done and compatibly 
  licensed?
  
  This:
  http://dog.intalio.com/ots.html
  
  Appears to be a Berkeley style licensed: 
  http://dog.intalio.com/license.html
  
  Transaction monitor.
 
 I'd say this is complementary, not an alternative to 2PC 
 implementation issues.  

My notion is that the specification has been created that describes how
the system should operate, what the API's are, etc.  I think that most
of the work is involved in that area.  The notion is that if you program
to this spec, it will already have been well thought out and it should
be standards based when completed.
 
 The transaction monitor lives on the other side of the 
 problem.  2PC is needed in the database _so that_ the 
 transaction monitor can do its job.

Theoretically, if any database in the chain supports 2PC, you could make
all connected systems 2PC compliant by using the one functional system
as a persistent store.  But you are right.  PostgreSQL still would need
the I promise to commit when you ask method if it is to really support
it.

I think another way it could be handled is with nested transactions.
Just have the promise phase be an inner transaction commit but have an
outer transaction bracket that one for the actual commit.
 
 That said, having a 3-tier model is probably a good idea if 
 distributed transaction management is what we want.  :-)

In real life, I think it is _always_ done this way.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 2-phase commit

2003-10-13 Thread Rod Taylor
 I think another way it could be handled is with nested transactions.
 Just have the promise phase be an inner transaction commit but have an
 outer transaction bracket that one for the actual commit.

Not really. In the event of a crash, most 2PC systems will expect the
participant to come back in the same state it crashed in.

Our nested-transaction implementation (like our standard transaction
implementation) aborts all transactions on crash.


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] 2-phase commit

2003-10-13 Thread Jordan Henderson
On Monday 13 October 2003 20:11, Rod Taylor wrote:
  I think another way it could be handled is with nested transactions.
  Just have the promise phase be an inner transaction commit but have an
  outer transaction bracket that one for the actual commit.

 Not really. In the event of a crash, most 2PC systems will expect the
 participant to come back in the same state it crashed in.


Yes, this is correct.  There are certain phases of the protocol in which the 
transaction state must be re-instated from the log file after a crash of the 
DB server.  The re-instatement must occur prior to any connections being 
accepted by the server.  Additionally, the coordinator must be fully 
recoverable as well.  The coordinator may, depending on the phase of the 
commit/abort, contact child servers after it crashes.  The requirement is 
that during log replay, the transaction structures might have to be fully 
reconstructed and remain in-place after log replay has completed, until the 
disposition of the (sub)transaction is settled by the coordinator.  All 
dependent on the phase of course.

 Our nested-transaction implementation (like our standard transaction
 implementation) aborts all transactions on crash.

Jordan Henderson


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-10-13 Thread Jan Wieck
Bruce Momjian wrote:

Tatsuo Ishii wrote:
 Yes.  I don't think that 2PC is a solution for robustness in face of
 network failure.  It's too slow, to begin with.  Some sort of
 multi-master system is very desirable for network failures, c., but
 I don't think anybody does active/hot standby with 2PC any more; the
 performance is too bad.
I'm tired of this kind of 2PC is too slow arguments. I think
Satoshi, the only guy who made a trial implementation of 2PC for
PostgreSQL, has already showed that 2PC is not that slow.
Agreed.  Let's get it into 7.5 and see it in action.  If we need to
adjust it, we can, but right now, we need something for distributed
transactions, and this seems like the logical direction.
Are you guy's kidding or what?

2PC is not too slow in normal operations when everything is purring like 
little kittens and you're just wasting your excess bandwidth on it. The 
point is that it behaves horrible and like a dirty backstreet cat at the 
time when things go wrong ... basically it's a neat thing to have, but 
from the second you need it it becomes useless.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-10-11 Thread Jeroen T. Vermeulen
On Fri, Oct 10, 2003 at 09:37:53PM -0700, Dann Corbit wrote:
 Why not apply the effort to something already done and compatibly
 licensed?
 
 This:
 http://dog.intalio.com/ots.html
 
 Appears to be a Berkeley style licensed:
 http://dog.intalio.com/license.html
 
 Transaction monitor.

I'd say this is complementary, not an alternative to 2PC implementation
issues.  

The transaction monitor lives on the other side of the problem.  2PC is
needed in the database _so that_ the transaction monitor can do its job.

That said, having a 3-tier model is probably a good idea if distributed
transaction management is what we want.  :-)


Jeroen


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-10-10 Thread Zeugswetter Andreas SB SD

I was wondering whether we need to keep WAL online for 2PC,
or whether only something like clog is sufficient.

What if:
1. phase 1 commit must pass the slave xid that will be used for 2nd phase
   (it needs to return some sort of identification anyway)
2. the coordinator must keep a list of slave xid's along with 
   corresponding (commit/rollback) info

Is that not sufficient ? Why would WAL be needed in the first place ?
This is not replication, the slave has it's own WAL anyway.

I also don't buy the argument with the lockup. Iff today somebody connects
with psql starts a transaction modifies something and then never commits
or aborts there is also no automatism builtin that will eventually kill 
it automatically. 2PC will simply need to have means for the administrator
to rollback/commit an in doubt transaction manually.

Andreas

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-10-10 Thread Andrew Sullivan
On Fri, Oct 10, 2003 at 09:46:35AM +0900, Tatsuo Ishii wrote:
 Satoshi, the only guy who made a trial implementation of 2PC for
 PostgreSQL, has already showed that 2PC is not that slow.

If someone has a fast implementation, so much the better.  I'm not
opposed to fast implementations! 

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-10-10 Thread Andrew Sullivan
On Thu, Oct 09, 2003 at 11:53:46PM -0400, Christopher Browne wrote:
 
 If 2PC gets implemented, that simply means that there will be another
 module that some will be interested in, and which many people won't
 bother using.  Which shouldn't seem to be a particularly big deal.

I think the reason this is controversial, however, is that while PL/R
(e.g.) doesn't make big changes to the internals, 2PC certainly will
touch the fundamentals.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] 2-phase commit

2003-10-10 Thread Satoshi Nagayasu

Andrew Sullivan [EMAIL PROTECTED] wrote:
 On Fri, Oct 10, 2003 at 09:46:35AM +0900, Tatsuo Ishii wrote:
  Satoshi, the only guy who made a trial implementation of 2PC for
  PostgreSQL, has already showed that 2PC is not that slow.
 
 If someone has a fast implementation, so much the better.  I'm not
 opposed to fast implementations! 

The pgbench results of my experimental 2PC implementation
and plain postgresql are available.

PostgreSQL 7.3
  http://snaga.org/pgsql/pgbench/pgbench-REL7_3.log

Experimental 2PC in PostgreSQL 7.3
  http://snaga.org/pgsql/pgbench/pgbench-TPC0_0_2.log

I can't see a grave overhead from this comparison.

 
 A
 
 -- 
 
 Andrew Sullivan 204-4141 Yonge Street
 Afilias CanadaToronto, Ontario Canada
 [EMAIL PROTECTED]  M2P 2A8
  +1 416 646 3304 x110
 
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend
 


-- 
NAGAYASU Satoshi [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 2-phase commit

2003-10-10 Thread Christopher Browne
Martha Stewart called it a Good Thing [EMAIL PROTECTED] (Dann Corbit)wrote:
 I can't see a grave overhead from this comparison.

 2PC is absolutely essential when you have to have both parts of the
 transaction complete for a logical unit of work.  For a project that
 needs it, if you don't have it you will be forced to go to another
 tool, or perform lots of custom programming to work around it.

 If you have 2PC and it is ten times slower than without it, you will
 still need it for projects requiring that capability.

Just so.

I would be completely unsurprised if an attempt to use 2PC to support
generalized multimaster replication would involve 10-fold slowdowns
as compared to having all the activity take place on one database.

Which would imply that 2PC is not a tool that may be appropriately
used to naively do replication.  But that should not come as any grand
surprise.

To each tool the right job, and to each job the right tool...

There seems to be enough room for there to be evidence both of 2PC
being useful for improving performance, and for it to cut
performance:

 - TPC benchmarks often specify the inclusion of Tuxedo as a
   component; the combination of vendors would surely NOT put it
   on the list if it were not an aid to performance;

 - There is also indication that there can be a cost, notably in the
   form of the concerns of deadlock, but it should also be obvious
   that slow network links would lead to _hideous_ increases in
   latency.

As you say, even if there is a substantial cost, it's still worthwhile
if a project needs it.

 Now, a good model to start with is a very good idea.  So some
 discussion and analysis is a good thing.  From the looks of it,
 Satoshi Nagayasu has done a very good job.  Having a functional 2PC
 would be a huge feather in the cap of PostgreSQL.

It would seem so.  I look forward to seeing how this progresses.
-- 
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','acm.org').
http://cbbrowne.com/info/linuxdistributions.html
XFS might  (or might not)  come out before  the year 3000.  As far as
kernel patches go,  SGI are brilliant.  As far as graphics, especially
OpenGL,  go,  SGI is  untouchable.  As  far as   filing  systems go, a
concussed doormouse in a tarpit would move faster.  -- jd on Slashdot

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-10-10 Thread Dann Corbit
Why not apply the effort to something already done and compatibly
licensed?

This:
http://dog.intalio.com/ots.html

Appears to be a Berkeley style licensed:
http://dog.intalio.com/license.html

Transaction monitor.

Overview
The OpenORB Transaction Service is a very scalable transaction monitor
which also provides several extensions like XA management, a management
interface to control all transaction processes and a high reliable
recovery system. 

By coordinating OpenORB and OpenORB Transaction Service, you provide a
reliable and powerful foundation for building large scalable distributed
applications. 

Datasheet
The OpenORB Transaction Service is a fully compliant implementation of
the OMG Transaction Service specification. 
The OpenORB Transaction Service features are :  
  Management of distributed transactions with a two phase commit
protocol 
 Sub Transactions management ( nested transactions ) 
 Propagation of the transaction context between CORBA objects 
 Management of distributed transactions propagation through databases
with the XA protocol 
 Automatic logs to be able to make recovery in case of failures 
 Can be used as a transaction initiator or subordinate 
 High-performance, multiple thread architecture 
 Developed with POA 
 Provides a management interface to control all transactions 
 Full support of JTA 
 JDBC pooling and automatic resource enlistment 


Download
To download the OpenORB Transaction Service, do one of the following :  
  CVS : you can use CVS to grab the sources directly.  
 FTP : you get either a CVS snapshot or a prebuilt version 
To use one of these possibilities, go to the Download Services page. 

ChangeLog
August 15th 2001. Version 1.2.0.  
  Changed the transaction client side to support late binding to the
transaction monitor. 
 Bug fixed in the transactional client interceptor. This bug was due to
a change in the OpenORB behavior concerning the slot 


To get previous change log, please refer to the CHANGELOG file available
within this service distribution.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-10-10 Thread Dann Corbit
Here is a sourceforge version of the same thing
http://openorb.sourceforge.net/

 -Original Message-
 From: Dann Corbit 
 Sent: Friday, October 10, 2003 9:38 PM
 To: Christopher Browne; [EMAIL PROTECTED]
 Subject: Re: [HACKERS] 2-phase commit
 
 
 Why not apply the effort to something already done and 
 compatibly licensed?
 
 This:
 http://dog.intalio.com/ots.html
 
 Appears to be a Berkeley style licensed: 
 http://dog.intalio.com/license.html
 
 Transaction monitor.
 
 Overview
 The OpenORB Transaction Service is a very scalable 
 transaction monitor which also provides several extensions 
 like XA management, a management interface to control all 
 transaction processes and a high reliable recovery system. 
 
 By coordinating OpenORB and OpenORB Transaction Service, you 
 provide a reliable and powerful foundation for building large 
 scalable distributed applications. 
 
 Datasheet
 The OpenORB Transaction Service is a fully compliant 
 implementation of the OMG Transaction Service specification. 
 The OpenORB Transaction Service features are :  
   Management of distributed transactions with a two phase 
 commit protocol 
  Sub Transactions management ( nested transactions ) 
  Propagation of the transaction context between CORBA objects 
  Management of distributed transactions propagation through 
 databases with the XA protocol 
  Automatic logs to be able to make recovery in case of failures 
  Can be used as a transaction initiator or subordinate 
  High-performance, multiple thread architecture 
  Developed with POA 
  Provides a management interface to control all transactions 
  Full support of JTA 
  JDBC pooling and automatic resource enlistment 
 
 
 Download
 To download the OpenORB Transaction Service, do one of the 
 following :  
   CVS : you can use CVS to grab the sources directly.  
  FTP : you get either a CVS snapshot or a prebuilt version 
 To use one of these possibilities, go to the Download Services page. 
 
 ChangeLog
 August 15th 2001. Version 1.2.0.  
   Changed the transaction client side to support late binding 
 to the transaction monitor. 
  Bug fixed in the transactional client interceptor. This bug 
 was due to a change in the OpenORB behavior concerning the slot 
 
 
 To get previous change log, please refer to the CHANGELOG 
 file available within this service distribution.
 
 ---(end of 
 broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
   http://www.postgresql.org/docs/faqs/FAQ.html

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Andrew Sullivan
On Wed, Oct 08, 2003 at 05:43:49PM -0400, Bruce Momjian wrote:
 
 OK, I think we came to the conclusion that we want 2-phase commit, but
 want some way to mark a server as offline/read-only, or notify an

That sounds to me like the concusion, to the extent there was one,
yes.  I'd still like to hear from those who continue to have strong
objections on the grounds of the impossibility of a guaranteed
recovery method.  Does the proposal of allowing dbas to run that
risk, provided there's a mechanism to tell them about it, satisfy the
objection (assuming, of course, 2PC can be turned off)?

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Peter Eisentraut
Andrew Sullivan writes:

 Does the proposal of allowing dbas to run that risk, provided there's a
 mechanism to tell them about it, satisfy the objection (assuming, of
 course, 2PC can be turned off)?

Why would you spent time on implementing a mechanism whose ultimate
benefit is supposed to be increasing reliability and performance, when you
already realize that it will have to lock up at the slightest sight of
trouble?  There are better mechanisms out there that you can use instead.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Bruce Momjian
Peter Eisentraut wrote:
 Andrew Sullivan writes:
 
  Does the proposal of allowing dbas to run that risk, provided there's a
  mechanism to tell them about it, satisfy the objection (assuming, of
  course, 2PC can be turned off)?
 
 Why would you spent time on implementing a mechanism whose ultimate
 benefit is supposed to be increasing reliability and performance, when you
 already realize that it will have to lock up at the slightest sight of
 trouble?  There are better mechanisms out there that you can use instead.

If you want cross-server transactions, what other methods are there that
are more reliable?  It seems network unreliability is going to be a
problem no matter what method you use.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Andrew Sullivan
On Thu, Oct 09, 2003 at 04:22:13PM +0200, Peter Eisentraut wrote:
 Why would you spent time on implementing a mechanism whose ultimate
 benefit is supposed to be increasing reliability and performance, when you
 already realize that it will have to lock up at the slightest sight of
 trouble?  There are better mechanisms out there that you can use instead.

The slightest sign of trouble seems to me to be overstating the
matter rather.  It cannot recover in the case where the first phase
of commit has happened everywhere, and then the master crashes.  

We are talking, after all, about a pretty exotic feature in the first
place.  I presume that anyone who is using it is also using it on
machines which have ultra-high-reliable, the cpu can catch on fire
and the box stays up sort of hardware.  I'll grant you that running a
pair of B0b'5 C0mpu73r5 Ultra kewl sooper fa5t overclocked specials
with serial ATA with the write cache enabled is a recipe for data
loss.  But that's a disaster no matter what.

But you cannot have XA-like stuff without 2PC.  You can't easily have
heterogenous systems without 2PC.  And folks have already generously
volunteered to work on this problem; I think that they deserve
support, assuming we can come up with some idea of what kinds of
compromises are acceptable ones.  There's no question that 2PC
requires some unpleasant compromises.  But if you want someone to be
able to add a Postgres member to a heterogenous cluster, you're
going to need to be able to accept some compromises, because the DBA
(or, more likely, his management) already has.

I'm not sure that 2PC is actually intended to increase reliability or
performance, by the way.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Peter Eisentraut
Bruce Momjian writes:

 If you want cross-server transactions, what other methods are there that
 are more reliable?

3-phase commit

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Zeugswetter Andreas SB SD

  Why would you spent time on implementing a mechanism whose ultimate
  benefit is supposed to be increasing reliability and performance, when you
  already realize that it will have to lock up at the slightest sight of
  trouble?  There are better mechanisms out there that you can use instead.
 
 If you want cross-server transactions, what other methods are there that
 are more reliable?  It seems network unreliability is going to be a
 problem no matter what method you use.

And unless you have 2-phase (or 3-phase) commit, all other methods are going 
to be worse, since their time window for possible critical failure is
going to be substantially larger. (extending 2-phase to 3-phase should not be 
too difficult)

A lot of use cases for 2PC are not for manipulating the same data on more than 
one server (replication), but different data that needs to be manipulated in an
all or nothing transaction. In this scenario it is not about reliability but about 
physically locating data (e.g. in LA vs New York) where it is needed most often.

Andreas

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Mike Mascari
Bruce Momjian wrote:

 Peter Eisentraut wrote:
 
Andrew Sullivan writes:

Does the proposal of allowing dbas to run that risk, provided there's a
mechanism to tell them about it, satisfy the objection (assuming, of
course, 2PC can be turned off)?

Why would you spent time on implementing a mechanism whose ultimate
benefit is supposed to be increasing reliability and performance, when you
already realize that it will have to lock up at the slightest sight of
trouble?  There are better mechanisms out there that you can use instead.
 
 If you want cross-server transactions, what other methods are there that
 are more reliable?  It seems network unreliability is going to be a
 problem no matter what method you use.

What is the stated goal of distributed transactions in PostgreSQL?

1) XA-compatibility/interoperability

or

2) Robustness in the face of network failure

The implementation choosen depends upon the answer, does it not? Is
there an implementation (e.g. 3PC) that can simulate 2PC behavior for
interoperability purposes and satisfy both requirements?

Mike Mascari
[EMAIL PROTECTED]










---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Bruce Momjian
Peter Eisentraut wrote:
 Bruce Momjian writes:
 
  If you want cross-server transactions, what other methods are there that
  are more reliable?
 
 3-phase commit

OK, how is that going to make thing safer, or does it just shrink the
failure window smaller?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Rod Taylor
On Thu, 2003-10-09 at 11:14, Peter Eisentraut wrote:
 Bruce Momjian writes:
 
  If you want cross-server transactions, what other methods are there that
  are more reliable?
 
 3-phase commit

How about a real world example of a transaction manager that has
actually implemented 3PC?

But yes, the ability for the participants to talk to each-other in the
event the controller is unavailable seems an obvious fix.


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Andrew Sullivan
On Thu, Oct 09, 2003 at 11:22:05AM -0400, Mike Mascari wrote:
 The implementation choosen depends upon the answer, does it not? Is
 there an implementation (e.g. 3PC) that can simulate 2PC behavior for
 interoperability purposes and satisfy both requirements?

I don't know.  What I know is that someone showed up working on 2PC,
and got a frosty reception.  I'm trying to learn what criteria would
make the work acceptable.  For my purposes, the feature would be
really nice, so I'd hate to see the opportunity lost.  If someone has
an idea even how 3PC might be implemented, I'd be happy to hear it.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Robert Treat
On Thu, 2003-10-09 at 12:07, Andrew Sullivan wrote:
 On Thu, Oct 09, 2003 at 11:22:05AM -0400, Mike Mascari wrote:
  The implementation choosen depends upon the answer, does it not? Is
  there an implementation (e.g. 3PC) that can simulate 2PC behavior for
  interoperability purposes and satisfy both requirements?
 
 I don't know.  What I know is that someone showed up working on 2PC,
 and got a frosty reception.  I'm trying to learn what criteria would
 make the work acceptable.  For my purposes, the feature would be
 really nice, so I'd hate to see the opportunity lost.  If someone has
 an idea even how 3PC might be implemented, I'd be happy to hear it.
 

Can you elaborate on your purposes?  Do they fall into the
XA-compatibility bit or the Robustness in the face of network
failure?  

On the likely chance that 50% fall into 1 and the other into 2, can we
accept a solution than doesn't address both?

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Andrew Sullivan
On Thu, Oct 09, 2003 at 02:17:28PM -0400, Robert Treat wrote:
 Can you elaborate on your purposes?  Do they fall into the
 XA-compatibility bit or the Robustness in the face of network
 failure?  

Yes.  I don't think that 2PC is a solution for robustness in face of
network failure.  It's too slow, to begin with.  Some sort of
multi-master system is very desirable for network failures, c., but
I don't think anybody does active/hot standby with 2PC any more; the
performance is too bad.

I'm interested in the ability to use it for XA(ish) compatibility and
heterogenous database support.  Arguments with
people-who-think-Gartner-reports-are-good-guides-for-what-to-do would
be a lot easier if I had that, to begin with.

A 

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Tatsuo Ishii
 Yes.  I don't think that 2PC is a solution for robustness in face of
 network failure.  It's too slow, to begin with.  Some sort of
 multi-master system is very desirable for network failures, c., but
 I don't think anybody does active/hot standby with 2PC any more; the
 performance is too bad.

I'm tired of this kind of 2PC is too slow arguments. I think
Satoshi, the only guy who made a trial implementation of 2PC for
PostgreSQL, has already showed that 2PC is not that slow.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Bruce Momjian
Tatsuo Ishii wrote:
  Yes.  I don't think that 2PC is a solution for robustness in face of
  network failure.  It's too slow, to begin with.  Some sort of
  multi-master system is very desirable for network failures, c., but
  I don't think anybody does active/hot standby with 2PC any more; the
  performance is too bad.
 
 I'm tired of this kind of 2PC is too slow arguments. I think
 Satoshi, the only guy who made a trial implementation of 2PC for
 PostgreSQL, has already showed that 2PC is not that slow.

Agreed.  Let's get it into 7.5 and see it in action.  If we need to
adjust it, we can, but right now, we need something for distributed
transactions, and this seems like the logical direction.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Marc G. Fournier


On Fri, 10 Oct 2003, Tatsuo Ishii wrote:

  Yes.  I don't think that 2PC is a solution for robustness in face of
  network failure.  It's too slow, to begin with.  Some sort of
  multi-master system is very desirable for network failures, c., but
  I don't think anybody does active/hot standby with 2PC any more; the
  performance is too bad.

 I'm tired of this kind of 2PC is too slow arguments. I think
 Satoshi, the only guy who made a trial implementation of 2PC for
 PostgreSQL, has already showed that 2PC is not that slow.

Where does Satoshi's implementation sit right now?  Will it patch to v7.4?
Can it provide us with a base to work from, or is it complete?


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-10-09 Thread Christopher Browne
The world rejoiced as [EMAIL PROTECTED] (Tatsuo Ishii) wrote:
 I'm tired of this kind of 2PC is too slow arguments. I think
 Satoshi, the only guy who made a trial implementation of 2PC for
 PostgreSQL, has already showed that 2PC is not that slow.

I'm tired of it for a different reason, namely that there are use
cases where speed is not _relevant_.  The REAL problem that is taking
place is that people are talking past each other.

- Some say, It's too slow; no point in doing it.

  The fact that it may be too slow _for them_ means they probably
  shouldn't use it.  I somehow doubt that there are Vastly Faster
  alternatives waiting in the wings.

- The other problem that gets pointed out:  2PC is inherently
  fragile, and prone to deadlock.

  Again, those that _need_ to use 2PC will forcibly need to address
  those concerns in the way they manage their systems.

  Those that can't afford the fragility are not 'customers' for use of
  2PC.  And, pointing back to the speed controversy, it is not at all
  obvious that there is any other alternative for handling distributed
  processing that _totally addresses_ the concerns about fragility.

Those that can't afford these costs associated with 2PC will simply
Not Use It.

Probably in much the same way that most people _aren't_ using
replication.  And most people _aren't_ using PL/R.  And most people
_aren't_ using any number of the contributed things.

If 2PC gets implemented, that simply means that there will be another
module that some will be interested in, and which many people won't
bother using.  Which shouldn't seem to be a particularly big deal.
-- 
aa454,@,freenet.carleton.ca
http://www.ntlug.org/~cbbrowne/
The way to a man's heart is with a broadsword.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-10-08 Thread Bruce Momjian
Andrew Sullivan wrote:
 On Sat, Sep 27, 2003 at 09:13:27AM -0300, Marc G. Fournier wrote:
  
  I think it was Andrew that suggested it ... when the slave timesout, it
  should trigger a READ ONLY mode on the slave, so that when/if the master
  tries to start to talk to it, it can't ...
  
  As for the master itself, it should be smart enough that if it times out,
  it knows to actually abandom the slave and not continue to try ...
 
 Yes, but now we're talking as though this is master-slave
 replication.  Actually, master and slave are only useful terms in
 a transaction for 2PC.  So every machine is both a master and a
 slave.
 
 It seems that one way out is just to fall back to read only as soon
 as a single failure happens.  That's the least graceful but maybe
 safest approach to failure, analogous to what fsck does to your root
 filesystem at boot time.  Of course, since there's no read only
 mode at the moment, this is all pretty hand-wavy on my part :-/

OK, I think we came to the conclusion that we want 2-phase commit, but
want some way to mark a server as offline/read-only, or notify an
administrator.  Can we communicate this to the Japanese guys working on
2-phase commit so they can start working toward including in 7.5?


Added to TODO:

* Add two-phase commit to all distributed transactions with
  offline/readonly server status or administrator notification 
  for failure

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-10-07 Thread Hans-Jürgen Schönig
Marc G. Fournier wrote:
On Sat, 27 Sep 2003, Bruce Momjian wrote:


I have been thinking it might be time to start allowing external
programs to be called when certain events occur that require
administrative attention --- this would be a good case for that.
Administrators could configure shell scripts to be run when the network
connection fails or servers drop off the network, alerting them to the
problem.  Throwing things into the server logs isn't _active_ enough.


Actually, apparently you can do this now ... there is apparently a mail
module for PostgreSQL that you can use to have the database send email's
out ...
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


I guess someting such as

CREATE TRIGGER my_trig ON BEGIN / COMMIT
EXECUTE ...
would be nice. I think this can be used for many perposes (not 
necessarily 2PC).
If a trigger could handle database events and not just events on tables.

ON BEGIN
ON COMMIT
ON CREATE TABLE , ...
We could have used that so often in the past in countless applications.

	Regards,

		Hans

--
Cybertec Geschwinde u Schoenig
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/2952/30706 or +43/660/816 40 77
www.cybertec.at, www.postgresql.at, kernel.cybertec.at


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Hiroshi Inoue
I seem to have misunderstood the problem completely.
(BI apologize to you all(especially Tom) for disturbing
(Bthis thread.
(B
(BI wonder if there might be such a nice solution when
(Bsome of the systems or communications are dead.
(BAnd as many people already mentioned, there's not so
(Bmuch allowance if we only adopt XA-based protocol. 
(B
(Bregards,
(BHiroshi Inoue
(Bhttp://www.geocities.jp/inocchichichi/psqlodbc/
(B
(BTom Lane wrote:
(B 
(B Hiroshi Inoue [EMAIL PROTECTED] writes:
(B  The simplest senario(though there could be varations) is
(B 
(B  [At participant(master)'s side]
(BBecause the commit operations is done, does nothing.
(B 
(B  [At coordinator(slave)' side]
(B 1) After a while
(B 2) re-establish the communication path between the
(Bpartcipant(master)'s TM.
(B 3) resend the "commit requeset" to the participant's TM.
(B1)2)3) would be repeated until the coordinator receives
(Bthe "commit ok" message from the partcipant.
(B 
(B [ scratches head ] I think you are using the terms "master" and "slave"
(B oppositely than I would.  But in any case, this is not an answer to the
(B concern I had.  You're assuming that the "coordinator(slave)" side is
(B willing to resend a request indefinitely, and also that the
(B "participant(master)" side is willing to retain per-transaction commit
(B state indefinitely so that it can correctly answer belated questions
(B from the other side.  What I was complaining about was that I don't
(B think either side can afford to remember per-transaction state
(B indefinitely.  2PC in the abstract is a useless academic abstraction ---
(B where the rubber meets the road is defining how you cope with failures
(B in the commit protocol.
(B 
(B regards, tom lane
(B
(B---(end of broadcast)---
(BTIP 7: don't forget to increase your free space map settings

Re: [HACKERS] 2-phase commit

2003-09-29 Thread Zeugswetter Andreas SB SD

The simplest senario(though there could be varations) is
  
[At participant(master)'s side]
  Because the commit operations is done, does nothing.
  
[At coordinator(slave)' side]
   1) After a while
   2) re-establish the communication path between the
  partcipant(master)'s TM.
   3) resend the commit requeset to the participant's TM.
  1)2)3) would be repeated until the coordinator receives
  the commit ok message from the partcipant.
  
   [ scratches head ] I think you are using the terms master and slave
   oppositely than I would.
  
  Oops my mistake, sorry.
  But is it 2-phase commit protocol in the first place ?
 
 That is, in your exmaple below
 
  Example:
 
 Master  Slave
 --  -
 commit ready--

This is the commit for phase 1. This commit is allowed to return all 
sorts of errors, like violated deferred checks, out of diskspace, ...

 --OK
 commit done-XX

This is commit for phase 2, the slave *must* answer with success
in all but hardware failure cases. (Note that instead the master could 
instead send rollback, e.g. because some other slave aborted)

 is the commit done message needed ?

So, yes this is needed.

Andreas

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Marc G. Fournier


On Mon, 29 Sep 2003, Hiroshi Inoue wrote:



 Hiroshi Inoue wrote:
 
  Tom Lane wrote:
  
   Hiroshi Inoue [EMAIL PROTECTED] writes:
The simplest senario(though there could be varations) is
  
[At participant(master)'s side]
  Because the commit operations is done, does nothing.
  
[At coordinator(slave)' side]
   1) After a while
   2) re-establish the communication path between the
  partcipant(master)'s TM.
   3) resend the commit requeset to the participant's TM.
  1)2)3) would be repeated until the coordinator receives
  the commit ok message from the partcipant.
  
   [ scratches head ] I think you are using the terms master and slave
   oppositely than I would.
 
  Oops my mistake, sorry.
  But is it 2-phase commit protocol in the first place ?

 That is, in your exmaple below

  Example:

 Master  Slave
 --  -
 commit ready--
 --OK
 commit done-XX

 is the commit done message needed ?

Of course ... how else will the Slave commit?  From my understanding, the
concept is that the master sends a commit ready to the slave, but the OK
back is that OK, I'm ready to commit whenever you are, at which point
the master does its commit and tells the slave to do its ...


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Bruce Momjian
Marc G. Fournier wrote:
  Master  Slave
  --  -
  commit ready--
  --OK
  commit done-XX
 
  is the commit done message needed ?
 
 Of course ... how else will the Slave commit?  From my understanding, the
 concept is that the master sends a commit ready to the slave, but the OK
 back is that OK, I'm ready to commit whenever you are, at which point
 the master does its commit and tells the slave to do its ...

Or the slave could reject the request.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Bruce Momjian
Tom Lane wrote:
  [At participant(master)'s side]
Because the commit operations is done, does nothing.
 
  [At coordinator(slave)' side]
 1) After a while
 2) re-establish the communication path between the
partcipant(master)'s TM.
 3) resend the commit requeset to the participant's TM.
1)2)3) would be repeated until the coordinator receives
the commit ok message from the partcipant.
 
 [ scratches head ] I think you are using the terms master and slave
 oppositely than I would.  But in any case, this is not an answer to the
 concern I had.  You're assuming that the coordinator(slave) side is
 willing to resend a request indefinitely, and also that the
 participant(master) side is willing to retain per-transaction commit
 state indefinitely so that it can correctly answer belated questions
 from the other side.  What I was complaining about was that I don't
 think either side can afford to remember per-transaction state
 indefinitely.  2PC in the abstract is a useless academic abstraction ---
 where the rubber meets the road is defining how you cope with failures
 in the commit protocol.

I don't think there is any way to handle cases where the master or slave
just disappears.  The other machine isn't under the server's control, so
it has no way of it knowing. I think we have to allow the administrator
to set a timeout, or ask to wait indefinately, and allow them to call an
external program to record the event or notify administrators.
Multi-master replication has the same issues.

My original point was that multi-master replication has the same
limitations, but people still want it.  Same for two-phase commit --- it
has the same limitations, but people want it.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Jeff
Tom Lane wrote:

 Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 ... You can make this work, but the resource costs
 are steep.
 
 So, after 'n' seconds of waiting, we abandon the slave and the slave
 abandons the master.
 
 [itch...]  But you surely cannot guarantee that the slave and the master
 time out at exactly the same femtosecond.  What happens when the comm
 link comes back online just when one has timed out and the other not?
 (Hint: in either order, it ain't good.  Double plus ungood if, say, the
 comm link manages to deliver the master's commit confirm message a
 little bit after the master has timed out and decided to abort after all.)
 
 In my book, timeout-based solutions to this kind of problem are certain
 disasters.
 
 regards, tom lane

What do commercial databases do about 2PC or other multi-master solutions?
You've done a good job of convincing me that it's unreliable no matter what
(through your posts on this topic over a long time). However, I would think
that something like Oracle or DB2 have some kind of answer for
multi-master, and I'm curious what it is. If they don't, is it reasonable
to make a test case that leaves their database inconsistent or hanging?

I can (probably) get access to a SQL Server system to run some tests, if
someone is interested.

regards,
jeff davis




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Marc G. Fournier


On Mon, 29 Sep 2003, Bruce Momjian wrote:

 Marc G. Fournier wrote:
   Master  Slave
   --  -
   commit ready--
   --OK
   commit done-XX
  
   is the commit done message needed ?
 
  Of course ... how else will the Slave commit?  From my understanding, the
  concept is that the master sends a commit ready to the slave, but the OK
  back is that OK, I'm ready to commit whenever you are, at which point
  the master does its commit and tells the slave to do its ...

 Or the slave could reject the request.

Huh?  The slave has that option??  In what circumstance?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Zeugswetter Andreas SB SD

 I don't think there is any way to handle cases where the master or slave
 just disappears.  The other machine isn't under the server's control, so
 it has no way of it knowing. I think we have to allow the administrator
 to set a timeout, or ask to wait indefinately, and allow them to call an
 external program to record the event or notify administrators.
 Multi-master replication has the same issues.

Needs to wait indefinitely, a timeout is not acceptable since it leads to 
inconsistent data. Human (or monitoring software) intervention is needed
if they can't reach each other in a reasonable time.

I think this needs to be kept dumb. Different sorts of use cases will simply  
need different answers to resolve in-doubt transactions. What is needed is an
interface that allows listing and commit/rollback of in-doubt transactions 
(preferably from a newly started client, or a direct command for the postmaster).

Andreas

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Zeugswetter Andreas SB SD

   Master  Slave
   --  -
   commit ready--
   --OK
   commit done-XX
  
   is the commit done message needed ?
  
  Of course ... how else will the Slave commit?  From my 
 understanding, the
  concept is that the master sends a commit ready to the 
 slave, but the OK
  back is that OK, I'm ready to commit whenever you are, at 
 which point
  the master does its commit and tells the slave to do its ...
 
 Or the slave could reject the request.

At this point only because of a hardware error. In case of network 
problems the commit done eighter did not reach the slave or the success
answer did not reach the master.

That is what it's all about. Phase 2 is supposed to be low overhead and very 
fast to allow keeping the time window for failure (that produces in-doubt 
transactions) as short as possible.

Andreas

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Bruce Momjian
Marc G. Fournier wrote:
is the commit done message needed ?
  
   Of course ... how else will the Slave commit?  From my understanding, the
   concept is that the master sends a commit ready to the slave, but the OK
   back is that OK, I'm ready to commit whenever you are, at which point
   the master does its commit and tells the slave to do its ...
 
  Or the slave could reject the request.
 
 Huh?  The slave has that option??  In what circumstance?

I thought the slave could reject if someone local already had the row
locked.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Tom Lane
Hiroshi Inoue [EMAIL PROTECTED] writes:
 But is it 2-phase commit protocol in the first place ?

 That is, in your exmaple below

  Example:

 Master  Slave
 --  -
 commit ready--
 --OK
 commit done-XX

 is the commit done message needed ?

Absolutely --- otherwise, we'd not be having this whole discussion.  The
problem is that the slave is holding ready to commit but doesn't know
whether he should or not ... or alternatively, he did commit but the
master didn't get the acknowledgement.

It's not that big a deal for the master to remember past committed
transactions until it knows all slaves have acknowledged committing
them; you only need a bit or so per transaction.  It's a much bigger
deal if the slave has to hold the transaction ready-to-commit for a
long time.  That transaction is holding locks, and also the sheer
volume of log data is way bigger.  (For comparison, we recycle pg_xlog
details about a transaction much sooner than we recycle pg_clog.)

I think you really want some way for the slave to decide it can time out
and abort the transaction after all ... but I don't see how you do
that without breaking the 2PC protocol.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Zeugswetter Andreas SB SD

   Or the slave could reject the request.
  
  Huh?  The slave has that option??  In what circumstance?
 
 I thought the slave could reject if someone local already had the row
 locked.

No, not at all. The slave would need to reject phase 1 commit ready
for this.

Andreas

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Andrew Sullivan
On Sat, Sep 27, 2003 at 09:13:27AM -0300, Marc G. Fournier wrote:
 
 I think it was Andrew that suggested it ... when the slave timesout, it
 should trigger a READ ONLY mode on the slave, so that when/if the master
 tries to start to talk to it, it can't ...
 
 As for the master itself, it should be smart enough that if it times out,
 it knows to actually abandom the slave and not continue to try ...

Yes, but now we're talking as though this is master-slave
replication.  Actually, master and slave are only useful terms in
a transaction for 2PC.  So every machine is both a master and a
slave.

It seems that one way out is just to fall back to read only as soon
as a single failure happens.  That's the least graceful but maybe
safest approach to failure, analogous to what fsck does to your root
filesystem at boot time.  Of course, since there's no read only
mode at the moment, this is all pretty hand-wavy on my part :-/

A


-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Bruce Momjian
Zeugswetter Andreas SB SD wrote:
 
Or the slave could reject the request.
   
   Huh?  The slave has that option??  In what circumstance?
  
  I thought the slave could reject if someone local already had the row
  locked.
 
 No, not at all. The slave would need to reject phase 1 commit ready
 for this.

Oh, yea, thanks.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Marc G. Fournier wrote:
  Or the slave could reject the request.
  
  Huh?  The slave has that option??  In what circumstance?
 
  I thought the slave could reject if someone local already had the row
  locked.
 
 All normal reasons for transaction failure are supposed to be checked
 for before the slave responds that it's ready to commit.  Otherwise it's
 supposed to say it can't commit.
 
 Basically the weak spot of 2PC is that it assumes there are no possible
 reasons for failure after ready to commit is sent.  You can make that
 approximately true, with sufficient investment of resources, but it's
 definitely not a pleasant assumption.

Yep.  There is no full solution.  I think it is like running with fsync
off --- if the OS crashes, you have to clean up --- if you fail on a
2-phase commit, you have to clean up.  Multi-master will be the same.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Andrew Sullivan
On Sun, Sep 28, 2003 at 11:58:24AM -0700, Kevin Brown wrote:
  But the postmaster doesn't connect to any database, and in a serious
  failure, might not be able to start one.
 
 Ah, true.  But I figured that in the context of 2PC and replication that
 most of the associated failures were likely to occur in an active
 backend or something equivalent, where a stored procedure was likely to
 be accessible.

AS you go on to note, that's not always a possibility.  For instance,
server C crashes and can't come back because, say, its WAL is
scrabled.  All it will currently be able to do is scream at you in
the logs, which won't solve all the problems one has with 2PC (among
other problems).

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Bruce Momjian
Andrew Sullivan wrote:
 On Sat, Sep 27, 2003 at 09:13:27AM -0300, Marc G. Fournier wrote:
  
  I think it was Andrew that suggested it ... when the slave timesout, it
  should trigger a READ ONLY mode on the slave, so that when/if the master
  tries to start to talk to it, it can't ...
  
  As for the master itself, it should be smart enough that if it times out,
  it knows to actually abandom the slave and not continue to try ...
 
 Yes, but now we're talking as though this is master-slave
 replication.  Actually, master and slave are only useful terms in
 a transaction for 2PC.  So every machine is both a master and a
 slave.
 
 It seems that one way out is just to fall back to read only as soon
 as a single failure happens.  That's the least graceful but maybe
 safest approach to failure, analogous to what fsck does to your root
 filesystem at boot time.  Of course, since there's no read only
 mode at the moment, this is all pretty hand-wavy on my part :-/

Yes, but that affects all users, not just the transaction we were
working on. I think we have to get beyond the idea that this can be made
failure-proof, and just outline the behaviors for failure, and it has to
be configurable by the administrator.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Andrew Sullivan
On Mon, Sep 29, 2003 at 11:14:30AM -0300, Marc G. Fournier wrote:
 
  Or the slave could reject the request.
 
 Huh?  The slave has that option??  In what circumstance?

In every circumstance where a stand-alone machine would have it. 
Machine A may not yet know about conflicting transactions on machine
B.  This is why 2PC is hard ;-)

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Marc G. Fournier wrote:
 Or the slave could reject the request.
 
 Huh?  The slave has that option??  In what circumstance?

 I thought the slave could reject if someone local already had the row
 locked.

All normal reasons for transaction failure are supposed to be checked
for before the slave responds that it's ready to commit.  Otherwise it's
supposed to say it can't commit.

Basically the weak spot of 2PC is that it assumes there are no possible
reasons for failure after ready to commit is sent.  You can make that
approximately true, with sufficient investment of resources, but it's
definitely not a pleasant assumption.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Andrew Sullivan
On Sat, Sep 27, 2003 at 08:36:36AM +, Jeff wrote:
 
 What do commercial databases do about 2PC or other multi-master solutions?
 You've done a good job of convincing me that it's unreliable no matter what
 (through your posts on this topic over a long time). However, I would think
 that something like Oracle or DB2 have some kind of answer for
 multi-master, and I'm curious what it is. If they don't, is it reasonable
 to make a test case that leaves their database inconsistent or hanging?

Most real replication systems are not doing 2PC.  For me, 2PC-based
replication is not real interesting anyway, because the point of
multi-master replication is often at least partly speed, and 2PC is
nothing if not a good way to make sure that every database is at
least as slow as the slowest node.

But 2PC is important for application-server-based, XA-type work, and
for heterogenous databases.  Both of those would be real nice
features to support.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Andrew Sullivan
On Fri, Sep 26, 2003 at 05:15:37PM -0400, Rod Taylor wrote:
  The first problem is the restart/rejoin problem.  When a 2PC member
  goes away, it is supposed to come back with all its former locks and
  everything in place, so that it can know what to do.  This is also
  extremely tricky, but I think the answer is sort of easy.  A member
  which re-joins without crashing (that is, it has open transactions,
 
 I think you may be confusing 2PC with replication.

No, I'm not.  One needs to decide how to handle the situation where a
slave database in a 2PC transaction goes away and comes back, for
whatever reasons that may happen.  Since the idea here is to come up
with ways of handling the failure of 2PC in some cases, we need
something which notices that members are not playing nice. 

 PostgreSQLs 2PC implementation should follow enough of the XA rules to
 play nice in a mixed environment where something else is managing the
 transactions (application servers are becoming more common all the
 time).

I agree.  But we still need to decide how to handle cases where
things go away, and if there are some transaction managers that don't
fit that model, then we should not accept such managers.  Of course,
what such managers do is important data in deciding what sorts of
compromises are acceptable.

A
-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Rod Taylor
  It seems that one way out is just to fall back to read only as soon
  as a single failure happens.  That's the least graceful but maybe
  safest approach to failure, analogous to what fsck does to your root
  filesystem at boot time.  Of course, since there's no read only
  mode at the moment, this is all pretty hand-wavy on my part :-/
 
 Yes, but that affects all users, not just the transaction we were
 working on. I think we have to get beyond the idea that this can be made
 failure-proof, and just outline the behaviors for failure, and it has to
 be configurable by the administrator.

Yes, but holding locks on the affected rows IS appropriate until the
administrator issues something like:

ALTER SYSTEM ABORT GLOBAL TRANSACTION 123;


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Peter Eisentraut
Tom Lane writes:

 No.  The real problem with 2PC in my mind is that its failure modes
 occur *after* you have promised commit to one or more parties.  In
 multi-master, if you fail you know it before you have told the client
 his data is committed.

I have a book here which claims that the solution to the problems of
2-phase commit is 3-phase commit, which goes something like this:

coordinator participant
--- ---
INITIAL INITIAL
prepare --
WAIT
-- vote commit
READY
(all voted commit)
prepare-to-commit --
PRE-COMMIT
-- ready-to-commit
PRE-COMMIT
global-commit --
COMMIT  COMMIT


If the coordinator fails and all participants are in state READY, they can
safely decide to abort after some timeout.  If some participant is already
in state PRE-COMMIT, it becomes the new coordinator and sends the
global-commit message.

Details are left as an exercise. :-)

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Rod Taylor
 No, I'm not.  One needs to decide how to handle the situation where a
 slave database in a 2PC transaction goes away and comes back, for
 whatever reasons that may happen.  Since the idea here is to come up
 with ways of handling the failure of 2PC in some cases, we need
 something which notices that members are not playing nice. 

Yes, you're right. The part about the member reinitializing lead me to
believe that you were thinking replication (read it as copying data from
source location to bring it back up to speed -- which is not what you
intended). 




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Dann Corbit
 -Original Message-
 From: Bruce Momjian [mailto:[EMAIL PROTECTED] 
 Sent: Monday, September 29, 2003 7:10 AM
 To: Marc G. Fournier
 Cc: Hiroshi Inoue; Tom Lane; 'Zeugswetter Andreas SB SD'; 
 'Andrew Sullivan'; [EMAIL PROTECTED]
 Subject: Re: [HACKERS] 2-phase commit
 
 
 Marc G. Fournier wrote:
   Master  Slave
   --  -
   commit ready--
   --OK
   commit done-XX
  
   is the commit done message needed ?
  
  Of course ... how else will the Slave commit?  From my 
 understanding, 
  the concept is that the master sends a commit ready to the 
 slave, but 
  the OK back is that OK, I'm ready to commit whenever you are, at 
  which point the master does its commit and tells the slave 
 to do its 
  ...
 
 Or the slave could reject the request.
 

Here is a BSD-like licensed transaction monitor:

http://tyrex.sourceforge.net/tpmonitor.html

The stuff that eventually became Tuxedo and Encina was open source from
MIT (not sure what came of it).  You used to be able to download the
source code for their transaction monitor that worked on the IBM RS/2.

This is the Transaction Internet Protocol:
http://www.ietf.org/html.charters/OLD/tip-charter.html
It should be considered very seriously as a general solution to the
problem.

I mention this, because a transaction monitor is the next logical step
in managing database activity.
Two phase commit is a subset of transaction processing.

Interesting discussion:
http://www.developer.com/db/article.php/10920_2246481_2
http://www.developer.com/java/data/article.php/10932_3066301_4

Article worth a look (win32 specific, but talks about developing a
transaction monitor):
http://archive.devx.com/free/mgznarch/vcdj/1998/octmag98/dtc1.asp

Some simple background for those who have not spent much time looking
into it:
http://www.geocities.com/rajesh_purohit/db/twophasecommit.html


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Manfred Spraul
Peter Eisentraut wrote:

Tom Lane writes:

 

No.  The real problem with 2PC in my mind is that its failure modes
occur *after* you have promised commit to one or more parties.  In
multi-master, if you fail you know it before you have told the client
his data is committed.
   

I have a book here which claims that the solution to the problems of
2-phase commit is 3-phase commit, which goes something like this:
coordinator participant
--- ---
INITIAL INITIAL
prepare --
WAIT
-- vote commit
READY
(all voted commit)
prepare-to-commit --
PRE-COMMIT
-- ready-to-commit
PRE-COMMIT
global-commit --
COMMIT  COMMIT
If the coordinator fails and all participants are in state READY, they can
safely decide to abort after some timeout.  If some participant is already
in state PRE-COMMIT, it becomes the new coordinator and sends the
global-commit message.
Details are left as an exercise. :-)
 

Ok. Lets assume one coordinator, two partitipants.
Global commit send to both by coordinator. One replies with ok, the 
other one remains silent.
What should the coordinator do? It can't fail the transaction - the 
first partitipant has commited its part. It can't complete the 
transaction, because the ok from the 2nd partitipant is still outstanding.
I think Bruce is right: It's an admin decision. If a timeout expires, a 
user supplied app should be called, with a safe default (database 
shutdown?).

--
   Manfred
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Peter Eisentraut
Manfred Spraul writes:

 Ok. Lets assume one coordinator, two partitipants.
 Global commit send to both by coordinator. One replies with ok, the
 other one remains silent.
 What should the coordinator do? It can't fail the transaction - the
 first partitipant has commited its part. It can't complete the
 transaction, because the ok from the 2nd partitipant is still outstanding.

If a participant doesn't reply in an orderly fashion (say, after timeout),
it just gets kicked out of the whole mechanism.  That isn't the
interesting part.  The interesting part is what happens when the
coordinator fails.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Rod Taylor
On Mon, 2003-09-29 at 15:55, Peter Eisentraut wrote:
 Manfred Spraul writes:
 
  Ok. Lets assume one coordinator, two partitipants.
  Global commit send to both by coordinator. One replies with ok, the
  other one remains silent.
  What should the coordinator do? It can't fail the transaction - the
  first partitipant has commited its part. It can't complete the
  transaction, because the ok from the 2nd partitipant is still outstanding.
 
 If a participant doesn't reply in an orderly fashion (say, after timeout),
 it just gets kicked out of the whole mechanism.  That isn't the
 interesting part.  The interesting part is what happens when the
 coordinator fails.

The hot-standby coordinator picks up where the first one left off. Just
like when the participant fails the hot-standby for that participant
steps up to the plate.

For the application server side in Java, I believe the standard is OTS
(Object Transaction Service).



signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Andrew Sullivan
On Mon, Sep 29, 2003 at 12:59:55PM -0400, Bruce Momjian wrote:
 working on. I think we have to get beyond the idea that this can be made
 failure-proof, and just outline the behaviors for failure, and it has to
 be configurable by the administrator.

Exactly.  There are plenty of cases where graceless failure is
acceptable to someone as the right answer to the compromise.  Of
course, this is not to pretend they're not compromises.  There's a
world of difference between saying, This is not safe, but if you
want to do it, here are some potential failure modes, and, Hey, you
can use this even though it can't roll back 100% of the time, because
your application should check that.  Any comparison with any actual
application I have had to use is strictly coincidental. ;-)

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Dann Corbit
Commercial systems use:

Mainframe:
CICS

UNIX:
Tuxedo
Encina

Win32:
MTS

DEC/COMPAQ/HP:
ACMS

Probably lots of others that I have never heard about.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Andrew Sullivan
On Mon, Sep 29, 2003 at 12:48:30PM -0400, Andrew Sullivan wrote:
 In every circumstance where a stand-alone machine would have it. 

Oops.  Wrong stage.  Never mind.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Christopher Browne
[EMAIL PROTECTED] (Dann Corbit) writes:
 Tuxedo

Note that this is probably the only one of the lot that is _really_
worth looking at in a serious way, as the XA standard was essentially
based on Tuxedo.  (Irrelevant Aside: BEA had releases of CICS running
on both Unix and Windows NT, so it isn't quite fair to call that
mainframe code...)

There might be some value in looking at how Berkeley DB supports XA,
as there actually support for using Berkeley DB as an XA resource
manager.

http://www.sleepycat.com/docs/ref/xa/xa_intro.html

While it would obviously be exceedingly inappropriate to copy any of
SleepyCat's software, there is some very useful background information
there on care and feeding which can give an idea of how a TP monitor
might be used and configured.
-- 
cbbrowne,@,libertyrms.info
http://dev6.int.libertyrms.com/
Christopher Browne
(416) 646 3304 x124 (land)

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-09-29 Thread Dann Corbit
A really nice overview of how various transaction managers are modeled:

http://www.ti5.tu-harburg.de/Lecture/99ws/TP/06-OverviewOfTPSystemsAndPr
oducts/sld001.htm

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-09-28 Thread Kevin Brown
Bruce Momjian wrote:
 Kevin Brown wrote:
  Actually, all that's really necessary is the ability to call a stored
  procedure when some event occurs.  The stored procedure can take it from
  there, and since it can be written in C it can do anything the postgres
  user can do (for good or for ill, of course).
 
 But the postmaster doesn't connect to any database, and in a serious
 failure, might not be able to start one.

Ah, true.  But I figured that in the context of 2PC and replication that
most of the associated failures were likely to occur in an active
backend or something equivalent, where a stored procedure was likely to
be accessible.

But yes, you certainly want to account for failures where the database
itself is unavailable.  So I guess my original comment isn't strictly
true.  :-)


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-09-28 Thread Rod Taylor
  Actually, all that's really necessary is the ability to call a stored
  procedure when some event occurs.  The stored procedure can take it from
  there, and since it can be written in C it can do anything the postgres
  user can do (for good or for ill, of course).
 
 But the postmaster doesn't connect to any database, and in a serious
 failure, might not be able to start one.

In the event of a catastrophic, the 'nothing is running' scenario is one
standard monitoring software should pick up on that easily enough. One
that PostgreSQL cannot help with anyway (normally this is admin error).

Something simple much like pg_locks with transaction state (idle,
waiting on local lock, waiting on 3rd party, etc.), time transaction
started, time of last status change would be plenty. The monitor
software folks (Big Brother, etc. etc.) can write jobs to query those
elements and create the appropriate SNMP events when say waiting on 3rd
party for  N minutes (log at 1, trouble ticket at 2, SysAdmin page at
5, escalate to VP Pager at 20 minutes or whatever corporate policy is).

An alternative is to package an SNMP daemon (much like the stats daemon)
into the backend to generate SNMP events -- but I think this is overkill
if views are available.


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] 2-phase commit

2003-09-28 Thread Hiroshi Inoue
Hiroshi Inoue wrote:
(B 
(B  -Original Message-
(B  From: Tom Lane
(B 
(B  Bruce Momjian [EMAIL PROTECTED] writes:
(B   Tom Lane wrote:
(B   You're not considering the possibility of a transient communication
(B   failure.
(B 
(B   Can't the master re-send the request after a timeout?
(B 
(B  Not "it can", but "it has to".
(B 
(B Why ?$B!!(BMainly the coordinator(slave) not the participant(master)
(B has the resposibilty to resolve the in-doubt transaction.
(B
(BAs far as I see, it's the above point which prevents the
(Badvance of this topic and the issue must be solved ASAP.
(B
(BAs opposed to your answer
(B   Not "it can", but "it has to",
(Bmy answer is
(B   Yes "it can", but "it doesn't have to".
(B
(BThe simplest senario(though there could be varations) is
(B
(B[At participant(master)'s side]
(B  Because the commit operations is done, does nothing.
(B
(B[At coordinator(slave)' side]
(B   1) After a while
(B   2) re-establish the communication path between the
(B  partcipant(master)'s TM.
(B   3) resend the "commit requeset" to the participant's TM.
(B  1)2)3) would be repeated until the coordinator receives
(B  the "commit ok" message from the partcipant.
(B
(BIf there's no objection from you, I would assume I'm right.
(BPlease don't dodge my question this time.
(B
(Bregards,
(BHiroshi Inoue
(Bhttp://www.geocities.jp/inocchichichi/psqlodbc/
(B
(B---(end of broadcast)---
(BTIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] 2-phase commit

2003-09-28 Thread Marc G. Fournier

On Mon, 29 Sep 2003, Hiroshi Inoue wrote:

 The simplest senario(though there could be varations) is

 [At participant(master)'s side]
   Because the commit operations is done, does nothing.

 [At coordinator(slave)' side]
1) After a while
2) re-establish the communication path between the
   partcipant(master)'s TM.
3) resend the commit requeset to the participant's TM.
   1)2)3) would be repeated until the coordinator receives
   the commit ok message from the partcipant.

 If there's no objection from you, I would assume I'm right.

'K, but what happens if the slave never gets a 'commit ok'?  Does the
slave keep trying ad nausem?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-09-28 Thread Tom Lane
Hiroshi Inoue [EMAIL PROTECTED] writes:
 The simplest senario(though there could be varations) is

 [At participant(master)'s side]
   Because the commit operations is done, does nothing.

 [At coordinator(slave)' side]
1) After a while
2) re-establish the communication path between the
   partcipant(master)'s TM.
3) resend the commit requeset to the participant's TM.
   1)2)3) would be repeated until the coordinator receives
   the commit ok message from the partcipant.

[ scratches head ] I think you are using the terms master and slave
oppositely than I would.  But in any case, this is not an answer to the
concern I had.  You're assuming that the coordinator(slave) side is
willing to resend a request indefinitely, and also that the
participant(master) side is willing to retain per-transaction commit
state indefinitely so that it can correctly answer belated questions
from the other side.  What I was complaining about was that I don't
think either side can afford to remember per-transaction state
indefinitely.  2PC in the abstract is a useless academic abstraction ---
where the rubber meets the road is defining how you cope with failures
in the commit protocol.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 2-phase commit

2003-09-28 Thread Hiroshi Inoue
Tom Lane wrote:
(B 
(B Hiroshi Inoue [EMAIL PROTECTED] writes:
(B  The simplest senario(though there could be varations) is
(B 
(B  [At participant(master)'s side]
(BBecause the commit operations is done, does nothing.
(B 
(B  [At coordinator(slave)' side]
(B 1) After a while
(B 2) re-establish the communication path between the
(Bpartcipant(master)'s TM.
(B 3) resend the "commit requeset" to the participant's TM.
(B1)2)3) would be repeated until the coordinator receives
(Bthe "commit ok" message from the partcipant.
(B 
(B [ scratches head ] I think you are using the terms "master" and "slave"
(B oppositely than I would.
(B
(BOops my mistake, sorry. 
(BBut is it 2-phase commit protocol in the first place ?
(B
(Bregards,
(BHiroshi Inoue
(Bhttp://www.geocities.jp/inocchichichi/psqlodbc/
(B
(B---(end of broadcast)---
(BTIP 9: the planner will ignore your desire to choose an index scan if your
(B  joining column's datatypes do not match

Re: [HACKERS] 2-phase commit

2003-09-28 Thread Hiroshi Inoue

(B
(BHiroshi Inoue wrote:
(B 
(B Tom Lane wrote:
(B 
(B  Hiroshi Inoue [EMAIL PROTECTED] writes:
(B   The simplest senario(though there could be varations) is
(B 
(B   [At participant(master)'s side]
(B Because the commit operations is done, does nothing.
(B 
(B   [At coordinator(slave)' side]
(B  1) After a while
(B  2) re-establish the communication path between the
(B partcipant(master)'s TM.
(B  3) resend the "commit requeset" to the participant's TM.
(B 1)2)3) would be repeated until the coordinator receives
(B the "commit ok" message from the partcipant.
(B 
(B  [ scratches head ] I think you are using the terms "master" and "slave"
(B  oppositely than I would.
(B 
(B Oops my mistake, sorry.
(B But is it 2-phase commit protocol in the first place ?
(B
(BThat is, in your exmaple below
(B
(B Example:
(B
(BMaster  Slave
(B--  -
(Bcommit ready--
(B--OK
(Bcommit done-XX
(B
(Bis the "commit done" message needed ?
(B
(Bregards,
(BHiroshi Inoue
(Bhttp://www.geocities.jp/inocchichichi/psqlodbc/
(B
(B---(end of broadcast)---
(BTIP 5: Have you checked our extensive FAQ?
(B
(B   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] 2-phase commit

2003-09-28 Thread Hiroshi Inoue
Tom Lane wrote:
(B 
(B Hiroshi Inoue [EMAIL PROTECTED] writes:
(B  The simplest senario(though there could be varations) is
(B 
(B  [At participant(master)'s side]
(BBecause the commit operations is done, does nothing.
(B 
(B  [At coordinator(slave)' side]
(B 1) After a while
(B 2) re-establish the communication path between the
(Bpartcipant(master)'s TM.
(B 3) resend the "commit requeset" to the participant's TM.
(B1)2)3) would be repeated until the coordinator receives
(Bthe "commit ok" message from the partcipant.
(B 
(B [ scratches head ] I think you are using the terms "master" and "slave"
(B oppositely than I would.  But in any case, this is not an answer to the
(B concern I had.  You're assuming that the "coordinator(slave)" side is
(B willing to resend a request indefinitely, and also that the
(B "participant(master)" side is willing to retain per-transaction commit
(B state indefinitely so that it can correctly answer belated questions
(B from the other side.  What I was complaining about was that I don't
(B think either side can afford to remember per-transaction state
(B indefinitely.
(B
(BOK maybe I understand your complaint.
(BBasically such situation can occur when either side
(Bis down. Especially when the coodinator(master) is down,
(Bthe particicipants are troubled. In such cases, e.g. XA
(Binterface allows heuristic-commit on the participants.
(B
(BIn case one or more paricipants are down, the coordinator
(Bmay have to remember per-transaction state indefinitely.
(BIs it a big problem ? 
(B
(Bregards,
(BHiroshi Inoue
(Bhttp://www.geocities.jp/inocchichichi/psqlodbc/
(B
(B---(end of broadcast)---
(BTIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] 2-phase commit

2003-09-27 Thread Richard Huxton
On Saturday 27 September 2003 06:59, Tom Lane wrote:
 Christopher Kings-Lynne [EMAIL PROTECTED] writes:
  ... You can make this work, but the resource costs
  are steep.
 
  So, after 'n' seconds of waiting, we abandon the slave and the slave
  abandons the master.

 [itch...]  But you surely cannot guarantee that the slave and the master
 time out at exactly the same femtosecond.  What happens when the comm
 link comes back online just when one has timed out and the other not?
 (Hint: in either order, it ain't good.  Double plus ungood if, say, the
 comm link manages to deliver the master's commit confirm message a
 little bit after the master has timed out and decided to abort after all.)

 In my book, timeout-based solutions to this kind of problem are certain
 disasters.

I might be (well, am actually) a bit out of my depth here, but surely what 
happens is if you have machines A,B,C and *any* of them thinks machine C has 
a problem then it does. If C can still communicate with the others then it is 
told to reinitialise/go away/start the sirens. If C can't communicate then 
it's all a bit academic.

Granted, if you have intermittent problems on a link and set your timeouts 
badly then you'll have a very brittle system, but if A thinks C has died, you 
can't just reverse that decision.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] 2-phase commit

2003-09-27 Thread Marc G. Fournier


On Sat, 27 Sep 2003, Tom Lane wrote:

 Christopher Kings-Lynne [EMAIL PROTECTED] writes:
  ... You can make this work, but the resource costs
  are steep.

  So, after 'n' seconds of waiting, we abandon the slave and the slave
  abandons the master.

 [itch...]  But you surely cannot guarantee that the slave and the master
 time out at exactly the same femtosecond.  What happens when the comm
 link comes back online just when one has timed out and the other not?
 (Hint: in either order, it ain't good.

I think it was Andrew that suggested it ... when the slave timesout, it
should trigger a READ ONLY mode on the slave, so that when/if the master
tries to start to talk to it, it can't ...

As for the master itself, it should be smart enough that if it times out,
it knows to actually abandom the slave and not continue to try ...

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-09-27 Thread Bruce Momjian
Richard Huxton wrote:
  [itch...]  But you surely cannot guarantee that the slave and the master
  time out at exactly the same femtosecond.  What happens when the comm
  link comes back online just when one has timed out and the other not?
  (Hint: in either order, it ain't good.  Double plus ungood if, say, the
  comm link manages to deliver the master's commit confirm message a
  little bit after the master has timed out and decided to abort after all.)
 
  In my book, timeout-based solutions to this kind of problem are certain
  disasters.
 
 I might be (well, am actually) a bit out of my depth here, but surely what 
 happens is if you have machines A,B,C and *any* of them thinks machine C has 
 a problem then it does. If C can still communicate with the others then it is 
 told to reinitialise/go away/start the sirens. If C can't communicate then 
 it's all a bit academic.
 
 Granted, if you have intermittent problems on a link and set your timeouts 
 badly then you'll have a very brittle system, but if A thinks C has died, you 
 can't just reverse that decision.

I have been thinking it might be time to start allowing external
programs to be called when certain events occur that require
administrative attention --- this would be a good case for that. 
Administrators could configure shell scripts to be run when the network
connection fails or servers drop off the network, alerting them to the
problem.  Throwing things into the server logs isn't _active_ enough.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-09-27 Thread Shridhar Daithankar
On Saturday 27 September 2003 20:17, Bruce Momjian wrote:
 Richard Huxton wrote:
 I have been thinking it might be time to start allowing external
 programs to be called when certain events occur that require
 administrative attention --- this would be a good case for that.
 Administrators could configure shell scripts to be run when the network
 connection fails or servers drop off the network, alerting them to the
 problem.  Throwing things into the server logs isn't _active_ enough.

I would say calling events from external libraries would be a good extension. 
That could allow for extending postgresql in novel way. e.g. calling a 
logrecord copy event after a WAL record is written for near real time 
replication..:-)

 Shridhar


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] 2-phase commit

2003-09-27 Thread Richard Huxton
On Saturday 27 September 2003 15:47, Bruce Momjian wrote:
 Richard Huxton wrote:
[snip]
  I might be (well, am actually) a bit out of my depth here, but surely
  what happens is if you have machines A,B,C and *any* of them thinks
  machine C has a problem then it does. If C can still communicate with the
  others then it is told to reinitialise/go away/start the sirens. If C
  can't communicate then it's all a bit academic.
 
[snip]

 I have been thinking it might be time to start allowing external
 programs to be called when certain events occur that require
 administrative attention --- this would be a good case for that.
 Administrators could configure shell scripts to be run when the network
 connection fails or servers drop off the network, alerting them to the
 problem.  Throwing things into the server logs isn't _active_ enough.

Actually, from the discussion I'd assumed there was some sort of plug-in 
policy daemon that was making decisions when things went wrong. Given the 
different scenarios 2 phase-commit will be used in, one size is unlikely to 
fit all.

The idea of a more general system is _very_ interesting. I know Wietse Venema 
has decided to provide an external policy interface for his Postfix 
mailserver, precisely because he wants to keep the core system fairly clean.
-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-09-27 Thread Marc G. Fournier


On Sat, 27 Sep 2003, Bruce Momjian wrote:

 I have been thinking it might be time to start allowing external
 programs to be called when certain events occur that require
 administrative attention --- this would be a good case for that.
 Administrators could configure shell scripts to be run when the network
 connection fails or servers drop off the network, alerting them to the
 problem.  Throwing things into the server logs isn't _active_ enough.

Actually, apparently you can do this now ... there is apparently a mail
module for PostgreSQL that you can use to have the database send email's
out ...


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 2-phase commit

2003-09-27 Thread Bruce Momjian
Marc G. Fournier wrote:
 
 
 On Sat, 27 Sep 2003, Bruce Momjian wrote:
 
  I have been thinking it might be time to start allowing external
  programs to be called when certain events occur that require
  administrative attention --- this would be a good case for that.
  Administrators could configure shell scripts to be run when the network
  connection fails or servers drop off the network, alerting them to the
  problem.  Throwing things into the server logs isn't _active_ enough.
 
 Actually, apparently you can do this now ... there is apparently a mail
 module for PostgreSQL that you can use to have the database send email's
 out ...

The only part that needs to be added is the ability to call an external
program when some even occurs, like a database write failure.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] 2-phase commit

2003-09-27 Thread Hiroshi Inoue
 -Original Message-
(B From: Tom Lane
(B 
(B Bruce Momjian [EMAIL PROTECTED] writes:
(B  Tom Lane wrote:
(B  You're not considering the possibility of a transient communication
(B  failure.
(B 
(B  Can't the master re-send the request after a timeout?
(B 
(B Not "it can", but "it has to". 
(B
(BWhy ?$B!!(BMainly the coordinator(slave) not the participant(master)
(Bhas the resposibilty to resolve the in-doubt transaction.
(B
(Bregards,
(BHiroshi Inoue
(B
(B
(B---(end of broadcast)---
(BTIP 2: you can get off all lists at once with the unregister command
(B(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] 2-phase commit

2003-09-27 Thread Kevin Brown
Bruce Momjian wrote:
 Marc G. Fournier wrote:
  
  
  On Sat, 27 Sep 2003, Bruce Momjian wrote:
  
   I have been thinking it might be time to start allowing external
   programs to be called when certain events occur that require
   administrative attention --- this would be a good case for that.
   Administrators could configure shell scripts to be run when the network
   connection fails or servers drop off the network, alerting them to the
   problem.  Throwing things into the server logs isn't _active_ enough.
  
  Actually, apparently you can do this now ... there is apparently a mail
  module for PostgreSQL that you can use to have the database send email's
  out ...
 
 The only part that needs to be added is the ability to call an external
 program when some even occurs, like a database write failure.

Actually, all that's really necessary is the ability to call a stored
procedure when some event occurs.  The stored procedure can take it from
there, and since it can be written in C it can do anything the postgres
user can do (for good or for ill, of course).


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 2-phase commit

2003-09-27 Thread Bruce Momjian
Kevin Brown wrote:
 Bruce Momjian wrote:
  Marc G. Fournier wrote:
   
   
   On Sat, 27 Sep 2003, Bruce Momjian wrote:
   
I have been thinking it might be time to start allowing external
programs to be called when certain events occur that require
administrative attention --- this would be a good case for that.
Administrators could configure shell scripts to be run when the network
connection fails or servers drop off the network, alerting them to the
problem.  Throwing things into the server logs isn't _active_ enough.
   
   Actually, apparently you can do this now ... there is apparently a mail
   module for PostgreSQL that you can use to have the database send email's
   out ...
  
  The only part that needs to be added is the ability to call an external
  program when some even occurs, like a database write failure.
 
 Actually, all that's really necessary is the ability to call a stored
 procedure when some event occurs.  The stored procedure can take it from
 there, and since it can be written in C it can do anything the postgres
 user can do (for good or for ill, of course).

But the postmaster doesn't connect to any database, and in a serious
failure, might not be able to start one.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Bruce Momjian
Zeugswetter Andreas SB SD wrote:
 
   From our previous discussion of 2-phase commit, there was concern that
   the failure modes of 2-phase commit were not solvable.  However, I think
   multi-master replication is going to have similar non-solvable failure
   modes, yet people still want multi-master replication.
  
  No.  The real problem with 2PC in my mind is that its failure modes
  occur *after* you have promised commit to one or more parties.  In
  multi-master, if you fail you know it before you have told the client
  his data is committed.
 
 Hmm ? The appl cannot take the first phase commit as its commit info. It 
 needs to wait for the second phase commit. The second phase is only finished
 when all coservers have reported back. 2PC is synchronous.
 
 The problems with 2PC are when after second phase commit was sent to all
 servers and before all report back one of them becomes unreachable/down ...
 (did it receive and do the 2nd commit or not) Such a transaction must stay
 open until the coserver is reachable again or an administrator committed/aborted it. 
 
 It is multi master replication that usually has an asynchronous mode for
 performance, and there the trouble starts.

Let me diagram this so we can see the issues.  Normal operation is:

Master  Slave
--  -
commit ready--
--OK
commit done---
--OK
completed

One possible failure is:

Master  Slave
--  -
commit ready--
--OK
commit done---
dies here
stuck waiting

Another possible failure is:

Master  Slave
--  -
commit ready--
--OK
dies here
stuck waiting

Are these the issues?  Can't we just add GUC timeouts to cause the
commit to fail, and the slave to stop waiting?  I suppose a problem is:

Master  Slave
--  -
commit ready--
--OK
sleep
stuck waiting, times out
commit done

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Could we allow slaves to check if the backend is still alive, perhaps by
 asking the postmaster, similar to what we do with the cancel signal ---
 that way, the slave would never time out and always wait if the master
 was alive.

You're not considering the possibility of a transient communication
failure.  The fact that you cannot currently contact the other guy
is not proof that he's not still alive.

Example:

Master  Slave
--  -
commit ready--
--OK
commit done-XX

where -XX means the message gets lost due to network failure.  Now
what?  The slave cannot abort; he promised he could commit, and he does
not know whether the master has committed or not.  The master does not
know the slave's state either; maybe he got the second message, and
maybe he didn't.  Both sides are forced to keep information about the 
open transaction indefinitely.  Timing out on either side could yield
the wrong result.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Could we allow slaves to check if the backend is still alive, perhaps by
  asking the postmaster, similar to what we do with the cancel signal ---
  that way, the slave would never time out and always wait if the master
  was alive.
 
 You're not considering the possibility of a transient communication
 failure.  The fact that you cannot currently contact the other guy
 is not proof that he's not still alive.
 
 Example:
 
   Master  Slave
   --  -
   commit ready--
   --OK
   commit done-XX
 
 where -XX means the message gets lost due to network failure.  Now
 what?  The slave cannot abort; he promised he could commit, and he does
 not know whether the master has committed or not.  The master does not
 know the slave's state either; maybe he got the second message, and
 maybe he didn't.  Both sides are forced to keep information about the 
 open transaction indefinitely.  Timing out on either side could yield
 the wrong result.

Can't the master re-send the request after a timeout?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Marc G. Fournier


On Fri, 26 Sep 2003, Tom Lane wrote:

 Bruce Momjian [EMAIL PROTECTED] writes:
  Could we allow slaves to check if the backend is still alive, perhaps by
  asking the postmaster, similar to what we do with the cancel signal ---
  that way, the slave would never time out and always wait if the master
  was alive.

 You're not considering the possibility of a transient communication
 failure.  The fact that you cannot currently contact the other guy
 is not proof that he's not still alive.

 Example:

   Master  Slave
   --  -
   commit ready--
   --OK
   commit done-XX

 where -XX means the message gets lost due to network failure.  Now

'k, but isn't alot of that a retry issue?  we're talking TCP here, not
UDP, which I *thought* was designed for transient network problems ... ?
I would think that any implementation would have a timeout/retry GUC
variable associated with it ... 'if no answer in x seconds, retry up to y
times' ...

if we are talking two computers sitting next to each other on a switch,
you'd expect those to be low ... but if you were talking about two
seperate geographical locations (and yes, I realize you are adding lag to
the mix with waiting for responses), you'd expect those #s to rise ...



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Marc G. Fournier


On Fri, 26 Sep 2003, Tom Lane wrote:

 Bruce Momjian [EMAIL PROTECTED] writes:
  Could we allow slaves to check if the backend is still alive, perhaps by
  asking the postmaster, similar to what we do with the cancel signal ---
  that way, the slave would never time out and always wait if the master
  was alive.

 You're not considering the possibility of a transient communication
 failure.  The fact that you cannot currently contact the other guy
 is not proof that he's not still alive.

 Example:

   Master  Slave
   --  -
   commit ready--
   --OK
   commit done-XX

 where -XX means the message gets lost due to network failure.  Now
 what?

'k, but isn't alot of that a retry issue?  we're talking TCP here, not
UDP, which I *thought* was designed for transient network problems ... ?
I would think that any implementation would have a timeout/retry GUC
variable associated with it ... 'if no answer in x seconds, retry up to y
times' ...

if we are talking two computers sitting next to each other on a switch,
you'd expect those to be low ... but if you were talking about two
seperate geographical locations (and yes, I realize you are adding lag to
the mix with waiting for responses), you'd expect those #s to rise ...


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Patrick Welche
On Fri, Sep 26, 2003 at 02:49:30PM -0300, Marc G. Fournier wrote:
... 
 if we are talking two computers sitting next to each other on a switch,
 you'd expect those to be low ... but if you were talking about two
 seperate geographical locations (and yes, I realize you are adding lag to
 the mix with waiting for responses), you'd expect those #s to rise ...

Which I thought was the whole point of using a group communication protocol
such as spread in postgresql-r. It seemed solved there...

Cheers,

Patrick

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Bruce Momjian
Patrick Welche wrote:
 On Fri, Sep 26, 2003 at 02:49:30PM -0300, Marc G. Fournier wrote:
 ... 
  if we are talking two computers sitting next to each other on a switch,
  you'd expect those to be low ... but if you were talking about two
  seperate geographical locations (and yes, I realize you are adding lag to
  the mix with waiting for responses), you'd expect those #s to rise ...
 
 Which I thought was the whole point of using a group communication protocol
 such as spread in postgresql-r. It seemed solved there...

Right, but I think we want to try to do two-phase commit without spread.
Spread seems overkill for this usage.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 You're not considering the possibility of a transient communication
 failure.

 Can't the master re-send the request after a timeout?

Not it can, but it has to.  The master *must* keep hold of that
request forever (or until the slave responds, or until we reconfigure
the system not to consider that slave valid anymore).  Similarly, the
slave cannot forget the maybe-committed transaction on pain of not being
a valid slave anymore.  You can make this work, but the resource costs
are steep.  For instance, in Postgres, you don't get to truncate the WAL
log, for what could be a really really long time --- more disk space
than you wanted to spend on WAL anyway.  The locks held by the
maybe-committed transaction are another potentially unpleasant problem;
you can't release them, no matter what else they are blocking.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  You're not considering the possibility of a transient communication
  failure.
 
  Can't the master re-send the request after a timeout?
 
 Not it can, but it has to.  The master *must* keep hold of that
 request forever (or until the slave responds, or until we reconfigure
 the system not to consider that slave valid anymore).  Similarly, the
 slave cannot forget the maybe-committed transaction on pain of not being
 a valid slave anymore.  You can make this work, but the resource costs
 are steep.  For instance, in Postgres, you don't get to truncate the WAL
 log, for what could be a really really long time --- more disk space
 than you wanted to spend on WAL anyway.  The locks held by the
 maybe-committed transaction are another potentially unpleasant problem;
 you can't release them, no matter what else they are blocking.

I think we would need a configurable timeout to say a slave is no longer
valid, like 60 seconds, and then let everyone release.  We can let the
administrator decide how long he wants to try to keep two hosts
communicating.  I don't see this as much different from multi-master
replication problems.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Marc G. Fournier


On Fri, 26 Sep 2003, Tom Lane wrote:

 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  You're not considering the possibility of a transient communication
  failure.

  Can't the master re-send the request after a timeout?

 Not it can, but it has to.  The master *must* keep hold of that
 request forever (or until the slave responds, or until we reconfigure
 the system not to consider that slave valid anymore).  Similarly, the
 slave cannot forget the maybe-committed transaction on pain of not being
 a valid slave anymore.

Hr ... is there no way of having part of the protocol being a message
sent back that its a valid/invalid slave?  ie. slave has an uncommitted
transaction, never hears back from master to actually do the commit, so
after x-secs * y-retries any messages it does try to send to the master
have a bit flag set to 'invalid'?


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] 2-phase commit

2003-09-26 Thread Christopher Browne
[EMAIL PROTECTED] (Bruce Momjian) writes:
 Patrick Welche wrote:
 On Fri, Sep 26, 2003 at 02:49:30PM -0300, Marc G. Fournier wrote:
 ... 
  if we are talking two computers sitting next to each other on a switch,
  you'd expect those to be low ... but if you were talking about two
  seperate geographical locations (and yes, I realize you are adding lag to
  the mix with waiting for responses), you'd expect those #s to rise ...
 
 Which I thought was the whole point of using a group communication
 protocol such as spread in postgresql-r. It seemed solved there...

 Right, but I think we want to try to do two-phase commit without
 spread.  Spread seems overkill for this usage.

Is there some big demerit to _having_ that overkill?  If there is no
major price to pay, then I don't see why it isn't reasonable to simply
say Sure, we'll use that!

After all, PostgreSQL is set up to do _everything_ inside
transactions, even though there are some actions you might take that
don't forcibly need to be transactional.  That's overkill, and nobody
(well, barring fans of Certain Other Databases) complains that it's
overkill.
-- 
let name=cbbrowne and tld=libertyrms.info in String.concat @ [name;tld];;
http://dev6.int.libertyrms.com/
Christopher Browne
(416) 646 3304 x124 (land)

---(end of broadcast)---
TIP 8: explain analyze is your friend


  1   2   >