subject:"\[HACKERS\] streaming replication breaks horribly if master crashes"

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-17 Thread Heikki Linnakangas

On 17/06/10 02:40, Greg Stark wrote: On Thu, Jun 17, 2010 at 12:16 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Greg Starkgsst...@mit.edu wrote: TCP keepalives are for detecting broken network connections Yeah. That seems like what we have here. If you shoot the OS in the head,

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-17 Thread Rafael Martinez

-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Heikki Linnakangas wrote: We're not talking about a timeout for promoting standby to master. The problem is that the standby doesn't notice that from the master's point of view, the connection has been broken. Whether it's because of a network

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-17 Thread Fujii Masao

On Thu, Jun 17, 2010 at 4:02 PM, Rafael Martinez r.m.guerr...@usit.uio.no wrote: I tested this yesterday and I could not get any reaction from the wal receiver even after using minimal values compared to the default values . The default values in linux for tcp_keepalive_time,

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-17 Thread Magnus Hagander

On Thu, Jun 17, 2010 at 09:20, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Jun 17, 2010 at 4:02 PM, Rafael Martinez r.m.guerr...@usit.uio.no wrote: I tested this yesterday and I could not get any reaction from the wal receiver even after using minimal values compared to the default values

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-17 Thread Tom Lane

Fujii Masao masao.fu...@gmail.com writes: On Thu, Jun 17, 2010 at 5:26 AM, Robert Haas robertmh...@gmail.com wrote: The real problem here is that we're sending records to the slave which might cease to exist on the master if it unexpectedly reboots. I believe that what we need to do is make

[HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Robert Haas

On Mon, Jun 14, 2010 at 7:55 AM, Simon Riggs si...@2ndquadrant.com wrote: But that change would cause the problem that Robert pointed out. http://archives.postgresql.org/pgsql-hackers/2010-06/msg00670.php Presumably this means that if synchronous_commit = off on primary that SR in 9.0 will no

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Joshua D. Drake

On Wed, 2010-06-16 at 15:47 -0400, Robert Haas wrote: So, obviously at this point my slave database is corrupted beyond repair due to nothing more than an unexpected crash on the master. That's bad. What is worse is that the system only detected the corruption because the slave had crossed

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote: So, obviously at this point my slave database is corrupted beyond repair due to nothing more than an unexpected crash on the master. Certainly that's true for resuming replication. From your description it sounds as though the slave would be usable

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Stefan Kaltenbrunner

On 06/16/2010 09:47 PM, Robert Haas wrote: On Mon, Jun 14, 2010 at 7:55 AM, Simon Riggssi...@2ndquadrant.com wrote: But that change would cause the problem that Robert pointed out. http://archives.postgresql.org/pgsql-hackers/2010-06/msg00670.php Presumably this means that if

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote: I don't know what to do about this This probably is out of the question for 9.0 based on scale of change, and maybe forever based on the impact of WAL volume, but -- if we logged before images along with the after, we could undo the work of the

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Kevin Grittner

Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: well this is likely caused by the OS not noticing that the connections went away (linux has really long timeouts here) - maybe we should unconditionally enable keepalive on systems that support that for replication connections (if that is

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Josh Berkus

The first problem I noticed is that the slave never seems to realize that the master has gone away. Every time I crashed the master, I had to kill the wal receiver process on the slave to get it to reconnect; otherwise it just sat there waiting, either forever or at least for longer than I

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Robert Haas

On Wed, Jun 16, 2010 at 4:00 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: So, obviously at this point my slave database is corrupted beyond repair due to nothing more than an unexpected crash on the master. Certainly that's true for resuming

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Robert Haas

On Wed, Jun 16, 2010 at 4:14 PM, Josh Berkus j...@agliodbs.com wrote: The first problem I noticed is that the slave never seems to realize that the master has gone away. Every time I crashed the master, I had to kill the wal receiver process on the slave to get it to reconnect; otherwise it

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote: Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: So, obviously at this point my slave database is corrupted beyond repair due to nothing more than an unexpected crash on the master. Certainly that's true for

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Magnus Hagander

On Wed, Jun 16, 2010 at 22:26, Robert Haas robertmh...@gmail.com wrote: and this just makes it more likely. After the most recent crash, the master thought pg_current_xlog_location() was 1/86CD4000; the slave thought pg_last_xlog_receive_location() was 1/8733C000. After reconnecting to the

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Rafael Martinez

-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Haas wrote: The first problem I noticed is that the slave never seems to realize that the master has gone away. Every time I crashed the master, I had to kill the wal receiver process on the slave to get it to reconnect; otherwise it

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes: The first problem I noticed is that the slave never seems to realize that the master has gone away. Every time I crashed the master, I had to kill the wal receiver process on the slave to get it to reconnect; otherwise it just sat there waiting,

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Josh Berkus

On 6/16/10 1:26 PM, Robert Haas wrote: Similarly with synchronous_commit=off, I believe that the next checkpoint will still fsync WAL, but the lag might be long. That's not a showstopper. Just tell people that having synch_commit=off on the master might increase the lag to the slave, and

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Pierre C

The real problem here is that we're sending records to the slave which might cease to exist on the master if it unexpectedly reboots. I believe that what we need to do is make sure that the master only sends WAL it has already fsync'd How about this : - pg records somewhere the xlog

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Greg Stark

On Wed, Jun 16, 2010 at 9:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The first problem I noticed is that the slave never seems to realize that the master has gone away. Every time I crashed the master, I had to kill the wal receiver process on the

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Kevin Grittner

Greg Stark gsst...@mit.edu wrote: TCP keepalives are for detecting broken network connections Yeah. That seems like what we have here. If you shoot the OS in the head, the network connection is broken rather abruptly, without the normal packets exchanged to close the TCP connection. It

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Kevin Grittner

Kevin Grittner kevin.gritt...@wicourts.gov wrote: It sounds like it behaves just fine except for not detecting a broken connection. Of course I meant in terms of the slave's attempts at retrieving more WAL, not in terms of it applying a second time line. TCP keepalive timeouts don't help

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Greg Stark

On Thu, Jun 17, 2010 at 12:22 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Kevin Grittner kevin.gritt...@wicourts.gov wrote: It sounds like it behaves just fine except for not detecting a broken connection. Of course I meant in terms of the slave's attempts at retrieving more WAL,

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Greg Stark

On Thu, Jun 17, 2010 at 12:16 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Greg Stark gsst...@mit.edu wrote: TCP keepalives are for detecting broken network connections Yeah. That seems like what we have here. If you shoot the OS in the head, the network connection is broken

Re: [HACKERS] streaming replication breaks horribly if master crashes

2010-06-16 Thread Fujii Masao

On Thu, Jun 17, 2010 at 5:26 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 16, 2010 at 4:14 PM, Josh Berkus j...@agliodbs.com wrote: The first problem I noticed is that the slave never seems to realize that the master has gone away. Every time I crashed the master, I had to kill

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

[HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

Re: [HACKERS] streaming replication breaks horribly if master crashes

26 matches

Site Navigation

Mail list logo

Footer information