On Tue, Jun 29, 2010 at 10:58 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
Robert Haas robertmh...@gmail.com wrote:
If someone is sloppy about how they copy the WAL files around,
they could temporarily have a truncated file.
Can you explain the scenario you're concerned about in
On Tue, Jun 15, 2010 at 11:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
On the other hand, I like immediate-panicking. And I don't want the standby
to retry reconnecting the master infinitely.
On second thought, the peremptory PANIC is not good for HA system. If the
master unfortunately has
On Tue, Jun 29, 2010 at 3:55 AM, Fujii Masao masao.fu...@gmail.com wrote:
On Tue, Jun 15, 2010 at 11:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
On the other hand, I like immediate-panicking. And I don't want the standby
to retry reconnecting the master infinitely.
On second thought, the
On Tue, Jun 29, 2010 at 6:59 AM, Robert Haas robertmh...@gmail.com wrote:
On Tue, Jun 29, 2010 at 3:55 AM, Fujii Masao masao.fu...@gmail.com wrote:
On Tue, Jun 15, 2010 at 11:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
On the other hand, I like immediate-panicking. And I don't want the
Robert Haas robertmh...@gmail.com wrote:
...with this patch, following the above, you get:
FATAL: invalid record in WAL stream
HINT: Take a new base backup, or remove recovery.conf and restart
in read-write mode.
LOG: startup process (PID 6126) exited with exit code 1
LOG:
On Tue, Jun 29, 2010 at 10:21 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
Robert Haas robertmh...@gmail.com wrote:
...with this patch, following the above, you get:
FATAL: invalid record in WAL stream
HINT: Take a new base backup, or remove recovery.conf and restart
in
Robert Haas robertmh...@gmail.com wrote:
If someone is sloppy about how they copy the WAL files around,
they could temporarily have a truncated file.
Can you explain the scenario you're concerned about in more
detail?
If someone uses cp or scp to copy a WAL file from the pg_xlog
On Tue, Jun 29, 2010 at 7:59 PM, Robert Haas robertmh...@gmail.com wrote:
On Tue, Jun 29, 2010 at 3:55 AM, Fujii Masao masao.fu...@gmail.com wrote:
On Tue, Jun 15, 2010 at 11:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
On the other hand, I like immediate-panicking. And I don't want the
On Tue, Jun 29, 2010 at 10:03 PM, Fujii Masao masao.fu...@gmail.com wrote:
This is true. But what I'm concerned about is:
1. Backend writes and fsyncs the WAL to the disk
2. The WAL on the disk gets corrupted
3. Walsender reads and sends that corrupted WAL image
4. The master crashes because
On 12/06/10 04:19, Bruce Momjian wrote:
Robert Haas wrote:
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.
This needs some better thought.
If we PANIC, then surely it will PANIC again when we restart unless we
do something.
Heikki Linnakangas wrote:
On 12/06/10 04:19, Bruce Momjian wrote:
Robert Haas wrote:
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.
This needs some better thought.
If we PANIC, then surely it will PANIC again when
On Mon, Jun 14, 2010 at 12:16, Bruce Momjian br...@momjian.us wrote:
Heikki Linnakangas wrote:
On 12/06/10 04:19, Bruce Momjian wrote:
Robert Haas wrote:
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.
This needs some
Magnus Hagander wrote:
Seems like we need something like WARNING that doesn't cause the process
to die, but more alarming like ERROR/FATAL/PANIC. Or maybe just adding a
hint to the warning will do. How about
WARNING: ?invalid record length at 0/4005330
HINT: An invalid record was
On Mon, Jun 14, 2010 at 13:11, Bruce Momjian br...@momjian.us wrote:
Magnus Hagander wrote:
Seems like we need something like WARNING that doesn't cause the process
to die, but more alarming like ERROR/FATAL/PANIC. Or maybe just adding a
hint to the warning will do. How about
WARNING:
Magnus Hagander wrote:
On Mon, Jun 14, 2010 at 13:11, Bruce Momjian br...@momjian.us wrote:
Magnus Hagander wrote:
Seems like we need something like WARNING that doesn't cause the process
to die, but more alarming like ERROR/FATAL/PANIC. Or maybe just adding a
hint to the warning will
On Mon, Jun 14, 2010 at 7:18 AM, Magnus Hagander mag...@hagander.net wrote:
On Mon, Jun 14, 2010 at 13:11, Bruce Momjian br...@momjian.us wrote:
Magnus Hagander wrote:
Seems like we need something like WARNING that doesn't cause the process
to die, but more alarming like ERROR/FATAL/PANIC.
On 14/06/10 13:16, Bruce Momjian wrote:
Heikki Linnakangas wrote:
On 12/06/10 04:19, Bruce Momjian wrote:
Robert Haas wrote:
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.
This needs some better thought.
If we PANIC, then
Bruce Momjian br...@momjian.us writes:
Magnus Hagander wrote:
It means that we can't prevent people from configuring their tools to
ignore important warning. We can't prevent them rom ignoring ERROR or
FATAL either...
My point is that most tools are going to look at the tag first to
On Mon, Jun 14, 2010 at 10:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Bruce Momjian br...@momjian.us writes:
Magnus Hagander wrote:
It means that we can't prevent people from configuring their tools to
ignore important warning. We can't prevent them rom ignoring ERROR or
FATAL either...
My
Robert Haas robertmh...@gmail.com writes:
On Mon, Jun 14, 2010 at 10:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
The correct log level for this message is LOG. End of discussion.
Why?
Because it's not being issued in a user's session. The only place it
can go is to the system log, and if you
On Mon, Jun 14, 2010 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
On Mon, Jun 14, 2010 at 10:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
The correct log level for this message is LOG. End of discussion.
Why?
Because it's not being issued in a
Tom Lane wrote:
Robert Haas robertmh...@gmail.com writes:
On Mon, Jun 14, 2010 at 10:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
The correct log level for this message is LOG. ?End of discussion.
Why?
Because it's not being issued in a user's session. The only place it
can go is to the
Robert Haas robertmh...@gmail.com writes:
I'm willing to buy the above, but nobody has explained to my
satisfaction why it's remotely sane to go into an infinite retry loop
on an unrecoverable error.
That's a different question altogether ;-). I assume you're not
satisfied by the change
On Mon, Jun 14, 2010 at 10:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
I'm willing to buy the above, but nobody has explained to my
satisfaction why it's remotely sane to go into an infinite retry loop
on an unrecoverable error.
That's a different
On Mon, 2010-06-14 at 10:30 -0400, Tom Lane wrote:
I'm totally unimpressed by the argument that log-filtering
applications don't know enough to pay attention to LOG messages.
There are already a lot of those that are quite important to notice.
We have a log level where 1 log entry in a
Robert Haas robertmh...@gmail.com writes:
On Mon, Jun 14, 2010 at 10:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
That's a different question altogether ;-). I assume you're not
satisfied by the change Heikki committed a couple hours ago?
It will at least try to do something to recover.
Yeah,
On Mon, Jun 14, 2010 at 10:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
On Mon, Jun 14, 2010 at 10:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
That's a different question altogether ;-). I assume you're not
satisfied by the change Heikki committed a couple
Simon Riggs si...@2ndquadrant.com writes:
Should I be downgrading Hot Standby breakages to LOG? That will
certainly help high availability as well.
If a message is being issued in a non-user-connected session, there
is basically not a lot of point in WARNING or below. It should either
be LOG,
On Mon, Jun 14, 2010 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Simon Riggs si...@2ndquadrant.com writes:
Should I be downgrading Hot Standby breakages to LOG? That will
certainly help high availability as well.
If a message is being issued in a non-user-connected session, there
is
On Mon, 2010-06-14 at 11:14 -0400, Robert Haas wrote:
On Mon, Jun 14, 2010 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Simon Riggs si...@2ndquadrant.com writes:
Should I be downgrading Hot Standby breakages to LOG? That will
certainly help high availability as well.
If a message is
Robert Haas robertmh...@gmail.com writes:
On Mon, Jun 14, 2010 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
If a message is being issued in a non-user-connected session, there
is basically not a lot of point in WARNING or below. It should either
be LOG, or ERROR/FATAL/PANIC (which are
On Mon, Jun 14, 2010 at 11:34 AM, Simon Riggs si...@2ndquadrant.com wrote:
On Mon, 2010-06-14 at 11:14 -0400, Robert Haas wrote:
On Mon, Jun 14, 2010 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Simon Riggs si...@2ndquadrant.com writes:
Should I be downgrading Hot Standby breakages to
Robert Haas robertmh...@gmail.com writes:
On Mon, Jun 14, 2010 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Simon Riggs si...@2ndquadrant.com writes:
Should I be downgrading Hot Standby breakages to LOG? That will
certainly help high availability as well.
If a message is being issued in a
On Mon, 2010-06-14 at 18:11 +0200, Dimitri Fontaine wrote:
Robert Haas robertmh...@gmail.com writes:
On Mon, Jun 14, 2010 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Simon Riggs si...@2ndquadrant.com writes:
Should I be downgrading Hot Standby breakages to LOG? That will
certainly
On Mon, Jun 14, 2010 at 12:31 PM, Simon Riggs si...@2ndquadrant.com wrote:
If that's the case, I guess Tom's right, once more, saying that LOG is
fine here. If we want to be more subtle than that, we'd need to revise
each and every error message and attribute it the right level, which it
Robert Haas robertmh...@gmail.com writes:
Not sure I agree with this - what I think the problem is here is we
need to make a clear distinction between recoverable errors and
unrecoverable errors.
Um, if it's recoverable, it's not really an error ...
regards, tom lane
On Mon, Jun 14, 2010 at 1:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
Not sure I agree with this - what I think the problem is here is we
need to make a clear distinction between recoverable errors and
unrecoverable errors.
Um, if it's recoverable,
On Mon, 2010-06-14 at 11:09 -0400, Tom Lane wrote:
Simon Riggs si...@2ndquadrant.com writes:
Should I be downgrading Hot Standby breakages to LOG? That will
certainly help high availability as well.
If a message is being issued in a non-user-connected session, there
is basically not a lot
Simon Riggs si...@2ndquadrant.com writes:
LOG is already over-used and so anything said at that level is drowned.
This is nonsense.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
Simon Riggs si...@2ndquadrant.com wrote:
LOG is already over-used and so anything said at that level is
drowned. In many areas of code we cannot use a higher level
without trauma. That is a problem since we have no way to separate
the truly important from the barely interesting.
The fact
Kevin Grittner kevin.gritt...@wicourts.gov writes:
Simon Riggs si...@2ndquadrant.com wrote:
LOG is already over-used and so anything said at that level is
drowned. In many areas of code we cannot use a higher level
without trauma. That is a problem since we have no way to separate
the truly
On 6/14/10 7:57 AM, Tom Lane wrote:
However, I do agree that it's not helpful to loop forever. If we can
easily make it retry once and then PANIC, I'd be for that --- otherwise
I tend to agree that the best thing is just to PANIC immediately. There
are many many situations where a slave
On Mon, Jun 14, 2010 at 20:22, Tom Lane t...@sss.pgh.pa.us wrote:
Simon Riggs si...@2ndquadrant.com writes:
LOG is already over-used and so anything said at that level is drowned.
This is nonsense.
Whether it's over-used or not may be, but that doesn't make the
general issue nonsense.
But
Tom Lane t...@sss.pgh.pa.us wrote:
Kevin Grittner kevin.gritt...@wicourts.gov writes:
The fact that LOG is categorized the same as INFO has led me to
believe that they are morally equivalent --
They are not morally equivalent. INFO is for output that the user
has explicitly requested
On Tue, Jun 15, 2010 at 12:09 AM, Robert Haas robertmh...@gmail.com wrote:
The testing that I have been doing while we've been discussing this
reveals that you are correct. I set up an HS/SR master and slave
(running on the same machine), ran pgbench on the master, and then
started randomly
On Mon, Jun 14, 2010 at 10:35 PM, Fujii Masao masao.fu...@gmail.com wrote:
On Tue, Jun 15, 2010 at 12:09 AM, Robert Haas robertmh...@gmail.com wrote:
The testing that I have been doing while we've been discussing this
reveals that you are correct. I set up an HS/SR master and slave
(running
On Thu, 2010-06-10 at 09:57 -0400, Robert Haas wrote:
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao masao.fu...@gmail.com
wrote:
When an error is found in the WAL streamed from the master, a
warning
message is repeated without interval forever in the standby. This
consumes CPU load very
On 11/06/10 07:18, Fujii Masao wrote:
On Fri, Jun 11, 2010 at 1:01 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors within redo functions. During crash recovery, a corrupt
record means
On Fri, Jun 11, 2010 at 8:19 AM, Simon Riggs si...@2ndquadrant.com wrote:
On Thu, 2010-06-10 at 09:57 -0400, Robert Haas wrote:
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao masao.fu...@gmail.com
wrote:
When an error is found in the WAL streamed from the master, a
warning
message is repeated
On Fri, Jun 11, 2010 at 9:32 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Hmm, right now it doesn't even reconnect when it sees a corrupt record
streamed from the master. It's really pointless to retry in that case,
reapplying the exact same piece of WAL surely won't work.
On Thu, 2010-06-10 at 19:01 +0300, Heikki Linnakangas wrote:
What warning message are we talking about? All the error cases I can
think of in WAL-application are ERROR, or likely even PANIC.
We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors
On Fri, Jun 11, 2010 at 9:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
On Thu, 2010-06-10 at 19:01 +0300, Heikki Linnakangas wrote:
What warning message are we talking about? All the error cases I can
think of in WAL-application are ERROR, or likely even PANIC.
We're talking about a
Robert Haas wrote:
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.
This needs some better thought.
If we PANIC, then surely it will PANIC again when we restart unless we
do something. So we can't do that. But we
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao masao.fu...@gmail.com wrote:
When an error is found in the WAL streamed from the master, a warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would interfere with read-only queries.
To fix this
Robert Haas robertmh...@gmail.com writes:
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao masao.fu...@gmail.com wrote:
When an error is found in the WAL streamed from the master, a warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would
On 10/06/10 17:38, Tom Lane wrote:
Robert Haasrobertmh...@gmail.com writes:
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masaomasao.fu...@gmail.com wrote:
When an error is found in the WAL streamed from the master, a warning
message is repeated without interval forever in the standby. This
consumes
On Thu, Jun 10, 2010 at 12:01 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors within redo functions. During crash recovery, a corrupt
record means you've reached end of WAL. In standby
On Thu, Jun 10, 2010 at 5:13 PM, Robert Haas robertmh...@gmail.com wrote:
At this point you should have a working HS/SR setup. Now:
8. shut the slave down
9. move recovery.conf out of the way
10. restart the slave - it will do recovery and enter normal running
11. make some database changes
On Thu, Jun 10, 2010 at 12:49 PM, Greg Stark gsst...@mit.edu wrote:
On Thu, Jun 10, 2010 at 5:13 PM, Robert Haas robertmh...@gmail.com wrote:
At this point you should have a working HS/SR setup. Now:
8. shut the slave down
9. move recovery.conf out of the way
10. restart the slave - it will
On Fri, Jun 11, 2010 at 1:01 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors within redo functions. During crash recovery, a corrupt
record means you've reached end of WAL. In standby
Hi,
When an error is found in the WAL streamed from the master, a warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would interfere with read-only queries.
To fix this problem, we should add a sleep into emode_for_corrupt_record()
or
61 matches
Mail list logo