Heikki Linnakangas wrote:
> Robert Treat wrote:
>> 2) Can be graphed over time (using rrdtool and others) for trending
>> checkpoint activity
>
> Hmm. You'd need the historical data to do that properly. In particular,
> if two checkpoints happen between the polling interval, you'd miss that.
On Sat, 05 Apr 2008 16:37:15 +0100
Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
May I just say that every person that is currently talking on this
thread is offtopic? Move it to -hackers please.
Joshua D. Drake
--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Co
Robert Treat wrote:
1) Alert if checkpointing stops occuring within a reasonable time frame (note
there are failure cases and normal use cases where this might occur) (also
note I'll agree, this isn't common, but the results are pretty disatrous if
it does happen)
What are the normal use cas
On Fri, 2008-04-04 at 02:21 -0400, Greg Smith wrote:
> Database stops checkpointing. WAL files pile up. In the middle of
> backup, system finally dies, and when it starts recovery there's a bad
> record in the WAL files--which there are now thousands of to apply, and
> the bad one is 4 hours
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> These kind of things can be monitored externally very easily, say by
> Nagios, when the values are available via the database. If you have to
> troll the logs, it's quite a bit harder to do it.
> I'm not sure about the right values to export -- last ch
"Alvaro Herrera" <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Greg Smith <[EMAIL PROTECTED]> writes:
>> > ... If they'd have noticed it while the server was up, perhaps because the
>> > "last checkpoint" value hadn't changed in a long time (which seems like it
>> > might be available via sta
On Friday 04 April 2008 03:14, Tom Lane wrote:
> Greg Smith <[EMAIL PROTECTED]> writes:
> > ... If they'd have noticed it while the server was up, perhaps because
> > the "last checkpoint" value hadn't changed in a long time (which seems
> > like it might be available via stats even if, as you say,
On Friday 04 April 2008 01:59, Tom Lane wrote:
> Greg Smith <[EMAIL PROTECTED]> writes:
> > On Thu, 3 Apr 2008, Tom Lane wrote:
> >> I'd much rather be spending our time and effort on understanding what
> >> broke for you, and fixing the code so it doesn't happen again.
> >
> > [ shit happens... ]
Tom Lane wrote:
> Greg Smith <[EMAIL PROTECTED]> writes:
> > ... If they'd have noticed it while the server was up, perhaps because the
> > "last checkpoint" value hadn't changed in a long time (which seems like it
> > might be available via stats even if, as you say, the background writer is
>
Greg Smith <[EMAIL PROTECTED]> writes:
> ... If they'd have noticed it while the server was up, perhaps because the
> "last checkpoint" value hadn't changed in a long time (which seems like it
> might be available via stats even if, as you say, the background writer is
> out of its mind at that
On Fri, 4 Apr 2008, Tom Lane wrote:
The actual advice I'd give to a DBA faced with such a case is to
kill -ABRT the bgwriter and send the stack trace to -hackers.
And that's a perfect example of where they're trying to get to. They
didn't notice the problem until after the crash. The server
Greg Smith <[EMAIL PROTECTED]> writes:
> On Fri, 4 Apr 2008, Tom Lane wrote:
>> (And you still didn't tell me what the actual failure case was.)
> Database stops checkpointing. WAL files pile up. In the middle of
> backup, system finally dies, and when it starts recovery there's a bad
> record
On Fri, 4 Apr 2008, Tom Lane wrote:
(And you still didn't tell me what the actual failure case was.)
Database stops checkpointing. WAL files pile up. In the middle of
backup, system finally dies, and when it starts recovery there's a bad
record in the WAL files--which there are now thousan
Greg Smith <[EMAIL PROTECTED]> writes:
> On Thu, 3 Apr 2008, Tom Lane wrote:
>> I'd much rather be spending our time and effort on understanding what
>> broke for you, and fixing the code so it doesn't happen again.
> [ shit happens... ]
Completely fair, but I still don't see how this particular
On Thu, 3 Apr 2008, Tom Lane wrote:
"the system stopped checkpointing" does not strike me as a routine
occurrence that we should be making provisions for DBAs to watch for.
What, pray tell, is the DBA supposed to do when and if he notices that?
Schedule downtime rather than wait for it to hap
Robert Treat <[EMAIL PROTECTED]> writes:
> I have to add, given that we already provide the time of last checkpoint
> information via pg_controldata, I don't understand why people are against
> making that information accesible to remote clients.
So, I can expect to see a patch next week that i
On Friday 04 April 2008 00:09, Greg Smith wrote:
> On Thu, 3 Apr 2008, Robert Treat wrote:
> > You can plug a single item graphed over time into things like rrdtool to
> > get good trending information. And it's often easier to do this using
> > sql interfaces to get the data than pulling it out of
On Thu, 3 Apr 2008, Tom Lane wrote:
As of PG 8.3, the bgwriter tries very hard to make the elapsed time of a
checkpoint be just about checkpoint_timeout *
checkpoint_completion_target, regardless of load factors.
In the cases where the timing on checkpoint writes are timeout driven.
When the
On Thu, 3 Apr 2008, Joshua D. Drake wrote:
For knowing how long checkpoints are taking. If they are taking too
long you may need to adjust your bgwriter settings, and it is a
serious drag to parse postgresql logs for this info.
There's some disconnect here between what I think you want here an
Greg Smith wrote:
> On Thu, 3 Apr 2008, Robert Treat wrote:
>
> > You can plug a single item graphed over time into things like rrdtool to
> > get good trending information. And it's often easier to do this using
> > sql interfaces to get the data than pulling it out of log files (almost
> > li
On Thu, 3 Apr 2008, Robert Treat wrote:
You can plug a single item graphed over time into things like rrdtool to
get good trending information. And it's often easier to do this using
sql interfaces to get the data than pulling it out of log files (almost
like the db was designed for that :-)
Theo Schlossnagle <[EMAIL PROTECTED]> writes:
> On Apr 3, 2008, at 10:33 PM, Tom Lane wrote:
>> Theo claimed he had a reason for wanting to know the latest checkpoint
>> time, *without* any intention of time-extended tracking of that; but
>> he didn't say what it was.
> We had a recent event where
On Apr 3, 2008, at 10:33 PM, Tom Lane wrote:
Theo claimed he had a reason for wanting to know the latest checkpoint
time, *without* any intention of time-extended tracking of that; but
he didn't say what it was. If there is a credible reason for that
then it might justify a patch of this nature
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 22:33:15 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:
> JD seems to be on record that the existing logging mechanism sucks
> and he needs something else. That's fine, but I think it means that
> we need to improve logging in general,
Robert Treat <[EMAIL PROTECTED]> writes:
>> Tom Lane <[EMAIL PROTECTED]> wrote:
> 3. As of PG 8.3, the bgwriter tries very hard to make the elapsed time
> of a checkpoint be just about checkpoint_timeout *
> checkpoint_completion_target, regardless of load factors. So unless
> your settings are co
On Thursday 03 April 2008 21:14, Joshua D. Drake wrote:
> On Thu, 03 Apr 2008 20:29:18 -0400
>
> Tom Lane <[EMAIL PROTECTED]> wrote:
> > "Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> > > Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
> > >> Why is that useful?
> > >
> > > For knowing how long chec
Theo Schlossnagle wrote:
Has this feature been discussed on -hackers? I don't recall it (and
my memory has plenty of holes in it), but I'm sure that after
attending my talk last Sunday Theo hasn't sent in a patch for an
undiscussed feature ;-)
Andrew: I don't think this feature has be
Theo Schlossnagle <[EMAIL PROTECTED]> writes:
> Heikki: It it useful for knowing when the last checkpoint occurred.
I guess I'm wondering why that's important. In the current bgwriter
design, the system spends half its time checkpointing (or in general
checkpoint_completion_target % of the tim
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 21:44:00 -0400
Andrew Dunstan <[EMAIL PROTECTED]> wrote:
> I think there is quite possibly a good case for keeping some
> diagnostics in a table or tables, on a rolling basis, maybe. But then
> that's a facility that needs to be p
Joshua D. Drake wrote:
Exposing everything into the log files isn't always sufficient
(says the guy who maintains a remote admin tool)
It should be now that you can have machine readable logs (says the
guy who literally spent weeks making that happen) ;-)
And how does the pers
On Apr 3, 2008, at 7:08 PM, Andrew Dunstan wrote:
Joshua D. Drake wrote:
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into
SQL.
Why is that useful?
For knowing how long checkpoints are taking. If they are taking too
long you may need to adjust
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 21:26:46 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:
> "Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> > I would agree with this. We would need a history of checkpoints that
> > didn't reset until we told it to.
>
> Indeed, but the
"Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> I would agree with this. We would need a history of checkpoints that
> didn't reset until we told it to.
Indeed, but the submitted patch has nought whatsoever to do with that.
It exposes some instantaneous state.
You could perhaps *build* a log faci
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 20:45:37 -0400
Andrew Dunstan <[EMAIL PROTECTED]> wrote:
> > Exposing everything into the log files isn't always sufficient
> > (says the guy who maintains a remote admin tool)
> >
>
> It should be now that you can have machine
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 20:29:18 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:
> "Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> > Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
> >> Why is that useful?
>
> > For knowing how long checkpoints are taking. If th
Robert Treat wrote:
On Thursday 03 April 2008 19:08, Andrew Dunstan wrote:
Joshua D. Drake wrote:
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into
SQL.
Why is that useful?
For knowing how long checkpoints are t
On Thursday 03 April 2008 19:08, Andrew Dunstan wrote:
> Joshua D. Drake wrote:
> >> Theo Schlossnagle wrote:
> >>> First whack at exposing the start and finish checkpoint times into
> >>> SQL.
> >>
> >> Why is that useful?
> >
> > For knowing how long checkpoints are taking. If they are taking too
"Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
>> Why is that useful?
> For knowing how long checkpoints are taking. If they are taking too
> long you may need to adjust your bgwriter settings, and it is a
> serious drag to parse postgresql logs for t
Joshua D. Drake wrote:
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into
SQL.
Why is that useful?
For knowing how long checkpoints are taking. If they are taking too
long you may need to adjust your bgwriter settings, and it is a
se
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Theo Schlossnagle wrote:
>> First whack at exposing the start and finish checkpoint times into SQL.
> Why is that useful?
Does this implementation even work? It looks to me like the
globalStats.last_checkpoint_start/done fields will go back to zer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 23:21:49 +0100
Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
> Theo Schlossnagle wrote:
> > First whack at exposing the start and finish checkpoint times into
> > SQL.
>
> Why is that useful?
For knowing how long checkpoints are
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into SQL.
Why is that useful?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
Theo Schlossnagle wrote:
> First whack at exposing the start and finish checkpoint times into SQL.
I suggest using GetCurrentTimestamp() directly instead of time_t and
converting.
--
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, C
First whack at exposing the start and finish checkpoint times into SQL.
--
Theo Schlossnagle
Esoteric Curio -- http://lethargy.org/
OmniTI Computer Consulting, Inc. -- http://omniti.com/
checkpoint_exposed.patch
Description: Binary data
--
Sent via pgsql-patches mailing list (pgsql-patche
44 matches
Mail list logo