Re: [HACKERS] log_duration is redundant, no?

2006-09-16 Thread Guillaume Smet

On 9/16/06, Alvaro Herrera [EMAIL PROTECTED] wrote:

It might make sense to log _what_ is going on, without telling all the
little details, for example

LOG:  parse duration: 0.250 ms
LOG:  bind duration: 0.057 ms
LOG:  execute my_query: SELECT * FROM shop WHERE $1 = $2
DETAIL:  parameters: $1 = 'Clothes Clothes Clothes', $2 = 'Joe''s Widgets'


It's not really the idea when you use log_statement and log_duration.
Lines are completely different semantically speaking.
So you should have:
LOG:  parse (log_statement)
LOG:  duration: 0.250 ms (log_duration)

--
Guillaume

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] log_duration is redundant, no?

2006-09-16 Thread Guillaume Smet

On 9/16/06, Tom Lane [EMAIL PROTECTED] wrote:

The only asymmetry in the thing is that if log_statement fired then
we suppress duplicate printing of the query in the later duration log
message (if any) for that query.  But that seems like the right thing
if you're at all concerned about log volume.


Perhaps I'm not representative of the users of these settings but when
I used log_statement='all', I didn't really care about the log volume.
I knew it really generates a lot of log lines and it slows down my
database.

My only concern was that we now have less information with
log_statement='all' than with log_min_duration_statement.

That said, I don't use it myself now: I use exclusively
log_min_duration_statement and log_duration. So if you think it's
better like that, it's ok for me.

Does anyone else have an opinion about this?

--
Guillaume

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] log_duration is redundant, no?

2006-09-16 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 My only concern was that we now have less information with
 log_statement='all' than with log_min_duration_statement.

Well, you don't have the durations, but log_statement isn't supposed to
tell you that.  So I'm still quite confused about what it is that you
want to do differently.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] log_duration is redundant, no?

2006-09-15 Thread Guillaume Smet

On 9/8/06, Tom Lane [EMAIL PROTECTED] wrote:

It's done already ...


(Working on implementing the last changes you made in formatting in pgFouine)

Is it normal that when I set log_duration to on and log_statement to
all, I have the following output when I prepare/bind/execute a
prepared statement using the protocol:
LOG:  duration: 0.250 ms
LOG:  duration: 0.057 ms
LOG:  execute my_query: SELECT * FROM shop WHERE $1 = $2
DETAIL:  parameters: $1 = 'Clothes Clothes Clothes', $2 = 'Joe''s Widgets'

I suppose the first line is the prepare and the second line is the
bind but I'm not sure it's the desired behaviour.

Any comment?

--
Guillaume

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] log_duration is redundant, no?

2006-09-15 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 Is it normal that when I set log_duration to on and log_statement to
 all, I have the following output when I prepare/bind/execute a
 prepared statement using the protocol:
 LOG:  duration: 0.250 ms
 LOG:  duration: 0.057 ms
 LOG:  execute my_query: SELECT * FROM shop WHERE $1 = $2
 DETAIL:  parameters: $1 = 'Clothes Clothes Clothes', $2 = 'Joe''s Widgets'

 I suppose the first line is the prepare and the second line is the
 bind but I'm not sure it's the desired behaviour.

Well, considering that the parse and bind may take longer than the
execute, I hardly think we want to ignore them for log_duration
purposes.  And we agreed that if log_duration is on and
log_min_duration_statement is not triggered, log_duration should print
*only* duration.  So I'm not sure what else you expected.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] log_duration is redundant, no?

2006-09-15 Thread Guillaume Smet

On 9/16/06, Tom Lane [EMAIL PROTECTED] wrote:

Well, considering that the parse and bind may take longer than the
execute, I hardly think we want to ignore them for log_duration
purposes.  And we agreed that if log_duration is on and
log_min_duration_statement is not triggered, log_duration should print
*only* duration.  So I'm not sure what else you expected.


I don't know exactly what I expected. I'm just surprised to have only
the duration when log_statement is set to all.
If we consider that the prepare and the bind operations are important
(and I agree they can be), I wonder why do we remove the output we
have when log_min_duration_statement is set to 0 (I'm thinking of the
parse: and bind: lines)?

(sorry for the double post, I forgot to cc: the list)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] log_duration is redundant, no?

2006-09-15 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 If we consider that the prepare and the bind operations are important
 (and I agree they can be), I wonder why do we remove the output we
 have when log_min_duration_statement is set to 0 (I'm thinking of the
 parse: and bind: lines)?

Well, we remove it for the execute: too if you have only log_duration
on.  My view of this is that log_duration is meant to find out the total
amount of time spent doing stuff, and you set log_min_duration_statement
to whatever your threshold of pain is for the amount of time spent doing
a single thing.  If it's less than log_min_duration_statement then
you're saying you don't care about the details of that individual step,
only the aggregate runtime.

log_statement has another goal entirely, which is to record *what* is
being done in a semantic sense, and so for that I think it makes sense
to log only actual executions (and not parse/bind leading up to 'em).

The only asymmetry in the thing is that if log_statement fired then
we suppress duplicate printing of the query in the later duration log
message (if any) for that query.  But that seems like the right thing
if you're at all concerned about log volume.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-15 Thread Alvaro Herrera
Tom Lane wrote:
 Guillaume Smet [EMAIL PROTECTED] writes:
  If we consider that the prepare and the bind operations are important
  (and I agree they can be), I wonder why do we remove the output we
  have when log_min_duration_statement is set to 0 (I'm thinking of the
  parse: and bind: lines)?
 
 Well, we remove it for the execute: too if you have only log_duration
 on.  My view of this is that log_duration is meant to find out the total
 amount of time spent doing stuff, and you set log_min_duration_statement
 to whatever your threshold of pain is for the amount of time spent doing
 a single thing.  If it's less than log_min_duration_statement then
 you're saying you don't care about the details of that individual step,
 only the aggregate runtime.

It might make sense to log _what_ is going on, without telling all the
little details, for example

LOG:  parse duration: 0.250 ms
LOG:  bind duration: 0.057 ms
LOG:  execute my_query: SELECT * FROM shop WHERE $1 = $2
DETAIL:  parameters: $1 = 'Clothes Clothes Clothes', $2 = 'Joe''s Widgets'

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] log_duration is redundant, no?

2006-09-09 Thread Peter Eisentraut
Alvaro Herrera wrote:
 Bruce Momjian wrote:
  #2, I think, but I am confused if you don't know the query, how
  valuable is the log_duration.

 Statistics?

I doubt that there is a statistical merit to calculating aggregate 
values over the duration of an anonymous set of queries.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] log_duration is redundant, no?

2006-09-09 Thread David Fetter
On Sat, Sep 09, 2006 at 06:33:10PM +0200, Peter Eisentraut wrote:
 Alvaro Herrera wrote:
  Bruce Momjian wrote:
   #2, I think, but I am confused if you don't know the query, how
   valuable is the log_duration.
 
  Statistics?
 
 I doubt that there is a statistical merit to calculating aggregate
 values over the duration of an anonymous set of queries.

How heavily loaded is the server is a perfectly legitimate metric to
have available, especially when more detailed, i.e. more invasive
probes could bring it down.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] log_duration is redundant, no?

2006-09-09 Thread Martijn van Oosterhout
On Sat, Sep 09, 2006 at 06:33:10PM +0200, Peter Eisentraut wrote:
 Alvaro Herrera wrote:
  Bruce Momjian wrote:
   #2, I think, but I am confused if you don't know the query, how
   valuable is the log_duration.
 
  Statistics?
 
 I doubt that there is a statistical merit to calculating aggregate 
 values over the duration of an anonymous set of queries.

Eh? Sure there is, if you want totals per database, number of queries
and where they were spread over the day. A perfectly reasonable example
was given in this thread.

If you're a service provider you don't care about what queries the
users are doing, just how many resources they're taking.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] log_duration is redundant, no?

2006-09-08 Thread Guillaume Smet

On 9/8/06, Bruce Momjian [EMAIL PROTECTED] wrote:

Alvaro Herrera wrote:
 Statistics?

Oh, interesting.


We build this type of report for our customers:
http://pgfouine.projects.postgresql.org/reports/sample_hourly.html
This one is a real one.

As you can see, we cannot tell the type of every query as we don't log
the text of all of them (we identified 138,788 of the 12,358,514
queries executed this day).

It helped to detect there was a weird behaviour at 11am which was not
a good idea as we have a lot of visits at 11am. We asked our customers
if there was any cronjob running at 11am we could move to another time
when the load was less intensive as it really slowed down the
database.

Sometimes we also detect that we suddenly have a big up in the number
of queries executed (and it can be small queries we cannot detect with
log_min_duration_statement = 500) without the same increase on the
HTTP side. We contact our customer to ask him what happens and if it's
normal. If it's not, they take a look at the changes they made to  the
code the previous day. Sometimes, we can even give them a hint because
it's at a specific time and we can suspect it's a cronjob running.
If they find the problem, it's directly corrected and we can check the
next day if it's okay. If they don't find the problem, then we enable
full query logging for a short time and we analyze them to find the
offending queries.

And sometimes, in the case we just host the database without
developing the application, we just simply need it for contractual
reasons to demonstrate why the server is suddenly becoming slow.

I hope it helps to clarify how we use it.

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] log_duration is redundant, no?

2006-09-08 Thread Guillaume Smet

Tom,

On 9/8/06, Tom Lane [EMAIL PROTECTED] wrote:

It seems like we should either remove the separate log_duration boolean
or make it work as he suggests.  I'm leaning to the second answer now.


Do you want me to propose a patch or do you prefer to work on it
yourself? If so, do we keep the log_duration name or do we change it
to log_all_duration or another more appropriate name?

I attached the little patch I use to apply on our packages. I can work
on it to make it apply to HEAD and update the documentation.

I suppose we should also change the FE/BE protocol logging accordingly
but ISTM you already planned to change it for other reasons.

--
Guillaume
Index: src/backend/tcop/postgres.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/tcop/postgres.c,v
retrieving revision 1.483
diff -u -r1.483 postgres.c
--- src/backend/tcop/postgres.c	4 Apr 2006 19:35:35 -	1.483
+++ src/backend/tcop/postgres.c	6 Apr 2006 12:12:00 -
@@ -1092,14 +1092,6 @@
 		usecs = (long) (stop_t.tv_sec - start_t.tv_sec) * 100 +
 			(long) (stop_t.tv_usec - start_t.tv_usec);
 
-		/* Only print duration if we previously printed the statement. */
-		if (was_logged  save_log_duration)
-			ereport(LOG,
-	(errmsg(duration: %ld.%03ld ms,
-			(long) ((stop_t.tv_sec - start_t.tv_sec) * 1000 +
-  (stop_t.tv_usec - start_t.tv_usec) / 1000),
-		 (long) (stop_t.tv_usec - start_t.tv_usec) % 1000)));
-
 		/*
 		 * Output a duration_statement to the log if the query has exceeded
 		 * the min duration, or if we are to print all durations.
@@ -1107,6 +1099,7 @@
 		if (save_log_min_duration_statement == 0 ||
 			(save_log_min_duration_statement  0 
 			 usecs = save_log_min_duration_statement * 1000))
+		{
 			ereport(LOG,
 	(errmsg(duration: %ld.%03ld ms  statement: %s%s,
 			(long) ((stop_t.tv_sec - start_t.tv_sec) * 1000 +
@@ -1114,6 +1107,17 @@
 			(long) (stop_t.tv_usec - start_t.tv_usec) % 1000,
 			query_string,
 			prepare_string ? prepare_string : )));
+		}
+		else
+		{
+			/* Print duration if we did not print it before. */
+			if (save_log_duration)
+ereport(LOG,
+		(errmsg(duration: %ld.%03ld ms,
+(long) ((stop_t.tv_sec - start_t.tv_sec) * 1000 +
+	  (stop_t.tv_usec - start_t.tv_usec) / 1000),
+			 (long) (stop_t.tv_usec - start_t.tv_usec) % 1000)));
+		}
 	}
 
 	if (save_log_statement_stats)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] log_duration is redundant, no?

2006-09-08 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 Do you want me to propose a patch or do you prefer to work on it
 yourself?

It's done already ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] log_duration is redundant, no?

2006-09-07 Thread Tom Lane
AFAICS, there is absolutely no difference anymore between turning
log_duration ON and setting log_min_duration_statement to zero.
ISTM that having the two redundant GUC settings is just confusing,
and we should remove log_duration to simplify things.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Guillaume Smet

Tom,

On 9/7/06, Tom Lane [EMAIL PROTECTED] wrote:

AFAICS, there is absolutely no difference anymore between turning
log_duration ON and setting log_min_duration_statement to zero.
ISTM that having the two redundant GUC settings is just confusing,
and we should remove log_duration to simplify things.


log_duration can be useful if we restore a behaviour similar to what
it did in 7.4 ie when you can log duration only for every query. This
way you can have a global overview of your database activity.
I mean:
log_duration = on
log_min_duration_statement = 500
would log only duration for queries faster than 500 ms and duration +
query text for queries slower than 500ms (we can easily avoid
redundancy).

This allows you to have all query statistics without too much I/O. It
doesn't give you all the information as you don't have the query text
but it can give you useful statistics (to explain why the database
server is suddenly slower - we executed 2 millions queries last month
- now we execute 8 millions queries. We cannot easily have this sort
of information as a log_min_duration_statement = 500 won't give it and
a log_min_duration_statement = 0 generates far too I/O).

For another real life example, we are switching from a manual vacuum
to autovacuum for one of our customers and we want to know the
consequences. The query text is useless for a first analysis (and will
slow down the database too much) as we just want to detect if it's
slower or not.

Could we consider reintroduce the old behaviour rather than removing
this setting (we can rename it to a better name if needed)? I already
have a patch for that as we run a patched version of 8.1.4 here to
have this very behaviour.

--
Guillaume

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 I mean:
 log_duration = on
 log_min_duration_statement = 500
 would log only duration for queries faster than 500 ms and duration +
 query text for queries slower than 500ms (we can easily avoid
 redundancy).

I don't find this very persuasive --- it sounds awfully messy, and in
fact isn't that exactly the old behavior we got rid of because no one
could understand it?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread David Fetter
On Thu, Sep 07, 2006 at 06:06:51PM -0400, Tom Lane wrote:
 Guillaume Smet [EMAIL PROTECTED] writes:
  I mean:
  log_duration = on
  log_min_duration_statement = 500
  would log only duration for queries faster than 500 ms and
  duration + query text for queries slower than 500ms (we can easily
  avoid redundancy).
 
 I don't find this very persuasive --- it sounds awfully messy, and
 in fact isn't that exactly the old behavior we got rid of because no
 one could understand it?

Guillaume's the author of pgfouine, which understands it and helps
others to do same.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Josh Berkus
Tom,

 I don't find this very persuasive --- it sounds awfully messy, and in
 fact isn't that exactly the old behavior we got rid of because no one
 could understand it?

Well, we want analogous functionality.   We could stand to have it 
named/organized differently.   But maybe we should hold those chages for 
8.3, so that they can be tested properly?

I am finding that the log format prior to Bruce's change, which we were 
using for TPCE, makes it very hard to do log digest analysis if you use 
SPs or prepared queries at all.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Guillaume Smet

On 9/8/06, Tom Lane [EMAIL PROTECTED] wrote:

I don't find this very persuasive --- it sounds awfully messy, and in
fact isn't that exactly the old behavior we got rid of because no one
could understand it?


I gave real use cases and we use it every day. It really helps us as a
PostgreSQL hosting company.

The fact is that no tool could really exploit this behaviour before. I
agree it's a totally useless information if you don't have a tool to
analyze the logs. This is no longer the case as pgFouine can extract
this information and make it useful by aggregating it.

Perhaps we could rename it to log_all_duration (my english is not that
good so I'm not sure it's a good name) and explain how it can be
useful in the documentation.

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Bruce Momjian
Guillaume Smet wrote:
 On 9/8/06, Tom Lane [EMAIL PROTECTED] wrote:
  I don't find this very persuasive --- it sounds awfully messy, and in
  fact isn't that exactly the old behavior we got rid of because no one
  could understand it?
 
 I gave real use cases and we use it every day. It really helps us as a
 PostgreSQL hosting company.
 
 The fact is that no tool could really exploit this behaviour before. I
 agree it's a totally useless information if you don't have a tool to
 analyze the logs. This is no longer the case as pgFouine can extract
 this information and make it useful by aggregating it.
 
 Perhaps we could rename it to log_all_duration (my english is not that
 good so I'm not sure it's a good name) and explain how it can be
 useful in the documentation.

If you are using an external tool, can't you just restrict what you
display based on the logged duration?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 If you are using an external tool, can't you just restrict what you
 display based on the logged duration?

I think his basic complaint is that doing the full logging pushup for
even short-duration queries is too expensive, and that logging only the
duration and not the query text or parameters makes a significant speed
difference.  I'm not at all sure that I buy that, but if it's true then
subsequent filtering obviously doesn't help.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Guillaume Smet

On 9/8/06, Bruce Momjian [EMAIL PROTECTED] wrote:

If you are using an external tool, can't you just restrict what you
display based on the logged duration?


It's not a matter of having too much information in our reports (the
more information I have, the happier I am :)). It's a matter of
slowing down too much the server with too much I/O.
We can afford to log every duration and queries slower than 500ms
nearly without any overhead. We can't afford to log every query, it
generates too much I/O - note that we tried to do it and it was really
too slow.
With the former configuration we log 1.2 GB/day, with the latter I
suspect it will be far more than 60 GB/day (I don't have the exact
number as we can't do it for real but queries slower than 500 ms
represents 1/100 of the total amount of queries).

Query logging is really a nice way to monitor the activity of a
PostgreSQL server and the overhead is not that high if logging I/O are
not too intensive.

--
Guillaume

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  If you are using an external tool, can't you just restrict what you
  display based on the logged duration?
 
 I think his basic complaint is that doing the full logging pushup for
 even short-duration queries is too expensive, and that logging only the
 duration and not the query text or parameters makes a significant speed
 difference.  I'm not at all sure that I buy that, but if it's true then
 subsequent filtering obviously doesn't help.

Well, except for bind, all the log output display is zero cost, just a
printf(), as I remember.  The only cost that is significant, I think, is
the timing of the query, and that is happening for all the setttings
discussed.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Well, except for bind, all the log output display is zero cost, just a
 printf(), as I remember.  The only cost that is significant, I think, is
 the timing of the query, and that is happening for all the setttings
 discussed.

On a machine with slow gettimeofday(), measuring duration at all is
going to hurt, but apparently that is not Guillaume's situation ---
what's costing him is sheer log volume.  And remember that the
slow-gettimeofday problem exists mainly on cheap PCs, not server-grade
hardware.  Based on his experience I'm prepared to believe that there
is a use-case for logging just the duration for short queries.

It seems like we should either remove the separate log_duration boolean
or make it work as he suggests.  I'm leaning to the second answer now.
What's your vote?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Guillaume Smet

On 9/8/06, Tom Lane [EMAIL PROTECTED] wrote:

I think his basic complaint is that doing the full logging pushup for
even short-duration queries is too expensive, and that logging only the
duration and not the query text or parameters makes a significant speed
difference.  I'm not at all sure that I buy that, but if it's true then
subsequent filtering obviously doesn't help.


That's exactly my point. And on our highly loaded servers, the
log_duration behaviour makes the difference between:
- we have a clear overview of the server activity and
- we don't have any idea of what happens on this server (apart that
there are queries slower than X ms).

Regards,

--
Guillaume

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Well, except for bind, all the log output display is zero cost, just a
  printf(), as I remember.  The only cost that is significant, I think, is
  the timing of the query, and that is happening for all the setttings
  discussed.
 
 On a machine with slow gettimeofday(), measuring duration at all is
 going to hurt, but apparently that is not Guillaume's situation ---
 what's costing him is sheer log volume.  And remember that the
 slow-gettimeofday problem exists mainly on cheap PCs, not server-grade
 hardware.  Based on his experience I'm prepared to believe that there
 is a use-case for logging just the duration for short queries.
 
 It seems like we should either remove the separate log_duration boolean
 or make it work as he suggests.  I'm leaning to the second answer now.
 What's your vote?

#2, I think, but I am confused if you don't know the query, how valuable
is the log_duration.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Alvaro Herrera
Bruce Momjian wrote:

 #2, I think, but I am confused if you don't know the query, how valuable
 is the log_duration.

Statistics?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] log_duration is redundant, no?

2006-09-07 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian wrote:
 
  #2, I think, but I am confused if you don't know the query, how valuable
  is the log_duration.
 
 Statistics?

Oh, interesting.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] log_duration and log_statement

2006-03-14 Thread Simon Riggs
On Mon, 2006-03-13 at 23:40 +0100, Guillaume Smet wrote:

 Here are some background information to explain our issue and request.
 We are currently planning a migration from PostgreSQL 7.4 to
 PostgreSQL 8.1. We work on a medium sized database (2GB) with a rather
 important activity (12 millions queries a day with peaks up to 1000
 queries/s).
 We are analyzing the logs with a tool we developed (namely pgFouine
 available on pgFoundry). 

Guillaume,

Thanks very much for writing pgFouine. We've been doing our best to
support generation of useful logs for performance analysis, so please
feel free to ask for anything you see a need for.

 We currently use the following configuration
 for logging:
 - log_min_duration_statement = 500 to log the slowest queries
 - log_duration to log every query duration and have a global overview
 of our database activity (used to generate this sort of graphs:
 http://people.openwide.fr/~gsmet/postgresql/graphs.html ).

Interesting results and good graphics too.

I note your graphs don't show missing values: on the bottom graph there
is no data for 7pm and 2am, yet the graph passes directly from 6 to 8pm
as if the figure for 7pm was mid-way between them, rather than zero.

 We cannot log every query as we already generate 1.2GB of logs a day
 while only logging the text of one hundredth of the queries so we log
 only the duration for the not so slow queries.

How do you tell the difference between a SELECT and a Write query when
you do not see the text of the query?

 I didn't notice the log_duration behaviour has changed starting from
 8.0 (thanks to oicu for pointing me the 8.0 release notes on
 #postgresql) and what we did is not possible anymore with 8.x as
 log_duration now only logs the duration for queries logged with
 log_statement.
 
 I think the former behaviour can be interesting in our case and
 probably for many other people out there who use log analysis tools as
 logging only slow queries is not enough to have an overview of the
 database activity.
 I was thinking about something like log_duration = 'none|logged|all'
 which will allow us to switch between:
 - none: we don't log the duration (=log_duration=off);
 - logged: we log the duration only for logged queries (depending on
 log_statement as for 8.0);
 - all: we log every duration as 7.4 did before when log_duration was on.
 
 Any comment on this?

I think I need more persuasion before I see the value of this, but I'm
not going to immediately disregard this either.

Collecting information is interesting, but it must in some way lead to a
rational corrective action. Logging the duration of statements without
recording what they are would tell you there was a problem but make it
difficult to move towards an action rationally. Perhaps I'm
misunderstanding this.

Is the issue that the log volume is too high? We might be able to look
at ways to reduce/compress the log volume for duration/statement
logging.

Another thought might be to provide a random sample of queries. A 10%
sample would be sufficient for your case here and yet would provide a
possibility of deeper analysis also.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] log_duration and log_statement

2006-03-14 Thread Qingqing Zhou

Guillaume Smet [EMAIL PROTECTED] wrote

 Here are some background information to explain our issue and request.

On a separate issue, seems in pgfouine homepage a typo is there for a while:

What's New
2005-02-11 - pgFouine 0.4.99 released ...
2005-01-10 - pgFouine 0.2.1 released ...

I guess they should read as 2006 :-)

Regards,
Qingqing



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration and log_statement

2006-03-14 Thread Guillaume Smet
On 3/14/06, Qingqing Zhou [EMAIL PROTECTED] wrote:
 I guess they should read as 2006 :-)

Sure. Will fix it this evening. Thanks.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] log_duration and log_statement

2006-03-14 Thread Guillaume Smet
Simon,

On 3/14/06, Simon Riggs [EMAIL PROTECTED] wrote:
 Thanks very much for writing pgFouine. We've been doing our best to
 support generation of useful logs for performance analysis, so please
 feel free to ask for anything you see a need for.

OK. Thanks. I should write something clear about what we need one day
or another.

The two problems I have in mind currently are:
- problem to isolate a transaction as a whole. I mean I'd like to have
something to aggregate the BEGIN; QUERY1; QUERY2; COMMIT; as a whole ;
- problem with the way queries are logged when using PG 8.1 and the
8.1 jdbc driver. All our queries are logged as PREPARE and EXECUTE and
we don't have the variables passed in the log (so no value for $1,
$2... and we have the same problem to aggregate the real time of a
query) but I saw a few threads on -hackers talking about this problem.

 I note your graphs don't show missing values: on the bottom graph there
 is no data for 7pm and 2am, yet the graph passes directly from 6 to 8pm
 as if the figure for 7pm was mid-way between them, rather than zero.

Which is obviouly wrong. I missed it as it's not a common case on our
db. Thanks for the report.

 How do you tell the difference between a SELECT and a Write query when
 you do not see the text of the query?

A SELECT is a query beginning with SELECT. It's far from being perfect
but it works most of the time for our projects.
Our main problem is for queries like SELECT update_tables(); which are
inserts/delete in a stored procedure.
If you know any way to do it better, I'm open to any suggestion.
I don't know if it's possible for the backend to log this sort of information.

 I think I need more persuasion before I see the value of this, but I'm
 not going to immediately disregard this either.

 Collecting information is interesting, but it must in some way lead to a
 rational corrective action. Logging the duration of statements without
 recording what they are would tell you there was a problem but make it
 difficult to move towards an action rationally. Perhaps I'm
 misunderstanding this.

IMHO, the duration information are interesting as I can see when my
database is heavily loaded which is not correlated with the http
requests (two levels of cache) or with the server load on this
particular website.
Let's admit we have a slow down between 1am and 2pm. There can be a
lot of reasons for that:
- a lot of selects at this time because we are under heavy load from
our website;
- a lot of updates (cronjobs for example);
- another unknow reason we should try to find.
Having duration for all my queries can help me to understand the
problem as I will know:
- how many queries the database really executed at this particular time;
- how slow they are.
If I don't have a global slow down, perhaps, at this time I have a
specific table locked for an unknown reason.

I agree with you that it won't give me all the information to solve my
problem but _it allows me to detect the problem_ and it can at least
give me guidelines to how I can detect it more accurately.

It also gives us a way to see the evolution from one day to another as
we just operates the database, we don't develop the website.
If I have a boost from 1 million queries a day to 20 millions queries
a day, I won't know it with log_min_duration_statement and I can
detect it if log_duration logs every query.

 Is the issue that the log volume is too high? We might be able to look
 at ways to reduce/compress the log volume for duration/statement
 logging.

Well, I'm not sure we can reduce the size of a syslog log but perhaps
I'm wrong. Our problem is both the volume and the performance drop
introduced by logging.

 Another thought might be to provide a random sample of queries. A 10%
 sample would be sufficient for your case here and yet would provide a
 possibility of deeper analysis also.

Why not but I won't have a global overview just a statistical
information and I'd really like to have this overview without logging
every query with log_statement (which we do sometimes for a few
minutes when we really need more information about a specific
problem).

--
Guillaume

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] log_duration and log_statement

2006-03-14 Thread Jim C. Nasby
On Tue, Mar 14, 2006 at 08:50:22AM +, Simon Riggs wrote:
 Is the issue that the log volume is too high? We might be able to look
 at ways to reduce/compress the log volume for duration/statement
 logging.

ISTM that for performance analysis there's probably a better alternative
than just dumping query statements to a logfile. The information we
really want is stuff like:

WAL sync's per second
Read-only queries per second
Queries that modify data per second
Transactions per second
Blacks written per second (maybe broken down by WAL, heap and index)
etc...

Doesn't really have anything to do with logging query strings passed
into the parser.

I know dtrace could be used to provide this kind of info, but it's only
on Solaris (and eventually FreeBSD). But it should be possible to
provide our own version of that for platforms that don't have something
like dtrace.

Or perhaps there's some entirely different statistics collection method
we could come up with...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] log_duration and log_statement

2006-03-13 Thread Guillaume Smet
Hello,

Here are some background information to explain our issue and request.
We are currently planning a migration from PostgreSQL 7.4 to
PostgreSQL 8.1. We work on a medium sized database (2GB) with a rather
important activity (12 millions queries a day with peaks up to 1000
queries/s).
We are analyzing the logs with a tool we developed (namely pgFouine
available on pgFoundry). We currently use the following configuration
for logging:
- log_min_duration_statement = 500 to log the slowest queries
- log_duration to log every query duration and have a global overview
of our database activity (used to generate this sort of graphs:
http://people.openwide.fr/~gsmet/postgresql/graphs.html ).
We cannot log every query as we already generate 1.2GB of logs a day
while only logging the text of one hundredth of the queries so we log
only the duration for the not so slow queries.

I didn't notice the log_duration behaviour has changed starting from
8.0 (thanks to oicu for pointing me the 8.0 release notes on
#postgresql) and what we did is not possible anymore with 8.x as
log_duration now only logs the duration for queries logged with
log_statement.

I think the former behaviour can be interesting in our case and
probably for many other people out there who use log analysis tools as
logging only slow queries is not enough to have an overview of the
database activity.
I was thinking about something like log_duration = 'none|logged|all'
which will allow us to switch between:
- none: we don't log the duration (=log_duration=off);
- logged: we log the duration only for logged queries (depending on
log_statement as for 8.0);
- all: we log every duration as 7.4 did before when log_duration was on.

Any comment on this?

Thanks in advance for considering my request.

Regards,

--
Guillaume

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] log_duration

2003-03-14 Thread Bruce Momjian

Added to TODO:

* Add GUC log_statement_duration to print statement and = min duration

---

Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   One nice thing is that each element is orthoginal.  But, for the
   functionality desired, we have to merge log_statement and log_duration
   and have it print for statements taking over X milliseconds.  I have no
   problem adding it, but it has to be clear it isn't orthoginal but is a
   conditional combination of two other parameters.
  
  Actually, I was wondering if we shouldn't *replace* the current
  log_duration with a combined form (that specifies a minimum interesting
  duration).  I can't quite see the need for orthogonality here.  The
  only reason you'd care about query duration is that you're looking for
  the slow ones, no?  So why bother logging the fast ones?  Besides, you
  can specify min-duration zero if you really want 'em all.
 
 We did talk about this a while ago, and folks wanted the query printed
 _before_ it was executed, so they could see the query in the logs at the
 time it was issued, both for monitoring and for showing the time the
 query started when log_timestamp is enabled.
 
 Seems the clearest option would be for log_duration to print the query
 string too, and convert it to an integer field.  I can see zero meaning
 print all queries and durations.  What value do we use to turn it off? 
 -1?
 
 This would give us log_statement that prints at query start, and
 log_duration that prints query and duration at query end.  How is that?
 
 Maybe we should rename them as log_statement_start and
 log_statement_duration.
 
 -- 
   Bruce Momjian|  http://candle.pha.pa.us
   [EMAIL PROTECTED]   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] log_duration

2003-02-12 Thread Greg Stark

 Christopher Kings-Lynne wrote:
   Someone asked about this at FOSDEM. The only way I know to do it is look
   in the pgsql_temp directory, but they disappear pretty quickly.  Folks,
   do we need something to report sort file usage?

Fwiw here's the perl one-liner I used to tune sort_mem recently, 
(run in the $PGDATA/base directory):

perl -e 'while (sleep(1)) {if ($s = -s pgsql_tmp/*) { if ($s  $m) { $m = $s; print 
$s\n; } } }'

When doing this I had precisely the same thought about having Postgres print
out the disk space usage for sorts. 

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] log_duration

2003-02-12 Thread Kevin Brown
Greg Stark wrote:
 Not a big deal though, since I doubt anyone's actually parsing
 postgres logs.  Hm, brings up an interesting idea though, I wonder
 if it would be useful to log directly into postgres tables.

I was wondering roughly the same thing.  If you ran an external
program to process the logs and put them into a PostgreSQL database,
you'd have problems with the transactions of the log processor landing
in the logs as well, at least if all transactions were logged.  The
logging process would have to filter out its own transactions, which
might not be all that easy.


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] log_duration

2003-02-12 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 One nice thing is that each element is orthoginal.  But, for the
 functionality desired, we have to merge log_statement and log_duration
 and have it print for statements taking over X milliseconds.  I have no
 problem adding it, but it has to be clear it isn't orthoginal but is a
 conditional combination of two other parameters.

Actually, I was wondering if we shouldn't *replace* the current
log_duration with a combined form (that specifies a minimum interesting
duration).  I can't quite see the need for orthogonality here.  The
only reason you'd care about query duration is that you're looking for
the slow ones, no?  So why bother logging the fast ones?  Besides, you
can specify min-duration zero if you really want 'em all.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] log_duration

2003-02-12 Thread Greg Stark

  One option is to have log_query output an identifier with the query such as a
  hash of the query or the pointer value for the plan, suppressing duplicates.
  Then log_duration prints the identifier with the duration. 

 Actually, log_pid is the proper way to do this.  You can then add log
 connections, and get a full snapshot of what is happening for that
 session.

Personally I would prefer a unique identifier. I guess the best way of
illustrating my intuition would be: Imagine loading all this data into a
relational database, what would you need to full normalize it?. 

Parsing log files programmatically is much easier if you have unique
identifiers instead of having to rely on the relative relationships of entries
in the log. 

Not a big deal though, since I doubt anyone's actually parsing postgres logs.
Hm, brings up an interesting idea though, I wonder if it would be useful to
log directly into postgres tables.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] log_duration

2003-02-12 Thread Bruce Momjian
Greg Stark wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 
  Christopher Kings-Lynne [EMAIL PROTECTED] writes:
   Looking at the log_duration postgresql.conf option.  How about adding an
   option log_duration_min which is a value in milliseconds that is the minimum
   time a query must run for before being logged.
  
  Fine with me --- but you'll need to add more logic than that.  Right
  now, log_duration *only* causes the query duration to be printed out;
  if you ain't got log_statement on, you're in the dark as to what the
  query itself was.  You'll need to add some code to print the query
  (the log_min_error_statement logic might be a useful source of
  inspiration).  Not sure how this should interact with the case where
  log_duration is set and the min-duration isn't.  But maybe that case
  is silly, and we should just redefine log_duration as a minimum runtime
  that causes the query *and* its runtime to be printed to the log.

Tom is right here.  log_duration _just_ prints the duration, so we would
need to basically create a merged param that does log_duration and
log_statement and have it activate only if the statement takes more than
X milliseconds, something like log_long_statement, or something like
that.

Here are the log_* params we have:

log_connections = false
log_hostname = false
log_source_port = false
log_pid = false
log_statement = false
log_duration = false
log_timestamp = false

Basically, log_pid pulls them all together.  Without that, you don't
have any way to pull together individual lines in the log, and with pid
wraparound, you can't even do that 100%.  I wonder if we should put a
number before the pid and increment it on every pid wraparound. 

One nice thing is that each element is orthoginal.  But, for the
functionality desired, we have to merge log_statement and log_duration
and have it print for statements taking over X milliseconds.  I have no
problem adding it, but it has to be clear it isn't orthoginal but is a
conditional combination of two other parameters.

 Is it even guaranteed to be properly ordered on a busy server with multiple
 processors anyways?
 
 One option is to have log_query output an identifier with the query such as a
 hash of the query or the pointer value for the plan, suppressing duplicates.
 Then log_duration prints the identifier with the duration. 
 
 This means on a busy server running lots of prepared queries you would see a
 whole bunch of queries on startup, then hopefully no durations. Any durations
 printed could cause alarms to go off. To find the query you grep the logs for
 the identifier in the duration message.

Actually, log_pid is the proper way to do this.  You can then add log
connections, and get a full snapshot of what is happening for that
session.

 This only really works if you're using prepared queries everywhere. But in the
 long run that will be the case for OLTP systems, which is where log_duration
 is really useful.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] log_duration

2003-02-12 Thread Christopher Kings-Lynne
 Tom is right here.  log_duration _just_ prints the duration, so we would
 need to basically create a merged param that does log_duration and
 log_statement and have it activate only if the statement takes more than
 X milliseconds, something like log_long_statement, or something like
 that.

 Here are the log_* params we have:

   log_connections = false
   log_hostname = false
   log_source_port = false
   log_pid = false
   log_statement = false
   log_duration = false
   log_timestamp = false

OK, while I'm doing all this benchmarking and stuff - is there any sort of
option where I can see it logged when a sort doesn't have enought sort
memory and hence hits the disk?  eg. an elog(LOG) is emitted?

Chris


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] log_duration

2003-02-12 Thread Bruce Momjian
Christopher Kings-Lynne wrote:
  Tom is right here.  log_duration _just_ prints the duration, so we would
  need to basically create a merged param that does log_duration and
  log_statement and have it activate only if the statement takes more than
  X milliseconds, something like log_long_statement, or something like
  that.
 
  Here are the log_* params we have:
 
  log_connections = false
  log_hostname = false
  log_source_port = false
  log_pid = false
  log_statement = false
  log_duration = false
  log_timestamp = false
 
 OK, while I'm doing all this benchmarking and stuff - is there any sort of
 option where I can see it logged when a sort doesn't have enought sort
 memory and hence hits the disk?  eg. an elog(LOG) is emitted?

Someone asked about this at FOSDEM. The only way I know to do it is look
in the pgsql_temp directory, but they disappear pretty quickly.  Folks,
do we need something to report sort file usage?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] log_duration

2003-02-12 Thread Christopher Kings-Lynne
 Someone asked about this at FOSDEM. The only way I know to do it is look
 in the pgsql_temp directory, but they disappear pretty quickly.  Folks,
 do we need something to report sort file usage?

How about a new GUC variable: log_sort_tempfiles

And in the code that creates the temp file, if the GUC variable is true,
then do:

elog(LOG, Sort needed temp file.  Sort required 2456K.  Try increasing
sort_mem.);

Or something?

Chris


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] log_duration

2003-02-12 Thread Bruce Momjian

Well, part of the issue here is that it isn't always bad to use sort
file;  certainly it is better to use them than to swap.

We have a checkpoint_warning in 7.4 that will warn about excessive
checkpointing.  What would our criteria be for warning about sort_mem? 
Seems we would have to know how much free memory there is available, and
in fact, if there is lots of free memory, the sort files will just sit
in the kernel disk cache anyway.

I am not saying this is a bad idea --- we just need to define it clearer.

---

Christopher Kings-Lynne wrote:
  Someone asked about this at FOSDEM. The only way I know to do it is look
  in the pgsql_temp directory, but they disappear pretty quickly.  Folks,
  do we need something to report sort file usage?
 
 How about a new GUC variable: log_sort_tempfiles
 
 And in the code that creates the temp file, if the GUC variable is true,
 then do:
 
 elog(LOG, Sort needed temp file.  Sort required 2456K.  Try increasing
 sort_mem.);
 
 Or something?
 
 Chris
 
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] log_duration

2003-02-12 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  One nice thing is that each element is orthoginal.  But, for the
  functionality desired, we have to merge log_statement and log_duration
  and have it print for statements taking over X milliseconds.  I have no
  problem adding it, but it has to be clear it isn't orthoginal but is a
  conditional combination of two other parameters.
 
 Actually, I was wondering if we shouldn't *replace* the current
 log_duration with a combined form (that specifies a minimum interesting
 duration).  I can't quite see the need for orthogonality here.  The
 only reason you'd care about query duration is that you're looking for
 the slow ones, no?  So why bother logging the fast ones?  Besides, you
 can specify min-duration zero if you really want 'em all.

We did talk about this a while ago, and folks wanted the query printed
_before_ it was executed, so they could see the query in the logs at the
time it was issued, both for monitoring and for showing the time the
query started when log_timestamp is enabled.

Seems the clearest option would be for log_duration to print the query
string too, and convert it to an integer field.  I can see zero meaning
print all queries and durations.  What value do we use to turn it off? 
-1?

This would give us log_statement that prints at query start, and
log_duration that prints query and duration at query end.  How is that?

Maybe we should rename them as log_statement_start and
log_statement_duration.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] log_duration

2003-02-11 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes:

 Christopher Kings-Lynne [EMAIL PROTECTED] writes:
  Looking at the log_duration postgresql.conf option.  How about adding an
  option log_duration_min which is a value in milliseconds that is the minimum
  time a query must run for before being logged.
 
 Fine with me --- but you'll need to add more logic than that.  Right
 now, log_duration *only* causes the query duration to be printed out;
 if you ain't got log_statement on, you're in the dark as to what the
 query itself was.  You'll need to add some code to print the query
 (the log_min_error_statement logic might be a useful source of
 inspiration).  Not sure how this should interact with the case where
 log_duration is set and the min-duration isn't.  But maybe that case
 is silly, and we should just redefine log_duration as a minimum runtime
 that causes the query *and* its runtime to be printed to the log.

Is it even guaranteed to be properly ordered on a busy server with multiple
processors anyways?

One option is to have log_query output an identifier with the query such as a
hash of the query or the pointer value for the plan, suppressing duplicates.
Then log_duration prints the identifier with the duration. 

This means on a busy server running lots of prepared queries you would see a
whole bunch of queries on startup, then hopefully no durations. Any durations
printed could cause alarms to go off. To find the query you grep the logs for
the identifier in the duration message.

This only really works if you're using prepared queries everywhere. But in the
long run that will be the case for OLTP systems, which is where log_duration
is really useful.

--
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] log_duration

2003-02-10 Thread Christopher Kings-Lynne
Hi guys,

Looking at the log_duration postgresql.conf option.  How about adding an
option log_duration_min which is a value in milliseconds that is the minimum
time a query must run for before being logged.  Basically, what I'd be
interested in is please log the SQL query and duration of all queries that
last longer than 500ms.

That way I can quickly find badly formed queries, index them, etc...

Idea?

Chris


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] log_duration

2003-02-10 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Looking at the log_duration postgresql.conf option.  How about adding an
 option log_duration_min which is a value in milliseconds that is the minimum
 time a query must run for before being logged.

Fine with me --- but you'll need to add more logic than that.  Right
now, log_duration *only* causes the query duration to be printed out;
if you ain't got log_statement on, you're in the dark as to what the
query itself was.  You'll need to add some code to print the query
(the log_min_error_statement logic might be a useful source of
inspiration).  Not sure how this should interact with the case where
log_duration is set and the min-duration isn't.  But maybe that case
is silly, and we should just redefine log_duration as a minimum runtime
that causes the query *and* its runtime to be printed to the log.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html