Re: [HACKERS] [PATCHES] updated patch for selecting large results

2006-08-30 Thread Chris Mair
On Tue, 2006-08-29 at 18:31 -0400, Tom Lane wrote:
 [EMAIL PROTECTED] writes:
  here comes the latest version (version 7) of the patch to handle large
  result sets with psql.  As previously discussed, a cursor is used
  for SELECT queries when \set FETCH_COUNT some_value  0
 
 Applied with revisions ... I didn't like the fact that the code was
 restricted to handle only unaligned output format, so I fixed print.c
 to be able to deal with emitting output in sections.  This is not
 ideal for aligned output mode, because we compute column widths
 separately for each FETCH group, but all the other output modes work
 nicely.  I also did a little hacking to make \timing and pager output
 work as expected.
 
   regards, tom lane

Cool!
I specially like that as a side effect of your work for applying this,
psql is faster now.

Thanks to all people that helped with this (lots...:)

Bye, Chris.



-- 

Chris Mair
http://www.1006.org


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors

2006-08-29 Thread Tom Lane
[EMAIL PROTECTED] writes:
 here comes the latest version (version 7) of the patch to handle large
 result sets with psql.  As previously discussed, a cursor is used
 for SELECT queries when \set FETCH_COUNT some_value  0

Applied with revisions ... I didn't like the fact that the code was
restricted to handle only unaligned output format, so I fixed print.c
to be able to deal with emitting output in sections.  This is not
ideal for aligned output mode, because we compute column widths
separately for each FETCH group, but all the other output modes work
nicely.  I also did a little hacking to make \timing and pager output
work as expected.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors

2006-08-28 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Wait a minute.  What I thought we had agreed to was a patch to make
 commands sent with \g use a cursor.  This patch changes SendQuery
 so that *every* command executed via psql is treated this way.

 That's what I remembered.  I don't think we want to introduce a 
 difference between ; and \g.

Have we measured the performance impact, then?  The last time I profiled
psql, GetVariable was already a hotspot, and this introduces another
call of it into the basic query loop whether you use the feature or not.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors

2006-08-28 Thread Tom Lane
[EMAIL PROTECTED] writes:
 here comes the latest version (version 7) of the patch to handle large
 result sets with psql.  As previously discussed, a cursor is used
 for SELECT queries when \set FETCH_COUNT some_value  0

Wait a minute.  What I thought we had agreed to was a patch to make
commands sent with \g use a cursor.  This patch changes SendQuery
so that *every* command executed via psql is treated this way.
That's a whole lot bigger behavioral change than I think is warranted.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors

2006-08-28 Thread Peter Eisentraut
Tom Lane wrote:
 Wait a minute.  What I thought we had agreed to was a patch to make
 commands sent with \g use a cursor.  This patch changes SendQuery
 so that *every* command executed via psql is treated this way.

That's what I remembered.  I don't think we want to introduce a 
difference between ; and \g.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in

2006-08-28 Thread Chris Mair
On Mon, 2006-08-28 at 13:45 -0400, Tom Lane wrote:
 Peter Eisentraut [EMAIL PROTECTED] writes:
  Tom Lane wrote:
   Wait a minute.  What I thought we had agreed to was a patch to make
   commands sent with \g use a cursor.  This patch changes SendQuery
   so that *every* command executed via psql is treated this way.
 
  That's what I remembered.  I don't think we want to introduce a 
  difference between ; and \g.
 
 Have we measured the performance impact, then?  The last time I profiled
 psql, GetVariable was already a hotspot, and this introduces another
 call of it into the basic query loop whether you use the feature or not.
 
   regards, tom lane

Hi,

after agreeing on using a \set variable, I proposed to have it influence
\g as well as ;, because I thought that would be the most expected
behaviour. IMHO I'm with Peter, that introducing a difference between
\g and ; would go against the principle of least surprise.

Performance-wise I took for granted without checking that GetVariable's
running time would be negligible.

[looks at the code]

I see it's it's basically two function calls with a loop over a linked
list of values (in the order of 10) doing strcmps and one strtol.
It is not quite clear to me what the impact of this is. I could
imagine it would show up only if you perform lots of trivial queries
through psql. I'm going to benchmark something now and report back.

Anyway, regardless the benchmark, I feel it's somehow not clean to have
a variable introduce a difference between \g and ;.

[goes benchmarking...]

Bye, Chris.


-- 

Chris Mair
http://www.1006.org



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] updated patch for selecting large results

2006-08-28 Thread Chris Mair

 Performance-wise I took for granted without checking that GetVariable's
 running time would be negligible.
 
 [looks at the code]
 
 I see it's it's basically two function calls with a loop over a linked
 list of values (in the order of 10) doing strcmps and one strtol.
 It is not quite clear to me what the impact of this is. I could
 imagine it would show up only if you perform lots of trivial queries
 through psql. I'm going to benchmark something now and report back.
 
 Anyway, regardless the benchmark, I feel it's somehow not clean to have
 a variable introduce a difference between \g and ;.
 
 [goes benchmarking...]

Ok,
so I ran a file containing 1 million lines of select 1; through
psql (discarding the output). 5 runs each with the patch and with the
patch removed (the if() in SendQuery commented).

These are the results in seconds user time of psql on a Pentium M 2.0
GHz (real time was longer, since the postmaster process was on the same
machine).

 patch | count |   avg   |  stddev
---+---+-+---
 f | 5 | 16.6722 | 0.359759919946455
 t | 5 | 17.2762 | 0.259528803796329

The conclusion is that, yes, the overhead is measurable, albeit with
a very synthetic benchmark (of the type MySQL wins ;).

Basically I'm loosing 0.6 usec on each query line (when FETCH_COUNT
is not there and therefore psql need to scan the whole variables list
in GetVariable() for nothing).

Not sure if that's acceptable (I'd say yes, but then again, I'm
not a cycle counter type of programmer *cough* Java *cough* ;)...

Opinions?

Bye, Chris.


-- 

Chris Mair
http://www.1006.org



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in

2006-08-28 Thread Tom Lane
Chris Mair [EMAIL PROTECTED] writes:
 The conclusion is that, yes, the overhead is measurable, albeit with
 a very synthetic benchmark (of the type MySQL wins ;).

OK, so about 3 or 4% overhead added to extremely short queries.
That's not enough to kill this patch, but it's still annoying ...
and as I mentioned, there are some existing calls of GetVariable
that are executed often enough to be a problem too.

It strikes me that having to do GetVariable *and* strtol and so on
for these special variables is pretty silly; the work should be done
during the infrequent times they are set, rather than the frequent
times they are read.  I propose that we add the equivalent of a GUC
assign_hook to psql's variable facility, attach an assign hook function
to each special variable, and have the assign hook transpose the
value into an internal variable that can be read with no overhead.
If we do that then the cost of the FETCH_COUNT patch will be
unmeasurable, and I think we'll see a few percent savings overall in
psql runtime from improving the existing hotspot uses of GetVariable.

Barring objections, I'll hack on this this evening ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] updated patch for selecting large results

2006-08-28 Thread Chris Mair

  The conclusion is that, yes, the overhead is measurable, albeit with
  a very synthetic benchmark (of the type MySQL wins ;).
 
 OK, so about 3 or 4% overhead added to extremely short queries.

More accurately, that 3 or 4% overhead is added to about all queries
(we're talking *psql*-only running time).

It's just that for anything but short queries, psql running time
totally dwarfs regarding to postmaster running time anyway.

 That's not enough to kill this patch, but it's still annoying ...
 and as I mentioned, there are some existing calls of GetVariable
 that are executed often enough to be a problem too.
 
 It strikes me that having to do GetVariable *and* strtol and so on
 for these special variables is pretty silly; the work should be done
 during the infrequent times they are set, rather than the frequent
 times they are read.  I propose that we add the equivalent of a GUC
 assign_hook to psql's variable facility, attach an assign hook function
 to each special variable, and have the assign hook transpose the
 value into an internal variable that can be read with no overhead.
 If we do that then the cost of the FETCH_COUNT patch will be
 unmeasurable, and I think we'll see a few percent savings overall in
 psql runtime from improving the existing hotspot uses of GetVariable.
 
 Barring objections, I'll hack on this this evening ...

Might help.

Take into account the strtol is not critical at all for FETCH_COUNT,
since when it's actually set, we're supposed to retrieving big data
where a strtol doesn't matter anyway. The overhead comes from scanning
the linked list for nothing in the normal case (when it's not set).

I don't know how the other uses factor in here, but I see it's called
at least twice more on average calls to SendQuery.

Bye,
Chris.



-- 

Chris Mair
http://www.1006.org



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets

2006-08-28 Thread Bruce Momjian
Peter Eisentraut wrote:
 Tom Lane wrote:
  Wait a minute.  What I thought we had agreed to was a patch to make
  commands sent with \g use a cursor.  This patch changes SendQuery
  so that *every* command executed via psql is treated this way.
 
 That's what I remembered.  I don't think we want to introduce a 
 difference between ; and \g.

I am confused.  I assume \g and ; should be affected, like Peter says. 
Tom, what *every* command are you talking about?  You mean \d?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors

2006-08-28 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Peter Eisentraut wrote:
 Tom Lane wrote:
 Wait a minute.  What I thought we had agreed to was a patch to make
 commands sent with \g use a cursor.

 I am confused.  I assume \g and ; should be affected, like Peter says. 
 Tom, what *every* command are you talking about?  You mean \d?

Like I said, I thought we were intending to modify \g's behavior only;
that was certainly the implication of the discussion of \gc.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets

2006-08-28 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Peter Eisentraut wrote:
  Tom Lane wrote:
  Wait a minute.  What I thought we had agreed to was a patch to make
  commands sent with \g use a cursor.
 
  I am confused.  I assume \g and ; should be affected, like Peter says. 
  Tom, what *every* command are you talking about?  You mean \d?
 
 Like I said, I thought we were intending to modify \g's behavior only;
 that was certainly the implication of the discussion of \gc.

OK, got it.  I just don't see the value to doing \g and not ;. I think
the \gc case was a hack when he didn't have \set.  Now that we have
\set, we should be consistent.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets

2006-08-28 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 OK, got it.  I just don't see the value to doing \g and not ;. I think
 the \gc case was a hack when he didn't have \set.  Now that we have
 \set, we should be consistent.

I'm willing to accept this if we can make sure we aren't adding any
overhead --- see my proposal elsewhere in the thread for fixing that.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in

2006-08-28 Thread Chris Mair

   I am confused.  I assume \g and ; should be affected, like Peter says. 
   Tom, what *every* command are you talking about?  You mean \d?
  
  Like I said, I thought we were intending to modify \g's behavior only;
  that was certainly the implication of the discussion of \gc.

At some point you OKed the \g and ; proposal.
I admit I was quick adding the and ; part, but it seemed so natural
once we agreed on using a variable.


 OK, got it.  I just don't see the value to doing \g and not ;. I think
 the \gc case was a hack when he didn't have \set.  Now that we have
 \set, we should be consistent.

I agree with this statement.

If we have a variable that switches just between two versions of \g,
we could have gone with using \u (or whatever) in the first place.

In the mean time I have been converted by the variable camp, and
I think the variable should change \g and ; together, consistently.

If we find we can't live with the performance overhead of that
if(FETCH_COUNT), it is still not clear why we would be better
off moving it into the \g code path only.

Is it because presumably \g is used less often in existing psql scripts?

Bye, Chris.



-- 

Chris Mair
http://www.1006.org



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] updated patch for selecting large results sets

2006-08-28 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  OK, got it.  I just don't see the value to doing \g and not ;. I think
  the \gc case was a hack when he didn't have \set.  Now that we have
  \set, we should be consistent.
 
 I'm willing to accept this if we can make sure we aren't adding any
 overhead --- see my proposal elsewhere in the thread for fixing that.

Right, if \g has overhead, I don't want people to start using ; because
it is faster.  That is the kind of behavior that makes us look sloppy.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org