Re: [HACKERS] [PATCHES] updated patch for selecting large results
On Tue, 2006-08-29 at 18:31 -0400, Tom Lane wrote: [EMAIL PROTECTED] writes: here comes the latest version (version 7) of the patch to handle large result sets with psql. As previously discussed, a cursor is used for SELECT queries when \set FETCH_COUNT some_value 0 Applied with revisions ... I didn't like the fact that the code was restricted to handle only unaligned output format, so I fixed print.c to be able to deal with emitting output in sections. This is not ideal for aligned output mode, because we compute column widths separately for each FETCH group, but all the other output modes work nicely. I also did a little hacking to make \timing and pager output work as expected. regards, tom lane Cool! I specially like that as a side effect of your work for applying this, psql is faster now. Thanks to all people that helped with this (lots...:) Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors
[EMAIL PROTECTED] writes: here comes the latest version (version 7) of the patch to handle large result sets with psql. As previously discussed, a cursor is used for SELECT queries when \set FETCH_COUNT some_value 0 Applied with revisions ... I didn't like the fact that the code was restricted to handle only unaligned output format, so I fixed print.c to be able to deal with emitting output in sections. This is not ideal for aligned output mode, because we compute column widths separately for each FETCH group, but all the other output modes work nicely. I also did a little hacking to make \timing and pager output work as expected. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors
Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's what I remembered. I don't think we want to introduce a difference between ; and \g. Have we measured the performance impact, then? The last time I profiled psql, GetVariable was already a hotspot, and this introduces another call of it into the basic query loop whether you use the feature or not. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors
[EMAIL PROTECTED] writes: here comes the latest version (version 7) of the patch to handle large result sets with psql. As previously discussed, a cursor is used for SELECT queries when \set FETCH_COUNT some_value 0 Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's a whole lot bigger behavioral change than I think is warranted. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors
Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's what I remembered. I don't think we want to introduce a difference between ; and \g. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in
On Mon, 2006-08-28 at 13:45 -0400, Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's what I remembered. I don't think we want to introduce a difference between ; and \g. Have we measured the performance impact, then? The last time I profiled psql, GetVariable was already a hotspot, and this introduces another call of it into the basic query loop whether you use the feature or not. regards, tom lane Hi, after agreeing on using a \set variable, I proposed to have it influence \g as well as ;, because I thought that would be the most expected behaviour. IMHO I'm with Peter, that introducing a difference between \g and ; would go against the principle of least surprise. Performance-wise I took for granted without checking that GetVariable's running time would be negligible. [looks at the code] I see it's it's basically two function calls with a loop over a linked list of values (in the order of 10) doing strcmps and one strtol. It is not quite clear to me what the impact of this is. I could imagine it would show up only if you perform lots of trivial queries through psql. I'm going to benchmark something now and report back. Anyway, regardless the benchmark, I feel it's somehow not clean to have a variable introduce a difference between \g and ;. [goes benchmarking...] Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] updated patch for selecting large results
Performance-wise I took for granted without checking that GetVariable's running time would be negligible. [looks at the code] I see it's it's basically two function calls with a loop over a linked list of values (in the order of 10) doing strcmps and one strtol. It is not quite clear to me what the impact of this is. I could imagine it would show up only if you perform lots of trivial queries through psql. I'm going to benchmark something now and report back. Anyway, regardless the benchmark, I feel it's somehow not clean to have a variable introduce a difference between \g and ;. [goes benchmarking...] Ok, so I ran a file containing 1 million lines of select 1; through psql (discarding the output). 5 runs each with the patch and with the patch removed (the if() in SendQuery commented). These are the results in seconds user time of psql on a Pentium M 2.0 GHz (real time was longer, since the postmaster process was on the same machine). patch | count | avg | stddev ---+---+-+--- f | 5 | 16.6722 | 0.359759919946455 t | 5 | 17.2762 | 0.259528803796329 The conclusion is that, yes, the overhead is measurable, albeit with a very synthetic benchmark (of the type MySQL wins ;). Basically I'm loosing 0.6 usec on each query line (when FETCH_COUNT is not there and therefore psql need to scan the whole variables list in GetVariable() for nothing). Not sure if that's acceptable (I'd say yes, but then again, I'm not a cycle counter type of programmer *cough* Java *cough* ;)... Opinions? Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in
Chris Mair [EMAIL PROTECTED] writes: The conclusion is that, yes, the overhead is measurable, albeit with a very synthetic benchmark (of the type MySQL wins ;). OK, so about 3 or 4% overhead added to extremely short queries. That's not enough to kill this patch, but it's still annoying ... and as I mentioned, there are some existing calls of GetVariable that are executed often enough to be a problem too. It strikes me that having to do GetVariable *and* strtol and so on for these special variables is pretty silly; the work should be done during the infrequent times they are set, rather than the frequent times they are read. I propose that we add the equivalent of a GUC assign_hook to psql's variable facility, attach an assign hook function to each special variable, and have the assign hook transpose the value into an internal variable that can be read with no overhead. If we do that then the cost of the FETCH_COUNT patch will be unmeasurable, and I think we'll see a few percent savings overall in psql runtime from improving the existing hotspot uses of GetVariable. Barring objections, I'll hack on this this evening ... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] updated patch for selecting large results
The conclusion is that, yes, the overhead is measurable, albeit with a very synthetic benchmark (of the type MySQL wins ;). OK, so about 3 or 4% overhead added to extremely short queries. More accurately, that 3 or 4% overhead is added to about all queries (we're talking *psql*-only running time). It's just that for anything but short queries, psql running time totally dwarfs regarding to postmaster running time anyway. That's not enough to kill this patch, but it's still annoying ... and as I mentioned, there are some existing calls of GetVariable that are executed often enough to be a problem too. It strikes me that having to do GetVariable *and* strtol and so on for these special variables is pretty silly; the work should be done during the infrequent times they are set, rather than the frequent times they are read. I propose that we add the equivalent of a GUC assign_hook to psql's variable facility, attach an assign hook function to each special variable, and have the assign hook transpose the value into an internal variable that can be read with no overhead. If we do that then the cost of the FETCH_COUNT patch will be unmeasurable, and I think we'll see a few percent savings overall in psql runtime from improving the existing hotspot uses of GetVariable. Barring objections, I'll hack on this this evening ... Might help. Take into account the strtol is not critical at all for FETCH_COUNT, since when it's actually set, we're supposed to retrieving big data where a strtol doesn't matter anyway. The overhead comes from scanning the linked list for nothing in the normal case (when it's not set). I don't know how the other uses factor in here, but I see it's called at least twice more on average calls to SendQuery. Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets
Peter Eisentraut wrote: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's what I remembered. I don't think we want to introduce a difference between ; and \g. I am confused. I assume \g and ; should be affected, like Peter says. Tom, what *every* command are you talking about? You mean \d? -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors
Bruce Momjian [EMAIL PROTECTED] writes: Peter Eisentraut wrote: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. I am confused. I assume \g and ; should be affected, like Peter says. Tom, what *every* command are you talking about? You mean \d? Like I said, I thought we were intending to modify \g's behavior only; that was certainly the implication of the discussion of \gc. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Peter Eisentraut wrote: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. I am confused. I assume \g and ; should be affected, like Peter says. Tom, what *every* command are you talking about? You mean \d? Like I said, I thought we were intending to modify \g's behavior only; that was certainly the implication of the discussion of \gc. OK, got it. I just don't see the value to doing \g and not ;. I think the \gc case was a hack when he didn't have \set. Now that we have \set, we should be consistent. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets
Bruce Momjian [EMAIL PROTECTED] writes: OK, got it. I just don't see the value to doing \g and not ;. I think the \gc case was a hack when he didn't have \set. Now that we have \set, we should be consistent. I'm willing to accept this if we can make sure we aren't adding any overhead --- see my proposal elsewhere in the thread for fixing that. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in
I am confused. I assume \g and ; should be affected, like Peter says. Tom, what *every* command are you talking about? You mean \d? Like I said, I thought we were intending to modify \g's behavior only; that was certainly the implication of the discussion of \gc. At some point you OKed the \g and ; proposal. I admit I was quick adding the and ; part, but it seemed so natural once we agreed on using a variable. OK, got it. I just don't see the value to doing \g and not ;. I think the \gc case was a hack when he didn't have \set. Now that we have \set, we should be consistent. I agree with this statement. If we have a variable that switches just between two versions of \g, we could have gone with using \u (or whatever) in the first place. In the mean time I have been converted by the variable camp, and I think the variable should change \g and ; together, consistently. If we find we can't live with the performance overhead of that if(FETCH_COUNT), it is still not clear why we would be better off moving it into the \g code path only. Is it because presumably \g is used less often in existing psql scripts? Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: OK, got it. I just don't see the value to doing \g and not ;. I think the \gc case was a hack when he didn't have \set. Now that we have \set, we should be consistent. I'm willing to accept this if we can make sure we aren't adding any overhead --- see my proposal elsewhere in the thread for fixing that. Right, if \g has overhead, I don't want people to start using ; because it is faster. That is the kind of behavior that makes us look sloppy. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org