Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Simon Riggs
On Sun, 2007-11-25 at 19:35 -0500, Tom Lane wrote:

 The cost of resolving ambiguous operators has been an issue for a long
 time, of course, but it seems particularly bad in this case --- gprof
 blames 37% of the runtime on oper_select_candidate().  It might be time
 to think about caching the results of operator searches somehow.  Too
 late for 8.3 though.

Wow: 37%. 

varchar_column = const is a very, very common predicate. 37% is enough
to still be visible for a wide range of queries, not just the very
simple. I think queries with WHERE clauses like 
(int4_column = int4_const AND varchar_column = const)
will also be noticeably affected this. So even when we have integer
keys, we will still get slowed down by an checks to an additional status
column.

Caching is the right way around this, though as you point out, that is
not an option for 8.3.

But I think there must be an action that we can take for 8.3 and that
much runtime should not be given away easily. ISTM that we can win back
the losses Guillaume has identified, plus gain a little more even.

Can we just hard-code the varchar lookup? Ugly, but it will add almost
nothing to non-varchar paths and yet speed-up the varchar lookup
dramatically. I guess the objection to that will be that it prevents
people from overloading the = operator for varchars to change the
selectivity functions etc.

So how about we have a cache-of-one: we store the best varchar =
operator after the first lookup, then document that if people overload
this then they must reconnect. That's an acceptable pain for the few
people affected and a great benefit for the most people.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] plpgsql: another new reserved word

2007-11-26 Thread Marko Kreen
On 11/10/07, Tom Lane [EMAIL PROTECTED] wrote:
 The current plpgsql code seems to be designed to force a qualifier to be
 interpreted as a block label if at all possible, even if there are
 more-closely-nested alternative interpretations; so in the above example
 it would assign to the outer variable bar.  This seems a tad bogus
 to me.  Can anyone comment on how Oracle handles cases like this?

Some googling brought following link:

 http://download-uk.oracle.com/docs/cd/B14117_01/appdev.101/b10807/d_names.htm

I have not parsed it completely, but rule seems simple - inner
scope overrides outer one and no magic on unqualified idents,
if ident is unqualified, it wont be matched to schema, block
or some other qualifier.  (well, at least no such magic behaviour
is mentioned.)

-- 
marko

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Autovacuum and OldestXmin

2007-11-26 Thread Simon Riggs
On Thu, 2007-11-22 at 21:59 -0500, Christopher Browne wrote:
 The world rejoiced as [EMAIL PROTECTED] (Alvaro Herrera) wrote:
  Simon Riggs wrote:
  I notice that slony records the oldestxmin that was running when it last
  ran a VACUUM on its tables. This allows slony to avoid running a VACUUM
  when it would be clearly pointless to do so.
  
  AFAICS autovacuum does not do this, or did I miss that?
 
  Hmm, I think it's just because nobody suggested it and I didn't came up
  with the idea.
 
  Whether it's a useful thing to do is a different matter.  Why store it
  per table and not more widely?  Perhaps per database would be just as
  useful; and maybe it would allow us to skip running autovac workers
  when there is no point in doing so.
 
 I think I need to take blame for that feature in Slony-I ;-).

Good thinking.

 I imagine it might be useful to add it to autovac, too.  I thought it
 was pretty neat that this could be successfully handled by comparison
 with a single value (e.g. - eldest xmin), and I expect that using a
 single quasi-global value should be good enough for autovac.

I've just looked at that to see if it is that easy; I don't think it is.

That works for slony currently because we vacuum all of the slony tables
at once. Autovacuum does individual tables so we'd need to store the
individual values otherwise we might skip doing a VACUUM when it could
have done some useful work.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Peter Eisentraut
Am Samstag, 24. November 2007 schrieb Simon Riggs:
 In many cases, 100% of queries are unprepared.

I have seen many applications where prepared queries caused stale plans and 
poor performance.  We have in many cases achieved great performance gains by 
turning off prepared queries globally, for example in the driver layer.  It 
had once gotten to a point where uses prepared statements was on par 
with never touched postgresql.conf and never heard of VACUUM as the worst 
performance sins.  I will gladly revisit this with 8.3, but I am not very 
optimistic.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Guillaume Smet
On Nov 26, 2007 11:59 AM, Peter Eisentraut [EMAIL PROTECTED] wrote:
 I have seen many applications where prepared queries caused stale plans and
 poor performance.  We have in many cases achieved great performance gains by
 turning off prepared queries globally, for example in the driver layer.  It
 had once gotten to a point where uses prepared statements was on par
 with never touched postgresql.conf and never heard of VACUUM as the worst
 performance sins.  I will gladly revisit this with 8.3, but I am not very
 optimistic.

+1 on this point as I explained it earlier.

--
Guillaume

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Autovacuum and OldestXmin

2007-11-26 Thread Alvaro Herrera
Simon Riggs wrote:
 On Thu, 2007-11-22 at 21:59 -0500, Christopher Browne wrote:

  I imagine it might be useful to add it to autovac, too.  I thought it
  was pretty neat that this could be successfully handled by comparison
  with a single value (e.g. - eldest xmin), and I expect that using a
  single quasi-global value should be good enough for autovac.
 
 I've just looked at that to see if it is that easy; I don't think it is.
 
 That works for slony currently because we vacuum all of the slony tables
 at once. Autovacuum does individual tables so we'd need to store the
 individual values otherwise we might skip doing a VACUUM when it could
 have done some useful work.

Yeah, that was my conclusion too.

-- 
Alvaro Herrera  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
Voy a acabar con todos los humanos / con los humanos yo acabaré
voy a acabar con todos / con todos los humanos acabaré (Bender)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Replacement Selection

2007-11-26 Thread mac_man2005

Hi to all.

I'm new. I'd like to integrate my code into PostgreSQL. It's the 
implementation of some refinements of Replacement Selection algorithm used 
for External Sorting.
I have got some issue and preferibly I'd like to be supported by some 
developers that have something to do with it.


Who can I talk to?

Thanks for your attentions.
Good Luck!

Manolo. 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [COMMITTERS] pgsql: update files for beta3

2007-11-26 Thread Peter Eisentraut
Am Samstag, 24. November 2007 schrieb Bruce Momjian:
 Peter, were you going to address this?

It's done now.

  diff -ur ../cvs-pgsql/configure.in ./configure.in
  --- ../cvs-pgsql/configure.in   2007-11-16 21:25:10.0 +0100
  +++ ./configure.in  2007-11-16 22:27:36.0 +0100
  @@ -19,7 +19,7 @@
 
   AC_INIT([PostgreSQL], [8.3beta3], [EMAIL PROTECTED])
 
  -AC_PREREQ(2.59)
  +m4_if(m4_defn([m4_PACKAGE_VERSION]), [2.59], [], [m4_fatal([Autoconf
  version 2.59 is required])]) AC_COPYRIGHT([Copyright (c) 1996-2007,
  PostgreSQL Global Development Group])
  AC_CONFIG_SRCDIR([src/backend/access/common/heaptuple.c])
   AC_CONFIG_AUX_DIR(config)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Heikki Linnakangas

[EMAIL PROTECTED] wrote:
I'm new. I'd like to integrate my code into PostgreSQL. It's the 
implementation of some refinements of Replacement Selection algorithm 
used for External Sorting.
I have got some issue and preferibly I'd like to be supported by some 
developers that have something to do with it.


Who can I talk to?


This mailing list is the right place to discuss that.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] possible to create multivalued index from xpath() results in 8.3?

2007-11-26 Thread Peter Eisentraut
Am Freitag, 23. November 2007 schrieb Tom Lane:
 Peter Eisentraut [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  Should we force initdb to correct these pg_proc entries, or just quietly
  change pg_proc.h?
 
  Considering the extent of the changes, I'd be in favor of forcing an
  initdb.

 Well, if you're going to change the contents of pg_cast then there is
 little choice.  I was considering something less invasive ...

I will hang on to this patch for a few more days to see if any invasive 
catalog changes come out of the quote_literal/set_config discussion.  If not, 
I'll consider a less invasive solution.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replacement Selection

2007-11-26 Thread mac_man2005

Thanks for your support.

I downloaded the source code of the last stable version of PostgreSQL. Where 
can I find the part related to the External Sorting algorithm (supposed to 
be Replacement Selection)?

I mean, which is the file to be studied and/or modified and/or substituted?

Thanks for your attention.

--
From: Heikki Linnakangas [EMAIL PROTECTED]
Sent: Monday, November 26, 2007 1:35 PM
To: [EMAIL PROTECTED]
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Replacement Selection


[EMAIL PROTECTED] wrote:
I'm new. I'd like to integrate my code into PostgreSQL. It's the 
implementation of some refinements of Replacement Selection algorithm 
used for External Sorting.
I have got some issue and preferibly I'd like to be supported by some 
developers that have something to do with it.


Who can I talk to?


This mailing list is the right place to discuss that.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote:
 Thanks for your support.

 I downloaded the source code of the last stable version of PostgreSQL. 
 Where can I find the part related to the External Sorting algorithm 
 (supposed to be Replacement Selection)?
 I mean, which is the file to be studied and/or modified and/or substituted?

src/backend/utils/sort/tuplesort.c

-- 
Alvaro Herrera  Developer, http://www.PostgreSQL.org/
I would rather have GNU than GNOT.  (ccchips, lwn.net/Articles/37595/)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Heikki Linnakangas

[EMAIL PROTECTED] wrote:
I downloaded the source code of the last stable version of PostgreSQL. 
Where can I find the part related to the External Sorting algorithm 
(supposed to be Replacement Selection)?

I mean, which is the file to be studied and/or modified and/or substituted?


In src/backend/utils/sort/tuplesort.c. The comments at the top of that 
file is a good place to start.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Mark Cave-Ayland
Hi everyone,

I'm working on a piece of code for PostGIS to allow the loading of
projection configuration files from the share/postgresql directory, but
I can't find a way of getting this to work under Win32.

AIUI the way to do this would be to use a combination of my_exec_path
and get_share_path in order to find the directory, but MingW refuses to
find the my_exec_path variable during linking. Unfortunately I suspect
that this because my_exec_path is not declared as DLLIMPORT in
backend/utils/init/globals.c :(

I really need to find a solution to this that can work all the way back
to PostgreSQL 8.0 - can anyone think of any better ideas?


Many thanks,

Mark.

-- 
ILande - Open Source Consultancy
http://www.ilande.co.uk



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 But I think there must be an action that we can take for 8.3 and that
 much runtime should not be given away easily. ISTM that we can win back
 the losses Guillaume has identified, plus gain a little more even.

Perhaps some sanity could be restored to this discussion by pointing out
that the 2007-01-01 code *also* clocks in at 37% spent in
oper_select_candidate.  IOW it's been like this for a very long time.
I'm not interested in destabilizing 8.3 with panicky last-minute patches.

 So how about we have a cache-of-one:

Cache-of-one has exactly the same difficulty as cache-of-many, other
than the table lookup itself, which is a solved problem (hashtable).
You still have to determine how you identify the cached value and what
events require a cache flush.  Nor do I see any particular reason to
assume that a cache of only one operator would be of any use for
real-world apps, as opposed to toy examples.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


initdb for RC1 (was Re: [HACKERS] [GENERAL] possible to create multivalued index from xpath() results in 8.3?)

2007-11-26 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Peter Eisentraut [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Well, if you're going to change the contents of pg_cast then there is
 little choice.  I was considering something less invasive ...

 I will hang on to this patch for a few more days to see if any invasive 
 catalog changes come out of the quote_literal/set_config discussion.  If not,
 I'll consider a less invasive solution.

ATM it seems that consensus is to change quote_literal, so we may as
well adopt the cleaner solution for fixing the xml functions too.

Anyone out there who wants to argue against forcing initdb for RC1?
If so, better speak up now.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] plpgsql: another new reserved word

2007-11-26 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 On 11/10/07, Tom Lane [EMAIL PROTECTED] wrote:
 Can anyone comment on how Oracle handles cases like this?

 Some googling brought following link:

  http://download-uk.oracle.com/docs/cd/B14117_01/appdev.101/b10807/d_names.htm

Hmm, interesting document.  I think the bit that is relevant for us is
the statement

: An outer capture occurs when a name in an inner scope, which once
: resolved to an entity in an inner scope, is resolved to an entity in an
: outer scope. SQL and PL/SQL are designed to prevent outer captures. You
: do not need to take any action to avoid this condition.

AFAICT this means that if there is any ambiguity, the most closely
nested possible interpretation will always win.  Therefore the current
behavior of plpgsql is indeed wrong, because it searches up the
namespace stack for block labels too soon.  I'll go see about changing
that.

The subsection Qualifying References to Object Attributes and Methods
describes some truly bletcherous behavior --- you can't reference a
complex-column component unless you start from a table alias?  How
non-orthogonal.  I prefer our current solution of making you
parenthesize the column reference before you access its component ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] proposal, plpgsql, 8.4, for record in cursor

2007-11-26 Thread Pavel Stehule
Hello

I propose new kind of FOR statement .. iteration over cursor, There
are two reasons:

a) better readability of procedure; - SQL statement is outside of statement,

b) better conformance with PL/SQL.

Sample:
CREATE OR REPLACE FUNCTION foo()
RETURNS void AS $$
DECLARE
  c CURSOR(p integer)
FOR SELECT *
   FROM foot
  WHERE a = p;
BEGIN
  FOR r IN c(10) LOOP
RAISE NOTICE '%', r.a;
  END LOOP;
  RETURN;
END;
$$ FUNCTION plpgsql;

Syntax:

FOR recordvar IN cursor[ ( params ) ] LOOP
  ..
END LOOP;

Notes:
recordvar is local in statement body (like fori). Used cursor is
automatically opened and closed.

Lot of code can be shared with fors and dynafors (some with stmt_open).

Regards
Pavel Stehule

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Replacement Selection

2007-11-26 Thread mac_man2005

Ok guys!
Thanks for your help.

Unfortunately I'm lost into the code... any good soul helping me to 
understand what should be the precise part to be modified?


Thanks for your time!

--
From: Heikki Linnakangas [EMAIL PROTECTED]
Sent: Monday, November 26, 2007 2:34 PM
To: [EMAIL PROTECTED]
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Replacement Selection


[EMAIL PROTECTED] wrote:
I downloaded the source code of the last stable version of PostgreSQL. 
Where can I find the part related to the External Sorting algorithm 
(supposed to be Replacement Selection)?
I mean, which is the file to be studied and/or modified and/or 
substituted?


In src/backend/utils/sort/tuplesort.c. The comments at the top of that 
file is a good place to start.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] proposal, plpgsql, 8.4, for record in cursor

2007-11-26 Thread Tom Lane
Pavel Stehule [EMAIL PROTECTED] writes:
 I propose new kind of FOR statement .. iteration over cursor,

This seems useless and probably syntactically ambiguous.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] proposal, plpgsql, 8.4, for record in cursor

2007-11-26 Thread Peter Eisentraut
Am Montag, 26. November 2007 schrieb Tom Lane:
 Pavel Stehule [EMAIL PROTECTED] writes:
  I propose new kind of FOR statement .. iteration over cursor,

 This seems useless and probably syntactically ambiguous.

I think that is isomorphic to what he mentioned as better conformance with 
PL/SQL.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Peter Eisentraut
Am Montag, 26. November 2007 schrieb Mark Cave-Ayland:
 I'm working on a piece of code for PostGIS to allow the loading of
 projection configuration files from the share/postgresql directory, but

The share directory is the wrong place for configuration files anyway.  And 
moreover, non-PostgreSQL packages have no business putting files into 
PostgreSQL's private directories.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Heikki Linnakangas

[EMAIL PROTECTED] wrote:
Unfortunately I'm lost into the code... any good soul helping me to 
understand what should be the precise part to be modified?


You haven't given any details on what you're trying to do. What are you 
trying to do?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote:
 Ok guys!
 Thanks for your help.

 Unfortunately I'm lost into the code... any good soul helping me to 
 understand what should be the precise part to be modified?

I think you should print the file and read it several times until you
understand what's going on.  Then you can start thinking where and how
to modify it.

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible (Calvin a la TV)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Am Montag, 26. November 2007 schrieb Mark Cave-Ayland:
 I'm working on a piece of code for PostGIS to allow the loading of
 projection configuration files from the share/postgresql directory, but

 The share directory is the wrong place for configuration files anyway.  And 
 moreover, non-PostgreSQL packages have no business putting files into 
 PostgreSQL's private directories.

That doesn't seem logical to me.  We expect tsearch configuration files
to be put into $sharedir/tsearch_data, so why shouldn't PostGIS use a
similar approach to dealing with external configuration data?

I'm not sure why Mark's having a problem accessing my_exec_path ---
it *is* declared DLLIMPORT in miscadmin.h (which is where it counts,
AIUI) clear back to 8.0.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Simon Riggs
On Mon, 2007-11-26 at 09:55 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  But I think there must be an action that we can take for 8.3 and that
  much runtime should not be given away easily. ISTM that we can win back
  the losses Guillaume has identified, plus gain a little more even.
 
 Perhaps some sanity could be restored to this discussion by pointing out
 that the 2007-01-01 code *also* clocks in at 37% spent in
 oper_select_candidate.  IOW it's been like this for a very long time.
 I'm not interested in destabilizing 8.3 with panicky last-minute patches.

I think this is worth taking action on, IMHO.

I've written up my suggestion as a 5 line patch, with measured
performance improvement of about 40% for the varchar case. 

It isn't a great long term fix, but I don't want to delay 8.3 either
with the full fix.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 [EMAIL PROTECTED] wrote:
 Unfortunately I'm lost into the code... any good soul helping me to 
 understand what should be the precise part to be modified?

 I think you should print the file and read it several times until you
 understand what's going on.  Then you can start thinking where and how
 to modify it.

Also, go find a copy of Knuth volume 3, because a whole lot of the
comments assume you've read Knuth's discussion of external sorting.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Dave Page
Peter Eisentraut wrote:
 Am Montag, 26. November 2007 schrieb Mark Cave-Ayland:
 I'm working on a piece of code for PostGIS to allow the loading of
 projection configuration files from the share/postgresql directory, but
 
 The share directory is the wrong place for configuration files anyway.  And 
 moreover, non-PostgreSQL packages have no business putting files into 
 PostgreSQL's private directories.
 

How does that work with the stuff that goes into directories relative to
$libdir (per documented recommendations), without creating easy-to-break
paths like $libdir/../../MyAddon/8.3/MyAddon.dll?

Regards, Dave

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] proposal, plpgsql, 8.4, for record in cursor

2007-11-26 Thread Pavel Stehule
On 26/11/2007, Tom Lane [EMAIL PROTECTED] wrote:
 Pavel Stehule [EMAIL PROTECTED] writes:
  I propose new kind of FOR statement .. iteration over cursor,

 This seems useless and probably syntactically ambiguous.


I don't see any syntactically problem and I have working prototype.
This case is simply identified by cursor variable after IN keyword.

It's possibility little bit clean open, fors, dynafors code.

Regards
Pavel Stehule

 regards, tom lane


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Guillaume Smet
On Nov 26, 2007 5:58 PM, Simon Riggs [EMAIL PROTECTED] wrote:
 I've written up my suggestion as a 5 line patch, with measured
 performance improvement of about 40% for the varchar case.

 It isn't a great long term fix, but I don't want to delay 8.3 either
 with the full fix.

Can we see the patch?

Thanks.

--
Guillaume

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Fw: [HACKERS] Replacement Selection

2007-11-26 Thread mac_man2005

Thanks for your advice.

The developement of this integration is part of my final project. And fo
course my initial bibliografy includes the Knuth reference as you can see

1. Vladimir Estivill-Castro and Derick Wood. A survey of adaptive sorting
algorithms. ACM Computing Surveys, 24(4):441{476, 1992.

2. Donald E. Knuth. The art of computer programming, volume 3: sorting and
 searching. Addison-Wesley, Reading, 2nd edition, 1998.

3. P. Larson and G. Graefe. Memory management during run generation in
external sorting. In ACM, editor, SIGMOD98, pages 472{483, 1998.

4. Per-Ake Larson. External sorting: Run formation revisited. IEEE
Transactions on Knowledge and Data Engineering, 15(4):961{972, 2003.

5. Je®rey Scott Vitter and David A. Hutchinson. Distribution sort with
randomized cycling. pages 77-86.


--
From: Tom Lane [EMAIL PROTECTED]
Sent: Monday, November 26, 2007 6:00 PM
To: Alvaro Herrera [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Replacement Selection


Alvaro Herrera [EMAIL PROTECTED] writes:

[EMAIL PROTECTED] wrote:

Unfortunately I'm lost into the code... any good soul helping me to
understand what should be the precise part to be modified?



I think you should print the file and read it several times until you
understand what's going on.  Then you can start thinking where and how
to modify it.


Also, go find a copy of Knuth volume 3, because a whole lot of the
comments assume you've read Knuth's discussion of external sorting.

regards, tom lane



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] maintenance_work_mem memory constraint?

2007-11-26 Thread Bernd Helmle
While supporting a customer to increase recovery performance from its 
backups i just realized that PostgreSQL never uses big maintenance_work_mem 
settings. Even giving 10GB of RAM to maintenance_work_mem results in using 
a fraction of memory (it switches to external sort after using around 2 
GB). I think the culprit ist the following code in tuplesort.c, 
grow_memtuples(), as the comments there let assume already:


   /* 

* On a 64-bit machine, allowedMem could be high enough to get us 
into
* trouble with MaxAllocSize, too. 


*/
   if ((Size) (state-memtupsize * 2) = MaxAllocSize / 
sizeof(SortTuple))

   return false;

While i understand, that doubling the memtuples array is more efficient 
than increasing the array in smaller steps, i think we give away usable 
memory, because we never consider using memory up to the upper limit given 
by MaxAllocSize. Modifying the code in that way results in a sightly better 
memory usage, but far away from what the system is able to use on such a 
machine (see the diff attached, a very crude experimental code).


I've played around with increasing the MaxAllocSize as well and got the 
backend to use up to 6GB maintenance_work_mem during creating an index with 
80.000.000 integer tuples. That way the backend was able to sort the tuples 
entirely in memory, speeding up the creation of the index from 200s to 80s. 
I understand that we have to handle MaxAllocSize very carefully, since it's 
involved in many cases in the code. But isn't it worth to special case the 
code in grow_memtuples() (and maybe other places where sort is likely to 
use more RAM), so that we can remove this constraint on 64-Bit systems with 
many RAM built in? Or am I missing something very important?.


--
 Thanks

   Bernd*** tuplesort.c	2007-11-26 18:30:21.0 +0100
--- tuplesort.c.new	2007-11-26 18:30:11.0 +0100
***
*** 810,820 
  	 * On a 64-bit machine, allowedMem could be high enough to get us into
  	 * trouble with MaxAllocSize, too.
  	 */
  	if ((Size) (state-memtupsize * 2) = MaxAllocSize / sizeof(SortTuple))
! 		return false;
  
  	FREEMEM(state, GetMemoryChunkSpace(state-memtuples));
- 	state-memtupsize *= 2;
  	state-memtuples = (SortTuple *)
  		repalloc(state-memtuples,
   state-memtupsize * sizeof(SortTuple));
--- 810,837 
  	 * On a 64-bit machine, allowedMem could be high enough to get us into
  	 * trouble with MaxAllocSize, too.
  	 */
+ 
+ 	if ((Size) (state-memtupsize)  + 1 = MaxAllocSize / sizeof(SortTuple))
+ 	{
+ 	  elog(DEBUG1, memtupsize %d reached MaxAllocSize %lu, 
+ 	   state-memtupsize, MaxAllocSize);
+ 	  /* max alloc size already reached */
+ 	  return false;
+ 	}
+ 
  	if ((Size) (state-memtupsize * 2) = MaxAllocSize / sizeof(SortTuple))
! 	{
! 	  /* use the maximum allowed alloc size */
! 	  state-memtupsize = (MaxAllocSize / sizeof(SortTuple) - 1);
! 	  elog(DEBUG1, memtupsize %d exceeds MaxAllocSize %lu, readjusted, 
! 	   state-memtupsize * 2, MaxAllocSize);
! 	}
! 	else
! 	{
! 	  state-memtupsize *= 2;
! 	}
  
  	FREEMEM(state, GetMemoryChunkSpace(state-memtuples));
  	state-memtuples = (SortTuple *)
  		repalloc(state-memtuples,
   state-memtupsize * sizeof(SortTuple));

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Gregory Stark

Simon Riggs [EMAIL PROTECTED] writes:

 I've written up my suggestion as a 5 line patch, with measured
 performance improvement of about 40% for the varchar case. 

A 5-line patch which improves performance by 40% for any case sounds amazing,
but how fragile is that gain? The kind of thing which would be worryign is if
runing a query which uses both varchar and some other ambiguous operator
causes it to lose all its gain.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 A 5-line patch which improves performance by 40% for any case sounds amazing,
 but how fragile is that gain? The kind of thing which would be worryign is if
 runing a query which uses both varchar and some other ambiguous operator
 causes it to lose all its gain.

Yeah, exactly.  If we're going to risk anything like this at all, the
cache-of-one restriction is simply not acceptable (especially given
that the part of the coding it would eliminate is the simplest and
easiest-to-get-right part).

In the test case Guillame provided, every single WHERE clause happens
to be of the form
varchar_column = 'unknown-type literal'
and there are no other operators used in the SELECT lists; but I can
hardly believe that this is representative of any significant number
of real-world applications.  Even pgbench uses more than one operator.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] maintenance_work_mem memory constraint?

2007-11-26 Thread Tom Lane
Bernd Helmle [EMAIL PROTECTED] writes:
 ... But isn't it worth to special case the 
 code in grow_memtuples() (and maybe other places where sort is likely to 
 use more RAM), so that we can remove this constraint on 64-Bit systems with 
 many RAM built in? Or am I missing something very important?.

AFAICS this patch can increase the number of sortable tuples by at most 2X
(less one).  That doesn't seem worth getting very worked up about ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replacement Selection

2007-11-26 Thread mac_man2005

Sorry.

I'm trying to integrate my code into PostgreSQL. At the moment I have got my 
working code, with my own main() etc etc.
The code is supposed to perform run generation during external sorting. 
That's all, my code won't do any mergesort. Just run generation.


I'm studing the code and I don't know where to put my code into. Which part 
I need to substitute and which other are absolutely untouchables.
I admit I'm not an excellent programmer. I've always been writing my own 
codes, simple codes. Now I have got some ideas that can possibly help 
postgreSQL to get better. And for the first time I'm to integrate code into 
others code. I say it just to apologize in case some things that could be 
obvious for someone else, maybe are not for me.


Anyway... back to work.
My code has the following structure.

1) Generates a random input stream to sort.
As for this part, i just generate an integer input stream, not a stream of 
db records. I talk about stream because I'm in a general case in which the 
input source can be unknown and we cannot even know how much elements to 
sort


2)Fill the available memory with the first M elements from stream. They will 
be arranged into an heap structure.


3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth 
algorithm) marks elements belonging to runs in otder to know which run they 
belong to and to know when the current heap has finished building the 
current run. I don't memorize this kind of info. I just output from heap to 
run all of the elements going into the current run. The elements supposed to 
go into the next run (I call them dead records) are still stored into main 
memory, but as leaves of the heap. This implies reducing the heap size and 
so heapifying a smaller number of elements each time I get a dead record 
(it's not necessary to sort dead records). When the heap size is zero a new 
run is created heapifying all the dead records currently present into main 
memory.


I haven't seen something similar into tuplesort.c, apparently no heapify is 
called no new run created and stuff like this.
Do you see any parallelism between PostgreSQL code with what I said in the 
previous points?


Thanks for your attention.

--
From: Heikki Linnakangas [EMAIL PROTECTED]
Sent: Monday, November 26, 2007 5:42 PM
To: [EMAIL PROTECTED]
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Replacement Selection


[EMAIL PROTECTED] wrote:
Unfortunately I'm lost into the code... any good soul helping me to 
understand what should be the precise part to be modified?


You haven't given any details on what you're trying to do. What are you 
trying to do?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] proposal, plpgsql, 8.4, for record in cursor

2007-11-26 Thread Chris Browne
[EMAIL PROTECTED] (Peter Eisentraut) writes:
 Am Montag, 26. November 2007 schrieb Tom Lane:
 Pavel Stehule [EMAIL PROTECTED] writes:
  I propose new kind of FOR statement .. iteration over cursor,

 This seems useless and probably syntactically ambiguous.

 I think that is isomorphic to what he mentioned as better conformance with 
 PL/SQL.

Hmm.  So better conformance with PL/SQL amounts to being useless and
syntactically ambiguous...

;-)
-- 
let name=cbbrowne and tld=linuxfinances.info in name ^ @ ^ tld;;
http://linuxfinances.info/info/sgml.html
HE'S AHAB, boy; and Ahab of old, thou knowest, was a crowned king!
-- /Moby-Dick/, Ch 16 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Tom Lane
[EMAIL PROTECTED] writes:
 3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth 
 algorithm) marks elements belonging to runs in otder to know which run they 
 belong to and to know when the current heap has finished building the 
 current run. I don't memorize this kind of info. I just output from heap to 
 run all of the elements going into the current run. The elements supposed to 
 go into the next run (I call them dead records) are still stored into main 
 memory, but as leaves of the heap. This implies reducing the heap size and 
 so heapifying a smaller number of elements each time I get a dead record 
 (it's not necessary to sort dead records). When the heap size is zero a new 
 run is created heapifying all the dead records currently present into main 
 memory.

Why would this be an improvement over Knuth?  AFAICS you can't generate
longer runs this way, and it's not saving any time --- in fact it's
costing time, because re-heapifying adds a lot of new comparisons.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replacement Selection

2007-11-26 Thread mac_man2005
I must precise that it's not the improvement. Other more complex algorithms 
correspond to the refinements, but at the moment I just want to know which 
part of PostgreSQL code does what. I also implemented Replacement Selection 
(RS) so if I'm able to integrate my RS I hope I would be able to integrate 
the others too.


Anyway, even in my RS implementation a longer run is created. The first M 
initialization elements will surely form part of the current run. M is the 
memory size so at least a run sized M will be created. After initialization, 
the elements are not suddenly output, but an element from heap is output 
into run as soon as I get an element from stream. In other words, for each 
element from stream, the root element of the heap is output, and the input 
element takes the root place into the heap. If that element is a good 
record I just heapify (since the element will be placed at the now free 
root place). If that input element is a dead record I swap it with the last 
leaf and reduce the heap size.




--
From: Tom Lane [EMAIL PROTECTED]
Sent: Monday, November 26, 2007 7:31 PM
To: [EMAIL PROTECTED]
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Replacement Selection


[EMAIL PROTECTED] writes:
3) Start run generation. As for this phase, I see PostgreSQL code (as 
Knuth
algorithm) marks elements belonging to runs in otder to know which run 
they

belong to and to know when the current heap has finished building the
current run. I don't memorize this kind of info. I just output from heap 
to
run all of the elements going into the current run. The elements supposed 
to
go into the next run (I call them dead records) are still stored into 
main
memory, but as leaves of the heap. This implies reducing the heap size 
and

so heapifying a smaller number of elements each time I get a dead record
(it's not necessary to sort dead records). When the heap size is zero a 
new
run is created heapifying all the dead records currently present into 
main

memory.


Why would this be an improvement over Knuth?  AFAICS you can't generate
longer runs this way, and it's not saving any time --- in fact it's
costing time, because re-heapifying adds a lot of new comparisons.

regards, tom lane



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Guillaume Smet
On Nov 26, 2007 6:48 PM, Tom Lane [EMAIL PROTECTED] wrote:
 In the test case Guillame provided, every single WHERE clause happens
 to be of the form
 varchar_column = 'unknown-type literal'
 and there are no other operators used in the SELECT lists; but I can
 hardly believe that this is representative of any significant number
 of real-world applications.  Even pgbench uses more than one operator.

Sure. The application uses a lot of other operators (timestamp
comparison, cube operators, LIKE, boolean comparison...). But as the
primary keys are all varchar in this application, the operator varchar
= unknown is by far the most used.

If we can't find a solution for upstream 8.3, I'll be happy to apply a
local patch for this customer (if I'm sure it doesn't do any harm in
their case).

Usually, people use integer flavoured primary keys so they probably
won't be hit by this problem as strong as we are.

--
Guillaume

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Timothy J. Kordas

[EMAIL PROTECTED] wrote:
I also implemented 
Replacement Selection (RS) so if I'm able to integrate my RS I hope I 
would be able to integrate the others too.


The existing code implements RS. Tom asked you to describe what improvements 
you hope to make; I'm confident that he already understands how to implement 
RS. :-)


**

Why don't you compile with TRACE_SORT enabled and watch the log output.

The function in tuplesort.c that you should start with is puttuple_common().

in puttuple_common(), the transition from an internal to external sort is 
performed at the bottom of the TSS_INITIAL case in the main switch 
statement. The function dumptuples() heapifies the in-core tuples (divides 
the in-core tuples into initial runs and then advances the state to 
TSS_BUILDRUNS). All subsequent tuples will hit the TSS_BUILDRUNS case and 
will insert tuples into the heap; emitting tuples for the current run as it 
goes.


I recommend you run the code in the debugger on a external-sorting query: 
watch two or three tuples go into the heap and you'll get the idea.


The top of the heap is at state-memtuples[0] the heap goes down from there. 
New tuples are added there and the heap is adjusted (Using the 
tuplesort_heap_siftup() function).


-Tim

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Kevin Grittner
 On Mon, Nov 26, 2007 at  1:04 PM, in message
[EMAIL PROTECTED], Guillaume Smet
[EMAIL PROTECTED] wrote: 
 On Nov 26, 2007 6:48 PM, Tom Lane [EMAIL PROTECTED] wrote:
 In the test case Guillame provided, every single WHERE clause happens
 to be of the form
 varchar_column = 'unknown-type literal'
 and there are no other operators used in the SELECT lists; but I can
 hardly believe that this is representative of any significant number
 of real-world applications.  Even pgbench uses more than one operator.
 
 Sure. The application uses a lot of other operators (timestamp
 comparison, cube operators, LIKE, boolean comparison...). But as the
 primary keys are all varchar in this application, the operator varchar
 = unknown is by far the most used.
 
 If we can't find a solution for upstream 8.3, I'll be happy to apply a
 local patch for this customer (if I'm sure it doesn't do any harm in
 their case).
 
 Usually, people use integer flavoured primary keys so they probably
 won't be hit by this problem as strong as we are.
 
I've been watching this with some concern, as we only use synthetic
keys when there is no naturally occurring set of columns which
uniquely identify a row.  In our circuit court software (where we
have the most users), there are 377 permanent tables, mostly with
compound keys:
 
keycolcount | count
-+---
   1 |91
   2 |   129
   3 |88
   4 |46
   5 |20
   6 | 3
(6 rows)
 
We have a lot of varchar columns within those keys, although almost
all are actually declared via a domain:
 
 basetype | count
--+---
 date |34
 int  |   146
 numeric  | 8
 smallint |   365
 varchar  |   362
(5 rows)
 
Complex queries usually join on at least one varchar column and one
smallint column.  Selection criteria usually involve comparing to a
character string literal, sometimes a date literal, and almost
always a smallint.
 
Are we likely to see the 3% or the 7% performance degradation with
version 8.3?  Is the small patch likely to help with this usage
pattern?
 
I'm still trying to get 8.3 performance testing onto our schedule
before the end (fast approaching, I know) of the beta.  Is there
anything in particular that would be helpful for me to look at
regarding this particular issue in those tests?
 
-Kevin
 


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Peter Eisentraut
Dave Page wrote:
 How does that work with the stuff that goes into directories relative to
 $libdir (per documented recommendations), without creating easy-to-break
 paths like $libdir/../../MyAddon/8.3/MyAddon.dll?

Which documented recommendation do you speak of?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Peter Eisentraut
Tom Lane wrote:
 Peter Eisentraut [EMAIL PROTECTED] writes:
  Am Montag, 26. November 2007 schrieb Mark Cave-Ayland:
  I'm working on a piece of code for PostGIS to allow the loading of
  projection configuration files from the share/postgresql directory, but
 
  The share directory is the wrong place for configuration files anyway. 
  And moreover, non-PostgreSQL packages have no business putting files into
  PostgreSQL's private directories.

 That doesn't seem logical to me.  We expect tsearch configuration files
 to be put into $sharedir/tsearch_data, so why shouldn't PostGIS use a
 similar approach to dealing with external configuration data?

Well, PostGIS is not PostgreSQL.  And those files are not really configuration 
files, in the sense that the user configures them, but data files (or else 
they are in the wrong place).

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] maintenance_work_mem memory constraint?

2007-11-26 Thread Bernd Helmle
--On Montag, November 26, 2007 13:02:14 -0500 Tom Lane [EMAIL PROTECTED] 
wrote:



Bernd Helmle [EMAIL PROTECTED] writes:

... But isn't it worth to special case the
code in grow_memtuples() (and maybe other places where sort is likely to
use more RAM), so that we can remove this constraint on 64-Bit systems
with  many RAM built in? Or am I missing something very important?.


AFAICS this patch can increase the number of sortable tuples by at most 2X
(less one).  That doesn't seem worth getting very worked up about ...

regards, tom lane


That's true.

Well, i haven't meant the diff as a discussable patch at all. It's just 
what i've done to understand why we have this limit for tuplesort. afaics, 
the main constraint here is MaxAllocSize, and i just wonder if that doesn't 
introduce unnecessary limits on systems which can use many RAM for index 
creation and wether we can be more generous here. So one idea could be to 
allow larger allocation requests during sorting on systems where we know 
that this is likely to work.


--
 Thanks

   Bernd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-26 Thread Brendan Jurd
Quoting Tom, from the previous thread linked by Martijn:

 It could be pretty ugly, because type assignment normally proceeds
 bottom-up :-(.  What you might have to do is make the raw grammar
 representation of ARRAY[] work like A_Const does, ie, there's a
 slot to plug in a typecast.  That's pretty much vestigial now for
 A_Const, if memory serves, but it'd be needful if ARRAY[] has to
 be able to see the typecast that would otherwise be above it in
 the parse tree.

This approach is making sense to me, but I've run into a bit of a
dependency issue.  A_Const does indeed have a slot for typecasts by
way of a TypeName member.  A_Const and TypeName are both defined in
parsenodes.h, whereas ArrayExpr is defined in primnodes.h.  So
unfortunately I can't just add a TypeName member to ArrayExpr.

I'm new to this area of the codebase (and parsers generally), so I'm
treading carefully.  What would be the best way to resolve this?
Would moving TypeName into primnodes.h be acceptable?

Thanks for your time,
BJ

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Dave Page


 --- Original Message ---
 From: Peter Eisentraut [EMAIL PROTECTED]
 To: Dave Page [EMAIL PROTECTED]
 Sent: 26/11/07, 20:14:25
 Subject: Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
 
 Dave Page wrote:
  How does that work with the stuff that goes into directories relative to
  $libdir (per documented recommendations), without creating easy-to-break
  paths like $libdir/../../MyAddon/8.3/MyAddon.dll?
 
 Which documented recommendation do you speak of?
 

http://www.postgresql.org/docs/8.3/static/xfunc-c.html states:

==
It is recommended to locate shared libraries either relative to $libdir or 
through the dynamic library path. This simplifies version upgrades if the new 
installation is at a different location. The actual directory that $libdir 
stands for can be found out with the command pg_config --pkglibdir.
==

That seems to have been there since at least 7.2.

/D

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-26 Thread Tom Lane
Brendan Jurd [EMAIL PROTECTED] writes:
 This approach is making sense to me, but I've run into a bit of a
 dependency issue.  A_Const does indeed have a slot for typecasts by
 way of a TypeName member.  A_Const and TypeName are both defined in
 parsenodes.h, whereas ArrayExpr is defined in primnodes.h.  So
 unfortunately I can't just add a TypeName member to ArrayExpr.

That would be quite the wrong thing to do anyway, since ArrayExpr is
a run-time representation and shouldn't have any such thing attached
to it.  What you probably need is a separate parse-time representation
of ARRAY[], a la the difference between A_Const and Const.

Another possibility is to just hack up a private communication path
between transformExpr and transformArrayExpr, ie when you see TypeCast
check to see if its argument is ArrayExpr and do something different.
This would be a mite klugy but it'd be a much smaller patch that way.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Tom Lane
Dave Page [EMAIL PROTECTED] writes:
 From: Peter Eisentraut [EMAIL PROTECTED]
 Which documented recommendation do you speak of?

 http://www.postgresql.org/docs/8.3/static/xfunc-c.html states:
 It is recommended to locate shared libraries either relative to $libdir or 
 through the dynamic library path. This simplifies version upgrades if the new 
 installation is at a different location. The actual directory that $libdir 
 stands for can be found out with the command pg_config --pkglibdir.

I believe that that is talking specifically about shared libraries (or
DLLs in Windows-speak), and not about configuration or data files.
In particular, nothing under libdir would be a candidate to go under
sharedir, nor vice versa, since the former is supposed to hold
architecture-dependent files and the latter architecture-independent
files.

Mark hasn't been very clear about whether he wants to store static data
files or installation-changeable configuration info, so it's not clear
to me whether Peter's objection to using sharedir is appropriate or not.
But unless the files are architecture-sensitive (which they might be!),
libdir doesn't sound right in either case.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Tom Lane
Kevin Grittner [EMAIL PROTECTED] writes:
 Are we likely to see the 3% or the 7% performance degradation with
 version 8.3?

Probably not, since it sounds like your queries are typically not as
trivial as the ones in Guillame's test case.  IOW there will be some
slowdown but it's likely to be in the noise for you, and also very
likely made up by improvements elsewhere.  The test case is in the
unfortunate position of not being helped materially by *any* of the work
we've done for 8.3.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Tom Lane
[EMAIL PROTECTED] writes:
 Anyway, even in my RS implementation a longer run is created. The first M 
 initialization elements will surely form part of the current run. M is the 
 memory size so at least a run sized M will be created. After initialization, 
 the elements are not suddenly output, but an element from heap is output 
 into run as soon as I get an element from stream. In other words, for each 
 element from stream, the root element of the heap is output, and the input 
 element takes the root place into the heap. If that element is a good 
 record I just heapify (since the element will be placed at the now free 
 root place). If that input element is a dead record I swap it with the last 
 leaf and reduce the heap size.

AFAICS that produces runs that are *exactly* the same length as Knuth's
method --- you're just using a different technique for detecting when
the run is over, to wit record is not in heap vs record is in heap
but with a higher run number.  I guess you would save some comparisons
while the heap is shrinking, but it's not at all clear that you'd save
more than what it will cost you to re-heapify all the dead records once
the run is over.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Mark Cave-Ayland
On Mon, 2007-11-26 at 11:55 -0500, Tom Lane wrote:

 I'm not sure why Mark's having a problem accessing my_exec_path ---
 it *is* declared DLLIMPORT in miscadmin.h (which is where it counts,
 AIUI) clear back to 8.0.

Bah, I think that is the source of the problem. Having grepped the
source for my_exec_path, I found the reference in globals.c and hence
the code worked on Linux (my main development environment) with a simple
extern declaration which was what was confusing me. Adding the #include
miscadmin.h solves this problem and I can now access the variable on
Windows aswell, so I put this down to user error - thanks for the
advice Tom.


Many thanks,

Mark.

-- 
ILande - Open Source Consultancy
http://www.ilande.co.uk



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Dave Page


 --- Original Message ---
 From: Tom Lane [EMAIL PROTECTED]
 To: Dave Page [EMAIL PROTECTED]
 Sent: 26/11/07, 22:02:09
 Subject: Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
 

 I believe that that is talking specifically about shared libraries (or
 DLLs in Windows-speak), and not about configuration or data files.
 In particular, nothing under libdir would be a candidate to go under
 sharedir, nor vice versa, since the former is supposed to hold
 architecture-dependent files and the latter architecture-independent
 files.

Yes, I know. Peter seemed to be saying that nothing except postgres itself 
should be in *any* of the installation directories (he called them 'private' 
directories by which I imagine he meant an install in /usr/local/pgsql/ vs 
/usr/local, or more specifically the case when $libdir is something like 
/usr/lib/postgresql)

/D


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Tom Lane
Dave Page [EMAIL PROTECTED] writes:
 Yes, I know. Peter seemed to be saying that nothing except postgres
 itself should be in *any* of the installation directories

Yeah, that's what I think he said too, but it strikes me as a completely
bogus policy --- what about contrib modules or stuff from pgfoundry
or any random user-written module that was built with PGXS?  All that
stuff happily drops files under $libdir and $sharedir, and I see no good
argument why it shouldn't.

There is fair room for argument about whether $sharedir is the
appropriate place for installation-specific config files though.
Those should probably live in the same place as postgresql.conf
and friends, ie, typically but not necessarily $PGDATA.  It doesn't
look like SelectConfigFiles() saves the config directory name anyplace
where an add-on module could get at it.  We could perhaps fix that for
8.3, but it won't help Mark if he insists on a backwards-compatible
solution.  I suppose he could copy ConfigFileName and strip the last
component from it?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-26 Thread Brendan Jurd
On Nov 27, 2007 8:04 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Brendan Jurd [EMAIL PROTECTED] writes:
  ... So
  unfortunately I can't just add a TypeName member to ArrayExpr.

 That would be quite the wrong thing to do anyway, since ArrayExpr is
 a run-time representation and shouldn't have any such thing attached
 to it.  What you probably need is a separate parse-time representation
 of ARRAY[], a la the difference between A_Const and Const.


Ah.  I wasn't aware of the distinction; I started by looking in gram.y
and saw that the ARRAY parse path creates an ArrayExpr node, whilst
the constant parse paths create A_Const nodes.  I didn't realise that
ArrayExpr was skipping ahead and creating the same kind of object
that the transform produces.

Glad I stopped and asked for directions then. =)

I'm not 100% clear on what the A_ prefix signifies ... is A_ArrayExpr
a good name for the parse-time structure?

Thanks for your time,
BJ

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-26 Thread Tom Lane
Brendan Jurd [EMAIL PROTECTED] writes:
 I'm not 100% clear on what the A_ prefix signifies ... is A_ArrayExpr
 a good name for the parse-time structure?

Yeah, might as well use that for consistency.  The A_ doesn't seem
very meaningful to me either, but I don't want to rename the existing
examples ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 AFAICS that produces runs that are *exactly* the same length as Knuth's
 method --- you're just using a different technique for detecting when
 the run is over, to wit record is not in heap vs record is in heap
 but with a higher run number.  I guess you would save some comparisons
 while the heap is shrinking, but it's not at all clear that you'd save
 more than what it will cost you to re-heapify all the dead records once
 the run is over.

This sounded familiar... It sounds a lot like what this CVS log message is
describing as a mistaken idea:

  revision 1.2
  date: 1999-10-30 18:27:15 +0100;  author: tgl;  state: Exp;  lines: +423 -191;

  Further performance improvements in sorting: reduce number of comparisons
  during initial run formation by keeping both current run and next-run tuples
  in the same heap (yup, Knuth is smarter than I am). And, during merge
  passes, make use of available sort memory to load multiple tuples from any
  one input 'tape' at a time, thereby improving locality of access to the temp
  file.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Table inheritance, unique constraints and foreign key problem

2007-11-26 Thread Jacob Rief
Hello,
this issue has been requested and its on the TODO-list. Since I really
need foreign key constraints on inherited tables, I have two solutions:
Adding some hackish RULES/TRIGGERS to my tables or implementing it
myself. It think the latter is better. However, I have no experience in
implementing such a feature in Postgres. I have written some triggers in
C (actually C++) using the SPI_-functions, but that's it. I have a
running 8.3beta, checked out from the repository, and I have read the
Ottawa slides. Can someone tell me in a few lines, where to start with
such a feature.
Jacob



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Mark Cave-Ayland
On Mon, 2007-11-26 at 17:02 -0500, Tom Lane wrote:

 I believe that that is talking specifically about shared libraries (or
 DLLs in Windows-speak), and not about configuration or data files.
 In particular, nothing under libdir would be a candidate to go under
 sharedir, nor vice versa, since the former is supposed to hold
 architecture-dependent files and the latter architecture-independent
 files.
 
 Mark hasn't been very clear about whether he wants to store static data
 files or installation-changeable configuration info, so it's not clear
 to me whether Peter's objection to using sharedir is appropriate or not.
 But unless the files are architecture-sensitive (which they might be!),
 libdir doesn't sound right in either case.

Okay, I'll try and expand on this a bit. In order to convert coordinates
between different coordinate systems, PostGIS uses the external PROJ.4
library. Now in order to support a certain category of conversion,
PROJ.4 requires access to a set of library grid reference files which
are effectively compiled from source files into a set of data files as
part of the build process. The path to this directory of files is then
built into the DLL at compile time, although it can be overriden with an
API call.

Under Linux, this is fairly easy as the files are normally installed
somewhere under /usr/share/proj, and hence the directory exists at both
compile-time and run-time. Windows is trickier because drive letters and
mappings can change - the default of C:\PROJ\NAD may or may not exist,
or can change depending upon the current drive configuration. I can also
see issues arising if the PostgreSQL installation is moved from the C:\
drive to another.

Hence my idea was to create a directory under $sharedir such as
$sharedir/postgresql/contrib/postgis/nad and install the files there.
Then regardless of the location of the PostgreSQL installation or the
current drive setup, I can use get_share_path() with the PROJ.4 API to
set the new library path the first time the function is called, and
everything will just work.

I can see Peter's argument about not putting files directly in
$sharedir, but I feel the usage suggested above falls under a similar
use case to the tsearch2 data files (which is mostly where I looked for
inspiration).

Hopefully this will help make things a bit clearer - please let me know
if any more information is needed.


Many thanks,

Mark.

-- 
ILande - Open Source Consultancy
http://www.ilande.co.uk



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Dave Page


 --- Original Message ---
 From: Tom Lane [EMAIL PROTECTED]
 To: Dave Page [EMAIL PROTECTED]
 Sent: 26/11/07, 22:30:17
 Subject: Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
 
 Dave Page [EMAIL PROTECTED] writes:
  Yes, I know. Peter seemed to be saying that nothing except postgres
  itself should be in *any* of the installation directories
 
 Yeah, that's what I think he said too, but it strikes me as a completely
 bogus policy --- what about contrib modules or stuff from pgfoundry
 or any random user-written module that was built with PGXS?  All that
 stuff happily drops files under $libdir and $sharedir, and I see no good
 argument why it shouldn't.

Yup, exactly.

 There is fair room for argument about whether $sharedir is the
 appropriate place for installation-specific config files though.
 Those should probably live in the same place as postgresql.conf
 and friends, ie, typically but not necessarily $PGDATA.  It doesn't
 look like SelectConfigFiles() saves the config directory name anyplace
 where an add-on module could get at it.  We could perhaps fix that for
 8.3, but it won't help Mark if he insists on a backwards-compatible
 solution.  I suppose he could copy ConfigFileName and strip the last
 component from it?

Wouldn't be difficult, but I suspect Mark is probably using the path for the 
PostGIS SQL scripts lwgeom.sql, spatial_ref_sys.sql etc. rather than config 
files. But, I see he's found his bug now anyway.

/D

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replacement Selection

2007-11-26 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I guess you would save some comparisons
 while the heap is shrinking, but it's not at all clear that you'd save
 more than what it will cost you to re-heapify all the dead records once
 the run is over.

 This sounded familiar... It sounds a lot like what this CVS log message is
 describing as a mistaken idea:

Wow, I had forgotten all about that; but yeah this sounds exactly like
my first-cut rewrite of PG's sorting back in 1999.  I have some vague
memory of having dismissed Knuth's approach as being silly because of
the extra space and (small number of) cycles needed to compare run
numbers in the heap.  I hadn't realized that there was an impact on
total number of comparisons required :-(

The discussion from that time period in pgsql-hackers makes it sound
like you need a large test case to notice the problem, though.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Tom Lane
Mark Cave-Ayland [EMAIL PROTECTED] writes:
 Okay, I'll try and expand on this a bit. In order to convert coordinates
 between different coordinate systems, PostGIS uses the external PROJ.4
 library. Now in order to support a certain category of conversion,
 PROJ.4 requires access to a set of library grid reference files which
 are effectively compiled from source files into a set of data files as
 part of the build process. The path to this directory of files is then
 built into the DLL at compile time, although it can be overriden with an
 API call.

 Under Linux, this is fairly easy as the files are normally installed
 somewhere under /usr/share/proj, and hence the directory exists at both
 compile-time and run-time.

OK, if the files are normally under /usr/share then it's presumably
kosher to put them under our $sharedir.  I guess the only question is
whether you are worried about having two copies in a machine where
PROJ.4 is also installed natively.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Simon Riggs
On Mon, 2007-11-26 at 09:55 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  But I think there must be an action that we can take for 8.3 and that
  much runtime should not be given away easily. ISTM that we can win back
  the losses Guillaume has identified, plus gain a little more even.
 
 Perhaps some sanity could be restored to this discussion by pointing out
 that the 2007-01-01 code *also* clocks in at 37% spent in
 oper_select_candidate.  IOW it's been like this for a very long time.

[I'm replying to the wrong message, I know.]

Here's where I am:

Basic test was to replace call to oper_select_candidate with a single
item that was fed by a hardcoded value for varchar equality operator.
This is the oper_cache.v1.patch enclosed; the 5 line patch.

Test results were
- w/o patch ~10,500 tps with pgbench_varchar.sql
- with patch~15,500 tps with pgbench_varchar.sql (**big gain**)
- w/o patch ~16,250 tps with pgbench_integer.sql
- with patch~16,250 tps with pgbench_integer.sql

Tables are standard pgbench, varchar test table created using:
create table av as select aid::varchar, bid, abalance, filler from
accounts;
create unique index av_pkey on av (aid);

The impact of calling oper_select_candidate() is big enough that it will
affect any query that is read only and has 1 or 2 predicates when at
least one of them is a VARCHAR_col = const query.

What I'm actually proposing is a patch implementing a oper_select_hook
function pointer, which allows the user to do anything they want. I'm
just re-writing that as a plugin now, but the backend patch is included
here for discussion. oper_select_hook.v1.patch

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com
Index: src/backend/parser/parse_oper.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/parser/parse_oper.c,v
retrieving revision 1.98
diff -c -r1.98 parse_oper.c
*** src/backend/parser/parse_oper.c	22 Nov 2007 19:40:25 -	1.98
--- src/backend/parser/parse_oper.c	26 Nov 2007 16:28:42 -
***
*** 28,33 
--- 28,34 
  #include utils/syscache.h
  #include utils/typcache.h
  
+ static Oid operOidCache = InvalidOid;
  
  static Oid	binary_oper_exact(List *opname, Oid arg1, Oid arg2);
  static FuncDetailCode oper_select_candidate(int nargs,
***
*** 516,534 
  		/* No operators found? Then fail... */
  		if (clist != NULL)
  		{
! 			/*
! 			 * Unspecified type for one of the arguments? then use the other
! 			 * (XXX this is probably dead code?)
! 			 */
! 			Oid			inputOids[2];
  
! 			if (rtypeId == InvalidOid)
! rtypeId = ltypeId;
! 			else if (ltypeId == InvalidOid)
! ltypeId = rtypeId;
! 			inputOids[0] = ltypeId;
! 			inputOids[1] = rtypeId;
! 			fdresult = oper_select_candidate(2, inputOids, clist, operOid);
  		}
  	}
  
--- 517,543 
  		/* No operators found? Then fail... */
  		if (clist != NULL)
  		{
! 			if (OidIsValid(operOidCache))
! operOid = operOidCache;
! 			else
! 			{
! /*
!  * Unspecified type for one of the arguments? then use the other
!  * (XXX this is probably dead code?)
!  */
! Oid			inputOids[2];
! 
! if (rtypeId == InvalidOid)
! 	rtypeId = ltypeId;
! else if (ltypeId == InvalidOid)
! 	ltypeId = rtypeId;
! inputOids[0] = ltypeId;
! inputOids[1] = rtypeId;
! fdresult = oper_select_candidate(2, inputOids, clist, operOid);
  
! if (ltypeId == 1043  rtypeId == 705)
! 	operOidCache = operOid;
! 			}
  		}
  	}
  
\set naccounts 10 * :scale
\setrandom aid 1 :naccounts
SELECT abalance FROM accounts WHERE aid = :aid;

\set naccounts 10 * :scale
\setrandom aid 1 :naccounts
SELECT abalance FROM av WHERE aid = ':aid';

Index: src/backend/parser/parse_oper.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/parser/parse_oper.c,v
retrieving revision 1.98
diff -c -r1.98 parse_oper.c
*** src/backend/parser/parse_oper.c	22 Nov 2007 19:40:25 -	1.98
--- src/backend/parser/parse_oper.c	26 Nov 2007 23:08:09 -
***
*** 28,33 
--- 28,34 
  #include utils/syscache.h
  #include utils/typcache.h
  
+ oper_select_hook_type oper_select_hook = NULL;
  
  static Oid	binary_oper_exact(List *opname, Oid arg1, Oid arg2);
  static FuncDetailCode oper_select_candidate(int nargs,
***
*** 528,534 
  ltypeId = rtypeId;
  			inputOids[0] = ltypeId;
  			inputOids[1] = rtypeId;
! 			fdresult = oper_select_candidate(2, inputOids, clist, operOid);
  		}
  	}
  
--- 529,544 
  ltypeId = rtypeId;
  			inputOids[0] = ltypeId;
  			inputOids[1] = rtypeId;
! 
! 			/*
! 			 * Allow user defined operator selection, allowing hard-coding
! 			 * cacheing or other mechanisms of operator selection for
! 			 * improved performance in certain circumstances
! 			 */
! 			if (oper_select_hook)
! fdresult = (* oper_select_hook) (2, 

Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Here's where I am:

 Basic test was to replace call to oper_select_candidate with a single
 item that was fed by a hardcoded value for varchar equality operator.

Well, that confirms what we knew from gprof, but surely you aren't
proposing that as a usable patch.

 What I'm actually proposing is a patch implementing a oper_select_hook
 function pointer, which allows the user to do anything they want.

Why in the world would that be a good idea?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale

2007-11-26 Thread Bruce Momjian
Bruce Momjian wrote:
 D'Arcy J.M. Cain wrote:
  On Sat, 24 Nov 2007 11:27:38 -0500 (EST)
  Bruce Momjian [EMAIL PROTECTED] wrote:
   I am confused about two other items with MONEY.  First, why can't
   anything but a string be cast to this type?
   
 test= select 871234872319489323::money;
 ERROR:  cannot cast type bigint to money
 LINE 1: select 871234872319489323::money;
^
 test= select 871234872::money;
 ERROR:  cannot cast type integer to money
 LINE 1: select 871234872::money;
   ^
 test= select 87123487231.3::money;
 ERROR:  cannot cast type numeric to money
 LINE 1: select 87123487231.3::money;
   ^
  
  I agree.  I wasn't the one that added the meta information.
 
 OK, so the big question then is if we are un-depricating this data type,
 does it have the behavior we want?  (And the regression addition will be
 helpful too.)

Added to TODO list:

* Allow MONEY to be cast to/from other numeric data types

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Simon Riggs
On Mon, 2007-11-26 at 18:18 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Here's where I am:
 
  Basic test was to replace call to oper_select_candidate with a single
  item that was fed by a hardcoded value for varchar equality operator.
 
 Well, that confirms what we knew from gprof, but surely you aren't
 proposing that as a usable patch.

gprof might not have translated into a usable gain, but clearly it can.

That's not a proposed patch, just showing my results.

  What I'm actually proposing is a patch implementing a oper_select_hook
  function pointer, which allows the user to do anything they want.
 
 Why in the world would that be a good idea?

Short answer: it makes it go faster? You asked. ;-)

Long answer: We all agree the operator cache is the best answer, yet
don't wish to delay the project or make it less robust. The best answer
is a plugin approach that lets users take the risk and make the gain. 

We can't hardcode it for everybody because that runs completely against
the grain of Postgres. Including this as a plugin allows people to make
their own decisions about cacheing/hardcoding. If you are the unlucky
owner of a database with a heavy read workload and lots of VARCHAR keys
then you're going to want this. 

The plugin allows writing a one-slot cache that is never flushed. If you
choose to override the operators then you'd need to reconnect. It also
allows some performance tuning in other cases too, so having it as a
general case makes sense.

The overhead of implementing it this way is very close to zero and the
code path doesn't even get called in the integers-as-keys cases.

I don't really like all of this, but that much gain is too much for me
to ignore. Better ideas eagerly accepted, and encouraged.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-26 Thread Bruce Momjian
Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  But I think there must be an action that we can take for 8.3 and that
  much runtime should not be given away easily. ISTM that we can win back
  the losses Guillaume has identified, plus gain a little more even.
 
 Perhaps some sanity could be restored to this discussion by pointing out
 that the 2007-01-01 code *also* clocks in at 37% spent in
 oper_select_candidate.  IOW it's been like this for a very long time.
 I'm not interested in destabilizing 8.3 with panicky last-minute patches.
 
  So how about we have a cache-of-one:
 
 Cache-of-one has exactly the same difficulty as cache-of-many, other
 than the table lookup itself, which is a solved problem (hashtable).
 You still have to determine how you identify the cached value and what
 events require a cache flush.  Nor do I see any particular reason to
 assume that a cache of only one operator would be of any use for
 real-world apps, as opposed to toy examples.

Seems like anytime a function like that takes 37%, there is something
wrong.  Are we sure there isn't a bug in there somewhere?

As far as a cache, could we create a simple cache that remembered the
last X lookups and cleared the cache anytime a cache invalidation
message came in?

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Table inheritance, unique constraints and foreign key problem

2007-11-26 Thread Gregory Stark
Jacob Rief [EMAIL PROTECTED] writes:

 this issue has been requested and its on the TODO-list. Since I really
 need foreign key constraints on inherited tables, I have two solutions:
 Adding some hackish RULES/TRIGGERS to my tables or implementing it
 myself. It think the latter is better. However, I have no experience in
 implementing such a feature in Postgres. I have written some triggers in
 C (actually C++) using the SPI_-functions, but that's it. I have a
 running 8.3beta, checked out from the repository, and I have read the
 Ottawa slides. Can someone tell me in a few lines, where to start with
 such a feature.

This is the tip of an iceberg. As you dig you find out it's caused by deeper
and deeper limitations until you're pretty much all of the executor.

The RI trigger code explicitly uses ONLY for the integrity checks. But if you
remove that you find it breaks because you get this message:

ERROR:  SELECT FOR UPDATE/SHARE is not supported for inheritance queries
 
Look at src/backend/optimizer/path/allpaths.c:287 for a comment about this.

I'm a bit puzzled myself why this affects SELECT FOR UPDATE/SHARE but not
straight UPDATES and DELETES.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale

2007-11-26 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Added to TODO list:
 * Allow MONEY to be cast to/from other numeric data types

So in other words, that's been added to the TODO list *purely* on your
own say-so, and not because any users asked for it or anyone else thinks
it's a good idea.

Since MONEY really ought to be considered a tagged type, I'm not at all
impressed with the idea that there should be default casts between it
and plain numerics.  There's a fundamental semantic gap there.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Table inheritance, unique constraints and foreign key problem

2007-11-26 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 I'm a bit puzzled myself why this affects SELECT FOR UPDATE/SHARE but not
 straight UPDATES and DELETES.

In straight UPDATE/DELETE we have enough structure in the query to know
how to associate each tuple returned to the executor top level with
exactly one tuple in exactly one target table (which is where to apply
the tuple lock operation).  We don't have that much structure in general
SELECT --- for example, what to do with null-filled rows in a LEFT JOIN,
or cases where one row gives rise to more than one joined row, or
aggregation or UNION?  Some of these cases can probably be rejected as
unsupportable, but it'll still take a lot of work.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale

2007-11-26 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 26 Nov 2007 21:19:48 -0500
Tom Lane [EMAIL PROTECTED] wrote:

 Bruce Momjian [EMAIL PROTECTED] writes:
  Added to TODO list:
  * Allow MONEY to be cast to/from other numeric data types
 
 So in other words, that's been added to the TODO list *purely* on your
 own say-so, and not because any users asked for it or anyone else
 thinks it's a good idea.
 

Well if we are going to continue to support money (which I am against)
we should support the casting to numeric as that is by far a more
common implementation of money and we will have mixed environments.

 Since MONEY really ought to be considered a tagged type, I'm not at
 all impressed with the idea that there should be default casts
 between it and plain numerics.  There's a fundamental semantic gap
 there.

Perhaps but the practical gap is much narrower.

Sincerely,

Joshua D. Drake

 
   regards, tom lane
 
 ---(end of
 broadcast)--- TIP 2: Don't 'kill -9' the
 postmaster
 


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHS4QhATb/zqfZUUQRAraiAJ9HLeqG7nRbblPvJhu/JQwhOrmzxQCgpWGE
1JShnZ4xwM1+lQzTKCkGVcw=
=sErU
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale

2007-11-26 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
 Well if we are going to continue to support money (which I am against)
 we should support the casting to numeric as that is by far a more
 common implementation of money and we will have mixed environments.

So, you don't use MONEY, and you don't want to, but nonetheless you
know better than the people who do use MONEY what they need.

Aside from the semantic-gap issue, there is the point that providing
a cast might actually mask application errors.  I can well imagine
cases where one of the reasons for using MONEY is *exactly* that it's
not a plain number or easily convertible to one.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch for operator lookup caching

2007-11-26 Thread Bruce Momjian
Tom Lane wrote:
 Since Simon seems intent on hacking something in there, here is a patch
 that I think is actually sane for improving operator lookup speed.
 This patch caches all lookups, exact or ambiguous (since even the exact
 ones require multiple cache searches in common cases); and behaves sanely
 in the presence of search_path, pg_operator, or pg_cast changes.
 
 I see about a 45% speedup (2110 vs 1445 tps) on Guillame Smet's test case.
 On straight pgbench --- which has no ambiguous operators, plus it's not
 read-only --- it's hard to measure any consistent speedup, but I can say
 that it's not slower.  Some other test cases would be nice.
 
 I went through the code that's being bypassed in some detail, to see what
 dependencies were being skipped over.  I think that as long as we assume
 that no *existing* type changes its domain base type, typtype, array
 status, type category, or preferred-type status, we don't need to flush
 the cache on pg_type changes.  This is a good thing since pg_type changes
 frequently (eg, at temp table create or drop).
 
 The only case that I believe to be unhandled is that the cache doesn't pay
 attention to ALTER TABLE ... INHERIT / NO INHERIT events.  This means it
 is theoretically possible to return the wrong operator if an operator
 takes a complex type as input and the calling situation involves another
 complex type whose inheritance relationship to that one changes.  That's
 sufficiently far out of the normal case that I'm not very worried about it
 (in fact, we probably have bugs in that area even without this patch,
 since for instance cached plans don't respond to such changes either).
 We could plug the hole by forcing a system-wide cache reset during ALTER
 TABLE ... INHERIT / NO INHERIT, if anyone insists.
 
 I'm not entirely happy about applying a patch like this so late in
 the beta cycle, but I'd much rather do this than than any of the
 less-than-half-baked ideas that have been floated in the discussion
 so far.

Thanks for the patch.  I can see it is clearly of significant size.

I also noted that you found that the case of:

SELECT col FROM tab WHERE text_col = 'ABC';

also took 37% of CPU in January, I think meaning we had this problem in
8.2.

On the one hand we have a pretty significant patch that we might apply. 
It gives us a major speedup (+30%) for a common query type.  I assume
8.3 was slightly slower than 8.2 only because we have a few more
pg_catalog entries in 8.3 than 8.2.  (I am still baffled how a lookup
function could take so much CPU compared to what else is done for a
query.)

We are also talking about catlog changes for 8.3.  Are we comfortable
doing catalog changes between the beta and RC?  I am wondering if the
right plan is to have someone else review your patch, apply it, make the
catalog changes, and release another beta this weekend. Give the beta
one week of testing and go for RC.  That gives us testing of the patch,
and testing of the catalog changes before going to RC1.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch for operator lookup caching

2007-11-26 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 We are also talking about catlog changes for 8.3.  Are we comfortable
 doing catalog changes between the beta and RC?

The catalog changes in question seem entirely safe ... certainly much
more so than this patch ...

I do see your point that another beta might be prudent, but on the other
hand I'm not sure it's really needed.  The only difference between a
beta and an RC is that we try not to change the code anymore after RC.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale

2007-11-26 Thread Bruce Momjian
Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
  Well if we are going to continue to support money (which I am against)
  we should support the casting to numeric as that is by far a more
  common implementation of money and we will have mixed environments.
 
 So, you don't use MONEY, and you don't want to, but nonetheless you
 know better than the people who do use MONEY what they need.

I found out you can cast numerics to MONEY by using two casts:

test= SELECT 12321.12::text::money;
   money

 $12,321.12
(1 row)

For some reason this doesn't work in 8.2 but does in 8.3.  The reverse
doesn't work:

test= SELECT '12321.12'::money::text::numeric;
ERROR:  invalid input syntax for type numeric: $12,321.12

The big problem is that MONEY is a string so the dollar sign and commas
are a problem.

 Aside from the semantic-gap issue, there is the point that providing
 a cast might actually mask application errors.  I can well imagine
 cases where one of the reasons for using MONEY is *exactly* that it's
 not a plain number or easily convertible to one.

Perhaps all we need is a way to accomplish the casting so it isn't
automatic.  This works:

test= SELECT regexp_replace('2343'::money::text, '[^$,]*', '', 
'g')::numeric;
 regexp_replace

2343.00
(1 row)

but the '$' and ',' are locale-specific and I can't think of a way to do
this generically.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale

2007-11-26 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 26 Nov 2007 23:47:04 -0500 (EST)
Bruce Momjian [EMAIL PROTECTED] wrote:

 Perhaps all we need is a way to accomplish the casting so it isn't
 automatic.  This works:
 
   test= SELECT regexp_replace('2343'::money::text, '[^$,]*',
 '', 'g')::numeric; regexp_replace
   
   2343.00
   (1 row)
 
 but the '$' and ',' are locale-specific and I can't think of a way to
 do this generically.

With a regex?

Joshua D. Drake 


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHS6RjATb/zqfZUUQRAtZyAJ9VZeCzjX+RSGr3A9eWXTcbHwnc4gCgoZcH
LcO9KnjcycRo5YjCektSJXg=
=HZpl
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale

2007-11-26 Thread Bruce Momjian
Joshua D. Drake wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On Mon, 26 Nov 2007 23:47:04 -0500 (EST)
 Bruce Momjian [EMAIL PROTECTED] wrote:
 
  Perhaps all we need is a way to accomplish the casting so it isn't
  automatic.  This works:
  
  test= SELECT regexp_replace('2343'::money::text, '[^$,]*',
  '', 'g')::numeric; regexp_replace
  
  2343.00
  (1 row)
  
  but the '$' and ',' are locale-specific and I can't think of a way to
  do this generically.
 
 With a regex?

The problem is there is no locale-independent way to determine if
'123.456' is ~123k or ~123.  I think we are going to need a way to
output the MONEY value without a currency and thousands symbols.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Proposed patch for operator lookup caching

2007-11-26 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  We are also talking about catlog changes for 8.3.  Are we comfortable
  doing catalog changes between the beta and RC?
 
 The catalog changes in question seem entirely safe ... certainly much
 more so than this patch ...
 
 I do see your point that another beta might be prudent, but on the other
 hand I'm not sure it's really needed.  The only difference between a
 beta and an RC is that we try not to change the code anymore after RC.

To me RC means we think this might be the release candidate and I would
like to get some testing in of this in beta before hitting that point.

And an additional beta might encourage more testing too.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Proposed patch for operator lookup caching

2007-11-26 Thread Michael Paesold

Bruce Momjian wrote:

Tom Lane wrote:

Bruce Momjian [EMAIL PROTECTED] writes:

We are also talking about catlog changes for 8.3.  Are we comfortable
doing catalog changes between the beta and RC?

The catalog changes in question seem entirely safe ... certainly much
more so than this patch ...

I do see your point that another beta might be prudent, but on the other
hand I'm not sure it's really needed.  The only difference between a
beta and an RC is that we try not to change the code anymore after RC.


To me RC means we think this might be the release candidate and I would
like to get some testing in of this in beta before hitting that point.

And an additional beta might encourage more testing too.


I agree with Bruce here. If you want to apply that operator lookup cache 
patch, I would have another beta. (And I am not personally against it, 
because I feel major performance fixes may sometimes slip in as bug fixes.)
If you all decide against that patch, we might as well just go for RC1. The 
catalog changes seem rather trivial, and just a required initdb is no 
reason for calling it another beta, IMHO.


Great work on that patch, btw.!

Best Regards
Michael Paesold

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Locating sharedir in PostgreSQL on Windows

2007-11-26 Thread Peter Eisentraut
Tom Lane wrote:
 Yeah, that's what I think he said too, but it strikes me as a completely
 bogus policy --- what about contrib modules or stuff from pgfoundry
 or any random user-written module that was built with PGXS?  All that
 stuff happily drops files under $libdir and $sharedir, and I see no good
 argument why it shouldn't.

We have an implicit policy for those two specific cases, which are OK with me.  
I guess it's better to look at specific use cases instead of interpreting 
other people's abstract directory descriptions. ;-)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale

2007-11-26 Thread Peter Eisentraut
Tom Lane wrote:
 Aside from the semantic-gap issue, there is the point that providing
 a cast might actually mask application errors.  I can well imagine
 cases where one of the reasons for using MONEY is *exactly* that it's
 not a plain number or easily convertible to one.

I'm always against casts, but I can hardly object to an explicit cast between 
money and numeric.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] developing for psqlodbc

2007-11-26 Thread John Walker
i'd like to try to develop some improvements to psqlodbc. is there
source for a test suite available? could anyone point me towards it,
or offer testing source code they've written personally?

thanks,
john

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster