Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Tom Lane
Kate F [EMAIL PROTECTED] writes:
 The difference between OF and this function is that this function is
 pulling the type from the datum, rather than explicitly testing it
 against types the user suggests. If I wanted to find the type of x
 using OF, I would have to check it for all types which interest me:

 IF a IS OF (INTEGER) THEN
   t := 'INTEGER';
 ELSE IF a IS OF (TEXT) THEN
   t := 'TEXT';
 ELSE IF a IS OF (REAL) THEN
   t := 'REAL';
 ...
 and so on. Versus:

 t := pg_type_of(a);

Well, but what are you going to do with t after that?  AFAICS the
next step is going to be something like

IF t = 'integer'::regtype THEN
  ...
ELSE IF t = 'text'::regtype THEN
  ...
etc etc

So it seems to me that this is functionally about the same, except that
it exposes two implementation-dependent concepts (pg_type OIDs and
regtype) where the first exposes neither.

Your approach would help if there were a reason to pass t as a
variable to someplace not having access to a, but I don't see a
very compelling use-case for that.

 Secondly, the semantics are different: OF yields the type the datum
 currently is; pg_type_of() (perhaps it should be renamed?) returns the
 most appropiate type to which the datum may be cast, if I understand
 get_fn_expr_argtype() correctly.

You don't, I think --- there's really no such thing as a datum of type
ANYELEMENT, real datums always have some more-specific type.  But my
question upthread was directed exactly to the point of how we should
interpret IS OF applied to a polymorphic function argument.  It's at
least possible to argue that it's OK to interpret it the way you need.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] writing new regexp functions

2007-02-02 Thread Jeremy Drake
On Thu, 1 Feb 2007, David Fetter wrote:

 Yes, although it might have the same name, as in regex_match(pattern
 TEXT, string TEXT, return_pre_and_post BOOL).

 The data structure could be something like

 TYPE matches (
 prematch TEXT,
  matchTEXT[],
  postmatch TEXT
 )

I just coded up for this:

CREATE FUNCTION regexp_matches(IN str text, IN pattern text) RETURNS
text[]
AS 'MODULE_PATHNAME', 'regexp_matches'
LANGUAGE C IMMUTABLE STRICT;

CREATE FUNCTION regexp_matches(
IN str text, IN pattern text, IN return_pre_and_post bool,
OUT prematch text, OUT fullmatch text, OUT matches text[], OUT
postmatch text) RETURNS record
AS 'MODULE_PATHNAME', 'regexp_matches'
LANGUAGE C IMMUTABLE STRICT;


Which works like this:

jeremyd=# \pset null '\\N'
Null display is \N.
jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(beque)$re$);
 regexp_matches

 {bar,beque}
(1 row)

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(beque)$re$, false);
 prematch | fullmatch |   matches   | postmatch
--+---+-+---
 \N   | \N| {bar,beque} | \N
(1 row)

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(beque)$re$, true);
 prematch | fullmatch |   matches   | postmatch
--+---+-+---
 foo  | barbeque  | {bar,beque} | baz
(1 row)


And then you also have this behavior in the matches array:

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(.*)(beque)$re$);
 regexp_matches

 {bar,,beque}
(1 row)

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(.+)(beque)$re$);
 regexp_matches

 \N
(1 row)

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(.+)?(beque)$re$);
  regexp_matches
--
 {bar,NULL,beque}
(1 row)

Reasonable?

-- 
A.A.A.A.A.:
An organization for drunks who drive

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] --enable-debug does not work with gcc

2007-02-02 Thread NikhilS

Hi,

configure with --enable-debug does not seem to add -g to CFLAGS while
compiling with gcc. Guess we will need to change configure.in as below:

***

 # supply -g if --enable-debug
! if test $enable_debug = yes  test $ac_cv_prog_cc_g = yes; then
   CFLAGS=$CFLAGS -g
 fi
--- 300,315 

 # supply -g if --enable-debug
! if test $enable_debug = yes  (test $ac_cv_prog_cc_g = yes ||
!   test $ac_cv_prog_gcc_g = yes); then
   CFLAGS=$CFLAGS -g
 fi


Should I submit a patch for this?

Regards,
Nikhils
--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] --enable-debug does not work with gcc

2007-02-02 Thread Gavin Sherry
On Fri, 2 Feb 2007, NikhilS wrote:

 Hi,

 configure with --enable-debug does not seem to add -g to CFLAGS while
 compiling with gcc. Guess we will need to change configure.in as below:

Erm... works for me and everyone else... AFAIK.

Thanks,

Gavin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Kate F
On Fri, Feb/ 2/07 03:06:19AM -0500, Tom Lane wrote:
 Kate F [EMAIL PROTECTED] writes:
  The difference between OF and this function is that this function is
  pulling the type from the datum, rather than explicitly testing it
  against types the user suggests. If I wanted to find the type of x
  using OF, I would have to check it for all types which interest me:
 
  IF a IS OF (INTEGER) THEN
  t := 'INTEGER';
  ELSE IF a IS OF (TEXT) THEN
  t := 'TEXT';
  ELSE IF a IS OF (REAL) THEN
  t := 'REAL';
  ...
  and so on. Versus:
 
  t := pg_type_of(a);
 
 Well, but what are you going to do with t after that?  AFAICS the
 next step is going to be something like
 
 IF t = 'integer'::regtype THEN
   ...
 ELSE IF t = 'text'::regtype THEN
   ...
 etc etc

I don't follow that cast at all, I'm afraid. I wasn't intending to
have a set of IF..ELSE IF statements like that, though - see below.


 So it seems to me that this is functionally about the same, except that
 it exposes two implementation-dependent concepts (pg_type OIDs and
 regtype) where the first exposes neither.
 
 Your approach would help if there were a reason to pass t as a
 variable to someplace not having access to a, but I don't see a
 very compelling use-case for that.

In my case, I am constructing a query (to be exexecuted dynamically)
wherein I pass along some of the arguments I am given. This query calls
a function specified by an argument passed to me. If that function is
overloaded, I need to be able to cast its arguments to appropiate
types so that PostgreSQL may decide which function of that name to
call. I'm sure there must be other uses, (or is this an unneccessary
feature?).

For the moment, I'm only using this information to see if I need to
quote a parameter or not, but I suspect my function will trip up when
told to execute something that is overloaded in a more complex way.

Of course, I don't want to expose anything unneccessarily.


  Secondly, the semantics are different: OF yields the type the datum
  currently is; pg_type_of() (perhaps it should be renamed?) returns the
  most appropiate type to which the datum may be cast, if I understand
  get_fn_expr_argtype() correctly.
 
 You don't, I think --- there's really no such thing as a datum of type
 ANYELEMENT, real datums always have some more-specific type.  But my
 question upthread was directed exactly to the point of how we should
 interpret IS OF applied to a polymorphic function argument.  It's at
 least possible to argue that it's OK to interpret it the way you need.

I've no suggestion to make on whether IS OF should look inside
ANYELEMENT or not.

This is quite past my knowledge of PostgreSQL... If I understand you
correctly, ANYELEMENT is unrelated to my suggestion.


I see my misunderstanding: '2' IS OF (INTEGER) yields false: fine.
However I was expecting that pg_type_of('2') would return 'INTEGER': it
wouldn't, of course. So, I understand you here: there would be no
difference between this and IS OF in the way I had imagined.


That still leaves the difference in usage I mention above. Does that
sound sane?

Thank you,

-- 
Kate

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] --enable-debug does not work with gcc

2007-02-02 Thread NikhilS

Hi,

Indeed it does, apologies for not doing the entire groundwork. But what it
also does is that it adds -O2 by default for gcc even when --enable-debug is
specified. gdb is not able to navigate the stack traces properly with this
optimization in place. Especially tracing of static functions becomes
difficult. Has this issue been faced by anybody else? If so can try out a
patch to avoid using O2 with enable-debug.

Regards,
Nikhils

On 2/2/07, Gavin Sherry [EMAIL PROTECTED] wrote:


On Fri, 2 Feb 2007, NikhilS wrote:

 Hi,

 configure with --enable-debug does not seem to add -g to CFLAGS while
 compiling with gcc. Guess we will need to change configure.in as below:

Erm... works for me and everyone else... AFAIK.

Thanks,

Gavin





--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] writing new regexp functions

2007-02-02 Thread Jeremy Drake
On Fri, 2 Feb 2007, Jeremy Drake wrote:

 jeremyd=# select * from regexp_matches('foobarbequebaz',
 $re$(bar)(beque)$re$, false);
  prematch | fullmatch |   matches   | postmatch
 --+---+-+---
  \N   | \N| {bar,beque} | \N
 (1 row)

I just changed this to fill in fullmatch when the bool is false, so this
one would look like:
 prematch | fullmatch |   matches   | postmatch
--+---+-+---
 \N   | barbeque  | {bar,beque} | \N
(1 row)

I also removed my check for capture groups, since in this setup you could
get useful output without any.  I am still trying to decide whether or not
to add back an error if you called the no-bool version which just returns
the array, and you do not have any capture groups.  ISTM this is likely an
oversight on the query author's part, and it would be helpful to alert him
to this.

If you have no capture groups, the matches array is empty (not null).  If
the match happened at the start of the string, the prematch is an empty
string, and if the match happened at the end of the string, the postmatch
is an empty string.


 Reasonable?

-- 
It's odd, and a little unsettling, to reflect upon the fact that
English is the only major language in which I is capitalized; in many
other languages You is capitalized and the i is lower case.
-- Sydney J. Harris

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Why is ecpg segfaulting on buildfarm member clownfish?

2007-02-02 Thread Michael Meskes
On Thu, Feb 01, 2007 at 06:25:50PM +0100, Stefan Kaltenbrunner wrote:
 BTW, this is a perfect example of why it's not a good idea to allow
 minor regression failures to go unfixed --- people become desensitized.
 I know I've been completely ignoring ECPG-Check buildfarm results
 for awhile now.

I was aware of this Tom, but didn't find the time to dig into it yet.

 there was also some discussion off-list last week with Michael - I have 
 arranged for an account on that box for him but I'm not sure if he 
 already found time to investigate.

I did today. This seemed like a strange one, but apparantly a gcc
ABI-bug workaround interfered with the compiler used here because the
workaround wasn't covering all positions.

make check now gives a full list of OKs on Stefan's machine. Thanks for
the account.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] --enable-debug does not work with gcc

2007-02-02 Thread Gavin Sherry
On Fri, 2 Feb 2007, NikhilS wrote:

 Hi,

 Indeed it does, apologies for not doing the entire groundwork. But what it
 also does is that it adds -O2 by default for gcc even when --enable-debug is
 specified. gdb is not able to navigate the stack traces properly with this
 optimization in place. Especially tracing of static functions becomes
 difficult. Has this issue been faced by anybody else? If so can try out a
 patch to avoid using O2 with enable-debug.

Yes, this is known. The thing with gcc is, it only emits some warnings at
-O2. I'm not that this is why we do not set optimisation to 0 but have
long assumed it to be the case. I imagine that it's fairly standard
practice for people doing debugging to CFLAGS=-O0 as an argument to
configure.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] --enable-debug does not work with gcc

2007-02-02 Thread NikhilS

Hi,

On 2/2/07, Gavin Sherry [EMAIL PROTECTED] wrote:


On Fri, 2 Feb 2007, NikhilS wrote:

 Hi,

 Indeed it does, apologies for not doing the entire groundwork. But what
it
 also does is that it adds -O2 by default for gcc even when
--enable-debug is
 specified. gdb is not able to navigate the stack traces properly with
this
 optimization in place. Especially tracing of static functions becomes
 difficult. Has this issue been faced by anybody else? If so can try out
a
 patch to avoid using O2 with enable-debug.

Yes, this is known. The thing with gcc is, it only emits some warnings at
-O2. I'm not that this is why we do not set optimisation to 0 but have
long assumed it to be the case. I imagine that it's fairly standard
practice for people doing debugging to CFLAGS=-O0 as an argument to
configure.

True, this is how I myself circumvent this problem too. But IMHO,

explicitly passing CFLAGS when we are invoking --enable-debug (which does
add -g, but leaves some optimization flag around which deters debugging)
does not seem correct?

Regards,
Nikhils


--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] [GENERAL] 8.2.1 Compiling Error

2007-02-02 Thread Michael Meskes
On Wed, Jan 31, 2007 at 03:24:24PM -0800, elein wrote:
 Pretty darn vanilla, except for source packages from postgres.

Which Debian version?

I take it you got this message on a full rebuild from clean sources,
right?

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Simon Riggs
I'm reading the SQL Standard and I can't find anywhere that says that we
need to place SHARE locks on rows in the referenced table.
RI_FKey_check() clearly does that.

What I do see is this:
4. For each row of the referenced table, its matching rows, unique
matching rows, and non-unique matching rows are determined immediately
prior to the execution of any SQL procedure statement. No new
matching rows are added during the execution of that SQL procedure
statement.

The association between a referenced row and a non-unique matching row
is dropped during the execution of that SQL-statement if the referenced
row is either marked for deletion or updated to a distinct value on
any referenced column that corresponds to a non-null referencing column.
This occurs immediately after such a mark for deletion or update of the
referenced row. Unique matching rows and non-unique matching rows for a
referenced row are evaluated immediately after dropping the association
between that referenced row and a non-unique matching row.

under General Rules for referential constraint definition

That explicitly says are determined immediately prior to the
execution. To me, that implies that a Read Committed snapshot is
sufficient to read referenced rows and that no lock is required.

Why do we need a SHARE lock at all, on the **referenc(ed)** table?

It sounds like if we don't put a SHARE lock on the referenced table then
we can end the transaction in an inconsistent state if the referenced
table has concurrent UPDATEs or DELETEs. BUT those operations do impose
locking rules back onto the referencing tables that would not be granted
until after any changes to the referencing table complete, whereupon
they would restrict or cascade. So an inconsistent state doesn't seem
possible to me.

What am I missing?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Richard Huxton

Kate F wrote:

I see my misunderstanding: '2' IS OF (INTEGER) yields false: fine.
However I was expecting that pg_type_of('2') would return 'INTEGER': it
wouldn't, of course. So, I understand you here: there would be no
difference between this and IS OF in the way I had imagined.


It's not even possible to have a function that determines the type of a 
value given that we have overlapping types. How do you know that 2 
isn't an int8 rather than int4, or numeric, or just text. What about 
'now'? That's a valid timestamp as well as text.


Now, if we had an can_be_cast_to(TEXT-VAL,TYPE) that would at least let 
you check against a pre-determined list of types. The only way I know of 
at present is to trap an exception if it fails.


I think you're going to have to store your arguments with their types.

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Csaba Nagy
On Fri, 2007-02-02 at 10:51, Simon Riggs wrote:
[snip]
 Why do we need a SHARE lock at all, on the **referenc(ed)** table?
 
 It sounds like if we don't put a SHARE lock on the referenced table then
 we can end the transaction in an inconsistent state if the referenced
 table has concurrent UPDATEs or DELETEs. BUT those operations do impose
 locking rules back onto the referencing tables that would not be granted
 until after any changes to the referencing table complete, whereupon
 they would restrict or cascade. So an inconsistent state doesn't seem
 possible to me.
 
 What am I missing?

Well, here we do have a patch (deployed on production servers) which
does not put the shared lock on the referenced table, and it lets in
occasionally rows in the referencing tables which do not have parent
rows in the referenced table. I'm not sure what is the mechanism, but it
does happen, I can assure you. It happens rare enough that is not
disturbing for us, compared to the deadlocks which happen without the
patch - that's another matter...

In our application we never update any key ids, so only deletes/inserts
come in play, and I guess it happens when a referenced row is deleted
just between a newly inserted child row checks that the parent row
exists and the row is really inserted. Or something like that...

Cheers,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Richard Huxton

Csaba Nagy wrote:

On Fri, 2007-02-02 at 10:51, Simon Riggs wrote:
[snip]

Why do we need a SHARE lock at all, on the **referenc(ed)** table?



Well, here we do have a patch (deployed on production servers) which
does not put the shared lock on the referenced table, and it lets in
occasionally rows in the referencing tables which do not have parent
rows in the referenced table. I'm not sure what is the mechanism, but it
does happen, I can assure you. It happens rare enough that is not
disturbing for us, compared to the deadlocks which happen without the
patch - that's another matter...


You say below the cut that you're not updating keys, so presumably it's 
other columns. Which leads me to something I've wondered for a while - 
why do we lock the whole row? Is it just a matter of not optimised that 
yet or is there a good reason why locking just some columns isn't 
practical.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: A more general approach (Re: [HACKERS] Data archiving/warehousing idea)

2007-02-02 Thread Hannu Krosing
Ühel kenal päeval, N, 2007-02-01 kell 12:31, kirjutas Tom Lane:
 Hannu Krosing [EMAIL PROTECTED] writes:
  A more radical variation of the restricted-use archive table approach
  is storing all tuple visibility info in a separate file.
  At first it seems to just add overhead, but for lots (most ? ) usecases
  the separately stored visibility should be highly compressible, so for
  example for bulk-loaded tables you could end up with one bit per page
  saying that all tuples on this page are visible.
 
 The more you compress, the slower and more complicated it will be to
 access the information.  I'd put my money on this being a net loss in
 the majority of scenarios.

define majority :)

In real life it is often faster to access compressed information,
especially if it is stored in something like trie where compression and
indeked access are the same thing.

the most gain will of course come from bulk-loaded data, where the
compressed representation can just say something like pages 1 to
20 are all visible starting from transaction 5000 and so is first
half of page 21, second half of 21 and pages up to 25 are
visible from trx 6000. 

In this case the visibility info will always stay in L1 cache and thus
be really fast, maybe even free if we account for cache reloads and
such.

But it may be better to still have a bitmap there for sake of simplicity
and have some of it be accessible from L2 cache (20 pages times say
2 bits is still only 100kB bitmap for 1.6GB of data).

Of course there are cases where this approach is worse, sometimes much
woorse, than current one, but the possibility of independently
compressing visibility info and making some types of VACUUM vastly
cheaper may make it a net win in several cases. 

Also, for higly dynamic tables a separate visibility heap might also
speed up bitmap scans, as access to heap happens only for visible
tuples. 

This can also be one way to get rid of need to write full data tuples
thrice - first the original write, then commit bits and then deleted
bits. instead we can just write the bits once for a whole set of tuples.

Usually most of visibility info can be thrown out quite soon, as the
active transaction window advances, so the first level of compression
is just thtrowing out cmin/cmax and setting commit bit, then setting the
tuple to just visible.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Csaba Nagy
 You say below the cut that you're not updating keys, so presumably it's 
 other columns. Which leads me to something I've wondered for a while - 
 why do we lock the whole row? Is it just a matter of not optimised that 
 yet or is there a good reason why locking just some columns isn't 
 practical.

For the conditions of generating the deadlock, see:

http://archives.postgresql.org/pgsql-general/2006-12/msg00029.php

The reason of the occasional orphan rows is not completely clear to me,
but it must be some kind of race condition while
inserting/deleting/?updating concurrently the parent/child tables.

Cheers,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Florian G. Pflug

Csaba Nagy wrote:

The reason of the occasional orphan rows is not completely clear to me,
but it must be some kind of race condition while
inserting/deleting/?updating concurrently the parent/child tables.


I guess the following sequence would generate a orphaned row.
A: executes insert into table_child parent_id=1
B: executes delete from table_parent where id=1
A: RI trigger checks for matching row in table_parent
B: The row with id=1 is marked as deleted in table_parent
A: The new row with parent_id=1 is inserted into table_child
B: The delete is commited
A: The insert is comitted.

Any ordering that marks the row as deleted between the execution
of the ri-trigger and the insertion of the new row would lead
to the same result..

greetings, Florian Pflug

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Bitmap index thoughts

2007-02-02 Thread Gavin Sherry
On Thu, 1 Feb 2007, Bruce Momjian wrote:


 Where are we on this patch?  Does it have performance tests to show
 where it is beneificial?  Is it ready to be reviewed?

Here's an updated patch:

http://www.alcove.com.au/~swm/bitmap-2007-02-02.patch

In this patch, I rewrote the index build system. It was fast before for
well clustered data but for poorly clustered data, it was very slow. Now,
it is pretty good for each distribution type.

I have various test cases but the one which showed bitmap a poor light was
a table of 600M rows. The key to the table had a cardinality of 100,000.
When the table was loaded with keys clustered, the build time was 1000
seconds with bitmap (2200 with btree). With poorly clustered data (e.g.,
the index key was (1, 2, 3, ..., 6000, 1, 2, 3, ...)), the build time for
bitmap was 14000 seconds!

So, I rewrote this to compress data using HRL encoding (the same scheme we
use in the bitmap AM itself). Now, clustered data is just as fast and
unclustered data is 2000 seconds.

The select performance at a cardinality of 100,000 is similar to btree but
faster with lower cardinalities.

Jie also contributed a rewrite of the WAL code to this patch. Not only is
the code faster now, but it handles the notion of incomplete actions --
like btree and friends do. The executor code still needs some work from me
-- Jie and I have dirtied things up while experimenting -- but we would
really like some review of the code so that this can get squared away
well before the approach of 8.3 feature freeze.

One of the major deficiencies remaining is the lack of VACUUM support.
Heikki put his hand up for this and I'm holding him to it! ;-)

I will update the code tomorrow. The focus will be cleaning up the
executor modifications. Please look else where for now.

Thanks,

Gavin

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Data archiving/warehousing idea

2007-02-02 Thread Florian G. Pflug

Jochem van Dieten wrote:

On 2/1/07, Chris Dunlop wrote:

In maillist.postgres.dev, you wrote:
Rather than writing in-place, perhaps the SET ARCHIVE would
create a on-disk copy of the table.


Just like CLUSTER does now: create an on-disk copy first and swap the
relfilenodes of the files and flush the relcache.


IIRC, cluster currently needs to take on exclusive lock of the table, thus
preventing any concurrent selects. I assume it would be the same for
alter table ... set archive.
For a large readonly table - the ones that set archive would be used for -
rewriting the whole table might easily
take a few hours, if not days. Blocking reads for such a long time might
be unacceptable in a lot of environments, severely limiting the use-cases
for alter table ... set archive

I think that both cluster and truncate should in theory only need to
prevent concurrent updates, not concurrent selects. AFAIK, the reason they
need to take an exclusive lock is because there is no way to let other backend
see the old relfilenode entry in pg_class until the cluster/truncate commits.
So I believe that this limitation would first have to be removed, before a
alter table ... set archive would become really usefull...

Just my 0.02 eurocents.
greetings, Florian Pflug


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Tom Lane
Kate F [EMAIL PROTECTED] writes:
 In my case, I am constructing a query (to be exexecuted dynamically)
 wherein I pass along some of the arguments I am given. This query calls
 a function specified by an argument passed to me. If that function is
 overloaded, I need to be able to cast its arguments to appropiate
 types so that PostgreSQL may decide which function of that name to
 call. I'm sure there must be other uses, (or is this an unneccessary
 feature?).
 For the moment, I'm only using this information to see if I need to
 quote a parameter or not, but I suspect my function will trip up when
 told to execute something that is overloaded in a more complex way.

Hmmm.  Actually, I think you'd be best off not to think in terms of
quote or not, but instead always quote and cast.  You're going to be
building up strings to EXECUTE, right?  ISTM what you want is something
like

... || quote_literal(aparam::text) || '::' || type_name_of(aparam) || ...

where type_name_of is something that produces the type name as a string,
not directly its OID.  So one way to counter the it's exposing internal
concepts gripe is to not expose the OID at all just the type name.
Even if the raw function did return the OID you'd need a wrapper to
convert to a string name.

The other problem here is that I've blithely assumed that you can cast
anything to text; you can't.  Now in plpgsql you can work around that
because plpgsql will cast anything to anything via textual intermediate
form, so you could hack it with

texttmp := aparam;
... || quote_literal(texttmp) || '::' || type_name_of(aparam) || ...

There's been talk off and on of allowing an explicit cast to and from
text throughout the system rather than just in plpgsql, but I dunno if
you want to fight that battle today.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] --enable-debug does not work with gcc

2007-02-02 Thread Tom Lane
NikhilS [EMAIL PROTECTED] writes:
 True, this is how I myself circumvent this problem too. But IMHO,
 explicitly passing CFLAGS when we are invoking --enable-debug (which does
 add -g, but leaves some optimization flag around which deters debugging)
 does not seem correct?

If we did what you suggest, then --enable-debug would cause performance
degradation, which would cause people to not use it, which would result
in most binaries being completely undebuggable rather than only partially.
Doesn't sound like a good tradeoff to me.

Personally, in my development tree I use a Makefile.custom containing

# back off optimization unless profiling
ifeq ($(PROFILE),)
   CFLAGS:= $(patsubst -O2,-O1,$(CFLAGS))
endif

-O1 still generates uninitialized variable warnings but the code is a
lot saner to step through ... not perfect, but saner.  It's been a
workable compromise for a long time.  I don't recommend developing with
-O0, exactly because it disables some mighty valuable warnings.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Kate F
On Fri, Feb/ 2/07 10:09:24AM +, Richard Huxton wrote:
 Kate F wrote:
 I see my misunderstanding: '2' IS OF (INTEGER) yields false: fine.
 However I was expecting that pg_type_of('2') would return 'INTEGER': it
 wouldn't, of course. So, I understand you here: there would be no
 difference between this and IS OF in the way I had imagined.
 
 It's not even possible to have a function that determines the type of a 
 value given that we have overlapping types. How do you know that 2 
 isn't an int8 rather than int4, or numeric, or just text. What about 
 'now'? That's a valid timestamp as well as text.

Yes, quite. I understand this; I'd just misunderstood what
get_fn_expr_argtype() did.



 Now, if we had an can_be_cast_to(TEXT-VAL,TYPE) that would at least let 
 you check against a pre-determined list of types. The only way I know of 
 at present is to trap an exception if it fails.

That's exactly what I'm doing, currently.


 I think you're going to have to store your arguments with their types.

I may do!

Regards,

-- 
Kate

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Kate F
On Fri, Feb/ 2/07 09:52:08AM -0500, Tom Lane wrote:
 Kate F [EMAIL PROTECTED] writes:
  In my case, I am constructing a query (to be exexecuted dynamically)
  wherein I pass along some of the arguments I am given. This query calls
  a function specified by an argument passed to me. If that function is
  overloaded, I need to be able to cast its arguments to appropiate
  types so that PostgreSQL may decide which function of that name to
  call. I'm sure there must be other uses, (or is this an unneccessary
  feature?).
  For the moment, I'm only using this information to see if I need to
  quote a parameter or not, but I suspect my function will trip up when
  told to execute something that is overloaded in a more complex way.
 
 Hmmm.  Actually, I think you'd be best off not to think in terms of
 quote or not, but instead always quote and cast.  You're going to be
 building up strings to EXECUTE, right?  ISTM what you want is something
 like
 
 ... || quote_literal(aparam::text) || '::' || type_name_of(aparam) || ...
 
 where type_name_of is something that produces the type name as a string,
 not directly its OID.  So one way to counter the it's exposing internal
 concepts gripe is to not expose the OID at all just the type name.

That's precisely what I suggested a moment ago! This is what I'm
proposing is added.

(And whatever the decision regarding ANYELEMENT of, I believe this
should behave the same as IS OF)


 Even if the raw function did return the OID you'd need a wrapper to
 convert to a string name.

This is what David did in his article.


 The other problem here is that I've blithely assumed that you can cast
 anything to text; you can't.  Now in plpgsql you can work around that
 because plpgsql will cast anything to anything via textual intermediate
 form, so you could hack it with
 
 texttmp := aparam;
 ... || quote_literal(texttmp) || '::' || type_name_of(aparam) || ...

That's interesting - I didn't realise that not everything could be cast
to text.


 There's been talk off and on of allowing an explicit cast to and from
 text throughout the system rather than just in plpgsql, but I dunno if
 you want to fight that battle today.

I'm sticking to things I could possibly implement :)

Thank you,

-- 
Kate

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] --enable-debug does not work with gcc

2007-02-02 Thread Bruce Momjian
Tom Lane wrote:
 NikhilS [EMAIL PROTECTED] writes:
  True, this is how I myself circumvent this problem too. But IMHO,
  explicitly passing CFLAGS when we are invoking --enable-debug (which does
  add -g, but leaves some optimization flag around which deters debugging)
  does not seem correct?
 
 If we did what you suggest, then --enable-debug would cause performance
 degradation, which would cause people to not use it, which would result
 in most binaries being completely undebuggable rather than only partially.
 Doesn't sound like a good tradeoff to me.
 
 Personally, in my development tree I use a Makefile.custom containing
 
 # back off optimization unless profiling
 ifeq ($(PROFILE),)
CFLAGS:= $(patsubst -O2,-O1,$(CFLAGS))
 endif
 
 -O1 still generates uninitialized variable warnings but the code is a
 lot saner to step through ... not perfect, but saner.  It's been a
 workable compromise for a long time.  I don't recommend developing with
 -O0, exactly because it disables some mighty valuable warnings.

Agreed.  I use -O1 by default myself, unless I am doing performance testing.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Tom Lane
Kate F [EMAIL PROTECTED] writes:
 (And whatever the decision regarding ANYELEMENT of, I believe this
 should behave the same as IS OF)

In the light of morning I think it may be a non-problem.  The way that a
plpgsql function with an ANYELEMENT parameter really works is that on
first invocation with a parameter of a specific type, we generate a new
parse-tree on the fly with the parameter being taken as of that type.
So the IS OF or equivalent operation would never see ANYELEMENT, and
there's nothing to look through.  (You might check this by seeing if
IS OF behaves sanely, before you go and spend time on a type_of function
...)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Kate F
On Fri, Feb/ 2/07 11:17:46AM -0500, Tom Lane wrote:
 Kate F [EMAIL PROTECTED] writes:
  (And whatever the decision regarding ANYELEMENT of, I believe this
  should behave the same as IS OF)
 
 In the light of morning I think it may be a non-problem.  The way that a
 plpgsql function with an ANYELEMENT parameter really works is that on
 first invocation with a parameter of a specific type, we generate a new
 parse-tree on the fly with the parameter being taken as of that type.
 So the IS OF or equivalent operation would never see ANYELEMENT, and
 there's nothing to look through.  (You might check this by seeing if
 IS OF behaves sanely, before you go and spend time on a type_of function
 ...)

I have checked this - I mentioned earlier, when I spoke about my
discussion on IRC with Pavel, but had since forgotten! IS OF for an
array of TEXT yields TEXT. I think this is convenient behaviour
(likewise for the function I'm proposing).

So, to conclude, we still have a valid use-case (which you explained a
little more explicitly than I did). Shall I attempt to implement it?
(that is, type_name_of() which returns TEXT)

Regards,

-- 
Kate

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] proposal: only superuser can change customized_options

2007-02-02 Thread Pavel Stehule

Hello

I want to use custmized option for security configuration one contrib 
library. Currently customized options are usable only for default 
configuration, because everybody can change it. It is substitution of global 
variables.


Decision if option is protected or not can be based on name of option. Like:

customized_option = (utl_file)

utl_file.protected.dir = '/aaa:/bbb'  .. can be modified by superuser
utl_file.readonly.dir = '/aaa:/mm' .. nobody can modify it

Regards
Pavel Stehule

_
Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Tom Lane
Kate F [EMAIL PROTECTED] writes:
 So, to conclude, we still have a valid use-case (which you explained a
 little more explicitly than I did). Shall I attempt to implement it?
 (that is, type_name_of() which returns TEXT)

I think I'd suggest pg_type_name ... or maybe pg_type_name_of ...
also, the code you need to convert OID to name is already there,
see regtypeout.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] proposal: only superuser can change customized_options

2007-02-02 Thread Tom Lane
Pavel Stehule [EMAIL PROTECTED] writes:
 I want to use custmized option for security configuration one contrib 
 library. Currently customized options are usable only for default 
 configuration, because everybody can change it. It is substitution of global 
 variables.
 Decision if option is protected or not can be based on name of option.

I dislike making it depend on spelling.  There was discussion of this
problem before, and we had a much saner answer: when the module that
defines the variable gets loaded, discard any local setting if the
correct protection level of the variable is SUSET or higher.  See the
archives.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Proposal: Snapshot cloning

2007-02-02 Thread Jim Nasby

On Jan 29, 2007, at 11:28 PM, Tom Lane wrote:

Jim Nasby [EMAIL PROTECTED] writes:

On Jan 26, 2007, at 4:48 PM, Tom Lane wrote:
I don't actually see that it buys you a darn thing ... you still  
won't

be able to delete dead updated tuples because of the possibility of
the LRT deciding to chase ctid chains up from the tuples it can see.



Well, Simon was talking about a serialized LRT, which ISTM shouldn't
be hunting down ctid chains past the point it serialized at.


How you figure that?  If the LRT wants to update a tuple, it's got to
chase the ctid chain to see whether the head update committed or not.
It's not an error for a serializable transaction to update a tuple  
that

was tentatively updated by a transaction that rolled back.


Nuts. :(


Even if that's not the case, there is also the possibility if a LRT
publishing information about what tables it will hit.


I think we already bought 99% of the possible win there by fixing
vacuum.  Most ordinary transactions aren't going to be able to predict
which other tables the user might try to touch.


Presumably a single-statement transaction could do that in most (if  
not all) cases.


But even if we didn't support automatically detecting what tables a  
transaction was hitting, we could allow the user to specify it and  
then bomb out if the transaction tried to hit anything that wasn't in  
that list. That would allow users who are creating LRTs to limit  
their impact on vacuum. The safe way to perform that check would be  
to check each buffer before accessing it, but I'm unsure how large a  
performance impact that would entail; I don't know how much code we  
run through to pull a tuple out of a page and do something with it  
compared to the cost of checking if that buffer belongs to a relation/ 
file that's in the approved list.


Perhaps a better way would be to allow users to mark vacuum-critical  
tables for restricted access. To access a restricted table the user  
would need to provide a list of restricted tables that a transaction  
is going to hit (or maybe just lump all restricted tables into one  
group), and that transaction would log it's XID somewhere that vacuum  
can look at. If a transaction that hasn't specified it will touch the  
restricted tables tries to do so it errors out. We might want some  
way to flag buffers as belonging to a restricted table (or one of  
it's indexes) so that transactions that aren't hitting restricted  
tables wouldn't have to pay a large performance penalty to figure  
that out. But you'd only have to mark those buffers when they're read  
in from the OS, and presumably a restricted table will be small  
enough that it's buffers should stay put. Logging the XID could prove  
to be a serialization point, but we could possibly avoid that by  
using per-relation locks.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[HACKERS] Performance penalty of visibility info in indexes?

2007-02-02 Thread Jim Nasby
Has anyone actually measured the performance overhead of storing  
visibility info in indexes? I know the space overhead sounds  
daunting, but even if it doubled the size of the index in many cases  
that'd still be a huge win over having to scan the heap as well as  
the index (esp. for things like count(*)). There would also be  
overhead from having to update the old index tuple, but for the case  
of updates you're likely to need that page for the new index tuple  
anyway.


I know this wouldn't work for all cases, but ISTM there are many  
cases where it would be a win.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Archive log compression keeping physical log available in the crash recovery

2007-02-02 Thread Jim Nasby
I thought the drive behind full_page_writes = off was to reduce the  
amount of data being written to pg_xlog, not to shrink the size of a  
PITR log archive.


ISTM that if you want to shrink a PITR log archive you'd be able to  
get good results by (b|g)zip'ing the WAL files in the archive. I  
quick test on my laptop shows over a 4x reduction in size. Presumably  
that'd be even larger if you increased the size of WAL segments.


On Jan 29, 2007, at 2:15 AM, Koichi Suzuki wrote:

This is a proposal for archive log compression keeping physical log  
in WAL.


In PotgreSQL 8.2, full-page_writes option came back to cut out  
physical

log both from WAL and archive log.   To deal with the partial write
during the online backup, physical log is written only during the  
online

backup.

Although this dramatically reduces the log size, it can risk the crash
recovery.   If any page is inconsisitent because of the fault, crash
recovery doesn't work because full page images are necessary to  
recover
the page in such case.  For critical use, especially in commercial  
use,

 we don't like to risk the crash recovery chance, while reducing the
archive log size will be crucial too for larger databases.WAL size
itself may be less critical, because they're reused cyclickly.

Here, I have a simple idea to reduce archive log size while keeping
physical log in xlog:

1. Create new GUC: full_page_compress,

2. Turn on both the full_page_writes and full_page_compress: physical
log will be written to WAL at the first write to a page after the
checkpoint, just as conventional full_page_writes ON.

3. Unless physical log is written during the online backup, this  
can be

removed from the archive log.   One bit in XLR_BKP_BLOCK_MASK
(XLR_BKP_REMOVABLE) is available to indicate this (out of four, only
three of them are in use) and this mark can be set in XLogInsert().
With the both full_page_writes and full_page_compress on, both logical
log and physical log will also be written to WAL with  
XLR_BKP_REMOVABLE

flag on.  Having both physical and logical log in a same WAL is not
harmful in the crash recovery.  In the crash recovery, physical log is
used if it's available.  Logical log is used in the archive  
recovery, as

the corresponding physical log will be removed.

4. The archive command (separate binary), removes physical logs if
XLR_BKP_REMOVABLE flag is on.   Physical logs will be replaced by a
minumum information of very small size, which is used to restore the
physical log to keep other log records's LSN consistent.

5. The restore command (separate binary) restores removed physical log
using the dummy record and restores LSN of other log records.

6. We need to rewrite redo functions so that they ignore the dummy
record inserted in 5.  The amount of code modification will be very  
small.


As a result, size of the archive log becomes as small as the case with
full_page_writes off, while the physical log is still available in the
crash recovery, maintaining the crash recovery chance.

Comments, questions and any input is welcome.

-
Koichi Suzuki, NTT Open Source Center

--
Koichi Suzuki

---(end of  
broadcast)---

TIP 6: explain analyze is your friend



--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Logging Lock Waits

2007-02-02 Thread Jim Nasby

On Jan 30, 2007, at 6:32 PM, Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:
I'm thinking to write an INFO message, so that people can choose  
to log

this and/or the SQL statement if they choose.
e.g. INFO: lock wait time of XXX secs has been exceeded


The available timer resources are already overloaded; adding an
independent timeout for this will complicate the code more than seems
justified.  Perhaps you could add a LOG message whenever the
deadlock-check code runs (and doesn't detect an error, so is about  
to go

back to sleep).  This would take almost no effort, and the granularity
could still be adjusted via the deadlock check timeout.


Sybase collected performance information for a server by periodically  
checking the status of all backend processes (ie: waiting on a user  
lock, waiting on IO, internal lock, etc) and keeping counters. Having  
that information available was useful for debugging performance  
issues (unfortunately I can't provide any specific examples since it  
was years ago I played with it). I'm thinking backends could set  
flags in shared memory to indicate what they're doing and a process  
could poll that every X milliseconds and keep stats on what's going on.


But maybe that's more along the lines of the rewrite of the stats  
system that's been talked about...

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Estimation error in n_dead_tuples

2007-02-02 Thread Jim Nasby

On Feb 1, 2007, at 10:57 AM, Tom Lane wrote:

ITAGAKI Takahiro [EMAIL PROTECTED] writes:
I'm thinking to add the number of vacuumed tuples to the message  
from
vacuum. The stats collector will subtract the value from  
n_dead_tuples

instead of setting it to zero. This is also needed if we want to make
some kinds of partial vacuum methods.


This seems awfully dangerous to me, because then you are operating on
dead reckoning forever: there will be nothing that can correct an
inaccurate rowcount estimate, and in practice that means it will  
diverge

arbitrarily far from reality :-(, because of the inherent inaccuracies
of the stats system.  I think the risk of that is far worse than the
relatively small (or at least bounded) error arising from tuples not
seen by vacuum.


Yeah, it'd be better for vacuum to send a message stating how many  
dead rows it couldn't remove, ala:


DETAIL:  0 dead row versions cannot be removed yet.

Granted, not perfect, but better than what we have now.
--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] RI checks during UPDATEs

2007-02-02 Thread Jim Nasby

On Jan 30, 2007, at 1:17 PM, Simon Riggs wrote:

It would be even better if there was some way of not executing the
trigger at all if we knew that the UPDATE statement doesn't SET the FK
columns.


Other databases allow you to put a WHERE or CHECK clause on triggers,  
so that they will only fire if that evaluates to true. That would  
allow FK triggers to be defined as


CREATE TRIGGER ... WHERE NEW.fk != OLD.fk

and would obviously have other uses as well. Of course the question  
is how much time that would save vs just doing the same check in the  
trigger function itself.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Talks for OSCON? Only 5 days left!

2007-02-02 Thread Jim Nasby
Would it be safe/appropriate to do a What's new in 8.3 talk? (Went  
to OSCON last year, but didn't make it to any talks).


On Jan 30, 2007, at 4:22 PM, Josh Berkus wrote:


All,

We only have five days left to submit talks for OSCON (Portland,  
last week

of July): http://conferences.oreillynet.com/cs/os2007/create/e_sess

I'd like to check  coordinate what people are submitting from  
PostgreSQL
to make sure we have the strongest possible PostgreSQL content.  So  
far we

have:

Performance Whack-a-Mole - Josh Berkus
Joy of Index - Josh Berkus
PostgreSQL analytics with DTrace - Theo Schlossnagle

... this isn't nearly enough.  We need some cool talks on building
applications with pgcrypto, PostGIS, TSearch2 and more!

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of  
broadcast)---

TIP 5: don't forget to increase your free space map settings



--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PL/pgSQL RENAME functionality in TODOs

2007-02-02 Thread Jim Nasby

On Feb 1, 2007, at 5:08 AM, Pavel Stehule wrote:
std. use rename only for triggers and variables new and old. It has  
sense. I don't see sense for rename in clasic plpgsql functions.  
There was one reason, rename unnamed $params. But currently plpgsql  
support named params and this reason is obsolete.


Unless things have changed it can be a real PITA to deal with plpgsql  
variables that share the same name as a field in a table. IIRC  
there's some cases where it's not even possible to unambiguously  
refer to the plpgsql variable instead of the field.


For internal variables there's a decent work-around... just prefix  
all variables with something like v_. But that's pretty ugly for  
parameters... get_user(user_id int) is certainly a nicer interface  
than get_user(p_user_id int).


But I think a way to get around that would be to RENAME the arguments  
in the DECLARE section, so user_id could become p_user_id under the  
covers.


So perhaps there is still a point to RENAME after-all, at least for  
paramaters.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] May, can, might

2007-02-02 Thread Andrew Sullivan
On Tue, Jan 30, 2007 at 12:39:26PM -0500, Bruce Momjian wrote:
 I would like to clean up our documentation to consistently use these
 words.  Objections?

None here, but if you're going to go to the trouble, you might want
to have a look at how others have faced this problem too.

In my line of work, we've taken to adopting the RFC 2119 words for
cases where we want to be super-clear and unambiguous.  I don't think
those formulations would be much use for user manuals, but it's nice
to see that another group of people who work by converging on
consensus can still do that by (for example) agreeing that MAY and
may are not the same word.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal: Commit timestamp

2007-02-02 Thread Jim Nasby

On Jan 25, 2007, at 6:16 PM, Jan Wieck wrote:
If a per database configurable tslog_priority is given, the  
timestamp will be truncated to milliseconds and the increment logic  
is done on milliseconds. The priority is added to the timestamp.  
This guarantees that no two timestamps for commits will ever be  
exactly identical, even across different servers.


Wouldn't it be better to just store that information separately,  
rather than mucking with the timestamp?


Though, there's anothe issue here... I don't think NTP is good for  
any better than a few milliseconds, even on a local network.


How exact does the conflict resolution need to be, anyway? Would it  
really be a problem if transaction B committed 0.1 seconds after  
transaction A yet the cluster thought it was the other way around?

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] proposal: only superuser can change customized_options

2007-02-02 Thread Pavel Stehule





From: Tom Lane [EMAIL PROTECTED]
To: Pavel Stehule [EMAIL PROTECTED]
CC: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] proposal: only superuser can change 
customized_options Date: Fri, 02 Feb 2007 11:40:10 -0500


Pavel Stehule [EMAIL PROTECTED] writes:
 I want to use custmized option for security configuration one contrib
 library. Currently customized options are usable only for default
 configuration, because everybody can change it. It is substitution of 
global

 variables.
 Decision if option is protected or not can be based on name of option.

I dislike making it depend on spelling.  There was discussion of this
problem before, and we had a much saner answer: when the module that
defines the variable gets loaded, discard any local setting if the
correct protection level of the variable is SUSET or higher.  See the
archives.

regards, tom lane


I am finding it.

Thank You

Pavel Stehule

_
Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. 
http://www.msn.cz/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] PL/pgSQL RENAME functionality in TODOs

2007-02-02 Thread Pavel Stehule


But I think a way to get around that would be to RENAME the arguments  in 
the DECLARE section, so user_id could become p_user_id under the  covers.




It's one case. But I don't belive so result will be more readable. Better 
solution is using names qualificated by function name. I am not sure if 
plpgsql support it. I thing so Oracle support it and SQL/PSM support it too.


like:
 create or replace function fx(a integer, b integer)
 returns void as $$
   declare la integer, lb integer;
 begin
   select into la, lb tab.a, tab.b
  from tab
where tab.a = fx.a and tab.b = fx.b

I am sorry. I don't belive so using RENAME is better

So perhaps there is still a point to RENAME after-all, at least for  
paramaters.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



Pavel Stehule

_
Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] writing new regexp functions

2007-02-02 Thread David Fetter
On Fri, Feb 02, 2007 at 12:54:30AM -0800, Jeremy Drake wrote:
 On Fri, 2 Feb 2007, Jeremy Drake wrote:
 
  jeremyd=# select * from regexp_matches('foobarbequebaz',
  $re$(bar)(beque)$re$, false);
   prematch | fullmatch |   matches   | postmatch
  --+---+-+---
   \N   | \N| {bar,beque} | \N
  (1 row)
 
 I just changed this to fill in fullmatch when the bool is false, so this
 one would look like:
  prematch | fullmatch |   matches   | postmatch
 --+---+-+---
  \N   | barbeque  | {bar,beque} | \N
 (1 row)
 
 I also removed my check for capture groups, since in this setup you could
 get useful output without any.  I am still trying to decide whether or not
 to add back an error if you called the no-bool version which just returns
 the array, and you do not have any capture groups.  ISTM this is likely an
 oversight on the query author's part, and it would be helpful to alert him
 to this.
 
 If you have no capture groups, the matches array is empty (not null).  If
 the match happened at the start of the string, the prematch is an empty
 string, and if the match happened at the end of the string, the postmatch
 is an empty string.
 
  Reasonable?

This is great :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Function proposal to find the type of a datum

2007-02-02 Thread Kate F
On Fri, Feb/ 2/07 11:37:13AM -0500, Tom Lane wrote:
 Kate F [EMAIL PROTECTED] writes:
  So, to conclude, we still have a valid use-case (which you explained a
  little more explicitly than I did). Shall I attempt to implement it?
  (that is, type_name_of() which returns TEXT)
 
 I think I'd suggest pg_type_name ... or maybe pg_type_name_of ...
 also, the code you need to convert OID to name is already there,
 see regtypeout.

Fantastic! Thank you for the interesting discussion,

-- 
Kate

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Stephan Szabo
On Fri, 2 Feb 2007, Simon Riggs wrote:

 It sounds like if we don't put a SHARE lock on the referenced table then
 we can end the transaction in an inconsistent state if the referenced
 table has concurrent UPDATEs or DELETEs. BUT those operations do impose
 locking rules back onto the referencing tables that would not be granted
 until after any changes to the referencing table complete, whereupon
 they would restrict or cascade. So an inconsistent state doesn't seem
 possible to me.

What locking back to the referencing table are you thinking about? The row
locks are insufficient because that doesn't prevent an insert of a
new row that matches the criteria previously locked against AFAIK.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] PL/pgSQL RENAME functionality in TODOs

2007-02-02 Thread imad

On 2/2/07, Jim Nasby [EMAIL PROTECTED] wrote:

On Feb 1, 2007, at 5:08 AM, Pavel Stehule wrote:
 std. use rename only for triggers and variables new and old. It has
 sense. I don't see sense for rename in clasic plpgsql functions.
 There was one reason, rename unnamed $params. But currently plpgsql
 support named params and this reason is obsolete.

Unless things have changed it can be a real PITA to deal with plpgsql
variables that share the same name as a field in a table. IIRC
there's some cases where it's not even possible to unambiguously
refer to the plpgsql variable instead of the field.

For internal variables there's a decent work-around... just prefix
all variables with something like v_. But that's pretty ugly for
parameters... get_user(user_id int) is certainly a nicer interface
than get_user(p_user_id int).

But I think a way to get around that would be to RENAME the arguments
in the DECLARE section, so user_id could become p_user_id under the
covers.

So perhaps there is still a point to RENAME after-all, at least for
paramaters.
--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


Parameters can be renamed in 8.2.
The only thing which does not work is renaming a variable immediately after
its declaration which is a useless functionality.

So, should we still consider it a ToDo?

-- Imad
www.EnterpriseDB.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] problem of geometric operator in v8.2.1

2007-02-02 Thread Ioseph Kim

= select @-@ lseg '((0,0),(1,0))';
?column?
--
   1
(1 row)

= select @-@ path '((0,0),(1,0))';
?column?
--
   2
(1 row)

--

It's maybe bug in v8.2.1



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Simon Riggs
On Fri, 2007-02-02 at 12:01 +0100, Csaba Nagy wrote: 
  You say below the cut that you're not updating keys, so presumably it's 
  other columns. Which leads me to something I've wondered for a while - 
  why do we lock the whole row? Is it just a matter of not optimised that 
  yet or is there a good reason why locking just some columns isn't 
  practical.
 
 For the conditions of generating the deadlock, see:
 
 http://archives.postgresql.org/pgsql-general/2006-12/msg00029.php
 
 The reason of the occasional orphan rows is not completely clear to me,
 but it must be some kind of race condition while
 inserting/deleting/?updating concurrently the parent/child tables.

Thanks very much to Csaba, Richard and Florian for insight on this.

As you might have guessed across my recent posts, I'm coping with a
locking problem that is occurring on RI checks. Similarly to Csaba's
example, this is a port from Oracle.

My earlier thinking was that Oracle appears to be able to avoid locking
and my thought was that this was simply a rather dodgy interpretation of
the SQL Standard. Anyway, I'm not happy with simply forgetting the SHARE
lock; that clearly leads to invalid states in some cases, even if I need
to have a strong cup of coffee in the morning before I see them.

Using SELECT ... FOR SHARE in RI checks is better than using FOR UPDATE,
but its still too heavy a lock for many scenarios.

In particular, an INSERT on the referencing table can be blocked because
of an RI check against the referenced table, when there is a concurrent
UPDATE (on referenced table). So RI works well when the referenced
tables are mostly read-only, but we see lots of problems when we perform
regular updates against both referencing and referenced tables.

When we perform an INSERT into the referencing table, we want to be
certain that the FK value we are inserting still exists within the
referenced table. A DELETE on the referenced table should prevent the
INSERT, as should an UPDATE which changes the Primary Key. However, an
UPDATE which simply UPDATEs a non-PK column on the referenced table
should neither block nor prevent the INSERT on the referencing table.

We might think of using a SELECT ... FOR SHARE NOWAIT but that will
return an ERROR. Even if we wrap the request in a subtransaction the
query will still fail even when a permissible non-PK UPDATE is taking
place, so that alone is not good enough.

Various other thoughts about new lock modes yield nothing useful either,
after close analysis. So changing the lock *strength* is not the right
thing to do, but changing the lock footprint does seem worthwhile.

My initial proposal is to change the way write locking works, so as to
implement simplified column-level locking. Rather than locking each
column individually, we can lock the columns in one of two groups, plus
the full row. Thus we have three types of write lock:

1. full row write lock

as well as two mutually exclusive groups of columns:

2.a) PK cols
2.b) all columns apart from the PK cols

So type (1) conflicts with either (2a) or (2b). (2a) and (2b) do not
conflict with one another. Shared and Write locks conflict as before at
the various levels.

INSERT, DELETE - full row write lock

UPDATE - will place a write lock on either: full row or all-non-PK-cols,
depending upon whether the SET clause touches the PK columns.  (So you
cannot UPDATE the PK while the non-PK cols are being UPDATEd...) If no
FK references this table, we will always take a full row write lock.

SELECT FOR UPDATE - full row write lock

SELECT FOR SHARE - will place full row lock by default, but in cases
where the SELECT doesn't reference anything apart from the PK columns,
we would place the lock only on the PK-cols. (Might be easier to do this
only for RI check queries.)

With this model, an INSERT or FK UPDATE on the referencing table that
generates a SELECT FOR SHARE onto the referenced table will conflict
with a DELETE or a PK UPDATE, but won't conflict at all with a non-PK
UPDATE.

This would be possible by using 2 additional tuple info bits to flag
that the lock held was either HEAP_LOCKED_PK_ONLY or
HEAP_LOCKED_NON_PK_COLS. When both lock types are held simultaneously we
will replace xmax with a multitransaction id, where we hold only two
transactionids at most - the first Xid is the holder of the PK cols
lock, the second Xid is the holder of the non-PK cols lock.

As a non-PK UPDATE is carried out, the details of any share locks on the
PK cols are carried forward onto the new row version.

So all of the machinery we need is already in place, we just need to
extend it somewhat.

Clearly an 8.4+ item, but seems worth getting onto the TODO list if we
agree to it:

TODO:

Develop non-conflicting locking scheme to allow RI checks to co-exist
peacefully with non-PK UPDATEs on the referenced table.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---

Re: [HACKERS] Performance penalty of visibility info in indexes?

2007-02-02 Thread Simon Riggs
On Thu, 2007-02-01 at 23:57 -0600, Jim Nasby wrote:
 Has anyone actually measured the performance overhead of storing  
 visibility info in indexes? I know the space overhead sounds  
 daunting, but even if it doubled the size of the index in many cases  
 that'd still be a huge win over having to scan the heap as well as  
 the index (esp. for things like count(*)). There would also be  
 overhead from having to update the old index tuple, but for the case  
 of updates you're likely to need that page for the new index tuple  
 anyway.
 
 I know this wouldn't work for all cases, but ISTM there are many  
 cases where it would be a win.

It would prevent any optimization that sought to avoid inserting rows
into the index each time we perform an UPDATE. Improving UPDATE
performance seems more important than improving count(*), IMHO.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Proposed adjustments in MaxTupleSize and toast thresholds

2007-02-02 Thread Tom Lane
I've been looking into Pavan Deolasee's recent discovery that when
storing a maximum-length toast tuple, heap_insert uselessly recurses
to toast_insert_or_update, wasting a nontrivial number of cycles.
It turns out there are several interrelated mistakes here, which are
wasting space as well as cycles.

First off, as to the exact nature of what's happening: the toast code
is designed so that when breaking down a large datum, it's divided into
rows with data payloads of exactly TOAST_MAX_CHUNK_SIZE bytes each.
On a 4-byte-MAXALIGN machine, this means the rows have total t_len of
exactly TOAST_TUPLE_THRESHOLD, which is what was intended.  However,
that value is not a multiple of 4.  Hence when heapam.c compares
MAXALIGN(tup-t_len)  TOAST_TUPLE_THRESHOLD
it decides the tuple needs re-toasting.

I noted before that this does not happen on an 8-byte-MAXALIGN machine,
but had not understood exactly why.  The reason is the outer MAXALIGN
call in the definition

#define TOAST_MAX_CHUNK_SIZE(TOAST_TUPLE_THRESHOLD -\
MAXALIGN(   \
MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)) +   \
sizeof(Oid) +   \
sizeof(int32) + \
VARHDRSZ))

On a 4-byte machine that call doesn't do anything, but on an 8-byte
machine it causes the value of TOAST_MAX_CHUNK_SIZE to be reduced by
4, which means that t_len of a toast row comes out 4 bytes smaller
than on a 4-byte machine, which makes it smaller than
TOAST_TUPLE_THRESHOLD even after maxalign'ing.  Hence no recursion.

That MAXALIGN is actually *wrong* now that I look at it: it's
effectively supposing that there is padding alignment after the varlena
length word for the chunk data, which of course there is not.  But we
can't change TOAST_MAX_CHUNK_SIZE without forcing an initdb.

Instead we can fix the recursion by removing the MAXALIGN() operations in
heapam.c and tuptoaster.c that compare tuple lengths to the thresholds.
This effectively moves the threshold for tuple compression up a couple
bytes, which is a safe change to make, and makes the comparisons
slightly cheaper to boot.  I propose doing that in 8.2 (and maybe older
branches after we get a bit more testing of it).

But the real problem is that we've got sloppy choices of the thresholds
and sizes.  In the first place, TOAST_MAX_CHUNK_SIZE is being set at a
value that makes every toast row have two wasted padding bytes after it
(turns out it's the same on both 4- and 8-byte machines, though the
specific size of the rows differs).  This is silly, we should be using a
TOAST_MAX_CHUNK_SIZE that makes the actual row length come out at
exactly a MAXALIGN multiple.  In the second place, examination of toast
tables will show you that on a page with four maximum-length toast rows,
there are 12 free bytes on a 4-byte machine and 28 free on an 8-byte
machine (not counting the aforementioned padding bytes after each row).
That's fine at first glance; because of alignment considerations it's
actually the best we can do.  The trouble is that TOAST_TUPLE_THRESHOLD
is derived from MaxTupleSize, which is derived on the assumption that we
should leave 32 bytes for special space on heap pages.  If we actually
had such special space, it wouldn't fit.  This happens because the
threshold calculation is just

#define TOAST_TUPLE_THRESHOLD   (MaxTupleSize / 4)

which fails to account for the line pointers needed for all but the
first tuple.  These errors cancel out at the moment, but wouldn't if we
changed anything about the page header or special space layout.

What I suggest we do about this in HEAD is:

1. Rename MaxTupleSize to MaxHeapTupleSize, and get rid of the
MaxSpecialSpace allotment in its calculation.  We don't use special
space on heap pages and we shouldn't be artificially restricting tuple
length to allow for something that's unlikely to appear in the future.
(Note: yes, I know it's been suggested to keep free-space maps in some
heap pages, but that need not factor into a MaxHeapTupleSize limit: big
tuples can simply go into a page without any free-space map.)

2. Fix TOAST_TUPLE_THRESHOLD and TOAST_TUPLE_TARGET to be correctly
calculated (properly allowing for line pointers) and to be MAXALIGN
multiples.  The threshold value should be exactly the size of the
largest tuple that you can put four of onto one page.  Fix
TOAST_MAX_CHUNK_SIZE so that it is *not* necessarily a MAXALIGN
multiple, but rather causes the total length of a toast tuple to come
out that way.  This guarantees minimum space wastage on toast pages.

This will force initdb due to changing chunk sizes in toast tables, but
unless we're going to reject Heikki's patch to merge cmin/cmax, there is
no hope of an in-place upgrade for 8.3 anyway.

BTW, while I was looking at this I noticed that BTMaxItemSize is
incorrectly calculated as well: it's coming out a 

Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Simon Riggs
On Fri, 2007-02-02 at 10:35 -0800, Stephan Szabo wrote:
 On Fri, 2 Feb 2007, Simon Riggs wrote:
 
  It sounds like if we don't put a SHARE lock on the referenced table then
  we can end the transaction in an inconsistent state if the referenced
  table has concurrent UPDATEs or DELETEs. BUT those operations do impose
  locking rules back onto the referencing tables that would not be granted
  until after any changes to the referencing table complete, whereupon
  they would restrict or cascade. So an inconsistent state doesn't seem
  possible to me.
 
 What locking back to the referencing table are you thinking about? The row
 locks are insufficient because that doesn't prevent an insert of a
 new row that matches the criteria previously locked against AFAIK.

Probably best to read the later posts; this one was at the beginning of
my thought train, so is slightly off track, as later posters remind me.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] problem of geometric operator in v8.2.1

2007-02-02 Thread Tom Lane
Ioseph Kim [EMAIL PROTECTED] writes:
 = select @-@ lseg '((0,0),(1,0))';
  ?column?
 --
 1
 (1 row)

 = select @-@ path '((0,0),(1,0))';
  ?column?
 --
 2
 (1 row)

 It's maybe bug in v8.2.1

What do you think is wrong with those answers?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Thus we have three types of write lock:
 1. full row write lock
 as well as two mutually exclusive groups of columns:
 2.a) PK cols
 2.b) all columns apart from the PK cols

This appears to require that we add enough fields to row headers to
represent *three* locking transactions instead of the current one.
Given the amount of griping about row header overhead that normally
flows across this list, I can't see such a proposal getting accepted.

 This would be possible by using 2 additional tuple info bits to flag
 that the lock held was either HEAP_LOCKED_PK_ONLY or
 HEAP_LOCKED_NON_PK_COLS. When both lock types are held simultaneously we
 will replace xmax with a multitransaction id, where we hold only two
 transactionids at most - the first Xid is the holder of the PK cols
 lock, the second Xid is the holder of the non-PK cols lock.

You haven't thought that through: it fails to distinguish who holds
which lock (mxact membership is not ordered), and it doesn't scale to
more than two holders, and I don't think it works for combinations of
share and exclusive lock.  Also, what happened to the third type of lock?

Implementation details aside, I'm a tad concerned about introducing
deadlock failures that did not happen before, in scenarios where
transactions touch the row multiple times and end up needing to
acquire one of these locks while already holding another.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] problem of geometric operator in v8.2.1

2007-02-02 Thread Ioseph Kim

when @-@ operator used at path type,
below query maybe returns 1.
because this path is just line.


- Original Message - 
From: Tom Lane [EMAIL PROTECTED]

To: Ioseph Kim [EMAIL PROTECTED]
Cc: pgsql-hackers@postgresql.org
Sent: Saturday, February 03, 2007 5:36 AM
Subject: Re: [HACKERS] problem of geometric operator in v8.2.1 




Ioseph Kim [EMAIL PROTECTED] writes:

= select @-@ lseg '((0,0),(1,0))';
 ?column?
--
1
(1 row)



= select @-@ path '((0,0),(1,0))';
 ?column?
--
2
(1 row)



It's maybe bug in v8.2.1


What do you think is wrong with those answers?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PL/pgSQL RENAME functionality in TODOs

2007-02-02 Thread Tom Lane
imad [EMAIL PROTECTED] writes:
 So, should we still consider it a ToDo?

Whatever you think about the rename-in-same-block-as-declared case,
it's still broken, as per my example showing that the effects are not
limited to the containing block.  However, considering that no one
has taken an interest in fixing it since 7.2, it's obviously not a
high-priority feature.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Simon Riggs
On Fri, 2007-02-02 at 15:57 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Thus we have three types of write lock:
  1. full row write lock
  as well as two mutually exclusive groups of columns:
  2.a) PK cols
  2.b) all columns apart from the PK cols
 
 This appears to require that we add enough fields to row headers to
 represent *three* locking transactions instead of the current one.
 Given the amount of griping about row header overhead that normally
 flows across this list, I can't see such a proposal getting accepted.

Yeh, I said ouch myself.

  This would be possible by using 2 additional tuple info bits to flag
  that the lock held was either HEAP_LOCKED_PK_ONLY or
  HEAP_LOCKED_NON_PK_COLS. When both lock types are held simultaneously we
  will replace xmax with a multitransaction id, where we hold only two
  transactionids at most - the first Xid is the holder of the PK cols
  lock, the second Xid is the holder of the non-PK cols lock.
 
 You haven't thought that through: it fails to distinguish who holds
 which lock (mxact membership is not ordered)

OK, sorry, I thought there was some ordering possible.

 , and it doesn't scale to
 more than two holders, and I don't think it works for combinations of
 share and exclusive lock.  Also, what happened to the third type of lock?

Well, we just need to record the maximum two lock holders (given the
semantics described). The third lock type is both locks at once.

 Implementation details aside, I'm a tad concerned about introducing
 deadlock failures that did not happen before, in scenarios where
 transactions touch the row multiple times and end up needing to
 acquire one of these locks while already holding another.

Well, right now we have deadlocks and lots of locking. Updating PKs
isn't a regular occurence, so I'd rather swap a common deadlock for an
uncommon one, any day.

Anyway, implementation aside, I wanted to agree the overall TODO, so we
can think through the best way over a long period, if you agree in
general.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] problem of geometric operator in v8.2.1

2007-02-02 Thread Ioseph Kim

I misunderstood. :)

path '((0,0),(1,0))' is 'closed' path.

in this case, it's maybe operator calculated return length too.

- Original Message - 
From: Ioseph Kim [EMAIL PROTECTED]

To: pgsql-hackers@postgresql.org
Sent: Saturday, February 03, 2007 6:00 AM
Subject: Re: [HACKERS] problem of geometric operator in v8.2.1 




when @-@ operator used at path type,
below query maybe returns 1.
because this path is just line.


- Original Message - 
From: Tom Lane [EMAIL PROTECTED]

To: Ioseph Kim [EMAIL PROTECTED]
Cc: pgsql-hackers@postgresql.org
Sent: Saturday, February 03, 2007 5:36 AM
Subject: Re: [HACKERS] problem of geometric operator in v8.2.1 




Ioseph Kim [EMAIL PROTECTED] writes:

= select @-@ lseg '((0,0),(1,0))';
 ?column?
--
1
(1 row)



= select @-@ path '((0,0),(1,0))';
 ?column?
--
2
(1 row)



It's maybe bug in v8.2.1


What do you think is wrong with those answers?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] problem of geometric operator in v8.2.1

2007-02-02 Thread Tom Lane
Ioseph Kim [EMAIL PROTECTED] writes:
 when @-@ operator used at path type,
 below query maybe returns 1.
 because this path is just line.

No, because it's a closed path, so it's a loop from (0,0) to (1,0)
and back again.  If you don't want to count the return segment,
use an open path.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Fri, 2007-02-02 at 15:57 -0500, Tom Lane wrote:
 , and it doesn't scale to
 more than two holders, and I don't think it works for combinations of
 share and exclusive lock.  Also, what happened to the third type of lock?

 Well, we just need to record the maximum two lock holders (given the
 semantics described). The third lock type is both locks at once.

You're not going to support shared locks?  That will be a big step
backwards ...

 Anyway, implementation aside, I wanted to agree the overall TODO, so we
 can think through the best way over a long period, if you agree in
 general.

No, I don't.  I think knowledge of which columns are in a PK is quite a
few levels away from the semantics of row locking.  To point out just
one problem, what happens when you add or drop a PK?  Or drop and
replace with a different column set?  Yes, I know dropping one requires
exclusive lock on the table, but the transaction doing it could hold row
locks within the table, and now it's very unclear what they mean.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Load distributed checkpoint

2007-02-02 Thread Bruce Momjian

Thread added to TODO list:

* Reduce checkpoint performance degredation by forcing data to disk
  more evenly

  http://archives.postgresql.org/pgsql-hackers/2006-12/msg00337.php
  http://archives.postgresql.org/pgsql-hackers/2007-01/msg00079.php


---

ITAGAKI Takahiro wrote:
 This is a proposal for load distributed checkpoint.
 (It is presented on postgresql anniversary summit in last summer.)
 
 
 We offen encounters performance gap during checkpoint. The reason is write
 bursts. Storage devices are too overworked in checkpoint, so they can not
 supply usual transaction processing.
 
 Checkpoint consists of the following four steps, and the major performance
 problem is 2nd step. All dirty buffers are written without interval in it.
 
  1. Query information (REDO pointer, next XID etc.)
  2. Write dirty pages in buffer pool
  3. Flush all modified files
  4. Update control file
 
 I suggested to write pages with sleeping in 2nd step, using normal activity
 of the background writer. It is something like cost-based vacuum delay.
 Background writer has two pointers, 'ALL' and 'LRU', indicating where to 
 write out in buffer pool. We can wait for the ALL clock-hand going around
 to guarantee all pages to be written.
 
 Here is pseudo-code for the proposed method. The internal loop is just the
 same as bgwriter's activity.
 
   PrepareCheckPoint();  -- do step 1
   Reset num_of_scanned_pages by ALL activity;
   do {
   BgBufferSync();   -- do a part of step 2
   sleep(bgwriter_delay);
   } while (num_of_scanned_pages  shared_buffers);
   CreateCheckPoint();   -- do step 3 and 4
 
 
 We may accelerate background writer to reduce works at checkpoint instead of
 the method, but it introduces another performance problem; Extra pressure
 is always put on the storage devices to keep the number of dirty pages low.
 
 
 I'm working about adjusting the progress of checkpoint to checkpoint timeout
 and wal segments limitation automatically to avoid overlap of two checkpoints.
 I'll post a patch sometime soon.
 
 Comments and suggestions welcome.
 
 Regards,
 ---
 ITAGAKI Takahiro
 NTT Open Source Software Center
 
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Sync Scan update

2007-02-02 Thread Bruce Momjian

Thread added to TODO for item:

* Allow sequential scans to take advantage of other concurrent
  sequential scans, also called Synchronised Scanning


---

Jeff Davis wrote:
 I have updated my Synchronized Scan patch and have had more time for
 testing.
 
 Go to http://j-davis.com/postgresql/syncscan-results10.html
 where you can download the patch, and see the benchmarks that I've run.
 
 The results are very promising. I did not see any significant slowdown
 for non-concurrent scans or for scans that fit into memory, although I
 do need more testing in this area.
 
 The benchmarks that I ran tested the concurrent performance, and the
 results were excellent.
 
 I also added two new simple features to the patch (they're just
 #define'd tunables in heapam.h):
 (1) If the table is smaller than
 effective_cache_size*SYNC_SCAN_THRESHOLD then the patch doesn't do
 anything different from current behavior.
 (2) The scans can start earlier than the hint implies by setting
 SYNC_SCAN_START_OFFSET between 0 and 1. This is helpful because it makes
 the scan start in a place where the cache trail is likely to be
 continuous between the starting point and the location of an existing scan.
 
 I'd like any feedback, particularly any results that show a slowdown
 from current behavior. I think I fixed Luke's problem (actually, it was
 a fluke that it was even working at all), but I haven't heard back. Some
 new feedback would be very helpful.
 
 Thanks.
 
 Regards,
   Jeff Davis
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] writing new regexp functions

2007-02-02 Thread Jeremy Drake
On Fri, 2 Feb 2007, Jeremy Drake wrote:

 I just coded up for this:

 CREATE FUNCTION regexp_matches(IN str text, IN pattern text) RETURNS
 text[]
 AS 'MODULE_PATHNAME', 'regexp_matches'
 LANGUAGE C IMMUTABLE STRICT;

 CREATE FUNCTION regexp_matches(
 IN str text, IN pattern text, IN return_pre_and_post bool,
 OUT prematch text, OUT fullmatch text, OUT matches text[], OUT
 postmatch text) RETURNS record
 AS 'MODULE_PATHNAME', 'regexp_matches'
 LANGUAGE C IMMUTABLE STRICT;


I wanted to put out there the question of what order the parameters to
these regex functions should go.  ISTM most people expect them to go
(pattern, string), but I made these functions consistant with
substring(text,text) which takes (string, pattern).  Now I have been
working on a regexp_split function, which takes (pattern, string), which
is what someone familiar with the function from perl would expect, but is
not consistant with substring or now with my regexp_matches function.

I want to ask, should I break with following substring's precedent, and
put the pattern first (as most people probably would expect), or should I
break with perl's precedent and put the pattern second (to behave like
substring)?


-- 
We cannot put the face of a person on a stamp unless said person is
deceased.  My suggestion, therefore, is that you drop dead.
-- James E. Day, Postmaster General

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Referential Integrity and SHARE locks

2007-02-02 Thread Florian G. Pflug

Simon Riggs wrote:

My earlier thinking was that Oracle appears to be able to avoid locking
and my thought was that this was simply a rather dodgy interpretation of
the SQL Standard. Anyway, I'm not happy with simply forgetting the SHARE
lock; that clearly leads to invalid states in some cases, even if I need
to have a strong cup of coffee in the morning before I see them.

I think oracle is in a completly different situation here - Oracle imposes
limits on the maximum size of a transaction dues to various reasons
I believe - one being the size of the rollback segment. AFAIK, postgres
doesn't impose any such limits (apart from problems with long-running
transactions an vacuums, and possibly if you do set constraints all deferred)
 - which is why row locks have to be stored on-disk in the tuple header,
and not in some shared-memory segment.

Now, _if_ you're already imposing limits on transaction size, than it becomes
quite feasable IMHO to also limit the number of row-locks a transaction can 
take -
and to just store them in memory. This again makes column-level locking much
easier I'd think.


Using SELECT ... FOR SHARE in RI checks is better than using FOR UPDATE,
but its still too heavy a lock for many scenarios.

I think it's not too heavy, but it's actually the wrong kind of lock for ri
checks. Both SHARE and EXCLUSIVE row locks are, well, _row_ locks - the
lock a specific tuple. This is fine, if you want to guarantee that
a certain tuple stays as it is as long as still need it. But it's not really
what a RI constraints wants to ensure. An RI constraint actually wants to
force a specific _condition_ (namely, the existence of a row with a certain
value in a certain column) to be true, not prevent a specific physical tuple
from being modifier.

Now, a generic mechanism for condition locking is probably impossible to
implement with large performance sacrifices - but AFAICS the cases needed for
RI checks are always of the form Is there a row where (field1, field2, ...) =
(a, b, c, ...). And - at least for RI-Checks done when updating the 
_referencing_
table, postgres already forces an index to exist on (field1, field2, ...) I 
think.

The condition There is a row where (field1, field2, ...) = (a,b,c,...) is
the same as saying There as an index entry for (a,b,c,) that points to a live 
row.
_If_ is was possible to somehow take a lock on a index key (not a certain index 
entry,
but rather _all_ entries with a given key), than that could maybe be used for 
more
efficient RI locks. I guess this would need some sort of 
tuple-visibility-in-index entries,
but it seems that there a few people interested in making this happen.

greetings, Florian Pflug

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] writing new regexp functions

2007-02-02 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 I want to ask, should I break with following substring's precedent, and
 put the pattern first (as most people probably would expect), or should I
 break with perl's precedent and put the pattern second (to behave like
 substring)?

All of SQL's pattern match operators have the pattern on the right, so
my advice is to stick with that and try not to think about Perl ;-)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] writing new regexp functions

2007-02-02 Thread David Fetter
On Fri, Feb 02, 2007 at 08:56:31PM -0500, Tom Lane wrote:
 Jeremy Drake [EMAIL PROTECTED] writes:
  I want to ask, should I break with following substring's
  precedent, and put the pattern first (as most people probably
  would expect), or should I break with perl's precedent and put the
  pattern second (to behave like substring)?
 
 All of SQL's pattern match operators have the pattern on the right,
 so my advice is to stick with that and try not to think about Perl
 ;-)

Perl provides inspiration, but here, consistency would help more than
orderly imitation of how it does what it does.   And besides, when
people really need Perl, they can pull it in as a PL :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Dirty pages in freelist cause WAL stuck

2007-02-02 Thread Bruce Momjian

Is this a TODO item?

---

ITAGAKI Takahiro wrote:
 Simon Riggs [EMAIL PROTECTED] wrote:
 
  I think what you are saying is: VACUUM places blocks so that they are
  immediately reused. This stops shared_buffers from being polluted by
  vacuumed-blocks, but it also means that almost every write becomes a
  backend dirty write when VACUUM is working, bgwriter or not. That also
  means that we flush WAL more often than we otherwise would.
 
 That's right. I think it's acceptable that vacuuming process writes dirty
 buffers made by itself, because only the process slows down; other backends
 can run undisturbedly. However, frequent WAL flushing should be avoided.
 
 I found the problem when I ran VACUUM FREEZE separately. But if there were
 some backends, dirty buffers made by VACUUM would be reused by those backends,
 not by the vacuuming process.
 
  From above my thinking would be to have a more general implementation:
  Each backend keeps a list of cache buffers to reuse in its local loop,
  rather than using the freelist as a global list. That way the technique
  would work even when we have multiple Vacuums working concurrently. It
  would also then be possible to use this for the SeqScan case as well.
 
 Great idea! The troubles are in the usage of buffers by SeqScan and VACUUM.
 The former uses too many buffers and the latter uses too few buffers.
 Your cache-looping will work around both cases.
 
  Another connected thought is the idea of a having a FullBufferList - the
  opposite of a free buffer list. When VACUUM/INSERT/COPY fills a block we
  notify the buffer manager that this block needs writing ahead of other
  buffers, so that the bgwriter can work more effectively. That seems like
  it would help with both this current patch and the additional thoughts
  above.
 
 Do you mean that bgwriter should take care of buffers in freelist, not only
 ones in the tail of LRU? We might need activity control of bgwriter. Buffers
 are reused rapidly in VACUUM or bulk insert, so bgwriter is not sufficient
 if its settings are same as usual.
 
 Regards,
 ---
 ITAGAKI Takahiro
 NTT Open Source Software Center
 
 
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] unixware and --with-ldap

2007-02-02 Thread Bruce Momjian

Clarification, this is the email used to make the patch that was
backpatched.

---

Albe Laurenz wrote:
  I have tried --with-thread-safety and configure fails on ldap check
  because for some reason CTHREAD_FLAGS (-Kpthread for UW) is missing
  on compile command, although present before that. I can't find why
 
  You mean PTHREAD_FLAGS, right?
 
  Nope,I mean PTHREAD_CFLAGS witch is defined in src/templates/unixware
 
 PTHREAD_CFLAGS, of course :^)
 
 Oops, this seems to be an oversight in (my) original code.
 
 I'd say we should change the lines
 
   AC_CHECK_LIB(ldap_r, ldap_simple_bind, [],
   [AC_MSG_ERROR([library 'ldap_r' is required for LDAP])],
   [$PTHREAD_LIBS $EXTRA_LDAP_LIBS])
 
 to
 
   AC_CHECK_LIB(ldap_r, ldap_simple_bind, [],
   [AC_MSG_ERROR([library 'ldap_r' is required for LDAP])],
   [$PTHREAD_CFLAGS $PTHREAD_LIBS $EXTRA_LDAP_LIBS])
 
 I know that this is abuse of AC_CHECK_LIB, but it is only a test
 and LIBS is discarded later.
 
 Yours,
 Laurenz Albe
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] snprintf()

2007-02-02 Thread Kate F
Hello,

I've been implementing a type I needed, and happened to be using
snprintf(), since I have C99 available.

ereport(NOTICE,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg(%d, snprintf(NULL, 0, abc;

For me, this reports 0. I beieve it should report 3. My system's
snprintf() returns 3. I'm using NetBSD. By including postgres.h and
fmgr.h, does PostgreSQL replace my system's snprintf() prototype with
its own implementation's?

Placing stdio.h above those includes appears to have no effect.

For reference, the relevant part of C99:

  7.19.6.5 2 If n is zero, nothing is written, and s may be a null
  pointer.

  7.19.6.5 3 The snprintf function returns the number of characters
  that would have been written had n been sufficiently large, not
  counting the terminating null character, or a neg ative value if an
  encoding error occurred.

Regards,

-- 
Kate

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] snprintf()

2007-02-02 Thread Tom Lane
Kate F [EMAIL PROTECTED] writes:
 ... does PostgreSQL replace my system's snprintf() prototype with
 its own implementation's?

We do on some platforms where configure decides the system version
is deficient ... I don't recall the exact conditions at the moment.
I wouldn't really have expected that to happen on any *BSD, but you
could look into the generated Makefile.global to find out.

 For reference, the relevant part of C99:
   7.19.6.5 2 If n is zero, nothing is written, and s may be a null
   pointer.

For reference, the relevant part of the Single Unix Spec:

If the value of n is zero on a call to snprintf(), an
unspecified value less than 1 is returned.

So the behavior you'd like to depend on is unportable anyway, and
that coding will get rejected if submitted as a Postgres patch.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] snprintf()

2007-02-02 Thread Kate F
On Fri, Feb/ 2/07 10:52:28PM -0500, Tom Lane wrote:
 Kate F [EMAIL PROTECTED] writes:
  ... does PostgreSQL replace my system's snprintf() prototype with
  its own implementation's?
 
 We do on some platforms where configure decides the system version
 is deficient ... I don't recall the exact conditions at the moment.
 I wouldn't really have expected that to happen on any *BSD, but you
 could look into the generated Makefile.global to find out.

I don't see anything that looks relevant for my Makefile.global; I
would be surprised if NetBSD's were overridden too!


  For reference, the relevant part of C99:
7.19.6.5 2 If n is zero, nothing is written, and s may be a null
pointer.
 
 For reference, the relevant part of the Single Unix Spec:
 
   If the value of n is zero on a call to snprintf(), an
   unspecified value less than 1 is returned.

Aha! I do recall either POSIX or SUS defers to C on conflicts... I
can't find which, though. If this snprintf() is following SUS
behaviour, that's fine. Thank you!


 So the behavior you'd like to depend on is unportable anyway, and
 that coding will get rejected if submitted as a Postgres patch.

Absolutley (and I assume you target C89, too, which does not provide
snprintf()). This was just something personal where I happened to use
it for convenience.

Thank you for checking that - and appologies for posting to the wrong
list; that should have been to -bugs!

Regards,

-- 
Kate

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] snprintf()

2007-02-02 Thread Tom Lane
Kate F [EMAIL PROTECTED] writes:
 On Fri, Feb/ 2/07 10:52:28PM -0500, Tom Lane wrote:
 I wouldn't really have expected that to happen on any *BSD, but you
 could look into the generated Makefile.global to find out.

 I don't see anything that looks relevant for my Makefile.global; I
 would be surprised if NetBSD's were overridden too!

Sorry for not being specific: the thing to check is whether that
file's definition for LIBOBJS includes snprintf.o.  If it does,
the code in src/port/snprintf.c would get sucked in.

If it doesn't, then I confess bafflement as to why snprintf isn't
acting as you'd expect on your machine.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] snprintf()

2007-02-02 Thread Kate F
On Fri, Feb/ 2/07 11:20:07PM -0500, Tom Lane wrote:
 Kate F [EMAIL PROTECTED] writes:
  On Fri, Feb/ 2/07 10:52:28PM -0500, Tom Lane wrote:
  I wouldn't really have expected that to happen on any *BSD, but you
  could look into the generated Makefile.global to find out.
 
  I don't see anything that looks relevant for my Makefile.global; I
  would be surprised if NetBSD's were overridden too!
 
 Sorry for not being specific: the thing to check is whether that
 file's definition for LIBOBJS includes snprintf.o.  If it does,
 the code in src/port/snprintf.c would get sucked in.
 
 If it doesn't, then I confess bafflement as to why snprintf isn't
 acting as you'd expect on your machine.

Just these:

LIBOBJS =  copydir.o dirmod.o exec.o noblock.o path.o pipe.o pgsleep.o
pgstrcasecmp.o qsort.o qsort_arg.o sprompt.o thread.o

(More than I expected, actually)

I am imagining the compiler (gcc, here) has some flags to explicitly
select if C99 (which is what I tested my stand-alone example with)
or SUS behaviour is to be used. I'm not really sure how I'd set that,
though - I imagine I'd need to recompile the backend with -std=C99?

Regards,

-- 
Kate

---(end of broadcast)---
TIP 6: explain analyze is your friend