[HACKERS] PL/Perl translation, croak

2009-02-17 Thread Peter Eisentraut

plperl's nls.mk contains

GETTEXT_TRIGGERS:= _ errmsg errdetail errdetail_log errhint errcontext 
write_stderr croak Perl_croak


As far as I can tell, croak() and Perl_croak() are provided by the Perl 
library.  So it is quite unclear to me how we expect their argument 
strings to be translated using our message catalogs.  Is this unfinished 
business or is there some mechanism here that I'm not seeing?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions about parsing boolean and casting to anyelement

2009-02-17 Thread Sam Mason
On Mon, Feb 16, 2009 at 08:03:33PM -0500, Tom Lane wrote:
 ITAGAKI Takahiro itagaki.takah...@oss.ntt.co.jp writes:
  I hope anyelement could be used in cast because casts are supported by
  almost programming languages where template or generics are available.

Programming languages with generics (aka, parametric polymorphism in
literature) should mean that you need *less* casts because the type
system is expressive enough that you don't need to escape through a
cast.

 I think what you're suggesting is that inside a polymorphic function,
 anyelement would somehow be a macro for the type that the function's
 current anyelement parameter(s) have.  It's an interesting idea but
 it's just fantasy at the moment; I don't even have an idea of how we
 might implement that.

A couple of solutions would immediately present themselves; making
functions first class objects and introducing something called type
classes (please note these bear little resemblance to classes in
object orientated programming).

If functions were first class objects; you could pass in the input
function (i.e. boolin, or numeric_in) to the array_find function
directly call it in place of the magic cast syntax (magic because it
has to figure out the type of the LHS, whereas if it was a function with
known type then it wouldn't need to infer the source type).

Type classes[1][2] are a general mechanism for making the magic above
tractable and sound.  The cast above would be exactly analogous to the
read function in Haskell, and is used very regularly in most code.

 In the meantime I'm more convinced than ever that we should throw an
 error for attempting such a cast.  If people are imagining that it will
 do something like that, we need to disillusion them.

Yes, sounds sensible at the moment.

-- 
  Sam  http://samason.me.uk/
 
 [1] http://portal.acm.org/citation.cfm?id=75277.75283
 is the original paper
 [2] http://portal.acm.org/citation.cfm?id=141536
 extends them to have multiple type parameters, not for PG but nice
 to know it's been done before and isn't new ground

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4660: float functions return -0

2009-02-17 Thread Bruce Momjian
Tom Lane wrote:
 ITAGAKI Takahiro itagaki.takah...@oss.ntt.co.jp writes:
  We already have some codes to avoid -0 float8um (unary minus),
  but there are no protection in trunc(), ceil() and round() at least.
 
 I looked into the CVS history to find out when the anti-minus-zero code
 got put into float8um.  It seems to have been done by Tom Lockhart here:
 
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/float.c.diff?r1=1.13;r2=1.14
 
 The CVS commit message says
   Check for zero in unary minus floating point code (IEEE allows an
   explicit negative zero which looks ugly in a query result!).
 along with some other unrelated changes.  I can find no evidence in the
 mailing list archives that there was any discussion about the point,
 so I think Tom did that on his own authority.
 
 I'm of the opinion that minus zero was put into the IEEE floating point
 standard by people who know a great deal more about the topic than
 anyone on this list does, and that we do not have the expertise to be
 second-guessing how it should work.  Not long ago we took out code that
 was interfering with spec-compliant treatment of IEEE infinity; I think
 we should take out this code too.
 
 Yes, it will be platform dependent, because various platforms get the
 IEEE spec wrong to some degree, but so what?  This is hardly the only
 platform dependence of that kind.

Agreed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4660: float functions return -0

2009-02-17 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 The CVS commit message says
   Check for zero in unary minus floating point code (IEEE allows an
   explicit negative zero which looks ugly in a query result!).

 I'm of the opinion that minus zero was put into the IEEE floating point
 standard by people who know a great deal more about the topic than
 anyone on this list does, and that we do not have the expertise to be
 second-guessing how it should work.  Not long ago we took out code that
 was interfering with spec-compliant treatment of IEEE infinity; I think
 we should take out this code too.

If the original complaint was that it looked ugly in query results then the
right way to fix it would surely in float4out and float8out. Interfering with
IEEE floating points may be a bad idea but surely it's up to us how we want to
represent those values in text.

But without a convenient and widely used binary format that kind of restricts
our options. If we squash -0 on float[48]out then dumps will lose information.
So I guess there's nothing we can do about it now. I wonder if we're going to
find users complaining about things like displaying -0 matching results
though...


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] differnce from ansi sql standard - unicode strings?

2009-02-17 Thread Peter Eisentraut

Pavel Stehule wrote:

I found BNF for SQL 2003 and I found there some small difference.
Standard use keyword ESCAPE, but PostgreSQL use keybord UESCAPE.

Anybody knows reason?

  Unicode character string literal::=
 [ introducer character set specification ]
 U ampersand quote [ Unicode representation ... ] quote
 [ { separator quote [ Unicode representation ... ] quote }... ]
 [ ESCAPE escape character ]


My copy has

Unicode character string literal ::=
[ introducercharacter set specification ]
Uampersandquote [ Unicode representation... ] quote
[ { separator quote [ Unicode representation... ] quote }... ]
Unicode escape specifier

Unicode escape specifier ::= [ UESCAPE quoteUnicode escape 
characterquote ]


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] differnce from ansi sql standard - unicode strings?

2009-02-17 Thread Pavel Stehule
Hi,

It's possible - I used unofficial BNF graph from internet.

so problem is on my side.

thank you
Pavel


2009/2/17 Peter Eisentraut pete...@gmx.net:
 Pavel Stehule wrote:

 I found BNF for SQL 2003 and I found there some small difference.
 Standard use keyword ESCAPE, but PostgreSQL use keybord UESCAPE.

 Anybody knows reason?

  Unicode character string literal::=
 [ introducer character set specification ]
 U ampersand quote [ Unicode representation ... ] quote
 [ { separator quote [ Unicode representation ... ] quote
 }... ]
 [ ESCAPE escape character ]

 My copy has

 Unicode character string literal ::=
 [ introducercharacter set specification ]
 Uampersandquote [ Unicode representation... ] quote
 [ { separator quote [ Unicode representation... ] quote }... ]
 Unicode escape specifier

 Unicode escape specifier ::= [ UESCAPE quoteUnicode escape
 characterquote ]


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions about parsing boolean and casting to anyelement

2009-02-17 Thread Tom Lane
I wrote:
 ITAGAKI Takahiro itagaki.takah...@oss.ntt.co.jp writes:
 I hope anyelement could be used in cast because casts are supported by
 almost programming languages where template or generics are available.

 I think what you're suggesting is that inside a polymorphic function,
 anyelement would somehow be a macro for the type that the function's
 current anyelement parameter(s) have.  It's an interesting idea but
 it's just fantasy at the moment; I don't even have an idea of how we
 might implement that.

After thinking about it for awhile, I don't like the notation anyway
--- it's not immediately obvious that a cast to anyelement should mean
something like that.  What seems more sensible to me is to introduce
a function to get the type of an expression, so that you could write
something like

cast(expression as typeof(expression))

This special function would act like C's sizeof and similar constructs
in that its argument would never be evaluated, only inspected at parse
time to determine its type.  (There are already precedents for this in
SQL; see the IS OF construct.)  So the original requirement would be
met with something like expression::typeof($1).

A small disadvantage of this approach is that it's notationally a bit
uglier for anyelement/anyarray pairs.  For example, consider a function
foo(anyelement) returns anyarray.  To get at the element type you just
say typeof($1), but if you have to name the array type you need a hack
like typeof(array[$1]).  In the other direction (name the element type
of a parameter array) something like typeof($1[1]) would work.

The countervailing advantage is that this solves a lot of problems that
overloading anyelement wouldn't ever solve, since you can get at the
type of any expression not just a bare parameter.

Also I think it'd be relatively easy to stick into the parser; it
wouldn't require introduction of any new parse-time context information.

Anyway, none of this is material for 8.4, just a possible TODO item.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and handling dropped columns

2009-02-17 Thread Bruce Momjian

Applied.

---

Bruce Momjian wrote:
 Peter Eisentraut wrote:
  Tom Lane wrote:
   Is this acceptable to everyone?  We could name the option
   -u/--upgrade-compatible.
   
   If the switch is specifically for pg_upgrade support (enabling this as
   well as any other hacks we find necessary), which seems like a good
   idea, then don't chew up a short option letter for it.  There should be
   a long form only.
  
  Note that pg_dump's output is already upgrade compatible.  That's what 
  pg_dump is often used for after all.  I believe what we are after here 
  is something like in-place upgrade compatible or upgrade binary 
  compatible.
  
   And probably not even list it in the user documentation.
  
  I think we should still list it somewhere and say it is for use by 
  in-place upgrade utilities.  It will only confuse people if it is not 
  documented at all.
 
 OK, I have completed the patch;  attached.
 
 I ran into a little problem, as documented by this comment in
 catalog/heap.c:
 
 /*
  * Set the type OID to invalid.  A dropped attribute's type link
  * cannot be relied on (once the attribute is dropped, the type might
  * be too). Fortunately we do not need the type row --- the only
  * really essential information is the type's typlen and typalign,
  * which are preserved in the attribute's attlen and attalign.  We set
  * atttypid to zero here as a means of catching code that incorrectly
  * expects it to be valid.
  */
 
 Basically, drop column zeros pg_attribute.atttypid, and there doesn't
 seem to be enough information left in pg_attribute to guess the typid
 that, combined with atttypmod, would restore the proper values for
 pg_attribute.atttypid and pg_attribute.attalign.  Therefore, I just
 brute-forced an UPDATE into dump to set the values properly after
 dropping the fake TEXT column.
 
 I did a minimal documentation addition by adding something to the
 Notes section of the manual pages.
 
 Here is what a dump of a table with dropped columns looks like:
 
   --
   -- Name: test; Type: TABLE; Schema: public; Owner: postgres; Tablespace:
   --
   
   CREATE TABLE test (
   x integer,
   pg.dropped.2 TEXT
   );
   ALTER TABLE ONLY test DROP COLUMN pg.dropped.2;
   
   -- For binary upgrade, recreate dropped column's length and alignment.
   UPDATE pg_attribute
   SET attlen = -1, attalign = 'i'
   WHERE   attname = 'pg.dropped.2'
   AND attrelid =
   (
   SELECT oid
   FROM pg_class
   WHERE   relnamespace = (SELECT oid FROM pg_namespace 
 WHERE nspname = CURRENT_SCHEMA)
   AND relname = 'test'
   );
   
   ALTER TABLE public.test OWNER TO postgres;
 
 -- 
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com
 
   + If your life is a hard drive, Christ can be your backup. +

[ text/x-diff is unsupported, treating like TEXT/PLAIN ]

 Index: doc/src/sgml/ref/pg_dump.sgml
 ===
 RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_dump.sgml,v
 retrieving revision 1.109
 diff -c -c -r1.109 pg_dump.sgml
 *** doc/src/sgml/ref/pg_dump.sgml 10 Feb 2009 00:55:21 -  1.109
 --- doc/src/sgml/ref/pg_dump.sgml 17 Feb 2009 01:57:10 -
 ***
 *** 827,832 
 --- 827,837 
  editing of the dump file might be required.
 /para
   
 +   para
 +applicationpg_dump/application also supports a
 +literal--binary-upgrade/ option for upgrade utility usage.
 +   /para
 + 
/refsect1
   
refsect1 id=pg-dump-examples
 Index: doc/src/sgml/ref/pg_dumpall.sgml
 ===
 RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_dumpall.sgml,v
 retrieving revision 1.75
 diff -c -c -r1.75 pg_dumpall.sgml
 *** doc/src/sgml/ref/pg_dumpall.sgml  7 Feb 2009 14:31:30 -   1.75
 --- doc/src/sgml/ref/pg_dumpall.sgml  17 Feb 2009 01:57:10 -
 ***
 *** 489,494 
 --- 489,499 
  locations.
 /para
   
 +   para
 +applicationpg_dump/application also supports a
 +literal--binary-upgrade/ option for upgrade utility usage.
 +   /para
 + 
/refsect1
   
   
 Index: src/bin/pg_dump/pg_dump.c
 ===
 RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
 retrieving revision 1.521
 diff -c -c -r1.521 pg_dump.c
 *** src/bin/pg_dump/pg_dump.c 16 Feb 2009 23:06:55 -  1.521
 --- src/bin/pg_dump/pg_dump.c 17 Feb 2009 01:57:10 -
 ***
 *** 99,104 
 --- 99,106 
   /* default, if no inclusion switches 

Re: [HACKERS] [BUGS] BUG #4660: float functions return -0

2009-02-17 Thread Tom Lane
Gregory Stark st...@enterprisedb.com writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 I'm of the opinion that minus zero was put into the IEEE floating point
 standard by people who know a great deal more about the topic than
 anyone on this list does, and that we do not have the expertise to be
 second-guessing how it should work.  Not long ago we took out code that
 was interfering with spec-compliant treatment of IEEE infinity; I think
 we should take out this code too.

 If the original complaint was that it looked ugly in query results then the
 right way to fix it would surely in float4out and float8out. Interfering with
 IEEE floating points may be a bad idea but surely it's up to us how we want to
 represent those values in text.

 But without a convenient and widely used binary format that kind of restricts
 our options. If we squash -0 on float[48]out then dumps will lose information.

The point I'm trying to make is that we should deliver IEEE-compliant
results if we are on a platform that complies with the spec.  Right down
to the minus sign.  If that surprises people who are unfamiliar with the
spec, well, there are a lot of things about floating point arithmetic
that surprise people who aren't familiar with it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Perl translation, croak

2009-02-17 Thread Alvaro Herrera
Peter Eisentraut wrote:
 plperl's nls.mk contains

 GETTEXT_TRIGGERS:= _ errmsg errdetail errdetail_log errhint errcontext  
 write_stderr croak Perl_croak

 As far as I can tell, croak() and Perl_croak() are provided by the Perl  
 library.  So it is quite unclear to me how we expect their argument  
 strings to be translated using our message catalogs.  Is this unfinished  
 business or is there some mechanism here that I'm not seeing?

IIRC I just added them because we have a couple of error messages that
are output with Perl_croak and one with croak.  Does it fail to work?  I
don't remember if I actually tested it.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Perl translation, croak

2009-02-17 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Peter Eisentraut wrote:
 plperl's nls.mk contains
 
 GETTEXT_TRIGGERS:= _ errmsg errdetail errdetail_log errhint errcontext  
 write_stderr croak Perl_croak
 
 As far as I can tell, croak() and Perl_croak() are provided by the Perl  
 library.  So it is quite unclear to me how we expect their argument  
 strings to be translated using our message catalogs.  Is this unfinished  
 business or is there some mechanism here that I'm not seeing?

 IIRC I just added them because we have a couple of error messages that
 are output with Perl_croak and one with croak.  Does it fail to work?  I
 don't remember if I actually tested it.

ISTM it would only work if the message was fed through our gettext
before being passed to croak().  So you'd need to write croak(_(msg))
and marking croak itself as a gettext trigger is wrong.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] vacuumdb --freeze

2009-02-17 Thread Bruce Momjian
I would like to add a --freeze parameter to vacuumdb for use by the
binary upgrade utility, and for symmetry with the existing VACUUM
options;  patch attached.

I could also accomplish with with PGOPTIONs but this seem like a cleaner
solution.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/ref/vacuumdb.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/vacuumdb.sgml,v
retrieving revision 1.42
diff -c -c -r1.42 vacuumdb.sgml
*** doc/src/sgml/ref/vacuumdb.sgml  11 Dec 2007 19:57:32 -  1.42
--- doc/src/sgml/ref/vacuumdb.sgml  17 Feb 2009 16:24:39 -
***
*** 26,31 
--- 26,32 
 grouparg--full/argarg-f/arg/group
 grouparg--verbose/argarg-v/arg/group
 grouparg--analyze/argarg-z/arg/group
+grouparg--freeze/argarg-F/arg/group
 arg--table | -t replaceabletable/replaceable
  arg( replaceable class=parametercolumn/replaceable [,...] )/arg
 /arg
***
*** 37,42 
--- 38,44 
 grouparg--full/argarg-f/arg/group
 grouparg--verbose/argarg-v/arg/group
 grouparg--analyze/argarg-z/arg/group
+grouparg--freeze/argarg-F/arg/group
/cmdsynopsis
   /refsynopsisdiv
   
***
*** 161,166 
--- 163,178 
 /para
/listitem
   /varlistentry
+ 
+  varlistentry
+   termoption-F/option/term
+   termoption--freeze/option/term
+   listitem
+para
+ Aggressively quotefreeze/quote tuples.
+/para
+   /listitem
+  /varlistentry
  /variablelist
 /para
  
Index: src/bin/scripts/vacuumdb.c
===
RCS file: /cvsroot/pgsql/src/bin/scripts/vacuumdb.c,v
retrieving revision 1.22
diff -c -c -r1.22 vacuumdb.c
*** src/bin/scripts/vacuumdb.c  1 Jan 2009 17:23:55 -   1.22
--- src/bin/scripts/vacuumdb.c  17 Feb 2009 16:24:39 -
***
*** 15,25 
  
  
  static void vacuum_one_database(const char *dbname, bool full, bool verbose, 
bool analyze,
!   const char *table,
const char *host, const char *port,
const char *username, bool password,
const char *progname, bool echo);
! static void vacuum_all_databases(bool full, bool verbose, bool analyze,
 const char *host, const char *port,
 const char *username, bool password,
 const char *progname, bool echo, bool 
quiet);
--- 15,25 
  
  
  static void vacuum_one_database(const char *dbname, bool full, bool verbose, 
bool analyze,
!   bool freeze, const char *table,
const char *host, const char *port,
const char *username, bool password,
const char *progname, bool echo);
! static void vacuum_all_databases(bool full, bool verbose, bool analyze, bool 
freeze,
 const char *host, const char *port,
 const char *username, bool password,
 const char *progname, bool echo, bool 
quiet);
***
*** 39,44 
--- 39,45 
{quiet, no_argument, NULL, 'q'},
{dbname, required_argument, NULL, 'd'},
{analyze, no_argument, NULL, 'z'},
+   {freeze, no_argument, NULL, 'F'},
{all, no_argument, NULL, 'a'},
{table, required_argument, NULL, 't'},
{full, no_argument, NULL, 'f'},
***
*** 58,63 
--- 59,65 
boolecho = false;
boolquiet = false;
boolanalyze = false;
+   boolfreeze = false;
boolalldb = false;
char   *table = NULL;
boolfull = false;
***
*** 68,74 
  
handle_help_version_opts(argc, argv, vacuumdb, help);
  
!   while ((c = getopt_long(argc, argv, h:p:U:Weqd:zat:fv, long_options, 
optindex)) != -1)
{
switch (c)
{
--- 70,76 
  
handle_help_version_opts(argc, argv, vacuumdb, help);
  
!   while ((c = getopt_long(argc, argv, h:p:U:Weqd:zaFt:fv, long_options, 
optindex)) != -1)
{
switch (c)
{
***
*** 96,101 
--- 98,106 
case 'z':
analyze = true;
break;
+   case 'F':
+   

Re: [HACKERS] Synch Replication

2009-02-17 Thread Czichy, Thoralf (NSN - FI/Helsinki)

hi,

[I am working in the same team as Niranjan]

Niranjan wrote:
  3) Do you have test programs that can used
  for synchronous replication testing?
 
 No, I've not used the automated test program. Yeah, since 
 it's very useful, I'll make it before long.
 
  4) I'am thinking of trying load/performance tests as well. 
  What do you feel? Will it be too early to do this test?

 Any kinds of testing welcome!

Actually, this is just to let you know that for _stability_ and 
performance tests we use the Network Database Benchmark which
we open-sourced (GPLv2) in 2006. Just recently one of our
colleagues wrote a _small_ patch that makes it work out of the
box with _PostgreSQL_/UnixODBC. The patch is now also available.

The main project page(s):
  http://hoslab.cs.helsinki.fi/savane/projects/ndbbenchmark/
  http://hoslab.cs.helsinki.fi/homepages/ndbbenchmark/

The patch:
 
http://hoslab.cs.helsinki.fi/savane/cookbook/?func=detailitemitem_id=14
1

The benchmark models a Telco home location register (HLR)
application with lots of short read/write transactions whose
ratio can be adjusted on the command line, e.g. to model read
or write heavy transaction loads. We'll re-use this benchmark
as we have lots of existing measurements for other databases.
Also we have a pretty good understanding of what to expect
performance-wise with the different transaction mixes. The
actual benchmark specification is available from here

The benchmark spec:
 
http://hoslab.cs.helsinki.fi/downloads/ndbbenchmark/Network_Database_Ben
chmark_Definition_2006-02-01.pdf

Thoralf 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Redefine _() to dgettext() instead of gettext() so that it uses

2009-02-17 Thread Alvaro Herrera
Peter Eisentraut wrote:
 Log Message:
 ---
 Redefine _() to dgettext() instead of gettext() so that it uses the plpgsql
 text domain, instead of the postgres one (or whatever the default may be).

Hmm, so is this needed on all other PLs too?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Good Delimiter for copy command

2009-02-17 Thread David Fetter
On Tue, Feb 17, 2009 at 05:17:40PM +0100, Marco Colombo wrote:
 Tom Lane wrote:
  Andrew Gould andrewlylego...@gmail.com writes:
  To the list:  Does pg_dump escape characters that are the same as the
  delimiter?
  
  Yes.  The OP has not actually explained why he needs to pick a
  nondefault delimiter, unless maybe it is that he wants to feed the
  dump to some program that is too dumb to deal with escaping.
  
  regards, tom lane
  
 
 Which makes me wonder, does copy accept UTF-8 input?

Yes, but...

 Is it possibile to use some unicode character which is unlikely to
 appear in the data set as delimiter? Something like U+FFFC.

No.  The delimiter needs to be one byte long at the moment.  The error
message you're getting probably needs an update.  Bug?

 $ psql -c \copy test from '2.txt' delimiter ●
 ERROR:  COPY delimiter must be a single character
 \copy: ERROR:  COPY delimiter must be a single character

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4660: float functions return -0

2009-02-17 Thread Brendan Jurd
On Wed, Feb 18, 2009 at 2:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The point I'm trying to make is that we should deliver IEEE-compliant
 results if we are on a platform that complies with the spec.  Right down
 to the minus sign.  If that surprises people who are unfamiliar with the
 spec, well, there are a lot of things about floating point arithmetic
 that surprise people who aren't familiar with it.

Agreed.  There are plenty of things about floats that are downright
wonky, and when people start seeing minus zero in their float
computations it might prompt them into doing some reading, and
figuring out that what they really wanted was numeric.

(not saying that floats are without application, but I've often
encountered them in places they ought not to be)

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions about parsing boolean and casting to anyelement

2009-02-17 Thread Brendan Jurd
On Wed, Feb 18, 2009 at 2:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 After thinking about it for awhile, I don't like the notation anyway
 --- it's not immediately obvious that a cast to anyelement should mean
 something like that.  What seems more sensible to me is to introduce
 a function to get the type of an expression, so that you could write
 something like

We already have such a function, pg_typeof().  I submitted a patch for
it in the November commitfest, and you committed it. [1]

Or is that not the sort of function you were thinking of?

Cheers,
BJ

[1] 
http://git.postgresql.org/?p=postgresql.git;a=commit;h=1a850edf036a1c7dbb9f4fcfeae1e5f2c68cf049

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions about parsing boolean and casting to anyelement

2009-02-17 Thread Tom Lane
Brendan Jurd dire...@gmail.com writes:
 On Wed, Feb 18, 2009 at 2:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 After thinking about it for awhile, I don't like the notation anyway
 --- it's not immediately obvious that a cast to anyelement should mean
 something like that.  What seems more sensible to me is to introduce
 a function to get the type of an expression, so that you could write
 something like

 We already have such a function, pg_typeof().

No, pg_typeof is a more-or-less ordinary function that delivers an OID
at runtime.  What we need here is something that will work as a CAST
target, ie, it has to be treated as a type name at parse time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Good Delimiter for copy command

2009-02-17 Thread Tom Lane
David Fetter da...@fetter.org writes:
 On Tue, Feb 17, 2009 at 05:17:40PM +0100, Marco Colombo wrote:
 Is it possibile to use some unicode character which is unlikely to
 appear in the data set as delimiter? Something like U+FFFC.

 No.  The delimiter needs to be one byte long at the moment.  The error
 message you're getting probably needs an update.  Bug?

 $ psql -c \copy test from '2.txt' delimiter ●
 ERROR:  COPY delimiter must be a single character
 \copy: ERROR:  COPY delimiter must be a single character

In 8.3 and CVS HEAD these messages are phrased as must be a single
ASCII character which I suppose is someone's attempt at improving the
situation; but IMHO this has replaced an imprecision with a lie.  It
works fine with non-ASCII characters, if you're working in a single-byte
encoding (eg LATIN-n).

I think it should say must be a single one-byte character and not try
to talk down to the reader.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions about parsing boolean and casting to anyelement

2009-02-17 Thread Pavel Stehule
2009/2/17 Tom Lane t...@sss.pgh.pa.us:
 I wrote:
 ITAGAKI Takahiro itagaki.takah...@oss.ntt.co.jp writes:
 I hope anyelement could be used in cast because casts are supported by
 almost programming languages where template or generics are available.

 I think what you're suggesting is that inside a polymorphic function,
 anyelement would somehow be a macro for the type that the function's
 current anyelement parameter(s) have.  It's an interesting idea but
 it's just fantasy at the moment; I don't even have an idea of how we
 might implement that.

 After thinking about it for awhile, I don't like the notation anyway
 --- it's not immediately obvious that a cast to anyelement should mean
 something like that.  What seems more sensible to me is to introduce
 a function to get the type of an expression, so that you could write
 something like

cast(expression as typeof(expression))

 This special function would act like C's sizeof and similar constructs
 in that its argument would never be evaluated, only inspected at parse
 time to determine its type.  (There are already precedents for this in
 SQL; see the IS OF construct.)  So the original requirement would be
 met with something like expression::typeof($1).

 A small disadvantage of this approach is that it's notationally a bit
 uglier for anyelement/anyarray pairs.  For example, consider a function
 foo(anyelement) returns anyarray.  To get at the element type you just
 say typeof($1), but if you have to name the array type you need a hack
 like typeof(array[$1]).  In the other direction (name the element type
 of a parameter array) something like typeof($1[1]) would work.

 The countervailing advantage is that this solves a lot of problems that
 overloading anyelement wouldn't ever solve, since you can get at the
 type of any expression not just a bare parameter.

 Also I think it'd be relatively easy to stick into the parser; it
 wouldn't require introduction of any new parse-time context information.

 Anyway, none of this is material for 8.4, just a possible TODO item.

it's look like good idea

regards
Pavel Stehule


regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL and row level security

2009-02-17 Thread Josh Berkus

All,

I thought we'd agreed to compromise on having SE without row-level in 
8.4, and working on SE with row-level in 8.5.  Why are we revisiting 
this argument?  8.4 is *already* late; arguing further about the terms 
of SE simply risk us being forced to reject it entirely.


--Josh

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL and row level security

2009-02-17 Thread Pavel Stehule
2009/2/17 Josh Berkus j...@agliodbs.com:
 All,

 I thought we'd agreed to compromise on having SE without row-level in 8.4,
 and working on SE with row-level in 8.5.  Why are we revisiting this
 argument?  8.4 is *already* late; arguing further about the terms of SE
 simply risk us being forced to reject it entirely.


I absolutely agree. It nonsense open again and again closed question.

regards
Pavel

 --Josh

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Restore frozen xids for binary upgrades

2009-02-17 Thread Bruce Momjian
The attached patch adds to pg_dumpall --binary-upgrade by restoring
information about frozen xids for relations and databases.  I think this
is the last patch I need to complete my TODO items for the pg_migrator
binary upgrade utility.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/bin/pg_dump/pg_dump.c
===
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.523
diff -c -c -r1.523 pg_dump.c
*** src/bin/pg_dump/pg_dump.c	17 Feb 2009 22:32:54 -	1.523
--- src/bin/pg_dump/pg_dump.c	17 Feb 2009 22:34:08 -
***
*** 1585,1590 
--- 1585,1591 
  i_encoding,
  i_collate,
  i_ctype,
+ i_frozenxid,
  i_tablespace;
  	CatalogId	dbCatId;
  	DumpId		dbDumpId;
***
*** 1594,1599 
--- 1595,1601 
  			   *collate,
  			   *ctype,
  			   *tablespace;
+ 	uint32		frozenxid;
  
  	datname = PQdb(g_conn);
  
***
*** 1609,1615 
  		appendPQExpBuffer(dbQry, SELECT tableoid, oid, 
  		  (%s datdba) AS dba, 
  		  pg_encoding_to_char(encoding) AS encoding, 
! 		  datcollate, datctype, 
  		  (SELECT spcname FROM pg_tablespace t WHERE t.oid = dattablespace) AS tablespace, 
  	  shobj_description(oid, 'pg_database') AS description 
  
--- 1611,1617 
  		appendPQExpBuffer(dbQry, SELECT tableoid, oid, 
  		  (%s datdba) AS dba, 
  		  pg_encoding_to_char(encoding) AS encoding, 
! 		  datcollate, datctype, datfrozenxid, 
  		  (SELECT spcname FROM pg_tablespace t WHERE t.oid = dattablespace) AS tablespace, 
  	  shobj_description(oid, 'pg_database') AS description 
  
***
*** 1623,1629 
  		appendPQExpBuffer(dbQry, SELECT tableoid, oid, 
  		  (%s datdba) AS dba, 
  		  pg_encoding_to_char(encoding) AS encoding, 
! 		  NULL AS datcollate, NULL AS datctype, 
  		  (SELECT spcname FROM pg_tablespace t WHERE t.oid = dattablespace) AS tablespace, 
  	  shobj_description(oid, 'pg_database') AS description 
  
--- 1625,1631 
  		appendPQExpBuffer(dbQry, SELECT tableoid, oid, 
  		  (%s datdba) AS dba, 
  		  pg_encoding_to_char(encoding) AS encoding, 
! 		  NULL AS datcollate, NULL AS datctype, datfrozenxid, 
  		  (SELECT spcname FROM pg_tablespace t WHERE t.oid = dattablespace) AS tablespace, 
  	  shobj_description(oid, 'pg_database') AS description 
  
***
*** 1637,1643 
  		appendPQExpBuffer(dbQry, SELECT tableoid, oid, 
  		  (%s datdba) AS dba, 
  		  pg_encoding_to_char(encoding) AS encoding, 
! 		  NULL AS datcollate, NULL AS datctype, 
  		  (SELECT spcname FROM pg_tablespace t WHERE t.oid = dattablespace) AS tablespace 
  		  FROM pg_database 
  		  WHERE datname = ,
--- 1639,1645 
  		appendPQExpBuffer(dbQry, SELECT tableoid, oid, 
  		  (%s datdba) AS dba, 
  		  pg_encoding_to_char(encoding) AS encoding, 
! 		  NULL AS datcollate, NULL AS datctype, datfrozenxid, 
  		  (SELECT spcname FROM pg_tablespace t WHERE t.oid = dattablespace) AS tablespace 
  		  FROM pg_database 
  		  WHERE datname = ,
***
*** 1650,1655 
--- 1652,1658 
  		  (%s datdba) AS dba, 
  		  pg_encoding_to_char(encoding) AS encoding, 
  		  NULL AS datcollate, NULL AS datctype, 
+ 		  0 AS datfrozenxid, 
  		  NULL AS tablespace 
  		  FROM pg_database 
  		  WHERE datname = ,
***
*** 1664,1669 
--- 1667,1673 
  		  (%s datdba) AS dba, 
  		  pg_encoding_to_char(encoding) AS encoding, 
  		  NULL AS datcollate, NULL AS datctype, 
+ 		  0 AS datfrozenxid, 
  		  NULL AS tablespace 
  		  FROM pg_database 
  		  WHERE datname = ,
***
*** 1696,1701 
--- 1700,1706 
  	i_encoding = PQfnumber(res, encoding);
  	i_collate = PQfnumber(res, datcollate);
  	i_ctype = PQfnumber(res, datctype);
+ 	i_frozenxid = PQfnumber(res, datfrozenxid);
  	i_tablespace = PQfnumber(res, tablespace);
  
  	dbCatId.tableoid = atooid(PQgetvalue(res, 0, i_tableoid));
***
*** 1704,1709 
--- 1709,1715 
  	encoding = PQgetvalue(res, 0, i_encoding);
  	collate = PQgetvalue(res, 0, i_collate);
  	ctype = PQgetvalue(res, 0, i_ctype);
+ 	frozenxid = atooid(PQgetvalue(res, 0, i_frozenxid));
  	tablespace = PQgetvalue(res, 0, i_tablespace);
  
  	appendPQExpBuffer(creaQry, CREATE DATABASE %s WITH TEMPLATE = template0,
***
*** 1728,1733 
--- 1734,1748 
  		  fmtId(tablespace));
  	appendPQExpBuffer(creaQry, ;\n);
  
+ 	if (binary_upgrade)
+ 	{
+ 		appendPQExpBuffer(creaQry, \n-- For binary upgrade, set datfrozenxid.\n);
+ 		appendPQExpBuffer(creaQry, UPDATE pg_database\n
+ 			 SET datfrozenxid = '%u'\n
+ 			 WHERE	datname = '%s';\n,
+ 			 

Re: [HACKERS] Questions about parsing boolean and casting to anyelement

2009-02-17 Thread Tom Lane
I wrote:
 In the meantime I'm more convinced than ever that we should throw an
 error for attempting such a cast.  If people are imagining that it will
 do something like that, we need to disillusion them.

BTW, I wrote up what I thought was a trivial patch to make this happen,
and promptly got a regression test failure:

  CREATE TABLE enumtest_child (parent rainbow REFERENCES enumtest_parent);
  INSERT INTO enumtest_parent VALUES ('red');
  INSERT INTO enumtest_child VALUES ('red');
+ ERROR:  casting to a polymorphic type such as anyenum is meaningless
+ LINE 1: ... FROM ONLY public.enumtest_parent x WHERE id::pg_catal...
+  ^
+ QUERY:  SELECT 1 FROM ONLY public.enumtest_parent x WHERE 
id::pg_catalog.anyenum OPERATOR(pg_catalog.=) $1::pg_catalog.anyenum FOR 
SHARE OF x
  INSERT INTO enumtest_child VALUES ('blue');  -- fail

What is happening is that the code to generate RI check queries is
blindly casting to the declared input type of the operator it's
selected, which here is anyenum = anyenum.  We could easily prevent
it from doing that for polymorphic input types; but since I tripped over
this case almost immediately, I'm wondering what other cases might be
out there that would get broken by throwing this error.

Seeing that this type of confusion hasn't come up before, I think it
might be better to leave things alone here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vacuumdb --freeze

2009-02-17 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I would like to add a --freeze parameter to vacuumdb for use by the
 binary upgrade utility, and for symmetry with the existing VACUUM
 options;  patch attached.

Exactly what do you think the upgrade utility is going to do with it?
Surely not a database-wide VACUUM FREEZE, if we are hoping that upgrade
is going to be fast.

As far as I can see this is a solution looking for a problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vacuumdb --freeze

2009-02-17 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I would like to add a --freeze parameter to vacuumdb for use by the
  binary upgrade utility, and for symmetry with the existing VACUUM
  options;  patch attached.
 
 Exactly what do you think the upgrade utility is going to do with it?
 Surely not a database-wide VACUUM FREEZE, if we are hoping that upgrade
 is going to be fast.
 
 As far as I can see this is a solution looking for a problem.

I didn't go into the use-case.  The way pg_migrator works is to copy the
_schema_ from the old database and load it into the new database.  We
then need to run vacuum freeze on the schema-only databases because we
then move pg_clog from the old database to the new one; so, it is
needed, and it will not take long to run.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] regression test crashes at tsearch

2009-02-17 Thread Hiroshi Inoue
Hi,

I see a regression test failure in my mingw-vista port
when I invoke the command
  make check MULTIBYTE=euc_jp NO_LOCALE=yes
.
It causes a crash at tsearch.
The crash seems to occur when the server encoding isn't
UTF-8 with no locale.
The attached is a patch to avoid the crash.

regards,
Hiroshi Inoue


Index: backend/utils/mb/mbutils.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/mb/mbutils.c,v
retrieving revision 1.78
diff -c -r1.78 mbutils.c
*** backend/utils/mb/mbutils.c  22 Jan 2009 10:09:48 -  1.78
--- backend/utils/mb/mbutils.c  17 Feb 2009 21:59:26 -
***
*** 575,580 
--- 575,584 
  wchar2char(char *to, const wchar_t *from, size_t tolen)
  {
size_t result;
+ #ifdefWIN32
+   int encoding = GetDatabaseEncoding();
+   booluseWcstombs = !(encoding == PG_UTF8 || lc_ctype_is_c());
+ #endif

if (tolen == 0)
return 0;
***
*** 584,602 
 * On Windows, the Unicode locales assume UTF16 not UTF8 encoding,
 * and for some reason mbstowcs and wcstombs won't do this for us,
 * so we use MultiByteToWideChar().
 */
!   if (GetDatabaseEncoding() == PG_UTF8)
{
!   result = WideCharToMultiByte(CP_UTF8, 0, from, -1, to, tolen,
NULL, NULL);
/* A zero return is failure */
!   if (result = 0)
result = -1;
else
{
-   Assert(result = tolen);
/* Microsoft counts the zero terminator in the result */
!   result--;
}
}
else
--- 588,624 
 * On Windows, the Unicode locales assume UTF16 not UTF8 encoding,
 * and for some reason mbstowcs and wcstombs won't do this for us,
 * so we use MultiByteToWideChar().
+* Also note wcstombs/mbstowcs is unavailable when LC_CTYPE is C.
 */
!   if (!useWcstombs)
{
!   int utf8len = tolen;
!   char *utf8str = to;
!   
!   if (encoding != PG_UTF8)
!   {
!   utf8len = pg_encoding_max_length(PG_UTF8) * tolen;
!   utf8str = palloc(utf8len + 1);
!   }
!   utf8len = WideCharToMultiByte(CP_UTF8, 0, from, -1, utf8str, 
utf8len,
NULL, NULL);
/* A zero return is failure */
!   if (utf8len = 0)
result = -1;
else
{
/* Microsoft counts the zero terminator in the result */
!   result = utf8len - 1;
!   if (encoding != PG_UTF8)
!   {
!   char *mbstr = 
pg_do_encoding_conversion((unsigned char *) utf8str, result, PG_UTF8, encoding);
!   result = strlcpy(to, mbstr, tolen);
!   if (utf8str != to)
!   pfree(utf8str);
!   if (mbstr != utf8str)
!   pfree(mbstr);
!   }
!   Assert(result = tolen);
}
}
else
***
*** 618,637 
  char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen)
  {
size_t  result;
  
if (tolen == 0)
return 0;
  
  #ifdef WIN32
!   /* See WIN32 Unicode comment above */
!   if (GetDatabaseEncoding() == PG_UTF8)
{
/* Win32 API does not work for zero-length input */
!   if (fromlen == 0)
result = 0;
else
{
!   result = MultiByteToWideChar(CP_UTF8, 0, from, fromlen, 
to, tolen - 1);
/* A zero return is failure */
if (result == 0)
result = -1;
--- 640,672 
  char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen)
  {
size_t  result;
+ #ifdefWIN32
+   int encoding = GetDatabaseEncoding();
+   booluseMbstowcs = !(encoding == PG_UTF8 || lc_ctype_is_c());
+ #endif
  
if (tolen == 0)
return 0;
  
  #ifdef WIN32
!   if (!useMbstowcs)
{
+   int utf8len = fromlen;
+   unsigned char *utf8str = (unsigned char *) from;
+   
+   if (encoding != PG_UTF8)
+   {
+   utf8str = pg_do_encoding_conversion(from, fromlen, 
encoding, PG_UTF8);
+   if (utf8str != from)
+   utf8len = 

Re: [HACKERS] SE-PostgreSQL and row level security

2009-02-17 Thread KaiGai Kohei

Pavel Stehule wrote:

2009/2/17 Josh Berkus j...@agliodbs.com:

All,

I thought we'd agreed to compromise on having SE without row-level in 8.4,
and working on SE with row-level in 8.5.  Why are we revisiting this
argument?  8.4 is *already* late; arguing further about the terms of SE
simply risk us being forced to reject it entirely.



I absolutely agree. It nonsense open again and again closed question.


I also agree. What we should do now is to make progress the proposed
feature for v8.4, not a upcoming feature.

BogDan,
As I noted before, I can understand your requirement, but we already
decided to postpone a part of features within originally proposed,
because we're paying effort to develop v8.4 within reasonable schedule.
I'd like you to guess who wanted the row-level stuff to be merged most.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4660: float functions return -0

2009-02-17 Thread Tom Lane
Brendan Jurd dire...@gmail.com writes:
 On Wed, Feb 18, 2009 at 2:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The point I'm trying to make is that we should deliver IEEE-compliant
 results if we are on a platform that complies with the spec.  Right down
 to the minus sign.  If that surprises people who are unfamiliar with the
 spec, well, there are a lot of things about floating point arithmetic
 that surprise people who aren't familiar with it.

 Agreed.  There are plenty of things about floats that are downright
 wonky, and when people start seeing minus zero in their float
 computations it might prompt them into doing some reading, and
 figuring out that what they really wanted was numeric.

I pulled the special code out of float8um/float4um and got the following
two changes in the regression tests:

*** src/test/regress/expected/numerology.outMon Aug  4 22:43:18 2008
--- src/test/regress/results/numerology.out Tue Feb 17 20:05:01 2009
***
*** 92,98 
ORDER BY two, max_float, min_float;
   two |  max_float   |   min_float   
  -+--+---
!1 | 1.2345678901234e+200 | 0
 2 |0 | -1.2345678901234e+200
  (2 rows)
  
--- 92,98 
ORDER BY two, max_float, min_float;
   two |  max_float   |   min_float   
  -+--+---
!1 | 1.2345678901234e+200 |-0
 2 |0 | -1.2345678901234e+200
  (2 rows)
  
***
*** 104,110 
ORDER BY two, max_float, min_float;
   two |  max_float   |   min_float   
  -+--+---
!1 | 1.2345678901234e+200 | 0
 2 |0 | -1.2345678901234e+200
  (2 rows)
  
--- 104,110 
ORDER BY two, max_float, min_float;
   two |  max_float   |   min_float   
  -+--+---
!1 | 1.2345678901234e+200 |-0
 2 |0 | -1.2345678901234e+200
  (2 rows)
  

==

This is on a minus-zero-clean platform of course (same results on Fedora
9 and current Mac OS X).  My HP box still produces the old results,
so we will need two variants of this expected-result file.  Other
platforms might show yet other diffs of course, but we'll have to wait
for buildfarm results to know more.

Last call for objections ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The science of optimization in practical terms?

2009-02-17 Thread decibel

On Feb 15, 2009, at 9:54 PM, Robert Haas wrote:
On Sun, Feb 15, 2009 at 1:16 PM, Greg Smith gsm...@gregsmith.com  
wrote:

On Fri, 13 Feb 2009, Robert Haas wrote:

This seems plausible, but I'm not totally sold: predicting the
contents of the operating system buffer cache sounds like it might be
pretty touch.  And do we even need to go that far?   I'm kind of
wondering whether we might be able to leverage the information that
the statistics collector already gathers for this purpose - in
particular, the information on blocks fetched and read.  That might
not exactly model the current contents of the buffer cache, but it's
certainly a measure of popularity, and that may be all we really need.
 We're not going to invalidate every plan in the system on every
buffer eviction, so plans have to be based not so much on what is in
the buffer cache right now but on what we have a reasonable
expectation of finding there in the typical case.

Consider, for example, the degenerate (but not necessarily uncommon)
case where the entire database can fit within shared_buffers, or
perhaps shared_buffers + OS cache.  ISTM we're going to want to plan
as if the entire database is in cache all the time, even though that
might not always be true - right after restart, for example.


The shared_buffers + OS cache example is a reason why simply  
examining shared_buffers isn't likely to work well; in that case it  
definitely would not reflect reality. Though, really in that case we  
should be able to simply look at eff_cache_size as well as the size  
of the database and understand everything should be in memory.


Actually, a simple algorithm that might work really well would be to  
calculate relation cache odds as ( number of page accesses for  
relation / number of page accesses for all relations ) * ( sum 
(relpages)*BLKSZ / eff_cache_size ), where number of page accesses  
would be both from relcache and not. One thing this doesn't address  
though is the report from a few months ago that accessing small  
tables is still faster with an index scan, even if we know the whole  
thing is in cache (I don't remember if that was ever resolved...)


Another idea would be to look at an efficient way to measure how long  
it actually takes to pull data from the OS. This has been suggested  
in the past, but the idea there was to measure every block access,  
and the concern was the overhead of the timing calls. But what if we  
sampled instead? Or, what if we read multiple blocks at one time in  
the cases where we knew we'd need to (seqscan and an index scan  
needing more than one tuple). Another option would by an async IO  
process that is responsible for measuring this stuff; I believe some  
people have played with async IO and gotten good results.


Of course, on dtrace platforms we could just plug into dtrace...


You might also run into
problems with relations that have hot segments that are accessed
frequently and stay cached, and cold segments that are never
touched: if 20% of the relation is in cache, but that's the only 20%
of the relation we ever access, then our hit rate will be 100% rather
than 20%.


Yes, but that would be accurate :)

In reality, I think we need to re-visit the idea of evaluating how  
close a chosen query plan is matching reality as we're running. If we  
thought we'd be seeing a 100% hit rate but in reality it's much lower  
we could re-plan (of course this probably only makes sense for  
queries that take many seconds).



But even a primitive algorithm would probably be a lot better than
what we have now. I'm guessing that there are a lot of databases where
either the whole database fits in cache, or a decent chunk of
relatively small core relations fit in cache and then there are some
big or infrequently-used ones that don't.


--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The science of optimization in practical terms?

2009-02-17 Thread Robert Haas
 Actually, a simple algorithm that might work really well would be to
 calculate relation cache odds as ( number of page accesses for relation /
 number of page accesses for all relations ) * ( sum(relpages)*BLKSZ /
 eff_cache_size ), where number of page accesses would be both from relcache
 and not.

I don't think that formula makes any sense.  If effective_cache_size
is in the denominator, then increasing it will make the odds of
finding the page in cache go down.

 One thing this doesn't address though is the report from a few
 months ago that accessing small tables is still faster with an index scan,
 even if we know the whole thing is in cache (I don't remember if that was
 ever resolved...)

I'm not sure if this is what you're referring to, but there was a
relatively recent post on, I believe, -performance, where a bitmap
index scan that hit almost the entire table beat out a seqscan.  I
don't think there was any further discussion and I'm still mystified
as to how it's possible.

 Another idea would be to look at an efficient way to measure how long it
 actually takes to pull data from the OS. This has been suggested in the
 past, but the idea there was to measure every block access, and the concern
 was the overhead of the timing calls. But what if we sampled instead? Or,
 what if we read multiple blocks at one time in the cases where we knew we'd
 need to (seqscan and an index scan needing more than one tuple). Another
 option would by an async IO process that is responsible for measuring this
 stuff; I believe some people have played with async IO and gotten good
 results.

 Of course, on dtrace platforms we could just plug into dtrace...

 You might also run into
 problems with relations that have hot segments that are accessed
 frequently and stay cached, and cold segments that are never
 touched: if 20% of the relation is in cache, but that's the only 20%
 of the relation we ever access, then our hit rate will be 100% rather
 than 20%.

 Yes, but that would be accurate :)

No, we'd predict the hit rate to be 20%, but the real hit rate would be 100%.

 In reality, I think we need to re-visit the idea of evaluating how close a
 chosen query plan is matching reality as we're running. If we thought we'd
 be seeing a 100% hit rate but in reality it's much lower we could re-plan
 (of course this probably only makes sense for queries that take many
 seconds).

I don't think it's going to be very practical to re-plan the query in
its entirety, because then you'd have to somehow undo all of the work
you'd done thus far (including side effects, if any), which might not
be possible and certainly isn't easy.  What might be practical is to
bail out of a nested loop that turns out to iterate more times than
expected and hash the inner rel, or even sort the remaining portion of
the outer rel and the entire inner rel and then merge-join them.  The
problem is that these sorts of changes can alter the order in which
results are generated, and if the parent node is something like a
merge-join that needs the results to be ordered in a particular way,
then you've got a problem.  Furthermore, it's not easy to decide when
to switch strategies.  If you decide to switch to a hash join just
prior to what would have been the last iteration of the nested loop,
you lose.

I'm interested to know whether anyone else shares my belief that
nested loops are the cause of most really bad plans.  What usually
happens to me is that the planner develops some unwarranted optimism
about the number of rows likely to be generated by the outer side of
the join and decides that it's not worth sorting the inner side or
building a hash table or using an index, and that the right thing to
do is just rescan the inner node on every pass.  When the outer side
returns three or four orders of magnitude more results than expected,
ka-pow!

Another approach to this problem might be to try to make the planner a
little more cautious about choosing nested loops in the first place.
Given a column a with the values 1 .. 10^6, the planner estimates the
number of rows for a = X as 1, a in (X1..Xn) as n, a not in (X1..Xn)
AS 10^6-n, and a  X for all X  100 as 100.  These are all pretty
reasonable estimates (one could hope to get a different result for a 
5 than a  100).  But as soon as you use some operation that the
planner knows nothing about, the guesses get really bad:

CREATE TABLE big (id serial, x text);
INSERT INTO big (x) SELECT random() FROM generate_series(1,100);
ANALYZE;
EXPLAIN SELECT * FROM big WHERE id % 2 = 0 AND (id + 0) % 2 = 0 AND
(id - 0) % 2 = 0;

  QUERY PLAN
--
 Seq Scan on big  (cost=0.00..36375.00 rows=1 width=22)
   Filter: (((id % 2) = 0) AND (((id + 0) % 2) = 0) AND (((id - 0) % 2) = 0))

The fact that the selectivity of an unknown expression is arbitrarily
set to 0.005 is a defensible choice that doesn't happen to 

Re: [Pljava-dev] [HACKERS] Re: Should creating a new base type require superuser status?

2009-02-17 Thread Kris Jurka



On Sat, 2 Aug 2008, Tom Lane wrote:


So what exactly does happen when the user deliberately specifies wrong
typlen/typbyval/typalign info when creating a type based on PL/Java
functions?


I have reviewed pljava's handling of misrepresented alignment, length, and 
by value parameters


1) Alignment: pljava reads and writes data a byte at a time, so all types 
effectively have char alignment.  Reading an integer will read 
four bytes out of memory and then put those together.  Therefore the 
alignment cannot be misspecified.


2) Length: For fixed length types, pljava correctly detects trying to 
read or write too much data and not supplying enough data on write. 
Pljava does not correctly handle variable length types.  It should be 
setting and reading the length header itself rather than leaving that up 
to the user, but it is not.


3) By value: pljava does not correctly handle passed by value types 
correctly, allowing access to random memory.


So yes, pljava has a security problem, but I still object to the statement 
that no PL can do this securely.  I will work on fixing pljava, but I 
request the change for superuser requirement for type creation be 
reverted.  The fact that no PL currently does it correctly is not a reason 
to prohibit a PL from doing it correctly.


Kris Jurka

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Pljava-dev] [HACKERS] Re: Should creating a new base type require superuser status?

2009-02-17 Thread Thomas Hallgren

Kris Jurka wrote:


3) By value: pljava does not correctly handle passed by value types 
correctly, allowing access to random memory.


This is simply not true. There's no way a Java developer can access 
random memory through PL/Java.


- thomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The science of optimization in practical terms?

2009-02-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I'm interested to know whether anyone else shares my belief that
 nested loops are the cause of most really bad plans.  What usually
 happens to me is that the planner develops some unwarranted optimism
 about the number of rows likely to be generated by the outer side of
 the join and decides that it's not worth sorting the inner side or
 building a hash table or using an index, and that the right thing to
 do is just rescan the inner node on every pass.  When the outer side
 returns three or four orders of magnitude more results than expected,
 ka-pow!

And then there is the other half of the world, who complain because it
*didn't* pick a nestloop for some query that would have run in much less
time if it had.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Pljava-dev] [HACKERS] Re: Should creating a new base type require superuser status?

2009-02-17 Thread Kris Jurka

Thomas Hallgren wrote:

Kris Jurka wrote:


3) By value: pljava does not correctly handle passed by value types 
correctly, allowing access to random memory.


This is simply not true. There's no way a Java developer can access 
random memory through PL/Java.


No, the point is that the Java developer can provide some data which can 
convince postgresql to fetch random data for the user.


Consider the attached type which is simply an int4 equivalent. 
Depending on how you define it as passed by value or passed by reference 
it will or will not work (attached).


This looks like it works:


jurka=# select '1'::intbyref, '2'::intbyval;
 intbyref | intbyval
--+--
 1| 2
(1 row)

But it doesn't really:

jurka=# create table inttest (a intbyref, b intbyval);
CREATE TABLE
jurka=# insert into inttest values ('1', '2');
INSERT 0 1
jurka=# select * from inttest;
 a | b
---+
 1 | 2139062143
(1 row)

You can also get:

jurka=# select * from inttest;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.


Kris Jurka
package types;

import java.io.IOException;
import java.sql.SQLData;
import java.sql.SQLException;
import java.sql.SQLInput;
import java.sql.SQLOutput;
import java.util.logging.Logger;

public class Int implements SQLData
{
	private static Logger s_logger = Logger.getAnonymousLogger();

	private int m_i;
	private String m_typeName;

	public static Int parse(String input, String typeName) throws SQLException
	{
		try
		{
			int i = Integer.parseInt(input);
			return new Int(i, typeName);
		}
		catch(NumberFormatException e)
		{
			throw new SQLException(e.getMessage());
		}
	}

	public Int()
	{
	}

	public Int(int i, String typeName)
	{
		m_i = i;
		m_typeName = typeName;
	}

	public String getSQLTypeName()
	{
		return m_typeName;
	}

	public void readSQL(SQLInput stream, String typeName) throws SQLException
	{
		s_logger.info(typeName +  from SQLInput);
		m_i = stream.readInt();
		m_typeName = typeName;
	}

	public void writeSQL(SQLOutput stream) throws SQLException
	{
		s_logger.info(m_typeName +  to SQLOutput);
		stream.writeInt(m_i);
	}

	public String toString()
	{
		s_logger.info(m_typeName +  toString);
		return Integer.toString(m_i);
	}

}
CREATE TYPE intbyval;

CREATE FUNCTION intbyval_in(cstring)
	RETURNS intbyval
	AS 'UDT[types.Int] input'
	LANGUAGE java IMMUTABLE STRICT;

CREATE FUNCTION intbyval_out(intbyval)
	RETURNS cstring
	AS 'UDT[types.Int] output'
	LANGUAGE java IMMUTABLE STRICT;

CREATE FUNCTION intbyval_recv(internal)
	RETURNS intbyval
	AS 'UDT[types.Int] receive'
	LANGUAGE java IMMUTABLE STRICT;

CREATE FUNCTION intbyval_send(intbyval)
	RETURNS bytea
	AS 'UDT[types.Int] send'
	LANGUAGE java IMMUTABLE STRICT;

CREATE TYPE intbyval (
	internallength = 4,
	input = intbyval_in,
	output = intbyval_out,
	receive = intbyval_recv,
	send = intbyval_send,
	passedbyvalue
	);

CREATE TYPE intbyref;

CREATE FUNCTION intbyref_in(cstring)
	RETURNS intbyref
	AS 'UDT[types.Int] input'
	LANGUAGE java IMMUTABLE STRICT;

CREATE FUNCTION intbyref_out(intbyref)
	RETURNS cstring
	AS 'UDT[types.Int] output'
	LANGUAGE java IMMUTABLE STRICT;

CREATE FUNCTION intbyref_recv(internal)
	RETURNS intbyref
	AS 'UDT[types.Int] receive'
	LANGUAGE java IMMUTABLE STRICT;

CREATE FUNCTION intbyref_send(intbyref)
	RETURNS bytea
	AS 'UDT[types.Int] send'
	LANGUAGE java IMMUTABLE STRICT;

CREATE TYPE intbyref (
	internallength = 4,
	input = intbyref_in,
	output = intbyref_out,
	receive = intbyref_recv,
	send = intbyref_send
	);


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers