date:20100928

[HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Pavel Stehule

Hello

I looked on some constructs that helps with iteration over array in
plpgsql. I propose a following syntax:

FOR var IN [array variable | array expression]
LOOP
  ..
END LOOP

var - declared variable - theoretically we can a detect var type from
array type, but it needs a early expression an analyze (not used in
PL/pgSQL), so var should be declared before. This construct ensure
iteration over all items of array. When somebody needs a subscripts
from some dimension, then he can use a proposed function subscripts.

so iteration over two dimensional array can be written:

DECLARE
  i integer;
  j integer;
BEGIN
  FOR i IN subscripts(myarray, 1) LOOP
FOR j IN subscripts(myarray, 2) LOOP
  RAISE NOTICE 'myarray[%,%] = %', i, j, myarray[i,j];
END LOOP;
  END LOOP;

When input array is multidimensional, then this array is flattened -
order of iteration is specified by physical store of items inside the
array. This construct iterate over all items of input array - it has a
same behave as unnest function.

some examples:

DECLARE
  v integer;
  a int[] := ARRAY[2,3.4,5];
BEGIN
  FOR val IN a
  LOOP
RAISE NOTICE '%', val; -- produce 2,3,4,5
  END LOOP;

  FOR val IN subscripts(a, 1)
  LOOP
RAISE NOTICE '%', val; -- produce 1,2,3,4
  END LOOP;

  FOR val IN subscripts(a,1)
  LOOP
RAISE NOTICE '%', a[val]; -- produce 2,3,4,5
  END LOOP;
END;

Comments, ideas?

Regards

Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Help with User-defined function in PostgreSQL with Visual C++

2010-09-28 Thread Tom Lane

Itagaki Takahiro itagaki.takah...@gmail.com writes:
 On Tue, Sep 28, 2010 at 12:12 PM, Robert Haas robertmh...@gmail.com wrote:
 Oh - I didn't realize this meant marking lots of things in contrib
 that didn't otherwise need to be marked. Â Why do other people need
 this if we don't?

 As I mentioned, we don't need the marks in our build environment at all.

In that case, anybody who does need it should fix their build
environment.

I grow really weary of the idea that we should submit to arbitrary
amounts of uglification of our source code so that it will deal with
this week's Windows oddness.  The Windows folk need to be willing to
do a bit of work on their end.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Parallel Query Execution Project

2010-09-28 Thread Li Jie


Hi all,

I'm interested in this parallel project, 
http://wiki.postgresql.org/wiki/Parallel_Query_Execution


But I can't find any discussion and current progress in the website, it 
seems to stop for nearly a year?


Thanks,
Li Jie

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Help with User-defined function in PostgreSQL with Visual C++

2010-09-28 Thread Itagaki Takahiro

On Tue, Sep 28, 2010 at 3:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 As I mentioned, we don't need the marks in our build environment at all.

 In that case, anybody who does need it should fix their build
 environment.

 I grow really weary of the idea that we should submit to arbitrary
 amounts of uglification of our source code so that it will deal with
 this week's Windows oddness.  The Windows folk need to be willing to
 do a bit of work on their end.

Windows has 3 levels of function visibilities in a DLL:
  1. exported from the DLL
  2. module global, but not exported
  3. static (private in file), of course not exported

On UNIXes, 2 is always 1. So we don't have to distinguish between
global and exported functions. But standard DLL projects on Windows
require marking which functions should be exported.

I'm not sure why we can build modules without any EXPORT marks,
though we can do it actually... It is very unnatural on Windows.


If we want to avoid adding PGDLLEXPORTs, we could have sample MSVC
project with proper settings in tutorial or documentation instead.
It should be opened with VC++ GUI (not from command line!) and can
be generate DLLs in the same way we're using to build the core.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, revised

2010-09-28 Thread Shigeru HANADA

On Mon, 27 Sep 2010 21:07:33 -0400
Robert Haas robertmh...@gmail.com wrote:
 I found and fixed a few more issues and committed this.  The pg_dump
 support had a few escaping bugs, and I added tab completion support
 for psql.  Considering the size of the patch, it seems likely that
 there are some issues we both overlooked, but this is as solid as I
 can make it for right now.
Some OIDs used in SECURITY LABEL patch have already been used for
some functions such as pg_stat_get_xact_numscans().

The src/include/catalog/duplicate_oids script reports that 3037 ~
3040 are used two or more times.

Though all regression tests finish successfully, should this be
fixed ?

regards,
--
Shigeru Hanada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Query Execution Project

2010-09-28 Thread Markus Wanner

Hi,

On 09/28/2010 07:24 AM, Li Jie wrote:
 I'm interested in this parallel project,
 http://wiki.postgresql.org/wiki/Parallel_Query_Execution
 
 But I can't find any discussion and current progress in the website, it
 seems to stop for nearly a year?

Yeah, I don't know of anybody really working on it ATM.

If you are interested in a process based design, please have a look at
the bgworker infrastructure stuff. It could be of help for a
process-based implementation.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using streaming replication as log archiving

2010-09-28 Thread Magnus Hagander

On Tue, Sep 28, 2010 at 06:25, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Sep 27, 2010 at 9:07 PM, Magnus Hagander mag...@hagander.net wrote:
 As has been previously mentioned a couple of times, it should be
 perfectly possible to use streaming replication to get around the
 limitations of archive_command/archive_timeout to do log archiving for
 PITR (being that you either keep archive_timeout high and risk data
 loss or you set it very low and generate a huge log archive without
 need).

 I've put together a tool to do this. The basic idea is to just stream
 down replication and write it to regular WAL files, which can then be
 used for recovery. You'll still need to use archive_command together
 with it to ensure that the backups are complete. Streaming replication
 doesn't guarantee that - in fact, regular replication will fallback to
 using whatever archive_command created when wal_keep_segments isn't
 enough.

 I've put up an early version of the tool at
 http://github.com/mhagander/pg_streamrecv

 Great! This also might be useful for users who want something like
 Oracle redo log mirroring.

Thanks.

 Comments and contributions are most welcome. And frankly, a good
 review is very much required before I'd trust it ;) Hopefully, I
 didn't overlook something critical :D

 When I ran that, the size of the WAL file in inprogress directory
 became more than 16MB. Obviously something isn't right.

Wow, that's weird. I'm unable to reproduce that here - can you try to
figure out why that happened?


 When I requested immediate shutdown to the master, segmentation
 fault occurred in pg_streamrecv. I guess that the return value 0
 of PQgetCopyData would not be handled correctly.

Almost right - it actually returns -2 - which isn't handled. I've
added a fix for that - and while att it, covering anything that's so
small it doesn't contain the streaming replication header.


 After I repeated Ctrl+C and start of pg_streamrecv some times,
 I encountered the following error and pg_streamrecv was never up.
 Is this intentional?

    In progress directory contains more than one file!

    $ ls foo/inprogress/
    0001000D  0001000D.save

Yes, that is actually intentional.

When it finds the ..0D file there the first time, it gets renamed to
.save, and it retries the transmission from the beginning of that
segment. as soon as the retransmission has passed the point that 0D
was at, the .save file is removed. If you Ctrl-C the process again
*before* it has reached that point, it will leave both files around -
it's up to you to clean them up. This is to make sure we don't
overwrite a file that contains more log data than is currently
available on the master.


 When there is inprogress or archived WAL file, pg_streamrecv should
 not execute pg_current_xlog_location because that result is not used?

Yeah, that's just a bit of lazy programming that I should fix :-)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Help with User-defined function in PostgreSQL with Visual C++

2010-09-28 Thread Craig Ringer

On 28/09/10 11:09, Itagaki Takahiro wrote:
 On Tue, Sep 28, 2010 at 9:51 AM, Robert Haas robertmh...@gmail.com wrote:
 Since we have PGDLLEXPORT in 9.0, we can mark some of exported
 functions with it in tutorial codes and maybe contrib modules.

 If that (a) works and (b) reduces user confusion, +1 from me.  We've
 gotten this question a few times lately.
 
 If we do so, many PGDLLEXPORT will be added:
   *  17 in src/tutorial
   * 507 in contrib
 for each exported PGFunction, _PG_init, and _PG_fini.
 
 Any objections? Am I missing something?

For what it's worth, these macros are useful for more than Windows.

They can be used with gcc's visibility features to reduce the size of
the symbol table and therefore speed linking and backend startup. This
isn't as important in a C-only program as it is in a C++ program (which
has huge collections of private symbols) but it still won't hurt. If
building with gcc4 on a non-Windows platform, the PGDLLEXPORT macro can
be defined as:

  #define PGDLLEXPORT __attribute__ ((visibility(default)))

and gcc can be invoked with -fvisibility=hidden to make symbols not
explicitly exported hidden by default. A handy advantage of this is that
it'll cause code that would compile and run correctly on *nix and fail
on Windows to also fail on *nix, making it easier for *nix based
developers (ie sane, happy people) to catch issues that'll break Windows.

Such macros also serve as documentation markers indicating public API.
They're ugly, but they're not purely Windows ugly.

-- 
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Help with User-defined function in PostgreSQL with Visual C++

2010-09-28 Thread Magnus Hagander

On Tue, Sep 28, 2010 at 09:26, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 On Tue, Sep 28, 2010 at 3:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 As I mentioned, we don't need the marks in our build environment at all.

 In that case, anybody who does need it should fix their build
 environment.

 I grow really weary of the idea that we should submit to arbitrary
 amounts of uglification of our source code so that it will deal with
 this week's Windows oddness.  The Windows folk need to be willing to
 do a bit of work on their end.

 Windows has 3 levels of function visibilities in a DLL:
  1. exported from the DLL
  2. module global, but not exported
  3. static (private in file), of course not exported

 On UNIXes, 2 is always 1. So we don't have to distinguish between
 global and exported functions. But standard DLL projects on Windows
 require marking which functions should be exported.

 I'm not sure why we can build modules without any EXPORT marks,
 though we can do it actually... It is very unnatural on Windows.


 If we want to avoid adding PGDLLEXPORTs, we could have sample MSVC
 project with proper settings in tutorial or documentation instead.
 It should be opened with VC++ GUI (not from command line!) and can
 be generate DLLs in the same way we're using to build the core.

We're talking about the export all symbols thing, right? I *don't*
think we want to recommend people to do that - it creates bloated DLL
files, for no really good reason. Also, it's not just a matter of a
msvc project - we do that with a perl hack
(http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/tools/msvc/gendef.pl;h=b8538dd79b8baf21ede87b2ec1aba0276fd3b3d9;hb=62b6aaa40b2abb26edf18d1cd00dffcac090f67a).
It's not a good way.

We might, however, want to add a specific section to the
*documentation* about building extensions on Windows. We have section
35.9.6 which lists a bunch of OSes, where windows is clearly missing.
But perhaps a complete section of it's own, like the one for pgxs in
35.9.6, would be even better?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Itagaki Takahiro

On Tue, Sep 28, 2010 at 3:24 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I looked on some constructs that helps with iteration over array in
 plpgsql. I propose a following syntax:

 FOR var IN [array variable | array expression]

What is the benefits compared with
FOR ... IN SELECT unnest(array) or generate_subscripts(array) ?

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Pavel Stehule

2010/9/28 Itagaki Takahiro itagaki.takah...@gmail.com:
 On Tue, Sep 28, 2010 at 3:24 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I looked on some constructs that helps with iteration over array in
 plpgsql. I propose a following syntax:

 FOR var IN [array variable | array expression]

 What is the benefits compared with
 FOR ... IN SELECT unnest(array) or generate_subscripts(array) ?


the speed

SELECT unnest() is full query, but array_expression is just simple
query and can be evaluated by
exec_eval_simple_expr - it can be significantly times faster.
CREATE OR REPLACE FUNCTION f1()
RETURNS void AS $$
DECLARE a int[] := ARRAY[1,2,3,4];
s int;
BEGIN
  FOR i IN 1..1 LOOP
s := 0;
FOR j IN array_lower(a,1)..array_upper(a,1)
LOOP
  s := s + a[j];
END LOOP;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

take about 255ms

CREATE OR REPLACE FUNCTION f1()
RETURNS void AS $$
DECLARE a int[] := ARRAY[1,2,3,4]; j int;
s int;
BEGIN
  FOR i IN 1..1 LOOP
s := 0;
FOR j IN SELECT unnest(a)
LOOP
  s := s + j;
END LOOP;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

it takes abou 1000ms

Regards

Pavel Stehule


 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Query Execution Project

2010-09-28 Thread Hans-Jürgen Schönig

On Sep 28, 2010, at 10:15 AM, Markus Wanner wrote:

 Hi,
 
 On 09/28/2010 07:24 AM, Li Jie wrote:
 I'm interested in this parallel project,
 http://wiki.postgresql.org/wiki/Parallel_Query_Execution
 
 But I can't find any discussion and current progress in the website, it
 seems to stop for nearly a year?
 
 Yeah, I don't know of anybody really working on it ATM.
 
 If you are interested in a process based design, please have a look at
 the bgworker infrastructure stuff. It could be of help for a
 process-based implementation.
 
 Regards
 
 Markus Wanner



yes, i don't know of anybody either.
in addition to that it is more than a giant task. it means working on more than 
just one isolated part.
practically i cannot think of any stage of query execution which would not need 
some changes.
i don't see a feature like that within a realistic timeframe.

regards,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Still more tweaking of git_changelog.

2010-09-28 Thread Dimitri Fontaine

Dimitri Fontaine dfonta...@hi-media.com writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 Author: Tom Lane t...@sss.pgh.pa.us
 Branch: master Release: REL8_1 [872c1497f] 2005-05-24 18:02:31 +
 Branch: REL8_0_STABLE Release: REL8_0_4 [a94ace079] 2005-05-24 18:02:55 +
 Branch: REL7_4_STABLE Release: REL7_4_9 [0a7b3a364] 2005-05-24 18:03:24 +

   http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=872c1497f
   
 http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=40608e7f949fb7e4025c0ddd5be01939adc79eec

 Can we do something about the gitweb interface to include such a feature?

Please?
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, revised

2010-09-28 Thread Robert Haas

On Tue, Sep 28, 2010 at 3:57 AM, Shigeru HANADA
han...@metrosystems.co.jp wrote:
 On Mon, 27 Sep 2010 21:07:33 -0400
 Robert Haas robertmh...@gmail.com wrote:
 I found and fixed a few more issues and committed this.  The pg_dump
 support had a few escaping bugs, and I added tab completion support
 for psql.  Considering the size of the patch, it seems likely that
 there are some issues we both overlooked, but this is as solid as I
 can make it for right now.
 Some OIDs used in SECURITY LABEL patch have already been used for
 some functions such as pg_stat_get_xact_numscans().

 The src/include/catalog/duplicate_oids script reports that 3037 ~
 3040 are used two or more times.

 Though all regression tests finish successfully, should this be
 fixed ?

Woops.  Thanks for the report, fixed.  I wish we had a regression test
that would catch these mistakes.  It's easy to forget to run this
script.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, revised

2010-09-28 Thread Magnus Hagander

On Tue, Sep 28, 2010 at 13:50, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Sep 28, 2010 at 3:57 AM, Shigeru HANADA
 han...@metrosystems.co.jp wrote:
 On Mon, 27 Sep 2010 21:07:33 -0400
 Robert Haas robertmh...@gmail.com wrote:
 I found and fixed a few more issues and committed this.  The pg_dump
 support had a few escaping bugs, and I added tab completion support
 for psql.  Considering the size of the patch, it seems likely that
 there are some issues we both overlooked, but this is as solid as I
 can make it for right now.
 Some OIDs used in SECURITY LABEL patch have already been used for
 some functions such as pg_stat_get_xact_numscans().

 The src/include/catalog/duplicate_oids script reports that 3037 ~
 3040 are used two or more times.

 Though all regression tests finish successfully, should this be
 fixed ?

 Woops.  Thanks for the report, fixed.  I wish we had a regression test
 that would catch these mistakes.  It's easy to forget to run this
 script.

Could we run the script as part of the regression tests? :-)

Or if not, could we have the buildfarm run it?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, revised

2010-09-28 Thread Peter Eisentraut

On tis, 2010-09-28 at 13:53 +0200, Magnus Hagander wrote:
 On Tue, Sep 28, 2010 at 13:50, Robert Haas robertmh...@gmail.com wrote:
  On Tue, Sep 28, 2010 at 3:57 AM, Shigeru HANADA
  The src/include/catalog/duplicate_oids script reports that 3037 ~
  3040 are used two or more times.
 
  Though all regression tests finish successfully, should this be
  fixed ?
 
  Woops.  Thanks for the report, fixed.  I wish we had a regression test
  that would catch these mistakes.  It's easy to forget to run this
  script.
 
 Could we run the script as part of the regression tests? :-)
 
 Or if not, could we have the buildfarm run it?

I think it should actually be run as part of the regular build.  Ever
since I started using git and developing like 10 features at once, and
other people doing the same, the chances of (hidden) OID conflicts is
growing immensely.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Peter Geoghegan

 2010/9/28 Itagaki Takahiro itagaki.takah...@gmail.com:
 On Tue, Sep 28, 2010 at 3:24 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I looked on some constructs that helps with iteration over array in
 plpgsql. I propose a following syntax:

 FOR var IN [array variable | array expression]

 What is the benefits compared with
 FOR ... IN SELECT unnest(array) or generate_subscripts(array) ?


 the speed


Not to mention that it's far more aesthetically pleasing, and makes
the statement immediately understandable to people unfamiliar with the
plpgsql idioms you describe.

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, revised

2010-09-28 Thread Robert Haas

On Tue, Sep 28, 2010 at 8:03 AM, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2010-09-28 at 13:53 +0200, Magnus Hagander wrote:
 On Tue, Sep 28, 2010 at 13:50, Robert Haas robertmh...@gmail.com wrote:
  On Tue, Sep 28, 2010 at 3:57 AM, Shigeru HANADA
  The src/include/catalog/duplicate_oids script reports that 3037 ~
  3040 are used two or more times.
 
  Though all regression tests finish successfully, should this be
  fixed ?
 
  Woops.  Thanks for the report, fixed.  I wish we had a regression test
  that would catch these mistakes.  It's easy to forget to run this
  script.

 Could we run the script as part of the regression tests? :-)

 Or if not, could we have the buildfarm run it?

 I think it should actually be run as part of the regular build.  Ever
 since I started using git and developing like 10 features at once, and
 other people doing the same, the chances of (hidden) OID conflicts is
 growing immensely.

Injecting nonessential checks into the build process doesn't seem like
a good idea to me.  Typing 'make' should just do the build.  If you
want to check for breakage, well, that's what 'make check' is for.

Another angle on this problem is that, at least AFAICT, the duplicate
OIDs are completely harmless so long as they are in different
catalogs.  And if they are in the same catalog, then initdb will fail
(and shame on you if you don't notice that).  Long, long ago
pg_description was indexed just by object-OID, so duplicates would be
a problem, but that hasn't been the case since 2001, and I'm not aware
of anything else that relies on OIDs being globally unique either.  So
maybe we should decide that we just don't care about this any more.
It seems a little silly since we're in no danger of running out of
OIDs any time soon, but if there's no practical reason to do it...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] small fix to possible null pointer dereference in byteaout() varlena.c

2010-09-28 Thread Grzegorz Jaśkiewicz

It would crash if input is of unrecognized format. Probably than
there's going to be more problems to be concerned with, but just in
case, don't crash in

-- 
GJ
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 363fd3c..48ee55d 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -351,6 +351,7 @@ byteaout(PG_FUNCTION_ARGS)
 			else
 *rp++ = *vp;
 		}
+*rp = '\0';
 	}
 	else
 	{
@@ -358,7 +359,7 @@ byteaout(PG_FUNCTION_ARGS)
 			 bytea_output);
 		rp = result = NULL;		/* keep compiler quiet */
 	}
-	*rp = '\0';
+
 	PG_RETURN_CSTRING(result);
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery.conf location

2010-09-28 Thread Robert Haas

On Tue, Sep 28, 2010 at 12:46 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Sep 27, 2010 at 5:02 PM, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Sep 27, 2010 at 08:34, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Sep 27, 2010 at 9:35 AM, Jaime Casanova ja...@2ndquadrant.com 
 wrote:
 Maybe i'm missing something but this would be a problem if we put a
 trigger file and the recovery.conf gets renamed to recovery.done, no?
 at least that would be a problem for the standbys that still need to
 be standbys

 That's not problem unless more than one standbys become master at the
 same time. Since recovery.conf is read by standby only at the start unless
 it's triggered, already-started standbys can work even if recovery.conf is
 renamed to recovery.done. Though it's somewhat tricky..

 Uh, what if the slave is restarted for one reason or another? That
 seems like it would be really fragile..

 Agreed ;)

 So, even if we move primary_conninfo and trigger_file to postgresql.conf,
 we would need to still leave standby_mode in recovery.conf.

The idea of relying on the existence of recovery.conf to determine
whether we should continue recovery forever or switch to normal
running seems somewhat klunky to me.  It mixes up settings with
control information.  Maybe the control information should move to
pg_control, and the settings to postgresql.conf.  *waves hands*

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Help with User-defined function in PostgreSQL with Visual C++

2010-09-28 Thread Euler Taveira de Oliveira

Magnus Hagander escreveu:
 We might, however, want to add a specific section to the
 *documentation* about building extensions on Windows.
 
+1.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: SQL/MED(FDW) DDL

2010-09-28 Thread Robert Haas

On Mon, Sep 27, 2010 at 2:50 AM, SAKAMOTO Masahiko
sakamoto.masah...@oss.ntt.co.jp wrote:
  http://wiki.postgresql.org/wiki/SQL/MED

With regard to what is written here, it strikes me that it would be an
extremely bad idea to try to mix reloptions or attoptions with
fdwoptions.  fdwoptions are options to be passed transparently to the
fdw to handle as it likes; rel/attoptions affect the behavior of PG.

I think the section about WHERE clause push-down is way off base.
First, it seems totally wrong to assume that the same functions and
operators will be defined on the remote side as you have locally;
indeed, for CSV files, you won't have anything defined on the remote
side at all.  You need some kind of a discovery mechanism here to
figure out which quals are push-downable.  And it should probably be
something generic, not a bunch of hard-wired rules that may or may not
be correct in any particular case.  What if the remote side is a
competing database product that doesn't understand X = ANY(Y)?
Second, even if a functions or operators does exist on both sides of
the link, how do you know whether they have compatible semantics?
Short of solving the entscheidungsproblem, you're not going to be able
to determine that algorithmically, so you need some kind of mechanism
for controlling what assumptions get made.  Otherwise, you'll end up
with queries that don't work and no way for the user to fix it.

It seems to me that the API should allow PG to ask the FDW questions like this:

- How many tuples are there on the remote side?
- Here is a qual.  Are you able to evaluate this qual remotely?
- What are the startup and total costs of a sequential scan of the
remote side with the following set of remotely executable quals?
- Are there any indices available on the remote side, and if so what
are there names and which columns do they index in which order
(asc/desc, nulls first/last)?
- What are the startup and total costs of an index scan of the remote
side using the index called $NAME given the following set of remotely
executable quals?

and, as you mentIon:

- Please update pg_statistic for this foreign table, if you have that
capability.

Then:

- Begin a sequential scan with the following set of quals.
- Begin an index scan using the index called X with the following set of quals.
- Fetch next tuple.
- End scan.

Maybe that's too much for a first version but if we're not going to
deal with the problems in a general way, then we ought to not deal
with them at all, rather than having hacky rules that will work if
your environment is set up in exactly the way the code expects and
otherwise break horribly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: SQL/MED(FDW) DDL

2010-09-28 Thread Robert Haas

On Mon, Sep 27, 2010 at 2:50 AM, SAKAMOTO Masahiko
sakamoto.masah...@oss.ntt.co.jp wrote:
  http://wiki.postgresql.org/wiki/SQL/MED

With regard to what is written here, it strikes me that it would be an
extremely bad idea to try to mix reloptions or attoptions with
fdwoptions.  fdwoptions are options to be passed transparently to the
fdw to handle as it likes; rel/attoptions affect the behavior of PG.

I think the section about WHERE clause push-down is way off base.
First, it seems totally wrong to assume that the same functions and
operators will be defined on the remote side as you have locally;
indeed, for CSV files, you won't have anything defined on the remote
side at all.  You need some kind of a discovery mechanism here to
figure out which quals are push-downable.  And it should probably be
something generic, not a bunch of hard-wired rules that may or may not
be correct in any particular case.  What if the remote side is a
competing database product that doesn't understand X = ANY(Y)?
Second, even if a functions or operators does exist on both sides of
the link, how do you know whether they have compatible semantics?
Short of solving the entscheidungsproblem, you're not going to be able
to determine that algorithmically, so you need some kind of mechanism
for controlling what assumptions get made.  Otherwise, you'll end up
with queries that don't work and no way for the user to fix it.

It seems to me that the API should allow PG to ask the FDW questions like this:

- How many tuples are there on the remote side?
- Here is a qual.  Are you able to evaluate this qual remotely?
- What are the startup and total costs of a sequential scan of the
remote side with the following set of remotely executable quals?
- Are there any indices available on the remote side, and if so what
are there names and which columns do they index in which order
(asc/desc, nulls first/last)?
- What are the startup and total costs of an index scan of the remote
side using the index called $NAME given the following set of remotely
executable quals?

and, as you mentIon:

- Please update pg_statistic for this foreign table, if you have that
capability.

Then:

- Begin a sequential scan with the following set of quals.
- Begin an index scan using the index called X with the following set of quals.
- Fetch next tuple.
- End scan.

Maybe that's too much for a first version but if we're not going to
deal with the problems in a general way, then we ought to not deal
with them at all, rather than having hacky rules that will work if
your environment is set up in exactly the way the code expects and
otherwise break horribly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Tom Lane

Pavel Stehule pavel.steh...@gmail.com writes:
 I looked on some constructs that helps with iteration over array in
 plpgsql. I propose a following syntax:

 FOR var IN [array variable | array expression]
 LOOP

I don't have any opinion about whether the functionality proposed here
is worth the trouble, but I do have an opinion about that syntax: it's
an awful choice.  plpgsql has enough trouble already distinguishing
between integer for-loops and query for-loops, not to mention trouble
in producing a helpful error message when somebody gets either of those
constructs slightly wrong.  Providing a variant where a single
expression can follow IN will make both of those problems an order of
magnitude worse.  As an example, is this a for-in-query or a
for-in-array?

FOR v IN (SELECT arraycol FROM tab) LOOP ...

Either answer is plausible depending on whether you assume the
parentheses make it a subquery.

Pick something less easily confusable with the existing constructs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Robert Haas

On Tue, Sep 28, 2010 at 10:34 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavel Stehule pavel.steh...@gmail.com writes:
 I looked on some constructs that helps with iteration over array in
 plpgsql. I propose a following syntax:

 FOR var IN [array variable | array expression]
 LOOP

 I don't have any opinion about whether the functionality proposed here
 is worth the trouble, but I do have an opinion about that syntax: it's
 an awful choice.

I agree, on both points.

It's nice to try to reduce the excess verbosity that is IMHO the
biggest usability issue with PL/pgsql, but I can't help wondering
whether we're trying to dam the Nile with chicken wire.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, revised

2010-09-28 Thread Peter Eisentraut

On tis, 2010-09-28 at 09:22 -0400, Robert Haas wrote:
  I think it should actually be run as part of the regular build.
  Ever
  since I started using git and developing like 10 features at once,
 and
  other people doing the same, the chances of (hidden) OID conflicts
 is
  growing immensely.
 
 Injecting nonessential checks into the build process doesn't seem like
 a good idea to me.  Typing 'make' should just do the build.  If you
 want to check for breakage, well, that's what 'make check' is for.

I don't feel strongly either way, but if we want to philosophize for a
minute, all these -W option on the compiler command line are
nonessential checks. ;-)  I suppose 'make check' should test whether the
code behaves correctly rather than checking whether the code was
structured consistently to begin with.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] small fix to possible null pointer dereference in byteaout() varlena.c

2010-09-28 Thread Tom Lane

=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes:
 It would crash if input is of unrecognized format. Probably than
 there's going to be more problems to be concerned with, but just in
 case, don't crash in

I'm not sure why you think this is a good change, but it would break
things: in particular, the code would fail to null-terminate the string
in the hex-output case.  Also, the case that you seem to be trying to
defend against can't happen because elog(ERROR) doesn't return.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, revised

2010-09-28 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 Another angle on this problem is that, at least AFAICT, the duplicate
 OIDs are completely harmless so long as they are in different
 catalogs.  And if they are in the same catalog, then initdb will fail
 (and shame on you if you don't notice that).  Long, long ago
 pg_description was indexed just by object-OID, so duplicates would be
 a problem, but that hasn't been the case since 2001, and I'm not aware
 of anything else that relies on OIDs being globally unique either.  So
 maybe we should decide that we just don't care about this any more.

No, we shouldn't.  The reason we still have the policy of global
uniqueness of manually-assigned OIDs is that those OIDs frequently
need to be copied in multiple places (eg, operators may need to be
entered in pg_amop).  It gets a lot easier to make mistakes, and
harder to check for mistakes, if the OIDs aren't unique.

The duplicate_oids script is just something that committers are supposed
to know to run when applying a patch that messes with the catalogs.
That's a sufficiently small minority of patches that I don't see the
attraction of trying to wire it into every build, nor every regression
test.  Maybe the landscape is changing to the point where we can't trust
committers to get this right, but I haven't seen evidence of that.  You
certainly won't forget it again soon ;-)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] small fix to possible null pointer dereference in byteaout() varlena.c

2010-09-28 Thread Grzegorz Jaśkiewicz

2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 =?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes:
 It would crash if input is of unrecognized format. Probably than
 there's going to be more problems to be concerned with, but just in
 case, don't crash in

 I'm not sure why you think this is a good change, but it would break
 things: in particular, the code would fail to null-terminate the string
 in the hex-output case.  Also, the case that you seem to be trying to
 defend against can't happen because elog(ERROR) doesn't return.


...
rp = result = NULL; /* keep compiler quiet */
}
*rp = '\0';


this strikes me as a clear case of possible null pointer dereference,
wouldn't you agree ?
I know the case is very corner-ish, but still valid imo.




-- 
GJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, revised

2010-09-28 Thread Robert Haas

On Tue, Sep 28, 2010 at 11:14 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Another angle on this problem is that, at least AFAICT, the duplicate
 OIDs are completely harmless so long as they are in different
 catalogs.  And if they are in the same catalog, then initdb will fail
 (and shame on you if you don't notice that).  Long, long ago
 pg_description was indexed just by object-OID, so duplicates would be
 a problem, but that hasn't been the case since 2001, and I'm not aware
 of anything else that relies on OIDs being globally unique either.  So
 maybe we should decide that we just don't care about this any more.

 No, we shouldn't.  The reason we still have the policy of global
 uniqueness of manually-assigned OIDs is that those OIDs frequently
 need to be copied in multiple places (eg, operators may need to be
 entered in pg_amop).  It gets a lot easier to make mistakes, and
 harder to check for mistakes, if the OIDs aren't unique.

Yeah, I guess that's true.

 The duplicate_oids script is just something that committers are supposed
 to know to run when applying a patch that messes with the catalogs.
 That's a sufficiently small minority of patches that I don't see the
 attraction of trying to wire it into every build, nor every regression
 test.  Maybe the landscape is changing to the point where we can't trust
 committers to get this right, but I haven't seen evidence of that.  You
 certainly won't forget it again soon ;-)

Maybe so, but the more steps there are that have to be manually
remembered, the harder it gets.  Run the regression tests, check for
duplicate OIDs, bump catversion, bump WAL version, bump pg_dump
archive version, bump pg_control version, check for unnecessary header
includes, audit trailing whitespace, look for leftovers that don't
need to be committed.  If we can combine a step or two, it just makes
life easier.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] small fix to possible null pointer dereference in byteaout() varlena.c

2010-09-28 Thread Tom Lane

=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes:
 ...
 rp = result = NULL; /* keep compiler quiet */
 }
 *rp = '\0';
 

 this strikes me as a clear case of possible null pointer dereference,
 wouldn't you agree ?

No, I wouldn't.  You need to enlarge your peephole by one line:

else
{
elog(ERROR, unrecognized bytea_output setting: %d,
 bytea_output);
rp = result = NULL;/* keep compiler quiet */
}
*rp = '\0';

The keep compiler quiet line is unreachable code (and that comment is
supposed to remind you of that).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] small fix to possible null pointer dereference in byteaout() varlena.c

2010-09-28 Thread Robert Haas

2010/9/28 Grzegorz Jaśkiewicz gryz...@gmail.com:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 =?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes:
 It would crash if input is of unrecognized format. Probably than
 there's going to be more problems to be concerned with, but just in
 case, don't crash in

 I'm not sure why you think this is a good change, but it would break
 things: in particular, the code would fail to null-terminate the string
 in the hex-output case.  Also, the case that you seem to be trying to
 defend against can't happen because elog(ERROR) doesn't return.


 ...
                rp = result = NULL;             /* keep compiler quiet */
        }
        *rp = '\0';
 

 this strikes me as a clear case of possible null pointer dereference,
 wouldn't you agree ?
 I know the case is very corner-ish, but still valid imo.

That comment that says keep compiler quiet is there because we
expect that line of code never to get executed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] small fix to possible null pointer dereference in byteaout() varlena.c

2010-09-28 Thread Grzegorz Jaśkiewicz

got it folks. Forgot that elog doesn't return with ERROR

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, revised

2010-09-28 Thread Gurjeet Singh

On Tue, Sep 28, 2010 at 3:22 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Sep 28, 2010 at 8:03 AM, Peter Eisentraut pete...@gmx.net wrote:
  On tis, 2010-09-28 at 13:53 +0200, Magnus Hagander wrote:
  On Tue, Sep 28, 2010 at 13:50, Robert Haas robertmh...@gmail.com
 wrote:
   On Tue, Sep 28, 2010 at 3:57 AM, Shigeru HANADA
   The src/include/catalog/duplicate_oids script reports that 3037 ~
   3040 are used two or more times.
  
   Though all regression tests finish successfully, should this be
   fixed ?
  
   Woops.  Thanks for the report, fixed.  I wish we had a regression test
   that would catch these mistakes.  It's easy to forget to run this
   script.
 
  Could we run the script as part of the regression tests? :-)
 
  Or if not, could we have the buildfarm run it?
 
  I think it should actually be run as part of the regular build.  Ever
  since I started using git and developing like 10 features at once, and
  other people doing the same, the chances of (hidden) OID conflicts is
  growing immensely.

 Injecting nonessential checks into the build process doesn't seem like
 a good idea to me.  Typing 'make' should just do the build.  If you
 want to check for breakage, well, that's what 'make check' is for.


How about having git-hooks execute the script on specific action of our
choice? pre-commit hook sounds like a good place to check for this.

Regards,

(I don't have the complete context of this thread, so please ignore my
ignorance)
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread David E. Wheeler

On Sep 28, 2010, at 7:41 AM, Robert Haas wrote:

 I don't have any opinion about whether the functionality proposed here
 is worth the trouble, but I do have an opinion about that syntax: it's
 an awful choice.
 
 I agree, on both points.
 
 It's nice to try to reduce the excess verbosity that is IMHO the
 biggest usability issue with PL/pgsql, but I can't help wondering
 whether we're trying to dam the Nile with chicken wire.

I would really like to have this functionality. Iterating over arrays is 
something I do in PL/pgSQL a *lot*.

I see two ways to handle syntax:

1. A new keyword. Some options:
   + FOREACH (too close to FOR?)
   + ITERATE (ew)

2. Require the subscripts() function as syntax.

Thoughts?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trailing whitespace in psql table output

2010-09-28 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Mon, Sep 27, 2010 at 2:09 PM, David Fetter da...@fetter.org wrote:
 I must be missing something pretty crucial here as far as the
 complexity of changing all the regression tests.  Wouldn't trimming
 all trailing whitespace do the trick?

 Sure.  But everyone using pg_regress will have to update their
 regression test expected outputs.

I'm inclined to think that that's not a fatal objection; it's not like
we haven't felt free to change psql's output format before.  As long as
we don't back-patch this change, it should be no worse than other things
we've done to third-party code without a backwards glance.

It would be good to get rid of this whitespace because (I believe) it is
one of very few reasons for needing to have any trailing whitespace in
git-controlled files.  If we could get to a point where trailing
whitespace in patches could be rejected automatically, it'd eliminate
one small pet peeve.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trailing whitespace in psql table output

2010-09-28 Thread Robert Haas

On Tue, Sep 28, 2010 at 12:18 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Sep 27, 2010 at 2:09 PM, David Fetter da...@fetter.org wrote:
 I must be missing something pretty crucial here as far as the
 complexity of changing all the regression tests.  Wouldn't trimming
 all trailing whitespace do the trick?

 Sure.  But everyone using pg_regress will have to update their
 regression test expected outputs.

 I'm inclined to think that that's not a fatal objection; it's not like
 we haven't felt free to change psql's output format before.  As long as
 we don't back-patch this change, it should be no worse than other things
 we've done to third-party code without a backwards glance.

 It would be good to get rid of this whitespace because (I believe) it is
 one of very few reasons for needing to have any trailing whitespace in
 git-controlled files.  If we could get to a point where trailing
 whitespace in patches could be rejected automatically, it'd eliminate
 one small pet peeve.

I have to agree that would be nice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 FOR var IN [array variable | array expression]
 LOOP

 I don't have any opinion about whether the functionality proposed
 here is worth the trouble, but I do have an opinion about that
 syntax: it's an awful choice.
 
 I agree, on both points.
 
How about distinguishing it this way:?
 
FOR var IN ARRAY array_expression LOOP
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 FOR var IN [array variable | array expression]
 LOOP
 
 How about distinguishing it this way:?
 
 FOR var IN ARRAY array_expression LOOP

That occurred to me too, but it's got a small problem: it's not
impossible for ARRAY to be the first token of a valid scalar expression.

But I guess you could get around that if you had to by putting the ARRAY
expression inside parens, and it would be a pretty darn unusual case
anyway.  So this is probably the best choice.  I'm not thrilled with
David's suggestions of using FOREACH or ITERATE --- using a different
initial keyword makes it awkward to make generic statements about all
types of FOR loop.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trailing whitespace in psql table output

2010-09-28 Thread Tom Lane

David Fetter da...@fetter.org writes:
 On Mon, Sep 27, 2010 at 03:11:07PM -0400, Robert Haas wrote:
 Sure.  But everyone using pg_regress will have to update their
 regression test expected outputs.

 Again, I must be missing something super important.  What is it that
 prevents people from doing
 find . -type f |xargs perl -pi.bak -e 's/\s+$//g'

I think the concern isn't about the change being hard to make, it's
about the fallout from having to have different expected-result files
for 9.1 versus previous releases.  A lot of third-party modules try
to maintain source code compatibility across all PG releases they
support, and this would break that.

That's not necessarily a sufficient reason for us not to do it ---
but it's definitely not a negligible issue, either.

It should be noted that this won't be zero-cost for the core project
either.  For example, any back-patched fix that adjusts regression test
outputs will have to deal with the incompatibility.  And if we do put in
some git-level check on whitespace, it'll have to be made to only apply
to master not back branches.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trailing whitespace in psql table output

2010-09-28 Thread Peter Eisentraut

On mån, 2010-09-27 at 11:09 -0700, David Fetter wrote:
 I must be missing something pretty crucial here as far as the
 complexity of changing all the regression tests.  Wouldn't trimming
 all trailing whitespace do the trick?

No, there is trailing whitespace that is significant.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trailing whitespace in psql table output

2010-09-28 Thread Peter Eisentraut

On tis, 2010-09-28 at 12:18 -0400, Tom Lane wrote:
 I'm inclined to think that that's not a fatal objection; it's not like
 we haven't felt free to change psql's output format before.  As long as
 we don't back-patch this change, it should be no worse than other things
 we've done to third-party code without a backwards glance.

In the past, pg_regress used diff -b or -w, so making whitespace changes
in psql was not a problem.

 It would be good to get rid of this whitespace because (I believe) it is
 one of very few reasons for needing to have any trailing whitespace in
 git-controlled files.  If we could get to a point where trailing
 whitespace in patches could be rejected automatically, it'd eliminate
 one small pet peeve.

You won't be able to programmatically forbid all trailing whitespace (at
least without additional arrangements) because of:

psql -c 'select 1 as a, null as b' | cat -A
 a | b$
---+---$
 1 | $===

Plus, there might be tests that check trailing space behavior or some
such, but I haven't looked for those.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trailing whitespace in psql table output

2010-09-28 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 On tis, 2010-09-28 at 12:18 -0400, Tom Lane wrote:
 It would be good to get rid of this whitespace because (I believe) it is
 one of very few reasons for needing to have any trailing whitespace in
 git-controlled files.  If we could get to a point where trailing
 whitespace in patches could be rejected automatically, it'd eliminate
 one small pet peeve.

 You won't be able to programmatically forbid all trailing whitespace (at
 least without additional arrangements) because of:

Yeah, if we can't get all the way there easily, the above argument isn't
worth much.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] documentation udpates to pgupgrade.html

2010-09-28 Thread Bruce Momjian

Massa, Harald Armin wrote:
 Hello,
 
 just doing an upgrade form PostgreSQL 8.4.4 on Windows 2007 64bit to
 PostgreSQL 9.0 64bit on the same system.
 
 I am working along
 http://developer.postgresql.org/pgdocs/postgres/pgupgrade.html
 
 There is written:
 
 NET STOP postgresql-8.4
 NET STOP postgresql-9.0
 or
 
 NET STOP pgsql-8.3  (PostgreSQL 8.3 and older used a different service name)
 
 which should be extended by
 
 net stop postgresql-x64-9.0
 
 for Windows 64 bit.
 

Interesting.  What I have added to HEAD and 9.0 docs is the attached
patch that explains the proper service name should be used.  I don't
think I want to mention 64-bit explicitly, and in fact this section is
part of an 'e.g.' block, meaning they are just examples.  If I get more
64-bit reports I can add an example for that too.

 -
 
 It would also be helpfull, if there was a reamrk that pg_upgrade will
 try to write to the current directory - as runas /user:postgres
 cmd.exe usually starts within c:\windows\system; so a cd %TEMP% after
 runas would be valuable

Agreed.  Second patch attached that mentions this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml
index 7bc939c..47c7bf6 100644
--- a/doc/src/sgml/pgupgrade.sgml
+++ b/doc/src/sgml/pgupgrade.sgml
@@ -246,14 +246,14 @@ gmake prefix=/usr/local/pgsql.new install
 titleStop both servers/title
  
 para
- Make sure both database servers are stopped using on Unix, e.g.:
+ Make sure both database servers are stopped using, on Unix, e.g.:
  
 programlisting
 pg_ctl -D /opt/PostgreSQL/8.4 stop
 pg_ctl -D /opt/PostgreSQL/9.0 stop
 /programlisting
  
- or on Windows
+ or on Windows, using the proper service names:
  
 programlisting
 NET STOP postgresql-8.4
diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml
index 47c7bf6..6d2cdaa 100644
--- a/doc/src/sgml/pgupgrade.sgml
+++ b/doc/src/sgml/pgupgrade.sgml
@@ -272,7 +272,7 @@ NET STOP pgsql-8.3  (productnamePostgreSQL/ 8.3 and older used a different s
 titleRun applicationpg_upgrade//title
  
 para
- Always run the applicationpg_upgrade/ binary in the new server, not the old one.
+ Always run the applicationpg_upgrade/ binary of the new server, not the old one.
  applicationpg_upgrade/ requires the specification of the old and new cluster's
  data and executable (filenamebin/) directories. You can also specify separate
  user and port values, and whether you want the data linked instead of
@@ -306,6 +306,7 @@ pg_upgrade.exe
  to perform only the checks, even if the old server is still
  running. commandpg_upgrade --check/ will also outline any
  manual adjustments you will need to make after the migration.
+ commandpg_upgrade/ requires write permission in the current directory.
 /para
  
 para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: SQL/MED(FDW) DDL

2010-09-28 Thread Heikki Linnakangas


On 09/28/10 17:26, Robert Haas wrote:

First, it seems totally wrong to assume that the same functions and
operators will be defined on the remote side as you have locally;
indeed, for CSV files, you won't have anything defined on the remote
side at all.  You need some kind of a discovery mechanism here to
figure out which quals are push-downable.  And it should probably be
something generic, not a bunch of hard-wired rules that may or may not
be correct in any particular case.  What if the remote side is a
competing database product that doesn't understand X = ANY(Y)?
Second, even if a functions or operators does exist on both sides of
the link, how do you know whether they have compatible semantics?


Or side-effects.

The SQL/MED specification has routine mappings for this purpose. We 
will need that or something similar.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Pavel Stehule

2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 I looked on some constructs that helps with iteration over array in
 plpgsql. I propose a following syntax:

 FOR var IN [array variable | array expression]
 LOOP

 I don't have any opinion about whether the functionality proposed here
 is worth the trouble, but I do have an opinion about that syntax: it's
 an awful choice.  plpgsql has enough trouble already distinguishing
 between integer for-loops and query for-loops, not to mention trouble
 in producing a helpful error message when somebody gets either of those
 constructs slightly wrong.  Providing a variant where a single
 expression can follow IN will make both of those problems an order of
 magnitude worse.  As an example, is this a for-in-query or a
 for-in-array?

        FOR v IN (SELECT arraycol FROM tab) LOOP ...


This is a subquery - so it is a for-in-array - should return one row
with one column. Similar construct is in SQL/PSM

where you can to write SET var = (SELECT ...)

You cannot to write just (SELECT ...) anywhere

 Either answer is plausible depending on whether you assume the
 parentheses make it a subquery.

 Pick something less easily confusable with the existing constructs.

It's not simple - FOR i IN array is natural - Original ADA use a very
similar construct.

FOR i IN ARRAY has problem with constant array - FOR i IN ARRAY ARRAY[1,2,3,]
and FOREACH is used in Oracle for absolutely different task.



we have now a for-in-cursor, so there is a precedent.

regards

Pavel

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Pavel Stehule

2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 FOR var IN [array variable | array expression]
 LOOP

 How about distinguishing it this way:?

 FOR var IN ARRAY array_expression LOOP

I see one problem - when you can use a constant array, then you will
write two keywords ARRAY

FOR var IN ARRAY ARRAY[...] LOOP

iteration over cursor is supported now, and you don't write

FOR var IN CURSOR cursorvar


 That occurred to me too, but it's got a small problem: it's not
 impossible for ARRAY to be the first token of a valid scalar expression.


yes

 But I guess you could get around that if you had to by putting the ARRAY
 expression inside parens, and it would be a pretty darn unusual case
 anyway.  So this is probably the best choice.

I don't agree - There isn't reason for complicating proposed syntax.
The situation is relative simple and we are able to print a correct
messages.

regards

Pavel Stehule

I'm not thrilled with
 David's suggestions of using FOREACH or ITERATE --- using a different
 initial keyword makes it awkward to make generic statements about all
 types of FOR loop.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] documentation udpates to pgupgrade.html

2010-09-28 Thread Massa, Harald Armin

Bruce,


  NET STOP postgresql-8.4
  NET STOP postgresql-9.0



  which should be extended by
 
  net stop postgresql-x64-9.0
 
  for Windows 64 bit.
 

 Interesting.  What I have added to HEAD and 9.0 docs is the attached
 patch that explains the proper service name should be used.  I don't
 think I want to mention 64-bit explicitly, and in fact this section is
 part of an 'e.g.' block, meaning they are just examples.


yes, they are only examples ... just saying that with one additional example
you could have all the fitting PostgreSQLs on Windows covered  :) ...

especially as learning about the correct service name is quite a nuisance
(it gets named magically in a sensible fassion via the one click installer;
it at least requires bringing up the service control panel and search for
sth looking like PostgreSQL)

Harald

-- 
GHUM GmbH
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607

Amtsgericht Stuttgart, HRB 734971
-
persuadere.
et programmare

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Tom Lane

Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 But I guess you could get around that if you had to by putting the ARRAY
 expression inside parens, and it would be a pretty darn unusual case
 anyway. Â So this is probably the best choice.

 I don't agree - There isn't reason for complicating proposed syntax.

Yes, there is.  The syntax you propose is flat out ambiguous: there are
two possible legal interpretations of some commands.  That's not
acceptable, especially not when it's so easily fixed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Pavel Stehule

2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 But I guess you could get around that if you had to by putting the ARRAY
 expression inside parens, and it would be a pretty darn unusual case
 anyway.  So this is probably the best choice.

 I don't agree - There isn't reason for complicating proposed syntax.

 Yes, there is.  The syntax you propose is flat out ambiguous: there are
 two possible legal interpretations of some commands.  That's not
 acceptable, especially not when it's so easily fixed.


what are you thinking? The subquery cannot be interpreted different.
There are not possible use a isolated subquery as query. And subquery
have to return one row, one column.

Pavel


                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Tom Lane

Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 As an example, is this a for-in-query or a
 for-in-array?
 
 Â  Â  Â  Â FOR v IN (SELECT arraycol FROM tab) LOOP ...

 This is a subquery - so it is a for-in-array - should return one row
 with one column.

That's not obvious at all.  It's legal right now to write that, and it
will be interpreted as for-in-query.  Furthermore, there are cases where
it's essential to be able to write a left paren before SELECT, so that
you can control the precedence of UNION/INTERSECT/EXCEPT constructs.
So you're proposing to remove functionality and break existing code in
order to have a simple syntax for for-in-array.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Tom Lane

Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Yes, there is. Â The syntax you propose is flat out ambiguous: there are
 two possible legal interpretations of some commands.

 what are you thinking? The subquery cannot be interpreted different.

Sure it can: it could be a parenthesized top-level query.  In fact,
that's what plpgsql will assume if you feed it that syntax today.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Pavel Stehule

2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 As an example, is this a for-in-query or a
 for-in-array?

        FOR v IN (SELECT arraycol FROM tab) LOOP ...

 This is a subquery - so it is a for-in-array - should return one row
 with one column.

 That's not obvious at all.  It's legal right now to write that, and it
 will be interpreted as for-in-query.

but it has not a sense. It's based on implementation and I am sure, so
this isn't documented. Yes, we are able to write

a := 10 FROM tab WHERE y = 10

but it is just more bug then required feature.

FOR v IN (SELECT FROM) when select returns more than one row is big
inconsistency - and this is bug, when this is allowed

Regards

Pavel

 Furthermore, there are cases where
 it's essential to be able to write a left paren before SELECT, so that
 you can control the precedence of UNION/INTERSECT/EXCEPT constructs.
 So you're proposing to remove functionality and break existing code in
 order to have a simple syntax for for-in-array.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Pavel Stehule

2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Yes, there is.  The syntax you propose is flat out ambiguous: there are
 two possible legal interpretations of some commands.

 what are you thinking? The subquery cannot be interpreted different.

 Sure it can: it could be a parenthesized top-level query.  In fact,
 that's what plpgsql will assume if you feed it that syntax today.

no - there are not any legal construct FOR r IN (..)

I believe so we can find more than one similar undocumented features,
like this - so it means so plpgsql will be a buggy?


                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Robert Haas

On Tue, Sep 28, 2010 at 4:39 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 As an example, is this a for-in-query or a
 for-in-array?

        FOR v IN (SELECT arraycol FROM tab) LOOP ...

 This is a subquery - so it is a for-in-array - should return one row
 with one column.

 That's not obvious at all.  It's legal right now to write that, and it
 will be interpreted as for-in-query.

 but it has not a sense.

It has a very fine sense.  It's completely obvious to me what that
means, and you're proposing to break it.  In a word: no.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Commitfest: The Good, The Bad, and the Ugly

2010-09-28 Thread David Fetter

Folks,

We're almost half way through the commitfest, and so I'll start with:

The Good:

- Most patches still in play have a reviewer.

- It's possible for one person to post 5 reviews in a day.  Robert
  Haas actually did this on his own time yesterday.

- New people have been reviewing patches, at least up to the
  Submission criteria.

The Bad:

- There is 1 (one) patch marked Committed or Ready for Committer,
  where neither the author nor reviewer is a committer.  This
  basically means we have approximately one RRReviewer.

The Ugly:

- Patches are not getting even basic QA.

The Bad and the Ugly are fixable, and here's how.

At the moment, we've got 7 basic review criteria
http://wiki.postgresql.org/wiki/Reviewing_a_Patch, 5 of which can be
accomplished with C skills somewhere between 0 and tiny.  These are:

  1. Submission review (skills needed: patch, English comprehension)
  2. Usability review (skills needed: test-fu, ability to find and read spec)
  3. Feature test (skills needed: patch, configure, make, pipe errors to log)
  4. Performance review (skills needed: ability to time performance)
  5. Coding review (skills needed: guideline comparison, experience with 
portability issues, minor C-reading skills)

I'd like to set as a goal that every patch in this commitfest get
those levels of review.  You do not need C skills[1].  You do not need
to be a genius database engine hacker[2].  You just need to be
diligent and want to move the project ahead.

If you haven't yet, get signed in and start reviewing patches.  Sign
in with your community login, and let's get going :)
https://commitfest.postgresql.org/action/commitfest_view?id=7

In case you were wondering, what I'm doing here is part of step 7.

If you think that getting all outstanding patches through step 5 is
not doable, let me know.  If you think it is, this is your chance to
help make it happen.  Write back either way.

Cheers,
David.

[1] If you do have them, help out with step 6, too.
[2] If you are one, help out with step 6, too.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Tom Lane

Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Sure it can: it could be a parenthesized top-level query. Â In fact,
 that's what plpgsql will assume if you feed it that syntax today.

 no - there are not any legal construct FOR r IN (..)

You are simply wrong, sir, and I suggest that you go read the SQL
standard until you realize that.  Consider for example

for r in (SELECT ... FROM a UNION SELECT ... FROM b) INTERSECT (SELECT 
... FROM c) LOOP ...

The parentheses here are not merely legal, they are *necessary*, else
the semantics of the UNION/INTERSECT operations change.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] UTF8 regexp and char classes still does not work

2010-09-28 Thread Sergey Burladyan

I see this in 9.0 Release note:
- Support locale-specific regular expression processing with UTF-8
  server encoding (Tom Lane)
Locale-specific regular expression functionality includes
case-insensitive matching and locale-specific character classes.

But character classes still does not work, example (git REL9_0_STABLE c767c3bd):
select version();
version 


 PostgreSQL 9.0.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Debian 
4.4.4-8) 4.4.5 20100728 (prerelease), 64-bit

--- CYRILLIC SMALL LETTER ZHE ~* CYRILLIC CAPITAL LETTER ZHE
select E'\320\266' ~* E'\320\226', E'\320\266' ~ '[[:alpha:]]+', 'g' ~ 
'[[:alpha:]]+';
 ?column? | ?column? | ?column? 
--+--+--
 t| f| t

all must be true, like below:

create database koi8 template template0 encoding 'koi8r' lc_collate 
'ru_RU.KOI8-R' lc_ctype 'ru_RU.KOI8-R';
\c koi8
set client_encoding TO utf8;
select E'\326' ~* E'\366', E'\326' ~ '[[:alpha:]]+', 'g' ~ '[[:alpha:]]+';
 ?column? | ?column? | ?column? 
--+--+--
 t| t| t

As i can see in Tom's patch 0d323425 only functions like pg_wc_isalpha is 
changed, but
this pg_wc_isalpha is called from
static struct cvec *
cclass(struct vars * v,/* context */
   const chr *startp,  /* where the name starts */
   const chr *endp,/* just past the end of the name */
   int cases)  /* case-independent? */
function, and this function have comment For the moment, assume that only char 
codes  256 can be in these classes and it call pg_wc_isalpha like this:
for (i = 0; i = UCHAR_MAX; i++)
{
if (pg_wc_isalpha((chr) i))
addchr(cv, (chr) i);
}
UCHAR_MAX is 255

I do not understand fully this algorithm of regular expressions, but i think 
cclass function also need fix.

-- 
Sergey Burladyan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] documentation udpates to pgupgrade.html

2010-09-28 Thread Bruce Momjian

Massa, Harald Armin wrote:
 Bruce,
 
 
   NET STOP postgresql-8.4
   NET STOP postgresql-9.0
 
 
 
   which should be extended by
  
   net stop postgresql-x64-9.0
  
   for Windows 64 bit.
  
 
  Interesting.  What I have added to HEAD and 9.0 docs is the attached
  patch that explains the proper service name should be used.  I don't
  think I want to mention 64-bit explicitly, and in fact this section is
  part of an 'e.g.' block, meaning they are just examples.
 
 
 yes, they are only examples ... just saying that with one additional example
 you could have all the fitting PostgreSQLs on Windows covered  :) ...
 
 especially as learning about the correct service name is quite a nuisance
 (it gets named magically in a sensible fassion via the one click installer;
 it at least requires bringing up the service control panel and search for
 sth looking like PostgreSQL)

Can we document an easy way to find the service names?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UTF8 regexp and char classes still does not work

2010-09-28 Thread Tom Lane

Sergey Burladyan eshkin...@gmail.com writes:
 As i can see in Tom's patch 0d323425 only functions like pg_wc_isalpha is 
 changed, but
 this pg_wc_isalpha is called from
 static struct cvec *
 cclass(struct vars * v,/* context */
const chr *startp,  /* where the name starts */
const chr *endp,/* just past the end of the name */
int cases)  /* case-independent? */
 function, and this function have comment For the moment, assume that only 
 char codes  256 can be in these classes and it call pg_wc_isalpha like this:
 for (i = 0; i = UCHAR_MAX; i++)
 {
 if (pg_wc_isalpha((chr) i))
 addchr(cv, (chr) i);
 }
 UCHAR_MAX is 255

Hmm, you're right.  I only tested that on Latin1 characters, for which
it does work because those have Unicode points below 256.  I'm not
sure of a reasonable solution for the general case --- we certainly
don't want this function iterating up to 2^21 or thereabouts.

Your test case seems to be using KOI8 encoding, though, which doesn't
have anything to do with UTF8 behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Andrew Dunstan





On 09/28/2010 03:41 PM, Pavel Stehule wrote:



It's not simple - FOR i IN array is natural - Original ADA use a very
similar construct.



No it doesn't. In Ada (Note: not ADA) FOR can only iterate over one 
thing: a discrete subtype (e.g. an integer or enumeration type, or a 
range of it)[1]. You can say:


   for i in my_array'range loop ...

but that iterates over the array's index, not over its values.And there 
is no ambiguity with other things you might loop over because there 
aren't any.


cheers

andrew

[1]http://www.adaic.org/standards/05rm/html/RM-5-5.html

Re: [HACKERS] UTF8 regexp and char classes still does not work

2010-09-28 Thread Sergey Burladyan

Tom Lane t...@sss.pgh.pa.us writes:

 Hmm, you're right.  I only tested that on Latin1 characters, for which
 it does work because those have Unicode points below 256.  I'm not
 sure of a reasonable solution for the general case --- we certainly
 don't want this function iterating up to 2^21 or thereabouts.

Yes, i understand this problem. How perl do this? May be this Unicode table can
be precomputed or linked to postgres binary from external source?

 Your test case seems to be using KOI8 encoding, though, which doesn't
 have anything to do with UTF8 behavior.

It's just for example of expected result. See first test, it is UTF8, two bytes 
per character:
  --- CYRILLIC SMALL LETTER ZHE ~* CYRILLIC CAPITAL LETTER ZHE
  select E'\320\266' ~* E'\320\226', E'\320\266' ~ '[[:alpha:]]+', 'g' ~ 
  '[[:alpha:]]+';
   ?column? | ?column? | ?column? 
  --+--+--
   t| f| t


-- 
Sergey Burladyan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Help with User-defined function in PostgreSQL with Visual C++

2010-09-28 Thread Itagaki Takahiro

On Tue, Sep 28, 2010 at 6:16 PM, Magnus Hagander mag...@hagander.net wrote:
 We're talking about the export all symbols thing, right? I *don't*
 think we want to recommend people to do that - it creates bloated DLL
 files, for no really good reason. Also, it's not just a matter of a
 msvc project - we do that with a perl hack
 (http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/tools/msvc/gendef.pl;h=b8538dd79b8baf21ede87b2ec1aba0276fd3b3d9;hb=62b6aaa40b2abb26edf18d1cd00dffcac090f67a).
 It's not a good way.

What a great hacking!  I agree that it is not recommendable, but users
need to build their codes in different build environment from ours if so.

 We might, however, want to add a specific section to the
 *documentation* about building extensions on Windows

+1. It will be a longer section than ones for other platforms.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest: The Good, The Bad, and the Ugly

2010-09-28 Thread Itagaki Takahiro

On Wed, Sep 29, 2010 at 6:03 AM, David Fetter da...@fetter.org wrote:
 The Good:
 - Most patches still in play have a reviewer.

As far as I remember, there were discussions about the issue
A patch has a reviewer, but in Needs Review state for several weeks 
in 9.0 development.

Do we have any plans for it? According to the commitfest app, one patch
has only one reviewer at once. A new reviewer might avoid reviewing
a patch that have another reviewer already.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Alvaro Herrera

Excerpts from Kevin Grittner's message of mar sep 28 12:28:12 -0400 2010:

 How about distinguishing it this way:?
  
 FOR var IN ARRAY array_expression LOOP

What about

FOR EACH var IN array_expr LOOP ...

I think this requires reserving EACH, which could cause a regression for
working code.  Maybe there's a way to make it work?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [RRR] [HACKERS] Commitfest: The Good, The Bad, and the Ugly

2010-09-28 Thread Robert Haas

On Tue, Sep 28, 2010 at 9:11 PM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 On Wed, Sep 29, 2010 at 6:03 AM, David Fetter da...@fetter.org wrote:
 The Good:
 - Most patches still in play have a reviewer.

 As far as I remember, there were discussions about the issue
 A patch has a reviewer, but in Needs Review state for several weeks 
 in 9.0 development.

 Do we have any plans for it? According to the commitfest app, one patch
 has only one reviewer at once. A new reviewer might avoid reviewing
 a patch that have another reviewer already.

No, the column is very clearly labelled Reviewers, not Reviewer.
And we have certainly had patches with more than one person's name in
that field in the past.  The issue is rather that we don't have enough
people reviewing.  We haven't had enough people volunteer to do
reviews to even assign ONE person to each patch, let alone two.  There
are, as of this writing, SEVEN patches that have no reviewer at all.

Of course, several of the committers, including you, me, and Tom, have
been working our way through the patches.  And that is great.  But the
point of the CommitFest process is that everyone is supposed to pitch
in and help out, not just the committers.  That is not happening, and
it's a problem.  This process does not work and will not scale if the
committers are responsible for doing all the work on every patch from
beginning to end.  That has never worked, and the fact that we have a
few more committers now doesn't change that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Path question

2010-09-28 Thread Robert Haas

2010/9/23 Robert Haas robertmh...@gmail.com:
 All of this leaves me wondering why Greg ended up ifdefing out this
 code in the first place.  There's probably something I'm missing
 here...  but for now I can't think of a better idea than just removing
 the #ifdefs and hoping that whatever problem they were causing was
 limited to an earlier version of the code that no longer exists.

...and FAIL.  I missed the blindingly obvious here, which is that
without these tests commented out, while it passes regression tests,
the merge append stuff stops working.  I think for some reason without
this stuff in there the appropriate index paths fail to get generated
for the child rels.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Darren Duncan


Alvaro Herrera wrote:

What about

FOR EACH var IN array_expr LOOP ...

I think this requires reserving EACH, which could cause a regression for
working code.  Maybe there's a way to make it work?


Code that quotes all of its identifiers, such as with:

  FOR EACH var IN array_expr LOOP ...

... would also gain a significant amount of future-proofing since then the 
language can add keywords at will, without there being conflicts with 
user-defined identifiers.


Similarly, quoting identifiers also carries present-day advantages as then one 
can name identifiers whatever is most suitably descriptive for them without 
worrying whether the language has a pre-defined meaning for the used words.


The quoting also has the nice bonus of making them case-sensitive.

-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [RRR] [HACKERS] Commitfest: The Good, The Bad, and the Ugly

2010-09-28 Thread Itagaki Takahiro

On Wed, Sep 29, 2010 at 10:18 AM, Robert Haas robertmh...@gmail.com wrote:
 No, the column is very clearly labelled Reviewers, not Reviewer.
 And we have certainly had patches with more than one person's name in
 that field in the past.  The issue is rather that we don't have enough
 people reviewing.  We haven't had enough people volunteer to do
 reviews to even assign ONE person to each patch, let alone two.  There
 are, as of this writing, SEVEN patches that have no reviewer at all.

Some of them might be too difficult to review. For example, replication
or snapshot management requires special skills to review.

I'm worrying about new reviewers hesitate to review a patch that has
a previous reviewer, and then, if they think the remaining patches are
too difficult for them, they would just leave the commitfest page.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Darren Duncan


Alvaro Herrera wrote:

What about

FOR EACH var IN array_expr LOOP ...

I think this requires reserving EACH, which could cause a regression for
working code.  Maybe there's a way to make it work?


What about saying FOR-EACH instead?

A good general solution that I'd expect to not cause regressions is to separate 
multiple-word keywords with dashes rather than spaces.


Since unquoted identifiers don't have dashes, I think, and moreover because the 
whole FOR-EACH would occupy the first position of the statement rather than the 
first two, there should be no ambiguity.


Parsing should be easier, too, because keywords formatted like this would just 
be a single term rather than having infinite variations due to embedded whitespace.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Kevin Grittner's message of mar sep 28 12:28:12 -0400 2010:
 How about distinguishing it this way:?
 FOR var IN ARRAY array_expression LOOP

 What about
 FOR EACH var IN array_expr LOOP ...

That doesn't seem to have any obvious connection to looping over array
elements, as opposed to some other kind of iteration.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Path question

2010-09-28 Thread Robert Haas

2010/9/28 Robert Haas robertmh...@gmail.com:
 2010/9/23 Robert Haas robertmh...@gmail.com:
 All of this leaves me wondering why Greg ended up ifdefing out this
 code in the first place.  There's probably something I'm missing
 here...  but for now I can't think of a better idea than just removing
 the #ifdefs and hoping that whatever problem they were causing was
 limited to an earlier version of the code that no longer exists.

 ...and FAIL.  I missed the blindingly obvious here, which is that
 without these tests commented out, while it passes regression tests,
 the merge append stuff stops working.  I think for some reason without
 this stuff in there the appropriate index paths fail to get generated
 for the child rels.

Yep, that's the problem, all right.  find_usable_indexes() calls
build_index_pathkeys() to generate pathkeys for each available index
on the child relation, and then calls truncate_useless_pathkeys() to
get rid of any that aren't useful.  In a query like this:

explain select * from ma_parent order by name limit 10;

...what makes the pathkeys potentially useful is that they match the
root pathkeys for the query.  However, without Greg's hacks, the
transposed child equivalence classes don't show up in the equivalence
member lists, so get_eclass_for_sort_expr() generates new equivalence
classes for them.  Subsequently, when truncate_useless_pathkeys() is
called, those new equivalence classes are found not to be equal to the
overall ordering desired for the query, so it truncates them away.

I'm not presently sure quite what the best way to fix this is... any ideas?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [RRR] [HACKERS] Commitfest: The Good, The Bad, and the Ugly

2010-09-28 Thread Robert Haas

On Tue, Sep 28, 2010 at 9:33 PM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 On Wed, Sep 29, 2010 at 10:18 AM, Robert Haas robertmh...@gmail.com wrote:
 No, the column is very clearly labelled Reviewers, not Reviewer.
 And we have certainly had patches with more than one person's name in
 that field in the past.  The issue is rather that we don't have enough
 people reviewing.  We haven't had enough people volunteer to do
 reviews to even assign ONE person to each patch, let alone two.  There
 are, as of this writing, SEVEN patches that have no reviewer at all.

 Some of them might be too difficult to review. For example, replication
 or snapshot management requires special skills to review.

 I'm worrying about new reviewers hesitate to review a patch that has
 a previous reviewer, and then, if they think the remaining patches are
 too difficult for them, they would just leave the commitfest page.

That's a legitimate concern, but I am not sure how much of a problem
it is in practice.  Most people who become round-robin reviewers are
getting pulled into the process a little more than just stumbling
across the CF page by happenstance, or at least I hope they are.  Not
all patches can benefit from multiple reviewers, but CF managers can
and should encourage multiple reviews of those that can.  However, at
the moment, the problem is that regardless of who is assigned to do
what, we're not getting enough reviews done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Andrew Dunstan




On 09/28/2010 09:31 PM, Darren Duncan wrote:

Alvaro Herrera wrote:

What about

FOR EACH var IN array_expr LOOP ...

I think this requires reserving EACH, which could cause a regression for
working code.  Maybe there's a way to make it work?


Code that quotes all of its identifiers, such as with:

  FOR EACH var IN array_expr LOOP ...

... would also gain a significant amount of future-proofing since then 
the language can add keywords at will, without there being conflicts 
with user-defined identifiers.


Similarly, quoting identifiers also carries present-day advantages as 
then one can name identifiers whatever is most suitably descriptive 
for them without worrying whether the language has a pre-defined 
meaning for the used words.


The quoting also has the nice bonus of making them case-sensitive.




This doesn't help in the least if the array is an expression rather than 
simply a variable - we're not going to start quoting expressions.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Andrew Dunstan




On 09/28/2010 09:43 PM, Darren Duncan wrote:

Alvaro Herrera wrote:

What about

FOR EACH var IN array_expr LOOP ...

I think this requires reserving EACH, which could cause a regression for
working code.  Maybe there's a way to make it work?


What about saying FOR-EACH instead?

A good general solution that I'd expect to not cause regressions is to 
separate multiple-word keywords with dashes rather than spaces.


Since unquoted identifiers don't have dashes, I think, and moreover 
because the whole FOR-EACH would occupy the first position of the 
statement rather than the first two, there should be no ambiguity.


Parsing should be easier, too, because keywords formatted like this 
would just be a single term rather than having infinite variations due 
to embedded whitespace.


This would actually make the parsing infinitely more ugly, not less. And 
we are not gong to start introducing non-alphabetic characters into 
keywords. It is also, as Tom noted about the earlier version, without 
any obvious connection to array processing.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] is sync rep stalled?

2010-09-28 Thread Robert Haas

So we've got two patches that implement synchronous replication, and
no agreement on which one, if either, should be committed.  We have no
agreement on how synchronous replication should be configured, and at
most a tenuous agreement that it should involve standby registration.

This is bad.

This feature is important, and we need to get it done.  How do we get
the ball rolling again?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] string function - format function proposal

2010-09-28 Thread Itagaki Takahiro

On Thu, Sep 9, 2010 at 8:57 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I am sending a updated version.

 changes:
 * tag %v removed from format function,
 * proprietary tags %lq a iq removed from sprintf
 * code cleaned

 patch divided to two parts - format function and stringfunc (contains
 sprintf function and substitute function)

=== Discussions about the spec ===
Two patches add format() into the core, and substitute() and sprintf() into
stringfunc contrib module. But will we have 3 versions of string formatters?

IMHO, substitute() is the best choice that we will have in the core because
functionalities in format() and sprintf() can be achieved by combination of
substitute() and quote_nullable(), quote_ident(), or to_char(). I think the
core will provide only simple and non-overlapped features. Users can write
wrapper functions by themselves if they think the description is redundant.

=== format.diff ===
* It has a reject in doc, but the hunk can be fixed easily.
1 out of 2 hunks FAILED -- saving rejects to file doc/src/sgml/func.sgml.rej
  COMMENT: We have the function list in alphabetical order,
  so format() should be inserted after encode().
* It can be built without compile warnings.
* Enough documentation and regression tests are included.

=== stringfunc.diff ===
* It can be applied cleanly and built without compile warnings.
* Documentation is included, but not enough.
  COMMENT: According to existing docs, function list are described with
  variablelist or table.
* Enough regression tests are included.
* COMMENT: stringfunc directory should be added to contrib/Makefile.

* BUG: stringfunc_substitute_nv() calls text_format().
  I think we don't need stringfunc_substitute_nv at all.
  It can be replaced by stringfunc_substitute(). _nv version is only
  required if it is in the core because of sanity regression test.

* BUG?: The doc says sprintf() doesn't support length modifiers,
  but it is actually supported in broken state:
postgres=# SELECT sprintf('%*s', 2, 'ABC');
 sprintf
-
 ABC  = should be ERROR if unsupported, or AB if supported.
(1 row)

* BUG?: ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
  errmsg(unsupported tag \%%%c\, tag)));
 Is the code ok if the tag (a char) is a partial byte of multi-byte character?
 My machine prints ? in the case, but it might be platform-dependent.

=== Both patches ===
* Performance: I don't think those functions are not performance-critical,
  but we could cache typoutput functions in fn_extra if needed.
  record_out would be a reference.

* Coding: Whitespace and tabs are mixed in some places. They are not so
  important because we will run pgindent, but careful choice will be
  preferred even of a patch.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Path question

2010-09-28 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 ...what makes the pathkeys potentially useful is that they match the
 root pathkeys for the query.  However, without Greg's hacks, the
 transposed child equivalence classes don't show up in the equivalence
 member lists, so get_eclass_for_sort_expr() generates new equivalence
 classes for them.  Subsequently, when truncate_useless_pathkeys() is
 called, those new equivalence classes are found not to be equal to the
 overall ordering desired for the query, so it truncates them away.

 I'm not presently sure quite what the best way to fix this is... any ideas?

Probably what's needed is to extend the child eclass member stuff to
cover sort-key eclasses.  Per relation.h:

 * em_is_child signifies that this element was built by transposing a member
 * for an inheritance parent relation to represent the corresponding expression
 * on an inheritance child.  The element should be ignored for all purposes
 * except constructing inner-indexscan paths for the child relation.

Likely these need to be ignored a bit less.  A couple of Greg's
undocumented hacks seem to be related to this point, but not all of
them.  In any case you'd need some careful thought about when to
ignore child EMs and when not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using streaming replication as log archiving

2010-09-28 Thread Fujii Masao

On Tue, Sep 28, 2010 at 5:23 PM, Magnus Hagander mag...@hagander.net wrote:
 When I ran that, the size of the WAL file in inprogress directory
 became more than 16MB. Obviously something isn't right.

 Wow, that's weird. I'm unable to reproduce that here - can you try to
 figure out why that happened?

Sorry, I overlooked the single-digit figure in the result of ls -l.
To be exact, the size of the WAL file in inprogress directory can be
less than 16MB. Here is the result of ls -l inprogress.

$ ls -l inprogress/
total 1724
-rw-rw-r-- 1 postgres postgres 1757352 Sep 29 12:03 00010003

This also would be problem since the WAL file smaller than 16MB cannot
be used for recovery. I think that pg_streamrecv should create 16MB
file with zero at first, and write the received WAL records in that, as
walreceiver does.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: I: [HACKERS] About Our CLUSTER implementation is pessimal patch

2010-09-28 Thread Josh Kupershmidt

On Mon, Sep 27, 2010 at 10:05 PM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 I re-ordered some description in the doc. Does it look better?
 Comments and suggestions welcome.

I thought this paragraph was a little confusing:

! In the second case, a full table scan is followed by a sort operation.
! The method is faster than the first one when the table is highly
fragmented.
! You need twice disk space of the sum in the case. In addition to the free
! space needed by the previous case, this approach may also need a temporary
! disk sort file which can be as big as the original table.

I think the worst-case disk space could be made a little more clear
here, and maybe some general wordsmithing as well. I wasn't sure what
twice disk space of the sum was in this description -- sum of what
(table and all indexes?). And does twice disk space include the
temporary disk sort file? Here's an idea of how I think this paragraph
could be cleaned up a bit, if my understanding of the disk space
required is about right:

! In the second case, a full table scan is followed by a sort operation.
! This method is faster than when the table is highly fragmented.
! However, commandCLUSTER/command may require available disk space of
! up to twice the sum of the size of the table and its indexes, if
it uses a temporary
! disk sort file, which can be as big as the original table.

Also, AIUI, this second clustering method is similar to the older
idiom of CREATE TABLE new AS SELECT * FROM old ORDER BY col; Since the
paragraph describing this older idiom is being removed, perhaps a
brief mention in the documentation could be made of this similarity.

Some more wordsmithing: change
!  The planner tries to choose a faster method in them base on the
information
to:
!  The planner tries to choose the fastest method based on the information

I started looking at the performance impact of this patch based on
Leonardo's SQL file. On the 2 million row table, I see a consistent
~10% advantage for the sequential scan clusters. I'm going to try to
run the bigger tests a few times and post results from there when I
get a chance.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: I: [HACKERS] About Our CLUSTER implementation is pessimal patch

2010-09-28 Thread Alvaro Herrera

Excerpts from Josh Kupershmidt's message of mar sep 28 23:53:33 -0400 2010:

 I started looking at the performance impact of this patch based on
 Leonardo's SQL file. On the 2 million row table, I see a consistent
 ~10% advantage for the sequential scan clusters. I'm going to try to
 run the bigger tests a few times and post results from there when I
 get a chance.

10% is nothing.  I was expecting this patch would give an order of
magnitude of improvement or somethine like that in the worst cases of
the current code (highly unsorted input)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Pavel Stehule

2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2010/9/28 Tom Lane t...@sss.pgh.pa.us:
 Sure it can: it could be a parenthesized top-level query.  In fact,
 that's what plpgsql will assume if you feed it that syntax today.

 no - there are not any legal construct FOR r IN (..)

 You are simply wrong, sir, and I suggest that you go read the SQL
 standard until you realize that.  Consider for example

        for r in (SELECT ... FROM a UNION SELECT ... FROM b) INTERSECT (SELECT 
 ... FROM c) LOOP ...

 The parentheses here are not merely legal, they are *necessary*, else
 the semantics of the UNION/INTERSECT operations change.


ok, then probably one variant is for-in-array array_expr. Is there agreement?

Regards

Pavel Stehule




                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: I: [HACKERS] About Our CLUSTER implementation is pessimal patch

2010-09-28 Thread Itagaki Takahiro

On Wed, Sep 29, 2010 at 1:27 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 I see a consistent
 ~10% advantage for the sequential scan clusters.

 10% is nothing.  I was expecting this patch would give an order of
 magnitude of improvement or somethine like that in the worst cases of
 the current code (highly unsorted input)

Yes. It should be x10 faster than ordinary method in the worst cases.

I have a performance result of pg_reorg, that performs as same as
the patch. It shows 16 times faster than the old CLUSTER. In addition,
it was slow if not fragmented. (So, it should not be consistent.)
http://reorg.projects.postgresql.org/

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: plpgsql - for in array statement

2010-09-28 Thread Darren Duncan


Andrew Dunstan wrote:

On 09/28/2010 09:31 PM, Darren Duncan wrote:


Code that quotes all of its identifiers, such as with:

  FOR EACH var IN array_expr LOOP ...


This doesn't help in the least if the array is an expression rather than 
simply a variable - we're not going to start quoting expressions.


I wrote that wrong.  I should have said array_var not array_expr.  I am 
certainly not advocating quoting expressions, and didn't mean to imply that 
here.  My point was that if a token is always interpreted as a keyword rather 
than an identifier when there is ambiguity, then quoting would let users name an 
identifier each or EACH.  In any event, I will not push this since it 
doesn't address the real issue of language changes not breaking the general case 
of legacy code; it only says how users can insulate themselves. -- Darren Duncan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: I: [HACKERS] About Our CLUSTER implementation is pessimal patch

2010-09-28 Thread Itagaki Takahiro

On Wed, Sep 29, 2010 at 12:53 PM, Josh Kupershmidt schmi...@gmail.com wrote:
 I thought this paragraph was a little confusing:

Thanks for checking.

 !     In the second case, a full table scan is followed by a sort operation.
 !     The method is faster than the first one when the table is highly
 fragmented.
 !     You need twice disk space of the sum in the case. In addition to the 
 free
 !     space needed by the previous case, this approach may also need a 
 temporary
 !     disk sort file which can be as big as the original table.

 I think the worst-case disk space could be made a little more clear
 here, and maybe some general wordsmithing as well. I wasn't sure what
 twice disk space of the sum was in this description -- sum of what
 (table and all indexes?).

To be exact, It's very complex.
During reconstructing tables, it requires about twice disk space of
the old table (for sort tapes and the new table).
After sorting the table, CLUSTER performs REINDEX. We need
{same size of the new table} + {twice disk space of the new indexes}.
Also, new relations will be the same sizes of old relations if they
have no free spaces.

So, I think twice disk space of the sum of table and indexes would be
the simplest explanation for safe margin.

 Also, AIUI, this second clustering method is similar to the older
 idiom of CREATE TABLE new AS SELECT * FROM old ORDER BY col; Since the
 paragraph describing this older idiom is being removed, perhaps a
 brief mention in the documentation could be made of this similarity.

Good idea.

 Some more wordsmithing: change
 !      The planner tries to choose a faster method in them base on the
 information
 to:
 !      The planner tries to choose the fastest method based on the information

Thanks.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-09-28 Thread Sushant Sinha

Any updates on this?


On Tue, Sep 21, 2010 at 10:47 PM, Sushant Sinha sushant...@gmail.comwrote:

  I looked at this patch a bit.  I'm fairly unhappy that it seems to be
  inventing a brand new mechanism to do something the ts parser can
  already do.  Why didn't you code the url-part mechanism using the
  existing support for compound words?

 I am not familiar with compound word implementation and so I am not sure
 how to split a url with compound word support. I looked into the
 documentation for compound words and that does not say much about how to
 identify components of a token. Does a compound word split by matching
 with a list of words? If yes, then we will not be able to use that as we
 do not know all the words that can appear in a url/host/email/file.

 I think another approach can be to use the dict_regex dictionary
 support. However, we will have to match the regex with something that
 parser is doing.

 The current patch is not inventing any new mechanism. It uses the
 special handler mechanism already present in the parser. For example,
 when the current parser finds a URL it runs a special handler called
 SpecialFURL which resets the parser position to the start of token to
 find hostname. After finding the host it moves to finding the path. So
 you first get the URL and then the host and finally the path.

 Similarly, we are resetting the parser to the start of the token on
 finding a url to output url parts. Then before entering the state that
 can lead to a url we output the url part. The state machine modification
 is similar for other tokens like file/email/host.


  The changes made to parsetext()
  seem particularly scary: it's not clear at all that that's not breaking
  unrelated behaviors.  In fact, the changes in the regression test
  results suggest strongly to me that it *is* breaking things.  Why are
  there so many diffs in examples that include no URLs at all?
 

 I think some of the difference is coming from the fact that now pos
 starts with 0 and it used to be 1 earlier. That is easily fixable
 though.

  An issue that's nearly as bad is the 100% lack of documentation,
  which makes the patch difficult to review because it's hard to tell
  what it intends to accomplish or whether it's met the intent.
  The patch is not committable without documentation anyway, but right
  now I'm not sure it's even usefully reviewable.

 I did not provide any explanation as I could not find any place in the
 code to provide the documentation (that was just a modification of state
 machine). Should I do a separate write-up to explain the desired output
 and the changes to achieve it?

 
  In line with the lack of documentation, I would say that the choice of
  the name parttoken for the new token type is not helpful.  Part of
  what?  And none of the other token type names include the word token,
  so that's not a good decision either.  Possibly url_part would be a
  suitable name.
 

 I can modify it to output url-part/host-part/email-part/file-part if
 there is an agreement over the rest of the issues. So let me know if I
 should go ahead with this.

 -Sushant.

86 matches

Mail list logo