Re: [HACKERS] updated hstore patch

2009-09-20 Thread Andrew Gierth
 Tom == Tom Lane t...@sss.pgh.pa.us writes:

  David E. Wheeler da...@kineticode.com writes:
  ... I think that this patch is ready for committer review and, perhaps,  
  committing. The code looks clean (though mainly over my head) and the  
  functionality is top-notch.

 Tom Given the number of questions in your review, I don't think this is
 Tom actually ready to commit.  I'm marking it waiting on author instead.

I will resubmit (hopefully in a day or two) with the following changes:

 - removal of the \set schema variable stuff from the .sql script
 - documentation fixes as mentioned in my previous email
 - additional documentation for some of the newer features
 - more internal code documentation covering the handling of old formats

I'd appreciate public feedback on:

 - whether conversions to/from a {key,val,key,val,...} array are needed
   (and if there's strong opinions in favour of making them casts; in the
   absence of strong support for that, I'll stick to functions)

 - what to do when installing the new version's .sql into an existing db;
   the send/recv support and some of the index support doesn't get installed
   if the hstore type and opclasses already exist. In the case of an upgrade
   (or a dump/restore from an earlier version) it would be nice to make all
   the functionality available; but there's no CREATE OR REPLACE for types
   or operator classes.

If there are any more potential showstoppers I'd appreciate hearing about
them now rather than later.

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PGCluster-II Progress

2009-09-20 Thread Markus Wanner
Mitani-San,

thank you for this heads up on PGCluster-II. The partial data idea
sounds very interesting and I'm looking forward to an inspiring meeting
in Tokyo.

Kind Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba.conf: samehost and samenet [REVIEW]

2009-09-20 Thread Magnus Hagander
On Sun, Sep 20, 2009 at 05:59, Abhijit Menon-Sen a...@toroid.org wrote:
 I think the patch is more or less ready, but I have a few minor
 comments:

 First, it needs to be reformatted to not use a space before the opening
 parentheses in (some) function calls and definitions.

 *** a/doc/src/sgml/client-auth.sgml
 --- b/doc/src/sgml/client-auth.sgml
 [...]

 +       paraInstead of an replaceableCIDR-address/replaceable, you can 
 specify
 +        the values literalsamehost/literal or 
 literalsamenet/literal. To
 +        match any address on the subnets connected to the local machine, 
 specify
 +        literalsamenet/literal. By specifying 
 literalsamehost/literal, any
 +        addresses present on the network interfaces of local machine will 
 match.
 +       /para
 +

 I'd suggest something like the following instead:

    paraInstead of a replaceableCIDR-address/replaceable, you can
    specify literalsamehost/literal to match any of the server's own
    IP addresses, or literalsamenet/literal to match any address in
    a subnet that the server belongs to.

 *** a/src/backend/libpq/hba.c
 --- b/src/backend/libpq/hba.c
 [...]

 +     else if (addr-sa_family == AF_INET 
 +                      raddr-addr.ss_family == AF_INET6)
 +     {
 +             /*
 +              * Wrong address family.  We allow only one case: if the file
 +              * has IPv4 and the port is IPv6, promote the file address to
 +              * IPv6 and try to match that way.
 +              */

 How about this instead:

    If we're listening on IPv6 but the file specifies an IPv4 address to
    match against, we promote the latter also to an IPv6 address before
    trying to match the client's address.

 (The comment is repeated elsewhere.)

That's actually a copy/paste from the code that's in hba.c now. That's
not reason not to fix it of course :-)


 + int
 + pg_foreach_ifaddr(PgIfAddrCallback callback, void * cb_data)
 + {
 + #ifdef WIN32
 +     return foreach_ifaddr_win32(callback, cb_data);
 + #else /* !WIN32 */
 + #ifdef HAVE_GETIFADDRS
 +     return foreach_ifaddr_getifaddrs(callback, cb_data);
 + #else /* !HAVE_GETIFADDRS */
 +     return foreach_ifaddr_ifconf(callback, cb_data);
 + #endif
 + #endif /* !WIN32 */
 + }

 First, writing it this way is less noisy:

    #ifdef WIN32
        return foreach_ifaddr_win32(callback, cb_data);
    #elif defined(HAVE_GETIFADDRS)
        return foreach_ifaddr_getifaddrs(callback, cb_data);
    #else
        return foreach_ifaddr_ifconf(callback, cb_data);
    #endif

 Second, I can't see that it makes very much difference, but why do it
 this way at all? You could just have each of the three #ifdef blocks
 define a function named pg_foreach_ifaddr() and be done with it. No
 need for a fourth function.

That was my thought as well when I looked at the patch. It'd be
clearer and I think more in line with what we do at other places.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code blocks

2009-09-20 Thread Dimitri Fontaine
Hi,

Tom Lane t...@sss.pgh.pa.us writes:
 Andrew Dunstan and...@dunslane.net writes:
 Really? That wasn't my expectation at all. I expected that the code 
 would in effect be always returning void. I think you're moving the 
 goalposts a bit here. I don't think we need a RETURNS clause on it for 
 it to be useful.

Yes it'd be useful without it too, and this can come in later in the
game, granted too. Will continue review without it.

 Yeah.  The presented examples aren't tremendously convincing, as they
 both beg the question why not just do a SELECT?.  

Where I want to go from those example are for the first case being able
to run SQL commands on a bunch of tables and return a 3 setof
(schemaname, tablename, check bool). The second one (server_version) is
intended to go in extension install script, in order to branch on the
server_version and behave accordingly, returning what it is that have
been done or sth useful.

But I didn't give it much time yet, and I'll confess it was the first
ideas coming in while playing with the feature. Proper ways to do things
with what's currently in the patch may come after some more playing.

 It's also not
 exactly apparent to me why redefining the behavior of SELECT in a
 plpgsql function would be a better thing than having RETURN do it.

Well convenience for lazy DBA only, skipping the declaring of vars in a DO
command... it's following the idea that we might want not exactly
plpgsql in there, which I can see as a very bad one too, wrt code
maintenance and all. Maybe offering later a new PL intended at
interactive processing would better fit this line of thoughs.

 I think adding onto DO capabilities is something we could do later
 if demand warrants.  I'd prefer to underdesign it for starters than
 to encrust it with features that might not be needed.

 BTW, what happens with the current patch if you try to do a RETURN?

dim=# do $$begin return; end;$$;
ERROR:  invalid typLen: 0
CONTEXT:  PL/pgSQL function inline while casting return value to function's 
return type

dim=# do $$begin return 1; end;$$;  


ERROR:  RETURN cannot have a parameter in function returning void at or near 1
LINE 1: do $$begin return 1; end;$$;
  ^
Regards,
-- 
Dimitri Fontaine
PostgreSQL DBA, Architecte

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code blocks

2009-09-20 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 That doesn't seem appropriate.  Returned With Feedback means that the
 patch is dead as far as this CommitFest goes, which isn't what you
 seem to be saying at all.  I think this should stay Needs Review until
 it's had a full review, and then we can decide where it goes from
 there after that.

Yeah, don't want to close it already. Bitten by the labels once more,
but it'll come :)

-- 
Dimitri Fontaine
PostgreSQL DBA, Architecte

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated hstore patch

2009-09-20 Thread Ron Mayer
Andrew Gierth wrote:
 I'd appreciate public feedback on:
  - whether conversions to/from a {key,val,key,val,...} array are needed
(and if there's strong opinions in favour of making them casts; in the
absence of strong support for that, I'll stick to functions)

Strikes me as an independent separate patch.  It seems totally
orthogonal to the features in the patch as submitted, no?

  - what to do when installing the new version's .sql into an existing db;
the send/recv support and some of the index support doesn't get installed
if the hstore type and opclasses already exist. In the case of an upgrade
(or a dump/restore from an earlier version) it would be nice to make all
the functionality available; but there's no CREATE OR REPLACE for types
or operator classes.

It seems similar in ways to the PostGIS upgrade issues when their
types and operators change:
 http://postgis.refractions.net/docs/ch02.html#upgrading
It seems they've settled on a script which processes the dump file
to exclude the parts that would conflict?

If the perfect solution is too complex, I'd also kinda hope this isn't a
show-stopper for this patch, but rather a TODO for the future modules feature.


 If there are any more potential showstoppers I'd appreciate hearing about
 them now rather than later.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GRANT ON ALL IN schema

2009-09-20 Thread Abhijit Menon-Sen
(This is a partial review of the grantonall-20090810v2.diff patch posted
by Petr Jelinek on 2009-08-10 (hi PJMODOS!). See
http://archives.postgresql.org/message-id/4a7f5853.5010...@pjmodos.net
for the original message.)

I have not yet been able to do a complete review of this patch, but I am
posting this because I'll be travelling for a week starting tomorrow. My
comments are based mostly on reading the patch, and not on any intensive
testing of the feature. I have left the patch status unchanged at needs
review, although I think it's close to ready for committer.

I really like this patch. It's easy to understand and written in a very
straightforward way, and addresses a real need that comes up time and
again on various support fora. I have only a couple of minor comments.

1. The patch did apply to HEAD and build cleanly, but there are now a
   couple of minor (documentation) conflicts. (Sorry, I would have fixed
   them and reposted a patch, but I'm running out of time right now.)

 *** a/doc/src/sgml/ref/grant.sgml
 --- b/doc/src/sgml/ref/grant.sgml
 [...]
 
 para
 +There is also the possibility of granting permissions to all objects of
 +given type inside one or multiple schemas. This functionality is 
 supported
 +for tables, views, sequences and functions and can done by using
 +ALL {TABLES|SEQUENCES|FUNCTIONS} IN SCHEMA schemaname syntax in place
 +of object name.
 +   /para
 + 
 +   para

2. Here I suggest the following wording:

para
You can also grant permissions on all tables, sequences, or
functions that currently exist within a given schema by specifying
ALL {TABLES|SEQUENCES|FUNCTIONS} IN SCHEMA schemaname in place of
an object name.
/para

3. I believe MySQL's grant all privileges on foo.* to someone grants
   privileges on all existing objects in foo _but also_ on any objects
   that may be created later. This patch only gives you a way to grant
   privileges only on the objects currently within a schema. I strongly
   prefer this behaviour myself, but I do think the documentation needs
   a brief mention of this fact, to avoid surprising people. That's why
   I added that currently exist to (2), above. Maybe another sentence
   that specifically says that objects created later are unaffected is
   in order. I'm not sure.

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code blocks

2009-09-20 Thread Pavel Stehule
2009/9/20 Dimitri Fontaine dfonta...@hi-media.com:
 Hi,

 Tom Lane t...@sss.pgh.pa.us writes:
 Andrew Dunstan and...@dunslane.net writes:
 Really? That wasn't my expectation at all. I expected that the code
 would in effect be always returning void. I think you're moving the
 goalposts a bit here. I don't think we need a RETURNS clause on it for
 it to be useful.

 Yes it'd be useful without it too, and this can come in later in the
 game, granted too. Will continue review without it.

Hello

I think so RETURN there has some sense. It should be optional, and
have to specify exit status. So return non zero value means exit psql
with returned value as exit status.

It would be interesting, if we could access from anonymous block some
psql internal variables.

regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated hstore patch

2009-09-20 Thread Tom Lane
Ron Mayer rm...@cheapcomplexdevices.com writes:
 Andrew Gierth wrote:
 - what to do when installing the new version's .sql into an existing db;
 the send/recv support and some of the index support doesn't get installed
 if the hstore type and opclasses already exist. In the case of an upgrade
 (or a dump/restore from an earlier version) it would be nice to make all
 the functionality available; but there's no CREATE OR REPLACE for types
 or operator classes.

 If the perfect solution is too complex, I'd also kinda hope this isn't a
 show-stopper for this patch, but rather a TODO for the future modules feature.

Yeah, this is a long-standing generic issue, and not really hstore's
problem to fix.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GRANT ON ALL IN schema

2009-09-20 Thread Abhijit Menon-Sen
At 2009-09-20 20:20:11 +0530, a...@toroid.org wrote:

 1. The patch did apply to HEAD and build cleanly, but there are now a
couple of minor (documentation) conflicts.

To be more clear, what I meant is that it did apply and build cleanly
when it was posted, but things have drifted enough now that applying
it causes conflicts in some sgml files.

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 ...  A related question is whether we
 should replace the CSV option with a FORMAT option for which one
 of the possible choices is CSV (much as we did with EXPLAIN).

That might be a good idea --- otherwise we'd need some ad-hoc code
to check that only one format option has been selected.  On the other
hand, it wouldn't be all that much code; and it would be a rather larger
change from previous behavior than we were contemplating to start with.
Comments anyone?

 One other minor point is that the patch introduces an empty-list
 syntax for individual option values, but then treats it the same
 as specifying nothing:

 It seemed like a good idea because if you can do force_quote (a, b, c)
 and force_quote (a, b) you might think that you could also do
 force_quote ().  Particularly for the sake of people generating SQL
 automatically by some means, it seems like this might simplify life.
 But possibly it shouldn't evaluate to the same value, so that you
 can't write OIDS () to mean the same thing as OIDS ON.

Yeah, that type of scenario was why I didn't like it.  I am not
impressed by the empty-list argument, either.  We don't support empty
lists with that sort of syntax in most other places, so why here?
There are counter-precedents even in the syntax of COPY itself:
you can't write () for an empty column name list, and you can't
write () for an empty copy_generic_option_list.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Emmanuel Cecchet

Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:
  

...  A related question is whether we
should replace the CSV option with a FORMAT option for which one
of the possible choices is CSV (much as we did with EXPLAIN).



That might be a good idea --- otherwise we'd need some ad-hoc code
to check that only one format option has been selected.  On the other
hand, it wouldn't be all that much code; and it would be a rather larger
change from previous behavior than we were contemplating to start with.
Comments anyone?
  
That would assume that the semantic of the other options (header, quote, 
espace, ...) is exactly the same for each format. Otherwise this will be 
a nightmare to document.
If we don't prefix with CSV, I guess that some users will spend some 
time to figure out that NULL is not a format option but FORCE NOT NULL 
is. If an option is only supported by one format (let's say XML) we will 
have to document every time which options are supported by which format 
which would be much easier and intuitive is options were readily 
prefixed by the format name.
If you look at the COPY documentation in the error logging patch, if we 
strip the error_logging prefix, it is going to be very confusing.
But I am willing to let Tom choose whatever he prefers as his birthday 
gift ;-)



One other minor point is that the patch introduces an empty-list
syntax for individual option values, but then treats it the same
as specifying nothing:
  


  

It seemed like a good idea because if you can do force_quote (a, b, c)
and force_quote (a, b) you might think that you could also do
force_quote ().  Particularly for the sake of people generating SQL
automatically by some means, it seems like this might simplify life.
But possibly it shouldn't evaluate to the same value, so that you
can't write OIDS () to mean the same thing as OIDS ON.



Yeah, that type of scenario was why I didn't like it.  I am not
impressed by the empty-list argument, either.  We don't support empty
lists with that sort of syntax in most other places, so why here?
There are counter-precedents even in the syntax of COPY itself:
you can't write () for an empty column name list, and you can't
write () for an empty copy_generic_option_list.
  
Well this one was in Robert's initial patch and I left it as is. I don't 
have any strong opinion for or against it.


Emmanuel

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: generalized index constraints

2009-09-20 Thread Jeff Davis
On Sat, 2009-09-19 at 23:15 -0400, Robert Haas wrote:
 I was wondering if we couldn't introduce a dummy tuple name similar to
 OLD and NEW, called, say, OTHER.  Then instead of writing a =, you
 could write a = OTHER.a ... or perhaps a = OTHER.b ... although that
 might also open the door to more things than you want to support at
 this point.

Interesting idea. At this point though, there is enough disagreement
over the language that I just want to take the least-invasive route that
has the lowest chance of causing a problem. It looks like ALTER INDEX
might be the path of least resistance.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Tom Lane
Emmanuel Cecchet m...@asterdata.com writes:
 That would assume that the semantic of the other options (header, quote, 
 espace, ...) is exactly the same for each format. Otherwise this will be 
 a nightmare to document.

Well, sure, we should pick a different name for an option that means
something significantly different.  But for options that do mean
approximately the same thing in different formats, it doesn't seem
helpful to require different names to be used.

 If we don't prefix with CSV, I guess that some users will spend some 
 time to figure out that NULL is not a format option but FORCE NOT NULL 
 is. If an option is only supported by one format (let's say XML) we will 
 have to document every time which options are supported by which format 
 which would be much easier and intuitive is options were readily 
 prefixed by the format name.

No, I don't think so.  Suppose I write

COPY ... (xml_header on)

If HEADER isn't actually an option supported by XML format, what I will
get here is an unknown option error, which conveys just about nothing
--- is it really an unsupported combination, or did I just misspell the
option name?  If we go with the other way then I would expect

COPY ... (xml, header on)

to draw a specific HEADER is not supported in XML format error.
Of course, that will require some extra code to make it happen.
So you could argue that format-specific option names are easier
from the lazy programmer's viewpoint.  But I don't believe the
argument that they're better from the user's viewpoint.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Jeff Davis
On Sat, 2009-09-19 at 19:23 -0700, David Fetter wrote:
 It just occurred to me that SQL:2008 ASSERTION might already fit this
 feature. :)

I think I would only be able to enforce very specific types of
assertions that match the template. As I said to Robert, I think I'm
going to use ALTER INDEX for the syntax because it appears to be the
path of least resistance.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: generalized index constraints

2009-09-20 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Sat, 2009-09-19 at 18:00 -0400, Tom Lane wrote:
 Well, you can't do it *exactly* the same way btree does, but what
 I would envision is first insert the index tuple and then do a
 dirty-snapshot search for conflicting tuples.  The interlock against
 conflicting concurrent inserts doesn't need all this new infrastructure
 you propose; just wait to see if conflicting transactions commit, same
 as we do now.  And I do maintain that that sort of code has a high risk
 of undetected bugs.

 How do you prevent deadlocks in the following case?

 T1: inserts into index
 T2: inserts into index
 T1: checks index for conflicts, finds T2
 T2: checks index for conflicts, finds T1

You get a deadlock failure, because both transactions will wait for each
other.  So what?  It's an error in any case, and you can get a reported
deadlock in constraint-enforcement scenarios today (conflicting FK
changes, for instance).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 I would still really like to decouple this from CREATE INDEX because of
 two reasons:
   1. Cannot support multiple constraints per index very easily. I think 
  this is a significant feature.
   2. Must decide to make constraint at the same time as making the 
  index, and once it's there, you can't remove it without dropping 
  the index.

I don't actually find either of those arguments to be credible in the
least.  I don't think that people will find it useful to enforce
multiple constraints with one index, and I don't believe that they'll
design an index without knowledge of the constraint they will enforce
with it.  The closest precedent we have is the UNIQUE constraint.
How often have we had requests to add or drop UNIQUE in an existing
index?  Maybe there were more than zero, but not by a lot.

As an example of why I don't believe the first item, consider something
like
create index ... (a = , b = )
(or whatever the syntax is to exclude equality on each column
separately).  Yeah, it will work, but have you considered the efficiency
implications?  Searching such an index for b, independently of a, is
going to suck to such an extent that you'd be *far* better off building
two separate indexes.  We do not have, and are unlikely ever to have,
index types in which a search that doesn't constrain the first index
column is efficient.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: generalized index constraints

2009-09-20 Thread Jeff Davis
On Sun, 2009-09-20 at 12:31 -0400, Tom Lane wrote:
  T1: inserts into index
  T2: inserts into index
  T1: checks index for conflicts, finds T2
  T2: checks index for conflicts, finds T1
 
 You get a deadlock failure, because both transactions will wait for each
 other.  So what?  It's an error in any case, and you can get a reported
 deadlock in constraint-enforcement scenarios today (conflicting FK
 changes, for instance).

Well, in theory, one of the transactions may have been destined to be
aborted later anyway, and the deadlock detector might kill the wrong
one. But I agree, perhaps I'm over-thinking this one.

Aside: I just realized that my solution to the deadlock problem won't
work with your simpler idea anyway. When reading the index we've long
since lost the information about the specific insert of the specific
command of the other transaction.

I'll make the change.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW-based dblink (WIP)

2009-09-20 Thread Peter Eisentraut
On Wed, 2009-09-16 at 13:47 +0900, Itagaki Takahiro wrote:
 Peter Eisentraut pete...@gmx.net wrote:
 
  This patch is listed in the commitfest, but I think the consensus was
  that it needed some rework.
 
 No doubt, but SQL/MED will require a lot of works. Can we split the work
 into small parts? I intended FDW-based dblink to be one of the split jobs.

Sure, but I think what you are doing here isn't on anyone's agenda.  I
would imagine that the next step would be to implement foreign tables
using something like dblink's interface as underlying interface.  What
you are proposing doesn't really move us closer to that, or I'm
misunderstanding what you are trying to achieve.  So what is the purpose
of this patch anyway?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: walreceiver settings Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-20 Thread Heikki Linnakangas
Fujii Masao wrote:
 On Fri, Sep 18, 2009 at 7:34 PM, Fujii Masao masao.fu...@gmail.com wrote:
 This approach is OK if the stand-alone walreceiver is treated steadily
 by the startup process like a child process under postmaster:

 * Handling of some interrupts: SIGHUP, SIGTERM?, SIGINT, SIGQUIT...
   For example, the startup process would need to rethrow walreceiver
   the interrupt from postmaster.

 * Communication with other child processes: stats collector? syslogger?...
   For example, the log message generated by walreceiver should also
   be collected by syslogger if requested.
 
 Also we should consider how to give a GUC parameter to the stand-alone
 walreceiver. In the initial patch, since walreceiver was a child process of
 postmaster, it could easily get any GUC parameter. But it's not so easy
 to give a GUC parameter to a stand-alone program.

Yes, good point.

 * some parameters for logging
   I think that the log messages generated by walreceiver should also be
   treated as well as the other postgres messages. For example, I'd like
   to specify log_line_prefix also for walreceiver.

Hmm, agreed, although we already have the same problem with
archive_command, and pg_standby in particular. I could live with that
for now.

The startup process could capture stderr from walreceiver and forward it
with elog(LOG).

 There are some approaches to give a GUC parameter to walreceiver.
 Which is the best?
 
 1) Give a parameter as a command-line argument of the stand-alone
 walreceiver. This is a straightforward approach, but wouldn't cover
 a reload of parameter.

The startup process could kill and restart walreceiver to reload. If
reloading is really required, that is. Which GUC parameters are we
concerned about? The ones related to logging you mentioned, but if we
handle logging via a pipe to the startup process, that won't be an issue.

 2) Give a parameter via pipe between the startup process and walreceiver.
 
 3) Change walreceiver to read a configuration file. The problem is that
 the command-line argument of postmaster doesn't affect walreceiver.
 The combination of 1) and 3) might be required.

Sounds complicated..

One option that you might well want to change on the fly is the
connection info string in recovery.conf. Neither of the above really
cater for that, unless we make walreceiver read recovery.conf as well. I
think we should keep walreceiver quite dumb.

 4) Change walreceiver back to a child process of postmaster.

Yeah, that's not out of the question either.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: generalized index constraints

2009-09-20 Thread Tom Lane
BTW, I just thought of an issue that might change some of these
conclusions: what about supporting deferred constraint checking,
as we just recently managed to do for straight UNIQUE constraints?
I don't say that this has to be in the first cut, but it's something
we ought to try to leave room for down the road.

The current infrastructure for deferred uniqueness requires that the
thing actually be a constraint, with an entry in pg_constraint that
can carry the deferrability options.  So unless we want to rethink
that, this might be a sufficient reason to override my arguments
about not wanting to use CONSTRAINT syntax.

As far as implementation goes, I think there would be very little
choice but to use the insert-the-index-entry-first, check-later
approach; so your ideas involving extra state in shared memory
seem to fall to the ground anyhow.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Jeff Davis
On Sun, 2009-09-20 at 12:45 -0400, Tom Lane wrote:
 How often have we had requests to add or drop UNIQUE in an existing
 index?  Maybe there were more than zero, but not by a lot.

Ok.

 As an example of why I don't believe the first item, consider something
 like
   create index ... (a = , b = )
 (or whatever the syntax is to exclude equality on each column
 separately).  Yeah, it will work, but have you considered the efficiency
 implications?  Searching such an index for b, independently of a, is
 going to suck to such an extent that you'd be *far* better off building
 two separate indexes.  We do not have, and are unlikely ever to have,
 index types in which a search that doesn't constrain the first index
 column is efficient.

My use case was something else:

An index on (a, b, c) enforcing the constraints UNIQUE(a, b) and
UNIQUE(a, c).

UNIQUE(a, b) can be enforced efficiently. UNIQUE(a, c) might be less
efficient depending on the selectivity of a, but as long as a is
selective I think it's useful. The alternative is updating two indices
on every insert.

You may still think this use case is too marginal to bother supporting,
but I never made an argument for the use case you described above.

If we move away from multiple constraints per index, are you suggesting
that I also move the constraints out of pg_constraint and back into
pg_index?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: generalized index constraints

2009-09-20 Thread Jeff Davis
On Sun, 2009-09-20 at 13:01 -0400, Tom Lane wrote:
 The current infrastructure for deferred uniqueness requires that the
 thing actually be a constraint, with an entry in pg_constraint that
 can carry the deferrability options.  So unless we want to rethink
 that, this might be a sufficient reason to override my arguments
 about not wanting to use CONSTRAINT syntax.

Ok. Using the word EXCLUSION would hopefully guard us against future
changes to SQL, but you know more about the subtle dangers of language
changes than I do.

So, do I still omit it from information_schema?

 As far as implementation goes, I think there would be very little
 choice but to use the insert-the-index-entry-first, check-later
 approach; so your ideas involving extra state in shared memory
 seem to fall to the ground anyhow.

True.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 My use case was something else:

 An index on (a, b, c) enforcing the constraints UNIQUE(a, b) and
 UNIQUE(a, c).

 UNIQUE(a, b) can be enforced efficiently. UNIQUE(a, c) might be less
 efficient depending on the selectivity of a, but as long as a is
 selective I think it's useful. The alternative is updating two indices
 on every insert.

 You may still think this use case is too marginal to bother supporting,
 but I never made an argument for the use case you described above.

You're right, it still seems remarkably marginal.  I'm rethinking
my position on use of CONSTRAINT syntax because of the deferrability
issue, but I'm still unconvinced that we need to allow the constraints
to be decoupled from the indexes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: generalized index constraints

2009-09-20 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 So, do I still omit it from information_schema?

My thought is yes --- any representation of it within information_schema
would be so inaccurate/incomplete as to be worse than useless, IMO.
Peter might have a different idea though ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Jeff Davis
On Sun, 2009-09-20 at 13:13 -0400, Tom Lane wrote:
 You're right, it still seems remarkably marginal.  I'm rethinking
 my position on use of CONSTRAINT syntax because of the deferrability
 issue, but I'm still unconvinced that we need to allow the constraints
 to be decoupled from the indexes.

Ok, should I explicitly disallow multiple constraints on one index then?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Sun, 2009-09-20 at 13:13 -0400, Tom Lane wrote:
 You're right, it still seems remarkably marginal.  I'm rethinking
 my position on use of CONSTRAINT syntax because of the deferrability
 issue, but I'm still unconvinced that we need to allow the constraints
 to be decoupled from the indexes.

 Ok, should I explicitly disallow multiple constraints on one index then?

What I'm arguing for is a syntax in which the question doesn't even
arise, ie, a CONSTRAINT doesn't reference an existing index at all.
If that's not possible for whatever reason, then I think that
disallowing multiple references isn't going to buy any simplicity.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Jeff Davis
On Sun, 2009-09-20 at 13:28 -0400, Tom Lane wrote:
 What I'm arguing for is a syntax in which the question doesn't even
 arise, ie, a CONSTRAINT doesn't reference an existing index at all.
 If that's not possible for whatever reason, then I think that
 disallowing multiple references isn't going to buy any simplicity.

I believe that syntax is possible by specifying the index access method,
e.g.:

  CONSTRAINT name EXCLUSION (a =, b ) USING gist;

versus:

  CONSTRAINT name EXCLUSION (a =, b ) INDEX indexname;

And the former could build the index implicitly. I haven't written the
code yet, but I don't see any major problems.

So, should I eliminate the latter syntax and only support the former, or
should I support both?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 I believe that syntax is possible by specifying the index access method,
 e.g.:

   CONSTRAINT name EXCLUSION (a =, b ) USING gist;

 versus:

   CONSTRAINT name EXCLUSION (a =, b ) INDEX indexname;

 And the former could build the index implicitly. I haven't written the
 code yet, but I don't see any major problems.

 So, should I eliminate the latter syntax and only support the former, or
 should I support both?

I'd vote for only supporting the former.

What worries me more about that syntax is the postfix-operator ambiguity
--- I think it'll be hard to expand it to expressions.  It might be
better to put the operator at the front; or maybe you need an extra
keyword in there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Emmanuel Cecchet

Tom Lane wrote:

No, I don't think so.  Suppose I write

COPY ... (xml_header on)

If HEADER isn't actually an option supported by XML format, what I will
get here is an unknown option error, which conveys just about nothing
--- is it really an unsupported combination, or did I just misspell the
option name?

Well, I don't see why you would write that if the option is not documented.
Usually as a user, when I need to use a command, I look at the doc/man 
page and use the options that are indicated, I don't try to invent new 
options. That should prevent the kind of scenario you describe here:

  If we go with the other way then I would expect

COPY ... (xml, header on)

to draw a specific HEADER is not supported in XML format error.
Of course, that will require some extra code to make it happen.
So you could argue that format-specific option names are easier
from the lazy programmer's viewpoint.  But I don't believe the
argument that they're better from the user's viewpoint.
  
Here you will force every format to use the same set of options and if 
someone introduces a new option, you will have to modify all other 
formats to make sure they throw an error telling the user that this 
option is not supported. I don't think this is a great design and that 
it will be easy to extend.


Emmanuel

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Tom Lane
Emmanuel Cecchet m...@asterdata.com writes:
 Here you will force every format to use the same set of options

How does this force any such thing?

 and if 
 someone introduces a new option, you will have to modify all other 
 formats to make sure they throw an error telling the user that this 
 option is not supported.

Well, if we do it your way then we will instead need a collection of
code to throw errors for combinations like (xml on, csv_header on).
I don't really see any improvement there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Emmanuel Cecchet

Tom Lane wrote:

Emmanuel Cecchet m...@asterdata.com writes:
  

Here you will force every format to use the same set of options



How does this force any such thing?
  
As far as I understand it, every format will have to handle every format 
options that may exist so that they can either implement it or throw an 
error.
and if 
someone introduces a new option, you will have to modify all other 
formats to make sure they throw an error telling the user that this 
option is not supported.



Well, if we do it your way then we will instead need a collection of
code to throw errors for combinations like (xml on, csv_header on).
I don't really see any improvement there.
  
That would argue in favor of a format option that defines the format. 
Right now I find it bogus to have to say (csv on, csv_header on). If 
csv_header is on that should imply csv on.
The only problem I have is that it is not obvious what options are 
generic COPY options and what are options of an option (like format 
options).
So maybe a tradeoff is to differentiate format specific options like in: 
(delimiter '.', format csv, format_header, format_escape...)
This should also make clear if someone develops a new format what 
options need to be addressed.


Emmanuel
PS: I don't know why but as I write this message I already feel that Tom 
hates this new proposal :-D


--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Tom Lane
Emmanuel Cecchet m...@asterdata.com writes:
 So maybe a tradeoff is to differentiate format specific options like in: 
 (delimiter '.', format csv, format_header, format_escape...)
 This should also make clear if someone develops a new format what 
 options need to be addressed.

I think that distinction would exist internally.  What I'm not
clear on is why we would want it visible externally.

But anyhow I think we have run through all the arguments, and it's
time for some votes from other people.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated hstore patch

2009-09-20 Thread David E. Wheeler

On Sep 18, 2009, at 6:27 PM, Andrew Gierth wrote:


However, I would prefer to keep the ability to do this:

psql --set hstore_schema='myschema' -f hstore.sql dbname

The logic to do it is a bit ugly, but editing the file to set what  
schema to

use is even uglier...


Yes, this should perhaps be generalized into a separate patch. I  
completely agree that it'd be useful and desirable.


The only open question I can see is what delete(hs,$1) resolves to  
when $1 is
an unknown-type placeholder; this is probably an incompatibility  
with the old
version if anyone is relying on that (but I don't see why they would  
be).


Given your examples, I think it probably should resolve to text as it  
does, as deleting a single key is likely to be a common case. It  
should otherwise be cast.



Because record doesn't express the fact that the return type of
populate_record is the SAME record type as its parameter type, whereas
anyelement does.


The lack of expressivity in records is beginning to annoy me.


David * I'd like to see more examples of the new functionality in
David the documentation. I'd be happy to contribute them once the
David main patch is committed.

Thanks. I'll do some (especially for populate_record, which really  
needs

them), but more can be added later.


Agreed. As I said, once this is committed I'll likely go over the docs  
and submit a patch myself.



Old values are converted to new values in core, but they can't be
changed on-disk unless something actually updates them.


Right, of course, thanks.


The overhead is possibly non-negligible for reading old values, but
old values can be promoted to new ones fairly simply (e.g. using
ALTER TABLE).


So then it's negligible for new values?

David * Regarding the bug you found in the old hstore format  
(mentioned
David [here](http://archives.postgresql.org/pgsql-hackers/2009-07/msg01422.php 
)

David ), has that been fixed? Is it a regression that should be
David back-patched?

That has not been fixed.


Should it be? I realize that it's a separate issue from this patch,  
and maybe it's too edge-case to bother with, given that no one has  
complained and it obviously won't exist once your patch is applied.  
Right?



Davidtry=# select 'a = 1, b = 2'::hstore::text[];
David   array
David---
David {a,1,b,2}

DavidI'd find this especially useful in my Perl applications, as
Davidthen I could easily fetch hstores as arrays and turn them
Davidinto hashes in my Perl code by simply doing `my %h = @{
David$row-{hstore} };`. Of course, the inverse would be useful
Davidas well:

Davidtry=# select ARRAY[ 'a', '1', 'b', '2']::hstore;
David   hstore
David
David a=1, b=2

DavidA function would work as well, failing community interest
Davidin an explicit cast.

I'm not sure I like these as casts but I'm open to opinions. Having  
them

as functions is certainly doable.


I think a function would be great here. A cast is something we can  
decide to add later, or one can add manually using the function.


Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated hstore patch

2009-09-20 Thread David E. Wheeler

On Sep 20, 2009, at 8:43 AM, Tom Lane wrote:

If the perfect solution is too complex, I'd also kinda hope this  
isn't a
show-stopper for this patch, but rather a TODO for the future  
modules feature.


Yeah, this is a long-standing generic issue, and not really hstore's
problem to fix.


So then does there need to be some documentation for how to deal with  
this, for those doing an in-place upgrade from an existing hstore data  
type? Or would that discussion be in Bruce's tool's docs?


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated hstore patch

2009-09-20 Thread David E. Wheeler

On Sep 20, 2009, at 1:03 AM, Andrew Gierth wrote:

I will resubmit (hopefully in a day or two) with the following  
changes:


- removal of the \set schema variable stuff from the .sql script
- documentation fixes as mentioned in my previous email
- additional documentation for some of the newer features
- more internal code documentation covering the handling of old  
formats


+1. And maybe a discussion about upgrading old hstore values?


I'd appreciate public feedback on:

- whether conversions to/from a {key,val,key,val,...} array are needed
  (and if there's strong opinions in favour of making them casts; in  
the

  absence of strong support for that, I'll stick to functions)


Definitely functions. I'm strongly in favor of an explicit cast, as  
well, but I'm spoiled by Perl, so I may be overly biased. Functions  
will do to start.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated hstore patch

2009-09-20 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 On Sep 20, 2009, at 8:43 AM, Tom Lane wrote:
 Yeah, this is a long-standing generic issue, and not really hstore's
 problem to fix.

 So then does there need to be some documentation for how to deal with  
 this, for those doing an in-place upgrade from an existing hstore data  
 type? Or would that discussion be in Bruce's tool's docs?

I'm inclined to correct the existing documentation, which says at the
bottom of http://developer.postgresql.org/pgdocs/postgres/contrib.html

  After a major-version upgrade of PostgreSQL, run the installation script
  again, even though the module's objects might have been brought forward
  from the old installation by dump and restore. This ensures that any new
  functions will be available and any needed corrections will be applied.

That recipe doesn't actually work for cases like this.  What *would*
work is loading the module *before* restoring from your old dump,
then relying on the CREATEs from the incoming dump to fail.

I believe we have already discussed the necessity for pg_upgrade to
support this type of subterfuge.  A module facility would be a lot
better of course, but we still need something for upgrading existing
databases that don't contain the module structure.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Jeff Davis
On Sun, 2009-09-20 at 13:49 -0400, Tom Lane wrote:
 I'd vote for only supporting the former.

Ok.

I just did some brief non-scientific in-memory benchmarks. I think it
has promise, but for now I think it can safely be set aside.

Results appended.

 What worries me more about that syntax is the postfix-operator ambiguity
 --- I think it'll be hard to expand it to expressions.  It might be
 better to put the operator at the front; or maybe you need an extra
 keyword in there.

How about OPERATOR, like:

  CONSTRAINT name
EXCLUSION (expr OPERATOR op, ...)
USING method;

I like it because it connects back to the name operator exclusion
constraint.

Regards,
Jeff Davis

---
Results (oversimplified benchmark):

As a control, two unique btrees (using old uniqueness mechanism) takes
37s.

DDL (old syntax, haven't changed it yet):

  create table one(a int, b int, c int);
  create index one_a_b_c_idx on one(a,b,c);
  alter table one add constraint one_a_b_constr
exclusion (a =, b =) using one_a_b_c_idx;
  alter table one add constraint one_a_c_constr
exclusion (a =, c =) index one_a_b_c_idx;

  create table two(a int, b int, c int);
  create index two_a_b_idx on two(a,b);
  create index two_a_c_idx on two(a,c);
  alter table two add constraint two_a_c_constr
exclusion (a =, c =) index two_a_c_idx;
  alter table two add constraint two_a_b_constr
exclusion (a =, b =) index two_a_b_idx;

Tests are of the form:

  -- test inserting into table with one big index with 10 b 
  -- values per a value
  insert into one select g1, g2, g2 
from generate_series(1,10) g1, generate_series(1,10) g2;

n: number of a values per b value
t1: results for one-index solution
t2: results for two-index solution

n t1 t2
---+--+---
  1000 | 105s | 57s
   100 |  47s | 54s
10 |  44s | 53s
 1 |  42s | 56s

So, the one-index solution shows about 10-20% benefit over the two-index
solution when the number of b values per a value drops to around
100. Not bad, but nothing to write home about, because it's still
outperformed by the existing btree enforcement mechinism. I think it has
promise for some situations though; such as larger key size, leaf pages
not in memory, etc.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 But anyhow I think we have run through all the arguments, and it's
 time for some votes from other people.

Same option names whatever the format please. We know the context is
important and people will be able to understand that header does not
refer exactly to the same processing when applied to XML or CSV. But
still refers to if those first lines are imported too...

And I think I prefer 'format csv' rather than 'csv on'.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Upgrading towards managed extensions (was Re: [HACKERS] updated hstore patch)

2009-09-20 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 I believe we have already discussed the necessity for pg_upgrade to
 support this type of subterfuge.  A module facility would be a lot
 better of course, but we still need something for upgrading existing
 databases that don't contain the module structure.

An idea would be to have an external tool to prepare the transition. The
tool would have to be able to build the pg_depend entries for a given
database and a given extension. It could process this way:

 - create a new empty database, pg_dump -Ft  empty.dump
 - install the given extension, pg_dump -Ft  extended.dump
 - compare empty and extended dumps catalogs (pg_restore -l)
 - diffs are from the extension, look them up in given database
 - for each extension's object, try drop cascade it then rollback
 - parse error messages

Now the diff lookup gives a first set of pg_depend entries, which is to
be completed in the DROP CASCASE error parsing step.

Given this we yet have to prepare the database so that pg_dump from
extensions aware major version will be able to dump CREATE and INSTALL
extension commands rather than the extension.sql install file. This can
be done by installing the newer extension on the target database and
point the tool to this, in order to drain the needed catalog entries.

It'll be slow and will take AccessExclusive locks, but you can do it on
a staging server. The output should be a SQL script filling pg_extension
and pg_depend on the existing database.

So user steps are:
 - pg_addextension olddb newdb extname install.sql [...]  exts.sql
 - psql -f exts.sql olddb

From there pg_dump from new version is happy.

Regards,
-- 
dim

PS: once more, devil is the details, and the extension code is to be
written. Hope doing so for 11/15 commitfest, over free time.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Jeff Davis
Update on operator exclusion constraints (OXC for short):

After a lot of discussion, I think a lot of progress has been made. Here
is my summary, please let me know if I've left anything out or not
addressed some concern.

1. Constraint syntax, part of CREATE/ALTER TABLE:

  [CONSTRAINT name] EXCLUSION (expr OPERATOR op, ...)
USING method
[ DEFERRABLE | NOT DEFERRABLE ]
[ INITIALLY DEFERRED | INITIALLY IMMEDIATE ];

Table constraint syntax was chosen because of the ability to support
DEFERRABLE[1] and the interest in a more declarative syntax[2].

We omitted the [INDEX indexname] clause because the usefulness of
defining multiple constraints using one index or defining the constraint
separately from the index was judged to be too marginal[3][4][5]. Some
brief benchmarks showed some promise[6], perhaps interesting to explore
later.

Also, we introduce the OPERATOR keyword in between the expression and
the operator to disambiguate the syntax[5]. Nobody has affirmed the use
of OPERATOR for the disambiguation, but it seems like the obvious choice
to me.

2. information_schema

We omit operator exclusion constraints from the information schema, on
the grounds that there is no way to represent them usefully there[7][8].

3. Simplify the constraint checking procedure itself

Tom suggested a simpler constraint-checking procedure[9]. It introduces
the rare possibility of deadlocks[10], but that possibility exists for
other constraints anyway[11]. My scheme for avoiding deadlocks was
significantly more complex, and would become even more complex for
deferrable constraints.

4. expr is an expression over the table's attributes and will be used
to generate a functional index with the same expression to enforce the
constraint.

5. We reject non-symmetric operators[12], like , but allow
non-reflexive operators[13] like .

6. Semantics of constraint[14] are such that for any two tuples A and B,
and for a constraint:

  EXCLUSION (e1 OPERATOR op1, ..., eN OPERATOR opN)

the constraint is violated if:

  A.e1 op1 B.e1 AND
  ... AND
  A.eN opN B.eN

7. LIKE is still unresolved. I don't have a strong opinion here.

When INCLUDING CONSTRAINTS and INCLUDING INDEXES are both specified:
  a. copy all OXCs and indexes
  b. copy no OXCs or indexes
When INCLUDING CONSTRAINTS is specified but not INCLUDING INDEXES:
  a. copy all OXCs and indexes
  b. copy no OXCs or indexes
When INCLUDING INDEXES is specified but not INCLUDING CONSTRAINTS:
  a. copy all OXCs, including indexes
  b. copy all indexes created implicitly for OXCs, but not the 
 constraints themselves
  c. copy no OXCs or indexes

We can also emit various types of messages if we think the user is
making a mistake.

UNIQUE behavior here doesn't provide a good cue, because the constraint
is implemented inside the index, so copying the index does copy the
constraint. Brendan made a strong argument[15] that the behavior of LIKE
with UNIQUE is wrong, but I don't know if we want to try to fix that
now. I'd like some more input before I actually take care of this item.



The rest of the issues were mostly non-controversial. I will start
making some of these changes and post an updated patch and TODO list.

Regards,
Jeff Davis




[1] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01352.php
[2] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01018.php
[3] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01348.php
[4] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01355.php
[5] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01360.php
[6] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01369.php
[7] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01310.php
[8] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01356.php
[9] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01315.php
[10] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01317.php
[11] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01347.php
[12] http://archives.postgresql.org/pgsql-hackers/2009-09/msg00977.php
[13] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01039.php
[14] http://archives.postgresql.org/pgsql-hackers/2009-09/msg00971.php
[15] http://archives.postgresql.org/pgsql-hackers/2009-09/msg00755.php


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 1. Constraint syntax, part of CREATE/ALTER TABLE:

   [CONSTRAINT name] EXCLUSION (expr OPERATOR op, ...)

Have you actually built this grammar?  I don't think it avoids the
problem, because OPERATOR is possible within a_expr.

Also, don't forget the possibility of wanting a nondefault opclass.
(I'm wondering a bit if anyone will want a WHERE clause, too, though
adding that later shouldn't pose any big syntactic obstacles.)

 Brendan made a strong argument[15] that the behavior of LIKE
 with UNIQUE is wrong, but I don't know if we want to try to fix that
 now. I'd like some more input before I actually take care of this item.

That's really a separate issue, but I think we need to do something to
make it more consistent.  My first thought is that anything made
via CONSTRAINT syntax ought to be copied by LIKE INCLUDING CONSTRAINTS,
while LIKE INCLUDING INDEXES should copy anything you made via CREATE
INDEX.  But note this assumes that there is a clear distinction between
the two.  The constraint-depending-on-index design that you started
with would not permit such a rule, or at least it would mean that
INCLUDING CONSTRAINTS EXCLUDING INDEXES would have failure cases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Robert Haas
On Sun, Sep 20, 2009 at 2:25 PM, Emmanuel Cecchet m...@asterdata.com wrote:
 Tom Lane wrote:
 Emmanuel Cecchet m...@asterdata.com writes:

 Here you will force every format to use the same set of options

 How does this force any such thing?


 As far as I understand it, every format will have to handle every format
 options that may exist so that they can either implement it or throw an
 error.

I don't think this is really true.  To be honest with you, I think
it's exactly backwards.  The way the option-parsing logic works, we
parse each option individually FIRST.  Then at the end we do
cross-checks to see whether there is an incompatibility in the
combination specified.  So if two different formats support the same
option, we just change the cross-check to say that foo is OK with
either format bar or format baz.  On the other hand, if we split the
option into bar_foo and baz_foo, then the first loop that does the
initial parsing has to support both cases, and then you still need a
separate cross-check for each one.

 That would argue in favor of a format option that defines the format. Right
 now I find it bogus to have to say (csv on, csv_header on). If csv_header is
 on that should imply csv on.
 The only problem I have is that it is not obvious what options are generic
 COPY options and what are options of an option (like format options).
 So maybe a tradeoff is to differentiate format specific options like in:
 (delimiter '.', format csv, format_header, format_escape...)
 This should also make clear if someone develops a new format what options
 need to be addressed.

I think this is a false dichotomy.  It isn't necessarily the case that
every format will support a delimiter option either.  For example, if
we were to add an XML or JSON format (which I'm not at all convinced
is a good idea, but I'm sure someone is going to propose it!) it
certainly won't support specifying an arbitrary delimiter.

IOW, *every* format will have different needs and we can't necessarily
know which options will be applicable to those needs.  But as long as
we agree that we won't use the same option for two different
format-specific options with wildly different semantics, I don't think
that undecorated names are going to cause us much trouble.  It's also
less typing.

 PS: I don't know why but as I write this message I already feel that Tom
 hates this new proposal :-D

I get those feeling sometimes myself.  :-)  Anyway, FWIW, I think Tom
has analyzed this one correctly...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated hstore patch

2009-09-20 Thread Andrew Gierth
 David == David E Wheeler da...@kineticode.com writes:

  The only open question I can see is what delete(hs,$1) resolves to
  when $1 is an unknown-type placeholder; this is probably an
  incompatibility with the old version if anyone is relying on that
  (but I don't see why they would be).

 David Given your examples, I think it probably should resolve to
 David text as it does, as deleting a single key is likely to be a
 David common case. It should otherwise be cast.

I think you're missing the point here; I can't control what it resolves
to, since that's the job of the function overload resolution code.

But I checked, and delete(hstore,$1) still resolves to
delete(hstore,text) when the type of $1 is not specified, so there's
no compatibility issue there that I can see. (I'm not sure I
understand _why_ it resolves to that rather than being ambiguous...)

  The overhead is possibly non-negligible for reading old values,
  but old values can be promoted to new ones fairly simply
  (e.g. using ALTER TABLE).

 David So then it's negligible for new values?

Yes. (One bit test, done inline)

-- 
Andrew.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Jeff Davis
On Sun, 2009-09-20 at 17:54 -0400, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
  1. Constraint syntax, part of CREATE/ALTER TABLE:
 
[CONSTRAINT name] EXCLUSION (expr OPERATOR op, ...)
 
 Have you actually built this grammar?  I don't think it avoids the
 problem, because OPERATOR is possible within a_expr.
 
 Also, don't forget the possibility of wanting a nondefault opclass.
 (I'm wondering a bit if anyone will want a WHERE clause, too, though
 adding that later shouldn't pose any big syntactic obstacles.)

I suppose I should just allow any index_elem. The only way I was able to
make the grammar for that work is by using a reserved keyword. The
possibilities that make the most sense to me are:

  index_elem WITH any_operator
  index_elem WITH OPERATOR any_operator
  index_elem CHECK any_operator
  index_elem CHECK OPERATOR any_operator

Do any of these look acceptable?

Also, I should allow for a tablespace, as well. Because it's specified
with UNIQUE as USING INDEX TABLESPACE foo, to be consistent I need to
move the USING method ahead like so:

  CONSTRAINT name EXCLUSION [USING method]
(index_elem CHECK op, ...)
[USING INDEX TABLESPACE tablespacename]
[DEFERRABLE | NOT DEFERRABLE ]
[ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]

Having the method before the attribute list makes it more consistent
with CREATE INDEX, as well.

 That's really a separate issue, but I think we need to do something to
 make it more consistent.  My first thought is that anything made
 via CONSTRAINT syntax ought to be copied by LIKE INCLUDING CONSTRAINTS,
 while LIKE INCLUDING INDEXES should copy anything you made via CREATE
 INDEX.

Works for me.

 But note this assumes that there is a clear distinction between
 the two.  The constraint-depending-on-index design that you started
 with would not permit such a rule, or at least it would mean that
 INCLUDING CONSTRAINTS EXCLUDING INDEXES would have failure cases.

Sounds reasonable. If we decide to support that kind of thing in the
future, we can handle that case somehow (an error seems reasonable to
me).

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 I suppose I should just allow any index_elem. The only way I was able to
 make the grammar for that work is by using a reserved keyword. The
 possibilities that make the most sense to me are:

   index_elem WITH any_operator
   index_elem WITH OPERATOR any_operator
   index_elem CHECK any_operator
   index_elem CHECK OPERATOR any_operator

 Do any of these look acceptable?

I'd vote for CHECK, out of that list.  WITH has no mnemonic value
whatever.

I'm not that thrilled with CHECK either, mainly because it seems like
it ought to check that the operator condition *does* hold, whereas
you're going to check that it *doesn't* hold.  But perhaps the EXCLUSION
up front will be enough to set people straight.

BTW, are you sure EXCLUSION doesn't have to become a reserved word for
this?  I notice that FOREIGN, CHECK, and UNIQUE all are, which makes me
suspicious ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]

2009-09-20 Thread Jeff Davis
On Sun, 2009-09-20 at 19:42 -0400, Tom Lane wrote:
 BTW, are you sure EXCLUSION doesn't have to become a reserved word for
 this?  I notice that FOREIGN, CHECK, and UNIQUE all are, which makes me
 suspicious ...

All of those (except FOREIGN) can be used as a column constraint as
well, and that might be necessary for a reason similar to the reason I
need to use a reserved word (i.e. they can come after a DEFAULT
expression). Is it possible that FOREIGN doesn't really have to be a
reserved word, but was just included because the others were?

I'm not an expert on the matter, but it does appear to compile and
recognize the grammar with EXCLUSION as an unreserved keyword. I'm in
the middle of changing a lot of things around, so I can't say that it
works beyond that.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Resjunk sort columns, Heikki's index-only quals patch, and bug #5000

2009-09-20 Thread Robert Haas
On Tue, Sep 15, 2009 at 7:53 AM, Robert Haas robertmh...@gmail.com wrote:
  Consider A IJ B, with
 the scan over A implemented as an index scan.  It seems to me that if
 the join selectivity is  1, then assuming there's a choice, we
 probably want to join A to B and then do the heap fetches against A
 afterwards.  But if the join selectivity is  1 (consider, for
 example, a cross join), we probably want to do the heap fetches first.

 Hmm, good point. I didn't consider that join selectivity can be  1.

 A more common scenario is that there's an additional filter condition on
 the HeapFetch, with a selectivity  1. It can then cheaper to perform
 the heap fetches first and only join the remaining rows that satisfy the
 filter condition.

 Well, again, it seems to me that it entirely depends on whether the IJ
 increases or decreases the number of rows.  You want  to do the heap
 fetches at the point where there are the fewest of them to do, and you
 can't know that a priori.  When you start talking about more common
 scenarios, what you really mean is more common in the queries I
 normally do, and that's not the same as what other people's queries
 do.  (See, for example, previous discussions on -performance, where it
 turns out that my suggested fix for a particular kind of planner
 problem is the exact opposite of Kevin Grittner's fix for a problem
 with the same code; the existing code bounds a certain value from
 below at 1 - I suggested raising it to 2, he suggested lowering it to
 0.)

I've been mulling over this problem all week.  I haven't gotten all
that far, but here are my thoughts.

Suppose we're planning some joinrel within a large join nest.  We have
two partial paths A and B with identical pathkeys.  Path A does not
involve an index scan, so we know its exact cost.  Path B involves an
index scan, so a heap fetch will have to be done at some point, but
since we can't yet know where the best place to do that heap fetch is,
so we can't know the exact cost of B.  Basically, the decision we face
here is whether to keep both plans A and B or to discard one of them
as clearly inferior to the other.  As a preliminary observation, this
is the logic that is performed by add_path(), which gets called A LOT
in planning large join nests, so the performance impact of what
happens there needs to be measured carefully.

We can, however, put some bounds on the cost of B.  We know what the
cost of B is disregarding the heap fetch (or heap fetches) that will
eventually need to be performed.  This cost is also the minimum total
cost of B, in the case where some yet-to-be-performed join is
estimated to return no rows, and thus the heap fetch is estimated to
not actually need to fetch anything.  We can also bound the maximum
cost of B: it certainly can't be any higher than the cost of doing all
the heap fetches immediately above the index fetches.  We could
probably compute a tighter upper bound by considering various possible
positions for the heap fetches at or below the level of the joinrel,
but I doubt that it's worth the effort.

Instead, what we can do is compare the low-estimate for B to the
estimate for A.  If it's higher, discard B.  Otherwise, compare the
high-estimate for B to the estimate for A.  If it's lower, discard A.
Otherwise, keep both paths.  More generally, we can conceive of each
path as having low and high estimates, and we can say that for two
paths with identical pathkeys, P dominates P' if the high-estimate for
P is less than the low-estimate for P'.

In this view of the world, we don't actually need to represent the
heap-fetches in the path trees during joinrel planning.  Instead, we
build up a set of possible paths for the whole joinrel, and then at
the end, we go back and look at any remaining paths that require heap
fetches to be inserted and figure out the best place to put them.  The
major problem I see with this approach is that it may slow down
add_path() too much, but if that turns out to be the case I don't have
another idea short of attacking the problem using some kind of
heuristics that will probably be less accurate than an analysis of the
type described above.

Thoughts?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Resjunk sort columns, Heikki's index-only quals patch, and bug #5000

2009-09-20 Thread Robert Haas
On Mon, Sep 14, 2009 at 5:41 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Heikki Linnakangas wrote:
 Tom Lane wrote:
 It strikes me that in the cases where it wouldn't be necessary to
 compute junk sort-key columns, it would be because we were scanning an
 index that includes those values.  So if the plan were set up to pull
 those values from the index and return them, then we'd not have to add
 this extra complexity to grouping_planner --- the argument that it's not
 worth it to get rid of the junk columns comes back into play.  Moreover,
 such an ability would also mean that if the user *does* ask for the
 sort column value as output (ie it's not resjunk), we can still satisfy
 the query from the index without recomputing the expensive function.

 So this is where we come to the connection to Heikki's index-only-quals
 patch.  As submitted, that code is only able to use an index column in
 a scan qual, it's not able to return it as part of the scan result.
 This example makes it clear that that definition is missing a large
 part of the potential benefit of an index value extraction capability.

 To be able to do anything along that line would require some more work
 in the executor and a *lot* more work in the planner, and I'm honestly
 not sure what the planner part of it would look like.

 I think we should separate the Heap Fetch operation from the IndexScan.

 I've been hacking on that approach. It's quite unfinished, but before I
 spend any more time on it, I'd like to get some feedback on the overall
 design.

 The attached patch can create plans where quals are checked and joins
 are performed using values from indexes only, and the heap tuples are
 fetched only for matching rows. Passes regression tests, but code is
 quite ugly at points. Cost estimation is bogus. The patch builds on the
 indexam-api-changes patch I posted earlier, which is also attached. I
 haven't yet done the changes to that patch that were discussed.

 I haven't done any performance testing. The overhead of an extra
 executor node for each index scan could slow down simple queries, we
 might need to compensate that somehow, maybe reintroduce a fastpath
 combined IndexScan+HeapFetch node.  I'm also afraid the extra work I've
 pushed to the stage where Paths are constructed could slow down planning
 quite a bit if you have a lot of indexes.


 Path nodes now carry a targetlist. That's because when you have a path like:

  HeapFetch
   - Join
     ...

 You won't have all the columns of the join rel available at the join
 node yet, because they will be fetched in the HeapFetch node above. The
 targetlist in Path nodes reflect that, and the targetlist of the final
 Plan nodes are created from the targetlists in the Path nodes instead of
 the ones in RelOptInfos.

 Per earlier discussion, I changed the way index tuple fetching works in
 B-tree, so that it can now be relied on. Matching index tuples are
 copied to backend-local memory when the scan steps on a page.

 Var nodes that refer to index columns (indexquals and the new index-only
 filters) now have a new field, varindexno, set. While we could've
 continued with the old representation, now that we have more expressions
 that refer to index vars instead of heap vars, this makes debugging easier.

I've taken a quick look through the rest of this patch and there's
obviously some hackery that needs to be cleaned up before this is
committable, but you knew that already.  I don't see a lot to object
to at a design level, but OTOH I'm not terribly familiar with the
executor, which is where most of the non-planner changes are to be
found, so I can't really offer an opinion with any certainty.

One other random thought: I notice that you copied a (completely
inscrutable, to me) comment beginning with Check if we are evaluating
PlanQual... from BitmapHeapNext to HeapFetchNext, so maybe this
hasn't escaped you either - but couldn't the heap fetches for a bitmap
index scan be postponed until higher up in the join tree just as
you're proposing to do for a regular index scan?  It even seems
conceivable that it could be right to do a regular index scan (on,
say, the inner side of a nested loop), and then accumulate all the
results and do a single bitmap heap scan on the result.  This might be
too marginal a case to worry about... especially for version one...
just throwing it out there.

I wonder if HeapFetchNext should be called just HeapNext for parity
with BitmapHeapNext, and the Heap Fetch node called a Heap Scan for
parity with Bitmap Heap Scan.

Since you previously stated that you were going to put this patch
aside to work on HS and SR[1], I'm going to move this to Returned with
Feedback for now.  Hope that's OK, and that the feedback is sufficient
and useful.

...Robert

[1] http://archives.postgresql.org/message-id/4ab1dd0b.1080...@enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your 

Re: [HACKERS] generic copy options

2009-09-20 Thread Emmanuel Cecchet
The easiest for both implementation and documentation might just be to 
have a matrix of options.
Each option has a row and a column in the matrix. The intersection of a 
row and a column is set to 0 if options are not compatible and set to 1 
if it is. This way we are sure to capture all possible combinations.
This way, each time we find a new option, we just have to check in the 
matrix if it is compatible with the already existing options. Note that 
we can also replace the 0 with an index in an error message array.


I can provide an implementation of that if this looks interesting to anyone.
Emmanuel

Robert Haas wrote:

On Sun, Sep 20, 2009 at 2:25 PM, Emmanuel Cecchet m...@asterdata.com wrote:
  

Tom Lane wrote:


Emmanuel Cecchet m...@asterdata.com writes:
  

Here you will force every format to use the same set of options


How does this force any such thing?

  

As far as I understand it, every format will have to handle every format
options that may exist so that they can either implement it or throw an
error.



I don't think this is really true.  To be honest with you, I think
it's exactly backwards.  The way the option-parsing logic works, we
parse each option individually FIRST.  Then at the end we do
cross-checks to see whether there is an incompatibility in the
combination specified.  So if two different formats support the same
option, we just change the cross-check to say that foo is OK with
either format bar or format baz.  On the other hand, if we split the
option into bar_foo and baz_foo, then the first loop that does the
initial parsing has to support both cases, and then you still need a
separate cross-check for each one.

  

That would argue in favor of a format option that defines the format. Right
now I find it bogus to have to say (csv on, csv_header on). If csv_header is
on that should imply csv on.
The only problem I have is that it is not obvious what options are generic
COPY options and what are options of an option (like format options).
So maybe a tradeoff is to differentiate format specific options like in:
(delimiter '.', format csv, format_header, format_escape...)
This should also make clear if someone develops a new format what options
need to be addressed.



I think this is a false dichotomy.  It isn't necessarily the case that
every format will support a delimiter option either.  For example, if
we were to add an XML or JSON format (which I'm not at all convinced
is a good idea, but I'm sure someone is going to propose it!) it
certainly won't support specifying an arbitrary delimiter.

IOW, *every* format will have different needs and we can't necessarily
know which options will be applicable to those needs.  But as long as
we agree that we won't use the same option for two different
format-specific options with wildly different semantics, I don't think
that undecorated names are going to cause us much trouble.  It's also
less typing.

  

PS: I don't know why but as I write this message I already feel that Tom
hates this new proposal :-D



I get those feeling sometimes myself.  :-)  Anyway, FWIW, I think Tom
has analyzed this one correctly...

...Robert

  



--
Emmanuel Cecchet
FTO @ Frog Thinker 
Open Source Development  Consulting

--
Web: http://www.frogthinker.org
email: m...@frogthinker.org
Skype: emmanuel_cecchet


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Linux LSB init script

2009-09-20 Thread Robert Haas
On Thu, Sep 17, 2009 at 4:52 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Sep 17, 2009 at 4:18 PM, Peter Eisentraut pete...@gmx.net wrote:
 On tor, 2009-09-17 at 11:59 -0500, Kevin Grittner wrote:
  Well, in such cases it may be useful to add an option such as
  --oknodo to select the idempotent behavior.

 I found that confusing (as did Robert); how about --lsm-conforming?

 s/lsm/lsb/

 I'm not so sure that I would label it as LSB, because that is too broad,
 and not very descriptive.

 I think this option should only control whether start and stop are
 idempotent.  Other LSB issues such as exit codes ought to become the
 default, or possibly a different option if necessary.

 Maybe we should just call it --idempotent.

 Or, they could be additional actions, like ensure-start/ensure-stop.

It seems like there is some support for what this patch is trying to
do, but much disagreement about the details of how to get there.
Where do we go from here?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO item: Allow more complex user/database default GUC settings

2009-09-20 Thread Robert Haas
On Fri, Sep 18, 2009 at 4:03 PM, Bernd Helmle maili...@oopsware.de wrote:


 --On 25. August 2009 22:17:38 -0400 Alvaro Herrera
 alvhe...@commandprompt.com wrote:

 I'm just posting in case somebody has thoughts on the UI part of it.

 Other things that need fixed:

 - need to figure out locking for roles; this stuff must be synchronized
  with role drop
 - pg_shadow and pg_roles need a join to obtain settings
 - two regression tests need their expected file updated
 - catalog version bump

 Here's a first shot on this for my current review round. Patch needed to
 re-merged into current CVS HEAD due to some merge conflicts, i've also
 adjusted the regression tests (minor). On a first look, i like the patch
 (especially the code for the utility commands accessing the settings is
 better modularized now, which looks much nicer).

So is this ready to commit, or what?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic copy options

2009-09-20 Thread Marcos Luis Ortiz Valmaseda
I think that it is a good idea, but do you can show to us what do you have in 
mind with a example?

Regards

The hurry is enemy of the success: for that reason...Be patient

Ing. Marcos L. Ortiz Valmaseda
Línea Soporte y Despliegue
Centro de Tecnologías de Almacenamiento y Análisis de Datos (CENTALAD)

Linux User # 418229
PostgreSQL User
http://www.postgresql.org
http://www.planetpostgresql.org/
http://www.postgresql-es.org/


- Mensaje original -
De: Emmanuel Cecchet m...@frogthinker.org
Para: Robert Haas robertmh...@gmail.com
CC: Emmanuel Cecchet m...@asterdata.com, Tom Lane t...@sss.pgh.pa.us, 
Emmanuel Cecchet emmanuel.cecc...@asterdata.com, Josh Berkus 
j...@agliodbs.com, PostgreSQL-development pgsql-hackers@postgresql.org
Enviados: Domingo, 20 de Septiembre 2009 16:24:28 GMT -10:00 Hawai
Asunto: Re: [HACKERS] generic copy options

The easiest for both implementation and documentation might just be to 
have a matrix of options.
Each option has a row and a column in the matrix. The intersection of a 
row and a column is set to 0 if options are not compatible and set to 1 
if it is. This way we are sure to capture all possible combinations.
This way, each time we find a new option, we just have to check in the 
matrix if it is compatible with the already existing options. Note that 
we can also replace the 0 with an index in an error message array.

I can provide an implementation of that if this looks interesting to anyone.
Emmanuel

Robert Haas wrote:
 On Sun, Sep 20, 2009 at 2:25 PM, Emmanuel Cecchet m...@asterdata.com wrote:
   
 Tom Lane wrote:
 
 Emmanuel Cecchet m...@asterdata.com writes:
   
 Here you will force every format to use the same set of options
 
 How does this force any such thing?

   
 As far as I understand it, every format will have to handle every format
 options that may exist so that they can either implement it or throw an
 error.
 

 I don't think this is really true.  To be honest with you, I think
 it's exactly backwards.  The way the option-parsing logic works, we
 parse each option individually FIRST.  Then at the end we do
 cross-checks to see whether there is an incompatibility in the
 combination specified.  So if two different formats support the same
 option, we just change the cross-check to say that foo is OK with
 either format bar or format baz.  On the other hand, if we split the
 option into bar_foo and baz_foo, then the first loop that does the
 initial parsing has to support both cases, and then you still need a
 separate cross-check for each one.

   
 That would argue in favor of a format option that defines the format. Right
 now I find it bogus to have to say (csv on, csv_header on). If csv_header is
 on that should imply csv on.
 The only problem I have is that it is not obvious what options are generic
 COPY options and what are options of an option (like format options).
 So maybe a tradeoff is to differentiate format specific options like in:
 (delimiter '.', format csv, format_header, format_escape...)
 This should also make clear if someone develops a new format what options
 need to be addressed.
 

 I think this is a false dichotomy.  It isn't necessarily the case that
 every format will support a delimiter option either.  For example, if
 we were to add an XML or JSON format (which I'm not at all convinced
 is a good idea, but I'm sure someone is going to propose it!) it
 certainly won't support specifying an arbitrary delimiter.

 IOW, *every* format will have different needs and we can't necessarily
 know which options will be applicable to those needs.  But as long as
 we agree that we won't use the same option for two different
 format-specific options with wildly different semantics, I don't think
 that undecorated names are going to cause us much trouble.  It's also
 less typing.

   
 PS: I don't know why but as I write this message I already feel that Tom
 hates this new proposal :-D
 

 I get those feeling sometimes myself.  :-)  Anyway, FWIW, I think Tom
 has analyzed this one correctly...

 ...Robert

   


-- 
Emmanuel Cecchet
FTO @ Frog Thinker 
Open Source Development  Consulting
--
Web: http://www.frogthinker.org
email: m...@frogthinker.org
Skype: emmanuel_cecchet


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Encoding issues in console and eventlog on win32

2009-09-20 Thread Josh Williams
On Tue, 2009-09-15 at 12:49 +0900, Itagaki Takahiro wrote:
 Here is an updated version of the patch.

This is a review of the Eventlog encoding on Windows patch:
http://archives.postgresql.org/message-id/20090915123243.9c59.52131...@oss.ntt.co.jp

Purpose  Format

This patch is designed to coerce log messages to a specific encoding.
It's currently only targeted at the win32 port, where the logs are
written in UTF-16.

The patch applies cleanly.  It doesn't include any documentation updates
or additional regression tests.  A comment in the documentation that
logs on Windows will go through an encoding conversion if appropriate
might be nice, though.

Initial Run
===
To (hopefully) properly test I initdb'd a couple directories under
different locales.  I then ran a few statements designed to generate
event log messages showing characters in a different encoding:
SELECT E'\xF0'::int;

The unpatched backend generated event log message showing only the byte
value interpreted as the same character each time in the system default
encoding.

With the patch in place the event log message showed the character
correctly for each of the different encodings.

I haven't tried any performance testing against it.

Concurrent Development Issues
=
On a hunch, tried applying the syslogger infrastructure changes at the
same time.  They conflict on elog.c.  Not sure if we're supposed to
check for that, but thought I'd point it out. :)

Editorial
=
The problem seems to stem from PG and Windows each having a few
encodings the other won't understand, or at least don't immediately
support.  So log messages back to the system from its perspective
contain incorrect or broken characters.  I'm not sure this is as much of
a problem on other platforms, though, where the database encoding
typically doesn't have any trouble matching the system's; would it be
worth pursuing beyond the win32 port?

I'm not too familiar with alternate character sets...  I would assume if
there's a code page supported on win32 it'll naturally support
conversion to UTF-16 on the platform, but is there any time this could
fail?  What about the few encodings that it doesn't directly support,
which need a conversion to UTF-8 first?

Maybe someone with more familiarity with encoding conversion issues
could comment on that?  Otherwise I think this is ready to be bumped up
for committer review.

- Josh Williams



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1)

2009-09-20 Thread Steve Prentice

Thank you!

-Steve

On Sep 19, 2009, at 6:55 PM, Tom Lane wrote:

Steve Prentice prent...@cisco.com writes:

This patch changes plpgsql IN parameters so they are mutable.


I've applied this, since the consensus seemed to be in favor of it.


I decided not to update the docs for this change because the docs
don't currently indicate that an IN parameter is constant and I didn't
want to encourage it because it isn't universally considered good
programming practice to assign to an IN parameter. If others think we
need a doc change for this, I'll update the patch.


I agree, no need to say anything one way or the other in the plpgsql  
docs.

We'll want to mention it in the release notes of course.

regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO item: Allow more complex user/database default GUC settings

2009-09-20 Thread Alvaro Herrera
Robert Haas escribió:

  Here's a first shot on this for my current review round. Patch needed to
  re-merged into current CVS HEAD due to some merge conflicts, i've also
  adjusted the regression tests (minor). On a first look, i like the patch
  (especially the code for the utility commands accessing the settings is
  better modularized now, which looks much nicer).
 
 So is this ready to commit, or what?

Not really :-(  It needs some minor tweaks to qualify as a cleanup
patch, and further extra coding for there to be an actual new feature.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers