Re: [HACKERS] [PATCH] DefaultACLs

2009-07-26 Thread Petr Jelinek

Joshua Tolley weote:

On Sat, Jul 25, 2009 at 08:41:12PM -0400, Robert Haas wrote:
  

On Sat, Jul 25, 2009 at 8:39 PM, Joshua Tolleyeggyk...@gmail.com wrote:


Immediately after concluding I was done with my review, I realized I'd
completely forgotten to look at the docs. I've made a few changes based solely
on my opinions of what sounds better and what's more consistent with the
existing documentation. Do with them as you see fit. :)
  

Applied with minor adjustments, attached updated patch.

--
Regards
Petr Jelinek (PJMODOS)



defaultacls-2009-07-26.diff.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL Specifications

2009-07-26 Thread Sam Mason
On Sun, Jul 26, 2009 at 01:42:32PM +0900, KaiGai Kohei wrote:
 Robert Haas wrote:
 Sam Mason wrote:
 The traditional approach would be to maintain multiple physically
 separate databases; in this setup it's obvious that when you perform a
 backup of one of these databases you're only seeing a subset of all of
 the objects.  Isn't SE-PG just allowing you to do this within a single
 PG database?
 
 Partly.  There's also a concept called read down, which is
 important.  It allows you to have, say, secret and classified data in
 the same database, and let the secret users see both types but the
 classified users see only the classified stuff, not the secret stuff.
 If you want to store intelligence data about the war in Iraq and
 intelligence data about the war in Afghanistan, it might not be too
 bad to store them in separate databases, though storing them in the
 same database might also make things simpler for users who have access
 to both sets of data.  But if you have higher and lower
 classifications of data it's pretty handy (AIUI) to be able to let the
 higher-secrecy users read the lower-secrecy data - if you used
 separate databases to simulate read-down, you'd have to replicate data
 between them, and also have some manual mechanism for tracking which
 level of secrecy applied to which to which data.
 
 It seems a correct description.
 
 In addition, we also need to prevent that higher-secrecy users writes
 anything to the lower-secrect objects to prevent information leaks.

OK, so to bulk out this physical analogy we'd have two physical servers
one that stores higher-secrecy stuff and one for lower-secrecy
stuff.  Users with higher clearance are able to read/write the higher
secrecy database but only read the lower secrecy database.  Users with
lower clearance can only read/write the lower secrecy database, ideally
they aren't even aware of the existence of the higher secrecy one.

 In some cases, the clearance of infoamtion may be changed. We often
 have dome more complex requirements also.

OK, so there is some other trusted entity that has unfettered access to
both databases and its job is to manage these requirements.

 Thus, it is necessary a capability to store and manage data objects
 with different security labeles in a single database instance here.
 (If we don't want to use commercial solutions instead.)

SE-PG is about doing the above in one database and allowing more
rigorous checks to be done?

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL Specifications

2009-07-26 Thread Sam Mason
On Sun, Jul 26, 2009 at 12:27:12PM +0900, KaiGai Kohei wrote:
 Indeed, the draft used the term of security context with minimum
 introductions, but not enough friendliness for database folks.
 
 The purpose of security context is an identifier of any subject and
 object to describe them in the security policy. Because the security
 policy is common for operating system, databases, x-window and others,
 any managed database objects needs its security context.
 
 Anyway, I need to introduce them in the security model section.

I'm coming to the conclusion that you really need to link to external
material here; there must be good (and canonical) definitions of these
things outside and because SE-PG isn't self contained I really think you
need to link to them.

This will be somewhat of a break from normal PG documentation because
so far everything has been self contained, it's chosen its own
interpretation of the SQL standard and it needs to document that.  SE-PG
will be interacting with much more code from outside and showing which
parts of these are PG specific vs. which parts are common to all SELinux
seems important.

If you try to document *everything* you're going to be writing for years
and give the impression that everything is implemented in SE-PG.  A
dividing line needs to be drawn between what is PG specific and what is
SELinux (why not SEL?).

 For the security policy, I introduce it at the security model section:
 
 | Access control is conceptually to decide a set of allowed (or denied)
 | actions between a certain subject (such as a database client) and an
 | object (such as a table), and to apply the decision on user's requests.
 | At the database privilege system, ACL stored in database objects itself
 | holds a list of allowed actions to certain database roles, and it is
 | applied on the user's request.
 | SELinux also holds massive sets of allowed actions between a certain
 | subject and a certain object, we call them security policy.
 
 Is it obscure?

I find that somewhat opaque as well! sorry

 At this point, the SELinux user's guide in Fedora is the most comprehensive
 documentation. It is described from the viewpoint of SELinux users, not
 experts or developers.
 
   http://docs.fedoraproject.org/selinux-user-guide/

OK, thanks!

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL Specifications

2009-07-26 Thread Andrew Dunstan



KaiGai Kohei wrote:


The SELinux provides a certain process privilege to make backups and
restore them. In the (currect) default policy, it is called unconfined.

However, it is also *possible* to define a new special process privilege
for backup and restore tools. For example, it can access all the databse
objects and can make backups, but any other process cannot touch the
backup files. It means that DBA can launch a backup tool and it creates
a black-boxed file, then he cal also lauch a restore tool to restore
the black-boxed backup, but he cannot see the contents of the backup.
(It might be a similar idea of sudo mechanism.)




Really? How you enforce this black box rule for a backup made across the 
network? From the server's POV there is no such thing as a backup. All 
it sees is a set of SQL statements all of which it might see in some 
other context.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 8.5, transformationHook

2009-07-26 Thread Pavel Stehule
Hello

new patch add new contrib transformations with three modules
anotation, decode and json.

These modules are ported from my older work.

Before applying this patch, please use named-fixed patch too. The hook
doesn't need it, but modules anotation and json depend on it.

Regards
Pavel Stehule

2009/7/26 Robert Haas robertmh...@gmail.com:
 On Sat, Jul 25, 2009 at 11:38 PM, Pavel Stehulepavel.steh...@gmail.com 
 wrote:
 Hello

 2009/7/25 Robert Haas robertmh...@gmail.com:
 On Mon, Apr 20, 2009 at 8:45 AM, Pavel Stehulepavel.steh...@gmail.com 
 wrote:
 2009/4/18 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2009/4/11 Tom Lane t...@sss.pgh.pa.us:
 No, I was complaining that a hook right there is useless and expensive.
 transformExpr() is executed multiple times per query, potentially a very
 large number of times per query; so even testing to see if a hook exists
 is not a negligible cost.

 I did some tests based on pgbench.

 The queries done by pgbench are completely trivial and do not stress
 parser performance.  Even if they did (consider cases likw an IN with a
 few thousand list items), the parser is normally not a bottleneck
 compared to transaction overhead, network round trips, and pgbench
 itself.

 I though about different position of hook, but only in this place the
 hook is useful (because expressions are recursive).

 As I keep saying, a hook there is useless, at least by itself.  You
 have no control over the grammar and no ability to modify what the
 rest of the system understands.  The only application I can think of is
 to fool with the transformation of FuncCall nodes, which you could do in
 a much lower-overhead way by hooking into transformFuncCall.  Even that
 seems pretty darn marginal for real-world problems.


 I am sending modified patch - it hooking parser via transformFuncCall

 I am reviewing this patch.  It seems to me upon rereading the thread
 that the objections Tom and Peter had to inserting a hook into
 transformExpr() mostly still apply to a hook in transformFuncCall():
 namely, that there's no proof that putting a hook here is actually
 useful.  I think we should apply the same criteria to this that we
 have to some other patches that have been rejected (like the
 extensible-rmgr patch Simon submitted for CommitFest 2008-11), namely,
 requiring that the extension mechanism be submitted together with at
 least two examples of how it can be used to interesting and useful
 things, bundled as one or more contrib modules.

 I have in my plan add to contrib JSON support similar to Bauman design:

 http://www.mysqludf.org/lib_mysqludf_json/index.php

 It's will be sample of smart functions. Because this need more then
 less work I am waiting on commit.

 Other simple intrduction contrib module should be real Oracle decode
 function - I sent source code some time ago. But this code needs some
 modification. I should send this code if you need it.

 Sure, post it and let's discuss.

 ...Robert



thook.diff.gz
Description: GNU Zip compressed data


named-fixed.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 8.5, transformationHook

2009-07-26 Thread Pavel Stehule
Hello

note about SQL:201x
http://blogs.mysql.com/peterg/2009/06/07/soothsaying-sql-standardization-stuff/

regards
Pavel Stehule

2009/7/26 Pavel Stehule pavel.steh...@gmail.com:
 Hello

 new patch add new contrib transformations with three modules
 anotation, decode and json.

 These modules are ported from my older work.

 Before applying this patch, please use named-fixed patch too. The hook
 doesn't need it, but modules anotation and json depend on it.

 Regards
 Pavel Stehule

 2009/7/26 Robert Haas robertmh...@gmail.com:
 On Sat, Jul 25, 2009 at 11:38 PM, Pavel Stehulepavel.steh...@gmail.com 
 wrote:
 Hello

 2009/7/25 Robert Haas robertmh...@gmail.com:
 On Mon, Apr 20, 2009 at 8:45 AM, Pavel Stehulepavel.steh...@gmail.com 
 wrote:
 2009/4/18 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2009/4/11 Tom Lane t...@sss.pgh.pa.us:
 No, I was complaining that a hook right there is useless and expensive.
 transformExpr() is executed multiple times per query, potentially a 
 very
 large number of times per query; so even testing to see if a hook 
 exists
 is not a negligible cost.

 I did some tests based on pgbench.

 The queries done by pgbench are completely trivial and do not stress
 parser performance.  Even if they did (consider cases likw an IN with a
 few thousand list items), the parser is normally not a bottleneck
 compared to transaction overhead, network round trips, and pgbench
 itself.

 I though about different position of hook, but only in this place the
 hook is useful (because expressions are recursive).

 As I keep saying, a hook there is useless, at least by itself.  You
 have no control over the grammar and no ability to modify what the
 rest of the system understands.  The only application I can think of is
 to fool with the transformation of FuncCall nodes, which you could do in
 a much lower-overhead way by hooking into transformFuncCall.  Even that
 seems pretty darn marginal for real-world problems.


 I am sending modified patch - it hooking parser via transformFuncCall

 I am reviewing this patch.  It seems to me upon rereading the thread
 that the objections Tom and Peter had to inserting a hook into
 transformExpr() mostly still apply to a hook in transformFuncCall():
 namely, that there's no proof that putting a hook here is actually
 useful.  I think we should apply the same criteria to this that we
 have to some other patches that have been rejected (like the
 extensible-rmgr patch Simon submitted for CommitFest 2008-11), namely,
 requiring that the extension mechanism be submitted together with at
 least two examples of how it can be used to interesting and useful
 things, bundled as one or more contrib modules.

 I have in my plan add to contrib JSON support similar to Bauman design:

 http://www.mysqludf.org/lib_mysqludf_json/index.php

 It's will be sample of smart functions. Because this need more then
 less work I am waiting on commit.

 Other simple intrduction contrib module should be real Oracle decode
 function - I sent source code some time ago. But this code needs some
 modification. I should send this code if you need it.

 Sure, post it and let's discuss.

 ...Robert



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL Specifications

2009-07-26 Thread KaiGai Kohei

Sam Mason wrote:

On Sun, Jul 26, 2009 at 12:27:12PM +0900, KaiGai Kohei wrote:

Indeed, the draft used the term of security context with minimum
introductions, but not enough friendliness for database folks.

The purpose of security context is an identifier of any subject and
object to describe them in the security policy. Because the security
policy is common for operating system, databases, x-window and others,
any managed database objects needs its security context.

Anyway, I need to introduce them in the security model section.


I'm coming to the conclusion that you really need to link to external
material here; there must be good (and canonical) definitions of these
things outside and because SE-PG isn't self contained I really think you
need to link to them.

This will be somewhat of a break from normal PG documentation because
so far everything has been self contained, it's chosen its own
interpretation of the SQL standard and it needs to document that.  SE-PG
will be interacting with much more code from outside and showing which
parts of these are PG specific vs. which parts are common to all SELinux
seems important.

If you try to document *everything* you're going to be writing for years
and give the impression that everything is implemented in SE-PG.  A
dividing line needs to be drawn between what is PG specific and what is
SELinux (why not SEL?).


It also seems to me reasonable suggestion.

However, a reasonable amount (which should be adjusted under discussions)
of description should be self-contained.
For example, security context is a formatted short string is not enough
to understand why it is necessary and what is the purpose.

As Robert suggested, a few example and definition of technical terms
will help database folks to understand what it is, even if self-contained
explanation is not comprehensive from viewpoint of security folks.

Thanks,
--
KaiGai Kohei kai...@kaigai.gr.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL Specifications

2009-07-26 Thread KaiGai Kohei

Andrew Dunstan wrote:



KaiGai Kohei wrote:


The SELinux provides a certain process privilege to make backups and
restore them. In the (currect) default policy, it is called unconfined.

However, it is also *possible* to define a new special process privilege
for backup and restore tools. For example, it can access all the databse
objects and can make backups, but any other process cannot touch the
backup files. It means that DBA can launch a backup tool and it creates
a black-boxed file, then he cal also lauch a restore tool to restore
the black-boxed backup, but he cannot see the contents of the backup.
(It might be a similar idea of sudo mechanism.)




Really? How you enforce this black box rule for a backup made across the 
network? From the server's POV there is no such thing as a backup. All 
it sees is a set of SQL statements all of which it might see in some 
other context.


The recent SELinux provide a feature to exchange the security context of
peer process over the network connection.
It allows to control a certain process to send/receive packets to/from
only a certain process, even if they communicate using remote connection.

This feature is named Labeled IPsec. The key exchange daemon (racoon)
was enhanced to exchange the security context of peer processes also,
prior to the actual communications.

Thanks,
--
KaiGai Kohei kai...@kaigai.gr.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL Specifications

2009-07-26 Thread Andrew Dunstan



KaiGai Kohei wrote:

Andrew Dunstan wrote:



KaiGai Kohei wrote:


The SELinux provides a certain process privilege to make backups and
restore them. In the (currect) default policy, it is called 
unconfined.


However, it is also *possible* to define a new special process 
privilege
for backup and restore tools. For example, it can access all the 
databse

objects and can make backups, but any other process cannot touch the
backup files. It means that DBA can launch a backup tool and it creates
a black-boxed file, then he cal also lauch a restore tool to restore
the black-boxed backup, but he cannot see the contents of the backup.
(It might be a similar idea of sudo mechanism.)




Really? How you enforce this black box rule for a backup made across 
the network? From the server's POV there is no such thing as a 
backup. All it sees is a set of SQL statements all of which it might 
see in some other context.


The recent SELinux provide a feature to exchange the security context of
peer process over the network connection.
It allows to control a certain process to send/receive packets to/from
only a certain process, even if they communicate using remote connection.

This feature is named Labeled IPsec. The key exchange daemon (racoon)
was enhanced to exchange the security context of peer processes also,
prior to the actual communications.




Interesting, I can see this having some use in quite a number of areas. 
Of course, in the end, it still comes down to this issue, which is as 
old as Plato: Quis custodiet ipsos custodes? (see 
http://en.wikipedia.org/wiki/Quis_custodiet_ipsos_custodes%3F )


cheers

andrew

*/

/*

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Jul 25, 2009 at 6:40 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 I'm not nearly as excited about migrating all or even most of, say,
 the pg_proc DATA lines into SQL.

 I think it would actually buy you quite a bit to migrate them to SQL,
 because in SQL, default properties can generally be omitted, which
 means that a patch which adds a new property to pg_proc that takes the
 same value for every row doesn't actually need to touch the SQL at
 all.

[ shrug... ]  If you think default values would buy something in
maintainability, we could revise the BKI notation to support them,
with a lot less work and risk than what you're proposing.  Perhaps
something like

DATA_DEFAULTS( pronamespace=PGNSP proowner=PGUID prolang=12 ... );

DATA( oid=1242 proname=boolin pronargs=2 ... );
DATA( oid=1243 proname=boolout pronargs=2 ... );

with the convention that any field not specified in either the
DATA macro or the current defaults would go to NULL, except OID
which would retain its current special treatment.  (Hmm, I wonder
if we'd even still need the _null_ hack anymore?)

I remain unexcited about inventing contraptions that solve limited
special cases.  It's just not that hard to maintain those cases
the way we're doing now, and every added processing step introduces
its own comprehension and maintenance overheads.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Andrew Dunstan



Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:
  

On Sat, Jul 25, 2009 at 6:40 PM, Tom Lanet...@sss.pgh.pa.us wrote:


I'm not nearly as excited about migrating all or even most of, say,
the pg_proc DATA lines into SQL.
  


  

I think it would actually buy you quite a bit to migrate them to SQL,
because in SQL, default properties can generally be omitted, which
means that a patch which adds a new property to pg_proc that takes the
same value for every row doesn't actually need to touch the SQL at
all.



[ shrug... ]  If you think default values would buy something in
maintainability, we could revise the BKI notation to support them,
with a lot less work and risk than what you're proposing.  Perhaps
something like

DATA_DEFAULTS( pronamespace=PGNSP proowner=PGUID prolang=12 ... );

DATA( oid=1242 proname=boolin pronargs=2 ... );
DATA( oid=1243 proname=boolout pronargs=2 ... );

with the convention that any field not specified in either the
DATA macro or the current defaults would go to NULL, except OID
which would retain its current special treatment.  (Hmm, I wonder
if we'd even still need the _null_ hack anymore?)
  


I kinda like this. It will make it easier not only to make catalog 
changes but to add entries to thinks like pg_proc (which is surely the 
biggest piece of the headache).



I remain unexcited about inventing contraptions that solve limited
special cases.  It's just not that hard to maintain those cases
the way we're doing now, and every added processing step introduces
its own comprehension and maintenance overheads.


  

Agreed.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest Status Summary - 2009-07-25

2009-07-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 ... One thing I have belatedly realized about this
 CommitFest is that we (or at least, I) did not think about asking the
 committers about their schedules, and it turns out that three of them
 - Heikki, Michael Meskes, Joe Conway - are away at the moment.  About
 25% of the remaining patches are waiting for one of those three people
 to take the next step (as either patch author, or reviewer, or
 committer).

Well, any commitfest is going to have some issues of that sort,
especially one scheduled during the summer.  If we get to the point
where those patches are the only ones left, and the relevant people
still aren't back, I think we can just push them all to the next fest.
But I doubt we are moving fast enough to make that happen.

 Specific Committers (13)
 - generic explain options v3 (needs further review by Tom Lane)

Actually I was waiting for the other EXPLAIN patch to come ready
before looking at this, because I thought they were intertwined.
Do you want this committed before that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Tom Lane
I wrote:
 ... So maybe we could split the current bootstrap phase
 into three phases:
   * create core catalogs and load DATA commands, using bki
   * create operator classes, using sql script
   * create indexes, using bki
   * proceed on as before

I experimented with that a little bit and found it doesn't seem to be
tremendously easy.  A non-bootstrap-mode backend will PANIC immediately
on startup if it doesn't find the critical system indexes, so the second
step has issues.  Also, there is no provision for resuming bootstrap
mode in an already-existing database, so the third step doesn't work
either.  We could hack up solutions to those things, but it's not clear
that it's worth it.  What seems more profitable is just to allow CREATE
OPERATOR CLASS/FAMILY to be executed while still in bootstrap mode.
There will still be some obstacles to be surmounted, no doubt (in
particular persuading these commands to run without system indexes
present) ... but we'd have to surmount those anyway.

In the spirit of not inventing single-purpose solutions, I suggest
that the BKI representation for this might be something like

BKI_EXECUTE( any old SQL command );

where the bootstrap.c code just passes the given string to the main SQL
parser, and whether it works or not is dependent on whether the
specified command has been bootstrap-mode-proofed.  For the moment we'd
only bother to fix CREATE OPERATOR CLASS/FAMILY to work that way, but
the door would be open for other things if it seemed worthwhile.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL Specifications

2009-07-26 Thread Ron Mayer
Robert Haas wrote:
 If you want to store intelligence data about the war in Iraq and
 intelligence data about the war in Afghanistan, it might not be too
 bad to store them in separate databases, though storing them in the
 same database might also make things simpler for users who have access
 to both sets of data.  But if you have higher and lower
 classifications of data it's pretty handy (AIUI) to be able to let the
 higher-secrecy users read the lower-secrecy data 

Nice example.

Is this system being designed flexibly enough so that one user may
have access to the higher-secrecy data of the Iraq dataset but only
the lower-secrecy Afghanistan dataset; while a different user may have
access to the higher-secrecy Afghanistan data but only the lower-secrecy
Iraq data?

I imagine it's not uncommon for organizations to want to have total
access to their data, but expose more limited access to other
organizations they communicate with.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Greg Stark
On Sun, Jul 26, 2009 at 5:48 PM, Tom Lanet...@sss.pgh.pa.us wrote:

 In the spirit of not inventing single-purpose solutions, I suggest
 that the BKI representation for this might be something like

 BKI_EXECUTE( any old SQL command );

 where the bootstrap.c code just passes the given string to the main SQL
 parser, and whether it works or not is dependent on whether the
 specified command has been bootstrap-mode-proofed.  For the moment we'd
 only bother to fix CREATE OPERATOR CLASS/FAMILY to work that way, but
 the door would be open for other things if it seemed worthwhile.


I have nothing against a BKI_EXECUTE() like you propose.

But my instinct is still to go the other way. Of determining which
parts are actually necessary for bootstrapping and which parts really
aren't. I think it's valuable to have those two classes separated so
we understand when we're introducing new dependencies and when we're
varying from the well-trodden standard approaches.

It would also be valuable if we ever want to move some of these things
out to contrib modules or move other modules into the core. We might
even envision having optional components which the user could have the
optoin to decide at  at initdb-time whether to include them.

AFAICT the only opclasses that need to be in the bootstrap set are
int2_ops, int4_ops, name_ops, oid_ops, and oidvector_ops.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: support for multiplexing SIGUSR1

2009-07-26 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 I updated the patch to solve two problems which you pointed.

 Here is the changes:

 * Prevented the obsolete flag to being set to a new process, by using
newly-introduced spinlock.

 * Used the index of AuxiliaryProcs instead of auxType, to assign
backend ID to an auxiliary process.

Neither of these changes seem like a good idea to me.  The use of a
spinlock renders the mechanism unsafe for use from the postmaster ---
we do not wish the postmaster to risk getting stuck if the contents of
shared memory have become corrupted, eg, there is a spinlock that looks
taken.  And you've completely mangled the concept of BackendId.
MyBackendId is by definition the same as the index of the process's
entry in the sinval ProcState array.  This means that (1) storing it in
the ProcState entry is silly, and (2) assigning values that don't
correspond to an actual ProcState entry is dangerous.

If we want to be able to signal processes that don't have a ProcState
entry, it would be better to assign an independent index instead of
overloading BackendId like this.  I'm not sure though whether we care
about being able to signal such processes.  It's certainly not necessary
for catchup interrupts, but it might be for future applications of the
mechanism.  Do we have a clear idea of what the future applications are?

As for the locking issue, I'm inclined to just specify that uses of the
mechanism must be such that receiving a signal that wasn't meant for you
isn't fatal.  In the case of catchup interrupts the worst that can
happen is you do a little bit of useless work.  Are there any intended
uses where it would be seriously bad to get a misdirected signal?

I agree with Jaime that the patch would be more credible if it covered
more than one signal usage at the start --- these questions make it
clear that the design can't happen in a vacuum without intended usages
in mind.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Kevin Grittner
Sam Mason s...@samason.me.uk wrote: 
 
 I've not read much of his writings, any canonical references for
 this sort of discussion?
 
I think this is the one, although it's been a while since I read it,
and I might be getting it confused with something else he wrote:
 
Codd, E.F. (1990). The Relational Model for Database Management
(Version 2 ed.). Addison Wesley Publishing Company.
ISBN 0-201-14192-2.
 
I believe that he puts forward a list of about 200 things he feels
should be true of a database in order for him to consider it a
relational database.  Since he was first and foremost a mathematician,
and was something of a perfectionist, I don't think some of these are
achievable (at least in the foreseeable future) without tanking
performance, but it makes for an interesting read.  I find most of it
to be on target, and it gives a unique chance to see things from the
perspective of the inventor of relational model for database
management.
 
I don't, of course, agree with him on everything.  If you think that
the SQL standard date handling is weird, wait until you see how a
perfectionist mathematician attempts to deal with it.  :-)  Also, the
requirement that, to be considered a relational database, it must be
impossible to write two queries which can be shown to be logically
equivalent but which optimize to different access plans to be, well, a
bit ivory tower.
 
It appears that the no duplicate rows in a relation rule is to
Codd's relational theory what the speed of light is to relativity.  I
think it is basically a corollary to the rule that each datum must be
addressable by specifying its table name, column name, and some set of
key values which uniquely identify the row.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 AFAICT the only opclasses that need to be in the bootstrap set are
 int2_ops, int4_ops, name_ops, oid_ops, and oidvector_ops.

Maybe so, but the first two are part of the integer_ops family.  If
we have to continue implementing all of that through DATA statements
then we haven't done much towards making things more maintainable
or less fragile.  I think we need to try to get *all* of the operator
classes out of the hand-maintained-DATA-entries collection.

The argument about optional stuff doesn't impress me.  I would think
that something that's going to be optionally loaded doesn't need to be
considered during bootstrap mode at all.  We can just have initdb run
some SQL scripts (or not) during its post-bootstrap phase.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] query - change in gistentryinit between 8.1 and 8.2

2009-07-26 Thread Pavel Stehule
Hello,

I try to write GiST support for one custom type and I am not sure
about compress function. I don't understand where I can specify size
of returned compressed key. 8.1 and older had 6. parameter for size,
but this parameter is missing in new versions.

Can somebody explain, where pg take info about size of compressed key?

Thank You
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Robert Haas
On Sun, Jul 26, 2009 at 11:31 AM, Tom Lanet...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, Jul 25, 2009 at 6:40 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 I'm not nearly as excited about migrating all or even most of, say,
 the pg_proc DATA lines into SQL.

 I think it would actually buy you quite a bit to migrate them to SQL,
 because in SQL, default properties can generally be omitted, which
 means that a patch which adds a new property to pg_proc that takes the
 same value for every row doesn't actually need to touch the SQL at
 all.

 [ shrug... ]  If you think default values would buy something in
 maintainability, we could revise the BKI notation to support them,
 with a lot less work and risk than what you're proposing.

Really?  I thought about that too, but concluded that it would be
easier to verify that a change to the BKI-generation stuff was correct
(by just diffing the generated files).  I don't know how to verify
that two versions of initdb do the same thing - I assume the databases
won't be byte-for-byte identical.  But that was my only concern about
it: I like the idea of expanding what can be done in BKI mode, if we
can figure out how to do it.

 Perhaps
 something like

 DATA_DEFAULTS( pronamespace=PGNSP proowner=PGUID prolang=12 ... );

 DATA( oid=1242 proname=boolin pronargs=2 ... );
 DATA( oid=1243 proname=boolout pronargs=2 ... );

 with the convention that any field not specified in either the
 DATA macro or the current defaults would go to NULL, except OID
 which would retain its current special treatment.  (Hmm, I wonder
 if we'd even still need the _null_ hack anymore?)

 I remain unexcited about inventing contraptions that solve limited
 special cases.  It's just not that hard to maintain those cases
 the way we're doing now, and every added processing step introduces
 its own comprehension and maintenance overheads.

If you think that the current system is anywhere close to ideal, I
give up.  To do so much as add a single line to pg_proc requires all
sort of useless manual work, like translating type names to OIDs, and
making sure that pronargs contains the correct value when the same
information is already encapsulated in both proargtypes and
proargmodes.

Introducing defaults for DATA() would bring some benefits because it
would mostly avoid the need to change every row in the file when
adding a new column.  But a preprocessing script can do much more
sophisticated transformations, like computing a value for a column, or
looking up type names in another file and translating them into OIDs.
It's not even hard; it's probably a 100-line patch on top of what I
already submitted.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Robert Haas
On Sun, Jul 26, 2009 at 1:58 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Greg Stark gsst...@mit.edu writes:
 AFAICT the only opclasses that need to be in the bootstrap set are
 int2_ops, int4_ops, name_ops, oid_ops, and oidvector_ops.

 Maybe so, but the first two are part of the integer_ops family.  If
 we have to continue implementing all of that through DATA statements
 then we haven't done much towards making things more maintainable
 or less fragile.  I think we need to try to get *all* of the operator
 classes out of the hand-maintained-DATA-entries collection.

Is this mostly a forward-reference problem?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest Status Summary - 2009-07-25

2009-07-26 Thread Robert Haas
On Sun, Jul 26, 2009 at 12:07 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 ... One thing I have belatedly realized about this
 CommitFest is that we (or at least, I) did not think about asking the
 committers about their schedules, and it turns out that three of them
 - Heikki, Michael Meskes, Joe Conway - are away at the moment.  About
 25% of the remaining patches are waiting for one of those three people
 to take the next step (as either patch author, or reviewer, or
 committer).

 Well, any commitfest is going to have some issues of that sort,
 especially one scheduled during the summer.  If we get to the point
 where those patches are the only ones left, and the relevant people
 still aren't back, I think we can just push them all to the next fest.
 But I doubt we are moving fast enough to make that happen.

I think Joe is back in the next week, but I'm not sure about Michael or Heikki.

 Specific Committers (13)
 - generic explain options v3 (needs further review by Tom Lane)

 Actually I was waiting for the other EXPLAIN patch to come ready
 before looking at this, because I thought they were intertwined.
 Do you want this committed before that?

Well, if it's OK with you, yes.  I have been maintaining these as a
series of stacked patches, and the latest round of refactoring on the
explain-options patch has broken the machine-readable explain output
patch beyond all recognition.  So it costs me nothing to have you
whack it around some more before committing it at this point.  I think
it's an independent feature: it does a bunch of refactoring, adds new
syntax, and adds an actual option which makes use of that syntax.  Not
the most interesting option, to be sure, but one that's been requested
more than once.

The other alternative is to merge the two patches together and then
commit the whole thing in one go.  I think I like this option a little
less because it means that if there turn out to be additional issues
with the machine-readable explain-patch, I might end up getting
nothing that actually does anything committed this CommitFest, and it
will also delay the process by several days while I rework and merge
the patches.  But I'm willing to do it if it's the only path to
getting this done.  What I'm LEAST enthusiastic about is fixing the
machine-readable explain output patch, then have you make some more
changes to explain-options patch, then having to fix machine-readable
explain output again.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest Status Summary - 2009-07-25

2009-07-26 Thread Heikki Linnakangas
Robert Haas wrote:
 I think Joe is back in the next week, but I'm not sure about Michael or 
 Heikki.

I'll be back on Monday 3rd of August.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL Specifications

2009-07-26 Thread KaiGai Kohei

Andrew Dunstan wrote:



KaiGai Kohei wrote:

Andrew Dunstan wrote:



KaiGai Kohei wrote:


The SELinux provides a certain process privilege to make backups and
restore them. In the (currect) default policy, it is called 
unconfined.


However, it is also *possible* to define a new special process 
privilege
for backup and restore tools. For example, it can access all the 
databse

objects and can make backups, but any other process cannot touch the
backup files. It means that DBA can launch a backup tool and it creates
a black-boxed file, then he cal also lauch a restore tool to restore
the black-boxed backup, but he cannot see the contents of the backup.
(It might be a similar idea of sudo mechanism.)




Really? How you enforce this black box rule for a backup made across 
the network? From the server's POV there is no such thing as a 
backup. All it sees is a set of SQL statements all of which it might 
see in some other context.


The recent SELinux provide a feature to exchange the security context of
peer process over the network connection.
It allows to control a certain process to send/receive packets to/from
only a certain process, even if they communicate using remote connection.

This feature is named Labeled IPsec. The key exchange daemon (racoon)
was enhanced to exchange the security context of peer processes also,
prior to the actual communications.




Interesting, I can see this having some use in quite a number of areas. 
Of course, in the end, it still comes down to this issue, which is as 
old as Plato: Quis custodiet ipsos custodes? (see 
http://en.wikipedia.org/wiki/Quis_custodiet_ipsos_custodes%3F )


The administrator needs to set up and manage both of the hosts to
keep consistent security policy, but it is not a technical issue.

We have security issues broader than what technical feature can solve,
and the technical security feature is a piece of them.
(Needless to say, it is an important piece.)

For example, any kind of access controls are ineffective to phisical
attacks, so we need to place the server in data centers with physical
controls on entering or leaving a room.
Referring any ISO/IEC15408 certifications, they defines a certain
environment in which the certified products to be used.
It means the certification is valid on the required environments.

The technical security feature is an important piece, but not all.

Thanks,
--
KaiGai Kohei kai...@kaigai.gr.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Greg Stark
On Sun, Jul 26, 2009 at 6:49 PM, Kevin
Grittnerkevin.gritt...@wicourts.gov wrote:
 Also, the
 requirement that, to be considered a relational database, it must be
 impossible to write two queries which can be shown to be logically
 equivalent but which optimize to different access plans to be, well, a
 bit ivory tower.

Personally I think that's a fine goal to aim for. I'm not sure what
to be considered a relational database means but I consider a bug
whenever there's a case where this isn't true. It may be a bug that we
don't have a good solution for or a bug that's too minor for the
amount of effort it would require but it's still not right and if we
found a solution that we were happy with we would definitely want to
fix it.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Sam Mason
On Sun, Jul 26, 2009 at 12:49:32PM -0500, Kevin Grittner wrote:
 Codd, E.F. (1990). The Relational Model for Database Management
 (Version 2 ed.). Addison Wesley Publishing Company.
 ISBN 0-201-14192-2.

Looks as though I've got some reading to do then--somewhat annoying that
only second hand copies available from the US, but never mind!

 I believe that he puts forward a list of about 200 things he feels
 should be true of a database in order for him to consider it a
 relational database.  Since he was first and foremost a mathematician,
 and was something of a perfectionist, I don't think some of these are
 achievable (at least in the foreseeable future) without tanking
 performance, but it makes for an interesting read.  I find most of it
 to be on target, and it gives a unique chance to see things from the
 perspective of the inventor of relational model for database
 management.

Yup, I've heard lots and read a few smaller articles but don't think
I've got around to any of his books.

 I don't, of course, agree with him on everything.  If you think that
 the SQL standard date handling is weird, wait until you see how a
 perfectionist mathematician attempts to deal with it.  :-)  Also, the
 requirement that, to be considered a relational database, it must be
 impossible to write two queries which can be shown to be logically
 equivalent but which optimize to different access plans to be, well, a
 bit ivory tower.

Sounds as though he's using a different definition than what I would
use, but I'm sure I'll find out.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread David E. Wheeler

On Jul 25, 2009, at 4:41 PM, David E. Wheeler wrote:

Useless perhaps, but it's gonna happen, and someone may even have a  
reason for it. Until such time as NULLs are killed off, we need to  
be able to deal with SQL's pathologies.


And something I'd like to be able to handle in a while loop, as I'm  
actually fetching one row at a time from two cursors and need to be  
able to tell when I've reached the end of a cursor. This example  
highlights the issue:


\set QUIET 1
SET client_min_messages = warning;
BEGIN;

CREATE TABLE peeps (
name TEXT NOT NULL,
dob date,
ssn text,
active boolean NOT NULL DEFAULT true
);

INSERT INTO peeps
VALUES ('Tom', '1963-03-23', '123-45-6789', true),
   ('Damian', NULL, NULL, true),
   ('Larry',  NULL, '932-45-3456', true),
   ('Bruce',  '1965-12-31', NULL, true);

CREATE TYPE dobssn AS ( dob date, ssn text );

CREATE FUNCTION using_loop() RETURNS SETOF dobssn LANGUAGE  
plpgsql AS $$

DECLARE
stuff CURSOR FOR SELECT dob, ssn from peeps where active  
ORDER BY name;

BEGIN
FOR rec IN stuff LOOP
RETURN NEXT rec;
END LOOP;
END;
$$;


CREATE FUNCTION using_while() RETURNS SETOF dobssn LANGUAGE  
plpgsql AS $$

DECLARE
stuff CURSOR FOR SELECT dob, ssn from peeps where active  
ORDER BY name;

rec dobssn;
BEGIN
open stuff;
FETCH stuff INTO rec;
WHILE NOT rec IS NULL LOOP
RETURN NEXT rec;
FETCH stuff INTO rec;
END LOOP;
END;
$$;

SELECT * FROM using_loop();
SELECT * FROM using_while();

ROLLBACK;

Output:

dob | ssn
+-
 1965-12-31 |
|
| 932-45-3456
 1963-03-23 | 123-45-6789
(4 rows)

dob | ssn
+-
 1965-12-31 |
(1 row)

So somehow the use of the loop to go right through the cursor can tell  
the difference between a record that's all nulls and the when the end  
of the cursor has been reached. My use of the while loop, however,  
cannot tell the difference, and AFAICT, there is no way to detect the  
difference in SQL. Is that correct? Is there some way to get  
using_while() to properly return all the records?


FYI, using:

WHILE rec IS DISTINCT FROM NULL LOOP

Results in an infinite loop. So does:

WHILE NOT rec IS NOT DISTINCT FROM NULL LOOP

And this, of course:

WHILE rec IS NOT NULL LOOP

Returns no rows at all.

Surely someone has run into this before, no?

Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Sam Mason
On Sun, Jul 26, 2009 at 03:46:19PM -0700, David E. Wheeler wrote:
 And something I'd like to be able to handle in a while loop, as I'm  
 actually fetching one row at a time from two cursors and need to be  
 able to tell when I've reached the end of a cursor.

I'm sure I'm missing something obvious, but why doesn't the FOUND
magic variable tell you what you want?

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Eric B. Ridge

On Jul 26, 2009, at 6:46 PM, David E. Wheeler wrote:

Is there some way to get using_while() to properly return all the  
records?


I'm just a random lurker, but FOUND seems to work just fine (I suppose  
it's PG-specific?).


http://www.postgresql.org/docs/8.1/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-DIAGNOSTICS

BEGIN
   OPEN stuff;
   FETCH stuff INTO rec;
   WHILE FOUND LOOP
  RETURN NEXT rec;
  FETCH stuff INTO rec;
   END LOOP;
END;

HTH,

eric



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic explain options v3

2009-07-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Here's the update.  There are a few things that I'm not entirely happy
 with here, but not quite sure what to do about either.

Committed with a few editorializations.

 - ExplainPrintPlan() is now almost trivial.  It seems like there
 should be some way to get rid of this altogether, but I'm not quite
 sure how.  I thought about ripping pstmt and rtable out of
 ExplainState and just storying queryDesc there.  But that involves
 changing a lot of code, and while it makes some things simpler, it
 makes other parts more complex.  I'm not sure whether it's a win or
 not; I'm also not sure how much brainpower it's worth spending on
 this.

I think the problem here is that you chose to treat ExplainState.pstmt
as a parameter, when it's better considered as an internal field.
I changed it to the latter approach.

 - It's becoming increasingly evident to me that the explain stuff in
 prepare.c has no business being there and should be moved to
 explain.c.  I haven't done that here, but it's worth thinking about.

I'm unconvinced.  The reason that code is that way is that the
alternative would require explain.c to know quite a lot about prepared
plans, which does not seem like an improvement.

 - The hack needed in ExplainLogLevel is just that.

Yeah, I thought that was okay.  We could alternatively refactor the
code so that the parameter analysis code is a separate function that
utility.c could call, but it's unclear that it's worth the trouble.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Introducing defaults for DATA() would bring some benefits because it
 would mostly avoid the need to change every row in the file when
 adding a new column.  But a preprocessing script can do much more
 sophisticated transformations, like computing a value for a column, or
 looking up type names in another file and translating them into OIDs.

Hmm.  A preprocessing script that produces DATA commands might in fact
be a reasonable proposal, but it was not what I understood you to be
suggesting before.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Jul 26, 2009 at 1:58 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 I think we need to try to get *all* of the operator
 classes out of the hand-maintained-DATA-entries collection.

 Is this mostly a forward-reference problem?

No, I don't see that as particularly the issue.  What I'm concerned
about is the prospect of different parts of the same opfamily being
represented in different notations --- that sounds pretty error-prone
to me.  Greg is arguing that special-casing some minimum subset of the
opclasses is a good idea, but I disagree.  I think if we can make the
idea work at all, we can migrate *all* the built-in opclasses into the
higher-level notation, and that's how I want to approach it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Alvaro Herrera
Tom Lane escribió:

 I experimented with that a little bit and found it doesn't seem to be
 tremendously easy.  A non-bootstrap-mode backend will PANIC immediately
 on startup if it doesn't find the critical system indexes, so the second
 step has issues.  Also, there is no provision for resuming bootstrap
 mode in an already-existing database, so the third step doesn't work
 either.

FWIW we hacked up a sort-of-bootstrap mode in Mammoth Replicator to be
able to create our own catalogs and stuff.  It's not particularly
hard nor large:

 bootstrap.c |   31 ++!
 1 file changed, 6 insertions(+), 25 modifications(!)


(This is BSD code so feel free to use it if you find it useful)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
*** 83_rel/src/backend/bootstrap/bootstrap.c	2008-01-09 13:04:32.0 -0300
--- 23trunk/src/backend/bootstrap/bootstrap.c	2009-07-26 21:12:48.0 -0400
***
*** 27,36 
--- 27,39 
  #include catalog/index.h
  #include catalog/pg_type.h
  #include libpq/pqsignal.h
+ #include mammoth_r/mcp_queue.h
+ #include mammoth_r/txlog.h
  #include miscadmin.h
  #include nodes/makefuncs.h
  #include postmaster/bgwriter.h
  #include postmaster/walwriter.h
+ #include postmaster/replication.h
  #include storage/freespace.h
  #include storage/ipc.h
  #include storage/proc.h
***
*** 48,54 
  #define ALLOC(t, c)		((t *) calloc((unsigned)(c), sizeof(t)))
  
  static void CheckerModeMain(void);
! static void BootstrapModeMain(void);
  static void bootstrap_signals(void);
  static void ShutdownAuxiliaryProcess(int code, Datum arg);
  static hashnode *AddStr(char *str, int strlength, int mderef);
--- 51,57 
  #define ALLOC(t, c)		((t *) calloc((unsigned)(c), sizeof(t)))
  
  static void CheckerModeMain(void);
! static void BootstrapModeMain(char *dbname);
  static void bootstrap_signals(void);
  static void ShutdownAuxiliaryProcess(int code, Datum arg);
  static hashnode *AddStr(char *str, int strlength, int mderef);
***
*** 207,212 
--- 210,216 
  	int			flag;
  	AuxProcType auxType = CheckerProcess;
  	char	   *userDoption = NULL;
+ 	char   *dbname = NULL;
  
  	/*
  	 * initialize globals
***
*** 313,319 
  		}
  	}
  
! 	if (argc != optind)
  	{
  		write_stderr(%s: invalid command-line arguments\n, progname);
  		proc_exit(1);
--- 317,325 
  		}
  	}
  
! 	if (auxType == MammothBootstrapProcess  argc - optind + 1)
! 		dbname = argv[optind++]; 
! 	else if (argc != optind || auxType == MammothBootstrapProcess)
  	{
  		write_stderr(%s: invalid command-line arguments\n, progname);
  		proc_exit(1);
***
*** 337,342 
--- 343,350 
  			case WalWriterProcess:
  statmsg = wal writer process;
  break;
+ 			case MammothBootstrapProcess:
+ statmsg = mammoth bootstrap process;
  			default:
  statmsg = ??? process;
  break;
***
*** 410,416 
  			bootstrap_signals();
  			BootStrapXLOG();
  			StartupXLOG();
! 			BootstrapModeMain();
  			proc_exit(1);		/* should never return */
  
  		case StartupProcess:
--- 418,432 
  			bootstrap_signals();
  			BootStrapXLOG();
  			StartupXLOG();
! 			BootstrapModeMain(NULL);
! 			proc_exit(1);		/* should never return */
! 
! 		case MammothBootstrapProcess:
! 			bootstrap_signals();
! 			BootstrapTXLOG();
! 			BootStrapMCPQueue();
! 			StartupXLOG();
! 			BootstrapModeMain(dbname);
  			proc_exit(1);		/* should never return */
  
  		case StartupProcess:
***
*** 469,487 
   *	 commands in a special bootstrap language.
   */
  static void
! BootstrapModeMain(void)
  {
  	int			i;
  
  	Assert(!IsUnderPostmaster);
  
! 	SetProcessingMode(BootstrapProcessing);
  
  	/*
  	 * Do backend-like initialization for bootstrap mode
  	 */
  	InitProcess();
! 	InitPostgres(NULL, InvalidOid, NULL, NULL);
  
  	/* Initialize stuff for bootstrap-file processing */
  	for (i = 0; i  MAXATTR; i++)
--- 485,506 
   *	 commands in a special bootstrap language.
   */
  static void
! BootstrapModeMain(char *dbname)
  {
  	int			i;
  
  	Assert(!IsUnderPostmaster);
  
! 	if (dbname == NULL)
! 		SetProcessingMode(BootstrapProcessing);
! 	else
! 		SetProcessingMode(MammothBootstrapProcessing);
  
  	/*
  	 * Do backend-like initialization for bootstrap mode
  	 */
  	InitProcess();
! 	InitPostgres(dbname, InvalidOid, NULL, NULL);
  
  	/* Initialize stuff for bootstrap-file processing */
  	for (i = 0; i  MAXATTR; i++)
*** 83_rel/src/include/bootstrap/bootstrap.h	2008-01-09 13:04:49.0 -0300
--- 23trunk/src/include/bootstrap/bootstrap.h	2008-09-12 16:36:43.0 -0400
***
*** 70,76 
  	BootstrapProcess,
  	StartupProcess,
  	BgWriterProcess,
! 	WalWriterProcess
  } AuxProcType;
  
  #endif   /* BOOTSTRAP_H */
--- 70,77 
  	BootstrapProcess,
  	StartupProcess,
  	BgWriterProcess,
! 	

Re: [HACKERS] generic explain options v3

2009-07-26 Thread Robert Haas
On Sun, Jul 26, 2009 at 7:40 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Here's the update.  There are a few things that I'm not entirely happy
 with here, but not quite sure what to do about either.

 Committed with a few editorializations.

Thanks.

 - ExplainPrintPlan() is now almost trivial.  It seems like there
 should be some way to get rid of this altogether, but I'm not quite
 sure how.  I thought about ripping pstmt and rtable out of
 ExplainState and just storying queryDesc there.  But that involves
 changing a lot of code, and while it makes some things simpler, it
 makes other parts more complex.  I'm not sure whether it's a win or
 not; I'm also not sure how much brainpower it's worth spending on
 this.

 I think the problem here is that you chose to treat ExplainState.pstmt
 as a parameter, when it's better considered as an internal field.
 I changed it to the latter approach.

Sounds fine.

 - It's becoming increasingly evident to me that the explain stuff in
 prepare.c has no business being there and should be moved to
 explain.c.  I haven't done that here, but it's worth thinking about.

 I'm unconvinced.  The reason that code is that way is that the
 alternative would require explain.c to know quite a lot about prepared
 plans, which does not seem like an improvement.

I didn't consider that.  As it is, prepare.c has to know quite a lot
about explaining, so it may be six of one, half a dozen of the other.

 - The hack needed in ExplainLogLevel is just that.

 Yeah, I thought that was okay.  We could alternatively refactor the
 code so that the parameter analysis code is a separate function that
 utility.c could call, but it's unclear that it's worth the trouble.

OK.

It seems I have quite a bit of work in front of me unbreaking the
machine-readable explain patch.  I started grinding through it, but
it's not pretty.  I'll post an updated version when I have it.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4941: pg_stat_statements crash

2009-07-26 Thread Itagaki Takahiro

Tom Lane t...@sss.pgh.pa.us wrote:

  We should call [Read dumpfile] routine only once even on Windows.
 Seems to me that you should simply do the load only when found is false.

Here is a patch to fix pg_stat_statements on Windows.

I see we don't need any locks because initialization is done in postmaster;
There are no chance to see uninitialized state of 'pgss' after relasing
AddinShmemInitLock and before load dumpfile into it.

I also check pgss_shmem_shutdown and no problem.
It is called only once from postmaster on shutdown.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



pg_stat_statements.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autogenerating headers bki stuff

2009-07-26 Thread Robert Haas
On Sun, Jul 26, 2009 at 8:46 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Introducing defaults for DATA() would bring some benefits because it
 would mostly avoid the need to change every row in the file when
 adding a new column.  But a preprocessing script can do much more
 sophisticated transformations, like computing a value for a column, or
 looking up type names in another file and translating them into OIDs.

 Hmm.  A preprocessing script that produces DATA commands might in fact
 be a reasonable proposal, but it was not what I understood you to be
 suggesting before.

OK, sorry if I was unclear.  I'm not sure exactly what you mean by
producing DATA() commands; I think the output should be BKI directly.
One of the things this patch does that I think is good (however flawed
it may be otherwise) is unifies all of the stuff that needs to parse
the DATA() statements into a single script.  I think this is something
we should pursue, because I think it will simplify the introduction of
any other notation we want to consider in this area (regardless of
whether it's DATA_DEFAULTS or EXEC_BKI or what have you).

Maybe I should rip out all the anum.h stuff (sniff, I'm sad, I liked
that design...) and resubmit.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge Append Patch merged up to 85devel

2009-07-26 Thread Tom Lane
Gregory Stark st...@enterprisedb.com writes:
 Here's a copy of the merge-append patch that I sent months ago merged up to
 head. I haven't really added any additional functionality since then.

I looked at the planner part of this a little bit.  I think that it's
confusing an append that produces an ordered result with an append
that's doing a merge.  Those concepts need to be kept separate, because
as soon as we have a real concept of partitioned tables it will be
possible to have known-ordered output from a simple append of indexscans
on the partition key.  So you need an explicit flag to say we'll do
a merge, not just rely on whether the path has pathkeys.

As your comments note, the current approach to deciding which ordered
paths to generate is pretty unworkable --- it won't scale nicely at all
for large numbers of child tables.  One random idea is to take some
specific one of the children (the largest, probably) as the leader
and consider only ordered-appends generating the same pathkeys as are
available for the leader.  I approve of the fact that the code will
consider force-sorting children that are missing a way to match the
pathkeys, but presumably we don't want to do that on any but small
tables, so this seems like a possibly usable approach.

Speaking of sorting, it's not entirely clear to me how the patch ensures
that all the child plans produce the necessary sort keys as output
columns, and especially not how it ensures that they all get produced in
the *same* output columns.  This might accidentally manage to work
because of the throwaway call to make_sort_from_pathkeys(), but at the
very least that's a misleading comment.

 The other pending question is the same I had back when I originally submitted
 it. I don't really understand what's going on with eclasses and what
 invariants we're aiming to maintain with them.

For non-equivalence-class qualifications, we translate the parent rel's
quals to match the child's columns using adjust_appendrel_attrs().
(This isn't necessarily a no-op because child rels might have different
physical numbers for inherited columns.)  The child-eclass stuff is just
a mechanism to be able to generate suitably translated quals for the
cases where quals are being deduced from eclasses instead of presented
directly.  I don't remember offhand whether there are any special
considerations for the associated pathkeys, but it seems possible that
it would Just Work --- especially if the patch did anything useful for
you at all ;-).  In any case, I'm amazed that it's not failing
regression tests all over the place with those critical tests in
make_sort_from_pathkeys lobotomized by random #ifdef FIXMEs.  Perhaps
we need some more regression tests...

In the same vein, the hack to short circuit the append stuff for
a single child node is simply wrong, because it doesn't allow for column
number variances.  Please remove it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4941: pg_stat_statements crash

2009-07-26 Thread Alvaro Herrera
Itagaki Takahiro escribió:
 
 Tom Lane t...@sss.pgh.pa.us wrote:
 
   We should call [Read dumpfile] routine only once even on Windows.
  Seems to me that you should simply do the load only when found is false.
 
 Here is a patch to fix pg_stat_statements on Windows.

Hmm, it seems the comment just above the patched line needs to be fixed.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4941: pg_stat_statements crash

2009-07-26 Thread Tom Lane
Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp writes:
 Here is a patch to fix pg_stat_statements on Windows.

Yeah, that looks about right to me.  Committed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4941: pg_stat_statements crash

2009-07-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Itagaki Takahiro escribió:
 Here is a patch to fix pg_stat_statements on Windows.

 Hmm, it seems the comment just above the patched line needs to be fixed.

I looked at that and decided it was OK as-is.  How do you want to
change it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4941: pg_stat_statements crash

2009-07-26 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Itagaki Takahiro escribi�:
  Here is a patch to fix pg_stat_statements on Windows.
 
  Hmm, it seems the comment just above the patched line needs to be fixed.
 
 I looked at that and decided it was OK as-is.  How do you want to
 change it?

The reason that it doesn't need locks is not that there's no other
process running, but that it was already initialized, in the case when
found is false.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4941: pg_stat_statements crash

2009-07-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane escribió:
 I looked at that and decided it was OK as-is.  How do you want to
 change it?

 The reason that it doesn't need locks is not that there's no other
 process running, but that it was already initialized, in the case when
 found is false.

Mph.  The comment is correct, I think, but it applies to the situation
after we pass the !found test, rather than where the comment is.  Maybe
we should just move it down one statement?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers