Re: [HACKERS] Postgres-R: primary key patches

2008-07-21 Thread Markus Wanner

Hi,

Alvaro Herrera wrote:

Markus Wanner wrote:
(Although, I'm still less than thrilled about the internal storage  
format of these tuple collections. That can certainly be improved and  
simplified.)


Care to expand more on what it is?


Well, what I really dislike is the overhead in code to first transform 
tuples into a string based internal change set representation, which 
then gets serialized again. That looks like two conversion steps, which 
are both prone to error.


I'm about to merge those into a simpler tuple serializer, which shares 
code with the initializer (or recovery provider/subscriber) part. This 
is where I'd like to know what requirements Jan or others have. I will 
try to outline the current implementation and requirements of Postgres-R 
in a new thread soon.



On Replicator we're using the binary
send/recv routines to transmit tuples.  (Obviously this fails when the
master and slave have differing binary output, but currently we just
punt on this point).


Yeah, that's another point. I'm currently using the textual input/output 
functions, but would like to switch to the binary one as well. However, 
that's an optimization, where the above is simplification of code, which 
is more important to me at the moment.


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Default of max_stack_depth and getrlimit

2008-07-21 Thread Heikki Linnakangas
In 8.2, we started to use getrlimit(RLIMIT_STACK, ...) to set 
max_stack_depth automatically, to a max of 2MB:


http://archives.postgresql.org/pgsql-hackers/2006-10/msg00389.php

However, it's not behaving as I expected when the stack limit is set to 
unlimited. I would expect max_stack_depth to be set to the max of 2MB, 
but instead it gets set to 100kB.


I don't normally run without a limit, but it looks like the regression 
tests run like that with make check, at least on my platform. I bumped 
into this while running a custom regression test with very deep nesting.


I think we should differentiate between infinite and unknown in the 
return value of get_stack_depth_limit(), and use max_stack_depth of 2MB 
in case of infinite, and fall back to the 100kB only in the unknown case.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Adjust things so that the query_string of a cached plan and the

2008-07-21 Thread Markus Wanner

Hi,

Tom Lane wrote:

Adjust things so that the query_string of a cached plan and the sourceText of
a portal are never NULL, but reliably provide the source text of the query.
It turns out that there was only one place that was really taking a short-cut,
which was the 'EXECUTE' utility statement.  That doesn't seem like a
sufficiently critical performance hotspot to justify not offering a guarantee
of validity of the portal source text


This commit added a variable 'query_string' to the function 
ExecuteQuery() in src/backend/commands/prepare.c, but that function 
already takes an argument named 'queryString'. What's the difference? 
Which is which? Do we need both?


It looks like the second is the query string of the prepare statement, 
where the string passed as an argument contains the EXECUTE command. I 
propose renaming the variable (as in the attached patch) or at least 
explaining it better in additional comments.


Sorry, if this is nitpicking. I just happened to stumbled over it and 
thought I better tell you.


Regards

Markus

*** src/backend/commands/prepare.c	1f53747076d3cb8d83832179c2e8a0ee3d8f2d37
--- src/backend/commands/prepare.c	0b2ffacdca58b5a073fe6a57b3aa2c7d61d317a4
*** ExecuteQuery(ExecuteStmt *stmt, const ch
*** 174,180 
  	ParamListInfo paramLI = NULL;
  	EState	   *estate = NULL;
  	Portal		portal;
! 	char	   *query_string;
  
  	/* Look it up in the hash table */
  	entry = FetchPreparedStatement(stmt-name, true);
--- 174,180 
  	ParamListInfo paramLI = NULL;
  	EState	   *estate = NULL;
  	Portal		portal;
! 	char	   *prepared_qs;
  
  	/* Look it up in the hash table */
  	entry = FetchPreparedStatement(stmt-name, true);
*** ExecuteQuery(ExecuteStmt *stmt, const ch
*** 205,212 
  	portal-visible = false;
  
  	/* Copy the plan's saved query string into the portal's memory */
! 	query_string = MemoryContextStrdup(PortalGetHeapMemory(portal),
! 	   entry-plansource-query_string);
  
  	/*
  	 * For CREATE TABLE / AS EXECUTE, we must make a copy of the stored query
--- 205,212 
  	portal-visible = false;
  
  	/* Copy the plan's saved query string into the portal's memory */
! 	prepared_qs = MemoryContextStrdup(PortalGetHeapMemory(portal),
! 	  entry-plansource-query_string);
  
  	/*
  	 * For CREATE TABLE / AS EXECUTE, we must make a copy of the stored query
*** ExecuteQuery(ExecuteStmt *stmt, const ch
*** 256,262 
  
  	PortalDefineQuery(portal,
  	  NULL,
! 	  query_string,
  	  entry-plansource-commandTag,
  	  plan_list,
  	  cplan);
--- 256,262 
  
  	PortalDefineQuery(portal,
  	  NULL,
! 	  prepared_qs,
  	  entry-plansource-commandTag,
  	  plan_list,
  	  cplan);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] overlaps performance

2008-07-21 Thread Grzegorz Jaśkiewicz

Hey guys,

I am asking here, because I know there is bunch of people here that know 
the topic very well.
I need to use few 'overlaps' for timedate in my query. And I guess it is 
quite inefficient there. So my question would be, why isn't postgresql 
using indexes for OVERLAPS, and why optimizer doesn't substitute it with 
something like:


(c = a AND d  a) OR ( c = a AND c  b)

instead of

(a,b) overlaps (c,d)

any corner cases, or particular reasons ?




(source of example) 
http://www.depesz.com/index.php/2007/11/21/find-overlapping-time-ranges/



thanks.



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] overlaps performance

2008-07-21 Thread Gregory Stark
Grzegorz Jaśkiewicz [EMAIL PROTECTED] writes:

 So my question would be, why isn't postgresql using indexes for OVERLAPS,
 and why optimizer doesn't substitute it with something like:

 (c = a AND d  a) OR ( c = a AND c  b)

How would you use an index for that?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WITH RECUSIVE patches 0721

2008-07-21 Thread Tatsuo Ishii
Hi,

Here is the lastest WITH RECURSIVE patches against 2007/07/17 CVS (CVS
HEAD won't compile for me).

This version includes regression tests and is almost ready for commit
IMO.
--
Tatsuo Ishii
SRA OSS, Inc. Japan


recursive_query.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH RECUSIVE patches 0721

2008-07-21 Thread Markus Wanner

Hi,

Tatsuo Ishii wrote:

CVS HEAD won't compile for me


Did you try 'make clean' and rebuild? (Or even distclean). There were 
some changes to pg_proc. At least, that problem has biten me this 
morning, but after that, I could compile HEAD just fine.


Regards

Markus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH RECUSIVE patches 0721

2008-07-21 Thread Tatsuo Ishii
 Hi,
 
 Tatsuo Ishii wrote:
  CVS HEAD won't compile for me
 
 Did you try 'make clean' and rebuild? (Or even distclean).

Actually what I did was do cvs update on 2007/07/17 CVS HEAD (I kept
freshly gotten CVS work files as a tar ball).

 There were 
 some changes to pg_proc. At least, that problem has biten me this 
 morning, but after that, I could compile HEAD just fine.

I got following:

functioncmds.c:232: error: `FUNC_PARAM_TABLE' undeclared (first use in this 
function)
functioncmds.c:232: error: (Each undeclared identifier is reported only once
functioncmds.c:232: error: for each function it appears in.)
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] overlaps performance

2008-07-21 Thread Grzegorz Jaśkiewicz

Gregory Stark pisze:

Grzegorz Jaśkiewicz [EMAIL PROTECTED] writes:


So my question would be, why isn't postgresql using indexes for OVERLAPS,
and why optimizer doesn't substitute it with something like:

(c = a AND d  a) OR ( c = a AND c  b)


How would you use an index for that?


check Depesz'es article for that. I included it at the bottom of my email.




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Re: [COMMITTERS] pgsql: Adjust things so that the query_string of a cached plan and the

2008-07-21 Thread Tom Lane
Markus Wanner [EMAIL PROTECTED] writes:
 This commit added a variable 'query_string' to the function 
 ExecuteQuery() in src/backend/commands/prepare.c, but that function 
 already takes an argument named 'queryString'. What's the difference? 
 Which is which? Do we need both?

The query_string variable is the original PREPARE's query_string copied
into the portal's context, which we do to ensure that it lives as long
as the portal does.  There's no guarantee that the CachedPlanSource
will survive that long (there could be a DEALLOCATE while the query
executes).

The one passed in is the query string for the EXECUTE statement.
I think it's just used for error reporting in EvaluateParams.

 I propose renaming the variable (as in the attached patch) or at least 
 explaining it better in additional comments.

This seems like a bad idea, because it makes the code gratuitously
different from the names used for this purpose everywhere else.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH RECUSIVE patches 0721

2008-07-21 Thread Pavel Stehule
2008/7/21 Tatsuo Ishii [EMAIL PROTECTED]:
 Hi,

 Tatsuo Ishii wrote:
  CVS HEAD won't compile for me

 Did you try 'make clean' and rebuild? (Or even distclean).

 Actually what I did was do cvs update on 2007/07/17 CVS HEAD (I kept
 freshly gotten CVS work files as a tar ball).

 There were
 some changes to pg_proc. At least, that problem has biten me this
 morning, but after that, I could compile HEAD just fine.

 I got following:

 functioncmds.c:232: error: `FUNC_PARAM_TABLE' undeclared (first use in this 
 function)
 functioncmds.c:232: error: (Each undeclared identifier is reported only once
 functioncmds.c:232: error: for each function it appears in.)


it is new symbol from table function patch

regards
Pavel Stehule
 --
 Tatsuo Ishii
 SRA OSS, Inc. Japan

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] overlaps performance

2008-07-21 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Grzegorz Jaśkiewicz [EMAIL PROTECTED] writes:
 So my question would be, why isn't postgresql using indexes for OVERLAPS,
 and why optimizer doesn't substitute it with something like:
 
 (c = a AND d  a) OR ( c = a AND c  b)

 How would you use an index for that?

I believe you can index overlaps-like tests using GIST on an
interval-like data type --- look at contrib/seg for an example.

The reason we don't automatically translate OVERLAPS is that the spec's
definition of OVERLAPS is too weird for that to work; in particular
it demands a true result for some cases in which one of the four
endpoints is NULL, which'd be pretty hard to do with an interval-style
index.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Adjust things so that the query_string of a cached plan and the

2008-07-21 Thread Markus Wanner

Hi,

Tom Lane wrote:

This seems like a bad idea, because it makes the code gratuitously
different from the names used for this purpose everywhere else.


I find that a pretty dubious reason for having 'query_string' and 
'queryString' in the same function. In fact, having it in the same code 
base seems strange. It makes me wish we had (better!) naming 
conventions...  Something I've stumbled over often enough during my work 
with Postgres - What was it again: 'query_string' (87 times), 
'queryString' (77 times) or 'querystr' (42 times)?


However, what about at least adding a comment, so fellow hackers have a 
chance of understanding the subtle difference there?


Regards

Markus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default of max_stack_depth and getrlimit

2008-07-21 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 In 8.2, we started to use getrlimit(RLIMIT_STACK, ...) to set 
 max_stack_depth automatically, to a max of 2MB:

 http://archives.postgresql.org/pgsql-hackers/2006-10/msg00389.php

 However, it's not behaving as I expected when the stack limit is set to 
 unlimited.

Is there really any such thing as unlimited stack depth?  I think that
treating that as unknown is a good conservative thing to do, because
it is surely a lie.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] overlaps performance

2008-07-21 Thread Grzegorz Jaśkiewicz

Tom Lane pisze:

The reason we don't automatically translate OVERLAPS is that the spec's
definition of OVERLAPS is too weird for that to work; in particular
it demands a true result for some cases in which one of the four
endpoints is NULL, which'd be pretty hard to do with an interval-style
index.
shame, I just work on a thing that would benefit from index that could 
be used in OVERLAPS. I don't know psql internals , except for how GiST 
works, hence my question.


thanks for the answer.



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Default of max_stack_depth and getrlimit

2008-07-21 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
In 8.2, we started to use getrlimit(RLIMIT_STACK, ...) to set 
max_stack_depth automatically, to a max of 2MB:



http://archives.postgresql.org/pgsql-hackers/2006-10/msg00389.php


However, it's not behaving as I expected when the stack limit is set to 
unlimited.


Is there really any such thing as unlimited stack depth? 


No, but I would think it's safe to assume that unlimited is greater 
than 2MB.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Adjust things so that the query_string of a cached plan and the

2008-07-21 Thread Tom Lane
Markus Wanner [EMAIL PROTECTED] writes:
 However, what about at least adding a comment, so fellow hackers have a 
 chance of understanding the subtle difference there?

Sure, done.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default of max_stack_depth and getrlimit

2008-07-21 Thread Cédric Villemain
Le Monday 21 July 2008, Heikki Linnakangas a écrit :
 In 8.2, we started to use getrlimit(RLIMIT_STACK, ...) to set
 max_stack_depth automatically, to a max of 2MB:

 http://archives.postgresql.org/pgsql-hackers/2006-10/msg00389.php

 However, it's not behaving as I expected when the stack limit is set to
 unlimited. I would expect max_stack_depth to be set to the max of 2MB,
 but instead it gets set to 100kB.

 I don't normally run without a limit, but it looks like the regression
 tests run like that with make check, at least on my platform. I bumped
 into this while running a custom regression test with very deep nesting.

 I think we should differentiate between infinite and unknown in the
 return value of get_stack_depth_limit(), and use max_stack_depth of 2MB
 in case of infinite, and fall back to the 100kB only in the unknown case.

Why 2MB ? I believed that 3.5MB is the effective good maximum , is that too 
much ?


 --
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com



-- 
Cédric Villemain
Administrateur de Base de Données
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Default of max_stack_depth and getrlimit

2008-07-21 Thread Heikki Linnakangas

Cédric Villemain wrote:

Le Monday 21 July 2008, Heikki Linnakangas a écrit :

I think we should differentiate between infinite and unknown in the
return value of get_stack_depth_limit(), and use max_stack_depth of 2MB
in case of infinite, and fall back to the 100kB only in the unknown case.


Why 2MB ? I believed that 3.5MB is the effective good maximum , is that too 
much ?


2MB is the value we set max_stack_depth to, unless getrlimit() says that 
the actual stack limit is lower than that.


I believe the 2MB figure is just an arbitrary value, thought to be quite 
safe, but also high enough that most people won't need to raise it. 
Before we started to use getrlimit(), we used to just default 
max_stack_depth=2MB.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default of max_stack_depth and getrlimit

2008-07-21 Thread Cédric Villemain
Le Monday 21 July 2008, Heikki Linnakangas a écrit :
 Cédric Villemain wrote:
  Le Monday 21 July 2008, Heikki Linnakangas a écrit :
  I think we should differentiate between infinite and unknown in the
  return value of get_stack_depth_limit(), and use max_stack_depth of 2MB
  in case of infinite, and fall back to the 100kB only in the unknown
  case.
 
  Why 2MB ? I believed that 3.5MB is the effective good maximum , is that
  too much ?

 2MB is the value we set max_stack_depth to, unless getrlimit() says that
 the actual stack limit is lower than that.

 I believe the 2MB figure is just an arbitrary value, thought to be quite
 safe, but also high enough that most people won't need to raise it.
 Before we started to use getrlimit(), we used to just default
 max_stack_depth=2MB.

Ok, thank you for the explanation.


 --
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com



-- 
Cédric Villemain
Administrateur de Base de Données
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] PATCH: CITEXT 2.0 v4

2008-07-21 Thread David E. Wheeler

On Jul 18, 2008, at 09:53, David E. Wheeler wrote:

However, if someone with a lot more C and Pg core knowledge wanted  
to sit down with me for a couple hours next week and help me bang  
out these functions, that would be great. I'd love to have the  
implementation be that much more complete.


I've implemented fixes for the regexp_* functions and strpos() in pure  
SQL, like so:


CREATE OR REPLACE FUNCTION regexp_matches( citext, citext ) RETURNS  
TEXT[] AS '

SELECT regexp_matches( $1::text, $2::text, ''i'' );
' LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION regexp_matches( citext, citext, text )  
RETURNS TEXT[] AS '
SELECT regexp_matches( $1::text, $2::text, CASE WHEN strpos($3,  
''c'') = 0 THEN  $3 || ''i'' ELSE $3 END );

' LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION regexp_replace( citext, citext, text )  
returns TEXT AS '

SELECT regexp_replace( $1::text, $2::text, $3, ''i'');
' LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION regexp_replace( citext, citext, text,  
text ) returns TEXT AS '
SELECT regexp_replace( $1::text, $2::text, $3, CASE WHEN  
strpos($4, ''c'') = 0 THEN  $4 || ''i'' ELSE $4 END);

' LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION regexp_split_to_array( citext, citext )  
RETURNS TEXT[] AS '

SELECT regexp_split_to_array( $1::text, $2::text, ''i'' );
' LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION regexp_split_to_array( citext, citext,  
text ) RETURNS TEXT[] AS '
SELECT regexp_split_to_array( $1::text, $2::text, CASE WHEN  
strpos($3, ''c'') = 0 THEN  $3 || ''i'' ELSE $3 END );

' LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION regexp_split_to_table( citext, citext )  
RETURNS SETOF TEXT AS '

SELECT regexp_split_to_table( $1::text, $2::text, ''i'' );
' LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION regexp_split_to_table( citext, citext,  
text ) RETURNS SETOF TEXT AS '
SELECT regexp_split_to_table( $1::text, $2::text, CASE WHEN  
strpos($3, ''c'') = 0 THEN  $3 || ''i'' ELSE $3 END );

' LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION strpos( citext, citext ) RETURNS INT AS '
SELECT strpos( LOWER( $1::text ), LOWER( $2::text ) );
' LANGUAGE SQL IMMUTABLE STRICT;

Not so bad, though it'd be nice to have C functions that just did  
these things. Still not case-insensitive are:


-- replace()
-- split_part()
-- translate()

So, anyone at OSCON this week want to help me with these? Or to  
convert the above functions to C? Greg? Bruce?


Thanks,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread Tom Lane
The current commitfest queue has two entries that propose to migrate
existing pgfoundry projects (or improved versions thereof) into our
core distribution.  The more I think about this the less happy I am
with it.  From a maintenance point of view there seems little need
for either project to get integrated: they don't appear to have much
of any code that is tightly tied to backend innards.  From a features
point of view, yeah they're cool, but there are scads of cool things
out there.  From a project-management point of view, it's insanity
to set a presumption that pgfoundry is just a proving ground for code
that should eventually get into core once it's mature enough or popular
enough or whatever.  We *have to* encourage the development of a cloud
of subprojects around the core, or core will eventually collapse of
its own weight.  We have not got the manpower to deal with an
ever-inflating collection of allegedly core code.  If anything,
we ought to be working to push more stuff out of the core distro so
that we can focus on the functionality that has to be there.

So my feeling is that we should not accept either of these patches.

Now, there is some value in submitting the code for review --- certainly
citext is a whole lot better than it was a few weeks ago.  I think it
would be a good idea to be open to reviewing pgfoundry code with the
same standards we'd use if we were going to integrate it.  Perhaps
commitfest is not the right venue for that, though, if only because
of the possibility of confusion over what's supposed to happen.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread Josh Berkus

Tom,


Comments?


Well, in the *general* case, I think if we're going to have first 
class pgfoundry projects, then having a unified official Kitchen Sink 
Package will all of these add-ins becomes an imperative priority for 
8.4.   EDB's recent open sourcing of their installer might help with this.


Futher, we would need to come up with some organized way to subject 
pgFoundry projects to the same level of general scrutiny which core code 
gets.  Or at least close.


In the specific cases of pl/proxy and citext, they are very much in line 
with what we already package with the core code, including things like 
dblink, ISN, and CIDR.  citext in particular would eliminate a long-time 
newbie complaint about Postgres, but not if it's in an add-in package 
which the user can't find binaries for.


So I would argue maybe on pl/proxy, but that citext does belong in core.

--Josh Berkus




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Andrew Sullivan
On Sat, Jul 19, 2008 at 07:09:46AM +0530, Gurjeet Singh wrote:

 Will try this option, at least in the next schema upgrade or when setting up
 Slony.

As I've already suggested, however, if you try to set up slony on a
loaded database, you're going to see all manner of problems.  Slony
takes some heavy-duty locks when it does its setup work.  It's
designed that you should have an application outage for this sort of
work.  Please see previous discussion on the Slony mailing list.

A
-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: DTrace probes (merged version)

2008-07-21 Thread Robert Lor

Apologies for the delayed response - vacation, travel, etc got in the way!


Zdenek Kotala wrote:
I performed review of merged patch from Robert Treat. At first point 
the patch does not work (SunOS 5.11 snv_86 sun4u sparc 
SUNW,Sun-Fire-V240)


The attached patch fixed the regression test errors.



However I went through a code and I have following comments:

1) Naming convention:

 - Some probes uses *end, some *done. It would be good to select 
one name.
Noted. Will use name-done per the convention. This change will be 
included in an updated patch later since I think there are a number of 
other changes that need to be made.
 - I prefer to use clog instead of slru in probes name. clog is widely 
known.

I think slru- is okay per your subsequent emails.
 - It seems to me better to have checkpoint-clog..., 
checkpoint-subtrans instead of clog-checkpoint.
Yes, I was thinking about this too, but the current names are more 
consistent with the others. For example:


buffer-checkpoint, buffer-*
xlog-checkpoint, xlog-*
 - buffer-flush was originally dirty-buffer-write-start. I prefer 
Robert Lor's naming.
Actually, I made this change so the name is consistent with the other 
buffer-* probes.


2) storage read write probes

smgr-read*, smgr-writes probes are in md.c. I personally think it make 
more sense to put them into smgr.c. Only advantage to have it in md.c 
is that actual amount of bytes is possible to monitor.

The current probes return number of bytes, that's why they are in md.c


3) query rewrite probe

There are several probes for query measurement but query rewrite phase 
is missing. See analyze_and_rewrite or pg_parse_and_rewrite
I believe the rewrite time is accounted for in the query-plan probe. 
Need to double check on this.


4) autovacuum_start

Autovacuum_start probe is alone. I propose following probes for 
completeness:


proc-autovacuum-start
proc-autovacuum-stop
proc-bgwriter-start
proc-bgwriter-stop
proc-backend-start
proc-backend-stop
proc-master-start
proc-master-stop

Saw a number of emails on this. Will get back on this.


5) Need explain of usage:

I have some doubts about following probes. Could you please explain 
usage of them? example dtrace script is welcome


 - all exec-* probes
 - mark-dirty, local-mark-dirty


Theo/Robert, do you guys have any sample scripts on the probes you added.


6) several comments about placement:

I published patch on http://reviewdemo.postgresql.org/r/25/. I added 
several comments there.


7) SLRU/CLOG

SLRU probes could be return more info. For example if page was in 
buffer or if physical write is not necessary and so on.
Yes, more info could be returned if we can identify specific use cases 
that the new data will enable.


--
Robert Lor   Sun Microsystems
Austin, USA  http://sun.com/postgresql

Index: src/backend/access/transam/clog.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/clog.c,v
retrieving revision 1.46
diff -u -3 -p -r1.46 clog.c
--- src/backend/access/transam/clog.c   1 Jan 2008 19:45:46 -   1.46
+++ src/backend/access/transam/clog.c   21 Jul 2008 18:14:48 -
@@ -36,6 +36,7 @@
 #include access/slru.h
 #include access/transam.h
 #include postmaster/bgwriter.h
+#include pg_trace.h
 
 /*
  * Defines for CLOG page sizes.  A page is the same BLCKSZ as is used
@@ -323,7 +324,9 @@ void
 CheckPointCLOG(void)
 {
/* Flush dirty CLOG pages to disk */
+   TRACE_POSTGRESQL_CLOG_CHECKPOINT_START();
SimpleLruFlush(ClogCtl, true);
+   TRACE_POSTGRESQL_CLOG_CHECKPOINT_DONE();
 }
 
 
Index: src/backend/access/transam/multixact.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/multixact.c,v
retrieving revision 1.27
diff -u -3 -p -r1.27 multixact.c
--- src/backend/access/transam/multixact.c  1 Jan 2008 19:45:46 -   
1.27
+++ src/backend/access/transam/multixact.c  21 Jul 2008 18:21:58 -
@@ -57,6 +57,7 @@
 #include storage/lmgr.h
 #include utils/memutils.h
 #include storage/procarray.h
+#include pg_trace.h
 
 
 /*
@@ -1526,6 +1527,8 @@ MultiXactGetCheckptMulti(bool is_shutdow
 void
 CheckPointMultiXact(void)
 {
+   TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_START();
+
/* Flush dirty MultiXact pages to disk */
SimpleLruFlush(MultiXactOffsetCtl, true);
SimpleLruFlush(MultiXactMemberCtl, true);
@@ -1540,6 +1543,8 @@ CheckPointMultiXact(void)
 */
if (!InRecovery)
TruncateMultiXact();
+
+   TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_DONE();
 }
 
 /*
Index: src/backend/access/transam/slru.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/slru.c,v
retrieving revision 1.44
diff -u -3 -p -r1.44 slru.c
--- src/backend/access/transam/slru.c   1 Jan 2008 19:45:48 -   1.44
+++ 

Re: [HACKERS] [patch] plproxy v2

2008-07-21 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 [ plproxy ]

I looked through this a bit, and my principal reaction was what are
the security implications?  It seems like it'd be very easy to create
functions that allow untrusted users to execute arbitrary SQL on
other databases in the plproxy cluster.  As far as I saw there was
no privilege-checking within plproxy itself, you were just relying
on SQL-level permissions checking --- so even though plproxy functions
can only be *created* by superusers, by default they can be *used* by
anybody.  So it'd take a great deal of care to avoid making
unintentional security holes.

I'm not sure about a good solution to this problem, but I think it needs
to be solved before plproxy can be recommended for widespread use.
The first thought that comes to mind is to somehow propagate the userid
on the calling server to the execution on the remote, so that a user
can't get more privilege on the remote than if he were executing there
directly.  I'm imagining that the definition of a cluster would include
a map from local to remote userids (and thereby imply that anyone not
explicitly listed can't do remote access at all).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread David E. Wheeler

On Jul 21, 2008, at 12:43, Tom Lane wrote:


From a maintenance point of view there seems little need
for either project to get integrated: they don't appear to have much
of any code that is tightly tied to backend innards.


Well, citext against CVS HEAD is quite different from the other  
version I maintain for 8.3. The latter copies the str_toloer() code  
out of formatting.c from CVS and adds a number of includes in order to  
get things to work the same as against HEAD. I could probably work  
around this, though, if there was a macro with the version number in it.


Now, there is some value in submitting the code for review ---  
certainly

citext is a whole lot better than it was a few weeks ago.


Absolutely. I really appreciate the feedback and comments I've  
received. Thank you!



I think it
would be a good idea to be open to reviewing pgfoundry code with the
same standards we'd use if we were going to integrate it.  Perhaps
commitfest is not the right venue for that, though, if only because
of the possibility of confusion over what's supposed to happen.

Comments?


I think that this is a very good idea. But you might have trouble  
motivating people to review code that won't be in core unless it's  
managed very diligently. An official extended library distribution, as  
Josh suggests, would probably help with this, as it then becomes a  
project alongside PostgreSQL that bundles a lot of great add-ons,  
rather than just leaving all the add-ons to themselves on pgFoundry.


Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread Marc Munro
On Mon, 2008-07-21 at 17:03 -0300, Tom Lane wrote:
 [. . .]  I think it
 would be a good idea to be open to reviewing pgfoundry code with the
 same standards we'd use if we were going to integrate it.  Perhaps
 commitfest is not the right venue for that, though, if only because
 of the possibility of confusion over what's supposed to happen.

I think this would be a great idea.  I would be overjoyed to have veil
http://pgfoundry.org/projects/veil/ reviewed by postgres developers.


__
Marc


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread David E. Wheeler

On Jul 21, 2008, at 12:53, Josh Berkus wrote:

In the specific cases of pl/proxy and citext, they are very much in  
line with what we already package with the core code, including  
things like dblink, ISN, and CIDR.  citext in particular would  
eliminate a long-time newbie complaint about Postgres, but not if  
it's in an add-in package which the user can't find binaries for.


So I would argue maybe on pl/proxy, but that citext does belong in  
core.


This is my view, as well. If it was in contrib, it'd go a long way  
toward addressing a commonly-requested feature, whereas things are  
much more difficult to find on pgFoundry. pgFoundry ain't the CPAN,  
alas. Even if users do find it in pgFoundry, the fact that it isn't in  
core is more likely to be seen as a red flag at this point. One might  
ask, why isn't it in core? What's wrong with it? Why is something that  
seems so useful relegated to pgFoundry? What's the usual quality of  
code on pgFoundry?


Thanks for your consideration!

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 So I would argue maybe on pl/proxy, but that citext does belong in core.

Well, at least citext is pretty tiny ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread Andrew Dunstan



Tom Lane wrote:

The current commitfest queue has two entries that propose to migrate
existing pgfoundry projects (or improved versions thereof) into our
core distribution.  The more I think about this the less happy I am
with it.  From a maintenance point of view there seems little need
for either project to get integrated: they don't appear to have much
of any code that is tightly tied to backend innards.  From a features
point of view, yeah they're cool, but there are scads of cool things
out there.  From a project-management point of view, it's insanity
to set a presumption that pgfoundry is just a proving ground for code
that should eventually get into core once it's mature enough or popular
enough or whatever. 



I think there is a case to say that modules that are sufficiently 
popular have a case to be in core.


That's not necessarily a large number, but there might well be a case 
for citext at least to be among the number at some stage. Surely a case 
insensitive text type has more general use than, say, the seg module.





 We *have to* encourage the development of a cloud
of subprojects around the core, or core will eventually collapse of
its own weight.  We have not got the manpower to deal with an
ever-inflating collection of allegedly core code.  If anything,
we ought to be working to push more stuff out of the core distro so
that we can focus on the functionality that has to be there.
  



When we can get the much discussed module infrastructure that will make 
more sense. We will still need to keep enough modules to make sure that 
the infrastructure is working. In general I feel that the number of 
modules we have in core is about right. Maybe a small number should be 
pushed out.



So my feeling is that we should not accept either of these patches.

Now, there is some value in submitting the code for review --- certainly
citext is a whole lot better than it was a few weeks ago.  I think it
would be a good idea to be open to reviewing pgfoundry code with the
same standards we'd use if we were going to integrate it.  Perhaps
commitfest is not the right venue for that, though, if only because
of the possibility of confusion over what's supposed to happen.

Comments?





If we don't have enough resources to maintain them do we have enough to review 
them?



I was going to write some stuff about citext anyway. Quite apart from 
the above considerations I'm still a bit concerned about its performance 
characteristics. And I'm not sure we really want all the baggage that 
David is proposing to bring along with it. Is it an advance to force the 
regex_replace i flag for such a type? I can imagine cases where I 
might want it to sort insensitively, but be able to do case sensitive 
regex ops on it. It's not as if the user can't supply the flag. So right 
now I don't think citext should be included, because there are still 
issues to sort out, if for no other reason.



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-21 Thread Marko Kreen
On 7/21/08, Tom Lane [EMAIL PROTECTED] wrote:
 Marko Kreen [EMAIL PROTECTED] writes:
   [ plproxy ]

  I looked through this a bit, and my principal reaction was what are
  the security implications?  It seems like it'd be very easy to create
  functions that allow untrusted users to execute arbitrary SQL on
  other databases in the plproxy cluster.  As far as I saw there was
  no privilege-checking within plproxy itself, you were just relying
  on SQL-level permissions checking --- so even though plproxy functions
  can only be *created* by superusers, by default they can be *used* by
  anybody.  So it'd take a great deal of care to avoid making
  unintentional security holes.

  I'm not sure about a good solution to this problem, but I think it needs
  to be solved before plproxy can be recommended for widespread use.
  The first thought that comes to mind is to somehow propagate the userid
  on the calling server to the execution on the remote, so that a user
  can't get more privilege on the remote than if he were executing there
  directly.  I'm imagining that the definition of a cluster would include
  a map from local to remote userids (and thereby imply that anyone not
  explicitly listed can't do remote access at all).

There are 2 aspects to it:

1.  Function can be created only by superuser.
2.  If cluster connection strings do not have 'user=' key,
' user=' || current_username() is appended to it.  Note that
connections are per-backend, not shared.  Also, plroxy does
_nothing_ with passwords.  That means the password for remote
connection must be in postgres user's .pgpass, or there
is pooler between plproxy and remote database who handles
passwords.

What else do you see is needed?  I'm not sure a map is a good idea,
is seems to create unnecessary coplexity.  Ofcourse, it can be done.

But I don't think plproxy can and should protect dumb admins who
create remote_exec(sql) function and allow untrusted users to
execute it.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread Andrew Sullivan
On Mon, Jul 21, 2008 at 01:17:39PM -0700, David E. Wheeler wrote:
  pgFoundry ain't the CPAN, alas.

Maybe that's the problem that really needs solving?

One of the big Postgres features is its extensibility.  I agree that
the extensions can sometimes be hard to find, but surely the answer to
that is not an infinitely large source tarball?

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread David E. Wheeler

On Jul 21, 2008, at 13:19, Andrew Dunstan wrote:

I was going to write some stuff about citext anyway. Quite apart  
from the above considerations I'm still a bit concerned about its  
performance characteristics. And I'm not sure we really want all the  
baggage that David is proposing to bring along with it. Is it an  
advance to force the regex_replace i flag for such a type? I can  
imagine cases where I might want it to sort insensitively, but be  
able to do case sensitive regex ops on it. It's not as if the user  
can't supply the flag. So right now I don't think citext should be  
included, because there are still issues to sort out, if for no  
other reason.


I'm happy to work with folks to get them figured out, but at the end,  
there may be some differing opinions. If there's a reference  
implementation that could be checked (how does a case-insensitive  
collation work in another database?), that would be fine.


You can use the c flag to get case-sensitive comparison with the  
regex functions, though not with the operators. (Maybe this should be  
moved to a separate thread?)


Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread David E. Wheeler

On Jul 21, 2008, at 13:28, Andrew Sullivan wrote:


Maybe that's the problem that really needs solving?

One of the big Postgres features is its extensibility.  I agree that
the extensions can sometimes be hard to find, but surely the answer to
that is not an infinitely large source tarball?


Oh, of course. But one thing at a time. I'm in complete agreement that  
what goes into core should be pretty conservative, and that the module  
problem should be addressed. But even given that, I think that there  
is a strong case to be made that citext should be in contrib.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread Dave Cramer


On 21-Jul-08, at 4:28 PM, Andrew Sullivan wrote:


On Mon, Jul 21, 2008 at 01:17:39PM -0700, David E. Wheeler wrote:

pgFoundry ain't the CPAN, alas.


Maybe that's the problem that really needs solving?

One of the big Postgres features is its extensibility.  I agree that
the extensions can sometimes be hard to find, but surely the answer to
that is not an infinitely large source tarball?

A

I'd have to agree with Andrew here. Making it easy to get extensions  
would solve lots of problems.


Dave

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Decibel!
On Fri, Jul 18, 2008 at 02:23:43AM -0400, Andrew Sullivan wrote:
 On Fri, Jul 18, 2008 at 10:41:36AM +0530, Gurjeet Singh wrote:
  
  Just started INIT cluster Slonik command and that spiked too.. for more than
  10 minutes now!!
 
 Are you attempting to do Slony changes (such as install Slony) on an
 active database?  I strongly encourage you to read the Slony manual.
 Slony, frankly, sucks for this use case.  The manual says as much,
 although in more orotund phrases than that.

FWIW, I've had few problems getting londiste up and running on a heavily
loaded database. You might need to be a bit careful about when you add
very large tables due to the copy overhead, but other than that I
haven't had issues.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Gurjeet Singh
On Tue, Jul 22, 2008 at 1:29 AM, Andrew Sullivan [EMAIL PROTECTED]
wrote:

 On Sat, Jul 19, 2008 at 07:09:46AM +0530, Gurjeet Singh wrote:

  Will try this option, at least in the next schema upgrade or when setting
 up
  Slony.

 As I've already suggested, however, if you try to set up slony on a
 loaded database, you're going to see all manner of problems.  Slony
 takes some heavy-duty locks when it does its setup work.  It's
 designed that you should have an application outage for this sort of
 work.  Please see previous discussion on the Slony mailing list.


Well, a very low activity period of the application (after 11 PM EST) is
chosen as the maintenance window. The application is not down, but has just
the connections open, and almost all of them sitting IDLE.

I am aware of the heavy locking involved with Slony, which should mean that
it blocks the application connections; that's be completely acceptable,
given all the warnings in the Slony docs. But what I am concerned about and
trying to hunt down is why IDLE backend processes are all consuming up all
of CPU (!!!) so much so that I am unable to fire up any new process!

Another possible cause we are looking at now is the role Xeon hyperthreading
can play here. Will keep you all updated.

Thanks and best regards,
-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] typedefs for indent

2008-07-21 Thread Andrew Dunstan



Bruce Momjian wrote:

Andrew Dunstan wrote:
  

Alvaro Herrera wrote:


Andrew Dunstan wrote:
  
  
OK, I have spent some time generating and filtering typdefs via objdump  
on various platforms. I filtered them and Bruce's list to eliminate  
items not actually found in the sources thus:



Did this go anywhere?
  
  
I'm still  trying to get a working objdump for OSX. Automating this is 
difficult because we need to make sure we get all (or pretty close to 
all) the typedefs we can get on each platform for various build 
configurations.



At this point I would like to get a typedef list into CVS, even if it is
not perfect --- it is better than what we have now.

  


Well, you can start with this one:

http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=dungbeetledt=2008-07-21%20204856stg=typedefs

After I have a number of buildfarm machines producing them, I'll work on 
a stored proc to consolidate them and make them available, probably via 
a SOAP call (c.f. 
http://people.planetpostgresql.org/andrew/index.php?/archives/14-SOAP-server-for-Buildfarm-dashboard.html 
)



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Concurrent VACUUM and ANALYZE

2008-07-21 Thread Jonah H. Harris
Currently, one cannot perform a concurrent VACUUM and ANALYZE.  This
is a significant problem for tables which are not only large and have
designated cost-delays, but which are also heavily inserted into and
deleted from.  After performing a quick cursory investigation on this,
it doesn't appear to be difficult to change.  Other than the
ShareUpdateExclusiveLock, is there anything technically preventing us
from performing both concurrently?

Because we wouldn't want multiple ANALYZEs running on the same table,
changing the lock back to an AccessShareLock doesn't sound like a
solution.  However, what are the thoughts around creating another,
more-specific lock?  Perhaps something like ShareUpdateAnalysisLock?
Any other suggestions?

-Jonah

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Andrew Sullivan
On Tue, Jul 22, 2008 at 02:41:55AM +0530, Gurjeet Singh wrote:

 I am aware of the heavy locking involved with Slony, which should mean that
 it blocks the application connections; that's be completely acceptable,
 given all the warnings in the Slony docs. But what I am concerned about and
 trying to hunt down is why IDLE backend processes are all consuming up all
 of CPU (!!!) so much so that I am unable to fire up any new process!

Ah, well, then, yes, the spinlock improvements probably will help
you.  But you should disabuse yourself of the idea that IDLE
processes have no cost.  You still have to talk to all those
connections when doing schema changes.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent VACUUM and ANALYZE

2008-07-21 Thread Alvaro Herrera
Jonah H. Harris escribió:
 Currently, one cannot perform a concurrent VACUUM and ANALYZE.  This
 is a significant problem for tables which are not only large and have
 designated cost-delays, but which are also heavily inserted into and
 deleted from.  After performing a quick cursory investigation on this,
 it doesn't appear to be difficult to change.  Other than the
 ShareUpdateExclusiveLock, is there anything technically preventing us
 from performing both concurrently?

This means that VACUUM ANALYZE must grab both locks ... is there a
gotcha here?

The main problem I see with this idea is that the dead and total tuple
count computed by ANALYZE would be immediately out of date, and if it
happens to finish after VACUUM then it'll overwrite the values the
latter just wrote, which are more correct.  Not sure how serious a
problem this is.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Tom Lane
Andrew Sullivan [EMAIL PROTECTED] writes:
 On Tue, Jul 22, 2008 at 02:41:55AM +0530, Gurjeet Singh wrote:
 I am aware of the heavy locking involved with Slony, which should mean that
 it blocks the application connections; that's be completely acceptable,
 given all the warnings in the Slony docs. But what I am concerned about and
 trying to hunt down is why IDLE backend processes are all consuming up all
 of CPU (!!!) so much so that I am unable to fire up any new process!

 Ah, well, then, yes, the spinlock improvements probably will help
 you.  But you should disabuse yourself of the idea that IDLE
 processes have no cost.  You still have to talk to all those
 connections when doing schema changes.

Yeah.  In fact this is sounding more and more like the known problem
with sinval message response causing a thundering herd effect: the
idle processes all sit idle until the sinval message queue gets long
enough to rouse alarm bells, and then they all get signaled at once
and all try to clean the message queue at once, leading to very
heavy contention for the SInvalLock.  That code's been rewritten in
CVS HEAD to try to alleviate the problem, but no existing release
has the fix.

See thread here for prior report:
http://archives.postgresql.org/pgsql-performance/2008-01/msg1.php

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent VACUUM and ANALYZE

2008-07-21 Thread Tom Lane
Jonah H. Harris [EMAIL PROTECTED] writes:
 Because we wouldn't want multiple ANALYZEs running on the same table,
 changing the lock back to an AccessShareLock doesn't sound like a
 solution.

It flat will not work.  We used to do it that way, and it didn't
(search for tuple concurrently updated in the archives).

 However, what are the thoughts around creating another,
 more-specific lock?  Perhaps something like ShareUpdateAnalysisLock?

The general overhead involved in a whole new lock type is high enough
that I would resist taking this path.  (It's certainly a lot more than
adding an entry to one enum someplace --- offhand I can name docs and
grammar as important issues.  And no you don't get to have a hidden lock
type that no one can see.)


Also, as Alvaro points out, it's far from clear that concurrent VACUUM
and ANALYZE is as safe as you think --- they both want to write the same
fields in pg_class.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent VACUUM and ANALYZE

2008-07-21 Thread Jonah H. Harris
On Mon, Jul 21, 2008 at 6:15 PM, Alvaro Herrera
[EMAIL PROTECTED] wrote:
 This means that VACUUM ANALYZE must grab both locks ... is there a
 gotcha here?

Agreed.

 The main problem I see with this idea is that the dead and total tuple
 count computed by ANALYZE would be immediately out of date, and if it
 happens to finish after VACUUM then it'll overwrite the values the
 latter just wrote, which are more correct.  Not sure how serious a
 problem this is.

Agreed, but in the worst case, it's no different than running ANALYZE
immediately following a VACUUM.

-Jonah

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-21 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 On 7/21/08, Tom Lane [EMAIL PROTECTED] wrote:
 I looked through this a bit, and my principal reaction was what are
 the security implications?

 There are 2 aspects to it:

 1.  Function can be created only by superuser.

What I'm concerned about is who they can be *called* by.  I'd be happier
if the default behavior was that there was no public execute privilege
for plproxy functions.

I think right now that could be enforced by having plproxy's validator
procedure replace any null proacl entry with something that explicitly
refuses public execute.  That's a bit of a hack though.  Maybe it'd be
worth inventing per-PL default ACLs, instead of having a
one-size-fits-all policy?

 2.  If cluster connection strings do not have 'user=' key,
 ' user=' || current_username() is appended to it.

Cool, I missed that.  At minimum the documentation has to explain this
point and emphasize the security implications.  Is it a good idea
to allow user= in the cluster strings at all?

 Also, plroxy does
 _nothing_ with passwords.  That means the password for remote
 connection must be in postgres user's .pgpass,

That seems *exactly* backwards, because putting the password in postgres
user's .pgpass is as good as disabling password auth altogether.
Consider that it would also hand all the keys to the kingdom over to
someone who had access to dblink on the same machine (not even the same
cluster, so long as it was run by the same postgres user!).

 But I don't think plproxy can and should protect dumb admins who
 create remote_exec(sql) function and allow untrusted users to
 execute it.

We regularly get beat up about any aspect of our security apparatus
that isn't secure by default.  This definitely isn't, and from
a PR point of view (if nothing else) that doesn't seem a good idea.

I repeat that I don't feel comfortable in the least with plproxy's
security model.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent VACUUM and ANALYZE

2008-07-21 Thread Jonah H. Harris
On Mon, Jul 21, 2008 at 7:59 PM, Tom Lane [EMAIL PROTECTED] wrote:
 Jonah H. Harris [EMAIL PROTECTED] writes:
 Because we wouldn't want multiple ANALYZEs running on the same table,
 changing the lock back to an AccessShareLock doesn't sound like a
 solution.

 It flat will not work.  We used to do it that way, and it didn't
 (search for tuple concurrently updated in the archives).

Agreed.

 However, what are the thoughts around creating another,
 more-specific lock?  Perhaps something like ShareUpdateAnalysisLock?

 The general overhead involved in a whole new lock type is high enough
 that I would resist taking this path.  (It's certainly a lot more than
 adding an entry to one enum someplace --- offhand I can name docs and
 grammar as important issues.  And no you don't get to have a hidden lock
 type that no one can see.)

Any other suggestions?

 Also, as Alvaro points out, it's far from clear that concurrent VACUUM
 and ANALYZE is as safe as you think --- they both want to write the same
 fields in pg_class.

AFAICS, Alvaro didn't say that at all.  At worst, if ANALYZE completed
after VACUUM, its stats wouldn't be as good as those set by VACUUM.
But, as I said in response to Alvaro, that's no different than running
ANALYZE immediately following VACUUM.

The case I'm looking at is a large table which requires a lazy vacuum,
and a zero vacuum cost delay would cause too much I/O.  Yet, this
table has enough insert/delete activity during a vacuum, that it
requires a fairly frequent analysis to maintain proper plans.  I
patched as mentioned above and didn't run across any unexpected
issues; the only one expected was that mentioned by Alvaro.

-Jonah

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent VACUUM and ANALYZE

2008-07-21 Thread Tom Lane
Jonah H. Harris [EMAIL PROTECTED] writes:
 The case I'm looking at is a large table which requires a lazy vacuum,
 and a zero vacuum cost delay would cause too much I/O.  Yet, this
 table has enough insert/delete activity during a vacuum, that it
 requires a fairly frequent analysis to maintain proper plans.  I
 patched as mentioned above and didn't run across any unexpected
 issues; the only one expected was that mentioned by Alvaro.

I don't find this a compelling argument, at least not without proof that
the various vacuum-improvement projects already on the radar screen
(DSM-driven vacuum, etc) aren't going to fix your problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent VACUUM and ANALYZE

2008-07-21 Thread Jonah H. Harris
On Mon, Jul 21, 2008 at 10:19 PM, Tom Lane [EMAIL PROTECTED] wrote:
 Jonah H. Harris [EMAIL PROTECTED] writes:
 The case I'm looking at is a large table which requires a lazy vacuum,
 and a zero vacuum cost delay would cause too much I/O.  Yet, this
 table has enough insert/delete activity during a vacuum, that it
 requires a fairly frequent analysis to maintain proper plans.  I
 patched as mentioned above and didn't run across any unexpected
 issues; the only one expected was that mentioned by Alvaro.

 I don't find this a compelling argument, at least not without proof that
 the various vacuum-improvement projects already on the radar screen
 (DSM-driven vacuum, etc) aren't going to fix your problem.

Is DSM going to be in 8.4?  The last I had heard, DSM+related
improvements weren't close to being guaranteed for this release.  If
it doesn't make it, waiting another year and a half for something
easily fixed would be fairly unacceptable.  Should I provide a patch
in the event that DSM doesn't make it?

-Jonah

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema-qualified statements in pg_dump output

2008-07-21 Thread Owen Hartnett

At 8:34 AM +0100 7/11/08, Simon Riggs wrote:

On Mon, 2008-07-07 at 15:46 +0200, Bernd Helmle wrote:

 There's a behavior in pg_dump that annoyed me a little bit, the last few
 times i had to deal with it:

 Consider you have to dump a specific namespace only, you are going to use

 pg_dump -n your_schema [-t tables].

 I found it a common use case to restore this dump into a different schema
 by simply changing the search_path. With included ownerships this doesn't
 work, since pg_dump always outputs the necessary DDL as follows:

 ALTER TABLE bernd.foo OWNER TO bernd;

 Okay, it isn't too hard to use sed to replace the necessary statements to
 use the correct schema, but i think it would be much nicer if pg_dump would
 omit the schema-qualified table name here. I'd like to create a patch for
 this, if we agree on changing this behavior?


The use case you mention is something that would be of value to many
people, and I support your efforts to add a new option for this.

No useful workarounds exist without flaws: i) editing with sed might
well end up editing character data in the table(s) at the same time and
you may never even notice. ii) reloading to the same schema (renaming
etc) is not acceptable if the target has a production schema of that
name already. iii) manually editing a large file is problematic.

Tom's posted comments that you need to look at all of the places the
schemaname is used to see what we will need/not need to change. It's
more than just altering the owner, but that doesn't mean we don't want
it or its impossible.

Please pursue this further.


I've been looking into this matter, although I'm a noob apropos 
PostgreSQL hacking.  What I thought was a better way was to alter 
pg_dump to accept a flag -m masquerade_name.  It would require the 
-n schema_name option or fail.


It would generate a schema dump where all the references to 
schema_name were replaced by masquerade_name.


This would allow you to easily make a copy of a schema into a new schema.

My needs are that my production database is the public schema, and 
each year I want to archive fy2007, fy2008, etc. schemas which 
have the final information for those years.  So at the end of this 
year, I want to duplicate the public schema into the fy2008 
schema, and continue with public.


I could do the pg_dump public, rename public to fy2008 and then 
restore public, but this requires being without public for a 
short interval.  It would be better for me to simply:


pgsql database  pg_dump -c -n public -m fy2008

And that would give you a completely mechanical way to duplicate a 
schema, which means I could put it in a script that users could call.


From what I've seen, it would mean finding where the schema is 
currently accessed in the code, then substituting on the -m flag.


Having already done this with manually editing the files, it really 
cries out for a better procedure.


Perhaps my solution is excessive compared to the other offered 
solution, but it would have the benefit that the user would know 
precisely what he was doing by the flag setting.


-Owen

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent VACUUM and ANALYZE

2008-07-21 Thread Matthew T. O'Connor

Jonah H. Harris wrote:

On Mon, Jul 21, 2008 at 10:19 PM, Tom Lane [EMAIL PROTECTED] wrote:
  

I don't find this a compelling argument, at least not without proof that
the various vacuum-improvement projects already on the radar screen
(DSM-driven vacuum, etc) aren't going to fix your problem.



Is DSM going to be in 8.4?  The last I had heard, DSM+related
improvements weren't close to being guaranteed for this release.  If
it doesn't make it, waiting another year and a half for something
easily fixed would be fairly unacceptable.  Should I provide a patch
in the event that DSM doesn't make it?


Can't hurt to submit a patch.  Also, could you do something to help 
mitigate the worse case, something like don't update the stats in 
pg_class if the analyze finishes after a vacuum has finished since the 
current analyze started?


Matt


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers