date:20081023

Re: [HACKERS] psql Feature request \set query

2008-10-23 Thread Greg Stark

For making psql fully scriptable I would think it would be smarter to  
embed a full well-supported existing language rather than trying to  
invent a new one.


If we do want to to that I would suggest looking at lua which is  
intended to be used this way.


greg

On 23 Oct 2008, at 05:45 AM, Tom Lane <[EMAIL PROTECTED]> wrote:


Joshua Drake <[EMAIL PROTECTED]> writes:

"Robert Haas" <[EMAIL PROTECTED]> wrote:

"We can't do anything unless we do everything" is a recipe for
failure.



True enough, but a car doesn't roll without at least four wheels.


Perhaps more to the point: you don't have to implement every part of a
set of features at the same time, but it's a good idea to design them
all together.  Else you risk writing something that doesn't play nice
with the other parts; and by the time you find out, it's too late to
change without creating a backwards-compatibility problem.

   regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql Feature request \set query

2008-10-23 Thread Magnus Hagander

On 23 okt 2008, at 00.48, Alvaro Herrera <[EMAIL PROTECTED]>  
wrote:



Josh Berkus wrote:

Hackers,

Just had a feature request from Wheeler, and I don't see why it  
shouldn't

go on our todo list.


I think you're asking for more scriptability in psql.  Personally I
think that would be a great idea, but we need a lot more than what's
being proposed here.  We'll also need loops, conditionals, etc.  We've
had patches for those submitted over the years, but one at a time they
are easily rejected because they're so


If you want full scripting, look at pgscript, one of this summers gsoc  
projects. It'll ship in the next pgadmin and is also available as a  
standalone executable.


/Magnus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql Feature request \set query

2008-10-23 Thread Pavel Stehule

2008/10/23 Greg Stark <[EMAIL PROTECTED]>:
> For making psql fully scriptable I would think it would be smarter to embed
> a full well-supported existing language rather than trying to invent a new
> one.

plpgsql?

regards
Pavel

>
> If we do want to to that I would suggest looking at lua which is intended to
> be used this way.
>
> greg
>
> On 23 Oct 2008, at 05:45 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>
>> Joshua Drake <[EMAIL PROTECTED]> writes:
>>>
>>> "Robert Haas" <[EMAIL PROTECTED]> wrote:

 "We can't do anything unless we do everything" is a recipe for
 failure.
>>
>>> True enough, but a car doesn't roll without at least four wheels.
>>
>> Perhaps more to the point: you don't have to implement every part of a
>> set of features at the same time, but it's a good idea to design them
>> all together.  Else you risk writing something that doesn't play nice
>> with the other parts; and by the time you find out, it's too late to
>> change without creating a backwards-compatibility problem.
>>
>>   regards, tom lane
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql Feature request \set query

2008-10-23 Thread Peter Eisentraut


Pavel Stehule wrote:

2008/10/23 Greg Stark <[EMAIL PROTECTED]>:

For making psql fully scriptable I would think it would be smarter to embed
a full well-supported existing language rather than trying to invent a new
one.


plpgsql?


I think plpgsql is a pretty good example of the sort of disaster you get 
when you invent a language from nearly scratch.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql Feature request \set query

2008-10-23 Thread Peter Eisentraut


Magnus Hagander wrote:
If you want full scripting, look at pgscript, one of this summers gsoc 
projects. It'll ship in the next pgadmin and is also available as a 
standalone executable.


Yet another language?

What people asking for psql scriptability really want, in my estimate, 
is the ability to write "SQL plus some control structures" anywhere, in 
the server, in the client, or so that they don't have to know where. 
Commercial vendors have that: Oracle has PL/SQL as server-side language 
and client-side language.  Microsoft has T-SQL.  The others have similar 
offerings.


What PostgreSQL has in one kind of extended SQL language in the 
terminal, another kind of extended SQL language in the server (but 
really two separate languages, one for functions, one outside of 
functions), and now yet another kind of extended SQL language in the 
GUI, each mutually wildly incompatible.  And neither is based on a standard.


Something to think about ...

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql Feature request \set query

2008-10-23 Thread Pavel Stehule

2008/10/23 Peter Eisentraut <[EMAIL PROTECTED]>:
> Magnus Hagander wrote:
>>
>> If you want full scripting, look at pgscript, one of this summers gsoc
>> projects. It'll ship in the next pgadmin and is also available as a
>> standalone executable.
>
> Yet another language?

it is more crazy - pgscript is based on t-sql :).

I missing in pgsql mainly #ifdef and #endif - conditional execution.

Pavel

>
> What people asking for psql scriptability really want, in my estimate, is
> the ability to write "SQL plus some control structures" anywhere, in the
> server, in the client, or so that they don't have to know where. Commercial
> vendors have that: Oracle has PL/SQL as server-side language and client-side
> language.  Microsoft has T-SQL.  The others have similar offerings.
>
> What PostgreSQL has in one kind of extended SQL language in the terminal,
> another kind of extended SQL language in the server (but really two separate
> languages, one for functions, one outside of functions), and now yet another
> kind of extended SQL language in the GUI, each mutually wildly incompatible.
>  And neither is based on a standard.
>
> Something to think about ...
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deriving Recovery Snapshots

2008-10-23 Thread Simon Riggs


On Thu, 2008-10-23 at 08:40 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Wed, 2008-10-22 at 21:47 +0100, Simon Riggs wrote:
> > 
> >> But once you reach 64 transactions, you'll need to write an extra WAL
> >> record for every subtransaction, which currently I've managed to avoid.
> > 
> > Yes, I've managed to avoid it, but it will simplify the patch if you
> > think its not worth bothering with. This won't really effect anybody
> > I've met running straight Postgres, but it may effect EDB. It's not a
> > problem for me, but I was second guessing objections.
> > 
> > If I do that then I can just pass the slotId in full on every WAL
> > record, which simplifies a couple of other things also.
> > 
> > So, does everybody accept that we will write a WAL record for every
> > subtransaction assigned, once we hit the size limit of the subxid cache?
> > i.e. currently 65th subxid  and beyond.
> 
> Would have to see the patch to understand what the code simplicity vs. 
> extra WAL logging tradeoff really is.

Well, if your not certain now, then my initial feeling was correct. I
don't think everybody would agree to that. The code simplification would
be real, but I don't think it's that hard now.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Unicode escapes in literals

2008-10-23 Thread Peter Eisentraut

I would like to add an escape mechanism to PostgreSQL for entering 
arbitrary Unicode characters into string literals.  We currently only 
have the option of entering the character directly via the keyboard or 
cut-and-paste, which is difficult for a number of reasons, such as when 
the font doesn't have the character, and entering the UTF8-encoded bytes 
using the E'...' strings, which is hardly usable.


SQL has the following escape syntax for it:

   U&'special character: \' [ UESCAPE '\' ]

where  is the hexadecimal Unicode codepoint.  So this is pretty much 
just another variant on what the E'...' syntax does.


The trick is that since we have user-definable encoding conversion 
routines, we can't convert the Unicode codepoint to the server encoding 
in the scanner stage.  I imagine there are two ways to address this:


1. Only support this syntax when the server encoding is UTF8.  This 
would probably cover most use cases anyway.  We could have limited 
support for characters in the ASCII range for all server encodings.


2. Convert this syntax to a function call.  But that would then create a 
lot of inconsistencies, such as needing functional indexes for matches 
against what should really be a literal.


I'd be happy to start with UTF8 support only.  Other ideas?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block level concurrency during recovery

2008-10-23 Thread Simon Riggs

On Thu, 2008-10-23 at 09:09 +0300, Heikki Linnakangas wrote:

> However, we require that in b-tree vacuum, you take a cleanup lock on 
> *every* leaf page of the index, not only those that you modify. That's a 
> problem, because there's no trace of such pages in the WAL.

OK, good. Thanks for the second opinion. I'm glad you said that, cos I
felt sure anybody reading the patch would say "what the hell does this
bit do?". Now I can add it.

My solution is fairly simple:

As we pass through the table we keep track of which blocks need
visiting, then append that information onto the next WAL record. If the
last block doesn't contain removed rows, then we send a no-op message
saying which blocks to visit.

I'd already invented the XLOG_BTREE_VACUUM record, so now we just need
to augment it further with two fields: ordered array of blocks to visit,
and a doit flag.

Say we have a 10 block table, with rows to be removed on blocks 3,4,8. 
As we visit all 10 in sequence we would issue WAL records:

XLOG_BTREE_VACUUM block 3 visitFirst {1, 2} doit = true
XLOG_BTREE_VACUUM block 4 visitFirst {} doit = true
XLOG_BTREE_VACUUM block 8 visitFirst {5,6,7} doit = true
XLOG_BTREE_VACUUM block 10 visitFirst {9} doit = false

So that allows us to issue the same number of WAL messages yet include
all the required information to repeat the process correctly.

(The blocks can be visited out of sequence in some cases, hence the
ordered array of blocks to visit rather than just a first block value).

It would also be possible to introduce a special tweak there which is
that if the block is not in cache, don't read it in at all. If its not
in cache we know that nobody has a pin on it, so don't need to read it
in just to say "got the lock". That icing for later.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deriving Recovery Snapshots

2008-10-23 Thread Heikki Linnakangas


Simon Riggs wrote:

The patch isn't ready to apply standalone because we need to include the
changes to XidInMVCCSnapshot() also, which would take a little while to
extract. Let me know if that is worth producing a standalone patch for.


FWIW, this patch becomes a lot simpler if you don't change the function 
signature, and don't move the SubtransSetParent() call.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** src/backend/access/transam/xact.c
--- src/backend/access/transam/xact.c
***
*** 404,419  AssignTransactionId(TransactionState s)
  		AssignTransactionId(s->parent);
  
  	/*
! 	 * Generate a new Xid and record it in PG_PROC and pg_subtrans.
! 	 *
! 	 * NB: we must make the subtrans entry BEFORE the Xid appears anywhere in
! 	 * shared storage other than PG_PROC; because if there's no room for it in
! 	 * PG_PROC, the subtrans entry is needed to ensure that other backends see
! 	 * the Xid as "running".  See GetNewTransactionId.
  	 */
  	s->transactionId = GetNewTransactionId(isSubXact);
  
! 	if (isSubXact)
  		SubTransSetParent(s->transactionId, s->parent->transactionId);
  
  	/*
--- 404,418 
  		AssignTransactionId(s->parent);
  
  	/*
! 	 * Generate a new Xid and record it in PG_PROC. If there's no room
! 	 * in MyProc-> in MyProc->subxids, we must make the pg_subtrans
! 	 * entry BEFORE the Xid appears anywhere in shared storage other than
! 	 * PG_PROC, because the subtrans entry is needed to ensure that other
! 	 * backends see the Xid as "running".
  	 */
  	s->transactionId = GetNewTransactionId(isSubXact);
  
! 	if (isSubXact && MyProc->subxids.overflowed)
  		SubTransSetParent(s->transactionId, s->parent->transactionId);
  
  	/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] EXPLAIN CREATE TABLE AS

2008-10-23 Thread Peter Eisentraut

It turns out you cannot EXPLAIN on CREATE TABLE AS, but it seems to work 
fine if I extend the grammar as below:


  ExplainableStmt:
SelectStmt
| InsertStmt
| UpdateStmt
| DeleteStmt
| DeclareCursorStmt
+   | CreateAsStmt
| ExecuteStmt   /* by default all are $$=$1 */
  ;


It's the same as SELECT INTO after all, which already works.  Should 
this be added?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deriving Recovery Snapshots

2008-10-23 Thread Simon Riggs


On Thu, 2008-10-23 at 13:40 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > The patch isn't ready to apply standalone because we need to include the
> > changes to XidInMVCCSnapshot() also, which would take a little while to
> > extract. Let me know if that is worth producing a standalone patch for.
> 
> FWIW, this patch becomes a lot simpler if you don't change the function 
> signature, and don't move the SubtransSetParent() call.

Yeh, I'm trying to judge between trying to be simple and trying to be
neat. Whichever one I pick, I seem to be wrong. :-) But I begin, perhaps
too slowly, to understand that this is a natural part of review itself,
not really a problem.

Thanks for your help.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN CREATE TABLE AS

2008-10-23 Thread Simon Riggs


On Thu, 2008-10-23 at 14:29 +0300, Peter Eisentraut wrote:
> It turns out you cannot EXPLAIN on CREATE TABLE AS, but it seems to work 
> fine if I extend the grammar as below:

Yes please.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

2008-10-23 Thread Nikhil Sontakke

Hi,

On Wed, Oct 22, 2008 at 8:14 PM, Jaime Casanova <
[EMAIL PROTECTED]> wrote:

> just remembering that -patches is a dead list, so i'm sending this to
> -hackers where it will have more visibility...
>
> -- Forwarded message --
> On 10/22/08, Nikhil Sontakke <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > > >
> > > >> Thanks for taking a look. But if I am not mistaken Gavin and co. are
> > working
> > > >> on a much exhaustive proposal. In light of that maybe this patch
> might
> > not
> > > >> be needed in the first place?
> > > >>
> > > >> I will wait for discussion and a subsequent collective consensus
> here,
> > > >> before deciding the further course of actions.
> > > >
> > > > I think it is unwise to wait on Gavin for a more complex implemention
> > > > ---  we might end up with nothing for 8.4.  As long as your syntax is
> > > > compatible with whatever Gavin proposed Gavin can add on to your
> patch
> > > > once it is applied.
> > > >
> > >
> > > seems like you're a prophet... or i miss something?
> > >
> >
> > :)
> >
> > Maybe I will try to summarize the functionality of this patch, rebase it
> > against latest CVS head and try to get it on the commitfest queue atleast
> > for further feedback to keep the ball rolling on auto-partitioning...
> >
>
> yeah! i was thinking on doing that but still have no time... and
> frankly you're the best man for the job ;)
>
> one thing i was thinking of is to use triggers instead of rules just
> as our current docs recommends
> http://www.postgresql.org/docs/8.3/static/ddl-partitioning.html
>
> with the benefit that a trigger can check if the child table exists
> for the range being inserted and if not it can create it first...
> haven't looked at the code in the detail but seems that your patch is
> still missing the "create rule" part so we are in time to change
> that... no?
>

Yes triggers should be used instead of rules. Automatic generation of
rules/triggers would be kind of hard and needs some looking into. Also there
are issues like checking mutual exclusivity of the partition clauses
specified too (I have been maintaining that the onus of ensuring sane
partition ranges/clauses should rest with the users atleast initially..).

I will take a stab at this again whenever I get some free cycles.

Regards,
Nikhils
-- 
http://www.enterprisedb.com

Re: [HACKERS] EXPLAIN CREATE TABLE AS

2008-10-23 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> It turns out you cannot EXPLAIN on CREATE TABLE AS, but it seems to work 
> fine if I extend the grammar as below:

This seems to me to be something that will look like a wart, not a
feature, from the user's point of view.  You can't explain CREATE TABLE,
but you can explain CREATE TABLE AS?  It's just weird.  It feels to me
like exposing an implementation artifact.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql Feature request \set query

2008-10-23 Thread Andrew Sullivan

On Wed, Oct 22, 2008 at 04:14:11PM -0700, Joshua Drake wrote:
> 
> True enough, but a car doesn't roll without at least four wheels.

I'm not sure I agree: http://en.wikipedia.org/wiki/Image:1885Benz.jpg

(Sorry, I couldn't resist.)

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba options parsing

2008-10-23 Thread Magnus Hagander

Magnus Hagander wrote:
> This patch changes the options field of pg_hba.conf to take name/value
> pairs instead of a fixed string. This makes it a lot nicer to deal with
> auth methods that need more than one parameter, such as LDAP.
> 
> While at it, it also adds map support to kerberos, gssapi and sspi and
> not just ident - basically all methods where the username comes from an
> outside source (lmk if I missed one).
> 
> Also in passing, changes the methods in auth.c to deal with "unsupported
> auth method on this platform" errors the same way for all authentication
> methods.

Applied with suggested changes.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN CREATE TABLE AS

2008-10-23 Thread Dimitri Fontaine

Le jeudi 23 octobre 2008, Tom Lane a écrit :
> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > It turns out you cannot EXPLAIN on CREATE TABLE AS, but it seems to work
> > fine if I extend the grammar as below:
>
> This seems to me to be something that will look like a wart, not a
> feature, from the user's point of view.  You can't explain CREATE TABLE,
> but you can explain CREATE TABLE AS?  It's just weird.  It feels to me
> like exposing an implementation artifact.

As a user, I'm disagreeing about the weirdness of it. +1 for Peter's 
initiative.

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] EXPLAIN CREATE TABLE AS

2008-10-23 Thread Peter Eisentraut


Tom Lane wrote:

Peter Eisentraut <[EMAIL PROTECTED]> writes:
It turns out you cannot EXPLAIN on CREATE TABLE AS, but it seems to work 
fine if I extend the grammar as below:


This seems to me to be something that will look like a wart, not a
feature, from the user's point of view.  You can't explain CREATE TABLE,
but you can explain CREATE TABLE AS?  It's just weird.  It feels to me
like exposing an implementation artifact.


Isn't the whole point of EXPLAIN to expose implementation artifacts in 
the first place?  It shows the plan associated with a plannable 
statement.  Why are some statements plannable and some not?  Why can you 
explain DELETE and not TRUNCATE?  Why can you explain INSERT, what is 
plannable about that?  Ah, there is INSERT ... SELECT.  If you can plan 
INSERT, can you plan SELECT INTO, right?  Yes, that works.  But CREATE 
TABLE AS is the same, why won't that work?  Should we prohibit 
explaining SELECT INTO?  So you can't explain SELECT INTO, but you can 
explain SELECT?  It's all very weird.


So it's not like this is not already quite implementation-dependent.  We 
expose the information that we have and let the user process it. 
Someone who has advanced to the level of using EXPLAIN should be able to 
tell the difference, and those who are maximally confused won't be 
harmed by this addition any more than they already are.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unicode escapes in literals

2008-10-23 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> SQL has the following escape syntax for it:
> U&'special character: \' [ UESCAPE '\' ]

Man that's ugly.  Why the ampersand?  How do you propose to distinguish
this from a perfectly legitimate use of the & operator?

> 2. Convert this syntax to a function call.  But that would then create a 
> lot of inconsistencies, such as needing functional indexes for matches 
> against what should really be a literal.

Uh, why do you think that?  The function could surely be stable, even
immutable if you grant that a database's encoding can't change.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN CREATE TABLE AS

2008-10-23 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Isn't the whole point of EXPLAIN to expose implementation artifacts in 
> the first place?

Touche ... I withdraw the complaint.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Any reason to have heap_(de)formtuple?

2008-10-23 Thread Zdenek Kotala

Currently in heaptuple.c we have duplicated code.  heap_deformtuple and 
heap_formtuple are mark as a obsolete interface. Is any reason to have still 
them? I know that they are still used on many places, but is there any stopper 
to keep these function alive?


Zdenek



--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] A small performance bug in BTree Infrastructure

2008-10-23 Thread Gokulakannan Somasundaram

Hi All,
   BTree Insert requires a data-structure called BTStack to report the
page splits that have happened in the leaf pages to non-leaf pages upwards.
It basically goes back and updates those non-leaf pages. But this will never
happen in a search operation. You never need to climb upwards for a  Select
statement. Actually we can clearly see that in the _bt_first function, as
soon as we complete calling _bt_search, we call _bt_freestack to free it. So
unnecessarily a BTStack structure is getting formed and deleted.
   Inside _bt_search function, if we just add a if condition to check
whether access is for BT_READ, then we can avoid the creation of stack and
also remove the _bt_freestack  in the _bt_first function. I just implemented
the change and tested it and i am not seeing any performance difference(as
expected).

Still i thought of reporting it, so that whoever is working on it,
can incorporate this change also.

Thanks,
Gokul.

Re: [HACKERS] Unicode escapes in literals

2008-10-23 Thread Peter Eisentraut


Tom Lane wrote:

Peter Eisentraut <[EMAIL PROTECTED]> writes:

SQL has the following escape syntax for it:
U&'special character: \' [ UESCAPE '\' ]


Man that's ugly.  Why the ampersand?


Yeah, excellent question.  It seems completely unnecessary, but it is 
surely there in the syntax diagram.



How do you propose to distinguish
this from a perfectly legitimate use of the & operator?


Well, technically, there is going to be some conflict, but the practical 
impact should be minimal because:


- There are no spaces allowed between U&' .  We typically suggest spaces 
around binary operators.


- Naming a column "u" might not be terribly common.

- Binary-and with an undecorated string literal is not very common.

Of course, I have no data for these assertions.  An inquiry on -general 
might give more insight.


2. Convert this syntax to a function call.  But that would then create a 
lot of inconsistencies, such as needing functional indexes for matches 
against what should really be a literal.


Uh, why do you think that?  The function could surely be stable, even
immutable if you grant that a database's encoding can't change.


Yeah, true, that would work.

There are some other disadvantages for making a function call.  You 
couldn't use that kind of literal in any other place where the parser 
calls for a string constant: role names, tablespace locations, 
passwords, copy delimiters, enum values, function body, file names.


There is also a related feature for Unicode escapes in identifiers, and 
it might be good to keep the door open on that.


We could to a dual approach: Convert in the scanner when server encoding 
 is UTF8, and pass on as function call otherwise.  Surely ugly though.


Or pass it on as a separate token type to the analyze phase, but that is 
a lot more work.



Others: What use cases do you envision, and what requirements would they 
create for this feature?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Any reason to have heap_(de)formtuple?

2008-10-23 Thread Tom Lane

Zdenek Kotala <[EMAIL PROTECTED]> writes:
> Currently in heaptuple.c we have duplicated code.  heap_deformtuple and 
> heap_formtuple are mark as a obsolete interface. Is any reason to have still 
> them? I know that they are still used on many places, but is there any 
> stopper 
> to keep these function alive?

Well, aside from the gruntwork needed to convert all the core code that
still uses the old APIs, there's the prospect of breaking extension
modules that still use the old APIs.  It's kind of annoying to have two
copies of that code, but less annoying than removing it would be ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unicode escapes in literals

2008-10-23 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> There are some other disadvantages for making a function call.  You 
> couldn't use that kind of literal in any other place where the parser 
> calls for a string constant: role names, tablespace locations, 
> passwords, copy delimiters, enum values, function body, file names.

Good point.  I'm okay with supporting the feature only when database
encoding is UTF8.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Any reason to have heap_(de)formtuple?

2008-10-23 Thread Zdenek Kotala


Tom Lane napsal(a):

Zdenek Kotala <[EMAIL PROTECTED]> writes:
Currently in heaptuple.c we have duplicated code.  heap_deformtuple and 
heap_formtuple are mark as a obsolete interface. Is any reason to have still 
them? I know that they are still used on many places, but is there any stopper 
to keep these function alive?


Well, aside from the gruntwork needed to convert all the core code that
still uses the old APIs, there's the prospect of breaking extension
modules that still use the old APIs.  It's kind of annoying to have two
copies of that code, but less annoying than removing it would be ...



What's about convert null array to boolean and call heap_form_tuple?

Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Any reason to have heap_(de)formtuple?

2008-10-23 Thread Tom Lane

Zdenek Kotala <[EMAIL PROTECTED]> writes:
> Tom Lane napsal(a):
>> Well, aside from the gruntwork needed to convert all the core code that
>> still uses the old APIs, there's the prospect of breaking extension
>> modules that still use the old APIs.  It's kind of annoying to have two
>> copies of that code, but less annoying than removing it would be ...

> What's about convert null array to boolean and call heap_form_tuple?

Yeah, that's a thought.  We'd want to be sure we'd converted any call
sites that are performance-critical, but surely the vast majority are
not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Any reason to have heap_(de)formtuple?

2008-10-23 Thread Alvaro Herrera

Zdenek Kotala wrote:
> Tom Lane napsal(a):

>> Well, aside from the gruntwork needed to convert all the core code that
>> still uses the old APIs, there's the prospect of breaking extension
>> modules that still use the old APIs.  It's kind of annoying to have two
>> copies of that code, but less annoying than removing it would be ...
>
> What's about convert null array to boolean and call heap_form_tuple?

Agreed, I started doing that some time ago ... it doesn't seem all that
complicated.  We could try to add a #warning if an external module uses
the deprecated interface, for a couple of releases, and then perhaps
drop it.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Any reason to have heap_(de)formtuple?

2008-10-23 Thread Zdenek Kotala


Alvaro Herrera napsal(a):

Zdenek Kotala wrote:

Tom Lane napsal(a):



Well, aside from the gruntwork needed to convert all the core code that
still uses the old APIs, there's the prospect of breaking extension
modules that still use the old APIs.  It's kind of annoying to have two
copies of that code, but less annoying than removing it would be ...

What's about convert null array to boolean and call heap_form_tuple?


Agreed, I started doing that some time ago ... it doesn't seem all that
complicated.  


Do you have any half patch?


We could try to add a #warning if an external module uses
the deprecated interface, for a couple of releases, and then perhaps
drop it.


+1

Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] patch: Client certificate requirements

2008-10-23 Thread Magnus Hagander

This patch adds a configuration option to pg_hba.conf for "clientcert".
This makes it possible to have different client certificate requirements
on different connections. It also makes sure that if you specify that
you want client cert verification and the root store isn't there, we
give an error instead of silently allowing the user in (like we do now).

This still does not implement actual client certificate validation -
that's for a later step. It just cleans up the handling we have now.

//Magnus
*** a/src/backend/libpq/auth.c
--- b/src/backend/libpq/auth.c
***
*** 273,278  ClientAuthentication(Port *port)
--- 273,304 
   errmsg("missing or erroneous pg_hba.conf file"),
   errhint("See server log for details.")));
  
+ 	/*
+ 	 * This is the first point where we have access to the hba record for
+ 	 * the current connection, so perform any verifications based on the
+ 	 * hba options field that should be done *before* the authentication
+ 	 * here.
+ 	 */
+ 	if (port->hba->clientcert)
+ 	{
+ 		/*
+ 		 * When we parse pg_hba.conf, we have already made sure that we have
+ 		 * been able to load a certificate store. Thus, if a certificate is
+ 		 * present on the client, it has been verified against our root
+ 		 * certificate store, and the connection would have been aborted
+ 		 * already if it didn't verify ok.
+ 		 */
+ 		if (!port->peer)
+ 		{
+ 			ereport(FATAL,
+ 	(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION),
+ 	 errmsg("connection requires a valid client certificate")));
+ 		}
+ 	}
+ 
+ 	/*
+ 	 * Now proceed to do the actual authentication check
+ 	 */
  	switch (port->hba->auth_method)
  	{
  		case uaReject:
*** a/src/backend/libpq/be-secure.c
--- b/src/backend/libpq/be-secure.c
***
*** 128,133  static const char *SSLerrmessage(void);
--- 128,134 
  #define RENEGOTIATION_LIMIT (512 * 1024 * 1024)
  
  static SSL_CTX *SSL_context = NULL;
+ static bool ssl_loaded_verify_locations = false;
  
  /* GUC variable controlling SSL cipher list */
  char	   *SSLCipherSuites = NULL;
***
*** 229,234  secure_destroy(void)
--- 230,248 
  }
  
  /*
+  * Indicate if we have loaded the root CA store to verify certificates
+  */
+ bool
+ secure_loaded_verify_locations(void)
+ {
+ #ifdef USE_SSL
+ 	return ssl_loaded_verify_locations;
+ #endif
+ 
+ 	return false;
+ }
+ 
+ /*
   *	Attempt to negotiate secure session.
   */
  int
***
*** 780,794  initialize_SSL(void)
  		elog(FATAL, "could not set the cipher list (no valid ciphers available)");
  
  	/*
! 	 * Require and check client certificates only if we have a root.crt file.
  	 */
! 	if (!SSL_CTX_load_verify_locations(SSL_context, ROOT_CERT_FILE, NULL))
  	{
! 		/* Not fatal - we do not require client certificates */
  		ereport(LOG,
  (errmsg("could not load root certificate file \"%s\": %s",
  		ROOT_CERT_FILE, SSLerrmessage()),
!  errdetail("Will not verify client certificates.")));
  	}
  	else
  	{
--- 794,821 
  		elog(FATAL, "could not set the cipher list (no valid ciphers available)");
  
  	/*
! 	 * Attempt to load CA store, so we can verify client certificates if needed.
  	 */
! 	if (access(ROOT_CERT_FILE, R_OK))
  	{
! 		/*
! 		 * Root certificate file simply not found. Don't log an error here, because
! 		 * it's quite likely the user isn't planning on using client certificates.
! 		 */
! 		ssl_loaded_verify_locations = false;
! 	}
! 	else if (!SSL_CTX_load_verify_locations(SSL_context, ROOT_CERT_FILE, NULL))
! 	{
! 		/*
! 		 * File was there, but we could not load it. This means the file is somehow
! 		 * broken, and we should log this. Don't log it as a fatal error, because
! 		 * there is still a chance that the user isn't going to use client certs.
! 		 */
! 		ssl_loaded_verify_locations = false;
  		ereport(LOG,
  (errmsg("could not load root certificate file \"%s\": %s",
  		ROOT_CERT_FILE, SSLerrmessage()),
!  errdetail("Will not be able to verify client certificates.")));
  	}
  	else
  	{
***
*** 821,833  initialize_SSL(void)
  ROOT_CRL_FILE, SSLerrmessage()),
  		 errdetail("Certificates will not be checked against revocation list.")));
  			}
- 		}
  
! 		SSL_CTX_set_verify(SSL_context,
! 		   (SSL_VERIFY_PEER |
! 			SSL_VERIFY_FAIL_IF_NO_PEER_CERT |
! 			SSL_VERIFY_CLIENT_ONCE),
! 		   verify_cb);
  	}
  }
  
--- 848,865 
  ROOT_CRL_FILE, SSLerrmessage()),
  		 errdetail("Certificates will not be checked against revocation list.")));
  			}
  
! 			/*
! 			 * Always ask for SSL client cert, but don't fail if it's not presented. We'll fail later in this case,
! 			 * based on what we find in pg_hba.conf.
! 			 */
! 			SSL_CTX_set_verify(SSL_context,
! 			   (SSL_VERIFY_PEER |
! SSL_VERIFY_CLIENT_ONCE),
! 			   verify_cb);
! 
! 			ssl_loaded_verify_locations = true;
! 		}
  	}
  }
  
*** a/src/backend/libpq/hba.c
--- b/src/backend/libpq/hb

Re: [HACKERS] Making pg_standby compression-friendly

2008-10-23 Thread Koichi Suzuki

In terms of compress/decompress WAL in archive/restore, please take a
look at my project pglesslog,
http://pgfoundry.org/projects/pglesslog/

This project compresses WAL segment by replacing full page writes with
corresponding incremental logs.   When restored, it inserts dummy WAL
record to maintain LSN and file size.

This can be applied to log-shipping mechanism, asynchronous or synchronous.


2008/10/23 Charles Duffy <[EMAIL PROTECTED]>:
> Howdy, all.
>
> I'm interested in compressing archived WAL segments in an environment
> set up for PITR in the interests of reducing both network traffic and
> storage requirements. However, pg_standby presently checks file sizes,
> requiring that an archive segment be exactly the right size to be
> considered valid. The idea of compressing log segments is not new --
> the clearxlogtail project in pgfoundry provides a tool to make such
> compression more effective, and is explicitly intended for said
> purpose -- but as of 8.3.4, pg_standby appears not to support such
> environments; I propose adding such support.
>
> To allow pg_standby to operate in an environment where archive
> segments are compressed, two behaviors are necessary:
>
>  - suppressing the file-size checks. This puts the onus on the user to
> create these files via an atomic mechanism, but is necessary to allow
> compressed files to be considered.
>  - allowing a custom restore command to be provided. This permits the
> user to specify the mechanism to be used to decompress the segment.
> One bikeshed is determining whether the user should pass in a command
> suitable for use in a pipeline or a command which accepts input and
> output as arguments.
>
> A sample implementation is attached, intended only to kickstart
> discussion; I'm not attached to either its implementation or its
> proposed command-line syntax.
>
> Thoughts?
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>



-- 
--
Koichi Suzuki

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Making pg_standby compression-friendly

2008-10-23 Thread Heikki Linnakangas


Koichi Suzuki wrote:

In terms of compress/decompress WAL in archive/restore, please take a
look at my project pglesslog,
http://pgfoundry.org/projects/pglesslog/

This project compresses WAL segment by replacing full page writes with
corresponding incremental logs.   When restored, it inserts dummy WAL
record to maintain LSN and file size.

This can be applied to log-shipping mechanism, asynchronous or synchronous.


I believe Charles' question was: how do you hook that decompression into 
pg_standby? I suggested that whatever script is run on the standby 
server to copy xlog files to the archive location, should also call the 
decompression program, like pglesslog, but apparently there is no such 
script in his setup. How would you set up a standby server, using 
pg_lesslog?



2008/10/23 Charles Duffy <[EMAIL PROTECTED]>:

Howdy, all.

I'm interested in compressing archived WAL segments in an environment
set up for PITR in the interests of reducing both network traffic and
storage requirements. However, pg_standby presently checks file sizes,
requiring that an archive segment be exactly the right size to be
considered valid. The idea of compressing log segments is not new --
the clearxlogtail project in pgfoundry provides a tool to make such
compression more effective, and is explicitly intended for said
purpose -- but as of 8.3.4, pg_standby appears not to support such
environments; I propose adding such support.

To allow pg_standby to operate in an environment where archive
segments are compressed, two behaviors are necessary:

 - suppressing the file-size checks. This puts the onus on the user to
create these files via an atomic mechanism, but is necessary to allow
compressed files to be considered.
 - allowing a custom restore command to be provided. This permits the
user to specify the mechanism to be used to decompress the segment.
One bikeshed is determining whether the user should pass in a command
suitable for use in a pipeline or a command which accepts input and
output as arguments.

A sample implementation is attached, intended only to kickstart
discussion; I'm not attached to either its implementation or its
proposed command-line syntax.

Thoughts?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers









--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Any reason to have heap_(de)formtuple?

2008-10-23 Thread Alvaro Herrera

Zdenek Kotala wrote:
> Alvaro Herrera napsal(a):

>> Agreed, I started doing that some time ago ... it doesn't seem all that
>> complicated.  
>
> Do you have any half patch?

Couldn't find it here, sorry :-(

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: Client certificate requirements

2008-10-23 Thread Magnus Hagander

Magnus Hagander wrote:
> This patch adds a configuration option to pg_hba.conf for "clientcert".
> This makes it possible to have different client certificate requirements
> on different connections. It also makes sure that if you specify that
> you want client cert verification and the root store isn't there, we
> give an error instead of silently allowing the user in (like we do now).
> 
> This still does not implement actual client certificate validation -
> that's for a later step. It just cleans up the handling we have now.

Uh, with docs.

//Magnus

*** a/doc/src/sgml/runtime.sgml
--- b/doc/src/sgml/runtime.sgml
***
*** 1646,1658  $ kill -INT `head -1 /usr/local/pgsql/data/postmaster.pid`

!   
 To require the client to supply a trusted certificate, place
 certificates of the certificate authorities (CA)
 you trust in the file root.crt in the data
!directory.  A certificate will then be requested from the client during
 SSL connection startup.  (See  for a
!description of how to set up client certificates.)  The server will
 verify that the client's certificate is signed by one of the trusted
 certificate authorities.  Certificate Revocation List (CRL) entries
 are also checked if the file root.crl exists.
--- 1646,1662 
 been entered.

!   
!Using client certificates
!
 To require the client to supply a trusted certificate, place
 certificates of the certificate authorities (CA)
 you trust in the file root.crt in the data
!directory, and set the clientcert parameter
!to 1 on the appropriate line(s) in pg_hba.conf.
!A certificate will then be requested from the client during
 SSL connection startup.  (See  for a
!description of how to set up certificates on the client.)  The server will
 verify that the client's certificate is signed by one of the trusted
 certificate authorities.  Certificate Revocation List (CRL) entries
 are also checked if the file root.crl exists.
***
*** 1663,1673  $ kill -INT `head -1 /usr/local/pgsql/data/postmaster.pid`

!If the root.crt file is not present, client
!certificates will not be requested or checked.  In this mode, SSL
!provides encrypted communication but not authentication.

 The files server.key, server.crt,
 root.crt, and root.crl
--- 1667,1689 

!The clientcert option in pg_hba.conf
!is available for all authentication methods, but only for rows
!specified as hostssl. Unless specified, the default is
!not to verify the client certificate.
!   
! 
!   
!PostgreSQL currently does not support authentication
!using client certificates, since it cannot differentiate between
!different users. As long as the user holds any certificate issued
!by a trusted CA it will be accepted, regardless of what account the
!user is trying to connect with.

+   

+   
+SSL Server File Usage

 The files server.key, server.crt,
 root.crt, and root.crl
***
*** 1704,1710  $ kill -INT `head -1 /usr/local/pgsql/data/postmaster.pid`
root.crt
trusted certificate authorities
!   requests client certificate; checks certificate is
signed by a trusted certificate authority

--- 1720,1726 

root.crt
trusted certificate authorities
!   checks that client certificate is
signed by a trusted certificate authority

***
*** 1717,1722  $ kill -INT `head -1 /usr/local/pgsql/data/postmaster.pid`

+   

 Creating a Self-Signed Certificate
*** a/src/backend/libpq/auth.c
--- b/src/backend/libpq/auth.c
***
*** 273,278  ClientAuthentication(Port *port)
--- 273,304 
   errmsg("missing or erroneous pg_hba.conf file"),
   errhint("See server log for details.")));

+ 	/*
+ 	 * This is the first point where we have access to the hba record for
+ 	 * the current connection, so perform any verifications based on the
+ 	 * hba options field that should be done *before* the authentication
+ 	 * here.
+ 	 */
+ 	if (port->hba->clientcert)
+ 	{
+ 		/*
+ 		 * When we parse pg_hba.conf, we have already made sure that we have
+ 		 * been able to load a certificate store. Thus, if a certificate is
+ 		 * present on the client, it has been verified against our root
+ 		 * certificate store, and the connection would have been aborted
+ 		 * already if it didn't verify ok.
+ 		 */
+ 		if (!port->peer)
+ 		{
+ 			ereport(FATAL,
+ 	(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION),
+ 	 errmsg("connection requires a valid client certificate")));
+ 		}
+ 	}
+ 
+ 	/*
+ 	 * Now proceed to do the actual authentication check
+ 	 */
  	switch (port->hba->auth_method)
  	{
  		case uaReject:
*** a/src/backend/libpq/be-secure.c
--- b/src/backend/libpq/be-secure.c
***
*** 128,133  static const cha

Re: [HACKERS] double-buffering page writes

2008-10-23 Thread Alvaro Herrera

ITAGAKI Takahiro wrote:

> I have some comments about the double-buffering:

Since posting this patch I have realized that this implementation is
bogus.  I'm now playing with WAL-logging hint bits though.  As to your
questions:

> - Are there any performance degradation because of addtional memcpy?
>   8kB of memcpy seems not to be free.

Of course, it is not free.  However it comes with the benefit that we
can release the io_in_progress lock earlier for the block -- we lock,
copy, unlock; whereas the old code did lock, write(), unlock.  Avoding a
system call in the locked area could be a win.  Whether this is a net
benefit is something that I have not measured.

> - Is it ok to allocale dblbuf[BLCKSZ] as local variable?
>   It might be unaligned. AFAICS we avoid such usages in other places.

I thought about that too.  I admit I am not sure if this really works
portably; however I don't want to add a palloc() to that routine.

> - It is the best if we can delay double-buffering until locks are
>   conflicted actually. But we might need to allocale shadow buffers
>   from shared buffers instead of local memory.

The point of double-buffering is that the potential writer (a process
doing concurrent hint-bit setting) is not going to grab any locks.

> - Are there any other modules that can share in the benefits of
>   double-buffering? For example, we could avoid avoid waiting for
>   LockBufferForCleanup(). It is cool if the double-buffering can
>   be used for multiple purposes.

Not sure on this.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL cleanups/hostname verification

2008-10-23 Thread Magnus Hagander

Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:

> The patch seems pretty far short of sufficient as far as supporting a
> new conninfo option goes --- for instance it appears to leak the string
> at disconnect.  Check through all the references to some existing option
> field to see if you missed anything else.

Looking over it again, that's pretty much the only one I find. I checked
against sslmode. Did you spot something else?

Found a bug in the GSSAPI code though, which also does not free :) Will fix.

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block level concurrency during recovery

2008-10-23 Thread Heikki Linnakangas


Simon Riggs wrote:

It would also be possible to introduce a special tweak there which is
that if the block is not in cache, don't read it in at all. If its not
in cache we know that nobody has a pin on it, so don't need to read it
in just to say "got the lock". That icing for later.


Heh, that's clever :-). We could actually use a method like that to 
solve the whole problem. After the (replay of the) b-tree vacuum is 
finished, scan through all shared buffers, and get a vacuum lock on 
every page of that index that's in cache. Groveling through all shared 
buffers would be slower for small indexes, though.


I believe the "vacuum lock every leaf page" behavior is only needed for 
system catalogs. You have other issues with those still, namely that 
table locks are not yet taken appropriately, so I'm not sure if this is 
worth worrying about until that's done.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unicode escapes in literals

2008-10-23 Thread Andrew Sullivan

On Thu, Oct 23, 2008 at 06:04:43PM +0300, Peter Eisentraut wrote:
>> Man that's ugly.  Why the ampersand?
>
> Yeah, excellent question.  It seems completely unnecessary, but it is 
> surely there in the syntax diagram.

Probably because many Unicode representations are done with "U+"
followed by 4-6 hexadecimal units, but "+" is problematic for other
reasons (in some vendor's implementation)?

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] double-buffering page writes

2008-10-23 Thread Heikki Linnakangas


Alvaro Herrera wrote:

ITAGAKI Takahiro wrote:


I have some comments about the double-buffering:


Since posting this patch I have realized that this implementation is
bogus.  I'm now playing with WAL-logging hint bits though.  


Yeah, the torn page + hint bit updates problem is the tough question.


- Is it ok to allocale dblbuf[BLCKSZ] as local variable?
  It might be unaligned. AFAICS we avoid such usages in other places.


I thought about that too.  I admit I am not sure if this really works
portably; however I don't want to add a palloc() to that routine.


It should work, AFAIK, but unaligned memcpy()s and write()s can be a 
significantly slower. There can be only one write() happening at any 
time, so you could just palloc() a single 8k buffer in TopMemoryContext 
in backend startup, and always use that.



- Are there any other modules that can share in the benefits of
  double-buffering? For example, we could avoid avoid waiting for
  LockBufferForCleanup(). It is cool if the double-buffering can
  be used for multiple purposes.


Not sure on this.


You'd need to keep both versions of the buffer simultaneously in the 
buffer cache for that. When we talked about the various designs for HOT, 
that was one of the ideas I had to enable more aggressive pruning: if 
you can't immediately get a vacuum lock, allocate a new buffer in the 
buffer cache for the same block, copy the page to the new buffer, and do 
the pruning, including moving tuples around, there. Any new ReadBuffer 
calls would return the new page version, but old readers would keep 
referencing the old one. The intrusive part of that approach, in 
addition to the obvious changes required in the buffer manager to keep 
around multiple copies of the same block, is that all modifications must 
be done on the new version, so anyone who needs to lock the page for 
modification would need to switch to the new page version at the 
LockBuffer call.


As discussed in the other thread with Simon, we also use vacuum locks in 
b-tree for waiting out index scans, so avoiding the waiting there would 
be just wrong.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL cleanups/hostname verification

2008-10-23 Thread Tom Lane

Magnus Hagander <[EMAIL PROTECTED]> writes:
> Looking over it again, that's pretty much the only one I find. I checked
> against sslmode. Did you spot something else?

No, I just thought that it looked like too small a patch for adding
a new conn option, and complained as soon as I found something missing.
If you've got it working the same as sslmode then it should be okay.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unicode escapes in literals

2008-10-23 Thread Tom Lane

Andrew Sullivan <[EMAIL PROTECTED]> writes:
> On Thu, Oct 23, 2008 at 06:04:43PM +0300, Peter Eisentraut wrote:
>> Yeah, excellent question.  It seems completely unnecessary, but it is 
>> surely there in the syntax diagram.

> Probably because many Unicode representations are done with "U+"
> followed by 4-6 hexadecimal units, but "+" is problematic for other
> reasons (in some vendor's implementation)?

They could hardly ignore the conflict with the operator interpretation
for +.  The committee has now cut themselves off from ever having a
standard operator named &, but I suppose they didn't think ahead to that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block level concurrency during recovery

2008-10-23 Thread Simon Riggs


On Thu, 2008-10-23 at 19:21 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > It would also be possible to introduce a special tweak there which is
> > that if the block is not in cache, don't read it in at all. If its not
> > in cache we know that nobody has a pin on it, so don't need to read it
> > in just to say "got the lock". That icing for later.
> 
> Heh, that's clever :-). We could actually use a method like that to 
> solve the whole problem. After the (replay of the) b-tree vacuum is 
> finished, scan through all shared buffers, and get a vacuum lock on 
> every page of that index that's in cache. Groveling through all shared 
> buffers would be slower for small indexes, though.

Well, re-examining all of the assumptions in the code seems to have been
worthwhile so far. I think that makes 4 significant tweaks that have
come out of the Search For Hot Standby.

> I believe the "vacuum lock every leaf page" behavior is only needed for 
> system catalogs. 

So we will still need it then. Which is good 'cos I just wrote it.

> You have other issues with those still, namely that 
> table locks are not yet taken appropriately, so I'm not sure if this is 
> worth worrying about until that's done.

Please explain. If you know of a correctness issue, please say.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] double-buffering page writes

2008-10-23 Thread Tom Lane

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Alvaro Herrera wrote:
>> I thought about that too.  I admit I am not sure if this really works
>> portably; however I don't want to add a palloc() to that routine.

> It should work, AFAIK, but unaligned memcpy()s and write()s can be a 
> significantly slower. There can be only one write() happening at any 
> time, so you could just palloc() a single 8k buffer in TopMemoryContext 
> in backend startup, and always use that.

Some time ago, we arranged for shared buffers to be aligned on *more*
than maxalign boundaries (cf BUFFERALIGN) because on at least some
platforms this makes a very significant difference in the speed of
copying to/from kernel space.  If you are going to double-buffer it
is going to be important to have the intermediate buffer just as well
aligned.  A local char array won't be acceptable, and even for a
palloc'd one you'll need to take some extra steps (like wasting
ALIGNOF_BUFFER extra bytes so you can align the pointer palloc gives).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block level concurrency during recovery

2008-10-23 Thread Heikki Linnakangas


Simon Riggs wrote:

On Thu, 2008-10-23 at 19:21 +0300, Heikki Linnakangas wrote:
You have other issues with those still, namely that 
table locks are not yet taken appropriately, so I'm not sure if this is 
worth worrying about until that's done.


Please explain. If you know of a correctness issue, please say.


Things like DROP TABLE need to take an AccessExclusiveLock, and you 
don't have that yet, no?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql Feature request \set query

2008-10-23 Thread Josh Berkus


Everyone,

What people asking for psql scriptability really want, in my estimate, 
is the ability to write "SQL plus some control structures" anywhere, in 
the server, in the client, or so that they don't have to know where. 
Commercial vendors have that: Oracle has PL/SQL as server-side language 
and client-side language.  Microsoft has T-SQL.  The others have similar 
offerings.


I'm *not* proposing to implement full scriptability in psql.  Just one 
obvious extension of the existing pgsql variable feature.


I agree that giving psql its own unique scripting language would be 
stupid.  We should use something existing.  Ideally, we should make psql 
pluggable with the Perl & Python debuggers.



--Josh


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block level concurrency during recovery

2008-10-23 Thread Simon Riggs


On Thu, 2008-10-23 at 20:24 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2008-10-23 at 19:21 +0300, Heikki Linnakangas wrote:
> >> You have other issues with those still, namely that 
> >> table locks are not yet taken appropriately, so I'm not sure if this is 
> >> worth worrying about until that's done.
> > 
> > Please explain. If you know of a correctness issue, please say.
> 
> Things like DROP TABLE need to take an AccessExclusiveLock, and you 
> don't have that yet, no?

Oh right, I thought you meant something not already in the plan.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Rework subtransaction commit protocol for hot standby.

2008-10-23 Thread Simon Riggs


On Thu, 2008-10-23 at 04:38 +0100, Simon Riggs wrote:

> > That means that out of the four state transitions that are
> > disallowed by the original coding of that Assert, you are now having to
> > consider two as legal.  I don't like that, and I like even less that
> > it's not even trying to determine whether this is a replay-driven
> > change.

Possible state changes

TRANSACTION_STATUS_IN_PROGRESS to 
  TRANSACTION_STATUS_IN_PROGRESS is allowed 
  TRANSACTION_STATUS_COMMITTED is allowed
  TRANSACTION_STATUS_ABORTED is allowed
  TRANSACTION_STATUS_SUB_COMMITTED is allowed

TRANSACTION_STATUS_SUB_COMMITTED to
  TRANSACTION_STATUS_IN_PROGRESS is allowed (but should not be)
  TRANSACTION_STATUS_COMMITTED is allowed
  TRANSACTION_STATUS_ABORTED is allowed
  TRANSACTION_STATUS_SUB_COMMITTED is allowed

TRANSACTION_STATUS_COMMITTED to
  TRANSACTION_STATUS_IN_PROGRESS is disallowed
  TRANSACTION_STATUS_COMMITTED is allowed
  TRANSACTION_STATUS_ABORTED is disallowed
  TRANSACTION_STATUS_SUB_COMMITTED is ignored in redo only

TRANSACTION_STATUS_ABORTED to
  TRANSACTION_STATUS_IN_PROGRESS is disallowed 
  TRANSACTION_STATUS_COMMITTED is disallowed
  TRANSACTION_STATUS_ABORTED is allowed
  TRANSACTION_STATUS_SUB_COMMITTED is disallowed

So out of 16 possible state change requests 10 were previously allowed,
one of which was allowed but should not have been.

This patch allows 1 additional legal state change request, now in redo
only.

There are still 5 disallowed state changes, plus another one disallowed
in normal running. That seems fine.

> Presumably you would like to see an additional parameter to allow that
> test to be more strictly determined? 
> 
> Bug fix v2 patch enclosed, mostly API changes.

I suggest a third version with these changes:

* Write the SUBCOMMITTED to COMMIT transition as a no-op during redo
rather than as an Assert. This prevents a transition from COMMIT to
SUBCOMMIT to ABORT. By making it a no-op the attempt to set COMMIT to
SUBCOMMIT never causes a failure, but it doesn't take place either.

* Disallow SUBCOMMITTED to IN_PROGRESS transition via an Assert.

What do you think?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regression in IN( field, field, field ) performance

2008-10-23 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> Works fine for me, eg

> I think he's looking for something like:
>  5 IN (col1,col2,col3)
> resulting in a bitmap or of three index scans of three different indexes on
> col1, col2, and col3.

Ah, I see.  It would be easy to make transformAExprIn() generate an OR
tree instead of = ANY(ARRAY[]), if we could figure out the conditions
where an OR tree is superior.  I'm not sure it's easy to tell though.
Is it sufficient to do this when there are Vars on the right side and
none on the left?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Any reason to have heap_(de)formtuple?

2008-10-23 Thread Kris Jurka




On Thu, 23 Oct 2008, Tom Lane wrote:

Well, aside from the gruntwork needed to convert all the core code that 
still uses the old APIs, there's the prospect of breaking extension 
modules that still use the old APIs.  It's kind of annoying to have two 
copies of that code, but less annoying than removing it would be ...




The problem with trying to deprecate it is that the vast majority of the 
backend is still using the old interfaces, so people looking for 
inspiration for their external modules will likely end up using the old 
interface.  Like Alvaro I started this conversion a while ago, got bored, 
and forgot about it.  If people do want this conversion done while keeping 
the old interface around, I can track down that patch, update it and 
finish it up for the next CommitFest.


Kris Jurka

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby utility and administrator functions

2008-10-23 Thread Simon Riggs

On Mon, 2008-10-20 at 10:25 +0100, Simon Riggs wrote:

> What else do we need?

> * pg_freeze_recovery() 
> * pg_unfreeze_recovery() 

Two more functions

pg_freeze_recovery_cleanup()
pg_unfreeze_recovery_cleanup()

These would allow recovery to continue normally, except for row removal
operations which would halt the progress of recovery.

It would eventually be possible to have a function that halts recovery
whenever row removal takes place for a list of tables. Not planning on
implementing that initially though.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL cleanups/hostname verification

2008-10-23 Thread Bruce Momjian

Magnus Hagander wrote:
> Attached patch cleans up the certificate verification in libpq, and adds
> a configuration paraqmeter to control it. The new parameter is
> "sslverify", and can be set to:

Because SSL offers both encryption and authentication, I wonder if we
should call this "sslauthenticate".

> * cn = default = will validate that the certificate chains to a trusted
> root, *and* that the cn on the certificate matches the hostname
> specificed in the connection. This is the only option that prevents
> man-in-the-middle attacks completely, and therefor is the default.

Should this be "commonname"?

> * cert = what we had before if there was a root certificate file = will
> validate that the certificate chains to a trusted root, but ignore the cn.

Should this be "chain"?

> * none = will disable certificate validation completely
> 
> 
> This means that the connection string is now in charge of the security
> policy, and not just the "if file exists or not". IMHO this is the only
> proper way to do it. Now, if you for some reason loose the root

Agreed.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Any reason to have heap_(de)formtuple?

2008-10-23 Thread Greg Stark

The sad thing us that I also did a patch for this and lost it. I think  
it wouldn't be too hard to convert all the call sites in the backend  
and provide a wrapper for other users.


greg

On 23 Oct 2008, at 08:59 PM, Kris Jurka <[EMAIL PROTECTED]> wrote:




On Thu, 23 Oct 2008, Tom Lane wrote:

Well, aside from the gruntwork needed to convert all the core code  
that still uses the old APIs, there's the prospect of breaking  
extension modules that still use the old APIs.  It's kind of  
annoying to have two copies of that code, but less annoying than  
removing it would be ...




The problem with trying to deprecate it is that the vast majority of  
the backend is still using the old interfaces, so people looking for  
inspiration for their external modules will likely end up using the  
old interface.  Like Alvaro I started this conversion a while ago,  
got bored, and forgot about it.  If people do want this conversion  
done while keeping the old interface around, I can track down that  
patch, update it and finish it up for the next CommitFest.


Kris Jurka

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Bruce Momjian

Julius Stroffek wrote:
> Hi All,
> 
> we would like to start some work on improving the performance of
> PostgreSQL in a multi-CPU environment. Dano Vojtek is student at the
> Faculty of Mathematics and Physics of Charles university in Prague
> (http://www.mff.cuni.cz) and he is going to cover this topic in his
> master thesis. He is going to do some investigation in the methods and
> write down the possibilities and then he is going to implement something
> from that for PostgreSQL.
> 
> We want to come out with a serious proposal for this work after
> collecting the feedback/opinions and doing the more serious investigation.

Exciting stuff, and clearly a direction we need to explore.

> Topics that seem to be of interest and most of them were already
> discussed at developers meeting in Ottawa are
> 1.) parallel sorts
> 2.) parallel query execution
> 3.) asynchronous I/O

I think the current plan is to use posix_advise() to allow parallel I/O,
rather than async I/O becuase posix_advise() will require fewer code
changes.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Smith


On Thu, 23 Oct 2008, Bruce Momjian wrote:


I think the current plan is to use posix_advise() to allow parallel I/O,
rather than async I/O becuase posix_advise() will require fewer code
changes.


These are not necessarily mutually exclusive designs.  fadvise works fine 
on Linux, but as far as I know only async I/O works on Solaris.  Linux 
also has an async I/O library, and it's not clear to me yet whether that 
might work even better than the fadvise approach.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL cleanups/hostname verification

2008-10-23 Thread Magnus Hagander

Bruce Momjian wrote:
> Magnus Hagander wrote:
>> Attached patch cleans up the certificate verification in libpq, and adds
>> a configuration paraqmeter to control it. The new parameter is
>> "sslverify", and can be set to:
> 
> Because SSL offers both encryption and authentication, I wonder if we
> should call this "sslauthenticate".

I think that would confuse people with actual SSL certificate based
authentication, which we do not (yet) have.


>> * cn = default = will validate that the certificate chains to a trusted
>> root, *and* that the cn on the certificate matches the hostname
>> specificed in the connection. This is the only option that prevents
>> man-in-the-middle attacks completely, and therefor is the default.
> 
> Should this be "commonname"?

"cn" is a fairly standard way to refer to it, but if people think the
longer name is better, I'm ok with changing it.


>> * cert = what we had before if there was a root certificate file = will
>> validate that the certificate chains to a trusted root, but ignore the cn.
> 
> Should this be "chain"?

Could be, not sure.


//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SSL README

2008-10-23 Thread Magnus Hagander

I notice that there is a src/backend/libpq/README.SSL that contains a
couple of copied emails in it (it even contains our end-of-broadcast
message string!) And it contains a reply to the mail, with quoted stuff!

In general, this seem to be some hints about how to use things. Things
that if we need them at all, should go in the main manual. It's
certainly not code documentation, so it should not go in a README file
in the sourcetree.

My suggestion is to just get rid of any enduser docs from that one, and
possibly add stuff to the main manual once we feel it's needed. The part
that documents the SSL "negotiation protocol inside libpq" should stay,
of course.

Any objections?

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL README

2008-10-23 Thread Alvaro Herrera

While you're cleaning up SSL, how about the thread with this email on
it:

19212172.post%40talk.nabble.com

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris

On Thu, Oct 23, 2008 at 4:53 PM, Greg Smith <[EMAIL PROTECTED]> wrote:
>> I think the current plan is to use posix_advise() to allow parallel I/O,
>> rather than async I/O becuase posix_advise() will require fewer code
>> changes.
>
> These are not necessarily mutually exclusive designs.  fadvise works fine on
> Linux, but as far as I know only async I/O works on Solaris.  Linux also has
> an async I/O library, and it's not clear to me yet whether that might work
> even better than the fadvise approach.

fadvise is a kludge.  While it will help, it still makes us completely
reliant on the OS.  For performance reasons, we should be supporting a
multi-block read directly into shared buffers.  IIRC, we currently
have support for rings in the buffer pool, which we could read
directly into.  Though, an LRU-based buffer manager design would be
more optimal in this case.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL README

2008-10-23 Thread Bruce Momjian

Magnus Hagander wrote:
> I notice that there is a src/backend/libpq/README.SSL that contains a
> couple of copied emails in it (it even contains our end-of-broadcast
> message string!) And it contains a reply to the mail, with quoted stuff!
> 
> In general, this seem to be some hints about how to use things. Things
> that if we need them at all, should go in the main manual. It's
> certainly not code documentation, so it should not go in a README file
> in the sourcetree.
> 
> My suggestion is to just get rid of any enduser docs from that one, and
> possibly add stuff to the main manual once we feel it's needed. The part
> that documents the SSL "negotiation protocol inside libpq" should stay,
> of course.
> 
> Any objections?

Agreed, remove it.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Bruce Momjian

Jonah H. Harris wrote:
> On Thu, Oct 23, 2008 at 4:53 PM, Greg Smith <[EMAIL PROTECTED]> wrote:
> >> I think the current plan is to use posix_advise() to allow parallel I/O,
> >> rather than async I/O becuase posix_advise() will require fewer code
> >> changes.
> >
> > These are not necessarily mutually exclusive designs.  fadvise works fine on
> > Linux, but as far as I know only async I/O works on Solaris.  Linux also has
> > an async I/O library, and it's not clear to me yet whether that might work
> > even better than the fadvise approach.
> 
> fadvise is a kludge.  While it will help, it still makes us completely
> reliant on the OS.  For performance reasons, we should be supporting a
> multi-block read directly into shared buffers.  IIRC, we currently
> have support for rings in the buffer pool, which we could read
> directly into.  Though, an LRU-based buffer manager design would be
> more optimal in this case.

True, it is a kludge but if it gives us 95% of the benfit with 10% of
the code, it is a win.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL README

2008-10-23 Thread Russell Smith

Alvaro Herrera wrote:
> While you're cleaning up SSL, how about the thread with this email on
> it:
>
> 19212172.post%40talk.nabble.com
>
>   
Yeah, I mentioned this to Magnus this morning (my time) and he said
Bruce was compiling a patch in time for the next commit fest.

I'm not sure where it's all at, but an "official" update would be helpful.

Russell.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL README

2008-10-23 Thread Bruce Momjian

Russell Smith wrote:
> Alvaro Herrera wrote:
> > While you're cleaning up SSL, how about the thread with this email on
> > it:
> >
> > 19212172.post%40talk.nabble.com
> >
> >   
> Yeah, I mentioned this to Magnus this morning (my time) and he said
> Bruce was compiling a patch in time for the next commit fest.
> 
> I'm not sure where it's all at, but an "official" update would be helpful.

Yep, got the patch done, actually;  will post it now.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Reducing the memory footprint of large sets of pending triggers

2008-10-23 Thread Tom Lane

We've occasionally talked about allowing pending-trigger-event lists to
spill to disk when there get to be so many events that it's a memory
problem.  I'm not especially interested in doing that right now, but
I noticed recently that we could alleviate the problem a lot by adopting
a more compact representation.

Currently, each event is a separately palloc'd instance of struct
AfterTriggerEventData.  On a 32-bit machine that struct takes 32 bytes,
plus 8 bytes palloc overhead = 40 bytes.  On a 64-bit machine the struct
takes 36 bytes, but palloc rounds that up to 64 bytes, plus there's 16
bytes palloc overhead = 80 bytes :-(.

I see several things we could do here:

* Allocate the event structs in reasonably-large arrays instead of
separate palloc chunks.  This would require some data copying where we
now get away with pointer-swinging --- but on the other hand per-event
palloc'ing isn't exactly free either, so I suspect that this would net
out to a wash if not an actual speedup.

* Don't store the ate_tgoid and ate_relid fields in each individual
event struct.  Instead keep a separate array with one instance of these
values for each distinct trigger that's been fired in the current
transaction (in most cases this list should be pretty short, even if
there are many events).  We can commandeer the high order bits of
ate_event to store an index into that array.  Currently only 8 bits
of ate_event are actually used, so we'd have room for 16 million
distinct triggers fired in a transaction.  Even if we need a few more
ate_event flag bits later, I don't see a problem there.

* Don't store two ItemPointers in insert or delete events.  This would
make the array element stride variable, but since we don't need random
access into the arrays AFAICS, that doesn't seem to be a problem.

In combination these changes would get us down to 16 bytes per
insert/delete and 20 per update event, which represents a factor of 2
or 2.5 savings on a 32-bit machine and a factor of 4 or 5 on a 64-bit
machine.  Seems worth doing to me, especially since it looks like
only about a 1-day project touching only a single source file.

It might be possible to go further and move the event status bits and
firing_id into the separate array, which would save a further four bytes
per event in the typical situation that a lot of events of the same
trigger are queued by a single command.  I think I'd want to tackle that
as a follow-on patch though, because it would be a change in the data
structure semantics not just rearranging the representation a tad.

Comments, better ideas?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Stark

I couldn't get async I/O to work on Linux. That is it "worked" but  
performed the same as reading one block at a time. On solaris the  
situation is reversed.


In what way is fadvise a kludge?

greg

On 24 Oct 2008, at 01:44 AM, Bruce Momjian <[EMAIL PROTECTED]> wrote:


Jonah H. Harris wrote:
On Thu, Oct 23, 2008 at 4:53 PM, Greg Smith <[EMAIL PROTECTED]>  
wrote:
I think the current plan is to use posix_advise() to allow  
parallel I/O,
rather than async I/O becuase posix_advise() will require fewer  
code

changes.


These are not necessarily mutually exclusive designs.  fadvise  
works fine on
Linux, but as far as I know only async I/O works on Solaris.   
Linux also has
an async I/O library, and it's not clear to me yet whether that  
might work

even better than the fadvise approach.


fadvise is a kludge.  While it will help, it still makes us  
completely
reliant on the OS.  For performance reasons, we should be  
supporting a

multi-block read directly into shared buffers.  IIRC, we currently
have support for rings in the buffer pool, which we could read
directly into.  Though, an LRU-based buffer manager design would be
more optimal in this case.


True, it is a kludge but if it gives us 95% of the benfit with 10% of
the code, it is a win.

--
 Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
 EnterpriseDB http://enterprisedb.com

 + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Bruce Momjian

Greg Stark wrote:
> I couldn't get async I/O to work on Linux. That is it "worked" but  
> performed the same as reading one block at a time. On solaris the  
> situation is reversed.
> 
> In what way is fadvise a kludge?

I think he is saying AIO gives us more flexibility, but I am unsure we
need it.

---


> 
> greg
> 
> On 24 Oct 2008, at 01:44 AM, Bruce Momjian <[EMAIL PROTECTED]> wrote:
> 
> > Jonah H. Harris wrote:
> >> On Thu, Oct 23, 2008 at 4:53 PM, Greg Smith <[EMAIL PROTECTED]>  
> >> wrote:
>  I think the current plan is to use posix_advise() to allow  
>  parallel I/O,
>  rather than async I/O becuase posix_advise() will require fewer  
>  code
>  changes.
> >>>
> >>> These are not necessarily mutually exclusive designs.  fadvise  
> >>> works fine on
> >>> Linux, but as far as I know only async I/O works on Solaris.   
> >>> Linux also has
> >>> an async I/O library, and it's not clear to me yet whether that  
> >>> might work
> >>> even better than the fadvise approach.
> >>
> >> fadvise is a kludge.  While it will help, it still makes us  
> >> completely
> >> reliant on the OS.  For performance reasons, we should be  
> >> supporting a
> >> multi-block read directly into shared buffers.  IIRC, we currently
> >> have support for rings in the buffer pool, which we could read
> >> directly into.  Though, an LRU-based buffer manager design would be
> >> more optimal in this case.
> >
> > True, it is a kludge but if it gives us 95% of the benfit with 10% of
> > the code, it is a win.
> >
> > -- 
> >  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
> >  EnterpriseDB http://enterprisedb.com
> >
> >  + If your life is a hard drive, Christ can be your backup. +
> >
> > -- 
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris

On Thu, Oct 23, 2008 at 8:44 PM, Bruce Momjian <[EMAIL PROTECTED]> wrote:
> True, it is a kludge but if it gives us 95% of the benfit with 10% of
> the code, it is a win.

I'd say, optimistically, maybe 30-45% the benefit over a proper
multi-block read using O_DIRECT.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris

On Thu, Oct 23, 2008 at 10:36 PM, Greg Stark
<[EMAIL PROTECTED]> wrote:
> I couldn't get async I/O to work on Linux. That is it "worked" but performed
> the same as reading one block at a time. On solaris the situation is
> reversed.

Hmm, then obviously you did something wrong, because my tests showed
it quite well.  Pull the source to iozone or fio.

> In what way is fadvise a kludge?

non-portable, requires more user-to-system CPU, ... need I go on?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Tom Lane

"Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> On Thu, Oct 23, 2008 at 10:36 PM, Greg Stark
>> In what way is fadvise a kludge?

> non-portable, requires more user-to-system CPU, ... need I go on?

I'd be interested to know which of these proposals you claim *is*
portable.  The single biggest reason to reject 'em all is that
they aren't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris

On Fri, Oct 24, 2008 at 12:42 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> non-portable, requires more user-to-system CPU, ... need I go on?
>
> I'd be interested to know which of these proposals you claim *is*
> portable.  The single biggest reason to reject 'em all is that
> they aren't.

Yes, that was bad wording on my part.  What I mean to say was
unpredictable.  Different OSes and filesystems handle fadvise
differently (or not at all), which makes any claim to performance gain
configuration-dependent.  My preferred method, using O_DIRECT and
fetching directly into shared buffers, is not without its issues or
challenges as well.  However, by abstracting the multi-block read
interface, we could use more optimal calls depending on the OS.

Having done a bit of research and testing in this area (AIO and buffer
management), I don't see any easy solution.  fadvise will work on some
systems and will likely give some gain on them, but won't work for
everyone.  The alternative is to abstract prefetching and allow
platform-specific code, which we rarely do.  While we could build an
abstract prefetch interface and simply use fadvise for it now (rather
than OS-specific code), I don't see an easy win in any case.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Stark




On 24 Oct 2008, at 04:31 AM, "Jonah H. Harris"  
<[EMAIL PROTECTED]> wrote:



On Thu, Oct 23, 2008 at 10:36 PM, Greg Stark
<[EMAIL PROTECTED]> wrote:
I couldn't get async I/O to work on Linux. That is it "worked" but  
performed

the same as reading one block at a time. On solaris the situation is
reversed.


Hmm, then obviously you did something wrong, because my tests showed
it quite well.  Pull the source to iozone or fio.


I posted the source, feel free to point out what I did wrong. It did  
work on solaris with and without o_direct so I didn't think it was a  
bug in my code.




In what way is fadvise a kludge?


non-portable, requires more user-to-system CPU, ... need I go on?


Well it's just as portable, they're both specified by posix. Actually  
async I/o is in the real-time extensions so one could argue it's less  
portable. Also before posix_fadvise there was plain old fadvise so  
it's portable to older platforms too whereas async I/o isn't.


Posix_fadvise does require two syscalls and two trips to the buffer  
manager. But that doesn't really make it a kludge if the resulting  
code is cleaner than the async I/o code would be. To use async I/o we  
would have to pin all the buffers we're reading which would be quite a  
lot of code changes.


I did ask for feedback on precisely this point of whether two trips to  
the buffer manager was a problem. It would have been nice to get the  
feedback 6 months ago when I posted it instead of now two weeks before  
feature freeze. 
   


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Stark

Based on what? I did test this and posted the data. The results I  
posted showed that posix_fadvise on Linux performed nearly as well on  
Linux as async I/O on Solaris on identical hardware.


More importantly it scaled with the number if drives. A 15 drive array  
gets about 15x the performance of a 1 drive array if enough read-ahead  
is done. Plus an extra boost if the input wasn't already sorted which  
presumably reflects the better i/o ordering.





--
greg

On 24 Oct 2008, at 04:29 AM, "Jonah H. Harris"  
<[EMAIL PROTECTED]> wrote:


On Thu, Oct 23, 2008 at 8:44 PM, Bruce Momjian <[EMAIL PROTECTED]>  
wrote:

True, it is a kludge but if it gives us 95% of the benfit with 10% of
the code, it is a win.


I'd say, optimistically, maybe 30-45% the benefit over a proper
multi-block read using O_DIRECT.

--
Jonah H. Harris, Senior DBA
myYearbook.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Stark

We did discuss this in Ottawa and I beleive your comment was "the hint  
is in the name" referring to posix_fadvise.


In any case both aio and posix_fadvise are specified by posix so I  
don't see either as a problem on that front.


I don't think we can ignore any longer that we effectively can't use  
raid arrays with postgres. If you have many concurrent queries or  
restrict yourself to sequential scans you're ok but if you're doing  
data warehousing you're going to be pretty disappointed to see your  
shiny raid array performing like a single drive.



greg

On 24 Oct 2008, at 05:42 AM, Tom Lane <[EMAIL PROTECTED]> wrote:


"Jonah H. Harris" <[EMAIL PROTECTED]> writes:

On Thu, Oct 23, 2008 at 10:36 PM, Greg Stark

In what way is fadvise a kludge?



non-portable, requires more user-to-system CPU, ... need I go on?


I'd be interested to know which of these proposals you claim *is*
portable.  The single biggest reason to reject 'em all is that
they aren't.

   regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Heikki Linnakangas


Jonah H. Harris wrote:

fadvise is a kludge.


I don't think it's a kludge at all. posix_fadvise() is a pretty nice and 
clean interface to hint the kernel what pages you're going to access in 
the near future. I can't immediately come up with a cleaner interface to 
do that.


Compared to async I/O, it's helluva lot simpler to add a few 
posix_fadvise() calls to an application, than switch to a completely 
different paradigm. And while posix_fadvise() is just a hint, allowing 
the OS to prioritize accordingly, all async I/O requests look the same.



 While it will help, it still makes us completely
reliant on the OS.


That's not a bad thing in my opinion. The OS knows the I/O hardware, 
disk layout, utilization, and so forth, and is in a much better position 
to do I/O scheduling than a user process. The only advantage a user 
process has is that it knows better what pages it's going to need, and 
posix_fadvise() is a good interface to let the user process tell the 
kernel that.



IIRC, we currently have support for rings in the buffer pool, which we could 
read
directly into.


The rings won't help you a bit. It's just a different way to choose 
victim buffers.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Heikki Linnakangas


Jonah H. Harris wrote:

On Thu, Oct 23, 2008 at 8:44 PM, Bruce Momjian <[EMAIL PROTECTED]> wrote:

True, it is a kludge but if it gives us 95% of the benfit with 10% of
the code, it is a win.


I'd say, optimistically, maybe 30-45% the benefit over a proper
multi-block read using O_DIRECT.


Let's try to focus. We're not talking about using O_DIRECT, we're 
talking about using asynchronous I/O or posix_fadvise(). And without 
more details on what you mean by benefit, under what circumstances, any 
numbers like that is just unmeasurable handwaving.


In terms of getting the RAID array busy, in Greg's tests posix_fadvise() 
on Linux worked just as well as async I/O works on Solaris. So it 
doesn't seem like there's any inherent performance advantage in the 
async I/O interface over posix_fadvise() + read(),


There is differences between different OS implementations of the 
interfaces. But we're developing software for the future, and for a wide 
range of platforms, and I'm sure operating systems will develop as well. 
The decision should not be made on what is the fastest interface on a 
given operating system in 2008.


Async I/O might have a small potential edge on CPU usage, because less 
system calls are needed. However, let me remind you all that we're 
talking about how to utilize RAID array to do physical, random, I/O as 
fast as possible. IOW, the bottleneck is I/O, by definition. The CPU 
efficiency of the kernel interface to initiate the I/O is insignificant, 
until we reach a large enough random read throughput to saturate the 
CPU, and even then there's probably more significant CPU savings to be 
made elsewhere.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

75 matches

Mail list logo