Re: [HACKERS] Doubts about EvalPlanQual

2009-02-20 Thread Jacky Leng

 Asserts on data-consistency checks aren't really a good idea.

 (IOW this is can't happen only as long as your database isn't
 corrupt...)


Then why not change this to an ereport(PANIC ...)? 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow on/off as input texts for boolean.

2009-02-20 Thread ITAGAKI Takahiro

Peter Eisentraut pete...@gmx.net wrote:

 ITAGAKI Takahiro wrote:
  Here is a patch to allow 'on' and 'off' as input texts for boolean.
 
 Regarding your FIXME comment, I think parse_bool* should be in bool.c 
 and declared in builtins.h, which guc.c already includes. 
 (Conceptually, the valid format of a bool should be drived by the 
 boolean type, not the GUC system, I think.)

Here is an updated patch to move parse_bool* into bool.c.
I also added tests of on/off values to the regression test.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



boolin_accepts_onoff.2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doubts about EvalPlanQual

2009-02-20 Thread Heikki Linnakangas

Jacky Leng wrote:

Asserts on data-consistency checks aren't really a good idea.

(IOW this is can't happen only as long as your database isn't
corrupt...)



Then why not change this to an ereport(PANIC ...)? 


If you have a corrupted database, you want to be able to read it, not 
panic. If anything, we could put a WARNING there, but I'm not 100% sure 
it really is a can't happen case.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN fast insert

2009-02-20 Thread Teodor Sigaev

Right, can't do that on a hot standby server.


Is anywhere applicable hot standby patch? Last version on wiki is 9g and it 
can't be applied cleanly.



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --multi-thread

2009-02-20 Thread Peter Eisentraut

Andrew Dunstan wrote:

Cédric Villemain wrote:


 -j [jobs], --jobs[=jobs]
  Specifies  the  number  of jobs (pg_restore) to run simultaneously. 
If the -j
option is given without an argument, pg_restore will not limit the 
number of

jobs that can run simultaneously.


Quite apart from anything else, this description is almost 100% dead 
wrong.  The argument is not optional at all, and there is no unlimited 
parallelism. If you want to know how it actually works look at the dev 
docs.


What I'm still missing here is a piece of documentation or a guideline 
that says when a given number of threads/jobs/workers would be 
appropriate.  For make -j, this is pretty clear: If you have N CPUs to 
spare, use -j N.  For pg_restore, this is not made clear:  Is it the 
number of CPUs on the client or the server or the number of disks on the 
client or the server or perhaps a combination of this or something else?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Start background writer during archive recovery.

2009-02-20 Thread Heikki Linnakangas

Tom Lane wrote:

Couldn't you get rid of PMSIGNAL_RECOVERY_COMPLETED altogether?  If the
startup process exits with code 0, recovery is complete, else there
was trouble.  I find this SetPostmasterSignal bit quite ugly anyway.


Right now, the startup process exits with code 0 also when it's told to 
exit with SIGTERM, ie. fast shutdown request, and the recovery-completed 
signal is used to differentiate those cases. But yeah, we can use 
another exit code for that. I'll look into that approach.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN fast insert

2009-02-20 Thread Heikki Linnakangas

Teodor Sigaev wrote:

Right, can't do that on a hot standby server.


Is anywhere applicable hot standby patch? Last version on wiki is 9g and 
it can't be applied cleanly.


The latest version is in Simon's git repository at:

http://git.postgresql.org/?p=~sriggs/simon.git;a=shortlog;h=refs/heads/dev_hot_standby

in the dev_hot_standby branch. I don't think he's posted an updated 
patch based on that work.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Password prompting business

2009-02-20 Thread Peter Eisentraut
I was revisiting the issue of how to prevent psql from prompting for a 
password, for batch operations, previously discussion here:


http://www.mail-archive.com/pgsql-b...@postgresql.org/msg18440.html
http://www.mail-archive.com/pgsql-b...@postgresql.org/msg20002.html

I noticed that there appear to be remnants of actually supporting that 
in the code that are now nonsense, e.g., in psql/startup.c:


if (PQstatus(pset.db) == CONNECTION_BAD 
PQconnectionNeedsPassword(pset.db) 
password == NULL 
!feof(stdin))
{
PQfinish(pset.db);
password = simple_prompt(password_prompt, 100, false);
new_pass = true;
}

and similarly in scripts/common.c and in pg_dump(all).

The feof(stdin) test is there from a time when the prompt when to stdout 
and the input came from stdin.  Now it would usually not have any effect 
unless the program reads from stdin before connecting to the database, 
which doesn't happen, as far as I can tell.


Unless I'm reading this wrong, we should probably remove this.

Anyway, a way to prevent a password prompt would still be useful.  This 
should then be applicable to all client tools.  I can imagine that we 
either add a command-line option (-w would mirror -W well) or do this 
automatically when stdin and/or stdout is not a terminal.  I'm not sure 
if the latter can be made to work safely, though.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: hooking parser

2009-02-20 Thread Peter Eisentraut

Tom Lane wrote:

How about introducing a varchar2 type as in Oracle?


Maybe.  I think right now we don't allow input functions to decide
that a non-null input string should be converted to a NULL, but
that might be fixable.  It'd still be an ugly mess though, since
I suspect you'd have to introduce a whole structure of varchar2
functions/operators paralleling text.  For example, what is Oracle's
handling of || ?  AFAICS they can't be standards compliant there,
which means you need a varchar2-specific nonstrict implementation
of ||, and then to make that work the way Oracle users would expect,
varchar2-ness rather than text-ness would have to propagate through
anything else that might be done to a column before it reaches the ||.


Curiously enough, Oracle has it so that || of null arguments treats the 
arguments as empty string.


It's beyond comprehension.

But yeah, a varchar2 type with a full set of functions and operators 
could work.  If you choose not to bother with supporting the char type.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore new option -m

2009-02-20 Thread ohp

On Thu, 19 Feb 2009, Andrew Dunstan wrote:


Date: Thu, 19 Feb 2009 21:58:18 -0500
From: Andrew Dunstan and...@dunslane.net
To: Tom Lane t...@sss.pgh.pa.us
Cc: o...@pyrenet.fr, pgsql-hackers list pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] pg_restore new option -m



Tom Lane wrote:

o...@pyrenet.fr writes:


pg_restore -C -m 4 -d template1 db.dmp
gives numerous errors, mostly no such relation at index creation time.



You sure you don't get exactly the same without -m?





Yeah, I have reproduced this. It's because we reconnect to the wrong db in 
this case (i.e. to the -d database, not the created one) in the workers and 
subsequent connects.


I've applied a trivial patch to fix it.


works like a  charm!

cheers

andrew





thanks
--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges+33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: o...@pyrenet.fr
--
Make your life a dream, make your dream a reality. (St Exupery)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: hooking parser

2009-02-20 Thread Pavel Stehule
 Curiously enough, Oracle has it so that || of null arguments treats the
 arguments as empty string.

 It's beyond comprehension.

what is result of '' || '' ?

Pavel

 But yeah, a varchar2 type with a full set of functions and operators could
 work.  If you choose not to bother with supporting the char type.

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: hooking parser

2009-02-20 Thread Gregory Stark
Pavel Stehule pavel.steh...@gmail.com writes:

 Curiously enough, Oracle has it so that || of null arguments treats the
 arguments as empty string.

 It's beyond comprehension.

 what is result of '' || '' ?

Well the result of this is NULL of course (which is the same as '')

What's more puzzling is what the answer to 'foo' || NULL is...


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --multi-thread

2009-02-20 Thread Andrew Dunstan



Peter Eisentraut wrote:

Andrew Dunstan wrote:

Cédric Villemain wrote:


 -j [jobs], --jobs[=jobs]
  Specifies  the  number  of jobs (pg_restore) to run 
simultaneously. If the -j
option is given without an argument, pg_restore will not limit the 
number of

jobs that can run simultaneously.


Quite apart from anything else, this description is almost 100% dead 
wrong.  The argument is not optional at all, and there is no 
unlimited parallelism. If you want to know how it actually works look 
at the dev docs.


What I'm still missing here is a piece of documentation or a guideline 
that says when a given number of threads/jobs/workers would be 
appropriate.  For make -j, this is pretty clear: If you have N CPUs to 
spare, use -j N.  For pg_restore, this is not made clear:  Is it the 
number of CPUs on the client or the server or the number of disks on 
the client or the server or perhaps a combination of this or something 
else?


The short answer is that we don't know yet. There is anecdotal evidence 
that the number of CPUs on the server is a good place to start, but we 
should be honest enough to say that this is a new feature and we are 
still gathering information about its performance.  If you want to give 
some advice, then I think the best advice is to try a variety of 
settings to see what works best for you, and if you have a good set of 
figures report it back to us.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization rules for semi and anti joins

2009-02-20 Thread Tom Lane
I wrote:
 Hence semijoins can be rearranged just as freely as inner joins.

I guess nobody checked my work, because that claim is bogus.  Consider

A semijoin (B innerjoin C on (Pbc)) on (Pab)
=? (A semijoin B on (Pab)) innerjoin C on (Pbc)

In the second form the inner join is now using indeterminate B values.
What's more, if there are multiple C rows joining to some B, we could
get duplicated A rows, which can never happen in the first form.  So
semijoins do not commute with inner joins in their RHS.  A more accurate
statement seems to be that semijoins can be treated like innerjoins
for the purposes of rearrangement of other special joins.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization rules for semi and anti joins

2009-02-20 Thread Robert Haas
On Fri, Feb 20, 2009 at 10:59 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Hence semijoins can be rearranged just as freely as inner joins.

 I guess nobody checked my work, because that claim is bogus.

I spent some time reading your email and thinking through the cases,
but I completely failed to notice this.

Sorry,

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --multi-thread

2009-02-20 Thread Joshua D. Drake
On Fri, 2009-02-20 at 09:33 -0500, Andrew Dunstan wrote:

 The short answer is that we don't know yet. There is anecdotal evidence 
 that the number of CPUs on the server is a good place to start, but we 
 should be honest enough to say that this is a new feature and we are 
 still gathering information about its performance.  If you want to give 
 some advice, then I think the best advice is to try a variety of 
 settings to see what works best for you, and if you have a good set of 
 figures report it back to us.

There has been some fairly heavy testing and research that caused the
patch in the first place. The thread is here:

http://archives.postgresql.org/pgsql-hackers/2008-02/msg00695.php

It is a long thread. The end was result was the fastest restore time for
220G was performed with 24 threads with an 8 core box. It came in at 3.5
hours.

http://archives.postgresql.org/pgsql-hackers/2008-02/msg01092.php

It is important to point out that this was a machine with 50 spindles.
Which is where your bottleneck is going to be immediately after solving
the CPU bound nature of the problem.

So although the CPU question is easily answered, the IO is not. IO is
extremely variable in its performance.

Sincerely,

Joshua D. Drake

-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --multi-thread

2009-02-20 Thread Andrew Dunstan



Joshua D. Drake wrote:

On Fri, 2009-02-20 at 09:33 -0500, Andrew Dunstan wrote:

  
The short answer is that we don't know yet. There is anecdotal evidence 
that the number of CPUs on the server is a good place to start, but we 
should be honest enough to say that this is a new feature and we are 
still gathering information about its performance.  If you want to give 
some advice, then I think the best advice is to try a variety of 
settings to see what works best for you, and if you have a good set of 
figures report it back to us.



There has been some fairly heavy testing and research that caused the
patch in the first place. The thread is here:

http://archives.postgresql.org/pgsql-hackers/2008-02/msg00695.php

It is a long thread. The end was result was the fastest restore time for
220G was performed with 24 threads with an 8 core box. It came in at 3.5
hours.

http://archives.postgresql.org/pgsql-hackers/2008-02/msg01092.php

It is important to point out that this was a machine with 50 spindles.
Which is where your bottleneck is going to be immediately after solving
the CPU bound nature of the problem.

So although the CPU question is easily answered, the IO is not. IO is
extremely variable in its performance.


  


Yes, quite true. But parallel restore doesn't work quite the same way 
your original shell scripts did. It tries harder to keep the job pool 
continuously occupied, and so its best number of jobs is likely to be a 
bit lower then yours.


But you are right that there isn't a simple formula.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --multi-thread

2009-02-20 Thread Kenneth Marshall
On Fri, Feb 20, 2009 at 09:22:58AM -0800, Joshua D. Drake wrote:
 On Fri, 2009-02-20 at 09:33 -0500, Andrew Dunstan wrote:
 
  The short answer is that we don't know yet. There is anecdotal evidence 
  that the number of CPUs on the server is a good place to start, but we 
  should be honest enough to say that this is a new feature and we are 
  still gathering information about its performance.  If you want to give 
  some advice, then I think the best advice is to try a variety of 
  settings to see what works best for you, and if you have a good set of 
  figures report it back to us.
 
 There has been some fairly heavy testing and research that caused the
 patch in the first place. The thread is here:
 
 http://archives.postgresql.org/pgsql-hackers/2008-02/msg00695.php
 
 It is a long thread. The end was result was the fastest restore time for
 220G was performed with 24 threads with an 8 core box. It came in at 3.5
 hours.
 
 http://archives.postgresql.org/pgsql-hackers/2008-02/msg01092.php
 
 It is important to point out that this was a machine with 50 spindles.
 Which is where your bottleneck is going to be immediately after solving
 the CPU bound nature of the problem.
 
 So although the CPU question is easily answered, the IO is not. IO is
 extremely variable in its performance.
 
 Sincerely,
 
 Joshua D. Drake
 
I also ran some tests against a more modest system that was still
showing a performance improvement at (number-of-cores * 2):

http://archives.postgresql.org/pgsql-hackers/2008-11/msg01399.php

I think that a good starting point for any use should be the number
of cores given these two data points.

Cheers,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --multi-thread

2009-02-20 Thread Kevin Grittner
 Andrew Dunstan and...@dunslane.net wrote: 
 Joshua D. Drake wrote:
 the fastest restore time for
 220G was performed with 24 threads with an 8 core box.
 
 It is important to point out that this was a machine with 50
spindles.
 Which is where your bottleneck is going to be immediately after
solving
 the CPU bound nature of the problem.
 
 But you are right that there isn't a simple formula.
 
Perhaps the greater of the number of CPUs or effective spindles?
 
(24 sounds suspiciously close to effective spindles on a 50 spindle
box
with RAID 10.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --multi-thread

2009-02-20 Thread Joshua D. Drake
On Fri, 2009-02-20 at 11:57 -0600, Kevin Grittner wrote:
  
  But you are right that there isn't a simple formula.
  
 Perhaps the greater of the number of CPUs or effective spindles?
  
 (24 sounds suspiciously close to effective spindles on a 50 spindle
 box
 with RAID 10.)

It does except that you aren't accounting for 7200RPM vs 10k vs 15k vs
iSCSI vs FibreChannel etc...

You would have to literally do the math to figure it all out. Those 50
spindles were DAS. You go iSCSI and all of a sudden you have turned
those 50 spindles into and effective 8 DAS spindles. Not to mention if
you only have a single path for your FibreChannel etc...

Joshua D. Drake


  
 -Kevin
 
-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] return query/composite types

2009-02-20 Thread Merlin Moncure
create table foo(a int, b int);
postgres=# create function rfoo() returns setof foo as $$ begin return
query select foo from foo; end; $$ language plpgsql;
CREATE FUNCTION
Time: 25.606 ms
postgres=#
postgres=#
postgres=# select rfoo();
ERROR:  structure of query does not match function result type
DETAIL:  Number of returned columns (1) does not match expected column
count (2).


Any reason why the above shouldn't work?  Why does 'return query'
assume that returned composite types are expanded?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] return query/composite types

2009-02-20 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 create table foo(a int, b int);
 postgres=# create function rfoo() returns setof foo as $$ begin return
 query select foo from foo; end; $$ language plpgsql;

Use select * from ... instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] return query/composite types

2009-02-20 Thread Merlin Moncure
On Fri, Feb 20, 2009 at 3:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 create table foo(a int, b int);
 postgres=# create function rfoo() returns setof foo as $$ begin return
 query select foo from foo; end; $$ language plpgsql;

 Use select * from ... instead.

Yeah...I was thinking maybe that shouldn't be required:
1. it's illogical and conflicts with regular non 'returns query'
semantics (declare foo, assign, return)
2. if 'foo' is result of set returning function (like unnest), you
need to make extra subquery to prevent that function from executing
lots of extra times.
e.g.
select unnest(foo) from something will unnest the set six times if
foo has six fields.  This is a bit of a landmine since type returning
functions are _fairly_ common use for composite types.

These aren't really complaints since the workarounds are trivial, just
casual wondering if the behavior is correct.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] return query/composite types

2009-02-20 Thread Merlin Moncure
On Fri, Feb 20, 2009 at 3:25 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Fri, Feb 20, 2009 at 3:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 create table foo(a int, b int);
 postgres=# create function rfoo() returns setof foo as $$ begin return
 query select foo from foo; end; $$ language plpgsql;

 Use select * from ... instead.

 Yeah...I was thinking maybe that shouldn't be required:
 1. it's illogical and conflicts with regular non 'returns query'
 semantics (declare foo, assign, return)
 2. if 'foo' is result of set returning function (like unnest), you
 need to make extra subquery to prevent that function from executing
 lots of extra times.
 e.g.
 select unnest(foo) from something will unnest the set six times if

er,
select (unnest(foo)).* from something will unnest the set six times if
 ^^^

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] return query/composite types

2009-02-20 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Fri, Feb 20, 2009 at 3:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Use select * from ... instead.

 Yeah...I was thinking maybe that shouldn't be required:

If you allow both interpretations then you create a syntactic ambiguity
(at least for the case of single-column composite types).  We have
historically allowed both in SQL functions, but I think that's a design
error that shouldn't be repeated in other PLs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Pljava-dev] [HACKERS] Re: Should creating a new base type require superuser status?

2009-02-20 Thread Kris Jurka



On Wed, 18 Feb 2009, Kris Jurka wrote:

I have reviewed pljava's handling of misrepresented alignment, length, and by 
value parameters [and it doesn't all work.]




I have fixed pljava to now correctly handle all of these being defined 
incorrectly.  So a trusted language can be used to create type input and 
output functions safely.  I think the restriction that only superusers can 
create types should be reverted.


Kris Jurka

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The science of optimization in practical terms?

2009-02-20 Thread decibel

On Feb 17, 2009, at 11:23 PM, Robert Haas wrote:

Actually, a simple algorithm that might work really well would be to
calculate relation cache odds as ( number of page accesses for  
relation /

number of page accesses for all relations ) * ( sum(relpages)*BLKSZ /
eff_cache_size ), where number of page accesses would be both from  
relcache

and not.


I don't think that formula makes any sense.  If effective_cache_size
is in the denominator, then increasing it will make the odds of
finding the page in cache go down.


Yes, sorry... I got that part of the equation upside-down. It should be:

( number of page accesses for relation / number of page accesses for  
all relations ) * ( eff_cache_size / sum(relpages)*BLKSZ )



One thing this doesn't address though is the report from a few
months ago that accessing small tables is still faster with an  
index scan,
even if we know the whole thing is in cache (I don't remember if  
that was

ever resolved...)


I'm not sure if this is what you're referring to, but there was a
relatively recent post on, I believe, -performance, where a bitmap
index scan that hit almost the entire table beat out a seqscan.  I
don't think there was any further discussion and I'm still mystified
as to how it's possible.



What I was thinking of was that when dealing with a very small table  
(one or maybe a few pages), the planner thinks that a seqscan is the  
fastest way to get a single row, but it's actually faster to use an  
index. The bitmap case is even more interesting. Something is  
seriously screwy with small seqscans it seems.

--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel restore

2009-02-20 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:
  

Tom Lane wrote:


There is an unfinished TODO item here: we really ought to make it work
for tar-format archives.  That's probably not hugely difficult, but
I didn't look into it, and don't think we should hold up applying the
existing patch for it.
  


  

Right. Were you thinking this should be done for 8.4?



If you have time to look into it, sure.  Otherwise we should just put it
on the TODO list.


  


I've had a look at this. If our tar code supported out of order 
restoration(using fseeko) I'd be done. But it doesn't, and I won't get 
that done for 8.4, if at all. I'm not sure what would be involved in 
making it work.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The science of optimization in practical terms?

2009-02-20 Thread Robert Haas
On Fri, Feb 20, 2009 at 7:25 PM, decibel deci...@decibel.org wrote:
 On Feb 17, 2009, at 11:23 PM, Robert Haas wrote:

 Actually, a simple algorithm that might work really well would be to
 calculate relation cache odds as ( number of page accesses for relation /
 number of page accesses for all relations ) * ( sum(relpages)*BLKSZ /
 eff_cache_size ), where number of page accesses would be both from
 relcache
 and not.

 I don't think that formula makes any sense.  If effective_cache_size
 is in the denominator, then increasing it will make the odds of
 finding the page in cache go down.

 Yes, sorry... I got that part of the equation upside-down. It should be:

 ( number of page accesses for relation / number of page accesses for all
 relations ) * ( eff_cache_size / sum(relpages)*BLKSZ )

Well, that makes more sense, but it's still not right.  Suppose I have
ten equal-sized relations whose total size is equal to
effective_cache_size.  Relations 1-5 each get 15% of the page accesses
and relations 6-10 each get 5% of the page accesses.  Under your
formula, relations 1-5 will be 150% in cache and relations 6-10 will
be 50% in cache.  In reality, assuming sufficient frequency of access,
100% of all ten relations will be in cache.

I don't think there's any way to do this that doesn't involve some
sort of iterative process.  What you need to do is compute (# of page
accesses for this relation / number of page accesses for all
relations) * effective_cache_size, dole out that amount of cache to it
(capped at 100% of the relation size), and then decrement
effective_cache_size by the amount of cache you doled out and the
number of page accesses by the number for that relation, and then
rerun for the second-most-popular relation.

For example, suppose (in the example above) that effective_cache_size
= 1GB and there are 10K page accesses total.

Relation 1: MAX(1.5K/10K * 1GB, 100MB) = MAX(150MB, 100MB) = 100MB
Relation 2: MAX(1.5K/8.5K * 900MB, 100MB) = MAX(159MB, 100MB) = 100MB
Relation 3: MAX(1.5K/7K * 800MB, 100MB) = MAX(171MB, 100MB) = 100MB
Relation 4: MAX(1.5K/5.5K * 700MB, 100MB) = MAX(190MB, 100MB) = 100MB
Relation 5: MAX(1.5K/4K * 600MB, 100MB)  = MAX(225MB, 100MB) = 100MB
Relation 6: MAX(0.5K/2.5K * 500MB, 100MB) = MAX(100MB, 100MB) = 100MB
Relation 7: MAX(0.5K/2.0K * 400MB, 100MB) = MAX(100MB, 100MB) = 100MB
Relation 8: MAX(0.5K/1.5K * 300MB, 100MB) = MAX(100MB, 100MB) = 100MB
Relation 9: MAX(0.5K/1.0K * 200MB, 100MB) = MAX(100MB, 100MB) = 100MB
Relation 10: MAX(0.5K/0.5K * 100MB, 100MB) = MAX(100MB, 100MB) = 100MB

 One thing this doesn't address though is the report from a few
 months ago that accessing small tables is still faster with an index
 scan,
 even if we know the whole thing is in cache (I don't remember if that was
 ever resolved...)

 I'm not sure if this is what you're referring to, but there was a
 relatively recent post on, I believe, -performance, where a bitmap
 index scan that hit almost the entire table beat out a seqscan.  I
 don't think there was any further discussion and I'm still mystified
 as to how it's possible.

 What I was thinking of was that when dealing with a very small table (one or
 maybe a few pages), the planner thinks that a seqscan is the fastest way to
 get a single row, but it's actually faster to use an index. The bitmap case
 is even more interesting. Something is seriously screwy with small seqscans
 it seems.

Do you have a good test case for this?  I'd like to poke at it.  It's
really just because the planner thinks that accessing the index pages
will cost a disk read, which is often false in practice.  Does it help
if you set random_page_cost = seq_page_cost = 0.2?

The case I mentioned is qualitatively different because not only is
the planner wrong, but the observed behavior is somewhat inexplicable.
 I have a feeling this may have to do with the fact that bitmap
indices can identify individual tuples on the page when the tbm is
non-lossy.  Consulting the index (which is almost free if the page is
already in shared_buffers) not only finds the right page, but lets you
skip the CPU overhead of testing the quals against the irrelevant
tuples on that page.  But we need to do some better testing here to
figure out what is really going on.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers