[HACKERS] WIP: default values for function parameters

2008-10-26 Thread Pavel Stehule
Hello

I have problem with sending patch, so I am send link
http://www.pgsql.cz/patches/defaults.diff.gz

Example:
postgres=# create function fx(a int, b int default 30, c int default 40)
postgres-# returns int as $$ select $1 + $2 + $3; $$
postgres-# language sql;
CREATE FUNCTION
postgres=# select fx();
ERROR:  function fx() does not exist
LINE 1: select fx();
  ^
HINT:  No function matches the given name and argument types. You
might need to add explicit type casts.
postgres=# select fx(10);
 fx

 80
(1 row)

postgres=# select fx(10,11);
 fx

 61
(1 row)

postgres=# select fx(10,11,12);
 fx

 33
(1 row)

Know bugs:
blind ambiguous call detection

comments, ideas?

regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] new correlation metric

2008-10-26 Thread Jeff Davis
Currently, we use correlation to estimate the I/O costs of an index
scan. However, this has some problems:

http://archives.postgresql.org/pgsql-hackers/2008-01/msg01084.php

I worked with Nathan Boley to come up with what we think is a better
metric for measuring this cost. It is based on the number of times in
the ordered sample that you have to physically backtrack (i.e. the data
value increases, but the physical position is earlier).

For example, if the table's physical order is

6 7 8 9 10 1 2 3 4 5

then the new metric will see one backtrack -- after reading 5 it
backtracks to the beginning to read 6. The old code would have produced
a correlation near -0.5. The new code will yield a correlation of 8/9.
The old code would have produced a run_cost of 1/4*min_IO_cost +
3/4*max_IO_cost, while the new code produces a run_cost of
8/9*min_IO_cost + 1/9*max_IO_cost. 

Attached is a patch. We replaced correlation, but left the name the
same (even though that's now a misnomer). We can change it if needed.

Comments?

Regards,
Jeff Davis


correlation.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new correlation metric

2008-10-26 Thread Martijn van Oosterhout
On Sun, Oct 26, 2008 at 01:38:02AM -0700, Jeff Davis wrote:
 I worked with Nathan Boley to come up with what we think is a better
 metric for measuring this cost. It is based on the number of times in
 the ordered sample that you have to physically backtrack (i.e. the data
 value increases, but the physical position is earlier).
 
 For example, if the table's physical order is
 
 6 7 8 9 10 1 2 3 4 5

How does it deal with a case like the following:

1 6 2 7 3 8 4 9 5 10  (interleaving)

ISTM that your code will overestimate the cost whereas the old code
wouldn't have done too badly.

I think the code is in the right direction, but I think want you want
is some kind of estimate of given I've looked for tuple X, how many
tuples in the next k pages are near this one. Unfortunatly I don't see
a way of calculating it other than a full simulation.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] new correlation metric

2008-10-26 Thread Tom Lane
Martijn van Oosterhout [EMAIL PROTECTED] writes:
 On Sun, Oct 26, 2008 at 01:38:02AM -0700, Jeff Davis wrote:
 I worked with Nathan Boley to come up with what we think is a better
 metric for measuring this cost.

 I think the code is in the right direction, but I think want you want
 is some kind of estimate of given I've looked for tuple X, how many
 tuples in the next k pages are near this one.

ISTM that some experimental studies would be required to justify any
proposal in this area.

Having said that ... one of the things I never liked about the existing
code is that it pays no attention to block-boundary effects.  It doesn't
matter to an indexscan how badly tuples within a single block are
misordered --- what matters is how many block reads you have to do.
So I think that any changes here ought to include measuring the
correlation on the basis of block numbers not tuple numbers.  But what's
not too clear to me is whether our sampling methods would mess that up.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new correlation metric

2008-10-26 Thread Heikki Linnakangas

Martijn van Oosterhout wrote:

On Sun, Oct 26, 2008 at 01:38:02AM -0700, Jeff Davis wrote:

I worked with Nathan Boley to come up with what we think is a better
metric for measuring this cost. It is based on the number of times in
the ordered sample that you have to physically backtrack (i.e. the data
value increases, but the physical position is earlier).

For example, if the table's physical order is

6 7 8 9 10 1 2 3 4 5


How does it deal with a case like the following:

1 6 2 7 3 8 4 9 5 10  (interleaving)

ISTM that your code will overestimate the cost whereas the old code
wouldn't have done too badly.


Another similar case is where the tables is almost in order, but not 
quite. For example, take an ordered table, but swap every other value:


2 1 4 3 6 5 7 8 10 9

Distributions similar to that that would happen naturully if you have a 
logging application that receives timestamped events from elsewhere. The 
events would be come in roughly in timestamp order, but not quite 
because of different delays, for example. For all practical purposes, 
that is just as good as a perfectly sorted table.


However, the patch actually operates on the *sample*, not the real 
physical order. So I think it would actually work well for my example, 
because if you take just a few values from that sequence in random, the 
sample is likely to be in good order. Not sure about the interleaved case.


I wonder how the proposed measurement behaves wrt. the sample size vs. 
table size ratio. Intuitively, it feels like it becomes less sensitive 
to disorder, as the table size grows and (sample size)/(table size) 
becomes smaller. Which is not good, I believe.



I think the code is in the right direction, but I think want you want
is some kind of estimate of given I've looked for tuple X, how many
tuples in the next k pages are near this one. Unfortunatly I don't see
a way of calculating it other than a full simulation.


Yeah, it would be nice to figure out something, as the current method 
sure isn't perfect.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new correlation metric

2008-10-26 Thread Greg Stark
I haven't look at the patch yet -- I'm actually on a train now. I'm  
sorry if these questions are answered in the patch.


I think there are three questions here:
A) what metric are we going to use
B) how do we estimate thy metric given a sample
C) how do we draw the conclusions we need based on this metric

I looked recently on this topic and I found papers on estimating two  
metrics based on samples: longest sorted subsequence and number of  
sorted subsequence. The latter of which sounds like it might be  
equivalent to what you have.


I didn't did any papers on drawing conclusions from either of these  
metrics though.


I think we can just look at block number to avoid intra-block disorder  
confusing us. It does mean we have effectively a much smaller sample  
though.


I don't think the other objection is worth worrying about though. If  
our sample has 15263748 then it may be that readahead will save us in  
that particular instance but it's clear the table isn't well clustered  
and it's just luck that those blocks were intermixed. In a very  
particular way. The rest of the table is unlikely to be so lucky.



greg

On 26 Oct 2008, at 03:51 PM, Tom Lane [EMAIL PROTECTED] wrote:


Martijn van Oosterhout [EMAIL PROTECTED] writes:

On Sun, Oct 26, 2008 at 01:38:02AM -0700, Jeff Davis wrote:

I worked with Nathan Boley to come up with what we think is a better
metric for measuring this cost.



I think the code is in the right direction, but I think want you want
is some kind of estimate of given I've looked for tuple X, how many
tuples in the next k pages are near this one.


ISTM that some experimental studies would be required to justify any
proposal in this area.

Having said that ... one of the things I never liked about the  
existing
code is that it pays no attention to block-boundary effects.  It  
doesn't

matter to an indexscan how badly tuples within a single block are
misordered --- what matters is how many block reads you have to do.
So I think that any changes here ought to include measuring the
correlation on the basis of block numbers not tuple numbers.  But  
what's
not too clear to me is whether our sampling methods would mess that  
up.


   regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new correlation metric

2008-10-26 Thread Tom Lane
Martijn van Oosterhout [EMAIL PROTECTED] writes:
 I think the code is in the right direction, but I think want you want
 is some kind of estimate of given I've looked for tuple X, how many
 tuples in the next k pages are near this one. Unfortunatly I don't see
 a way of calculating it other than a full simulation.

I wonder whether we ought to rethink the problem entirely.  In
particular, the notion of associating correlation stats with particular
columns doesn't seem amazingly useful when you get right down to it.
We're wasting our time calculating the correlation of a column that has
no index; and if the column is part of a multicolumn index it's not that
easy to figure out what the index correlation is from the per-column
numbers.  Not to mention functional and partial indexes.

So it occurs to me that maybe we should forget about per-column
correlations altogether, and instead try directly to calculate *per
index* correlations.  You could do this now by doing an index-only scan
and looking at the series of tuple block numbers that come back.
However, there's no obvious way to do any sampling in that approach
--- you can't read random subsets of the index without cooperation from
the index AM.

So the direction I'm headed in is to imagine that we should add an
analyze entry point to the index AM API, with the definition that it
will be called once per index per ANALYZE, and is in charge of putting
useful stats into someplace or other.  We might need to invent some
other catalog besides pg_statistic if we want to represent per-index
properties like correlation.  A minimal solution would be to add a
correlation column to pg_class or pg_index, but that doesn't scale well
if you imagine that different index AMs might want different sorts of
stats.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] array_agg and array_accum (patch)

2008-10-26 Thread Jeff Davis
Here is a patch to support two aggregate functions:

1) ARRAY_AGG() -- SQL 2008 standard behavior, returns NULL on no input, 
and skips NULL inputs. 

2) ARRAY_ACCUM() -- Returns empty array on no input, and includes NULL 
inputs. 

These accumulate the result in a memory context that lives across calls
to the state function, so it's reasonably efficient. On my old laptop it
takes about 5s to generate an array of 1M elements -- not great, but at
least it's linear.

Although array_agg is the standard behavior, array_accum is important
because otherwise you always lose the NULLs, and that's difficult to
work around even with COALESCE. 

I added them as new native functions because ARRAY_AGG is in the
standard, but if others think they should live elsewhere that's fine. I
think that they are generally pretty useful functions for people using
arrays.

This patch is contributed by Truviso.

Regards,
Jeff Davis



array_agg.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BufferAccessStrategy for bulk insert

2008-10-26 Thread Tom Lane
Robert Haas [EMAIL PROTECTED] writes:
 I am kind of inclined to define flags like this:

 #define HEAP_INSERT_SKIP_WAL 0x0001
 #define HEAP_INSERT_SKIP_FSM 0x0002
 #define HEAP_INSERT_BULK 0x0004 /* do we even need this one? */

 And then:

 Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 unsigned options, BulkInsertState *bistate);
 BulkInsertState *GetBulkInsertState(void);
 void FreeBulkInsertState(BulkInsertState *);

Seems sane to me.  I don't see the point of the HEAP_INSERT_BULK flag
bit --- providing or not providing bistate would cover that, and if
you have a bit as well then you have to define what the inconsistent
combinations mean.  I concur with making all-zeroes be the typical
state of the flag bits, too.

FWIW, we generally declare bitmask flag variables as int, unless
there's some really good reason to do otherwise.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new correlation metric

2008-10-26 Thread Jeff Davis
On Sun, 2008-10-26 at 12:44 -0400, Tom Lane wrote:
 I wonder whether we ought to rethink the problem entirely.
[...]

What you say makes a lot of sense.

We would have to take a sample of index leaf pages, but I think we
could get a really useful number from it. For BTree, we can just read
the values in order, and maintain 3 counts:

* same_count: this TID points to the same page as the last one did
* near_count: points to a page that is close enough that readahead or
caching will help
* far_count: points to a page where we don't have a reason to think
readahead or caching will be a major factor

Then, we could use a formula like:

run_cost = min_IO_cost + (max_IO_cost - min_IO_cost)*(far_count/total_count)
  + small_factor*(max_IO_cost - min_IO_cost)*(near_count/total_count)

small_factor should be the one minus the percentage of the near_count
that we expect to find in cache (because it was recently read or due to
readahead).

I think from these numbers most of the concerns are taken into account.
Heikki's almost-in-order case is solved because we could recognize that
the pages are close enough to be in cache still.

Interleaving might still cause a lower estimate, but it seems like any
optimistic estimate we could make for that case is pretty risky. If the
caching effects or readahead aren't as helpful as we expected, it would
mean very bad performance.

Of course, I agree that we need some empiral testing.

 useful stats into someplace or other.  We might need to invent some
 other catalog besides pg_statistic if we want to represent per-index
 properties like correlation.  A minimal solution would be to add a
 correlation column to pg_class or pg_index, but that doesn't scale well
 if you imagine that different index AMs might want different sorts of
 stats.

Why can't we just use pg_statistic with the starelid set to the index
oid?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg and array_accum (patch)

2008-10-26 Thread Ian Caulfield
I think array_agg also keeps nulls - although the draft standard I
have seems to contradict itself about this...

Ian

2008/10/26 Jeff Davis [EMAIL PROTECTED]:
 Here is a patch to support two aggregate functions:

 1) ARRAY_AGG() -- SQL 2008 standard behavior, returns NULL on no input,
 and skips NULL inputs.

 2) ARRAY_ACCUM() -- Returns empty array on no input, and includes NULL
 inputs.

 These accumulate the result in a memory context that lives across calls
 to the state function, so it's reasonably efficient. On my old laptop it
 takes about 5s to generate an array of 1M elements -- not great, but at
 least it's linear.

 Although array_agg is the standard behavior, array_accum is important
 because otherwise you always lose the NULLs, and that's difficult to
 work around even with COALESCE.

 I added them as new native functions because ARRAY_AGG is in the
 standard, but if others think they should live elsewhere that's fine. I
 think that they are generally pretty useful functions for people using
 arrays.

 This patch is contributed by Truviso.

 Regards,
Jeff Davis



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new correlation metric

2008-10-26 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes:
 On Sun, 2008-10-26 at 12:44 -0400, Tom Lane wrote:
 We might need to invent some
 other catalog besides pg_statistic if we want to represent per-index
 properties like correlation.

 Why can't we just use pg_statistic with the starelid set to the index
 oid?

Well, because pg_statistic is built for per-column stats.  You'd have to
invent some value for staattnum, which would be problematic for views
like pg_stats that expect it to join to a valid pg_attribute row;
and you'd have useless columns like stanullfrac and stadistinct.

There's no problem with using pg_statistic for stats that correspond to
individual index columns (and in fact we do that already); but ISTM
the point here is that correlation/ordering is about the index as a
whole, not any one column of it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg and array_accum (patch)

2008-10-26 Thread Stephen Frost
Jeff,

* Jeff Davis ([EMAIL PROTECTED]) wrote:
 2) ARRAY_ACCUM() -- Returns empty array on no input, and includes NULL 
 inputs. 

Excellent..  I added it the easy way (from the online docs), but that's
clearly not at all efficient and was going to try and fix it, for psql
to use with the column-level privs patch.  It'd be great to use a more
efficient mechanism like this, and to remove adding it from my patch
(erm, it's only one line currently, but it would have been alot more
eventually :).

I havn't actually reviewed the code at all, but +1 in general to adding
this to core.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] BufferAccessStrategy for bulk insert

2008-10-26 Thread Robert Haas
 Seems sane to me.  I don't see the point of the HEAP_INSERT_BULK flag
 bit --- providing or not providing bistate would cover that, and if
 you have a bit as well then you have to define what the inconsistent
 combinations mean.  I concur with making all-zeroes be the typical
 state of the flag bits, too.

Thanks for the design review.  I had thought to make the inconsistent
combinations fail an assertion, but I'm just as happy to leave it out
altogether.

 FWIW, we generally declare bitmask flag variables as int, unless
 there's some really good reason to do otherwise.

OK, thanks for the tip.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WIP patch: convert SQL-language functions to return tuplestores

2008-10-26 Thread Tom Lane
We have an open TODO item to support SQL-language functions that return
the output of a RETURNING clause attached to an INSERT/UPDATE/DELETE query
within the function.  This is something that was left undone in the 8.2
development cycle after this thread analyzed the problem:
http://archives.postgresql.org/pgsql-hackers/2006-10/msg00665.php
The basic conclusion was that the only sane way to do it is to have the
SQL function execute the DML command to completion and then return the
emitted rows in a tuplestore.  Which is fine, but it would be pretty messy
to do unless we make set-returning SQL functions return tuplestores all
the time, and there was worry that that might lose performance compared to
the existing value-per-call protocol.

Attached is a draft patch that converts set-returning SQL functions to
return tuplestores.  It's pretty incomplete --- it doesn't actually add
the RETURNING feature, and there's a lot of ugly stuff to clean up ---
but it passes regression tests and it's close enough for performance
testing.  What I find is that the performance hit doesn't seem too bad.
The test case I'm using looks like this:

regression=# create function foo(int) returns setof int as
'select generate_series(1,$1)' language sql;
CREATE FUNCTION
regression=# select count(*) from (select foo(NNN)) ss;

This example is chosen with malice aforethought to stress the tuplestore
performance as much as possible.  The internal generate_series() call is
about as cheap a set-generating function as possible, and it returns
through value-per-call mechanism so there is no added tuplestore in the
way.  In the outer query we again avoid the tuplestore that would be
created by nodeFunctionscan.c, and we use an upper count(*) to avoid
shipping all the rows to the client.  I should note also that the function
is intentionally declared VOLATILE to prevent its being inlined into the
calling query.

What I find on my development machine is that CVS HEAD processes this
query at about 1.33 microsec/row.  With the attached patch, the speed is
about 1.0 usec/row if the tuplestore stays within work_mem; about 1.3
usec/row if it spills to disk but doesn't overflow available kernel disk
cache; and about 1.56 usec/row in cases considerably larger than available
RAM, when we actually have to write the data to disk and read it back.
This is on my development workstation, which is a dual 2.8GHz Xeon EM64T
with your typical junk consumer-grade single ATA disk drive, running
Fedora 9.  (BTW, the test seems to be mostly CPU-bound even when spilling
to disk.)

So I'm concluding that we can easily afford to switch to tuplestore-always
operation, especially if we are willing to put any effort into tuplestore
optimization.  (I note that the current tuplestore code writes 24 bytes
per row for this example, which is a shade on the high side for only 4 bytes
payload.  It looks like it would be pretty easy to knock 10 bytes off that
for a 40% savings in I/O volume.)

I'm putting up this patch mostly so that anyone who's worried about the
performance issue can do their own tests.  It's definitely not meant for
style or completeness critiques ;-)

BTW, the patch also removes the existing limitation of not being able
to call set-returning plpgsql functions in a SELECT targetlist...

regards, tom lane



binqMn7ZJD0iv.bin
Description: functions-via-tuplestore-1.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_agg and array_accum (patch)

2008-10-26 Thread Tom Lane
Ian Caulfield [EMAIL PROTECTED] writes:
 I think array_agg also keeps nulls - although the draft standard I
 have seems to contradict itself about this...

The SQL:2008 draft I have says, in 10.9 aggregate function general
rule 8g

NOTE 267 - Null values are not eliminated when computing array
aggregate function. This, plus the optional sort specification
list, sets array aggregate function apart from general set
functions.

So that seems to make it perfectly clear that nulls aren't eliminated,
and furthermore to be an intentional override of any other part of the
spec that you might think says nulls should be eliminated.  If you have
an argument to read it otherwise, please say exactly what.

A larger objection to Jeff's draft patch is that it doesn't implement
the sort specification list.  I'm entirely happy about not doing that
--- the current SQL committee's willingness to invent random new syntax
and nonorthogonal behavior for every function they can think of will be
the death of SQL yet --- but it's something that we at least need to
document the workaround for.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers