[HACKERS] window function v03 against HEAD

2008-07-28 Thread H . Harada
I happily announce that the first design of window function was
finished and the patch against HEAD is released online. See
http://umitanuki.net/pgsql/wfv03/design.html

The window functions such like row_number(), rank(), dense_rank(),
percent_rank() are experimentally defined within the patch as
 * It is formed as an aggregate function, seen in pg_aggregate
 * It does or does not have trans function.
 * It has a final function that is declared as VOLATILE.
 * Its final function is called multiple times during returning values.

And ranking system is a bit ugly. The problem was how you know the
ranking boundary. Inside rank_final(), you see WindowState node is
pulled out from fcinfo-context and TupleTableSlot is used. Actually I
am not sure if this is appropriate but have tried it, which did work.
So as the first release, I guess it is better that the window
functions are formulated but not open as user-defined functions.
Also, since current design of the window functions is by aggregate,
some of the SQL spec functions such like lag()/lead() was dropped.
These functions need to reach for random rows from current row. For
this needs, we might have to provide something like Window Object
mechanism that allows random access to the current frame.

There's no additional docs and tests, as well as lack of comments in
source code. The make check will fail because some of the window
functions do not have trans function.

Reviews and comments are welcome. I'm eager to refine it.

Regards,



-- 
Hitoshi Harada


window_function_v03_patch.tgz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-17 Thread H . Harada
2008/7/17 David Fetter [EMAIL PROTECTED]:
 On Sat, Jul 05, 2008 at 07:04:29PM +0900, H.Harada wrote:
 Hi,

 As I proposed a month before, I am working on window function.
 Although this work is at quite early step, I would like to introduce
 it since a part of it have been finished. If you can afford and are
 interested in it, please review the document and patch, or compile the
 applied source to execute an attached sample SQL.

 http://umitanuki.net/pgsql/wfv01/design.html

 Currently, only aggregation over window does work. I am planning to
 work for the combination of window and normal aggregation from now on,
 which I guess I can manage to do.

 The problem is, as written in the Things to discussed section of the
 document, how you define window functions (e.g. RANK()). My idea is to
 treat them as specialized functions such as SET OF functions and mark
 it in pg_proc. But this doesn't resolve RANK() boundary problem.

 I am so happy with any kind of comments, reviews or critiques.

 Regards,

 Many people have been waiting for years for this functionality.
 Thanks so much for doing this.

 When will you have another patch that applies against CVS HEAD?

 Cheers,
 David.
 --
 David Fetter [EMAIL PROTECTED] http://fetter.org/
 Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
 Skype: davidfetter  XMPP: [EMAIL PROTECTED]

 Remember to vote!
 Consider donating to Postgres: http://www.postgresql.org/about/donate


As you see the patch is not completed yet, lacking RANK() etc. I am
now struggling for this design which I assume will finish by the end
of this month. Since then, I will apply it against HEAD. So another
patch to -hackers will be around first half of August, or by the next
Commit Fest at worst.

There must be more discussion about the RANK() design after the post.
But I hope it will be in 8.4.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-06 Thread H . Harada
2008/7/6 Simon Riggs [EMAIL PROTECTED]:
 I think there are two types of functions for windowed mode.
 - windowed aggregate
 this type of function is exactly same as normal aggregate. So we use
 functions that have been in pgsql already. Actually in my patch above,
 I didn't introduce any new function. This type of function includes
 simply sum(), avg(), etc. which returns same values on a partition or
 a window frame.

 - windowed function
 this is the NEW type of function. I guess we should add a new function
 type to pgsql. This type of function includes rank(), rank_dense(),
 row_number(), etc. Windowed functions returns different values per
 tuple.

 The difference between two types is if the function returns the same
 value during a partition or different values.

 So, windowed aggregate and normal aggregate overlap each other. How
 you know which one is that you see OVER clause in SQL just after the
 function call. When you see OVER after func(), and pg_proc says it's
 an aggregate, it's a windowed aggregate. Otherwise, it's a windowed
 function.

 If I misunderstood about those definitions please correct me.

 Yes, I understand that and I think Martijn does also.

 I've done some thinking and rooting around on this and I think I have a
 different proposal for you, different to what we just discussed.

 SQL2008 specifies window functions as

 * rank functions
 * distribution functions: percent_rank() and cume_dist()
 * rownumber()
 * ntile()
 * lead() and lag()
 * first, last and n-th value functions
 * inverse distribution functions (similar to n-th value, based upon
 distribution function results)

 plus window aggregate functions (the normal aggregates COUNT, SUM etc)

 Now looking through all of those, I don't see *any* window functions
 that need access to different datatypes, or actually need to see the
 values of the attributes.

 The normal aggregates work with windows identically to the way they do
 without windows, so no change needed there.

 AFAICS we could define all of the non-aggregate window functions on the
 above list *without* defining them as functions in pg_proc. That would
 be a benefit because the window functions are very powerful and we'd
 need to give them access to any/all tuples in the window.

 So that would mean we don't provide a mechanism for user-defined
 windowed aggregate functions at all. Which solves the discussion about
 how to pass generic info through to them (at least long enough to get
 the first implementation done).

 We do already have such functions in code, e.g. greatest(). Sure they
 need to be defined in code, but we don't need to come up with a generic
 API for them.

 If you disagree, think about how we'd implement lag() or ntile() and
 what info we'd need to pass them.

Well, your idea is one of considerable choices. But I like pgsql's
extensibility that enables pgsql more powerful DBMS. So, I design it
as you propsed though trying to unify the function form somehow.

Just idea, how about pass window object to a function? We'll provide
window operation API then in the function you take window object
through fcinfo:

Datum func(PG_FUNCTION_ARGS)
{
  Datum v;
  WindowObject w = get_window(fcinfo);
  HeapTuple htup_current = window_current_row(w);
  HeapTuple htup_prev = window_preceding(w, 1);
  /* do something */
  PG_RETURN_DATUM(v);
}

so that a function access whole the window. APIs include
- current row
- preceding row
- following row
- current key
- preceding key
- following key
- iterate for the window
where key means ORDER BY values in OVER clause. Fortunately, my
patch uses tuplestore/tuplesort to create window, which allows random
access operation such above. Is there security/performance issue about
this?

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-06 Thread H . Harada
2008/7/6 Simon Riggs [EMAIL PROTECTED]:

 On Sun, 2008-07-06 at 17:39 +0900, H.Harada wrote:

 Is there security/performance issue about this?

 Performance, yes.

 If we need access to more rows than will fit within work_mem we have a
 problem and will need to restart sort. Giving random access to all
 tuples in the current window, whatever its size would be very costly,
 which is why we have optimized that access for merge joins. So we need
 to know how far back access is required, if any - think of that as an
 access window definition.

Is this about tuplesort, not tuplestore? At a glance, tuplestore seems
to be able to access tuples randomly without any performance problem,
just fitting its pointer. So I thought planner should always insert
Sort node before Window node to let Window not to sort, as I explained
in the document. But anyways, I understand some kind of optimization
mechanism to scroll in/out window is needed.

 I know I rattle on about performance, but with window functions it will
 be critical to their usability to have them perform well. We can already
 do the types of analysis that window functions allow, it just requires
 hand written procedures to do it. So the window functions must perform
 acceptably well against very large tables (i.e. much bigger than
 memory).

I know, the probable use case is against large data set. There's no
reason to add this feature if it is slower than self joins or other
kludge methods.


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] introduction of WIP window function patch

2008-07-05 Thread H . Harada
Hi,

As I proposed a month before, I am working on window function.
Although this work is at quite early step, I would like to introduce
it since a part of it have been finished. If you can afford and are
interested in it, please review the document and patch, or compile the
applied source to execute an attached sample SQL.

http://umitanuki.net/pgsql/wfv01/design.html

Currently, only aggregation over window does work. I am planning to
work for the combination of window and normal aggregation from now on,
which I guess I can manage to do.

The problem is, as written in the Things to discussed section of the
document, how you define window functions (e.g. RANK()). My idea is to
treat them as specialized functions such as SET OF functions and mark
it in pg_proc. But this doesn't resolve RANK() boundary problem.

I am so happy with any kind of comments, reviews or critiques.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-05 Thread H . Harada
Hi,

2008/7/6 Simon Riggs [EMAIL PROTECTED]:

 On Sat, 2008-07-05 at 16:20 +0200, Martijn van Oosterhout wrote:
 On Sat, Jul 05, 2008 at 07:04:29PM +0900, H.Harada wrote:

  http://umitanuki.net/pgsql/wfv01/design.html
 
  The problem is, as written in the Things to discussed section of the
  document, how you define window functions (e.g. RANK()). My idea is to
  treat them as specialized functions such as SET OF functions and mark
  it in pg_proc. But this doesn't resolve RANK() boundary problem.

 Actually, I would make RANK() and ROW_NUMBER() act more like
 aggregates. ISTM you have two kinds of window functions:

 - aggregation: a result is calculated over a set and the result copied
   across all the rows.
 - order depenadant: same as above, but the result is different for each
   row.

 I think you could make the latter work using the current aggregation
 setup, just by calling the final_func for each row rather than just
 once at the end.

 AFAICS there's no overlap between windowed aggregates and normal
 aggregates, so we can different infrastructure for each. I like the
 suggestion of doing it very similarly to current aggregates, but I would
 introduce a new function hook for windowed aggregates, wfunc.

I think there are two types of functions for windowed mode.
- windowed aggregate
this type of function is exactly same as normal aggregate. So we use
functions that have been in pgsql already. Actually in my patch above,
I didn't introduce any new function. This type of function includes
simply sum(), avg(), etc. which returns same values on a partition or
a window frame.

- windowed function
this is the NEW type of function. I guess we should add a new function
type to pgsql. This type of function includes rank(), rank_dense(),
row_number(), etc. Windowed functions returns different values per
tuple.

The difference between two types is if the function returns the same
value during a partition or different values.

So, windowed aggregate and normal aggregate overlap each other. How
you know which one is that you see OVER clause in SQL just after the
function call. When you see OVER after func(), and pg_proc says it's
an aggregate, it's a windowed aggregate. Otherwise, it's a windowed
function.

If I misunderstood about those definitions please correct me.

 Denserank is fairly simple

 CREATE AGGREGATE denserank()
 (
sfunc = increment
stype = bigint
initcond = 0
 )

I think this is ROW_NUMBER(), not DENSE_RANK(), isn't it?

 rank() is fairly complex because the state data must track 3 things:
 * the number of tuples seen so far (bigint)
 * the value of the last tuple seen (anyelement)
 * the rank of the last tuple seen (bigint)

 sfunc would compare the new value with the last value, if they match
 then we return the rank of the last tuple. If they don't match then we
 set the stored value and rank, then return the number of tuples seen so
 far as the rank.

Yeah, I think RANK() needs to know some or all of tuple of current
row. But it seems not to take any argument, which is the boundary
problem I said. Definitely RANK() or windowed function should know
about current tuple so that when to increment rank. But how? As
explicit argument? or specialized method?


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-05 Thread H . Harada
2008/7/5 Martijn van Oosterhout [EMAIL PROTECTED]:
 On Sat, Jul 05, 2008 at 07:04:29PM +0900, H.Harada wrote:
 Hi,

 As I proposed a month before, I am working on window function.

 Very nice!

 http://umitanuki.net/pgsql/wfv01/design.html

 The problem is, as written in the Things to discussed section of the
 document, how you define window functions (e.g. RANK()). My idea is to
 treat them as specialized functions such as SET OF functions and mark
 it in pg_proc. But this doesn't resolve RANK() boundary problem.

 Actually, I would make RANK() and ROW_NUMBER() act more like
 aggregates. ISTM you have two kinds of window functions:

 - aggregation: a result is calculated over a set and the result copied
  across all the rows.
 - order depenadant: same as above, but the result is different for each
  row.

So I agree the definition of these two types.

 I think you could make the latter work using the current aggregation
 setup, just by calling the final_func for each row rather than just
 once at the end.

How do you know which type of two above is used in the same SQL syntax?


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal: add window function to 8.4

2008-06-09 Thread H . Harada
This topic has been discussed on this list and many user expect that
PostgreSQL implements it.
I'd like to work on this feature and hope that we can include it on 8.4.

Former discussions are here:

http://archives.postgresql.org/pgsql-hackers/2004-11/msg01093.php

http://archives.postgresql.org/pgsql-hackers/2007-01/msg00861.php


How it works and examples:
 SELECT dept, empno,
 RANK() OVER(PARTITION BY dept ORDER BY age) as age_rank,
 RANK() OVER(PARTITION BY dept ORDER BY salary) as salary_rank,
 SUM(salary) OVER(PARTITION BY dept ORDER BY age) as run_total
 FROM employees ORDER BY 1, 3, 4;

 dept empno age_rank salary_rank run_total
 ENG  2 12   4
 ENG  1 21   9
 QA   3 12   3
 QA   4 21   65000
 (ref.: http://www.gavinsherry.org/talks/window_talk.pdf)


My current idea and concept:
- add window function and treat it specially such like aggregate
function and setof function.
- some features may be dropped at the first release, considering to
support them later.
- to formulate and to let it work properly are primary, performance
optimization is secondary.


From my survey around web and list archive, the points are:
- what is window function rank(), rank_dense(), lead() and others?
  First of all, we should define the window function such like
aggregate function. In my opinion, there are two types of functions in
OVER () call context. One of them is aggregate, and the other is
window (ranking) function. Sum() in a query like

  SELECT empno, sum(salary) OVER (PARTITION BY depno) FROM empsalary;

 is obviously aggregate function. This type of function can be used as
it is now. Only executer will change its behavior.
 Current pgsql feature sets lack window function like rank(). This
type of function must 1) have a context such as SETOF functions, 2)
return values until executor says DONE, rather than function itself
says DONE as in SETOF function, and 3) know about other tuples
(mainly ORDER BY clause), as rank() doesn't take any arguments but
knows the ranking boundary.  I suppose that pgsql have new function
system for these types called window function.
 Once we can define window function, users have extensibility to this
type of function.

- do we really need FRAME clause?
 From my survey, Oracle and DB2 have FRAME clause

  SELECT empno, sum(salary) OVER (ORDER BY empno ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW) FROM empsalary;

 while MS SQL Server doesn't (note that the literal from ROWS to
CURRENT ROW is called FRAME clause). To implement FRAME clause is
much more complicated than only PARTITION and ORDER clause support
because we need random access to result tuples. Though we will be
expected to support FAME clause in long-term, for the first release it
might be able to be put off. Even though rank() doesn't support FRAME
clause (only PARTITION and ORDER) it is so useful, more than now at
least.

- execution order
 In SQL:2003, the execution order is defined as

 where - group by - having - (windowing) * N - order by (outer,
currently existing one)
 where windowing is
 partition by - order by - (framing) * N

 But Oracle seems that it has another order

 (windowing) * N - where - group by ... and so on.

 which is better for us? With Oracle's one you can say
  SELECT empno, rank() OVER (PARTITION BY depno ORDER BY saraly) AS
topsalary FROM empsalary
  WHERE topsaraly  3
 to get the top 3 people taking heighest salary. In the SQL standard,
you should the nest query.
 I insist the first (standard) one is better because we may want use
the result of normal aggregate in OVER clause.

- plan and node
 Currently in my mind the execution strategy could be:

 1. Where  GroupBy  Having
 |
 2. SortBy partitionClause, orderByClause
 |
 3. Window
   foreach partition:
 if not there_is_frame():
   aggvalue = null
   foreach row in partition:
 aggvalue = agg_trans_func(aggvalue)
   aggvalue = agg_final_func(aggvalue)

 foreach row in partition:
   if has frame clause:
 aggvalue = null
 frame = make_frame()
 foreach row_in_frame:
   aggvalue = aggregate_trans_func(aggvalue)
 aggvalue = aggregate_final_func(aggvalue)

   set aggvalue to row
   val = window_func()
   set val to row
   goto 2. if another window remained
 |
 4. SortBy ORDER BY clause (outer)  Limit
 |
 5. Output

 This pseudo code is quite simple and stupid. We may optimize it by
splitting tasks with MergeJoin, etc. or think about process 2. that
collect same PARTITION clauses to reduce sort operation. Or to use
Hash Agg to create PARTITION. But let's be stupid at first.
Optimization is secondary.


References:
 description by Stefan DeBloch
http://wwwdvs.informatik.uni-kl.de/courses/NEDM/WS0607/Vorlesungsunterlagen/NEDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf
 via Wikipedia[Select (SQL)] http://en.wikipedia.org/wiki/Select_(SQL)


 BTW, what about Bizgres