On Sun, 2008-07-06 at 17:39 +0900, H.Harada wrote: > Is there security/performance issue about this?
Performance, yes. If we need access to more rows than will fit within work_mem we have a problem and will need to restart sort. Giving random access to all tuples in the current window, whatever its size would be very costly, which is why we have optimized that access for merge joins. So we need to know how far back access is required, if any - think of that as an "access window" definition. For example, rownumber() doesn't need access to prior tuples at all. lag(col, 1) requires access only to the prior row of the current window ntile() needs to know the size of the window before we begin processing In some cases the window itself is redefined for each tuple, e.g. avg() over (order by ... range between 5 preceeding and current row) In that case, we want the tuples no longer in the window to scroll out of memory. We already have the mechanism for this: a dynamic tuplestore (materialize node) in front of the tuplesort (sort node). Most of that tuning can be done after the initial implementation, but my point here is this: there needs to be a mechanism by which the window access requirements can be specified for a function so the executor can understand how to optimise access. So if you go the route of defining an extensible API then you must include this also. I know I rattle on about performance, but with window functions it will be critical to their usability to have them perform well. We can already do the types of analysis that window functions allow, it just requires hand written procedures to do it. So the window functions must perform acceptably well against very large tables (i.e. much bigger than memory). -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers