Re: [GENERAL] Cumulative count (running total) window fn

2010-04-30 Thread Alban Hertroys
On 29 Apr 2010, at 19:21, Oliver Kohll - Mailing Lists wrote:

> The two plans (note I've been rewriting the field names for readability until 
> now but haven't here):
> 
> explain analyze SELECT year, sum(c) over (order by year)
> FROM (  
>   SELECT extract(year from a56b7a8d6de03f67b) AS year, 
> count(a10e4ab8863c199f1) AS c
>   FROM a2e9a7e9e257153de
>   GROUP BY extract(year from a56b7a8d6de03f67b)
> ) as subq;

Oh my, how can you work with such column and table names? You and any 
colleagues you may have will probably appreciate having a few views over those 
tables that translate that gibberish to human readable stuff. You could go 
further and make those views updatable (by means of a few rules), but then you 
run the risk that colleagues start to hug you...

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4bdaabce10411378620886!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cumulative count (running total) window fn

2010-04-29 Thread Oliver Kohll - Mailing Lists
> 
> Curious note - how does the non-subselect version and the subselect
> version compare performance-wise?

Magnus,

On a test table with 12,000 rows there's not much in it, the subselect has a 
simpler plan but they both take practically the same time.

The two plans (note I've been rewriting the field names for readability until 
now but haven't here):

explain analyze SELECT year, sum(c) over (order by year)
FROM (  
  SELECT extract(year from a56b7a8d6de03f67b) AS year, count(a10e4ab8863c199f1) 
AS c
  FROM a2e9a7e9e257153de
  GROUP BY extract(year from a56b7a8d6de03f67b)
) as subq;
   QUERY PLAN   
 
-
 WindowAgg  (cost=851.49..874.06 rows=1290 width=16) (actual 
time=43.369..43.394 rows=5 loops=1)
   ->  Sort  (cost=851.49..854.71 rows=1290 width=16) (actual 
time=43.340..43.342 rows=5 loops=1)
 Sort Key: (date_part('year'::text, 
a2e9a7e9e257153de.a56b7a8d6de03f67b))
 Sort Method:  quicksort  Memory: 25kB
 ->  HashAggregate  (cost=752.59..771.94 rows=1290 width=26) (actual 
time=43.300..43.317 rows=5 loops=1)
   ->  Seq Scan on a2e9a7e9e257153de  (cost=0.00..689.56 rows=12605 
width=26) (actual time=0.031..26.723 rows=12605 loops=1)
 Total runtime: 43.549 ms

explain analyze SELECT extract(year from a56b7a8d6de03f67b), 
count(a10e4ab8863c199f1), sum(count(a10e4ab8863c199f1)) OVER (ORDER BY 
count(a10e4ab8863c199f1)) FROM a2e9a7e9e257153de GROUP BY 1 ORDER BY 1;
  QUERY PLAN
   
---
 Sort  (cost=1382.39..1388.52 rows=2451 width=32) (actual time=44.229..44.230 
rows=5 loops=1)
   Sort Key: (date_part('year'::text, a56b7a8d6de03f67b))
   Sort Method:  quicksort  Memory: 25kB
   ->  WindowAgg  (cost=1195.39..1244.41 rows=2451 width=32) (actual 
time=44.171..44.208 rows=5 loops=1)
 ->  Sort  (cost=1195.39..1201.52 rows=2451 width=32) (actual 
time=44.125..44.127 rows=5 loops=1)
   Sort Key: (count(a10e4ab8863c199f1))
   Sort Method:  quicksort  Memory: 25kB
   ->  HashAggregate  (cost=1014.52..1057.41 rows=2451 width=32) 
(actual time=44.071..44.099 rows=5 loops=1)
 ->  Seq Scan on a2e9a7e9e257153de  (cost=0.00..833.58 
rows=24126 width=32) (actual time=0.032..26.683 rows=12605 loops=1)
 Total runtime: 44.396 ms

Regards
Oliver Kohll

oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll
www.agilebase.co.uk - software
www.gtwm.co.uk - company





Re: [GENERAL] Cumulative count (running total) window fn

2010-04-29 Thread Magnus Hagander
On Thu, Apr 29, 2010 at 13:43, Oliver Kohll - Mailing Lists
 wrote:
>>
>> Aren't you looking for something along the line of:
>>
>> SELECT year, sum(c) over (order by year)
>> FROM (
>>  SELECT extract(year from signup_date) AS year, count(email_address) AS c
>>  FROM email_list
>>  GROUP BY extract(year from signup_date)
>> )
>>
>> (adjust for typos, I didn't test it)
>>
>> Yes that does work thanks, if you give the subquery a name. I'd still like
>> to know if it's possible to do with a window function rather than a
>> subquery.
>> Oliver Kohll
>
> Like this?:
>
> SELECT extract(year from signup_date), count(email_address),
> sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM
> email_list GROUP BY 1 ORDER BY 1;
>
> Thom
>
> Almost, but put me on the right track! This one is exactly what I'm looking
> for:
> SELECT extract(year from signup_date), count(email_address),
> sum(count(email_address)) OVER (ORDER BY extract(year from signup_date))
> FROM email_list GROUP BY 1 ORDER BY 1;
> The ORDER BY count(email_address) did give the same results for my data but
> only because the count values just happen to give the same ordering as the
> years - I tested by changing some dates.
> Many thanks all.

Curious note - how does the non-subselect version and the subselect
version compare performance-wise?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cumulative count (running total) window fn

2010-04-29 Thread Oliver Kohll - Mailing Lists
>> 
>> Aren't you looking for something along the line of:
>> 
>> SELECT year, sum(c) over (order by year)
>> FROM (
>>  SELECT extract(year from signup_date) AS year, count(email_address) AS c
>>  FROM email_list
>>  GROUP BY extract(year from signup_date)
>> )
>> 
>> (adjust for typos, I didn't test it)
> 
> Yes that does work thanks, if you give the subquery a name. I'd still like to 
> know if it's possible to do with a window function rather than a subquery.
> 
> Oliver Kohll
> 
> 
> Like this?:
> 
> SELECT extract(year from signup_date), count(email_address), 
> sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM 
> email_list GROUP BY 1 ORDER BY 1;
> 
> Thom

Almost, but put me on the right track! This one is exactly what I'm looking for:

SELECT extract(year from signup_date), count(email_address), 
sum(count(email_address)) OVER (ORDER BY extract(year from signup_date)) FROM 
email_list GROUP BY 1 ORDER BY 1;

The ORDER BY count(email_address) did give the same results for my data but 
only because the count values just happen to give the same ordering as the 
years - I tested by changing some dates.

Many thanks all.
Oliver

Re: [GENERAL] Cumulative count (running total) window fn

2010-04-29 Thread Thom Brown
On 29 April 2010 11:39, Oliver Kohll - Mailing Lists <
oliver.li...@gtwm.co.uk> wrote:

>
> On 29 Apr 2010, at 10:01, Magnus Hagander wrote:
>
>
> select extract(year from signup_date),
>
>  count(email_address),
>
>  sum(count(email_address)) over (partition by 1 order by 1 asc rows
> unbounded preceding)
>
> from email_list group by 1 order by 1;
>
>
> Does anyone have any other ideas?
>
>
> Aren't you looking for something along the line of:
>
> SELECT year, sum(c) over (order by year)
> FROM (
>  SELECT extract(year from signup_date) AS year, count(email_address) AS c
>  FROM email_list
>  GROUP BY extract(year from signup_date)
> )
>
> (adjust for typos, I didn't test it)
>
>
> Yes that does work thanks, if you give the subquery a name. I'd still like
> to know if it's possible to do with a window function rather than a
> subquery.
>
> Oliver Kohll
>


Like this?:

SELECT extract(year from signup_date), count(email_address),
sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM
email_list GROUP BY 1 ORDER BY 1;

Thom


Re: [GENERAL] Cumulative count (running total) window fn

2010-04-29 Thread Oliver Kohll - Mailing Lists

On 29 Apr 2010, at 10:01, Magnus Hagander wrote:
>> 
>> select extract(year from signup_date),
>>  count(email_address),
>>  sum(count(email_address)) over (partition by 1 order by 1 asc rows 
>> unbounded preceding)
>> from email_list group by 1 order by 1;
>> 
>> Does anyone have any other ideas?
> 
> Aren't you looking for something along the line of:
> 
> SELECT year, sum(c) over (order by year)
> FROM (
>  SELECT extract(year from signup_date) AS year, count(email_address) AS c
>  FROM email_list
>  GROUP BY extract(year from signup_date)
> )
> 
> (adjust for typos, I didn't test it)

Yes that does work thanks, if you give the subquery a name. I'd still like to 
know if it's possible to do with a window function rather than a subquery.

Oliver Kohll

Re: [GENERAL] Cumulative count (running total) window fn

2010-04-29 Thread Magnus Hagander
On Thu, Apr 29, 2010 at 10:52, Oliver Kohll - Mailing Lists
 wrote:
> Hello,
>
> Many thanks to andreas.kretschmer for this helpful reply about how to set up 
> a window function to perform a running total:
> http://archives.postgresql.org/pgsql-general/2010-03/msg01122.php
>
> It works perfectly with the simple test data but I've just got back to work, 
> tried implementing it on my live data and the results are slightly different. 
> My query is almost exactly the same - I've simplified by grouping by year 
> only rather than year and month:
>
> select extract(year from signup_date),
>  count(email_address),
>  sum(count(email_address)) over (rows unbounded preceding)
> from email_list group by 1 order by 1;
>
>  date_part | count | sum
> ---+---+--
>      2007 |   501 | 1374
>      2008 |   491 |  491
>      2009 |   382 |  873
>      2010 |    66 | 1440
> (4 rows)
>
> What I'm looking for is
>  date_part | count | sum
> ---+---+--
>      2007 |   501 | 501
>      2008 |   491 |  992
>      2009 |   382 |  1374
>      2010 |    66 | 1440
>
> It seems to be adding up the counts but not in the right order.
>
> I've also tried an explicit ORDER BY inside the partition with no difference:
>
> select extract(year from signup_date),
>  count(email_address),
>  sum(count(email_address)) over (partition by 1 order by 1 asc rows unbounded 
> preceding)
> from email_list group by 1 order by 1;
>
> Does anyone have any other ideas?

Aren't you looking for something along the line of:

SELECT year, sum(c) over (order by year)
FROM (
  SELECT extract(year from signup_date) AS year, count(email_address) AS c
  FROM email_list
  GROUP BY extract(year from signup_date)
)

(adjust for typos, I didn't test it)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general