Re: [GENERAL] Cumulative count (running total) window fn
On 29 Apr 2010, at 19:21, Oliver Kohll - Mailing Lists wrote: > The two plans (note I've been rewriting the field names for readability until > now but haven't here): > > explain analyze SELECT year, sum(c) over (order by year) > FROM ( > SELECT extract(year from a56b7a8d6de03f67b) AS year, > count(a10e4ab8863c199f1) AS c > FROM a2e9a7e9e257153de > GROUP BY extract(year from a56b7a8d6de03f67b) > ) as subq; Oh my, how can you work with such column and table names? You and any colleagues you may have will probably appreciate having a few views over those tables that translate that gibberish to human readable stuff. You could go further and make those views updatable (by means of a few rules), but then you run the risk that colleagues start to hug you... Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling. !DSPAM:737,4bdaabce10411378620886! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Cumulative count (running total) window fn
> > Curious note - how does the non-subselect version and the subselect > version compare performance-wise? Magnus, On a test table with 12,000 rows there's not much in it, the subselect has a simpler plan but they both take practically the same time. The two plans (note I've been rewriting the field names for readability until now but haven't here): explain analyze SELECT year, sum(c) over (order by year) FROM ( SELECT extract(year from a56b7a8d6de03f67b) AS year, count(a10e4ab8863c199f1) AS c FROM a2e9a7e9e257153de GROUP BY extract(year from a56b7a8d6de03f67b) ) as subq; QUERY PLAN - WindowAgg (cost=851.49..874.06 rows=1290 width=16) (actual time=43.369..43.394 rows=5 loops=1) -> Sort (cost=851.49..854.71 rows=1290 width=16) (actual time=43.340..43.342 rows=5 loops=1) Sort Key: (date_part('year'::text, a2e9a7e9e257153de.a56b7a8d6de03f67b)) Sort Method: quicksort Memory: 25kB -> HashAggregate (cost=752.59..771.94 rows=1290 width=26) (actual time=43.300..43.317 rows=5 loops=1) -> Seq Scan on a2e9a7e9e257153de (cost=0.00..689.56 rows=12605 width=26) (actual time=0.031..26.723 rows=12605 loops=1) Total runtime: 43.549 ms explain analyze SELECT extract(year from a56b7a8d6de03f67b), count(a10e4ab8863c199f1), sum(count(a10e4ab8863c199f1)) OVER (ORDER BY count(a10e4ab8863c199f1)) FROM a2e9a7e9e257153de GROUP BY 1 ORDER BY 1; QUERY PLAN --- Sort (cost=1382.39..1388.52 rows=2451 width=32) (actual time=44.229..44.230 rows=5 loops=1) Sort Key: (date_part('year'::text, a56b7a8d6de03f67b)) Sort Method: quicksort Memory: 25kB -> WindowAgg (cost=1195.39..1244.41 rows=2451 width=32) (actual time=44.171..44.208 rows=5 loops=1) -> Sort (cost=1195.39..1201.52 rows=2451 width=32) (actual time=44.125..44.127 rows=5 loops=1) Sort Key: (count(a10e4ab8863c199f1)) Sort Method: quicksort Memory: 25kB -> HashAggregate (cost=1014.52..1057.41 rows=2451 width=32) (actual time=44.071..44.099 rows=5 loops=1) -> Seq Scan on a2e9a7e9e257153de (cost=0.00..833.58 rows=24126 width=32) (actual time=0.032..26.683 rows=12605 loops=1) Total runtime: 44.396 ms Regards Oliver Kohll oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll www.agilebase.co.uk - software www.gtwm.co.uk - company
Re: [GENERAL] Cumulative count (running total) window fn
On Thu, Apr 29, 2010 at 13:43, Oliver Kohll - Mailing Lists wrote: >> >> Aren't you looking for something along the line of: >> >> SELECT year, sum(c) over (order by year) >> FROM ( >> SELECT extract(year from signup_date) AS year, count(email_address) AS c >> FROM email_list >> GROUP BY extract(year from signup_date) >> ) >> >> (adjust for typos, I didn't test it) >> >> Yes that does work thanks, if you give the subquery a name. I'd still like >> to know if it's possible to do with a window function rather than a >> subquery. >> Oliver Kohll > > Like this?: > > SELECT extract(year from signup_date), count(email_address), > sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM > email_list GROUP BY 1 ORDER BY 1; > > Thom > > Almost, but put me on the right track! This one is exactly what I'm looking > for: > SELECT extract(year from signup_date), count(email_address), > sum(count(email_address)) OVER (ORDER BY extract(year from signup_date)) > FROM email_list GROUP BY 1 ORDER BY 1; > The ORDER BY count(email_address) did give the same results for my data but > only because the count values just happen to give the same ordering as the > years - I tested by changing some dates. > Many thanks all. Curious note - how does the non-subselect version and the subselect version compare performance-wise? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Cumulative count (running total) window fn
>> >> Aren't you looking for something along the line of: >> >> SELECT year, sum(c) over (order by year) >> FROM ( >> SELECT extract(year from signup_date) AS year, count(email_address) AS c >> FROM email_list >> GROUP BY extract(year from signup_date) >> ) >> >> (adjust for typos, I didn't test it) > > Yes that does work thanks, if you give the subquery a name. I'd still like to > know if it's possible to do with a window function rather than a subquery. > > Oliver Kohll > > > Like this?: > > SELECT extract(year from signup_date), count(email_address), > sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM > email_list GROUP BY 1 ORDER BY 1; > > Thom Almost, but put me on the right track! This one is exactly what I'm looking for: SELECT extract(year from signup_date), count(email_address), sum(count(email_address)) OVER (ORDER BY extract(year from signup_date)) FROM email_list GROUP BY 1 ORDER BY 1; The ORDER BY count(email_address) did give the same results for my data but only because the count values just happen to give the same ordering as the years - I tested by changing some dates. Many thanks all. Oliver
Re: [GENERAL] Cumulative count (running total) window fn
On 29 April 2010 11:39, Oliver Kohll - Mailing Lists < oliver.li...@gtwm.co.uk> wrote: > > On 29 Apr 2010, at 10:01, Magnus Hagander wrote: > > > select extract(year from signup_date), > > count(email_address), > > sum(count(email_address)) over (partition by 1 order by 1 asc rows > unbounded preceding) > > from email_list group by 1 order by 1; > > > Does anyone have any other ideas? > > > Aren't you looking for something along the line of: > > SELECT year, sum(c) over (order by year) > FROM ( > SELECT extract(year from signup_date) AS year, count(email_address) AS c > FROM email_list > GROUP BY extract(year from signup_date) > ) > > (adjust for typos, I didn't test it) > > > Yes that does work thanks, if you give the subquery a name. I'd still like > to know if it's possible to do with a window function rather than a > subquery. > > Oliver Kohll > Like this?: SELECT extract(year from signup_date), count(email_address), sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM email_list GROUP BY 1 ORDER BY 1; Thom
Re: [GENERAL] Cumulative count (running total) window fn
On 29 Apr 2010, at 10:01, Magnus Hagander wrote: >> >> select extract(year from signup_date), >> count(email_address), >> sum(count(email_address)) over (partition by 1 order by 1 asc rows >> unbounded preceding) >> from email_list group by 1 order by 1; >> >> Does anyone have any other ideas? > > Aren't you looking for something along the line of: > > SELECT year, sum(c) over (order by year) > FROM ( > SELECT extract(year from signup_date) AS year, count(email_address) AS c > FROM email_list > GROUP BY extract(year from signup_date) > ) > > (adjust for typos, I didn't test it) Yes that does work thanks, if you give the subquery a name. I'd still like to know if it's possible to do with a window function rather than a subquery. Oliver Kohll
Re: [GENERAL] Cumulative count (running total) window fn
On Thu, Apr 29, 2010 at 10:52, Oliver Kohll - Mailing Lists wrote: > Hello, > > Many thanks to andreas.kretschmer for this helpful reply about how to set up > a window function to perform a running total: > http://archives.postgresql.org/pgsql-general/2010-03/msg01122.php > > It works perfectly with the simple test data but I've just got back to work, > tried implementing it on my live data and the results are slightly different. > My query is almost exactly the same - I've simplified by grouping by year > only rather than year and month: > > select extract(year from signup_date), > count(email_address), > sum(count(email_address)) over (rows unbounded preceding) > from email_list group by 1 order by 1; > > date_part | count | sum > ---+---+-- > 2007 | 501 | 1374 > 2008 | 491 | 491 > 2009 | 382 | 873 > 2010 | 66 | 1440 > (4 rows) > > What I'm looking for is > date_part | count | sum > ---+---+-- > 2007 | 501 | 501 > 2008 | 491 | 992 > 2009 | 382 | 1374 > 2010 | 66 | 1440 > > It seems to be adding up the counts but not in the right order. > > I've also tried an explicit ORDER BY inside the partition with no difference: > > select extract(year from signup_date), > count(email_address), > sum(count(email_address)) over (partition by 1 order by 1 asc rows unbounded > preceding) > from email_list group by 1 order by 1; > > Does anyone have any other ideas? Aren't you looking for something along the line of: SELECT year, sum(c) over (order by year) FROM ( SELECT extract(year from signup_date) AS year, count(email_address) AS c FROM email_list GROUP BY extract(year from signup_date) ) (adjust for typos, I didn't test it) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general