Re: [PERFORM] Extracting superlatives - SQL design philosophy

Mose Wed, 24 Feb 2010 13:50:53 -0800

Can you try using window functions?

Something like this:


select distinct
  city,
  first_value(temp) over w as max_temp,
  first_value(date) over w as max_temp_date
 from
  cities
  window w as (partition by city order by temp desc)

http://www.postgresql.org/docs/current/static/tutorial-window.html

- Mose

On Wed, Feb 24, 2010 at 1:31 PM, Dave Crooke <[email protected]> wrote:

> This is a generic SQL issue and not PG specific, but I'd like to get
> an opinion from this list.
>
> Consider the following data:
>
> # \d bar
>                 Table "public.bar"
>  Column |            Type             | Modifiers
> --------+-----------------------------+-----------
>  city   | character varying(255)      |
>  temp   | integer                     |
>  date   | timestamp without time zone |
>
> # select * from bar order by city, date;
>   city    | temp |        date
> -----------+------+---------------------
>  Austin    |   75 | 2010-02-21 15:00:00
>  Austin    |   35 | 2010-02-23 15:00:00
>  Edinburgh |   42 | 2010-02-23 15:00:00
>  New York  |   56 | 2010-02-23 15:00:00
>  New York  |   78 | 2010-06-23 15:00:00
> (5 rows)
>
> If you want the highest recorded temperature for a city, that's easy
> to do, since the selection criteria works on the same column that we
> are extracing:
>
> # select city, max(temp) from bar group by city order by 1;
>   city    | max
> -----------+-----
>  Austin    |  75
>  Edinburgh |  42
>  New York  |  78
> (3 rows)
>
>
> However there is (AFAIK) no simple way in plain SQL to write a query
> that performs such an aggregation where the aggregation criteria is on
> one column and you want to return another, e.g. adding the the *date
> of* that highest temperature to the output above, or doing a query to
> get the most recent temperature reading for each city.
>
> What I'd like to do is something like the below (and I'm inventing
> mock syntax here, the following is not valid SQL):
>
> -- Ugly implicit syntax but no worse than an Oracle outer join ;-)
> select city, temp, date from bar where date=max(date) group by city,
> temp order by city;
>
> or perhaps
>
> -- More explicit
> select aggregate_using(max(date), city, temp, date) from bar group by
> city, temp order by city;
>
> Both of the above, if they existed, would be a single data access
> followed by and sort-merge.
>
> The only way I know how to do it involves doing two accesses to the data,
> e.g.
>
> # select city, temp, date from bar a where date=(select max(b.date)
> from bar b where a.city=b.city) order by 1;
>   city    | temp |        date
> -----------+------+---------------------
>  Austin    |   35 | 2010-02-23 15:00:00
>  Edinburgh |   42 | 2010-02-23 15:00:00
>  New York  |   78 | 2010-06-23 15:00:00
> (3 rows)
>
>
> # explain select * from bar a where date=(select max(b.date) from bar
> b where a.city=b.city) order by 1;
>                                QUERY PLAN
> --------------------------------------------------------------------------
>  Sort  (cost=1658.86..1658.87 rows=1 width=528)
>   Sort Key: a.city
>   ->  Seq Scan on bar a  (cost=0.00..1658.85 rows=1 width=528)
>         Filter: (date = (subplan))
>         SubPlan
>           ->  Aggregate  (cost=11.76..11.77 rows=1 width=8)
>                 ->  Seq Scan on bar b  (cost=0.00..11.75 rows=1
> width=8)     -- would be an index lookup in a real scenario
>                       Filter: (($0)::text = (city)::text)
> (8 rows)
>
> --
> Sent via pgsql-performance mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: [PERFORM] Extracting superlatives - SQL design philosophy

Reply via email to