On Thu, Jun 20, 2013 at 9:13 PM, bricklen <brick...@gmail.com> wrote:
> > On Thu, Jun 20, 2013 at 6:24 PM, Maciek Sakrejda <m.sakre...@gmail.com>wrote: > >> SELECT >> DISTINCT ON (type) ts, type, details >> FROM >> observations >> WHERE >> subject = '...' >> ORDER BY >> type, ts DESC; >> > > First thing: What is your "work_mem" set to, and how much RAM is in the > machine? If you look at the plan, you'll immediately notice the "external > merge Disk" line where it spills to disk on the sort. Try setting your > work_mem to 120MB or so (depending on how much RAM you have, # concurrent > sessions, complexity of queries etc) > Good call, thanks, although the in-mem quicksort is not much faster: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Unique (cost=471248.30..489392.67 rows=3 width=47) (actual time=32002.133..32817.474 rows=3 loops=1) Buffers: shared read=30264 -> Sort (cost=471248.30..480320.48 rows=3628873 width=47) (actual time=32002.128..32455.950 rows=3628803 loops=1) Sort Key: public.observations.type, public.observations.ts Sort Method: quicksort Memory: 381805kB Buffers: shared read=30264 -> Result (cost=0.00..75862.81 rows=3628873 width=47) (actual time=0.026..1323.317 rows=3628803 loops=1) Buffers: shared read=30264 -> Append (cost=0.00..75862.81 rows=3628873 width=47) (actual time=0.026..978.477 rows=3628803 loops=1) Buffers: shared read=30264 ... the machine is not nailed down, but I think I'd need to find a way to drastically improve the plan to keep this in Postgres. The alternative is probably caching the results somewhere else: for any given subject, I only need the latest observation of each type 99.9+% of the time.