On Thu, Jun 20, 2013 at 9:13 PM, bricklen <brick...@gmail.com> wrote:

>
> On Thu, Jun 20, 2013 at 6:24 PM, Maciek Sakrejda <m.sakre...@gmail.com>wrote:
>
>> SELECT
>>   DISTINCT ON (type) ts, type, details
>> FROM
>>   observations
>> WHERE
>>   subject = '...'
>> ORDER BY
>>   type, ts DESC;
>>
>
> First thing: What is your "work_mem" set to, and how much RAM is in the
> machine? If you look at the plan, you'll immediately notice the "external
> merge Disk" line where it spills to disk on the sort. Try setting your
> work_mem to 120MB or so (depending on how much RAM you have, # concurrent
> sessions, complexity of queries etc)
>

Good call, thanks, although the in-mem quicksort is not much faster:


      QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=471248.30..489392.67 rows=3 width=47) (actual
time=32002.133..32817.474 rows=3 loops=1)
   Buffers: shared read=30264
   ->  Sort  (cost=471248.30..480320.48 rows=3628873 width=47) (actual
time=32002.128..32455.950 rows=3628803 loops=1)
         Sort Key: public.observations.type, public.observations.ts
         Sort Method: quicksort  Memory: 381805kB
         Buffers: shared read=30264
         ->  Result  (cost=0.00..75862.81 rows=3628873 width=47) (actual
time=0.026..1323.317 rows=3628803 loops=1)
               Buffers: shared read=30264
               ->  Append  (cost=0.00..75862.81 rows=3628873 width=47)
(actual time=0.026..978.477 rows=3628803 loops=1)
                     Buffers: shared read=30264
...

the machine is not nailed down, but I think I'd need to find a way to
drastically improve the plan to keep this in Postgres. The alternative is
probably caching the results somewhere else: for any given subject, I only
need the latest observation of each type 99.9+% of the time.

Reply via email to