Re: [PERFORM] Any way to optimize GROUP BY queries?

2005-12-20 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> On Mon, Dec 19, 2005 at 03:47:35PM -0500, Greg Stark wrote:
>> Increase your work_mem (or sort_mem in older postgres versions), you can do
>> this for the server as a whole or just for this one session and set it back
>> after this one query. You can increase it up until it starts causing swapping
>> at which point it would be counter productive.

> Just remember that work_memory is per-operation, so it's easy to push
> the box into swapping if the workload increases. You didn't say how much
> memory you have, but I'd be careful if work_memory * max_connections
> gets very much larger than your total memory.

It's considered good practice to have a relatively small default
work_mem setting (in postgresql.conf), and then let individual sessions
push up the value locally with "SET work_mem" if they are going to
execute queries that need it.  This works well as long as you only have
one or a few such "heavy" sessions at a time.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Any way to optimize GROUP BY queries?

2005-12-20 Thread Jim C. Nasby
On Mon, Dec 19, 2005 at 03:47:35PM -0500, Greg Stark wrote:
> Increase your work_mem (or sort_mem in older postgres versions), you can do
> this for the server as a whole or just for this one session and set it back
> after this one query. You can increase it up until it starts causing swapping
> at which point it would be counter productive.

Just remember that work_memory is per-operation, so it's easy to push
the box into swapping if the workload increases. You didn't say how much
memory you have, but I'd be careful if work_memory * max_connections
gets very much larger than your total memory.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Any way to optimize GROUP BY queries?

2005-12-19 Thread Greg Stark
"Cristian Prieto" <[EMAIL PROTECTED]> writes:

> SELECT adv, pub, web, country, date_trunc('hour', tiempo), sum(num)
> FROM mytmp GROUP BY adv, pub, web, country, date_trunc('hour', tiempo)
> 
> I've tried to create index in different columns but it seems that the group
> by clause doesn't use the index in any way.

If you had an index on < adv,pub,web,country,date_trunc('hour',tiemp) > then
it would be capable of using the index however it would choose not to unless
you forced it to. Using the index would be slower.

> Is around there any stuff to accelerate the group by kind of clauses?

Increase your work_mem (or sort_mem in older postgres versions), you can do
this for the server as a whole or just for this one session and set it back
after this one query. You can increase it up until it starts causing swapping
at which point it would be counter productive.

If increasing work_mem doesn't allow a hash aggregate or at least an in-memory
sort to handle it then putting the pgsql_tmp directory on a separate spindle
might help if you have any available.

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] Any way to optimize GROUP BY queries?

2005-12-19 Thread Cristian Prieto








I have the following table:

 

CREATE TABLE mytmp (

    Adv
integer,

    Pub
integer,

    Web
integer,

    Tiempo
timestamp,

    Num
integer,

    Country
varchar(2)

);

 

CREATE INDEX idx_mytmp ON mytmp(adv, pub, web);

 

And with 16M rows this query:

 

SELECT adv, pub, web, country, date_trunc(‘hour’,
tiempo), sum(num)

FROM mytmp GROUP BY adv, pub, web, country, date_trunc(‘hour’,
tiempo)

 

I’ve tried to create index in different columns but it
seems that the group by clause doesn’t use the index in any way.

 

Is around there any stuff to accelerate the group by kind of
clauses?

 

Thanks a lot…