Christopher Kings-Lynne wrote:
>
>I reckon that sort_mem is the hardest thing to optimise1
>

Agreed... in part because it depends a lot on the query.

Also, if I understand correctly sort_mem not only affects sorts
but also hash table stuff as well, right?  If that's true for
the new hash aggregates, I think this means large sort_mem settings 
will become even more useful for data-warehouse-type applications.


One thing I've been wondering, is if "sort_mem" could be 
per connection/backend-process instead of per sorting operation 
so that sort_mem could be set more aggressivelly without running out
of memory so easily with large queries.

If that's not possible (i.e. one couldn't know how many simultaneous 
sorts are needed beforehand), how about only let the first one or 
two get all the memory and make the rest use a smaller one.


Anecdote:
I have a reasonably large data warehouse (1e6 to 1e8 rows
in various tables) with quite a bit of data (500k rows) added
each day.  A lot of processing (7 hours/day) is spent loading
data and generating various aggregates.  In a couple places
in the ETL part of the data warehouse code I have:

  set sort_mem = 250000;
  /// something that only needs a single sort
  set sort_mem =  65536;
  ...
  set sort_mem = 4096;
  /// some ugly aggregate-creating join generated by a reporting tool 
  set sort_mem = 65536;

Ron

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to