Re: [PERFORM] Performance of CLUSTER

2012-06-11 Thread Jeff Davis
On Mon, 2012-06-11 at 08:42 -0500, Shaun Thomas wrote: > On 06/10/2012 03:20 AM, Mark Thornton wrote: > > > 4. 3980922 rows, 1167MB, 276s > > 5. 31843368 rows, 9709MB, ~ 10 hours > > Just judging based on the difference between these two, it would appear > to be from a lot of temp space thrashin

Re: [PERFORM] Performance of CLUSTER

2012-06-11 Thread Shaun Thomas
On 06/11/2012 09:25 AM, Mark Thornton wrote: Certainly not --- the server only has 5GB of memory. Nevertheless I don't expect quadratic behaviour for CLUSTER (n log n would be my expected time). And there it is. :) Since that's the case, *DO NOT* create the symlink from pgsql_tmp to /dev/shm

Re: [PERFORM] Performance of CLUSTER

2012-06-11 Thread Shaun Thomas
On 06/11/2012 09:02 AM, Mark Thornton wrote: I didn't think the process was using even the 500m it ought to have had available, whereas creating an index did appear to use that much. Note though that I didn't stay up all night watching it! You'd be surprised. If you look in your base/pgsql_tmp

Re: [PERFORM] Performance of CLUSTER

2012-06-11 Thread Mark Thornton
On 11/06/12 14:52, Shaun Thomas wrote: On 06/11/2012 08:46 AM, Mark Thornton wrote: 500m --- though isn't clear if cluster uses maintenance memory or the regular work memory. I could readily use a higher value for maintenance_work_mem. For an operation like that, having a full GB wouldn't hur

Re: [PERFORM] Performance of CLUSTER

2012-06-11 Thread Shaun Thomas
On 06/10/2012 03:20 AM, Mark Thornton wrote: 4. 3980922 rows, 1167MB, 276s 5. 31843368 rows, 9709MB, ~ 10 hours Just judging based on the difference between these two, it would appear to be from a lot of temp space thrashing. An order of magnitude more rows shouldn't take over 100x longer to

[PERFORM] Performance of CLUSTER

2012-06-10 Thread Mark Thornton
What is the expected performance of cluster and what tuning parameters have most effect? I have a set of 5 tables with identical structure (all inherit a common table). The sizes given are total relation size. The clustering index is a gist index on a (non null) geographic(linestring) column