[GENERAL] Clustering with enough work_mem: copy heap in mem?

2009-08-20 Thread Scara Maccai
Hi, I have a table with 15M rows. Table is around 5GB on disk. Clustering the table takes 5 minutes. A seq scan takes 20 seconds. I guess clustering is done using a seq scan on the index and then fetching the proper rows in the heap. If that's the case, fetching random rows on disk is the

Re: [GENERAL] Clustering with enough work_mem: copy heap in mem?

2009-08-20 Thread Scott Marlowe
On Thu, Aug 20, 2009 at 8:28 AM, Scara Maccaim_li...@yahoo.it wrote: Hi, I have a table with 15M rows. Table is around 5GB on disk. Clustering the table takes 5 minutes. A seq scan takes 20 seconds. I guess clustering is done using a seq scan on the index and then fetching the proper

Re: [GENERAL] Clustering with enough work_mem: copy heap in mem?

2009-08-20 Thread Scara Maccai
I've found it easier to select everything into another table, truncate the original table, then insert the rows as: that takes 50 seconds of pure sorting and 8GB of ram to sort; my method doesn't require more memory than the size of the heap table, and no sorting, since the index is already

Re: [GENERAL] Clustering with enough work_mem: copy heap in mem?

2009-08-20 Thread Alvaro Herrera
Scara Maccai wrote: I mean: there's access exclusive lock on the table while clustering, so I don't see any problem in doing it... this way you could - avoid sorting (which is what is used in the method create newtable as select * from oldtable order by mycol, and can be slow with 15M

Re: [GENERAL] Clustering with enough work_mem: copy heap in mem?

2009-08-20 Thread Scara Maccai
There was an attempt to fix it (for example so that it could try to do a seqscan+sort instead of indexscan), but it stalled. Actually I read that, but it's complicated... it involves planning and a lot of other stuff I don't even know about... My solution I guess would be easier (but, of