> It sounds like the costing model might need a bit more work before we commit >this.
I tried again the simple sql tests I posted a while ago, and I still get the same ratios. I've tested the applied patch on a dual opteron + disk array Solaris machine. I really don't get how a laptop hard drive can be faster at reading data using random seeks (required by the original cluster method) than seq scan + sort for the 5M rows test case. Same thing for the "cluster vs bloat" test: the seq scan + sort is faster on my machine. I've just noticed that Josh used shared_buffers = 16MB for the "cluster vs bloat" test: I'm using a much higher shared_buffers (I think something like 200MB), since if you're working with tables this big I thought it could be a more appropriate value. Maybe that's the thing that makes the difference??? Can someone else test the patch? And: I don't have that deep knowledge of how postgresql deletes rows; but I thought that something like: DELETE FROM mybloat WHERE RANDOM() < 0.9; would only delete data, not indexes; so the patch should perform even better in this case (as it does, in fact, on my test machine), as: - the original cluster method would read the whole index, and fetch only the "still alive" rows - the new method would read the table using a seq scan, and sort in memory the few rows still alive But, as I said, maybe I'm getting this part wrong... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers