Robert Haas <robertmh...@gmail.com> writes: > The problem here, as I see it, is that we're flying blind. If there's > just one spindle, I think it's got to be right to read the relation > sequentially. But if there are multiple spindles, it might not be, > but it seems hard to predict what we should do. We don't know what > the RAID chunk size is or how many spindles there are, so any guess as > to how to chunk up the relation and divide up the work between workers > is just a shot in the dark.
I thought the proposal to chunk on the basis of "each worker processes one 1GB-sized segment" should work all right. The kernel should see that as sequential reads of different files, issued by different processes; and if it can't figure out how to process that efficiently then it's a very sad excuse for a kernel. You are right that trying to do any detailed I/O scheduling by ourselves is a doomed exercise. For better or worse, we have kept ourselves at sufficient remove from the hardware that we can't possibly do that successfully. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers