On Sun, Jun 10, 2007 at 08:49:24PM +0100, Heikki Linnakangas wrote: > Jim C. Nasby wrote: > >On Thu, Jun 07, 2007 at 10:16:25AM -0400, Tom Lane wrote: > >>Heikki Linnakangas <[EMAIL PROTECTED]> writes: > >>>Thinking about this whole idea a bit more, it occured to me that the > >>>current approach to write all, then fsync all is really a historical > >>>artifact of the fact that we used to use the system-wide sync call > >>>instead of fsyncs to flush the pages to disk. That might not be the best > >>>way to do things in the new load-distributed-checkpoint world. > >>>How about interleaving the writes with the fsyncs? > >>I don't think it's a historical artifact at all: it's a valid reflection > >>of the fact that we don't know enough about disk layout to do low-level > >>I/O scheduling. Issuing more fsyncs than necessary will do little > >>except guarantee a less-than-optimal scheduling of the writes. > > > >If we extended relations by more than 8k at a time, we would know a lot > >more about disk layout, at least on filesystems with a decent amount of > >free space. > > I doubt it makes that much difference. If there was a significant amount > of fragmentation, we'd hear more complaints about seq scan performance. > > The issue here is that we don't know which relations are on which drives > and controllers, how they're striped, mirrored etc.
Actually, isn't pre-allocation one of the tricks that Greenplum uses to get it's seqscan performance? -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
pgp9v0jJYJxA0.pgp
Description: PGP signature