Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance Gain in WAL synching

2002-10-07 Thread Zeugswetter Andreas SB SD
> > Keep in mind that we support platforms without O_DSYNC. I am not > > sure whether there are any that don't have O_SYNC either, but I am > > fairly sure that we measured O_SYNC to be slower than fsync()s on > > some platforms. This measurement is quite understandable, since the current softw

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Tom Lane
Hannu Krosing <[EMAIL PROTECTED]> writes: > Or its solution ;) as instead of the predicting we just write all data > in log that is ready to be written. If we postpone writing, there will > be hickups when we suddenly discover that we need to write a whole lot > of pages (fsync()) after idling the

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
Curtis Faith wrote: > > No question about that! The sooner we can get stuff to the WAL buffers, > > the more likely we will get some other transaction to do our fsync work. > > Any ideas on how we can do that? > > More like the sooner we get stuff out of the WAL buffers and into the > disk's buf

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Curtis Faith
> No question about that! The sooner we can get stuff to the WAL buffers, > the more likely we will get some other transaction to do our fsync work. > Any ideas on how we can do that? More like the sooner we get stuff out of the WAL buffers and into the disk's buffers whether by write or aio_wri

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
Curtis Faith wrote: > > So, you are saying that we may get back aio confirmation quicker than if > > we issued our own write/fsync because the OS was able to slip our flush > > to disk in as part of someone else's or a general fsync? > > > > I don't buy that because it is possible our write() get

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Curtis Faith
> So, you are saying that we may get back aio confirmation quicker than if > we issued our own write/fsync because the OS was able to slip our flush > to disk in as part of someone else's or a general fsync? > > I don't buy that because it is possible our write() gets in as part of > someone else

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
Curtis Faith wrote: > The advantage to aio_write in this scenario is when writes cross track > boundaries or when the head is in the wrong spot. If we write > in reasonable blocks with aio_write the write might get to the disk > before the head passes the location for the write. > > Consider a sc

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Curtis Faith
>In particular, it would seriously degrade performance if the WAL file > isn't on its own spindle but has to share bandwidth with > data file access. If the OS is stupid I could see this happening. But if there are buffers and some sort of elevator algorithm the I/O won't happen at bad times. I

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance Gain in WAL synching

2002-10-05 Thread Curtis Faith
> You are confusing WALWriteLock with WALInsertLock. A > transaction-committing flush operation only holds the former. > XLogInsert only needs the latter --- at least as long as it > doesn't need to write. Well that make things better than I thought. We still end up with a disk write for each tr

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Tom Lane
Hannu Krosing <[EMAIL PROTECTED]> writes: > The writer process should just issue a continuous stream of > aio_write()'s while there are any waiters and keep track which waiters > are safe to continue - thus no guessing of who's gonna commit. This recipe sounds like "eat I/O bandwidth whether we n

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance Gain in WAL synching

2002-10-05 Thread Doug McNaught
Tom Lane <[EMAIL PROTECTED]> writes: > "Curtis Faith" <[EMAIL PROTECTED]> writes: > > The log file would be opened O_DSYNC, O_APPEND every time. > > Keep in mind that we support platforms without O_DSYNC. I am not > sure whether there are any that don't have O_SYNC either, but I am > fairly su

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance Gain in WAL synching

2002-10-05 Thread Tom Lane
"Curtis Faith" <[EMAIL PROTECTED]> writes: > Assume Transaction A which writes a lot of buffers and XLog entries, > so the Commit forces a relatively lengthy fsynch. > Transactions B - E block not on the kernel lock from fsync but on > the WALWriteLock. You are confusing WALWriteLock with WALIn

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
pgman wrote: > Curtis Faith wrote: > > Back-end servers would not issue fsync calls. They would simply block > > waiting until the LogWriter had written their record to the disk, i.e. > > until the sync'd block # was greater than the block that contained the > > XLOG_XACT_COMMIT record. The LogWri

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
Curtis Faith wrote: > Back-end servers would not issue fsync calls. They would simply block > waiting until the LogWriter had written their record to the disk, i.e. > until the sync'd block # was greater than the block that contained the > XLOG_XACT_COMMIT record. The LogWriter could wake up commi

[HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance Gain in WAL synching

2002-10-04 Thread Curtis Faith
It appears the fsync problem is pervasive. Here's Linux 2.4.19's version from fs/buffer.c: lock-> down(&inode->i_sem); ret = filemap_fdatasync(inode->i_mapping); err = file->f_op->fsync(file, dentry, 1); if (err && !ret) ret = err;