On Sat, 2002-05-04 at 10:59, Hans Reiser wrote: > > So how about if you revise fsync so that it always sends data blocks to > the journal not to the main disk?
This gets a little sticky. Once you log a block, it might be replayed after a crash. So, you have to protect against corner cases like this: write(file) fsync(file) ; /* logs modified data blocks */ write(file) ; /* write the same blocks without fsync */ sync ; /* use expects new version of the blocks on disk */ <crash> During replay, the logged data blocks overwrite the blocks sent to disk via sync(). This isn't hard to correct for, every time a buffer is marked dirty, you check the journal hash tables to see if it is replayable, and if so you log it instead (the 2.2.x code did this due to tails). This translates to increased CPU usage for every write. I'd rather not put it back in because it adds yet another corner case to maintain for all time. Most of the fsync/O_SYNC bound applications are just given their own partition anyway, so most users that need data logging need it for every write. -chris