Re: We have a problem with O_SYNC

1999-05-30 Thread Stephen C. Tweedie

Hi,

On Thu, 27 May 1999 22:15:29 -0700 (PDT), Linus Torvalds <[EMAIL PROTECTED]> said:

> On Fri, 28 May 1999, Stephen C. Tweedie wrote:

>> I have a patch I've been trying out to improve fsync performance by
>> maintaining per-inode dirty buffer lists, and to implement fdatasync
>> by tracking "significant" and "insignificant" (ie.  timestamp) dirty
>> flags in the inode separately.

> Don't bother. This is one of the issues that is just going to go away
> when we do dirty blocks correctly (ie the patches that Ingo is working
> on).

The per-inode lists will go away, but the bulk of the patch is the VFS
extension necessary to distinguish between fsync() and fdatasync() (the
VFS methods currently lack any way of making that distinction in a
fsync call).  The (minor) inode changes required to support the
split-personality dirty bits will also be needed: I'll strip those out
from the buffer cache changes.

>> Fixing O_SYNC will ruin the performance of such applications. 

> I disagree - I don't think that O_SYNC should imply writing back access
> and mtimes. If the file size really changes, that we definitely should
> write the inode back, I don't think we can honestly say that anything else
> would make sense..

According to singleunix, we have no option: the entire point of having a
separate O_DSYNC is that the O_SYNC clearly specifies semantics which
are too expensive for most applications to use.

--Stephen



Re: RFC on raw IO implementation

1999-05-30 Thread Stephen C. Tweedie

Hi,

On Thu, 27 May 1999 22:18:50 -0700 (PDT), Linus Torvalds
<[EMAIL PROTECTED]> said:

> I care not one whit what the interface is on a /dev level

Fine, I can live with that!

> the only thing I care about is that the internal interfaces make sense
> (ie are purely based on kernel physical addresses, and have nothing at
> all to do with user virtual addresses).

> Having a simple translation layer to old-fashioned UNIX semantics makes
> sense, but doesn't mater from a kernel internals standpoint, so I don't
> find it all that interesting. 

Agreed, and the current code keeps that distinction clear.  The raw
device code generates kiobufs from user space but there is no trace of
the origin of those pages when we pass them to the IO layers: all
passing is done by containers of arbitrary physical pages.

> IF you think the translation layer matters to the internal implementation,
> then I can only say that something else is broken, and no, the patches
> wouldn't get accepted. 

That's fine --- it is purely the top-level API for backwards
compatibility raw character devices I was worried about, since that's
the only part of the code which may give rise to compatibility issues if
people start using the raw diffs against 2.2.

--Stephen



Re: RFC on raw IO implementation

1999-05-30 Thread Linus Torvalds



On Sat, 29 May 1999, Stephen C. Tweedie wrote:
> 
> Agreed, and the current code keeps that distinction clear.  The raw
> device code generates kiobufs from user space but there is no trace of
> the origin of those pages when we pass them to the IO layers: all
> passing is done by containers of arbitrary physical pages.

Cool. That way we can use it for other things. Who knows, maybe we'll need
to have kernel support for web-serving to beat the benchmark crap of
certain systems that shall remain unnamed ;)

> That's fine --- it is purely the top-level API for backwards
> compatibility raw character devices I was worried about, since that's
> the only part of the code which may give rise to compatibility issues if
> people start using the raw diffs against 2.2.

Happily you still have to switch the devices around (b->c) in /dev, so I
guess an old installation wouldn't ever really notice, while a new
installation can take advantage of the true raw devices if it wants to,
which it probably does.

Linus