On Tue, Feb 06 2001, [EMAIL PROTECTED] wrote:
> >It depends on the device driver. Different controllers will have
> >different maximum transfer size. For IDE, for example, we get wakeups
> >all over the place. For SCSI, it depends on how many scatter-gather
> >entries the driver can push into a
>Hi,
>
>On Mon, Feb 05, 2001 at 08:01:45PM +0530, [EMAIL PROTECTED] wrote:
>>
>> >It's the very essence of readahead that we wake up the earlier buffers
>> >as soon as they become available, without waiting for the later ones
>> >to complete, so we _need_ this multiple completion concept.
>>
>> I
"Stephen C. Tweedie" wrote:
>
> The original multi-page buffers came from the map_user_kiobuf
> interface: they represented a user data buffer. I'm not wedded to
> that format --- we can happily replace it with a fine-grained sg list
>
Could you change that interface?
<<< from Linus mail:
Hi,
On Mon, Feb 05, 2001 at 11:06:48PM +, Alan Cox wrote:
> > do you then tell the application _above_ raid0 if one of the
> > underlying IOs succeeds and the other fails halfway through?
>
> struct
> {
> u32 flags; /* because everything needs flags */
> struct io_completio
> do you then tell the application _above_ raid0 if one of the
> underlying IOs succeeds and the other fails halfway through?
struct
{
u32 flags; /* because everything needs flags */
struct io_completion *completions;
kiovec_t sglist[0];
} thingy;
now kmalloc one ob
Hi,
On Mon, Feb 05, 2001 at 10:28:37PM +0100, Ingo Molnar wrote:
>
> On Mon, 5 Feb 2001, Stephen C. Tweedie wrote:
>
> it's exactly these 'compound' structures i'm vehemently against. I do
> think it's a design nightmare. I can picture these monster kiobufs
> complicating the whole code for no
On Mon, 5 Feb 2001, Stephen C. Tweedie wrote:
> > Obviously the disk access itself must be sector aligned and the total
> > length must be a multiple of the sector length, but there shouldn't be
> > any restrictions on the data buffers.
>
> But there are. Many controllers just break down and cor
On Mon, 5 Feb 2001, Stephen C. Tweedie wrote:
> And no, the IO success is *not* necessarily sequential from the start
> of the IO: if you are doing IO to raid0, for example, and the IO gets
> striped across two disks, you might find that the first disk gets an
> error so the start of the IO fail
Hi,
On Mon, Feb 05, 2001 at 08:36:31AM -0800, Linus Torvalds wrote:
> Have you ever thought about other things, like networking, special
> devices, stuff like that? They can (and do) have packet boundaries that
> have nothing to do with pages what-so-ever. They can have such notions as
> packets
On Mon, 5 Feb 2001, Stephen C. Tweedie wrote:
> > Thats true for _block_ disk devices but if we want a generic kiovec then
> > if I am going from video capture to network I dont need to force anything more
> > than 4 byte align
>
> Kiobufs have never, ever required the IO to be aligned on any
>
> Kiobufs have never, ever required the IO to be aligned on any
> particular boundary. They simply make the assumption that the
> underlying buffered object can be described in terms of pages with
> some arbitrary (non-aligned) start/offset. Every video framebuffer
start/length per page ?
> I'
Hi,
On Mon, Feb 05, 2001 at 05:29:47PM +, Alan Cox wrote:
> >
> > _All_ drivers would have to do that in the degenerate case, because
> > none of our drivers can deal with a dma boundary in the middle of a
> > sector, and even in those places where the hardware supports it in
> > theory, you
> > kiovec_align(kiovec, 512);
> > and have it do the bounce buffers ?
>
> _All_ drivers would have to do that in the degenerate case, because
> none of our drivers can deal with a dma boundary in the middle of a
> sector, and even in those places where the hardware supports it in
> theory, y
Hi,
On Mon, Feb 05, 2001 at 03:19:09PM +, Alan Cox wrote:
> > Yes, it's the sort of thing that you would hope should work, but in
> > practice it's not reliable.
>
> So the less smart devices need to call something like
>
> kiovec_align(kiovec, 512);
>
> and have it do the bounce buf
On Mon, 5 Feb 2001, Manfred Spraul wrote:
> "Stephen C. Tweedie" wrote:
> >
> > You simply cannot do physical disk IO on
> > non-sector-aligned memory or in chunks which aren't a multiple of
> > sector size.
>
> Why not?
>
> Obviously the disk access itself must be sector aligned and the tota
On Mon, 5 Feb 2001, Stephen C. Tweedie wrote:
>
> On Sat, Feb 03, 2001 at 12:28:47PM -0800, Linus Torvalds wrote:
> >
> > Neither the read nor the write are page-aligned. I don't know where you
> > got that idea. It's obviously not true even in the common case: it depends
> > _entirely_ on wha
> Yes, it's the sort of thing that you would hope should work, but in
> practice it's not reliable.
So the less smart devices need to call something like
kiovec_align(kiovec, 512);
and have it do the bounce buffers ?
-
To unsubscribe from this list: send the line "unsubscribe linux-ke
Hi,
On Mon, Feb 05, 2001 at 01:00:51PM +0100, Manfred Spraul wrote:
> "Stephen C. Tweedie" wrote:
> >
> > You simply cannot do physical disk IO on
> > non-sector-aligned memory or in chunks which aren't a multiple of
> > sector size.
>
> Why not?
>
> Obviously the disk access itself must be se
Hi,
On Mon, Feb 05, 2001 at 08:01:45PM +0530, [EMAIL PROTECTED] wrote:
>
> >It's the very essence of readahead that we wake up the earlier buffers
> >as soon as they become available, without waiting for the later ones
> >to complete, so we _need_ this multiple completion concept.
>
> I can und
>Hi,
>
>On Sun, Feb 04, 2001 at 06:54:58PM +0530, [EMAIL PROTECTED] wrote:
>>
>> Can't we define a kiobuf structure as just this ? A combination of a
>> frag_list and a page_list ?
>
>Then all code which needs to accept an arbitrary kiobuf needs to be
>able to parse both --- ugh.
>
Making this
Hi,
On Fri, Feb 02, 2001 at 01:02:28PM +0100, Christoph Hellwig wrote:
>
> > I may still be persuaded that we need the full scatter-gather list
> > fields throughout, but for now I tend to think that, at least in the
> > disk layers, we may get cleaner results by allow linked lists of
> > page-a
Hi,
On Sun, Feb 04, 2001 at 06:54:58PM +0530, [EMAIL PROTECTED] wrote:
>
> Can't we define a kiobuf structure as just this ? A combination of a
> frag_list and a page_list ?
Then all code which needs to accept an arbitrary kiobuf needs to be
able to parse both --- ugh.
> BTW, We could have a h
"Stephen C. Tweedie" wrote:
>
> You simply cannot do physical disk IO on
> non-sector-aligned memory or in chunks which aren't a multiple of
> sector size.
Why not?
Obviously the disk access itself must be sector aligned and the total
length must be a multiple of the sector length, but there sh
Hi,
On Sat, Feb 03, 2001 at 12:28:47PM -0800, Linus Torvalds wrote:
>
> On Thu, 1 Feb 2001, Stephen C. Tweedie wrote:
> >
> Neither the read nor the write are page-aligned. I don't know where you
> got that idea. It's obviously not true even in the common case: it depends
> _entirely_ on what t
>Hi,
>
>On Fri, Feb 02, 2001 at 12:51:35PM +0100, Christoph Hellwig wrote:
>> >
>> > If I have a page vector with a single offset/length pair, I can build
>> > a new header with the same vector and modified offset/length to split
>> > the vector in two without copying it.
>>
>> You just say in th
On Thu, 1 Feb 2001, Stephen C. Tweedie wrote:
>
> On Thu, Feb 01, 2001 at 09:33:27PM +0100, Christoph Hellwig wrote:
>
> > I think you want the whole kio concept only for disk-like IO.
>
> No. I want something good for zero-copy IO in general, but a lot of
> that concerns the problem of in
>Hi,
>
>On Thu, Feb 01, 2001 at 01:28:33PM +0530, [EMAIL PROTECTED] wrote:
>>
>> Here's a second pass attempt, based on Ben's wait queue extensions:
> Does this sound any better ?
>
>It's a mechanism, all right, but you haven't described what problems
>it is trying to solve, and where it is likel
Hi,
On Fri, Feb 02, 2001 at 12:51:35PM +0100, Christoph Hellwig wrote:
> >
> > If I have a page vector with a single offset/length pair, I can build
> > a new header with the same vector and modified offset/length to split
> > the vector in two without copying it.
>
> You just say in the higher
On Thu, Feb 01, 2001 at 11:18:56PM -0500, [EMAIL PROTECTED] wrote:
> On Thu, 1 Feb 2001, Christoph Hellwig wrote:
>
> > A kiobuf is 124 bytes, a buffer_head 96. And a buffer_head is additionally
> > used for caching data, a kiobuf not.
>
> Go measure the cost of a distant cache miss, then compl
On Thu, Feb 01, 2001 at 10:07:44PM +, Stephen C. Tweedie wrote:
> No. I want something good for zero-copy IO in general, but a lot of
> that concerns the problem of interacting with the user, and the basic
> center of that interaction in 99% of the interesting cases is either a
> user VM buff
On Thu, Feb 01, 2001 at 09:25:08PM +, Stephen C. Tweedie wrote:
> > No. Just allow passing the multiple of the devices blocksize over
> > ll_rw_block.
>
> That was just one example: you need the sub-ios just as much when
> you split up an IO over stripe boundaries in LVM or raid0, for
> exam
On Thu, 1 Feb 2001, Christoph Hellwig wrote:
> A kiobuf is 124 bytes, a buffer_head 96. And a buffer_head is additionally
> used for caching data, a kiobuf not.
Go measure the cost of a distant cache miss, then complain about having
everything in one structure. Also, 1 kiobuf maps 16-128 times
On Thu, 1 Feb 2001, Stephen C. Tweedie wrote:
> Hi,
>
> On Thu, Feb 01, 2001 at 06:05:15PM +0100, Christoph Hellwig wrote:
> > On Thu, Feb 01, 2001 at 04:16:15PM +, Stephen C. Tweedie wrote:
> > > >
> > > > No, and with the current kiobufs it would not make sense, because they
> > > > are to
Hi,
On Thu, Feb 01, 2001 at 09:33:27PM +0100, Christoph Hellwig wrote:
> I think you want the whole kio concept only for disk-like IO.
No. I want something good for zero-copy IO in general, but a lot of
that concerns the problem of interacting with the user, and the basic
center of that inte
Hi,
On Thu, Feb 01, 2001 at 09:33:27PM +0100, Christoph Hellwig wrote:
>
> > On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote:
> > In the disk IO case, you basically don't get that (the only thing
> > which comes close is raid5 parity blocks). The data which the user
> > started with is
Hi,
On Thu, Feb 01, 2001 at 09:46:27PM +0100, Christoph Hellwig wrote:
> > Right now we can take a kiobuf and turn it into a bunch of
> > buffer_heads for IO. The io_count lets us track all of those sub-IOs
> > so that we know when all submitted IO has completed, so that we can
> > pass the com
> On Thu, Feb 01, 2001 at 02:56:47PM -0600, Steve Lord wrote:
> > And if you are writing to a striped volume via a filesystem which can do
> > it's own I/O clustering, e.g. I throw 500 pages at LVM in one go and LVM
> > is striped on 64K boundaries.
>
> But usually I want to have pages 0-63, 128-
On Thu, Feb 01, 2001 at 02:56:47PM -0600, Steve Lord wrote:
> And if you are writing to a striped volume via a filesystem which can do
> it's own I/O clustering, e.g. I throw 500 pages at LVM in one go and LVM
> is striped on 64K boundaries.
But usually I want to have pages 0-63, 128-191, etc tog
> In article <[EMAIL PROTECTED]> you wrote:
> > Hi,
>
> > On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote:
> > In the disk IO case, you basically don't get that (the only thing
> > which comes close is raid5 parity blocks). The data which the user
> > started with is the data sent out o
In article <[EMAIL PROTECTED]> you wrote:
> Buffer_heads are _sometimes_ used for caching data.
Actually they are mostly used, but that should have any value for the
discussion...
> That's one of the
> big problems with them, they are too overloaded, being both IO
> descriptors _and_ cache descr
In article <[EMAIL PROTECTED]> you wrote:
> Hi,
> On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote:
> In the disk IO case, you basically don't get that (the only thing
> which comes close is raid5 parity blocks). The data which the user
> started with is the data sent out on the wire. Y
On Thu, 1 Feb 2001, Stephen C. Tweedie wrote:
> Hi,
>
> On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote:
> > >
> > > I don't see any real advantage for disk IO. The real advantage is that
> > > we can have a generic structure that is also usefull in e.g. networking
> > > and can lead t
On Thu, Feb 01, 2001 at 04:32:48PM -0200, Rik van Riel wrote:
> On Thu, 1 Feb 2001, Alan Cox wrote:
>
> > > Sure. But Linus saing that he doesn't want more of that (shit, crap,
> > > I don't rember what he said exactly) in the kernel is a very good reason
> > > for thinking a little more aboyt i
Hi,
On Thu, Feb 01, 2001 at 07:14:03PM +0100, Christoph Hellwig wrote:
> On Thu, Feb 01, 2001 at 05:41:20PM +, Stephen C. Tweedie wrote:
> > >
> > > We can't allocate a huge kiobuf structure just for requesting one page of
> > > IO. It might get better with VM-level IO clustering though.
>
Hi,
On Thu, Feb 01, 2001 at 06:49:50PM +0100, Christoph Hellwig wrote:
>
> > Adding tons of base/limit pairs to kiobufs makes it worse not better
>
> For disk I/O it makes the handling a little easier for the cost of the
> additional offset/length fields.
Umm, actually, no, it makes it much wo
On Thu, 1 Feb 2001, Alan Cox wrote:
> Linus list of reasons like the amount of state are more interesting
The state is required, not optional, if we are to have a decent basis for
building asyncronous io into the kernel.
> Networking wants something lighter rather than heavier. Adding tons of
>
On Thu, 1 Feb 2001, Stephen C. Tweedie wrote:
> Hi,
>
> On Thu, Feb 01, 2001 at 10:25:22AM +0530, [EMAIL PROTECTED] wrote:
> >
> > Being able to track the children of a kiobuf would help with I/O
> > cancellation (e.g. to pull sub-ios off their request queues if I/O
> > cancellation for the pare
On Thu, Feb 01, 2001 at 06:25:16PM +, Alan Cox wrote:
> > array_len, io_count, the presence of wait_queue AND end_io, and the lack of
> > scatter gather in one kiobuf struct (you always need an array), and AFAICS
> > that is what the networking guys dislike.
>
> You need a completion pointer.
On Thu, Feb 01, 2001 at 06:57:41PM +, Alan Cox wrote:
> Not for raw I/O. Although for the drivers that can't cope then going via
> the page cache is certainly the next best alternative
True - but raw-io has it's own alignment issues anyway.
> Yes. You also need a way to describe it in terms
> It doesn't really matter that much, because we write to the pagecache
> first anyway.
Not for raw I/O. Although for the drivers that can't cope then going via
the page cache is certainly the next best alternative
> The real thing is that we want to have some common data structure for
> describ
On Thu, 1 Feb 2001, Alan Cox wrote:
> > Now one could say: just let the networkers use their own kind of buffers
> > (and that's exactly what is done in the zerocopy patches), but that again leds
> > to inefficient buffer passing and ungeneric IO handling.
[snip]
> It is quite possible t
On Thu, 1 Feb 2001, Alan Cox wrote:
> > Sure. But Linus saing that he doesn't want more of that (shit, crap,
> > I don't rember what he said exactly) in the kernel is a very good reason
> > for thinking a little more aboyt it.
>
> No. Linus is not a God, Linus is fallible, regularly makes mista
> array_len, io_count, the presence of wait_queue AND end_io, and the lack of
> scatter gather in one kiobuf struct (you always need an array), and AFAICS
> that is what the networking guys dislike.
You need a completion pointer. Its arguable whether you want the wait_queue
in the default structu
On Thu, Feb 01, 2001 at 05:41:20PM +, Stephen C. Tweedie wrote:
> Hi,
>
> On Thu, Feb 01, 2001 at 06:05:15PM +0100, Christoph Hellwig wrote:
> > On Thu, Feb 01, 2001 at 04:16:15PM +, Stephen C. Tweedie wrote:
> > > >
> > > > No, and with the current kiobufs it would not make sense, becau
> > Linus basically designed the original kiobuf scheme of course so I guess
> > he's allowed to dislike it. Linus disliking something however doesn't mean
> > its wrong. Its not a technically valid basis for argument.
>
> Sure. But Linus saing that he doesn't want more of that (shit, crap,
> I
Hi,
On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote:
> >
> > I don't see any real advantage for disk IO. The real advantage is that
> > we can have a generic structure that is also usefull in e.g. networking
> > and can lead to a unified IO buffering scheme (a little like IO-Lite).
>
On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote:
> > > I'm in the middle of some parts of it, and am actively soliciting
> > > feedback on what cleanups are required.
> >
> > The real issue is that Linus dislikes the current kiobuf scheme.
> > I do not like everything he proposes, but
Hi,
On Thu, Feb 01, 2001 at 06:05:15PM +0100, Christoph Hellwig wrote:
> On Thu, Feb 01, 2001 at 04:16:15PM +, Stephen C. Tweedie wrote:
> > >
> > > No, and with the current kiobufs it would not make sense, because they
> > > are to heavy-weight.
> >
> > Really? In what way?
>
> We can'
> > I'm in the middle of some parts of it, and am actively soliciting
> > feedback on what cleanups are required.
>
> The real issue is that Linus dislikes the current kiobuf scheme.
> I do not like everything he proposes, but lots of things makes sense.
Linus basically designed the original k
Christoph Hellwig wrote:
> On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote:
> >
> > That would require the vfs interfaces themselves (address space
> > readpage/writepage ops) to take kiobufs as arguments, instead of struct
> > page * . That's not the case right now, is it ?
>
On Thu, Feb 01, 2001 at 04:49:58PM +, Stephen C. Tweedie wrote:
> > Enquiring minds would like to know if you are working towards this
> > revamp of the kiobuf structure at the moment, you have been very quiet
> > recently.
>
> I'm in the middle of some parts of it, and am actively soliciti
Hi,
On Thu, Feb 01, 2001 at 04:09:53PM +0100, Christoph Hellwig wrote:
> On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote:
> >
> > That would require the vfs interfaces themselves (address space
> > readpage/writepage ops) to take kiobufs as arguments, instead of struct
> > page
Hi,
On Thu, Feb 01, 2001 at 10:08:45AM -0600, Steve Lord wrote:
> Christoph Hellwig wrote:
> > On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote:
> > >
> > > That would require the vfs interfaces themselves (address space
> > > readpage/writepage ops) to take kiobufs as arguments
On Thu, Feb 01, 2001 at 06:05:15PM +0100, Christoph Hellwig wrote:
> > What, you mean adding *extra* stuff to the heavyweight kiobuf makes it
> > lean enough to do the job??
>
> No. I was speaking abou the light-weight kiobuf Linux & Me discussed on
On Thu, Feb 01, 2001 at 04:16:15PM +, Stephen C. Tweedie wrote:
> Hi,
>
> On Thu, Feb 01, 2001 at 04:09:53PM +0100, Christoph Hellwig wrote:
> > On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote:
> > >
> > > That would require the vfs interfaces themselves (address space
> >
On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote:
>
> >Hi,
> >
> >On Thu, Feb 01, 2001 at 10:25:22AM +0530, [EMAIL PROTECTED] wrote:
> >>
> >> >We _do_ need the ability to stack completion events, but as far as the
> >> >kiobuf work goes, my current thoughts are to do that by sta
>Hi,
>
>On Thu, Feb 01, 2001 at 10:25:22AM +0530, [EMAIL PROTECTED] wrote:
>>
>> >We _do_ need the ability to stack completion events, but as far as the
>> >kiobuf work goes, my current thoughts are to do that by stacking
>> >lightweight "clone" kiobufs.
>>
>> Would that work with stackable files
sct wrote:
>> >
>> > Thanks for mentioning this. I didn't know about it earlier. I've been
>> > going through the 4/00 kqueue patch on freebsd ...
>>
>> Linus has already denounced them as massively over-engineered...
>
>That shouldn't stop anyone from looking at them and learning, though.
>There
Hi,
On Thu, Feb 01, 2001 at 01:28:33PM +0530, [EMAIL PROTECTED] wrote:
>
> Here's a second pass attempt, based on Ben's wait queue extensions:
> Does this sound any better ?
It's a mechanism, all right, but you haven't described what problems
it is trying to solve, and where it is likely to be
Hi,
On Thu, Feb 01, 2001 at 10:25:22AM +0530, [EMAIL PROTECTED] wrote:
>
> >We _do_ need the ability to stack completion events, but as far as the
> >kiobuf work goes, my current thoughts are to do that by stacking
> >lightweight "clone" kiobufs.
>
> Would that work with stackable filesystems ?
Here's a second pass attempt, based on Ben's wait queue extensions:
Does this sound any better ?
[This doesn't require any changes to the existing wait_queue_head based i/o
structures or to existing drivers, and the constructs mentioned come into
the picture only when compound events are actuall
>My first comment is that this looks very heavyweight indeed. Isn't it
>just over-engineered?
Yes, I know it is, in its current form (sigh !).
But at the same time, I do not want to give up (not yet, at least) on
trying to arrive at something that can serve the objectives, and yet be
simple i
>Hi,
>
>On Wed, Jan 31, 2001 at 07:28:01PM +0530, [EMAIL PROTECTED] wrote:
>>
>> Do the following modifications to your wait queue extension sound
>> reasonable ?
>>
>> 1. Change add_wait_queue to add elements to the end of queue (fifo, by
>> default) and instead have an add_wait_queue_lifo() ro
Hi,
On Wed, Jan 31, 2001 at 07:28:01PM +0530, [EMAIL PROTECTED] wrote:
>
> Do the following modifications to your wait queue extension sound
> reasonable ?
>
> 1. Change add_wait_queue to add elements to the end of queue (fifo, by
> default) and instead have an add_wait_queue_lifo() routine tha
Hi,
On Tue, Jan 30, 2001 at 10:15:02AM +0530, [EMAIL PROTECTED] wrote:
>
> Comments, suggestions, advise, feedback solicited !
My first comment is that this looks very heavyweight indeed. Isn't it
just over-engineered?
We _do_ need the ability to stack completion events, but as far as the
ki
>The waitqueue extension below is a minimalist approach for providing
>kernel support for fully asynchronous io. The basic idea is that a
>function pointer is added to the wait queue structure that is called
>during wake_up on a wait queue head. (The patch below also includes
>support for exclu
IL PROTECTED]>
To: Suparna Bhattacharya/India/IBM@IBMIN
cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event
wait/notify + callback chains
On Tue, 30 Jan 2001 [EMAIL PROTECTED] wrote:
>
> Comments, suggestions, advise,
77 matches
Mail list logo