Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Thu, 11 Jan 2007 08:43:36 +0530 Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > The s/lock_page_slow/lock_page_blocking/ got lost. I redid it. > > I thought the lock_page_blocking was an alternative you had suggested > to the __lock_page vs lock_page_async discussion which got resolved later. > That is why I didn't make the change in this patchset. > The call does not block in the async case, hence the choice of > the _slow suffix (like in fs/buffer.c). But if lock_page_blocking() > sounds more intuitive to you, that's OK. I thought people didn't like the "lock_page_slow" name. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Wed, Jan 10, 2007 at 05:08:29PM -0800, Andrew Morton wrote: > On Wed, 10 Jan 2007 11:14:19 +0530 > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > On Thu, Jan 04, 2007 at 09:02:42AM -0800, Andrew Morton wrote: > > > On Thu, 4 Jan 2007 10:26:21 +0530 > > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > > > On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > > > > > > Patches against next -mm would be appreciated, please. Sorry about that. > > > > I have updated the patchset against 2620-rc3-mm1, incorporated various > > cleanups suggested during last review. Please let me know if I have missed > > anything: > > The s/lock_page_slow/lock_page_blocking/ got lost. I redid it. I thought the lock_page_blocking was an alternative you had suggested to the __lock_page vs lock_page_async discussion which got resolved later. That is why I didn't make the change in this patchset. The call does not block in the async case, hence the choice of the _slow suffix (like in fs/buffer.c). But if lock_page_blocking() sounds more intuitive to you, that's OK. > > For the record, patches-via-http are very painful. Please always always > email them. > > As a result, these patches ended up with titles which are derived from their > filenames, which are cryptic. Sorry about that - I wanted to ask if you'd prefer my resending them to the list, but missed doing so. Some people have found it easier to download the series as a whole when they intend to apply it, so I ended up maintaining it that way all this while. Regards Suparna -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Wed, 10 Jan 2007 11:14:19 +0530 Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > On Thu, Jan 04, 2007 at 09:02:42AM -0800, Andrew Morton wrote: > > On Thu, 4 Jan 2007 10:26:21 +0530 > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > > > > Patches against next -mm would be appreciated, please. Sorry about that. > > I have updated the patchset against 2620-rc3-mm1, incorporated various > cleanups suggested during last review. Please let me know if I have missed > anything: The s/lock_page_slow/lock_page_blocking/ got lost. I redid it. For the record, patches-via-http are very painful. Please always always email them. As a result, these patches ended up with titles which are derived from their filenames, which are cryptic. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Thu, Jan 04, 2007 at 09:02:42AM -0800, Andrew Morton wrote: > On Thu, 4 Jan 2007 10:26:21 +0530 > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > > Patches against next -mm would be appreciated, please. Sorry about that. I have updated the patchset against 2620-rc3-mm1, incorporated various cleanups suggested during last review. Please let me know if I have missed anything: It should show up at www.kernel.org:/pub/linux/kernel/people/suparna/aio/2620-rc3-mm1 Brief changelog: - Reworked against the block layer unplug changes - Switched from defines to inlines for init_wait_bit* etc (per akpm) - Better naming: __lock_page to lock_page_async (per hch, npiggin) - Kill lock_page_slow wrapper and rename __lock_page_slow to lock_page_slow (per hch) - Use a helper function aio_restarted() (per hch) - Replace combined if/assignment (per hch) - fix resetting of current->io_wait after ->retry in aio_run_iocb (per zab) I have run my usual aio-stress variations script (www.kernel.org:/pub/linux/kernel/people/suparna/aio/aio-results.sh) Regards Suparna -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Fri, Jan 05 2007, Suparna Bhattacharya wrote: > On Fri, Jan 05, 2007 at 08:02:33AM +0100, Jens Axboe wrote: > > On Fri, Jan 05 2007, Suparna Bhattacharya wrote: > > > On Thu, Jan 04, 2007 at 09:02:42AM -0800, Andrew Morton wrote: > > > > On Thu, 4 Jan 2007 10:26:21 +0530 > > > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > > > > > On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > > > > > > On Thu, 28 Dec 2006 13:53:08 +0530 > > > > > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > > This patchset implements changes to make filesystem AIO read > > > > > > > and write asynchronous for the non O_DIRECT case. > > > > > > > > > > > > Unfortunately the unplugging changes in Jen's block tree have > > > > > > trashed these > > > > > > patches to a degree that I'm not confident in my repair attempts. > > > > > > So I'll > > > > > > drop the fasio patches from -mm. > > > > > > > > > > I took a quick look and the conflicts seem pretty minor to me, the > > > > > unplugging > > > > > changes mostly touch nearby code. > > > > > > > > Well... the conflicts (both mechanical and conceptual) are such that a > > > > round of retesting is needed. > > > > > > > > > Please let know how you want this fixed up. > > > > > > > > > > >From what I can tell the comments in the unplug patches seem to say > > > > > >that > > > > > it needs more work and testing, so perhaps a separate fixup patch may > > > > > be > > > > > a better idea rather than make the fsaio patchset dependent on this. > > > > > > > > Patches against next -mm would be appreciated, please. Sorry about > > > > that. > > > > > > > > I _assume_ Jens is targetting 2.6.21? > > > > > > When is the next -mm likely to be out ? > > > > > > I was considering regenerating the blk unplug patches against the > > > fsaio changes instead of the other way around, if Jens were willing to > > > accept that. But if the next -mm is just around the corner then its > > > not an issue. > > > > I don't really care much, but I work against mainline and anything but > > occasional one-off generations of a patch against a different base is > > not very likely. > > > > The -mm order should just reflect the merge order of the patches, what > > is the fsaio target? > > 2.6.21 was what I had in mind, to enable the glibc folks to proceed with > conversion to native AIO. > > Regenerating my patches against the unplug stuff is not a problem, I only > worry about being queued up behind something that may take longer to > stabilize and is likely to change ... If that is not the case, I don't > mind. Same here, hence the suggestion to base then in merging order. If your target is 2.6.21, then I think fsaio should be first. While I think the plug changes are safe and as such mergable, we still need to see lots of results and do more testing. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Fri, Jan 05, 2007 at 08:02:33AM +0100, Jens Axboe wrote: > On Fri, Jan 05 2007, Suparna Bhattacharya wrote: > > On Thu, Jan 04, 2007 at 09:02:42AM -0800, Andrew Morton wrote: > > > On Thu, 4 Jan 2007 10:26:21 +0530 > > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > > > On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > > > > > On Thu, 28 Dec 2006 13:53:08 +0530 > > > > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > This patchset implements changes to make filesystem AIO read > > > > > > and write asynchronous for the non O_DIRECT case. > > > > > > > > > > Unfortunately the unplugging changes in Jen's block tree have trashed > > > > > these > > > > > patches to a degree that I'm not confident in my repair attempts. So > > > > > I'll > > > > > drop the fasio patches from -mm. > > > > > > > > I took a quick look and the conflicts seem pretty minor to me, the > > > > unplugging > > > > changes mostly touch nearby code. > > > > > > Well... the conflicts (both mechanical and conceptual) are such that a > > > round of retesting is needed. > > > > > > > Please let know how you want this fixed up. > > > > > > > > >From what I can tell the comments in the unplug patches seem to say > > > > >that > > > > it needs more work and testing, so perhaps a separate fixup patch may be > > > > a better idea rather than make the fsaio patchset dependent on this. > > > > > > Patches against next -mm would be appreciated, please. Sorry about that. > > > > > > I _assume_ Jens is targetting 2.6.21? > > > > When is the next -mm likely to be out ? > > > > I was considering regenerating the blk unplug patches against the > > fsaio changes instead of the other way around, if Jens were willing to > > accept that. But if the next -mm is just around the corner then its > > not an issue. > > I don't really care much, but I work against mainline and anything but > occasional one-off generations of a patch against a different base is > not very likely. > > The -mm order should just reflect the merge order of the patches, what > is the fsaio target? 2.6.21 was what I had in mind, to enable the glibc folks to proceed with conversion to native AIO. Regenerating my patches against the unplug stuff is not a problem, I only worry about being queued up behind something that may take longer to stabilize and is likely to change ... If that is not the case, I don't mind. Regards Suparna > > -- > Jens Axboe > > -- > To unsubscribe, send a message with 'unsubscribe linux-aio' in > the body to [EMAIL PROTECTED] For more info on Linux AIO, > see: http://www.kvack.org/aio/ > Don't email: mailto:"[EMAIL PROTECTED]">[EMAIL PROTECTED] -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Fri, Jan 05 2007, Suparna Bhattacharya wrote: > On Thu, Jan 04, 2007 at 09:02:42AM -0800, Andrew Morton wrote: > > On Thu, 4 Jan 2007 10:26:21 +0530 > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > > > > On Thu, 28 Dec 2006 13:53:08 +0530 > > > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > > > > > This patchset implements changes to make filesystem AIO read > > > > > and write asynchronous for the non O_DIRECT case. > > > > > > > > Unfortunately the unplugging changes in Jen's block tree have trashed > > > > these > > > > patches to a degree that I'm not confident in my repair attempts. So > > > > I'll > > > > drop the fasio patches from -mm. > > > > > > I took a quick look and the conflicts seem pretty minor to me, the > > > unplugging > > > changes mostly touch nearby code. > > > > Well... the conflicts (both mechanical and conceptual) are such that a > > round of retesting is needed. > > > > > Please let know how you want this fixed up. > > > > > > >From what I can tell the comments in the unplug patches seem to say that > > > it needs more work and testing, so perhaps a separate fixup patch may be > > > a better idea rather than make the fsaio patchset dependent on this. > > > > Patches against next -mm would be appreciated, please. Sorry about that. > > > > I _assume_ Jens is targetting 2.6.21? > > When is the next -mm likely to be out ? > > I was considering regenerating the blk unplug patches against the > fsaio changes instead of the other way around, if Jens were willing to > accept that. But if the next -mm is just around the corner then its > not an issue. I don't really care much, but I work against mainline and anything but occasional one-off generations of a patch against a different base is not very likely. The -mm order should just reflect the merge order of the patches, what is the fsaio target? -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Thu, Jan 04, 2007 at 09:02:42AM -0800, Andrew Morton wrote: > On Thu, 4 Jan 2007 10:26:21 +0530 > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > > > On Thu, 28 Dec 2006 13:53:08 +0530 > > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > > > This patchset implements changes to make filesystem AIO read > > > > and write asynchronous for the non O_DIRECT case. > > > > > > Unfortunately the unplugging changes in Jen's block tree have trashed > > > these > > > patches to a degree that I'm not confident in my repair attempts. So I'll > > > drop the fasio patches from -mm. > > > > I took a quick look and the conflicts seem pretty minor to me, the > > unplugging > > changes mostly touch nearby code. > > Well... the conflicts (both mechanical and conceptual) are such that a > round of retesting is needed. > > > Please let know how you want this fixed up. > > > > >From what I can tell the comments in the unplug patches seem to say that > > it needs more work and testing, so perhaps a separate fixup patch may be > > a better idea rather than make the fsaio patchset dependent on this. > > Patches against next -mm would be appreciated, please. Sorry about that. > > I _assume_ Jens is targetting 2.6.21? When is the next -mm likely to be out ? I was considering regenerating the blk unplug patches against the fsaio changes instead of the other way around, if Jens were willing to accept that. But if the next -mm is just around the corner then its not an issue. Regards Suparna > > -- > To unsubscribe, send a message with 'unsubscribe linux-aio' in > the body to [EMAIL PROTECTED] For more info on Linux AIO, > see: http://www.kvack.org/aio/ > Don't email: mailto:"[EMAIL PROTECTED]">[EMAIL PROTECTED] -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
Suparna Bhattacharya wrote: On Thu, Jan 04, 2007 at 05:50:11PM +1100, Nick Piggin wrote: OK, but I think that after IO submission, you do not run sync_page to unplug the block device, like the normal IO path would (via lock_page, before the explicit plug patches). In the buffered AIO case, we do run sync page like normal IO ... just that we don't block in io_schedule(), everything else is pretty much similar. You do? OK I must have misread it. Ignore that, then ;) I'm sure more merging or batching could be done, but also consider that most programs will not ever make use of any added complexity. I guess I didn't express myself well - by batching I meant being able to surround submission of a batch of iocbs with explicit plug/unplug instead of explicit plug/unplug for each iocb separately. Of course there is no easy way to do that, since at the io_submit() level we do not know about the block device (each iocb could be directed to a different fd and not just block devices). So it may not be worth thinking about. Well we currently _could_ do that, because the block device plugging code will detect if the request queue changes, and flush built up requests... However, I think we may want to make the plug operations a callback rather than hardcoded block device plugging, so that will make it harder... but you have a good point about increasing the scope of the plugging, it would be a win if we can do it. Regarding your patches, I've just had a quick look and have a question -- what do you do about blocking in page reclaim and dirty balancing? Aren't those major points of blocking with buffered IO? Did your test cases dirty enough to start writeout or cause a lot of reclaim? (admittedly, blocking in reclaim will now be much less common since the dirty mapping accounting). In my earlier versions of patches I actually had converted these waits to be async retriable, but then I came to the conclusion that the additional complexity wasn't worth it. For one it didn't seem to make a difference compared to the other bigger cases, and I was looking primarily at handling the gross blocking points (say to enable an application to keep device queues busy) and not making everything asynchronous; for another we had a long discussion thread waay back about not making AIO submitters exempt from throttling or memory availability waits. OK, I was just curious. For keeping queues busy, your patchset should work well (sleeping for more memory should be pretty uncommon). But for overlapping computation with IO, it may not work so well if it encounters throttling. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Thu, Jan 04 2007, Andrew Morton wrote: > > Please let know how you want this fixed up. > > > > >From what I can tell the comments in the unplug patches seem to say that > > it needs more work and testing, so perhaps a separate fixup patch may be > > a better idea rather than make the fsaio patchset dependent on this. > > Patches against next -mm would be appreciated, please. Sorry about that. > > I _assume_ Jens is targetting 2.6.21? Only if everything works perfectly, 2.6.22 is also a viable target. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Thu, 4 Jan 2007 10:26:21 +0530 Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > > On Thu, 28 Dec 2006 13:53:08 +0530 > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > > > This patchset implements changes to make filesystem AIO read > > > and write asynchronous for the non O_DIRECT case. > > > > Unfortunately the unplugging changes in Jen's block tree have trashed these > > patches to a degree that I'm not confident in my repair attempts. So I'll > > drop the fasio patches from -mm. > > I took a quick look and the conflicts seem pretty minor to me, the unplugging > changes mostly touch nearby code. Well... the conflicts (both mechanical and conceptual) are such that a round of retesting is needed. > Please let know how you want this fixed up. > > >From what I can tell the comments in the unplug patches seem to say that > it needs more work and testing, so perhaps a separate fixup patch may be > a better idea rather than make the fsaio patchset dependent on this. Patches against next -mm would be appreciated, please. Sorry about that. I _assume_ Jens is targetting 2.6.21? - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Thu, Jan 04, 2007 at 05:50:11PM +1100, Nick Piggin wrote: > Suparna Bhattacharya wrote: > >On Thu, Jan 04, 2007 at 04:51:58PM +1100, Nick Piggin wrote: > > >>So long as AIO threads do the same, there would be no problem (plugging > >>is optional, of course). > > > > > >Yup, the AIO threads run the same code as for regular IO, i.e in the rare > >situations where they actually end up submitting IO, so there should > >be no problem. And you have already added plug/unplug at the appropriate > >places in those path, so things should just work. > > Yes I think it should. > > >>This (is supposed to) give a number of improvements over the traditional > >>plugging (although some downsides too). Most notably for me, the VM gets > >>cleaner ;) > >> > >>However AIO could be an interesting case to test for explicit plugging > >>because of the way they interact. What kind of improvements do you see > >>with samba and do you have any benchmark setups? > > > > > >I think aio-stress would be a good way to test/benchmark this sort of > >stuff, at least for a start. > >Samba (if I understand this correctly based on my discussions with Tridge) > >is less likely to generate the kind of io patterns that could benefit from > >explicit plugging (because the file server has no way to tell what the next > >request is going to be, it ends up submitting each independently instead of > >batching iocbs). > > OK, but I think that after IO submission, you do not run sync_page to > unplug the block device, like the normal IO path would (via lock_page, > before the explicit plug patches). In the buffered AIO case, we do run sync page like normal IO ... just that we don't block in io_schedule(), everything else is pretty much similar. In the case of AIO-DIO, the path is like the just like non-AIO DIO, there is a call to blk_run_address_space() after submission. > > However, with explicit plugging, AIO requests will be started immediately. > Maybe this won't be noticable if the device is always busy, but I would > like to know there isn't a regression. > > >In future there may be optimization possibilities to consider when > >submitting batches of iocbs, i.e. on the io submission path. Maybe > >AIO - O_DIRECT would be interesting to play with first in this regardi ? > > Well I've got some simple per-process batching in there now, each process > has a list of pending requests. Request merging is done locklessly against > the last request added; and submission at unplug time is batched under a > single block device lock. > > I'm sure more merging or batching could be done, but also consider that > most programs will not ever make use of any added complexity. I guess I didn't express myself well - by batching I meant being able to surround submission of a batch of iocbs with explicit plug/unplug instead of explicit plug/unplug for each iocb separately. Of course there is no easy way to do that, since at the io_submit() level we do not know about the block device (each iocb could be directed to a different fd and not just block devices). So it may not be worth thinking about. > > Regarding your patches, I've just had a quick look and have a question -- > what do you do about blocking in page reclaim and dirty balancing? Aren't > those major points of blocking with buffered IO? Did your test cases > dirty enough to start writeout or cause a lot of reclaim? (admittedly, > blocking in reclaim will now be much less common since the dirty mapping > accounting). In my earlier versions of patches I actually had converted these waits to be async retriable, but then I came to the conclusion that the additional complexity wasn't worth it. For one it didn't seem to make a difference compared to the other bigger cases, and I was looking primarily at handling the gross blocking points (say to enable an application to keep device queues busy) and not making everything asynchronous; for another we had a long discussion thread waay back about not making AIO submitters exempt from throttling or memory availability waits. Regards Suparna > > -- > SUSE Labs, Novell Inc. > Send instant messages to your online friends http://au.messenger.yahoo.com > > -- > To unsubscribe, send a message with 'unsubscribe linux-aio' in > the body to [EMAIL PROTECTED] For more info on Linux AIO, > see: http://www.kvack.org/aio/ > Don't email: mailto:"[EMAIL PROTECTED]">[EMAIL PROTECTED] -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
Suparna Bhattacharya wrote: On Thu, Jan 04, 2007 at 04:51:58PM +1100, Nick Piggin wrote: So long as AIO threads do the same, there would be no problem (plugging is optional, of course). Yup, the AIO threads run the same code as for regular IO, i.e in the rare situations where they actually end up submitting IO, so there should be no problem. And you have already added plug/unplug at the appropriate places in those path, so things should just work. Yes I think it should. This (is supposed to) give a number of improvements over the traditional plugging (although some downsides too). Most notably for me, the VM gets cleaner ;) However AIO could be an interesting case to test for explicit plugging because of the way they interact. What kind of improvements do you see with samba and do you have any benchmark setups? I think aio-stress would be a good way to test/benchmark this sort of stuff, at least for a start. Samba (if I understand this correctly based on my discussions with Tridge) is less likely to generate the kind of io patterns that could benefit from explicit plugging (because the file server has no way to tell what the next request is going to be, it ends up submitting each independently instead of batching iocbs). OK, but I think that after IO submission, you do not run sync_page to unplug the block device, like the normal IO path would (via lock_page, before the explicit plug patches). However, with explicit plugging, AIO requests will be started immediately. Maybe this won't be noticable if the device is always busy, but I would like to know there isn't a regression. In future there may be optimization possibilities to consider when submitting batches of iocbs, i.e. on the io submission path. Maybe AIO - O_DIRECT would be interesting to play with first in this regardi ? Well I've got some simple per-process batching in there now, each process has a list of pending requests. Request merging is done locklessly against the last request added; and submission at unplug time is batched under a single block device lock. I'm sure more merging or batching could be done, but also consider that most programs will not ever make use of any added complexity. Regarding your patches, I've just had a quick look and have a question -- what do you do about blocking in page reclaim and dirty balancing? Aren't those major points of blocking with buffered IO? Did your test cases dirty enough to start writeout or cause a lot of reclaim? (admittedly, blocking in reclaim will now be much less common since the dirty mapping accounting). -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Thu, Jan 04, 2007 at 04:51:58PM +1100, Nick Piggin wrote: > Suparna Bhattacharya wrote: > >On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > > >>Plus Jens's unplugging changes add more reliance upon context inside > >>*current, for the plugging and unplugging operations. I expect that the > >>fsaio patches will need to be aware of the protocol which those proposed > >>changes add. > > > > > >Whatever logic applies to background writeout etc should also just apply > >as is to aio worker threads, shouldn't it ? At least at a quick glance I > >don't see anything special that needs to be done for fsaio, but its good > >to be aware of this anyway, thanks ! > > The submitting process plugs itself, submits all its IO, then unplugs > itself (ie. so the plug is now on the process, rather than the block > device). > > So long as AIO threads do the same, there would be no problem (plugging > is optional, of course). Yup, the AIO threads run the same code as for regular IO, i.e in the rare situations where they actually end up submitting IO, so there should be no problem. And you have already added plug/unplug at the appropriate places in those path, so things should just work. > > This (is supposed to) give a number of improvements over the traditional > plugging (although some downsides too). Most notably for me, the VM gets > cleaner ;) > > However AIO could be an interesting case to test for explicit plugging > because of the way they interact. What kind of improvements do you see > with samba and do you have any benchmark setups? I think aio-stress would be a good way to test/benchmark this sort of stuff, at least for a start. Samba (if I understand this correctly based on my discussions with Tridge) is less likely to generate the kind of io patterns that could benefit from explicit plugging (because the file server has no way to tell what the next request is going to be, it ends up submitting each independently instead of batching iocbs). In future there may be optimization possibilities to consider when submitting batches of iocbs, i.e. on the io submission path. Maybe AIO - O_DIRECT would be interesting to play with first in this regardi ? Regards Suparna > > Thanks, > Nick > > -- > SUSE Labs, Novell Inc. > Send instant messages to your online friends http://au.messenger.yahoo.com > > -- > To unsubscribe, send a message with 'unsubscribe linux-aio' in > the body to [EMAIL PROTECTED] For more info on Linux AIO, > see: http://www.kvack.org/aio/ > Don't email: mailto:"[EMAIL PROTECTED]">[EMAIL PROTECTED] -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
Suparna Bhattacharya wrote: On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: Plus Jens's unplugging changes add more reliance upon context inside *current, for the plugging and unplugging operations. I expect that the fsaio patches will need to be aware of the protocol which those proposed changes add. Whatever logic applies to background writeout etc should also just apply as is to aio worker threads, shouldn't it ? At least at a quick glance I don't see anything special that needs to be done for fsaio, but its good to be aware of this anyway, thanks ! The submitting process plugs itself, submits all its IO, then unplugs itself (ie. so the plug is now on the process, rather than the block device). So long as AIO threads do the same, there would be no problem (plugging is optional, of course). This (is supposed to) give a number of improvements over the traditional plugging (although some downsides too). Most notably for me, the VM gets cleaner ;) However AIO could be an interesting case to test for explicit plugging because of the way they interact. What kind of improvements do you see with samba and do you have any benchmark setups? Thanks, Nick -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: > On Thu, 28 Dec 2006 13:53:08 +0530 > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > > > This patchset implements changes to make filesystem AIO read > > and write asynchronous for the non O_DIRECT case. > > Unfortunately the unplugging changes in Jen's block tree have trashed these > patches to a degree that I'm not confident in my repair attempts. So I'll > drop the fasio patches from -mm. I took a quick look and the conflicts seem pretty minor to me, the unplugging changes mostly touch nearby code. Please let know how you want this fixed up. >From what I can tell the comments in the unplug patches seem to say that it needs more work and testing, so perhaps a separate fixup patch may be a better idea rather than make the fsaio patchset dependent on this. > > Zach's observations regarding this code's reliance upon things at *current > sounded pretty serious, so I expect we'd be seeing changes for that anyway? Not really, at least nothing that I can see needing a change. As I mentioned there is no reliance on *current in the code that runs in the aio threads that we need to worry about. The generic_write_checks etc that Zach was referring to all happens in the context of submitting process, not in retry context. The model is to perform all validation at the time of io submission. And of course things like copy_to_user() are already taken care of by use_mm(). Lets look at it this way - the kernel already has the ability to do background writeout on behalf of a task from a kernel thread and likewise read(ahead) pages that may be consumed by another task. There is also the ability to operate another task's address space (as used by ptrace). So there is nothing groundbreaking here. In fact on most occasions, all the IO is initiated in the context of the submitting task, so the aio threads mainly deal with checking for completion and transfering completed data to user space. > > Plus Jens's unplugging changes add more reliance upon context inside > *current, for the plugging and unplugging operations. I expect that the > fsaio patches will need to be aware of the protocol which those proposed > changes add. Whatever logic applies to background writeout etc should also just apply as is to aio worker threads, shouldn't it ? At least at a quick glance I don't see anything special that needs to be done for fsaio, but its good to be aware of this anyway, thanks ! Regards Suparna > > -- > To unsubscribe, send a message with 'unsubscribe linux-aio' in > the body to [EMAIL PROTECTED] For more info on Linux AIO, > see: http://www.kvack.org/aio/ > Don't email: mailto:"[EMAIL PROTECTED]">[EMAIL PROTECTED] -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Thu, 28 Dec 2006 13:53:08 +0530 Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > This patchset implements changes to make filesystem AIO read > and write asynchronous for the non O_DIRECT case. Unfortunately the unplugging changes in Jen's block tree have trashed these patches to a degree that I'm not confident in my repair attempts. So I'll drop the fasio patches from -mm. Zach's observations regarding this code's reliance upon things at *current sounded pretty serious, so I expect we'd be seeing changes for that anyway? Plus Jens's unplugging changes add more reliance upon context inside *current, for the plugging and unplugging operations. I expect that the fsaio patches will need to be aware of the protocol which those proposed changes add. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
On Thu, 28 Dec 2006 13:53:08 +0530 Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > This patchset implements changes to make filesystem AIO read > and write asynchronous for the non O_DIRECT case. I did s/lock_page_slow/lock_page_blocking/g then merged all these into -mm, thanks. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
* Suparna Bhattacharya <[EMAIL PROTECTED]> wrote: > The following is a sampling of comparative aio-stress results with the > patches (each run starts with uncached files): > > - > > aio-stress throughput comparisons (in MB/s): > > file size 1GB, record size 64KB, depth 64, ios per iteration 8 > max io_submit 8, buffer alignment set to 4KB > 4 way Pentium III SMP box, Adaptec AIC-7896/7 Ultra2 SCSI, 40 MB/s > Filesystem: ext2 > > > Buffered (non O_DIRECT) > Vanilla Patched O_DIRECT > > Vanilla Patched > Random-Read 10.08 23.91 18.91, 18.98 > Random-O_SYNC-Write8.86 15.84 16.51, 16.53 > Sequential-Read 31.49 33.00 31.86, 31.79 > Sequential-O_SYNC-Write 8.68 32.60 31.45, 32.44 > Random-Write 31.09 (19.65) 30.90 (19.65) > Sequential-Write 30.84 (28.94) 30.09 (28.39) the numbers look very convincing to me! Ingo - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHSET 1][PATCH 0/6] Filesystem AIO read/write
Currently native linux AIO is properly supported (in the sense of actually being asynchronous) only for files opened with O_DIRECT. While this suffices for a major (and most visible) user of AIO, i.e. databases, other types of users like Samba require AIO support for regular file IO. Also, for glibc POSIX AIO to be able to switch to using native AIO instead of the current simulation using threads, it needs/expects asynchronous behaviour for both O_DIRECT and buffered file AIO. This patchset implements changes to make filesystem AIO read and write asynchronous for the non O_DIRECT case. This is mainly relevant in the case of reads of uncached or partially cached files, and O_SYNC writes. Instead of translating regular IO to [AIO + wait], it translates AIO to [regular IO - blocking + retries]. The intent of implementing it this way is to avoid modifying or slowing down normal usage, by keeping it pretty much the way it is without AIO, while avoiding code duplication. Instead we make AIO vs regular IO checks inside io_schedule(), i.e. at the blocking points. The low-level unit of distinction is a wait queue entry, which in the AIO case is contained in an iocb and in the synchronous IO case is associated with the calling task. The core idea is that is we complete as much IO as we can in a non-blocking fashion, and then continue the remaining part of the transfer again when woken up asynchronously via a wait queue callback when pages are ready ... thus each iteration progresses through more of the request until it is completed. The interesting part here is that owing largely to the idempotence in the way radix-tree page cache traveral happens, every iteration is simply a smaller read/write. Almost all of the iocb manipulation and advancement in the AIO case happens in the high level AIO code, and rather than in regular VFS/filesystem paths. The following is a sampling of comparative aio-stress results with the patches (each run starts with uncached files): - aio-stress throughput comparisons (in MB/s): file size 1GB, record size 64KB, depth 64, ios per iteration 8 max io_submit 8, buffer alignment set to 4KB 4 way Pentium III SMP box, Adaptec AIC-7896/7 Ultra2 SCSI, 40 MB/s Filesystem: ext2 Buffered (non O_DIRECT) Vanilla Patched O_DIRECT Vanilla Patched Random-Read 10.08 23.91 18.91, 18.98 Random-O_SYNC-Write 8.86 15.84 16.51, 16.53 Sequential-Read 31.49 33.00 31.86, 31.79 Sequential-O_SYNC-Write 8.68 32.60 31.45, 32.44 Random-Write31.09 (19.65) 30.90 (19.65) Sequential-Write30.84 (28.94) 30.09 (28.39) Regards Suparna -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html