On Mon, Oct 08, 2007 at 09:54:33AM +1000, David Chinner wrote:
> On Fri, Oct 05, 2007 at 08:30:28PM +0800, Fengguang Wu wrote:
> > The improvement could be:
> > - kswapd is now explicitly preferred to do the writeout;
>
> Careful. kswapd is much less efficient at writeout than pdflush
> because
On Fri, Oct 05, 2007 at 08:30:28PM +0800, Fengguang Wu wrote:
> The improvement could be:
> - kswapd is now explicitly preferred to do the writeout;
Careful. kswapd is much less efficient at writeout than pdflush
because it does not do low->high offset writeback per address space.
It just flushes
On Fri, Oct 05, 2007 at 08:30:28PM +0800, Fengguang Wu wrote:
The improvement could be:
- kswapd is now explicitly preferred to do the writeout;
Careful. kswapd is much less efficient at writeout than pdflush
because it does not do low-high offset writeback per address space.
It just flushes
On Mon, Oct 08, 2007 at 09:54:33AM +1000, David Chinner wrote:
On Fri, Oct 05, 2007 at 08:30:28PM +0800, Fengguang Wu wrote:
The improvement could be:
- kswapd is now explicitly preferred to do the writeout;
Careful. kswapd is much less efficient at writeout than pdflush
because it does
On Fri, Oct 05, 2007 at 10:20:05AM -0700, Andrew Morton wrote:
> On Fri, 5 Oct 2007 20:30:28 +0800
> Fengguang Wu <[EMAIL PROTECTED]> wrote:
>
> > > commit c4e2d7ddde9693a4c05da7afd485db02c27a7a09
> > > Author: akpm
> > > Date: Sun Dec 22 01:07:33 2002 +
> > >
> > > [PATCH] Give
On Fri, Oct 05, 2007 at 08:32:19PM +0200, Peter Zijlstra wrote:
>
> On Fri, 2007-10-05 at 13:50 -0400, Trond Myklebust wrote:
> > On Fri, 2007-10-05 at 12:57 +0200, Peter Zijlstra wrote:
> > > In this patch I totally ignored unstable, but I'm not sure that's the
> > > proper thing to do, I'd need
On Fri, 2007-10-05 at 15:23 -0400, Trond Myklebust wrote:
> On Fri, 2007-10-05 at 15:20 -0400, Trond Myklebust wrote:
> > On Fri, 2007-10-05 at 20:32 +0200, Peter Zijlstra wrote:
> > > Well, the thing is, we throttle pageout in throttle_vm_writeout(). As it
> > > stand we can deadlock there
On Fri, 05 Oct 2007 09:32:57 +0200
Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> I think just adding nr_cpus * ratelimit_pages to the dirth_thresh in
> throttle_vm_writeout() will also solve the problem
Agreed, that should fix the main latency issues.
--
All Rights Reversed
-
To unsubscribe
On Fri, 2007-10-05 at 15:20 -0400, Trond Myklebust wrote:
> On Fri, 2007-10-05 at 20:32 +0200, Peter Zijlstra wrote:
> > Well, the thing is, we throttle pageout in throttle_vm_writeout(). As it
> > stand we can deadlock there because it just waits for the numbers to
> > drop, and unstable pages
On Fri, 2007-10-05 at 20:32 +0200, Peter Zijlstra wrote:
> Well, the thing is, we throttle pageout in throttle_vm_writeout(). As it
> stand we can deadlock there because it just waits for the numbers to
> drop, and unstable pages don't automagically dissapear. Only
> write_inodes() - normally
On Fri, 2007-10-05 at 13:50 -0400, Trond Myklebust wrote:
> On Fri, 2007-10-05 at 12:57 +0200, Peter Zijlstra wrote:
> > In this patch I totally ignored unstable, but I'm not sure that's the
> > proper thing to do, I'd need to figure out what happens to an unstable
> > page when passed into
On Fri, 2007-10-05 at 12:57 +0200, Peter Zijlstra wrote:
> In this patch I totally ignored unstable, but I'm not sure that's the
> proper thing to do, I'd need to figure out what happens to an unstable
> page when passed into pageout() - or if its passed to pageout at all.
>
> If unstable pages
On Fri, 5 Oct 2007 20:30:28 +0800
Fengguang Wu <[EMAIL PROTECTED]> wrote:
> > commit c4e2d7ddde9693a4c05da7afd485db02c27a7a09
> > Author: akpm
> > Date: Sun Dec 22 01:07:33 2002 +
> >
> > [PATCH] Give kswapd writeback higher priority than pdflush
> >
> > The `low latency page
>> I think that's an improvement in all respects.
>>
>> However it still does not generally address the deadlock scenario: if
>> there's a small DMA zone, and fuse manages to put all of those pages
>> under writeout, then there's trouble.
Miklos> And the only way to solve that AFAICS, is to
On Thu, Oct 04, 2007 at 10:46:50AM -0700, Andrew Morton wrote:
> On Thu, 04 Oct 2007 18:47:07 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
>
> > > > > > But that said, there might be better ways to do that.
> > > > >
> > > > > Sure, if we do need to globally limit the number of
> Limiting FUSE to say 50% (suggestion from your other email) sounds like
> a horrible hack to me. - Need more time to think on this.
I don't really understand all that page balancing stuff, but I think
this will probably never or very rarely happen, because the allocator
will prefer the bigger
On Fri, 2007-10-05 at 12:27 +0200, Miklos Szeredi wrote:
> > diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> > index 4ef4d22..eff2438 100644
> > --- a/include/linux/writeback.h
> > +++ b/include/linux/writeback.h
> > @@ -88,7 +88,7 @@ static inline void wait_on_inode(struct
> I think that's an improvement in all respects.
>
> However it still does not generally address the deadlock scenario: if
> there's a small DMA zone, and fuse manages to put all of those pages
> under writeout, then there's trouble.
And the only way to solve that AFAICS, is to make sure fuse
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index 4ef4d22..eff2438 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -88,7 +88,7 @@ static inline void wait_on_inode(struct inode *inode)
> int wakeup_pdflush(long nr_pages);
> void
On Fri, 2007-10-05 at 11:22 +0200, Miklos Szeredi wrote:
> > So how do we end up with more writeback pages than that? should we teach
> > pdflush about these limits as well?
>
> Ugh.
>
> I think we should rather fix vmscan to not spin when all pages of a
> zone are already under writeout. Which
> So how do we end up with more writeback pages than that? should we teach
> pdflush about these limits as well?
Ugh.
I think we should rather fix vmscan to not spin when all pages of a
zone are already under writeout. Which is the _real_ problem,
according to Andrew.
Miklos
-
To unsubscribe
On Thu, 2007-10-04 at 17:48 -0700, Andrew Morton wrote:
> On Fri, 05 Oct 2007 02:12:30 +0200 Miklos Szeredi <[EMAIL PROTECTED]> wrote:
>
> > >
> > > I don't think I understand that. Sure, it _shouldn't_ be a problem. But
> > > it
> > > _is_. That's what we're trying to fix, isn't it?
> >
>
On Thu, 2007-10-04 at 16:09 -0700, Andrew Morton wrote:
> On Fri, 05 Oct 2007 00:39:16 +0200
> Miklos Szeredi <[EMAIL PROTECTED]> wrote:
>
> > > throttle_vm_writeout() should be a per-zone thing, I guess. Perhaps
> > > fixing
> > > that would fix your deadlock. That's doubtful, but I don't
On Thu, 2007-10-04 at 16:09 -0700, Andrew Morton wrote:
On Fri, 05 Oct 2007 00:39:16 +0200
Miklos Szeredi [EMAIL PROTECTED] wrote:
throttle_vm_writeout() should be a per-zone thing, I guess. Perhaps
fixing
that would fix your deadlock. That's doubtful, but I don't know anything
On Thu, 2007-10-04 at 17:48 -0700, Andrew Morton wrote:
On Fri, 05 Oct 2007 02:12:30 +0200 Miklos Szeredi [EMAIL PROTECTED] wrote:
I don't think I understand that. Sure, it _shouldn't_ be a problem. But
it
_is_. That's what we're trying to fix, isn't it?
The problem, I
So how do we end up with more writeback pages than that? should we teach
pdflush about these limits as well?
Ugh.
I think we should rather fix vmscan to not spin when all pages of a
zone are already under writeout. Which is the _real_ problem,
according to Andrew.
Miklos
-
To unsubscribe
On Fri, 2007-10-05 at 11:22 +0200, Miklos Szeredi wrote:
So how do we end up with more writeback pages than that? should we teach
pdflush about these limits as well?
Ugh.
I think we should rather fix vmscan to not spin when all pages of a
zone are already under writeout. Which is the
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 4ef4d22..eff2438 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -88,7 +88,7 @@ static inline void wait_on_inode(struct inode *inode)
int wakeup_pdflush(long nr_pages);
void
On Thu, Oct 04, 2007 at 10:46:50AM -0700, Andrew Morton wrote:
On Thu, 04 Oct 2007 18:47:07 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote:
But that said, there might be better ways to do that.
Sure, if we do need to globally limit the number of under-writeback
pages, then I
On Fri, 2007-10-05 at 12:27 +0200, Miklos Szeredi wrote:
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 4ef4d22..eff2438 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -88,7 +88,7 @@ static inline void wait_on_inode(struct inode
I think that's an improvement in all respects.
However it still does not generally address the deadlock scenario: if
there's a small DMA zone, and fuse manages to put all of those pages
under writeout, then there's trouble.
And the only way to solve that AFAICS, is to make sure fuse never
Limiting FUSE to say 50% (suggestion from your other email) sounds like
a horrible hack to me. - Need more time to think on this.
I don't really understand all that page balancing stuff, but I think
this will probably never or very rarely happen, because the allocator
will prefer the bigger
I think that's an improvement in all respects.
However it still does not generally address the deadlock scenario: if
there's a small DMA zone, and fuse manages to put all of those pages
under writeout, then there's trouble.
Miklos And the only way to solve that AFAICS, is to make sure fuse
On Fri, 5 Oct 2007 20:30:28 +0800
Fengguang Wu [EMAIL PROTECTED] wrote:
commit c4e2d7ddde9693a4c05da7afd485db02c27a7a09
Author: akpm akpm
Date: Sun Dec 22 01:07:33 2002 +
[PATCH] Give kswapd writeback higher priority than pdflush
The `low latency page reclaim'
On Fri, 2007-10-05 at 12:57 +0200, Peter Zijlstra wrote:
In this patch I totally ignored unstable, but I'm not sure that's the
proper thing to do, I'd need to figure out what happens to an unstable
page when passed into pageout() - or if its passed to pageout at all.
If unstable pages would
On Fri, 2007-10-05 at 13:50 -0400, Trond Myklebust wrote:
On Fri, 2007-10-05 at 12:57 +0200, Peter Zijlstra wrote:
In this patch I totally ignored unstable, but I'm not sure that's the
proper thing to do, I'd need to figure out what happens to an unstable
page when passed into pageout() -
On Fri, 2007-10-05 at 20:32 +0200, Peter Zijlstra wrote:
Well, the thing is, we throttle pageout in throttle_vm_writeout(). As it
stand we can deadlock there because it just waits for the numbers to
drop, and unstable pages don't automagically dissapear. Only
write_inodes() - normally called
On Fri, 2007-10-05 at 15:20 -0400, Trond Myklebust wrote:
On Fri, 2007-10-05 at 20:32 +0200, Peter Zijlstra wrote:
Well, the thing is, we throttle pageout in throttle_vm_writeout(). As it
stand we can deadlock there because it just waits for the numbers to
drop, and unstable pages don't
On Fri, 05 Oct 2007 09:32:57 +0200
Peter Zijlstra [EMAIL PROTECTED] wrote:
I think just adding nr_cpus * ratelimit_pages to the dirth_thresh in
throttle_vm_writeout() will also solve the problem
Agreed, that should fix the main latency issues.
--
All Rights Reversed
-
To unsubscribe from
On Fri, 2007-10-05 at 15:23 -0400, Trond Myklebust wrote:
On Fri, 2007-10-05 at 15:20 -0400, Trond Myklebust wrote:
On Fri, 2007-10-05 at 20:32 +0200, Peter Zijlstra wrote:
Well, the thing is, we throttle pageout in throttle_vm_writeout(). As it
stand we can deadlock there because it
On Fri, Oct 05, 2007 at 08:32:19PM +0200, Peter Zijlstra wrote:
On Fri, 2007-10-05 at 13:50 -0400, Trond Myklebust wrote:
On Fri, 2007-10-05 at 12:57 +0200, Peter Zijlstra wrote:
In this patch I totally ignored unstable, but I'm not sure that's the
proper thing to do, I'd need to figure
On Fri, Oct 05, 2007 at 10:20:05AM -0700, Andrew Morton wrote:
On Fri, 5 Oct 2007 20:30:28 +0800
Fengguang Wu [EMAIL PROTECTED] wrote:
commit c4e2d7ddde9693a4c05da7afd485db02c27a7a09
Author: akpm akpm
Date: Sun Dec 22 01:07:33 2002 +
[PATCH] Give kswapd writeback
On Fri, 05 Oct 2007 02:12:30 +0200 Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> >
> > I don't think I understand that. Sure, it _shouldn't_ be a problem. But it
> > _is_. That's what we're trying to fix, isn't it?
>
> The problem, I believe is in the memory allocation code, not in fuse.
fuse
> > > This is a somewhat general problem: a userspace process is in the IO
> > > path.
> > > Userspace block drivers, for example - pretty much anything which involves
> > > kernel->userspace upcalls for storage applications.
> > >
> > > I solved it once in the past by marking the userspace
On Fri, 05 Oct 2007 01:26:12 +0200
Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> > This is a somewhat general problem: a userspace process is in the IO path.
> > Userspace block drivers, for example - pretty much anything which involves
> > kernel->userspace upcalls for storage applications.
> >
> This is a somewhat general problem: a userspace process is in the IO path.
> Userspace block drivers, for example - pretty much anything which involves
> kernel->userspace upcalls for storage applications.
>
> I solved it once in the past by marking the userspace process as
> PF_MEMALLOC and I
On Fri, 05 Oct 2007 00:39:16 +0200
Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> > throttle_vm_writeout() should be a per-zone thing, I guess. Perhaps fixing
> > that would fix your deadlock. That's doubtful, but I don't know anything
> > about your deadlock so I cannot say.
>
> No, doing the
> None of the above.
>
> [PATCH] vm: pageout throttling
>
> With silly pageout testcases it is possible to place huge amounts of
> memory
> under I/O. With a large request queue (CFQ uses 8192 requests) it is
> possible to place _all_ memory under I/O at the same time.
>
On Thu, 04 Oct 2007 14:25:22 +0200
Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> From: Miklos Szeredi <[EMAIL PROTECTED]>
>
> By relying on the global diry limits, this can cause a deadlock when
> devices are stacked.
>
> If the stacking is done through a fuse filesystem, the __GFP_FS,
> __GFP_IO
> Yeah, I'm guestimating O on a per device basis, but I agree that the
> current ratio limiting is quite crude. I'm not at all sorry to see
> throttle_vm_writeback() go, I just wanted to make a point that what it
> does is not quite without merrit - we agree that it can be done better
>
On Thu, 04 Oct 2007 20:10:10 +0200
Peter Zijlstra <[EMAIL PROTECTED]> wrote:
>
> On Thu, 2007-10-04 at 10:46 -0700, Andrew Morton wrote:
> > On Thu, 04 Oct 2007 18:47:07 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
>
> > > static int may_write_to_queue(struct backing_dev_info *bdi)
> > > {
>
On Thu, 2007-10-04 at 10:46 -0700, Andrew Morton wrote:
> On Thu, 04 Oct 2007 18:47:07 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> > static int may_write_to_queue(struct backing_dev_info *bdi)
> > {
> > if (current->flags & PF_SWAPWRITE)
> > return 1;
> > if
On Thu, 04 Oct 2007 18:47:07 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> > > > > But that said, there might be better ways to do that.
> > > >
> > > > Sure, if we do need to globally limit the number of under-writeback
> > > > pages, then I think we need to do it independently of the dirty
On Thu, 2007-10-04 at 15:49 +0200, Miklos Szeredi wrote:
> > > > Which can only happen when it is larger than 10% of dirty_thresh.
> > > >
> > > > Which is even more unlikely since it doesn't account nr_dirty (as I
> > > > think it should).
> > >
> > > I think nr_dirty is totally irrelevant.
> > > Which can only happen when it is larger than 10% of dirty_thresh.
> > >
> > > Which is even more unlikely since it doesn't account nr_dirty (as I
> > > think it should).
> >
> > I think nr_dirty is totally irrelevant. Since we don't care about
> > case 1), and in case 2) nr_dirty doesn't
On Thu, 2007-10-04 at 15:00 +0200, Miklos Szeredi wrote:
> > > 1) File backed pages -> file
> > >
> > > dirty + writeback count remains constant
> > >
> > > 2) Anonymous pages -> swap
> > >
> > > writeback count increases, dirty balancing will hold back file
> > > writeback in favor of
> > 1) File backed pages -> file
> >
> > dirty + writeback count remains constant
> >
> > 2) Anonymous pages -> swap
> >
> > writeback count increases, dirty balancing will hold back file
> > writeback in favor of swap
> >
> > So the real question is: does case 2 need rate limiting, or
On Thu, 2007-10-04 at 14:25 +0200, Miklos Szeredi wrote:
> This in preparation for the writable mmap patches for fuse. I know it
> conflicts with
>
> writeback-remove-unnecessary-wait-in-throttle_vm_writeout.patch
>
> but if this function is to be removed, it doesn't make much sense to
> fix
This in preparation for the writable mmap patches for fuse. I know it
conflicts with
writeback-remove-unnecessary-wait-in-throttle_vm_writeout.patch
but if this function is to be removed, it doesn't make much sense to
fix it first ;)
---
From: Miklos Szeredi <[EMAIL PROTECTED]>
By relying
On Thu, 2007-10-04 at 14:25 +0200, Miklos Szeredi wrote:
This in preparation for the writable mmap patches for fuse. I know it
conflicts with
writeback-remove-unnecessary-wait-in-throttle_vm_writeout.patch
but if this function is to be removed, it doesn't make much sense to
fix it
This in preparation for the writable mmap patches for fuse. I know it
conflicts with
writeback-remove-unnecessary-wait-in-throttle_vm_writeout.patch
but if this function is to be removed, it doesn't make much sense to
fix it first ;)
---
From: Miklos Szeredi [EMAIL PROTECTED]
By relying on
1) File backed pages - file
dirty + writeback count remains constant
2) Anonymous pages - swap
writeback count increases, dirty balancing will hold back file
writeback in favor of swap
So the real question is: does case 2 need rate limiting, or is it OK
to let the
On Thu, 2007-10-04 at 15:00 +0200, Miklos Szeredi wrote:
1) File backed pages - file
dirty + writeback count remains constant
2) Anonymous pages - swap
writeback count increases, dirty balancing will hold back file
writeback in favor of swap
So the real
Which can only happen when it is larger than 10% of dirty_thresh.
Which is even more unlikely since it doesn't account nr_dirty (as I
think it should).
I think nr_dirty is totally irrelevant. Since we don't care about
case 1), and in case 2) nr_dirty doesn't play any role.
On Thu, 2007-10-04 at 15:49 +0200, Miklos Szeredi wrote:
Which can only happen when it is larger than 10% of dirty_thresh.
Which is even more unlikely since it doesn't account nr_dirty (as I
think it should).
I think nr_dirty is totally irrelevant. Since we don't care
On Thu, 04 Oct 2007 18:47:07 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote:
But that said, there might be better ways to do that.
Sure, if we do need to globally limit the number of under-writeback
pages, then I think we need to do it independently of the dirty
accounting.
On Thu, 2007-10-04 at 10:46 -0700, Andrew Morton wrote:
On Thu, 04 Oct 2007 18:47:07 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote:
static int may_write_to_queue(struct backing_dev_info *bdi)
{
if (current-flags PF_SWAPWRITE)
return 1;
if
On Thu, 04 Oct 2007 20:10:10 +0200
Peter Zijlstra [EMAIL PROTECTED] wrote:
On Thu, 2007-10-04 at 10:46 -0700, Andrew Morton wrote:
On Thu, 04 Oct 2007 18:47:07 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote:
static int may_write_to_queue(struct backing_dev_info *bdi)
{
if
Yeah, I'm guestimating O on a per device basis, but I agree that the
current ratio limiting is quite crude. I'm not at all sorry to see
throttle_vm_writeback() go, I just wanted to make a point that what it
does is not quite without merrit - we agree that it can be done better
differently.
On Thu, 04 Oct 2007 14:25:22 +0200
Miklos Szeredi [EMAIL PROTECTED] wrote:
From: Miklos Szeredi [EMAIL PROTECTED]
By relying on the global diry limits, this can cause a deadlock when
devices are stacked.
If the stacking is done through a fuse filesystem, the __GFP_FS,
__GFP_IO tests
None of the above.
[PATCH] vm: pageout throttling
With silly pageout testcases it is possible to place huge amounts of
memory
under I/O. With a large request queue (CFQ uses 8192 requests) it is
possible to place _all_ memory under I/O at the same time.
On Fri, 05 Oct 2007 00:39:16 +0200
Miklos Szeredi [EMAIL PROTECTED] wrote:
throttle_vm_writeout() should be a per-zone thing, I guess. Perhaps fixing
that would fix your deadlock. That's doubtful, but I don't know anything
about your deadlock so I cannot say.
No, doing the throttling
This is a somewhat general problem: a userspace process is in the IO path.
Userspace block drivers, for example - pretty much anything which involves
kernel-userspace upcalls for storage applications.
I solved it once in the past by marking the userspace process as
PF_MEMALLOC and I
On Fri, 05 Oct 2007 01:26:12 +0200
Miklos Szeredi [EMAIL PROTECTED] wrote:
This is a somewhat general problem: a userspace process is in the IO path.
Userspace block drivers, for example - pretty much anything which involves
kernel-userspace upcalls for storage applications.
I solved
This is a somewhat general problem: a userspace process is in the IO
path.
Userspace block drivers, for example - pretty much anything which involves
kernel-userspace upcalls for storage applications.
I solved it once in the past by marking the userspace process as
On Fri, 05 Oct 2007 02:12:30 +0200 Miklos Szeredi [EMAIL PROTECTED] wrote:
I don't think I understand that. Sure, it _shouldn't_ be a problem. But it
_is_. That's what we're trying to fix, isn't it?
The problem, I believe is in the memory allocation code, not in fuse.
fuse is trying
76 matches
Mail list logo