Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-19 Thread Rik van Riel

Abhijit Bhopatkar wrote:


In my mind i find it fundamentally wrong to separate anon pages from
page cache. It should rather be lot more dependent on which task
accessed them last. Although it seems due to some twisted relationships
bet anon pages and interactive tasks separating them improves it.
Am i missing something here?


The IO cost for anonymous (and other swap backed) pages is
completely different from the IO cost of file system backed
pages.

On file systems, data is typically grouped together on disk
by related content.  Programs often access data linearly,
meaning that with readahead we can load a lot of pages into
memory with only a few disk seeks.

Anonymous memory does not have this benefit. For one, memory
tends to get written to swap by LRU order, not by related
content.  To make things worse, repeated malloc/free cycles
can cause the memory adjacant to each other inside a process
to be completely unrelated, making virtual address based
swap clustering less useful.

The goal of page replacement is to minimize the total time
spent waiting on page faults.  This is not exactly the same
as minimizing the total number of page faults.


Can you send me those patches please or point me to where i can find those?


You can get the latest one here:

http://surriel.com/patches/2.6/vm-split/linux-2.6-vm-split.patch

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-19 Thread Rik van Riel

Abhijit Bhopatkar wrote:


In my mind i find it fundamentally wrong to separate anon pages from
page cache. It should rather be lot more dependent on which task
accessed them last. Although it seems due to some twisted relationships
bet anon pages and interactive tasks separating them improves it.
Am i missing something here?


The IO cost for anonymous (and other swap backed) pages is
completely different from the IO cost of file system backed
pages.

On file systems, data is typically grouped together on disk
by related content.  Programs often access data linearly,
meaning that with readahead we can load a lot of pages into
memory with only a few disk seeks.

Anonymous memory does not have this benefit. For one, memory
tends to get written to swap by LRU order, not by related
content.  To make things worse, repeated malloc/free cycles
can cause the memory adjacant to each other inside a process
to be completely unrelated, making virtual address based
swap clustering less useful.

The goal of page replacement is to minimize the total time
spent waiting on page faults.  This is not exactly the same
as minimizing the total number of page faults.


Can you send me those patches please or point me to where i can find those?


You can get the latest one here:

http://surriel.com/patches/2.6/vm-split/linux-2.6-vm-split.patch

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Abhijit Bhopatkar

> I just wanted to know weather its worth going forward or we have
> better reasons to discount any such direction?

The reason that the wrong pages get swapped out sometimes
could be due to a side effect of the way the swappiness
policy is implemented.

While the VM only reclaims page cache pages, it will still
rotate through the anonymous pages on the LRU list, which
effectively randomizes the order of those pages on the list.


In my mind i find it fundamentally wrong to separate anon pages from
page cache. It should rather be lot more dependent on which task
accessed them last. Although it seems due to some twisted relationships
bet anon pages and interactive tasks separating them improves it.
Am i missing something here?


I need to get back to benchmarking my patch to split the
lists - anonymous and other swap backed pages on one set
of pageout lists, filesystem backed pages on another list.



Unfortunately my main desktop system at home depends on
Xen, so it's not as easy to use that patch there :(



Can you send me those patches please or point me to where i can find those?

Abhijit
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Rik van Riel

Abhijit Bhopatkar wrote:


I just wanted to know weather its worth going forward or we have
better reasons to discount any such direction?


The reason that the wrong pages get swapped out sometimes
could be due to a side effect of the way the swappiness
policy is implemented.

While the VM only reclaims page cache pages, it will still
rotate through the anonymous pages on the LRU list, which
effectively randomizes the order of those pages on the list.

I need to get back to benchmarking my patch to split the
lists - anonymous and other swap backed pages on one set
of pageout lists, filesystem backed pages on another list.

One report I got was that the system is more interactive
under very heavy load, and my desktop system at the office
seems to behave better than it used to when I get back to
it after a few days.

Unfortunately my main desktop system at home depends on
Xen, so it's not as easy to use that patch there :(

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Rik van Riel

अभिजित भोपटकर (Abhijit Bhopatkar) wrote:

The mm structures of interactive tasks are marked and
the pages belonging to them are never shifted to inactive
list in lru algorithm. Thus keeping interactive tasks in
memory as long as possible.
The interactivity is already determined by schedular so
we reuse that knowledge to mark the mm structures.


Aside from the obvious question of whether the idea is good,
there are some practical problems with your patch:

1) the mm->interactive flag is never cleared, even if the
   task stops being interactive

2) what if the interactive tasks use up more memory than
   the system has?  Will you OOM kill instead of swapping
   out part of an interactive task?

3) the scheduler can change its idea about which task is
   interactive and which task isn't very rapidly, while
   disk IO is very slow - the scheduler's classification
   may not be useful on swap timescales

4) a currently completely idle task can still be marked
   interactive in the scheduler, even if it has been
   idle for days.  Such a task is an obvious good
   candidate for swapout, isn't it?

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Chris Snook

अभिजित भोपटकर (Abhijit Bhopatkar) wrote:

The mm structures of interactive tasks are marked and
the pages belonging to them are never shifted to inactive
list in lru algorithm. Thus keeping interactive tasks in
memory as long as possible.
The interactivity is already determined by schedular so
we reuse that knowledge to mark the mm structures.

Signed-off-by: Abhijit Bhopatkar <[EMAIL PROTECTED]>
---


Lying to the VM doesn't seem like the best way to handle this.  A lot of tasks, 
including interactive ones have some/many pages that they touch once during 
startup, and don't touch again for a very long time, if ever.  We want these 
pages swapped out long before the box swaps out the working set of our 
non-interactive processes.


I like the general idea of swap priority influenced by scheduler priority, but 
if we're going to do that, we should do it in a general way that's independent 
of scheduler implementation, so it'll be useful to soft real-time users and 
still relevant if (when?) we replace the current scheduler with something else 
lacking a special "interactive" flag.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread अभिज ित भ ोपटकर (A bhijit Bhopatkar)

The mm structures of interactive tasks are marked and
the pages belonging to them are never shifted to inactive
list in lru algorithm. Thus keeping interactive tasks in
memory as long as possible.
The interactivity is already determined by schedular so
we reuse that knowledge to mark the mm structures.

Signed-off-by: Abhijit Bhopatkar <[EMAIL PROTECTED]>
---
include/linux/init_task.h |1 +
include/linux/sched.h |4 
kernel/sched.c|4 
mm/rmap.c |5 +
4 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index a2d95ff..12ba3c1 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -54,6 +54,7 @@
   .page_table_lock =  __SPIN_LOCK_UNLOCKED(name.page_table_lock), \
   .mmlist = LIST_HEAD_INIT(name.mmlist),  \
   .cpu_vm_mask= CPU_MASK_ALL, \
+   .interactive= 0,\
}

#define INIT_SIGNALS(sig) {\
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 49fe299..1f233e7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -373,6 +373,10 @@ struct mm_struct {
   /* aio bits */
   rwlock_tioctx_list_lock;
   struct kioctx   *ioctx_list;
+
+   /* interactivity flag */
+   unsigned char interactive;
+
};

struct sighand_struct {
diff --git a/kernel/sched.c b/kernel/sched.c
index b9a6837..72caf94 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -746,6 +746,10 @@ static inline int __normal_prio(struct task_struct *p)
   prio = MAX_RT_PRIO;
   if (prio > MAX_PRIO-1)
   prio = MAX_PRIO-1;
+
+   /* Update interactivity flag in mm if interactive. */
+   if(p->mm && TASK_INTERACTIVE(p))
+ p->mm->interactive = 1;
   return prio;
}

diff --git a/mm/rmap.c b/mm/rmap.c
index b82146e..0735168 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -317,6 +317,11 @@ static int page_referenced_one(struct page *page,
   rwsem_is_locked(>mmap_sem))
   referenced++;

+   /* Pretend the page is referenced if the task is
+  interactive. */
+   if (mm != current->mm && mm->interactive)
+   referenced++;
+
   (*mapcount)--;
   pte_unmap_unlock(pte, ptl);
out:
--
1.4.4.2
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread अभिज ित भ ोपटकर (A bhijit Bhopatkar)

The mm structures of interactive tasks are marked and
the pages belonging to them are never shifted to inactive
list in lru algorithm. Thus keeping interactive tasks in
memory as long as possible.
The interactivity is already determined by schedular so
we reuse that knowledge to mark the mm structures.

Signed-off-by: Abhijit Bhopatkar [EMAIL PROTECTED]
---
include/linux/init_task.h |1 +
include/linux/sched.h |4 
kernel/sched.c|4 
mm/rmap.c |5 +
4 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index a2d95ff..12ba3c1 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -54,6 +54,7 @@
   .page_table_lock =  __SPIN_LOCK_UNLOCKED(name.page_table_lock), \
   .mmlist = LIST_HEAD_INIT(name.mmlist),  \
   .cpu_vm_mask= CPU_MASK_ALL, \
+   .interactive= 0,\
}

#define INIT_SIGNALS(sig) {\
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 49fe299..1f233e7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -373,6 +373,10 @@ struct mm_struct {
   /* aio bits */
   rwlock_tioctx_list_lock;
   struct kioctx   *ioctx_list;
+
+   /* interactivity flag */
+   unsigned char interactive;
+
};

struct sighand_struct {
diff --git a/kernel/sched.c b/kernel/sched.c
index b9a6837..72caf94 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -746,6 +746,10 @@ static inline int __normal_prio(struct task_struct *p)
   prio = MAX_RT_PRIO;
   if (prio  MAX_PRIO-1)
   prio = MAX_PRIO-1;
+
+   /* Update interactivity flag in mm if interactive. */
+   if(p-mm  TASK_INTERACTIVE(p))
+ p-mm-interactive = 1;
   return prio;
}

diff --git a/mm/rmap.c b/mm/rmap.c
index b82146e..0735168 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -317,6 +317,11 @@ static int page_referenced_one(struct page *page,
   rwsem_is_locked(mm-mmap_sem))
   referenced++;

+   /* Pretend the page is referenced if the task is
+  interactive. */
+   if (mm != current-mm  mm-interactive)
+   referenced++;
+
   (*mapcount)--;
   pte_unmap_unlock(pte, ptl);
out:
--
1.4.4.2
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Chris Snook

अभिजित भोपटकर (Abhijit Bhopatkar) wrote:

The mm structures of interactive tasks are marked and
the pages belonging to them are never shifted to inactive
list in lru algorithm. Thus keeping interactive tasks in
memory as long as possible.
The interactivity is already determined by schedular so
we reuse that knowledge to mark the mm structures.

Signed-off-by: Abhijit Bhopatkar [EMAIL PROTECTED]
---


Lying to the VM doesn't seem like the best way to handle this.  A lot of tasks, 
including interactive ones have some/many pages that they touch once during 
startup, and don't touch again for a very long time, if ever.  We want these 
pages swapped out long before the box swaps out the working set of our 
non-interactive processes.


I like the general idea of swap priority influenced by scheduler priority, but 
if we're going to do that, we should do it in a general way that's independent 
of scheduler implementation, so it'll be useful to soft real-time users and 
still relevant if (when?) we replace the current scheduler with something else 
lacking a special interactive flag.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Rik van Riel

अभिजित भोपटकर (Abhijit Bhopatkar) wrote:

The mm structures of interactive tasks are marked and
the pages belonging to them are never shifted to inactive
list in lru algorithm. Thus keeping interactive tasks in
memory as long as possible.
The interactivity is already determined by schedular so
we reuse that knowledge to mark the mm structures.


Aside from the obvious question of whether the idea is good,
there are some practical problems with your patch:

1) the mm-interactive flag is never cleared, even if the
   task stops being interactive

2) what if the interactive tasks use up more memory than
   the system has?  Will you OOM kill instead of swapping
   out part of an interactive task?

3) the scheduler can change its idea about which task is
   interactive and which task isn't very rapidly, while
   disk IO is very slow - the scheduler's classification
   may not be useful on swap timescales

4) a currently completely idle task can still be marked
   interactive in the scheduler, even if it has been
   idle for days.  Such a task is an obvious good
   candidate for swapout, isn't it?

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Rik van Riel

Abhijit Bhopatkar wrote:


I just wanted to know weather its worth going forward or we have
better reasons to discount any such direction?


The reason that the wrong pages get swapped out sometimes
could be due to a side effect of the way the swappiness
policy is implemented.

While the VM only reclaims page cache pages, it will still
rotate through the anonymous pages on the LRU list, which
effectively randomizes the order of those pages on the list.

I need to get back to benchmarking my patch to split the
lists - anonymous and other swap backed pages on one set
of pageout lists, filesystem backed pages on another list.

One report I got was that the system is more interactive
under very heavy load, and my desktop system at the office
seems to behave better than it used to when I get back to
it after a few days.

Unfortunately my main desktop system at home depends on
Xen, so it's not as easy to use that patch there :(

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Abhijit Bhopatkar

 I just wanted to know weather its worth going forward or we have
 better reasons to discount any such direction?

The reason that the wrong pages get swapped out sometimes
could be due to a side effect of the way the swappiness
policy is implemented.

While the VM only reclaims page cache pages, it will still
rotate through the anonymous pages on the LRU list, which
effectively randomizes the order of those pages on the list.


In my mind i find it fundamentally wrong to separate anon pages from
page cache. It should rather be lot more dependent on which task
accessed them last. Although it seems due to some twisted relationships
bet anon pages and interactive tasks separating them improves it.
Am i missing something here?


I need to get back to benchmarking my patch to split the
lists - anonymous and other swap backed pages on one set
of pageout lists, filesystem backed pages on another list.

snip

Unfortunately my main desktop system at home depends on
Xen, so it's not as easy to use that patch there :(



Can you send me those patches please or point me to where i can find those?

Abhijit
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/