Re: [Sugar-devel] OOM conditions

2009-11-08 Thread Martin Dengler
On Sun, Nov 08, 2009 at 12:07:58PM +, Lucian Branescu wrote:
> Slightly off-topic, has anyone tried compcache
> (http://code.google.com/p/compcache/) on an XO-1? I might if I can get
> it to work.

Yes.  It works very well.

http://www.google.com/search?q=compcache+xo

Martin


pgpNfpVzilNu4.pgp
Description: PGP signature
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [Sugar-devel] OOM conditions

2009-11-08 Thread Lucian Branescu
Slightly off-topic, has anyone tried compcache
(http://code.google.com/p/compcache/) on an XO-1? I might if I can get
it to work.

2009/11/8 Tomeu Vizoso :
> On Sat, Nov 7, 2009 at 12:06, Martin Dengler  wrote:
>> On Fri, Nov 06, 2009 at 04:50:53PM +, Tomeu Vizoso wrote:
>>> On Wed, Nov 4, 2009 at 14:16, Martin Dengler  
>>> wrote:
>>> > On Fri, Oct 30, 2009 at 11:22:13PM +0100, Tomeu Vizoso wrote:
>>> >> On Fri, Oct 30, 2009 at 16:58, Richard A. Smith  
>>> >> wrote:
>>> >> > Working the table at the Boston book festival I was reminded how
>>> >> > painful the OOM stuff is on a gen 1. The demo machines were in
>>> >> > this state a lot as each visitor would open up a new
>>> >> > program.  Basically you have to just turn the unit off and restart
>>> >> > as trying to recover is futile.
>>> >>
>>> >> What if activities had a higher oom_score? Would that protect enough
>>> >> the processes that once killed require a system restart (X, shell,
>>> >> etc)?
>>> >
>>> > See patch vs sugar-toolkit HEAD below[1] (I can backport to 0.82 if
>>> > wanted).
>>>
>>> Maybe would be better to have the shell do that? So it works for
>>> non-python activities.
>>
>> Patch inline below.
>
> Looks great, thanks a lot. Have you seen less memory-induced lockups
> on the XO-1?
>
> Regards,
>
> Tomeu
>
>>> >> Regards,
>>> >>
>>> >> Tomeu
>>> >
>>> > Martin
>>> >
>>> Thanks,
>>>
>>> Tomeu
>>
>> Martin
>>
>>
>> (untested) patch against
>> http://cgit.sugarlabs.org/sugar-toolkit/mainline/tree/src/sugar/activity/activityfactory.py
>> :
>>
>> From 4bd6fb9f7f245c2aed92d6964746627d0c96cbec Mon Sep 17 00:00:00 2001
>> From: Martin Dengler 
>> Date: Sat, 7 Nov 2009 10:55:16 +
>> Subject: [PATCH] sacrifice activities to the OOM killer first
>>
>> change the OOM-killer score of launched activities to be the maximum.
>> See discussion at http://linux-mm.org/OOM_Killer
>> ---
>>  src/sugar/activity/activityfactory.py |   35 
>> +
>>  1 files changed, 35 insertions(+), 0 deletions(-)
>>
>> diff --git a/src/sugar/activity/activityfactory.py 
>> b/src/sugar/activity/activityfactory.py
>> index ee0fd92..5deee6e 100644
>> --- a/src/sugar/activity/activityfactory.py
>> +++ b/src/sugar/activity/activityfactory.py
>> @@ -65,6 +65,39 @@ def _close_fds():
>>             pass
>>
>>
>> +def __oom_adj_pid(pid, omm_adj_value=None):
>> +    """ Change a process' OOM likelihood to oom_adj_value.
>> +
>> +    By default, use the value of gconf path
>> +    "/desktop/sugar/performance/oom_adj_default"; if none exists, make
>> +    this process most likely to be killed (oom_adj_value=15).
>> +
>> +    Linux-specific.  See http://linux-mm.org/OOM_Killer for details.
>> +    """
>> +    oom_adj_fullpath = "/proc/%s/oom_adj" % pid
>> +    if os.path.exists(oom_adj_fullpath):
>> +        try:
>> +
>> +            # get values/defaults from gconf
>> +            import gconf
>> +            gconf_dir = "/desktop/sugar/performance"
>> +            gconf_key = "oom_adj_default"
>> +            client = gconf.client_get_default()
>> +            if not client.dir_exists(gconf_dir):
>> +                client.add_dir(gconf_dir, gconf.CLIENT_PRELOAD_NONE)
>> +            if oom_adj_value is None:
>> +                oom_adj_value = client.get_int(gconf_dir + "/" + gconf_key)
>> +                if oom_adj_value is None:
>> +                    oom_adj_value = 15
>> +                    client.set_int(gconf_dir + "/" + gconf_key,
>> +                                   oom_adj_value)
>> +
>> +            file(oom_adj_fullpath).write(oom_adj_value)
>> +
>> +        except:
>> +            pass
>> +
>> +
>>  def create_activity_id():
>>     """Generate a new, unique ID for this activity"""
>>     pservice = presenceservice.get_instance()
>> @@ -276,6 +309,8 @@ class ActivityCreationHandler(gobject.GObject):
>>             stdout=log_file.fileno(),
>>             stderr=log_file.fileno())
>>
>> +        __oom_adj_pid(child.pid)
>> +
>>         gobject.child_watch_add(child.pid,
>>                                 _child_watch_cb,
>>                                 (environment_dir, log_file))
>> --
>> 1.6.2.5
>>
>>
>
>
>
> --
> «Sugar Labs is anyone who participates in improving and using Sugar.
> What Sugar Labs does is determined by the participants.» - David
> Farning
> ___
> Sugar-devel mailing list
> sugar-de...@lists.sugarlabs.org
> http://lists.sugarlabs.org/listinfo/sugar-devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OOM conditions

2009-11-08 Thread Tomeu Vizoso
On Sat, Nov 7, 2009 at 12:06, Martin Dengler  wrote:
> On Fri, Nov 06, 2009 at 04:50:53PM +, Tomeu Vizoso wrote:
>> On Wed, Nov 4, 2009 at 14:16, Martin Dengler  
>> wrote:
>> > On Fri, Oct 30, 2009 at 11:22:13PM +0100, Tomeu Vizoso wrote:
>> >> On Fri, Oct 30, 2009 at 16:58, Richard A. Smith  
>> >> wrote:
>> >> > Working the table at the Boston book festival I was reminded how
>> >> > painful the OOM stuff is on a gen 1. The demo machines were in
>> >> > this state a lot as each visitor would open up a new
>> >> > program.  Basically you have to just turn the unit off and restart
>> >> > as trying to recover is futile.
>> >>
>> >> What if activities had a higher oom_score? Would that protect enough
>> >> the processes that once killed require a system restart (X, shell,
>> >> etc)?
>> >
>> > See patch vs sugar-toolkit HEAD below[1] (I can backport to 0.82 if
>> > wanted).
>>
>> Maybe would be better to have the shell do that? So it works for
>> non-python activities.
>
> Patch inline below.

Looks great, thanks a lot. Have you seen less memory-induced lockups
on the XO-1?

Regards,

Tomeu

>> >> Regards,
>> >>
>> >> Tomeu
>> >
>> > Martin
>> >
>> Thanks,
>>
>> Tomeu
>
> Martin
>
>
> (untested) patch against
> http://cgit.sugarlabs.org/sugar-toolkit/mainline/tree/src/sugar/activity/activityfactory.py
> :
>
> From 4bd6fb9f7f245c2aed92d6964746627d0c96cbec Mon Sep 17 00:00:00 2001
> From: Martin Dengler 
> Date: Sat, 7 Nov 2009 10:55:16 +
> Subject: [PATCH] sacrifice activities to the OOM killer first
>
> change the OOM-killer score of launched activities to be the maximum.
> See discussion at http://linux-mm.org/OOM_Killer
> ---
>  src/sugar/activity/activityfactory.py |   35 
> +
>  1 files changed, 35 insertions(+), 0 deletions(-)
>
> diff --git a/src/sugar/activity/activityfactory.py 
> b/src/sugar/activity/activityfactory.py
> index ee0fd92..5deee6e 100644
> --- a/src/sugar/activity/activityfactory.py
> +++ b/src/sugar/activity/activityfactory.py
> @@ -65,6 +65,39 @@ def _close_fds():
>             pass
>
>
> +def __oom_adj_pid(pid, omm_adj_value=None):
> +    """ Change a process' OOM likelihood to oom_adj_value.
> +
> +    By default, use the value of gconf path
> +    "/desktop/sugar/performance/oom_adj_default"; if none exists, make
> +    this process most likely to be killed (oom_adj_value=15).
> +
> +    Linux-specific.  See http://linux-mm.org/OOM_Killer for details.
> +    """
> +    oom_adj_fullpath = "/proc/%s/oom_adj" % pid
> +    if os.path.exists(oom_adj_fullpath):
> +        try:
> +
> +            # get values/defaults from gconf
> +            import gconf
> +            gconf_dir = "/desktop/sugar/performance"
> +            gconf_key = "oom_adj_default"
> +            client = gconf.client_get_default()
> +            if not client.dir_exists(gconf_dir):
> +                client.add_dir(gconf_dir, gconf.CLIENT_PRELOAD_NONE)
> +            if oom_adj_value is None:
> +                oom_adj_value = client.get_int(gconf_dir + "/" + gconf_key)
> +                if oom_adj_value is None:
> +                    oom_adj_value = 15
> +                    client.set_int(gconf_dir + "/" + gconf_key,
> +                                   oom_adj_value)
> +
> +            file(oom_adj_fullpath).write(oom_adj_value)
> +
> +        except:
> +            pass
> +
> +
>  def create_activity_id():
>     """Generate a new, unique ID for this activity"""
>     pservice = presenceservice.get_instance()
> @@ -276,6 +309,8 @@ class ActivityCreationHandler(gobject.GObject):
>             stdout=log_file.fileno(),
>             stderr=log_file.fileno())
>
> +        __oom_adj_pid(child.pid)
> +
>         gobject.child_watch_add(child.pid,
>                                 _child_watch_cb,
>                                 (environment_dir, log_file))
> --
> 1.6.2.5
>
>



-- 
«Sugar Labs is anyone who participates in improving and using Sugar.
What Sugar Labs does is determined by the participants.» - David
Farning
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OOM conditions

2009-11-07 Thread Martin Dengler
On Fri, Nov 06, 2009 at 04:50:53PM +, Tomeu Vizoso wrote:
> On Wed, Nov 4, 2009 at 14:16, Martin Dengler  wrote:
> > On Fri, Oct 30, 2009 at 11:22:13PM +0100, Tomeu Vizoso wrote:
> >> On Fri, Oct 30, 2009 at 16:58, Richard A. Smith  wrote:
> >> > Working the table at the Boston book festival I was reminded how
> >> > painful the OOM stuff is on a gen 1. The demo machines were in
> >> > this state a lot as each visitor would open up a new
> >> > program.  Basically you have to just turn the unit off and restart
> >> > as trying to recover is futile.
> >>
> >> What if activities had a higher oom_score? Would that protect enough
> >> the processes that once killed require a system restart (X, shell,
> >> etc)?
> >
> > See patch vs sugar-toolkit HEAD below[1] (I can backport to 0.82 if
> > wanted).
> 
> Maybe would be better to have the shell do that? So it works for
> non-python activities.

Patch inline below.

> >> Regards,
> >>
> >> Tomeu
> >
> > Martin
> >
> Thanks,
> 
> Tomeu

Martin


(untested) patch against
http://cgit.sugarlabs.org/sugar-toolkit/mainline/tree/src/sugar/activity/activityfactory.py
:

From 4bd6fb9f7f245c2aed92d6964746627d0c96cbec Mon Sep 17 00:00:00 2001
From: Martin Dengler 
Date: Sat, 7 Nov 2009 10:55:16 +
Subject: [PATCH] sacrifice activities to the OOM killer first

change the OOM-killer score of launched activities to be the maximum.
See discussion at http://linux-mm.org/OOM_Killer
---
 src/sugar/activity/activityfactory.py |   35 +
 1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/src/sugar/activity/activityfactory.py 
b/src/sugar/activity/activityfactory.py
index ee0fd92..5deee6e 100644
--- a/src/sugar/activity/activityfactory.py
+++ b/src/sugar/activity/activityfactory.py
@@ -65,6 +65,39 @@ def _close_fds():
 pass
 
 
+def __oom_adj_pid(pid, omm_adj_value=None):
+""" Change a process' OOM likelihood to oom_adj_value.
+
+By default, use the value of gconf path
+"/desktop/sugar/performance/oom_adj_default"; if none exists, make
+this process most likely to be killed (oom_adj_value=15).
+
+Linux-specific.  See http://linux-mm.org/OOM_Killer for details.
+"""
+oom_adj_fullpath = "/proc/%s/oom_adj" % pid
+if os.path.exists(oom_adj_fullpath):
+try:
+
+# get values/defaults from gconf
+import gconf
+gconf_dir = "/desktop/sugar/performance"
+gconf_key = "oom_adj_default"
+client = gconf.client_get_default()
+if not client.dir_exists(gconf_dir):
+client.add_dir(gconf_dir, gconf.CLIENT_PRELOAD_NONE)
+if oom_adj_value is None:
+oom_adj_value = client.get_int(gconf_dir + "/" + gconf_key)
+if oom_adj_value is None:
+oom_adj_value = 15
+client.set_int(gconf_dir + "/" + gconf_key,
+   oom_adj_value)
+
+file(oom_adj_fullpath).write(oom_adj_value)
+
+except:
+pass
+
+
 def create_activity_id():
 """Generate a new, unique ID for this activity"""
 pservice = presenceservice.get_instance()
@@ -276,6 +309,8 @@ class ActivityCreationHandler(gobject.GObject):
 stdout=log_file.fileno(),
 stderr=log_file.fileno())
 
+__oom_adj_pid(child.pid)
+
 gobject.child_watch_add(child.pid,
 _child_watch_cb,
 (environment_dir, log_file))
-- 
1.6.2.5



pgptlRP7tORQh.pgp
Description: PGP signature
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OOM conditions

2009-11-04 Thread C. Scott Ananian
LWN has another OOM-related article today:
  http://lwn.net/SubscriberLink/359998/87f548d9c23995f2/
 --scott

-- 
 ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OOM conditions

2009-11-04 Thread Martin Dengler
On Fri, Oct 30, 2009 at 11:22:13PM +0100, Tomeu Vizoso wrote:
> On Fri, Oct 30, 2009 at 16:58, Richard A. Smith  wrote:
> > Working the table at the Boston book festival I was reminded how
> > painful the OOM stuff is on a gen 1. The demo machines were in
> > this state a lot as each visitor would open up a new
> > program.  Basically you have to just turn the unit off and restart
> > as trying to recover is futile.
> 
> What if activities had a higher oom_score? Would that protect enough
> the processes that once killed require a system restart (X, shell,
> etc)?

See patch vs sugar-toolkit HEAD below[1] (I can backport to 0.82 if
wanted).

> Maybe even have the background activities have a higher
> oom_score than the one in the foreground?

Interesting.  Another approach I'd love feedback on would be to set
the allowed number of simultaneous running activities to be 2 (1 +
Journal) and writing a simple control panel extension that would allow
tweaking that, this oom_adj, and other related gconf values.
 
> Regards,
> 
> Tomeu

Martin


1. patch against
http://cgit.sugarlabs.org/sugar-toolkit/mainline/tree/src/sugar/activity/main.py
 :

diff --git a/src/sugar/activity/main.py b/src/sugar/activity/main.py
index 93f34e6..c868e11 100644
--- a/src/sugar/activity/main.py
+++ b/src/sugar/activity/main.py
@@ -31,7 +31,40 @@ from sugar.bundle.activitybundle import ActivityBundle
 from sugar import logger
 
 
+def __oom_adj_self(omm_adj_value=None):
+""" Change this process' OOM likelihood to oom_adj_value.
+
+By default, use the value of gconf path
+"/desktop/sugar/performance/oom_adj_default"; if none exists, make
+this process most likely to be killed (oom_adj_value=15).
+
+Linux-specific.  See http://linux-mm.org/OOM_Killer for details.
+"""
+oom_adj_fullpath = "/proc/self/oom_adj"
+if os.path.exists(oom_adj_fullpath):
+try:
+
+# get values/defaults from gconf
+import gconf
+gconf_dir = "/desktop/sugar/performance"
+gconf_key = "oom_adj_default"
+client = gconf.client_get_default()
+if not client.dir_exists(gconf_dir):
+client.add_dir(gconf_dir, gconf.CLIENT_PRELOAD_NONE)
+if oom_adj_value is None:
+oom_adj_value = client.get_int(gconf_dir + "/" + gconf_key)
+if oom_adj_value is None:
+oom_adj_value = 15
+client.set_int(gconf_dir + "/" + gconf_key,
+   oom_adj_value)
+
+file(oom_adj_fullpath).write(oom_adj_value)
+
+except:
+pass
+
 def create_activity_instance(constructor, handle):
+__oom_adj_self()
 activity = constructor(handle)
 activity.show()
 


pgpNZISREnFmt.pgp
Description: PGP signature
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OOM conditions

2009-11-02 Thread Martin Langhoff
On Fri, Oct 30, 2009 at 4:58 PM, Richard A. Smith  wrote:
> The OOM topic and what to do in that case has come up several times in the 
> past.  If Maemo thinks they have a reasonable solution to the problems then 
> someone should look at trying to add that to our kernel and user space

Last time I used a Nokia tablet, it had a "single-app-at-a-time" UI,
much like the iPhone. Or Sugar.

I do think that the shell can restrict the number of open apps, and
even actively close bg apps if their document state is saved. Between
such shell improvements and the OOM scores Tomeu proposes, the user
experience can be much better.

cheers,



m
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OOM conditions

2009-10-30 Thread Benjamin M. Schwartz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Richard A. Smith wrote:
> Working the table at the Boston book festival I was reminded how painful the 
> OOM stuff is on a gen 1.

The OOM problem on Gen 1 is a True Kernel Bug.  The problem is that the
OOM killer just isn't working.  Almost all the time, it fails to kill
_any_ process, and instead just locks up the machine.

I believe Andres was able to connect via a serial port during one of these
events, and observed the kswapd process in an "uninterruptible sleep"
(a.k.a. "state D").  This should never happen.

There has been a significant amount of churn in the OOM system over the
past few years, and a number of bugs are known to have been created and
resolved.  To the best of my knowledge, no one has ever precisely
identified whether the XO's problem is due to one of them.

Until recently, there was no newer XO kernel with which to test.  It would
be worthwhile to observe the F11-XO1 builds' behavior at OOM, to see if
there has been an improvement.

- --Ben
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.11 (GNU/Linux)

iEYEARECAAYFAkrrwMsACgkQUJT6e6HFtqTW1gCdHsEpOD5djVFtq0k3h8z6BqvE
aC4An3Sp0c+lpwmkNBoxDNEct3z5bfe4
=/INZ
-END PGP SIGNATURE-
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OOM conditions

2009-10-30 Thread Tomeu Vizoso
On Fri, Oct 30, 2009 at 16:58, Richard A. Smith  wrote:
> In a LWN discussion thread on how google uses the kernel I found the 
> following:
>
> ==
> 2) Mike asked why the kernel tries so hard to allocate memory - why not just 
> fail to allocate if there is too much pressure. Why isn't disabling 
> overcommit enough?
>
> Posted Oct 24, 2009 1:26 UTC (Sat) by Tomasu (subscriber, #39889) [Link]
>
> 2) probably because they actually want some over-commit, but they don't want 
> the OOM thread to go wild killing everything, and definitely not the WRONG 
> thing.
>
> Posted Oct 25, 2009 19:24 UTC (Sun) by oak (subscriber, #2786) [Link]
>
> In the Maemo (at least Diablo release) kernel source there are
> configurable limits for when kernel starts to deny allocations and when to
> OOM-kill (besides notifying user-space about crossing of these and some
> earlier limits). If process is set as "OOM-protected", its allocations
> will also always succeed. If "OOM-protected" processes waste all memory
> in the system, then they can also get killed.
>
> ===
>
> Working the table at the Boston book festival I was reminded how painful the 
> OOM stuff is on a gen 1. The demo machines were in this state a lot as each 
> visitor would open up a new program.  Basically you have to just turn the 
> unit off and restart as trying to recover is futile.  The 1GiB of memory on 
> 1.5 will help with this somewhat but in most cases it just means that you 
> shift the problem.  Users being users will still open up too much and the 
> 1GiB isn't an option for Gen 1.0 users.
>
> The OOM topic and what to do in that case has come up several times in the 
> past.  If Maemo thinks they have a reasonable solution to the problems then 
> someone should look at trying to add that to our kernel and user space

What if activities had a higher oom_score? Would that protect enough
the processes that once killed require a system restart (X, shell,
etc)? Maybe even have the background activities have a higher
oom_score than the one in the foreground?

Regards,

Tomeu

-- 
«Sugar Labs is anyone who participates in improving and using Sugar.
What Sugar Labs does is determined by the participants.» - David
Farning
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


OOM conditions

2009-10-30 Thread Richard A. Smith
In a LWN discussion thread on how google uses the kernel I found the following:

==
2) Mike asked why the kernel tries so hard to allocate memory - why not just 
fail to allocate if there is too much pressure. Why isn't disabling overcommit 
enough?

Posted Oct 24, 2009 1:26 UTC (Sat) by Tomasu (subscriber, #39889) [Link]

2) probably because they actually want some over-commit, but they don't want 
the OOM thread to go wild killing everything, and definitely not the WRONG 
thing.

Posted Oct 25, 2009 19:24 UTC (Sun) by oak (subscriber, #2786) [Link]

In the Maemo (at least Diablo release) kernel source there are
configurable limits for when kernel starts to deny allocations and when to
OOM-kill (besides notifying user-space about crossing of these and some
earlier limits). If process is set as "OOM-protected", its allocations
will also always succeed. If "OOM-protected" processes waste all memory
in the system, then they can also get killed.

===

Working the table at the Boston book festival I was reminded how painful the 
OOM stuff is on a gen 1. The demo machines were in this state a lot as each 
visitor would open up a new program.  Basically you have to just turn the unit 
off and restart as trying to recover is futile.  The 1GiB of memory on 1.5 will 
help with this somewhat but in most cases it just means that you shift the 
problem.  Users being users will still open up too much and the 1GiB isn't an 
option for Gen 1.0 users.

The OOM topic and what to do in that case has come up several times in the 
past.  If Maemo thinks they have a reasonable solution to the problems then 
someone should look at trying to add that to our kernel and user space

-- 
Richard A. Smith  
One Laptop per Child
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel