Re: [Sugar-devel] OOM conditions
On Sun, Nov 08, 2009 at 12:07:58PM +, Lucian Branescu wrote: > Slightly off-topic, has anyone tried compcache > (http://code.google.com/p/compcache/) on an XO-1? I might if I can get > it to work. Yes. It works very well. http://www.google.com/search?q=compcache+xo Martin pgpNfpVzilNu4.pgp Description: PGP signature ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: [Sugar-devel] OOM conditions
Slightly off-topic, has anyone tried compcache (http://code.google.com/p/compcache/) on an XO-1? I might if I can get it to work. 2009/11/8 Tomeu Vizoso : > On Sat, Nov 7, 2009 at 12:06, Martin Dengler wrote: >> On Fri, Nov 06, 2009 at 04:50:53PM +, Tomeu Vizoso wrote: >>> On Wed, Nov 4, 2009 at 14:16, Martin Dengler >>> wrote: >>> > On Fri, Oct 30, 2009 at 11:22:13PM +0100, Tomeu Vizoso wrote: >>> >> On Fri, Oct 30, 2009 at 16:58, Richard A. Smith >>> >> wrote: >>> >> > Working the table at the Boston book festival I was reminded how >>> >> > painful the OOM stuff is on a gen 1. The demo machines were in >>> >> > this state a lot as each visitor would open up a new >>> >> > program. Basically you have to just turn the unit off and restart >>> >> > as trying to recover is futile. >>> >> >>> >> What if activities had a higher oom_score? Would that protect enough >>> >> the processes that once killed require a system restart (X, shell, >>> >> etc)? >>> > >>> > See patch vs sugar-toolkit HEAD below[1] (I can backport to 0.82 if >>> > wanted). >>> >>> Maybe would be better to have the shell do that? So it works for >>> non-python activities. >> >> Patch inline below. > > Looks great, thanks a lot. Have you seen less memory-induced lockups > on the XO-1? > > Regards, > > Tomeu > >>> >> Regards, >>> >> >>> >> Tomeu >>> > >>> > Martin >>> > >>> Thanks, >>> >>> Tomeu >> >> Martin >> >> >> (untested) patch against >> http://cgit.sugarlabs.org/sugar-toolkit/mainline/tree/src/sugar/activity/activityfactory.py >> : >> >> From 4bd6fb9f7f245c2aed92d6964746627d0c96cbec Mon Sep 17 00:00:00 2001 >> From: Martin Dengler >> Date: Sat, 7 Nov 2009 10:55:16 + >> Subject: [PATCH] sacrifice activities to the OOM killer first >> >> change the OOM-killer score of launched activities to be the maximum. >> See discussion at http://linux-mm.org/OOM_Killer >> --- >> src/sugar/activity/activityfactory.py | 35 >> + >> 1 files changed, 35 insertions(+), 0 deletions(-) >> >> diff --git a/src/sugar/activity/activityfactory.py >> b/src/sugar/activity/activityfactory.py >> index ee0fd92..5deee6e 100644 >> --- a/src/sugar/activity/activityfactory.py >> +++ b/src/sugar/activity/activityfactory.py >> @@ -65,6 +65,39 @@ def _close_fds(): >> pass >> >> >> +def __oom_adj_pid(pid, omm_adj_value=None): >> + """ Change a process' OOM likelihood to oom_adj_value. >> + >> + By default, use the value of gconf path >> + "/desktop/sugar/performance/oom_adj_default"; if none exists, make >> + this process most likely to be killed (oom_adj_value=15). >> + >> + Linux-specific. See http://linux-mm.org/OOM_Killer for details. >> + """ >> + oom_adj_fullpath = "/proc/%s/oom_adj" % pid >> + if os.path.exists(oom_adj_fullpath): >> + try: >> + >> + # get values/defaults from gconf >> + import gconf >> + gconf_dir = "/desktop/sugar/performance" >> + gconf_key = "oom_adj_default" >> + client = gconf.client_get_default() >> + if not client.dir_exists(gconf_dir): >> + client.add_dir(gconf_dir, gconf.CLIENT_PRELOAD_NONE) >> + if oom_adj_value is None: >> + oom_adj_value = client.get_int(gconf_dir + "/" + gconf_key) >> + if oom_adj_value is None: >> + oom_adj_value = 15 >> + client.set_int(gconf_dir + "/" + gconf_key, >> + oom_adj_value) >> + >> + file(oom_adj_fullpath).write(oom_adj_value) >> + >> + except: >> + pass >> + >> + >> def create_activity_id(): >> """Generate a new, unique ID for this activity""" >> pservice = presenceservice.get_instance() >> @@ -276,6 +309,8 @@ class ActivityCreationHandler(gobject.GObject): >> stdout=log_file.fileno(), >> stderr=log_file.fileno()) >> >> + __oom_adj_pid(child.pid) >> + >> gobject.child_watch_add(child.pid, >> _child_watch_cb, >> (environment_dir, log_file)) >> -- >> 1.6.2.5 >> >> > > > > -- > «Sugar Labs is anyone who participates in improving and using Sugar. > What Sugar Labs does is determined by the participants.» - David > Farning > ___ > Sugar-devel mailing list > sugar-de...@lists.sugarlabs.org > http://lists.sugarlabs.org/listinfo/sugar-devel > ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OOM conditions
On Sat, Nov 7, 2009 at 12:06, Martin Dengler wrote: > On Fri, Nov 06, 2009 at 04:50:53PM +, Tomeu Vizoso wrote: >> On Wed, Nov 4, 2009 at 14:16, Martin Dengler >> wrote: >> > On Fri, Oct 30, 2009 at 11:22:13PM +0100, Tomeu Vizoso wrote: >> >> On Fri, Oct 30, 2009 at 16:58, Richard A. Smith >> >> wrote: >> >> > Working the table at the Boston book festival I was reminded how >> >> > painful the OOM stuff is on a gen 1. The demo machines were in >> >> > this state a lot as each visitor would open up a new >> >> > program. Basically you have to just turn the unit off and restart >> >> > as trying to recover is futile. >> >> >> >> What if activities had a higher oom_score? Would that protect enough >> >> the processes that once killed require a system restart (X, shell, >> >> etc)? >> > >> > See patch vs sugar-toolkit HEAD below[1] (I can backport to 0.82 if >> > wanted). >> >> Maybe would be better to have the shell do that? So it works for >> non-python activities. > > Patch inline below. Looks great, thanks a lot. Have you seen less memory-induced lockups on the XO-1? Regards, Tomeu >> >> Regards, >> >> >> >> Tomeu >> > >> > Martin >> > >> Thanks, >> >> Tomeu > > Martin > > > (untested) patch against > http://cgit.sugarlabs.org/sugar-toolkit/mainline/tree/src/sugar/activity/activityfactory.py > : > > From 4bd6fb9f7f245c2aed92d6964746627d0c96cbec Mon Sep 17 00:00:00 2001 > From: Martin Dengler > Date: Sat, 7 Nov 2009 10:55:16 + > Subject: [PATCH] sacrifice activities to the OOM killer first > > change the OOM-killer score of launched activities to be the maximum. > See discussion at http://linux-mm.org/OOM_Killer > --- > src/sugar/activity/activityfactory.py | 35 > + > 1 files changed, 35 insertions(+), 0 deletions(-) > > diff --git a/src/sugar/activity/activityfactory.py > b/src/sugar/activity/activityfactory.py > index ee0fd92..5deee6e 100644 > --- a/src/sugar/activity/activityfactory.py > +++ b/src/sugar/activity/activityfactory.py > @@ -65,6 +65,39 @@ def _close_fds(): > pass > > > +def __oom_adj_pid(pid, omm_adj_value=None): > + """ Change a process' OOM likelihood to oom_adj_value. > + > + By default, use the value of gconf path > + "/desktop/sugar/performance/oom_adj_default"; if none exists, make > + this process most likely to be killed (oom_adj_value=15). > + > + Linux-specific. See http://linux-mm.org/OOM_Killer for details. > + """ > + oom_adj_fullpath = "/proc/%s/oom_adj" % pid > + if os.path.exists(oom_adj_fullpath): > + try: > + > + # get values/defaults from gconf > + import gconf > + gconf_dir = "/desktop/sugar/performance" > + gconf_key = "oom_adj_default" > + client = gconf.client_get_default() > + if not client.dir_exists(gconf_dir): > + client.add_dir(gconf_dir, gconf.CLIENT_PRELOAD_NONE) > + if oom_adj_value is None: > + oom_adj_value = client.get_int(gconf_dir + "/" + gconf_key) > + if oom_adj_value is None: > + oom_adj_value = 15 > + client.set_int(gconf_dir + "/" + gconf_key, > + oom_adj_value) > + > + file(oom_adj_fullpath).write(oom_adj_value) > + > + except: > + pass > + > + > def create_activity_id(): > """Generate a new, unique ID for this activity""" > pservice = presenceservice.get_instance() > @@ -276,6 +309,8 @@ class ActivityCreationHandler(gobject.GObject): > stdout=log_file.fileno(), > stderr=log_file.fileno()) > > + __oom_adj_pid(child.pid) > + > gobject.child_watch_add(child.pid, > _child_watch_cb, > (environment_dir, log_file)) > -- > 1.6.2.5 > > -- «Sugar Labs is anyone who participates in improving and using Sugar. What Sugar Labs does is determined by the participants.» - David Farning ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OOM conditions
On Fri, Nov 06, 2009 at 04:50:53PM +, Tomeu Vizoso wrote: > On Wed, Nov 4, 2009 at 14:16, Martin Dengler wrote: > > On Fri, Oct 30, 2009 at 11:22:13PM +0100, Tomeu Vizoso wrote: > >> On Fri, Oct 30, 2009 at 16:58, Richard A. Smith wrote: > >> > Working the table at the Boston book festival I was reminded how > >> > painful the OOM stuff is on a gen 1. The demo machines were in > >> > this state a lot as each visitor would open up a new > >> > program. Basically you have to just turn the unit off and restart > >> > as trying to recover is futile. > >> > >> What if activities had a higher oom_score? Would that protect enough > >> the processes that once killed require a system restart (X, shell, > >> etc)? > > > > See patch vs sugar-toolkit HEAD below[1] (I can backport to 0.82 if > > wanted). > > Maybe would be better to have the shell do that? So it works for > non-python activities. Patch inline below. > >> Regards, > >> > >> Tomeu > > > > Martin > > > Thanks, > > Tomeu Martin (untested) patch against http://cgit.sugarlabs.org/sugar-toolkit/mainline/tree/src/sugar/activity/activityfactory.py : From 4bd6fb9f7f245c2aed92d6964746627d0c96cbec Mon Sep 17 00:00:00 2001 From: Martin Dengler Date: Sat, 7 Nov 2009 10:55:16 + Subject: [PATCH] sacrifice activities to the OOM killer first change the OOM-killer score of launched activities to be the maximum. See discussion at http://linux-mm.org/OOM_Killer --- src/sugar/activity/activityfactory.py | 35 + 1 files changed, 35 insertions(+), 0 deletions(-) diff --git a/src/sugar/activity/activityfactory.py b/src/sugar/activity/activityfactory.py index ee0fd92..5deee6e 100644 --- a/src/sugar/activity/activityfactory.py +++ b/src/sugar/activity/activityfactory.py @@ -65,6 +65,39 @@ def _close_fds(): pass +def __oom_adj_pid(pid, omm_adj_value=None): +""" Change a process' OOM likelihood to oom_adj_value. + +By default, use the value of gconf path +"/desktop/sugar/performance/oom_adj_default"; if none exists, make +this process most likely to be killed (oom_adj_value=15). + +Linux-specific. See http://linux-mm.org/OOM_Killer for details. +""" +oom_adj_fullpath = "/proc/%s/oom_adj" % pid +if os.path.exists(oom_adj_fullpath): +try: + +# get values/defaults from gconf +import gconf +gconf_dir = "/desktop/sugar/performance" +gconf_key = "oom_adj_default" +client = gconf.client_get_default() +if not client.dir_exists(gconf_dir): +client.add_dir(gconf_dir, gconf.CLIENT_PRELOAD_NONE) +if oom_adj_value is None: +oom_adj_value = client.get_int(gconf_dir + "/" + gconf_key) +if oom_adj_value is None: +oom_adj_value = 15 +client.set_int(gconf_dir + "/" + gconf_key, + oom_adj_value) + +file(oom_adj_fullpath).write(oom_adj_value) + +except: +pass + + def create_activity_id(): """Generate a new, unique ID for this activity""" pservice = presenceservice.get_instance() @@ -276,6 +309,8 @@ class ActivityCreationHandler(gobject.GObject): stdout=log_file.fileno(), stderr=log_file.fileno()) +__oom_adj_pid(child.pid) + gobject.child_watch_add(child.pid, _child_watch_cb, (environment_dir, log_file)) -- 1.6.2.5 pgptlRP7tORQh.pgp Description: PGP signature ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OOM conditions
LWN has another OOM-related article today: http://lwn.net/SubscriberLink/359998/87f548d9c23995f2/ --scott -- ( http://cscott.net/ ) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OOM conditions
On Fri, Oct 30, 2009 at 11:22:13PM +0100, Tomeu Vizoso wrote: > On Fri, Oct 30, 2009 at 16:58, Richard A. Smith wrote: > > Working the table at the Boston book festival I was reminded how > > painful the OOM stuff is on a gen 1. The demo machines were in > > this state a lot as each visitor would open up a new > > program. Basically you have to just turn the unit off and restart > > as trying to recover is futile. > > What if activities had a higher oom_score? Would that protect enough > the processes that once killed require a system restart (X, shell, > etc)? See patch vs sugar-toolkit HEAD below[1] (I can backport to 0.82 if wanted). > Maybe even have the background activities have a higher > oom_score than the one in the foreground? Interesting. Another approach I'd love feedback on would be to set the allowed number of simultaneous running activities to be 2 (1 + Journal) and writing a simple control panel extension that would allow tweaking that, this oom_adj, and other related gconf values. > Regards, > > Tomeu Martin 1. patch against http://cgit.sugarlabs.org/sugar-toolkit/mainline/tree/src/sugar/activity/main.py : diff --git a/src/sugar/activity/main.py b/src/sugar/activity/main.py index 93f34e6..c868e11 100644 --- a/src/sugar/activity/main.py +++ b/src/sugar/activity/main.py @@ -31,7 +31,40 @@ from sugar.bundle.activitybundle import ActivityBundle from sugar import logger +def __oom_adj_self(omm_adj_value=None): +""" Change this process' OOM likelihood to oom_adj_value. + +By default, use the value of gconf path +"/desktop/sugar/performance/oom_adj_default"; if none exists, make +this process most likely to be killed (oom_adj_value=15). + +Linux-specific. See http://linux-mm.org/OOM_Killer for details. +""" +oom_adj_fullpath = "/proc/self/oom_adj" +if os.path.exists(oom_adj_fullpath): +try: + +# get values/defaults from gconf +import gconf +gconf_dir = "/desktop/sugar/performance" +gconf_key = "oom_adj_default" +client = gconf.client_get_default() +if not client.dir_exists(gconf_dir): +client.add_dir(gconf_dir, gconf.CLIENT_PRELOAD_NONE) +if oom_adj_value is None: +oom_adj_value = client.get_int(gconf_dir + "/" + gconf_key) +if oom_adj_value is None: +oom_adj_value = 15 +client.set_int(gconf_dir + "/" + gconf_key, + oom_adj_value) + +file(oom_adj_fullpath).write(oom_adj_value) + +except: +pass + def create_activity_instance(constructor, handle): +__oom_adj_self() activity = constructor(handle) activity.show() pgpNZISREnFmt.pgp Description: PGP signature ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OOM conditions
On Fri, Oct 30, 2009 at 4:58 PM, Richard A. Smith wrote: > The OOM topic and what to do in that case has come up several times in the > past. If Maemo thinks they have a reasonable solution to the problems then > someone should look at trying to add that to our kernel and user space Last time I used a Nokia tablet, it had a "single-app-at-a-time" UI, much like the iPhone. Or Sugar. I do think that the shell can restrict the number of open apps, and even actively close bg apps if their document state is saved. Between such shell improvements and the OOM scores Tomeu proposes, the user experience can be much better. cheers, m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OOM conditions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Richard A. Smith wrote: > Working the table at the Boston book festival I was reminded how painful the > OOM stuff is on a gen 1. The OOM problem on Gen 1 is a True Kernel Bug. The problem is that the OOM killer just isn't working. Almost all the time, it fails to kill _any_ process, and instead just locks up the machine. I believe Andres was able to connect via a serial port during one of these events, and observed the kswapd process in an "uninterruptible sleep" (a.k.a. "state D"). This should never happen. There has been a significant amount of churn in the OOM system over the past few years, and a number of bugs are known to have been created and resolved. To the best of my knowledge, no one has ever precisely identified whether the XO's problem is due to one of them. Until recently, there was no newer XO kernel with which to test. It would be worthwhile to observe the F11-XO1 builds' behavior at OOM, to see if there has been an improvement. - --Ben -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.11 (GNU/Linux) iEYEARECAAYFAkrrwMsACgkQUJT6e6HFtqTW1gCdHsEpOD5djVFtq0k3h8z6BqvE aC4An3Sp0c+lpwmkNBoxDNEct3z5bfe4 =/INZ -END PGP SIGNATURE- ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OOM conditions
On Fri, Oct 30, 2009 at 16:58, Richard A. Smith wrote: > In a LWN discussion thread on how google uses the kernel I found the > following: > > == > 2) Mike asked why the kernel tries so hard to allocate memory - why not just > fail to allocate if there is too much pressure. Why isn't disabling > overcommit enough? > > Posted Oct 24, 2009 1:26 UTC (Sat) by Tomasu (subscriber, #39889) [Link] > > 2) probably because they actually want some over-commit, but they don't want > the OOM thread to go wild killing everything, and definitely not the WRONG > thing. > > Posted Oct 25, 2009 19:24 UTC (Sun) by oak (subscriber, #2786) [Link] > > In the Maemo (at least Diablo release) kernel source there are > configurable limits for when kernel starts to deny allocations and when to > OOM-kill (besides notifying user-space about crossing of these and some > earlier limits). If process is set as "OOM-protected", its allocations > will also always succeed. If "OOM-protected" processes waste all memory > in the system, then they can also get killed. > > === > > Working the table at the Boston book festival I was reminded how painful the > OOM stuff is on a gen 1. The demo machines were in this state a lot as each > visitor would open up a new program. Basically you have to just turn the > unit off and restart as trying to recover is futile. The 1GiB of memory on > 1.5 will help with this somewhat but in most cases it just means that you > shift the problem. Users being users will still open up too much and the > 1GiB isn't an option for Gen 1.0 users. > > The OOM topic and what to do in that case has come up several times in the > past. If Maemo thinks they have a reasonable solution to the problems then > someone should look at trying to add that to our kernel and user space What if activities had a higher oom_score? Would that protect enough the processes that once killed require a system restart (X, shell, etc)? Maybe even have the background activities have a higher oom_score than the one in the foreground? Regards, Tomeu -- «Sugar Labs is anyone who participates in improving and using Sugar. What Sugar Labs does is determined by the participants.» - David Farning ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
OOM conditions
In a LWN discussion thread on how google uses the kernel I found the following: == 2) Mike asked why the kernel tries so hard to allocate memory - why not just fail to allocate if there is too much pressure. Why isn't disabling overcommit enough? Posted Oct 24, 2009 1:26 UTC (Sat) by Tomasu (subscriber, #39889) [Link] 2) probably because they actually want some over-commit, but they don't want the OOM thread to go wild killing everything, and definitely not the WRONG thing. Posted Oct 25, 2009 19:24 UTC (Sun) by oak (subscriber, #2786) [Link] In the Maemo (at least Diablo release) kernel source there are configurable limits for when kernel starts to deny allocations and when to OOM-kill (besides notifying user-space about crossing of these and some earlier limits). If process is set as "OOM-protected", its allocations will also always succeed. If "OOM-protected" processes waste all memory in the system, then they can also get killed. === Working the table at the Boston book festival I was reminded how painful the OOM stuff is on a gen 1. The demo machines were in this state a lot as each visitor would open up a new program. Basically you have to just turn the unit off and restart as trying to recover is futile. The 1GiB of memory on 1.5 will help with this somewhat but in most cases it just means that you shift the problem. Users being users will still open up too much and the 1GiB isn't an option for Gen 1.0 users. The OOM topic and what to do in that case has come up several times in the past. If Maemo thinks they have a reasonable solution to the problems then someone should look at trying to add that to our kernel and user space -- Richard A. Smith One Laptop per Child ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel