Re: [Sugar-devel] [Dextrose] Stability stuff
On Fri, Sep 3, 2010 at 23:59, Bernie Innocenti wrote: > El Fri, 03-09-2010 a las 11:23 -0400, Martin Abente escribió: > >> Well, thats true in theory, assuming all the activities are properly >> designed for sugar. In the field you already know thats not the case. >> Also... even when the activities are being implemented in python >> through the Activity Class, the read and write methods needs to be >> implemented by the programmer. That means it >> depends on the activity specifics again. > > Yes, but if an activity fails to save when Sugar asks it to quit, then > it's already buggy today: we also have a "Stop" item in the menu of the > activity frame icon. > > >> > This is also a very good suggestion. We could start by doing this, which >> > is a lot easier and almost equally effective. >> >> I see it this way: Why waiting to get sick to do something about it. >> Preventive medicine >> is always better. Why waiting for the machine to freeze (waiting 3 or more >> minutes until its back >> to a usable state again) to do something about it, also with potential >> data loss. >> >> Having a message telling kids that the machine is too overloaded should be >> enough, with >> recommendations about saving any current work and closing earlier >> activities. >> >> This kind of mechanisms should help to the overall stability, and it makes >> even more sense when you >> think about XO's 1 scenarios. >> >> :) > > Yes, I already agreed with you on this. The hard part of this patch > would be setting a threshold to disallow opening another activity. > Memory footprint of activities varies wildly. Shall we take the worst > case, pissing off users who knew what they were doing, or shall we be > optimistic, risking the current behavior in some cases? > > If we also had both the "graceful stop on oom" that I was thinking of, > we could afford to be be optimistic in the "oom prevention" code. > > Anyway, for now I'd vote for doing what you suggest in the easiest > possible way even if it saves the system only 50% of the times. It would > still be a huge improvement upon the current behavior. Whatever we end up doing, we should not leave much chance of undercalculating the available memory of we may render Sugar mostly useless. This remembers me when we deployed the free-space warning and users were able to get into situations where they could not use Sugar because of not enough space but also couldn't remove stuff. Regards, Tomeu > -- > // Bernie Innocenti - http://codewiz.org/ > \X/ Sugar Labs - http://sugarlabs.org/ > > ___ > Sugar-devel mailing list > Sugar-devel@lists.sugarlabs.org > http://lists.sugarlabs.org/listinfo/sugar-devel > ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [Dextrose] Stability stuff
El Fri, 03-09-2010 a las 11:23 -0400, Martin Abente escribió: > Well, thats true in theory, assuming all the activities are properly > designed for sugar. In the field you already know thats not the case. > Also... even when the activities are being implemented in python > through the Activity Class, the read and write methods needs to be > implemented by the programmer. That means it > depends on the activity specifics again. Yes, but if an activity fails to save when Sugar asks it to quit, then it's already buggy today: we also have a "Stop" item in the menu of the activity frame icon. > > This is also a very good suggestion. We could start by doing this, which > > is a lot easier and almost equally effective. > > I see it this way: Why waiting to get sick to do something about it. > Preventive medicine > is always better. Why waiting for the machine to freeze (waiting 3 or more > minutes until its back > to a usable state again) to do something about it, also with potential > data loss. > > Having a message telling kids that the machine is too overloaded should be > enough, with > recommendations about saving any current work and closing earlier > activities. > > This kind of mechanisms should help to the overall stability, and it makes > even more sense when you > think about XO's 1 scenarios. > > :) Yes, I already agreed with you on this. The hard part of this patch would be setting a threshold to disallow opening another activity. Memory footprint of activities varies wildly. Shall we take the worst case, pissing off users who knew what they were doing, or shall we be optimistic, risking the current behavior in some cases? If we also had both the "graceful stop on oom" that I was thinking of, we could afford to be be optimistic in the "oom prevention" code. Anyway, for now I'd vote for doing what you suggest in the easiest possible way even if it saves the system only 50% of the times. It would still be a huge improvement upon the current behavior. -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [Dextrose] Stability stuff
On Fri, Sep 3, 2010 at 11:23 AM, Martin Abente wrote: > for sugar. In the field you already know thats not the case. Also... even > when > the activities are being implemented in python through the Activity Class, > the > read and write methods needs to be implemented by the programmer. That > means it > depends on the activity specifics again. Well, if the activity doesn't save on close,it won't save on close and will be messing up user data left-and-right. We cannot design the system for brokenness... cheers, m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [Dextrose] Stability stuff
On Fri, 03 Sep 2010 01:46:02 +0200, Bernie Innocenti wrote: > El Thu, 02-09-2010 a las 09:26 -0400, Martin Abente escribió: >> Weird, I really tried to trigger it on our last Dextrose build and never >> happened. > > Perhaps it's gone, but I have not done anything to fix it. The bug seems > to be in Pytrhon, dbus or their dependencies. > > >> The whole idea of killing activities is a little bit controversial I >> think, you have to assume to many things about activities, so far just a >> few activities in sugar uses all the proper mechanisms, I am afraid that >> in >> most of the cases kids would just loose their current work. > > I thought almost all activities understood the protocol for quitting > cleanly (probably a dbus message). You can test it by clicking Stop from > the menu on the icons top of the frame. That wouldn't work without > sending an IPC message of some kind (probably we use dbus because we > can't stand to use established X11 standards to manage applications). > Well, thats true in theory, assuming all the activities are properly designed for sugar. In the field you already know thats not the case. Also... even when the activities are being implemented in python through the Activity Class, the read and write methods needs to be implemented by the programmer. That means it depends on the activity specifics again. > >> What about... If the system load is already close to a "critical" point, >> SUGAR could just stop new activities from being executed with a proper >> warning, and suggestions. > > This is also a very good suggestion. We could start by doing this, which > is a lot easier and almost equally effective. I see it this way: Why waiting to get sick to do something about it. Preventive medicine is always better. Why waiting for the machine to freeze (waiting 3 or more minutes until its back to a usable state again) to do something about it, also with potential data loss. Having a message telling kids that the machine is too overloaded should be enough, with recommendations about saving any current work and closing earlier activities. This kind of mechanisms should help to the overall stability, and it makes even more sense when you think about XO's 1 scenarios. :) ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [Dextrose] Stability stuff
Excerpts from Bernie Innocenti's message of Fri Sep 03 01:46:02 +0200 2010: > I thought almost all activities understood the protocol for quitting > cleanly (probably a dbus message). You can test it by clicking Stop from > the menu on the icons top of the frame. That wouldn't work without > sending an IPC message of some kind (probably we use dbus because we > can't stand to use established X11 standards to manage applications). We simply use X11 to close the window: In jarabe/view/palettes.py: class CurrentActivityPalette(BasePalette): def __stop_activate_cb(self, menu_item): self._home_activity.get_window().close(1) Sascha -- http://sascha.silbe.org/ http://www.infra-silbe.de/ signature.asc Description: PGP signature ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [Dextrose] Stability stuff
El Thu, 02-09-2010 a las 09:26 -0400, Martin Abente escribió: > Weird, I really tried to trigger it on our last Dextrose build and never > happened. Perhaps it's gone, but I have not done anything to fix it. The bug seems to be in Pytrhon, dbus or their dependencies. > The whole idea of killing activities is a little bit controversial I > think, you have to assume to many things about activities, so far just a > few activities in sugar uses all the proper mechanisms, I am afraid that in > most of the cases kids would just loose their current work. I thought almost all activities understood the protocol for quitting cleanly (probably a dbus message). You can test it by clicking Stop from the menu on the icons top of the frame. That wouldn't work without sending an IPC message of some kind (probably we use dbus because we can't stand to use established X11 standards to manage applications). > What about... If the system load is already close to a "critical" point, > SUGAR could just stop new activities from being executed with a proper > warning, and suggestions. This is also a very good suggestion. We could start by doing this, which is a lot easier and almost equally effective. -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [Dextrose] Stability stuff
Weird, I really tried to trigger it on our last Dextrose build and never happened. The whole idea of killing activities is a little bit controversial I think, you have to assume to many things about activities, so far just a few activities in sugar uses all the proper mechanisms, I am afraid that in most of the cases kids would just loose their current work. What about... If the system load is already close to a "critical" point, SUGAR could just stop new activities from being executed with a proper warning, and suggestions. ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [Dextrose] Stability stuff
El Tue, 31-08-2010 a las 12:00 -0400, Martin Abente escribió: > Hey guys, > > I have been testing our last dextrose build for the XO 1, and comparing it > to the previous os179py (Sugar 0.84) version. I have noticed that the > kernel included in the os179os provides a mechanism for killing activities > when the laptop runs out of memory. This is the kernel out-of-memory killer. It's been in the Linux kernel since 2.4.x, so all versions of the XO software include it. The OOM killer makes its guess using heuristics. Sometimes, it could kill the wrong process, leaving the machine in an unusable state. Killing processes should be seen as a last-resort action, to recover from a situation that should never happen. Unfortunately, Sugar does not have any mechanism to gracefully quit activities when memory is tight. Technically, this is not a bug in Sugar or in Dextrose. OOM killing is the normal behavior of Linux even on servers. It's just that it's too easy to trigger on the XO. If you grep the OLPC and Sugar development mailing lists, you'll find many threads in which this topic was discussed and solutions were proposed. One such threads happened recently in conjunction with the discussion of Anish's CPU & Memory meter. I liked the solution that was proposed last: when memory gets tight, Sugar simply asks the least recently used activity to quit (and thus save to the journal). Optionally, we could put on a notification to let the user know what happened (after the fact). > I have tried to trigger this mechanism on our last dextrose build, but > with no results. Is it possible that our last kernel does not include this > mechanism? And in that case is there any reason for not including it? No, all kernels include it. There are a bunch of tunables in /proc/sys/vm to make the oom killer behave differently. The most important one is "swappiness", which seems to be set a little bit too high on the current OLPC kernels (all of them, not just Dextrose). Note that we're using the very same kernel that OLPC uses on os852, so bugs should be shared. -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel