Re: [sugar] Trial-2 & pushing out bugs...
> > 1) bugs that you believe absolutely must get fixed for Trial-2 I am not certain why, but http://dev.laptop.org/ticket/2319 was milestoned to B-Test-2. This is a really bad hardware bug. Its going to physically bust machines. We want to make sure this is fixed before anything goes out the door jp TamTam ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Trial-2 & pushing out bugs...
> 1) bugs that you believe absolutely must get fixed for Trial-2 > 2) bugs that you believe are very important, and for which you think we > should fix if we can, for which you believe non-invasive fixes are > doable or in hand. > 3) bugs we should definitely push later. > 4) "one liner" cosmetic fixes, with high bang for the buck. The risk > had better be very low. 1 --- null 2 --- #2014: This should be a quick fix once I get Marco a screenshot. It will make sharing more intuitive, and collaboration is important to this trial. #1129: This isn't as essential as the above 3 --- #1276: I thought that this was mostly handled in ticket #663, but in either case, we need full APIs and guidelines on this before it can be implemented correctly. Related: #1441, 1815. Jim, please help me out on the status of these; Thanks. #1502: Design aside, this will require a lot of implementation, which will happen post T2 4 --- #2148: If #2014 is done (listed above), this only requires one new icon ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Trial-2 & pushing out bugs...
On Wed, 2007-07-18 at 16:17 -0400, Jim Gettys wrote: > Could each of you working on Trial-2 please go over your assigned track > bugs and reply to this mail categorizing *all* of your bugs into four > bins, by Friday? > > 1) bugs that you believe absolutely must get fixed for Trial-2 > 2) bugs that you believe are very important, and for which you think we > should fix if we can, for which you believe non-invasive fixes are > doable or in hand. > 3) bugs we should definitely push later. > 4) "one liner" cosmetic fixes, with high bang for the buck. The risk > had better be very low. 1) #2057, #2309: Progresses in the datastore side. Will test in the next build. 2) #2285: Shouldn't be too hard. 3) #1281: Looks a too risky change for doing it now, as it messes with the frame activation logic. #2056: We can copy files between devices by drag and drop. Additional mechanisms can be implemented after trial2. #2044: We are appending Today or Yesterday to the date. If Eben thinks it's important for the user experience to add some more complex support, he needs to comment on this. #2245: Not easy, better to just update the preview on activity close for Trial2. 4) #2167: Should be easy to just add the icon at left to the entry. Inside the entry looks too much risk to me. Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Activation and new builds.
On 7/19/07, Zephaniah E. Hull <[EMAIL PROTECTED]> wrote: > How does that impact those of us wishing to boot from SD for development > purposes? Make sure you're using the jffs2 image (either directly, or unpacked from the tree.tar file). Copy the activation lease from /security/activate.key from NAND to the SD if necessary, or ensure that the USB key that had the lease written during activation is present during first boot. The activation ramdisk does currently mount root from mtd0; I'll update it to mount root from the source of the ramdisk, where that was. Watch the build changelog for this change. --scott -- ( http://cscott.net/ ) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Dramatic image size reduction
On Fri, 2007-07-20 at 10:23 -0400, Bernardo Innocenti wrote: > Build 511 was 290MB, build 513 dropped to 218MB, but > I couldn't find anything in the ChangeLog to justify > this dramatic improvement. Anyone has a clue? The library was broken and was removed. I should have noted that in the changelog. Dan ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OOM manager project
Hello Jim, We (PlanetLab) have found that OOM does some relative bad things causing a system to get into an unusable state. We replaced OOM with something that just panics and reboots rather than letting the system get into an unrecoverable state, which we need as many of our PL servers are in remote locations and unattended (kinda like your mini-servers will be, but unlike your laptops). And we then introduced a user-level OOM governor, which is probably something far more rudimentary than what you are after. Our governor, called pl_mom because "she cleans up your mess", assumes that separate applications/services are instantiated in separate vservers (slices). From what I gather, this is definitely the direction that OLPC is going for the laptop and mini-server-gateways, so our approach might be at least from a thought perspective applicable. What does pl_mom do? At the moment she kills the vserver with the largest aggregate VSZ (i.e., all processes within that vserver). This works for PlanetLab, but might not the best approach for OLPC. We have found that most OOM scenarios occur by a slow leaker that has its pages swapped out by kswap (which happens on the order of a few hours and are hard to detect with the current vm metrics we peek at). Since pl_mom does the trick for our usage scenario on PlanetLab for now we have not had an incentive to improve it further. However, one should definitely look at better vm statistics to make a better choice than largest aggregate VSZ. The code for pl_mom is available via anon cvs from: cvs -d :pserver:[EMAIL PROTECTED]:/cvs co pl_mom Take a peek at swap_mom.py and its helper functions in pl_mom.py. I'm cc'ing Faiyaz Ahmed, who is the person at Princeton who is currently maintaining pl_mom. Best regards, Marc Jim Gettys wrote: > OLPC needs a OOM governor, so that the "right" process gets shot when we > run low on RAM, and that processes that might get shot know enough to > save state for restart. As you know, various problems appear if the > wrong process is killed, usually resulting in needing a restart. > > Note that the kernel has to be able to recover memory when it needs it, > or it will deadlock: this is a situation where the kernel must be in > control, but user space could cooperate much better than it does today, > by providing appropriate hints. So don't say: "the kernel shouldn't > kill processes: user space should"; that design doesn't fly. > > Here's Kimmo Hämäläinen description of the (current) kernel OOM killer. > > The OOM killer selects a process to kill by assigning a score to each > process; the process with the highest score is the lucky winner that > will be killed. The current OOM score for > a process is visible in proc. The entry is in /proc/PID/oom score. The > starting point of the score is the amount of memory consumed by the > process and its children. This value is adjusted as follows: > • It is set to zero if the process has no memory management or if the > process has a negative > nice value (this can be used for protecting processes from the killer). > • Divided by the square root of the CPU time consumed by the process. > • Divided by the square root of the square root of the run time of the > process. > • Multiplied by 2 if it is a process with a positive nice value. > • Divided by 4 if it is a superuser process. > • Divided by 4 if it is a process with direct hardware access. > • Finally, the value is adjusted (shifted either left or right) by the > oom adj value. It is shifted left in case the value is positive and > right in case the value is negative. > This means that a negative oom adj value will decrease the score and > also decrease the risk that the particular process will be killed. A > positive value will have the opposite effect. The value should be no > smaller than -16 and no larger than 15. > > Please note that you can set the oom adj value in the proc file system. > It is located at /proc/PID/oom_adj. For more information about how the > OOM killer behaves, see the Linux kernel source code, mm/oom kill.c in > particular. > > So we need an OOM killer helper. > > We have the ability to provide the kernel with much of the > information it needs for much better behavior, if we choose. > > I see this project evolving through the following incremental > improvements (and incremental difficulty) as set out below: > > 1) start by setting the oom_adj appropriately so that the processes we > really care about don't get shot. > > 2) make this a window manager plug in (plug in, as people including us > may end up using other window managers) that uses the stacking order on > the screen to rank order the activities that are running. > > 3) provide a mechanism by which applications may be given a hint that > they might find it good to save enough state for a checkpoint restart, > because they are likely a good candidate for shooting. > > 4) use the XRes facilities in X (and/or m
Re: OOM manager project
On Fri, 2007-07-20 at 17:05 +0200, Carl-Daniel Hailfinger wrote: > On 20.07.2007 16:37, Jim Gettys wrote: > > Note that the kernel has to be able to recover memory when it needs it, > > or it will deadlock: this is a situation where the kernel must be in > > control, but user space could cooperate much better than it does today, > > by providing appropriate hints. So don't say: "the kernel shouldn't > > kill processes: user space should"; that design doesn't fly. > > [...] > > So we need an OOM killer helper. > > > > We have the ability to provide the kernel with much of the > > information it needs for much better behavior, if we choose. > > [...] > > 5) see if there are better OOM algorithms that Linux presently has. > > How does the new proposal relate to the OOM discussion here: > http://www.redhat.com/archives/olpc-software/2006-March/msg00197.html > > Have we simply given up trying to beat userspace into shape? Of course not, at least for what we ship. But the set of stuff people will run will include stuff that is bloated garbage My real point is that there is lots of useful information in various parts of the system, and we're currently not using almost *any* of it for choosing what to do. Two comments: 1) don't prejudge what the long term solution should be, including my straw man, which is just to try to spur some real action. 2) the enemy of the good is the perfect; doing nothing, which is what we're doing now, isn't a great position. I suspect we can do lots better than now, without taking on the entire problem up front. - Jim -- Jim Gettys One Laptop Per Child ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: OOM manager project
On 20.07.2007 16:37, Jim Gettys wrote: > Note that the kernel has to be able to recover memory when it needs it, > or it will deadlock: this is a situation where the kernel must be in > control, but user space could cooperate much better than it does today, > by providing appropriate hints. So don't say: "the kernel shouldn't > kill processes: user space should"; that design doesn't fly. > [...] > So we need an OOM killer helper. > > We have the ability to provide the kernel with much of the > information it needs for much better behavior, if we choose. > [...] > 5) see if there are better OOM algorithms that Linux presently has. How does the new proposal relate to the OOM discussion here: http://www.redhat.com/archives/olpc-software/2006-March/msg00197.html Have we simply given up trying to beat userspace into shape? Regards, Carl-Daniel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Dramatic image size reduction
Hi, > Build 511 was 290MB, build 513 dropped to 218MB, but I couldn't > find anything in the ChangeLog to justify this dramatic > improvement. Anyone has a clue? The difference is that the library (/home/olpc/Library/) got taken out. (Then it got put back in, and we went up past 300M, and then it got taken out again.) - Chris. -- Chris Ball <[EMAIL PROTECTED]> ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
OOM manager project
OLPC needs a OOM governor, so that the "right" process gets shot when we run low on RAM, and that processes that might get shot know enough to save state for restart. As you know, various problems appear if the wrong process is killed, usually resulting in needing a restart. Note that the kernel has to be able to recover memory when it needs it, or it will deadlock: this is a situation where the kernel must be in control, but user space could cooperate much better than it does today, by providing appropriate hints. So don't say: "the kernel shouldn't kill processes: user space should"; that design doesn't fly. Here's Kimmo Hämäläinen description of the (current) kernel OOM killer. The OOM killer selects a process to kill by assigning a score to each process; the process with the highest score is the lucky winner that will be killed. The current OOM score for a process is visible in proc. The entry is in /proc/PID/oom score. The starting point of the score is the amount of memory consumed by the process and its children. This value is adjusted as follows: • It is set to zero if the process has no memory management or if the process has a negative nice value (this can be used for protecting processes from the killer). • Divided by the square root of the CPU time consumed by the process. • Divided by the square root of the square root of the run time of the process. • Multiplied by 2 if it is a process with a positive nice value. • Divided by 4 if it is a superuser process. • Divided by 4 if it is a process with direct hardware access. • Finally, the value is adjusted (shifted either left or right) by the oom adj value. It is shifted left in case the value is positive and right in case the value is negative. This means that a negative oom adj value will decrease the score and also decrease the risk that the particular process will be killed. A positive value will have the opposite effect. The value should be no smaller than -16 and no larger than 15. Please note that you can set the oom adj value in the proc file system. It is located at /proc/PID/oom_adj. For more information about how the OOM killer behaves, see the Linux kernel source code, mm/oom kill.c in particular. So we need an OOM killer helper. We have the ability to provide the kernel with much of the information it needs for much better behavior, if we choose. I see this project evolving through the following incremental improvements (and incremental difficulty) as set out below: 1) start by setting the oom_adj appropriately so that the processes we really care about don't get shot. 2) make this a window manager plug in (plug in, as people including us may end up using other window managers) that uses the stacking order on the screen to rank order the activities that are running. 3) provide a mechanism by which applications may be given a hint that they might find it good to save enough state for a checkpoint restart, because they are likely a good candidate for shooting. 4) use the XRes facilities in X (and/or modify X) to provide the kernel with the pixmap usage on a process ID basis, for local applications/activities. 5) see if there are better OOM algorithms that Linux presently has. Discussion? Anyone want to take on this project, or parts of this project? - Jim -- Jim Gettys One Laptop Per Child ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Dramatic image size reduction
Build 511 was 290MB, build 513 dropped to 218MB, but I couldn't find anything in the ChangeLog to justify this dramatic improvement. Anyone has a clue? -- // Bernardo Innocenti \X/ http://www.codewiz.org/ ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
ChangeLog
How about automatically posting the ChangeLog files to the devel@ mailing-list in the fashion of Fedora? Changelog entries sometimes are the starting point for interesting discussion, and a form of post-mortem review process. -- // Bernardo Innocenti \X/ http://www.codewiz.org/ ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel