Re: No surprise on memory
Jim Gettys wrote: >> Note that I'm not advocating in favor of soldered NAND - in fact I've >> been one of the leading proponents of migrating to an SD-based storage >> solution. I'm just pointing out that, if you're willing to buy an SD >> card now (which is necessary for the SD-based swap solution), then you >> are probably willing to buy one later. >> > Soldered down SD, however may be an intermediate point; may fewer wires > than a conventional chip. There is eMMC for purposes like this, even. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Sat, Dec 20, 2008 at 23:57, Albert Cahalan wrote: > [multiple people] > >> I recently learned a few very important things about Linux memory >> management (I'm speaking about how its supposed to work, irrespective >> of any bugs). Operating systems experts already know all of this, >> but I did not. > > This is a good reminder for those of us who tend to assume that > anyone joining these discussions is an OS expert. > >> Conclusion: no magic get-out-of-jail-free card. > > There certainly isn't anything that can work with perfect > reliability, even if policy was to disable overcommit and > check malloc everywhere. > > Pay particular attention to how every proposed solution meets > the real goals, remembering that nearly all activities save the > user's data via a non-atomic process that requires memory. > Simply put, it is never acceptable to force a well-bahaved > activity to die or live without the memory it demands. It's great that you have such high quality standards, but I hope that in case we fail to reach those, we find a way to at least improve in some way what we already have. >>> It may be interesting to adjust the OOM score of some applications. >>> This way it should be possible to protect the core applications >>> (sugar-shell, journal, X, ...) from being killed in an OOM situation. >> >> I'm with Benjamin here, if the OOM killer kicked in soon enough and >> activities were clearly marked as first candidates to be killed, >> stability would be much much better. > > No way. > > The core applications only exist for the desired activity. If that > desired activity must die, you might as well power off the laptop. > The only processes slightly worth saving are klogd and syslogd, > allowing developers to figure out what just happened. I would agree with you if our users ran only one activity at a time, but I think that's not the case. If it was, we rarely would run out of memory. What is happening right now is that the user launches several activities even if may not need all of them, and when the system runs out of memory the whole system dead locks, losing the data of the active activity and having to restart the system. What I suggested would cause one of the background activities to die, and it would have already saved its state to the journal when it went to the background. The user would keep working and in order to resume the work on the killed activity would only need to go to the journal and click on one of the icons at the top of the list. Powering off the system sounds to me as less convenient to the user. >> And if background activities were killed before the active one, >> we would avoid data loss. > > Background activities can contain valuable unsaved state too. All python activities are requested to save their state when they go to the background. Non-python activities could chose that event to trigger an auto save if they wanted. > Of course, this is somewhat theoretical because kids do not > intentionally have background activities. The ten activities running > in the background on a typical kid's XO are a big contributer to > these memory problems. Then I don't understand why you said above that killing one of the background (probably non-intentional) activities is as bad as powering off the laptop. >> Combine that with Mac OS (pre X) style "estimated memory allocation" >> metadata for each activity and the user experience could perhaps even >> work. > > This is key. Until the UI absolutely refuses to let the user start > a set of 2+ activities that could run out of memory, memory problems > are a given. For activities with unbounded memory usage, this means > they get the machine exclusively. I certainly cannot imagine activity authors properly filling that field, given that most activities we have today have a wrong version field. Also, as a child used the old MacOS quite a bit and remember having to mess with those fields quite often. Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
[multiple people] > I recently learned a few very important things about Linux memory > management (I'm speaking about how its supposed to work, irrespective > of any bugs). Operating systems experts already know all of this, > but I did not. This is a good reminder for those of us who tend to assume that anyone joining these discussions is an OS expert. > Conclusion: no magic get-out-of-jail-free card. There certainly isn't anything that can work with perfect reliability, even if policy was to disable overcommit and check malloc everywhere. Pay particular attention to how every proposed solution meets the real goals, remembering that nearly all activities save the user's data via a non-atomic process that requires memory. Simply put, it is never acceptable to force a well-bahaved activity to die or live without the memory it demands. >> It may be interesting to adjust the OOM score of some applications. >> This way it should be possible to protect the core applications >> (sugar-shell, journal, X, ...) from being killed in an OOM situation. > > I'm with Benjamin here, if the OOM killer kicked in soon enough and > activities were clearly marked as first candidates to be killed, > stability would be much much better. No way. The core applications only exist for the desired activity. If that desired activity must die, you might as well power off the laptop. The only processes slightly worth saving are klogd and syslogd, allowing developers to figure out what just happened. > And if background activities were killed before the active one, > we would avoid data loss. Background activities can contain valuable unsaved state too. Of course, this is somewhat theoretical because kids do not intentionally have background activities. The ten activities running in the background on a typical kid's XO are a big contributer to these memory problems. > Combine that with Mac OS (pre X) style "estimated memory allocation" > metadata for each activity and the user experience could perhaps even > work. This is key. Until the UI absolutely refuses to let the user start a set of 2+ activities that could run out of memory, memory problems are a given. For activities with unbounded memory usage, this means they get the machine exclusively. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
This isn't directly related to swapping, but if anybody is curious about flash technology... Al Fazio from Intel gave a good talk at Stanford EE380 last November. He had lots of details and numbers about flash technology. Good geek bait. Intel is selling flash based disks for laptops. They do the wear leveling and block rewriting in hardware. No changes to the OS required. Nice power savings. (Google for >Intel SSD< gets lots of hits.) Flash disks have real good seek times. If you pick the right benchmark, you can get some impressive numbers. You can get the slides here: http://www.stanford.edu/class/ee380/Abstracts/081112.html You can get the video here. It was on November 12, 2008 http://www.stanford.edu/class/ee380/ When the next quarter starts, this quarter will migrate to the archives. Look down below on the left column. -- These are my opinions, not necessarily my employer's. I hate spam. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Thu, 2008-12-18 at 09:13 -1000, Mitch Bradley wrote: > John Gilmore wrote: > > Swapping to the soldered-in NAND chips is a very bad idea. It will > > tend to wear them out rapidly. Even if you use load-leveling software > > (e.g. swapping to a file in a jfffs2 filesystem), the problem is that > > if you do start wearing out serious numbers of flash blocks, the > > laptop becomes toast; it requires a soldering iron and spare chips to > > fix it. John, do the math: for the current chips (single level cells), life is of order 10^5 cycles. So you have 10^5 gigabytes of writing. This takes *a long* time. Swapping is not an insane idea, once you have wear leveling. We don't do it now because JFFS2 cannot support swapping, and we don't have a wear level beneath the file system. UBI and Ubifs fix this, and it is something we can consider. > > > > Well, maybe it's not as bad as all that. When the NAND wears out, then > you can buy the SD card, thus deferring that purchase and taking > advantage of Moore's law in the interim. Lots of people tend to forget, however, that warm (and/or cold) salt air is a serious issue in many of the places we have to go. Any connector tends to die under these circumstances. > > Note that I'm not advocating in favor of soldered NAND - in fact I've > been one of the leading proponents of migrating to an SD-based storage > solution. I'm just pointing out that, if you're willing to buy an SD > card now (which is necessary for the SD-based swap solution), then you > are probably willing to buy one later. > Soldered down SD, however may be an intermediate point; may fewer wires than a conventional chip. -- Jim Gettys One Laptop Per Child ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
John Gilmore wrote: > Swapping to the soldered-in NAND chips is a very bad idea. It will > tend to wear them out rapidly. Even if you use load-leveling software > (e.g. swapping to a file in a jfffs2 filesystem), the problem is that > if you do start wearing out serious numbers of flash blocks, the > laptop becomes toast; it requires a soldering iron and spare chips to > fix it. > Well, maybe it's not as bad as all that. When the NAND wears out, then you can buy the SD card, thus deferring that purchase and taking advantage of Moore's law in the interim. Note that I'm not advocating in favor of soldered NAND - in fact I've been one of the leading proponents of migrating to an SD-based storage solution. I'm just pointing out that, if you're willing to buy an SD card now (which is necessary for the SD-based swap solution), then you are probably willing to buy one later. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
While the MTD layer does go to memory first, my thought about two swaps was slightly different. Depending on how they are managed, one of two things might happen: (I assume the second swap isn't used until the first is full) 1. less "busy" stuff gets migrated to the second swap or 2) The second swap just gets the latest thing that needs swapping even if it is busy. (1) would be better if the second swap is on flash to minimize wear ... this is like old time hierarchical storage management on mainframes. Yes I did roam the earth with the dinosaurs. (2) would potentially allow for a graceful OOM management by using something about second swap usage to be a signal to make the user pick an activity to close. On Thu, Dec 18, 2008 at 9:55 AM, Martin Langhoff wrote: > On Thu, Dec 18, 2008 at 1:43 PM, Carol Farlow Lerche > wrote: > > Since Linux allows multiple swap partitions, is there anything to be > gained > > by using two -- the first, a compcache swap file and the second on flash, > > perhaps with Belyakov's MTD layer. First question is whether Linux > treats > > the two swap files in an order, such that it only uses the flash swap if > the > > compcache is exhausted. If so, instrument the I/O rate to swap, first to > > study the relative sizing and second -- could the I/O rate to the flash > swap > > be a signal to prune activities? > > Both Belyakov and Richard Purdie seem to compress into a large-ish > buffer in RAM first, and only past a certain threshold actually put it > to NAND. This is to minimise number of erase ops, and improve > performance, so the swaponflash driver Richard's posted does exactly > what you describe. > > As to treating that as a signal, I dunno. > > I do think Sugar could monitor some mem stats and warn the user > _early_ when under mem pressure. If more than one activity is open, it > could sugges to close a bg activity... > > cheers, > > > > m > -- > martin.langh...@gmail.com > mar...@laptop.org -- School Server Architect > - ask interesting questions > - don't get distracted with shiny stuff - working code first > - http://wiki.laptop.org/go/User:Martinlanghoff > -- "Don't think for a minute that power concedes. We have to work like our future depends on it." -- Barack Obama ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Thu, Dec 18, 2008 at 1:43 PM, Carol Farlow Lerche wrote: > Since Linux allows multiple swap partitions, is there anything to be gained > by using two -- the first, a compcache swap file and the second on flash, > perhaps with Belyakov's MTD layer. First question is whether Linux treats > the two swap files in an order, such that it only uses the flash swap if the > compcache is exhausted. If so, instrument the I/O rate to swap, first to > study the relative sizing and second -- could the I/O rate to the flash swap > be a signal to prune activities? Both Belyakov and Richard Purdie seem to compress into a large-ish buffer in RAM first, and only past a certain threshold actually put it to NAND. This is to minimise number of erase ops, and improve performance, so the swaponflash driver Richard's posted does exactly what you describe. As to treating that as a signal, I dunno. I do think Sugar could monitor some mem stats and warn the user _early_ when under mem pressure. If more than one activity is open, it could sugges to close a bg activity... cheers, m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
The soldered in NAND is also 14 times slower on writes and half the speed of a good SD card. wad On Dec 18, 2008, at 5:51 AM, John Gilmore wrote: >> What about using a NAND partition as swap? Has this ever been done? >> Given that partition support is a recent development it seems >> unlikely. > > Swapping to the soldered-in NAND chips is a very bad idea. It will > tend to wear them out rapidly. Even if you use load-leveling software > (e.g. swapping to a file in a jfffs2 filesystem), the problem is that > if you do start wearing out serious numbers of flash blocks, the > laptop becomes toast; it requires a soldering iron and spare chips to > fix it. > > A much more reliable scheme would be to swap to an SD card, if one is > plugged in and contains a swap partition (or a file in its root called > SWAPFILE). See http://dev.laptop.org/ticket/8410. Even a small, > cheap SD card could double or triple the available virtual RAM space. > And if an SD card gets worn out, you merely pull it out of the laptop, > throw it away, and buy a new one (for a fraction of the original cost, > since Moore's Law has been working in your favor in the intervening > years). > > This doesn't solve the least-common-denominator problem of people > without > SD cards -- but it does offer a user, or a deployment, a very > simple and > relatively cheap way to solve most problems related to physical RAM > size. > > On the topic of memory overload in general: > > Older XO releases did much better things when they ran out of physical > memory: they tended to rapidly kill off some process, leaving the > system > largely functional. In 767, the system instead goes from usable to > molasses-like in a period of seconds, then freezes totally for minutes > or hours. As far as I know, nobody has debugged why that changed. > The > prior behavior was infinitely preferable. > > John > ___ > Devel mailing list > Devel@lists.laptop.org > http://lists.laptop.org/listinfo/devel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
After reading Belyakov's paper a few questions for the experts occurred to me: Since Linux allows multiple swap partitions, is there anything to be gained by using two -- the first, a compcache swap file and the second on flash, perhaps with Belyakov's MTD layer. First question is whether Linux treats the two swap files in an order, such that it only uses the flash swap if the compcache is exhausted. If so, instrument the I/O rate to swap, first to study the relative sizing and second -- could the I/O rate to the flash swap be a signal to prune activities? On Thu, Dec 18, 2008 at 6:25 AM, Martin Langhoff wrote: > On Thu, Dec 18, 2008 at 11:21 AM, Chris Ball wrote: > > And see this old mail from Mitch: > > > > http://lists.laptop.org/pipermail/devel/2007-December/009030.html > > > > (Mitch doesn't mention how he feels about swapping in particular. > > We could perhaps attempt to conduct some experimen > > Not sure if this particular patchseries ever made it, but Richard > Purdie has been working on this area (marc lets you see all his posts > to lkml, and it's all around mtd, compression, swap). > http://marc.info/?l=linux-kernel&m=117285102508455&w=2 > > Might also be worthwhile to ping him... > > cheers, > > > > martin-who's-reading-lkml-while-xs-reboots > -- > martin.langh...@gmail.com > mar...@laptop.org -- School Server Architect > - ask interesting questions > - don't get distracted with shiny stuff - working code first > - http://wiki.laptop.org/go/User:Martinlanghoff > ___ > Devel mailing list > Devel@lists.laptop.org > http://lists.laptop.org/listinfo/devel > -- "Don't think for a minute that power concedes. We have to work like our future depends on it." -- Barack Obama ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Thu, Dec 18, 2008 at 11:21 AM, Chris Ball wrote: > And see this old mail from Mitch: > > http://lists.laptop.org/pipermail/devel/2007-December/009030.html > > (Mitch doesn't mention how he feels about swapping in particular. > We could perhaps attempt to conduct some experimen Not sure if this particular patchseries ever made it, but Richard Purdie has been working on this area (marc lets you see all his posts to lkml, and it's all around mtd, compression, swap). http://marc.info/?l=linux-kernel&m=117285102508455&w=2 Might also be worthwhile to ping him... cheers, martin-who's-reading-lkml-while-xs-reboots -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
Hi, >> Swapping to the soldered-in NAND chips is a very bad idea. It >> will tend to wear them out rapidly. Even if you use load-leveling >> software (e.g. swapping to a file in a jfffs2 filesystem), the >> problem is that > While I generally agree with you, > www.celinux.org/elc08_presentations/belyakov_elc2008_compressed_swap_final_doc.pdf > does seem to talk about the mtd driver having special handling for > swap. And see this old mail from Mitch: http://lists.laptop.org/pipermail/devel/2007-December/009030.html (Mitch doesn't mention how he feels about swapping in particular. We could perhaps attempt to conduct some experiments.) - Chris. -- Chris Ball ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Thu, Dec 18, 2008 at 8:51 AM, John Gilmore wrote: >> What about using a NAND partition as swap? Has this ever been done? >> Given that partition support is a recent development it seems unlikely. > > Swapping to the soldered-in NAND chips is a very bad idea. It will > tend to wear them out rapidly. Even if you use load-leveling software > (e.g. swapping to a file in a jfffs2 filesystem), the problem is that While I generally agree with you, www.celinux.org/elc08_presentations/belyakov_elc2008_compressed_swap_final_doc.pdf does seem to talk about the mtd driver having special handling for swap. cheers, m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
> What about using a NAND partition as swap? Has this ever been done? > Given that partition support is a recent development it seems unlikely. Swapping to the soldered-in NAND chips is a very bad idea. It will tend to wear them out rapidly. Even if you use load-leveling software (e.g. swapping to a file in a jfffs2 filesystem), the problem is that if you do start wearing out serious numbers of flash blocks, the laptop becomes toast; it requires a soldering iron and spare chips to fix it. A much more reliable scheme would be to swap to an SD card, if one is plugged in and contains a swap partition (or a file in its root called SWAPFILE). See http://dev.laptop.org/ticket/8410. Even a small, cheap SD card could double or triple the available virtual RAM space. And if an SD card gets worn out, you merely pull it out of the laptop, throw it away, and buy a new one (for a fraction of the original cost, since Moore's Law has been working in your favor in the intervening years). This doesn't solve the least-common-denominator problem of people without SD cards -- but it does offer a user, or a deployment, a very simple and relatively cheap way to solve most problems related to physical RAM size. On the topic of memory overload in general: Older XO releases did much better things when they ran out of physical memory: they tended to rapidly kill off some process, leaving the system largely functional. In 767, the system instead goes from usable to molasses-like in a period of seconds, then freezes totally for minutes or hours. As far as I know, nobody has debugged why that changed. The prior behavior was infinitely preferable. John ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
Erik, what is the latest status on Compcache? Obviously, this could relieve some of the pressure, but does not remove the need for an OOM strategy (or strategies). Jameson On Tue, Dec 16, 2008 at 11:38 AM, Martin Langhoff wrote: > On Tue, Dec 16, 2008 at 1:40 PM, Erik Garrison wrote: > > What about using a NAND partition as swap? Has this ever been done? > > Given that partition support is a recent development it seems unlikely. > > There's been discussion on this list about it. I don't think the mtd > driver does any wear-levelling, and the swap usage patterns are > probably murder. > > googling about I landed this paper from an Intel guy - > http://www.google.com.ar/search?q=linux+swap+mtd+nand > > > www.celinux.org/elc08_presentations/belyakov_elc2008_compressed_swap_final_doc.pdf > > cheers, > > > > m > -- > martin.langh...@gmail.com > mar...@laptop.org -- School Server Architect > - ask interesting questions > - don't get distracted with shiny stuff - working code first > - http://wiki.laptop.org/go/User:Martinlanghoff > ___ > Devel mailing list > Devel@lists.laptop.org > http://lists.laptop.org/listinfo/devel > ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Tue, Dec 16, 2008 at 12:15:37PM -0600, Jameson Quinn wrote: > Erik, what is the latest status on Compcache? Obviously, this could > relieve some of the pressure, but does not remove the need for an OOM > strategy (or strategies). I provided and tested a patch to apply it to a branch of our kernel git tree to our kernel devs. I believe it has been pushed into one tree but I am unsure. Deepak: What next? Erik > On Tue, Dec 16, 2008 at 11:38 AM, Martin Langhoff > wrote: > > > On Tue, Dec 16, 2008 at 1:40 PM, Erik Garrison wrote: > > > What about using a NAND partition as swap? Has this ever been done? > > > Given that partition support is a recent development it seems unlikely. > > > > There's been discussion on this list about it. I don't think the mtd > > driver does any wear-levelling, and the swap usage patterns are > > probably murder. > > > > googling about I landed this paper from an Intel guy - > > http://www.google.com.ar/search?q=linux+swap+mtd+nand > > > > > > www.celinux.org/elc08_presentations/belyakov_elc2008_compressed_swap_final_doc.pdf > > > > cheers, > > > > > > > > m > > -- > > martin.langh...@gmail.com > > mar...@laptop.org -- School Server Architect > > - ask interesting questions > > - don't get distracted with shiny stuff - working code first > > - http://wiki.laptop.org/go/User:Martinlanghoff > > ___ > > Devel mailing list > > Devel@lists.laptop.org > > http://lists.laptop.org/listinfo/devel > > ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Tue, Dec 16, 2008 at 1:40 PM, Erik Garrison wrote: > What about using a NAND partition as swap? Has this ever been done? > Given that partition support is a recent development it seems unlikely. There's been discussion on this list about it. I don't think the mtd driver does any wear-levelling, and the swap usage patterns are probably murder. googling about I landed this paper from an Intel guy - http://www.google.com.ar/search?q=linux+swap+mtd+nand www.celinux.org/elc08_presentations/belyakov_elc2008_compressed_swap_final_doc.pdf cheers, m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Tue, Dec 16, 2008 at 12:45:52PM -0200, Martin Langhoff wrote: > On Tue, Dec 16, 2008 at 12:34 PM, Tomeu Vizoso wrote: > > Well, I wasn't trying to give a "solution", just suggested a "less > > bad" way to fail. IMO, just trying to find the perfect solution while > > not doing anything to improve what we have now is the worst of the > > possibilities. > > Oh, sure. I just thought that your proposed enhancement combines well > with the stuff we've been discussing before :-) > > One good trick plus another one... What about using a NAND partition as swap? Has this ever been done? Given that partition support is a recent development it seems unlikely. It could also (theoretically) allow us to do power-fully-off hibernation, a feature which seems very useful given the power usage patterns I've heard about from the field (laptop run until *dead*, suspend not used because of high power draw, hard power off). Erik ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Tue, Dec 16, 2008 at 12:34 PM, Tomeu Vizoso wrote: > Well, I wasn't trying to give a "solution", just suggested a "less > bad" way to fail. IMO, just trying to find the perfect solution while > not doing anything to improve what we have now is the worst of the > possibilities. Oh, sure. I just thought that your proposed enhancement combines well with the stuff we've been discussing before :-) One good trick plus another one... . m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Tue, Dec 16, 2008 at 15:29, Martin Langhoff wrote: > On Tue, Dec 16, 2008 at 12:19 PM, Tomeu Vizoso wrote: >> I'm with Benjamin here, if the OOM killer kicked in soon enough and >> activities were clearly marked as first candidates to be killed, >> stability would be much much better. > > Combine that with Mac OS (pre X) style "estimated memory allocation" > metadata for each activity and the user experience could perhaps even > work. > > In terms of Ben's original email, what happens is a social problem, > IMHO. Code that handles memory allocation failures is bloody hard to > write -- because whatever decent handling you might want apply to the > situation will also _need_ to allocate memory. So after many years of > trying, the solution was a combination of virtual memory and a lie: > memory will never run out. And if it does, the process will die badly > because there is no way to die a nice death at that point. > > So we have some 15 years or more of programming with this "soft > malloc" and "memory never ends" mantra. It works, and you can even > request a ton of memory that doesn't exist... as long as you don't try > to use it. > > Lots of nice tricks fall out of it - mmapping, etc - but again, the > moment you actually use up memory, ouch. > > So IME the solution is to use very little memory - regardless of > allocation. Malloc is just like a credit card. Well, I wasn't trying to give a "solution", just suggested a "less bad" way to fail. IMO, just trying to find the perfect solution while not doing anything to improve what we have now is the worst of the possibilities. Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Tue, Dec 16, 2008 at 12:19 PM, Tomeu Vizoso wrote: > I'm with Benjamin here, if the OOM killer kicked in soon enough and > activities were clearly marked as first candidates to be killed, > stability would be much much better. Combine that with Mac OS (pre X) style "estimated memory allocation" metadata for each activity and the user experience could perhaps even work. In terms of Ben's original email, what happens is a social problem, IMHO. Code that handles memory allocation failures is bloody hard to write -- because whatever decent handling you might want apply to the situation will also _need_ to allocate memory. So after many years of trying, the solution was a combination of virtual memory and a lie: memory will never run out. And if it does, the process will die badly because there is no way to die a nice death at that point. So we have some 15 years or more of programming with this "soft malloc" and "memory never ends" mantra. It works, and you can even request a ton of memory that doesn't exist... as long as you don't try to use it. Lots of nice tricks fall out of it - mmapping, etc - but again, the moment you actually use up memory, ouch. So IME the solution is to use very little memory - regardless of allocation. Malloc is just like a credit card. cheers, m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
2008/12/16 Benjamin Berg : > On Mon, 2008-12-15 at 23:21 -0500, Benjamin M. Schwartz wrote: >> I'm no expert, but making the system work well without overcommit would >> probably require extensive modifications to the python interpreter, the >> fd.o libraries (dbus, gstreamer, telepathy, etc.), gecko, and maybe even >> X. All of these would need to allocate only as much memory as they need, >> and react appropriately when malloc returns NULL. In other words, 'tain't >> gonna happen. > > GLib will abort when g_malloc fails. This means that most libraries that > use glib (GTK+) will not handle out of memory at all. > > It may be interesting to adjust the OOM score of some applications. This > way it should be possible to protect the core applications (sugar-shell, > journal, X, ...) from being killed in an OOM situation. I'm with Benjamin here, if the OOM killer kicked in soon enough and activities were clearly marked as first candidates to be killed, stability would be much much better. And if background activities were killed before the active one, we would avoid data loss. Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Mon, 2008-12-15 at 23:21 -0500, Benjamin M. Schwartz wrote: > I'm no expert, but making the system work well without overcommit would > probably require extensive modifications to the python interpreter, the > fd.o libraries (dbus, gstreamer, telepathy, etc.), gecko, and maybe even > X. All of these would need to allocate only as much memory as they need, > and react appropriately when malloc returns NULL. In other words, 'tain't > gonna happen. GLib will abort when g_malloc fails. This means that most libraries that use glib (GTK+) will not handle out of memory at all. It may be interesting to adjust the OOM score of some applications. This way it should be possible to protect the core applications (sugar-shell, journal, X, ...) from being killed in an OOM situation. Benjamin signature.asc Description: This is a digitally signed message part ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: No surprise on memory
On Mon, Dec 15, 2008 at 11:21:18PM -0500, Benjamin M. Schwartz wrote: > I'm no expert, but making the system work well without overcommit would > probably require extensive modifications to the python interpreter, the > fd.o libraries (dbus, gstreamer, telepathy, etc.), gecko, and maybe even > X. All of these would need to allocate only as much memory as they need, > and react appropriately when malloc returns NULL. In other words, 'tain't > gonna happen. Couldn't we instrument malloc to report when it returns NULL (into an area of memory we have helpfully set aside for the purpose) and then report those events during testing, in order to find out and fix those instances of overallocation? -- James Cameronmailto:qu...@us.netrek.org http://quozl.netrek.org/ ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
No surprise on memory
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I recently learned a few very important things about Linux memory management (I'm speaking about how its supposed to work, irrespective of any bugs). Operating systems experts already know all of this, but I did not. 1. Malloc lies. It will happily return a pointer to an allocation larger than the entire amount of physical memory, just hoping that you won't use it. This is called "overcommit". 2. Even without swap, the system will never actually run out of memory. Instead, as some program attempts to make use of the memory that it has already allocated, the kernel will start paging out all clean pages that are not currently in use. At this point, the system has so little remaining free memory that only the specific pages of binaries that are currently in use can be held in memory. The system is essentially running executables directly from disk, which is so slow that it would take ages to finally run out of memory. Bernie helpfully compared this type of thrashing to Zeno's paradox. 3. To avoid getting stuck in this situation, the kernel has a "OOM killer". This is a misnomer. The OOM killer picks a process to kill _before_ OOM is reached. It does this either because the system is already low on memory and is paging lots of stuff out to disk, or because the system is overcommitted by an unacceptably large ratio. I found this very surprising, and in some ways I still do. I read many justifications of these decisions, but I was curious to test it for myself. I was happy, therefore, to learn about /proc/sys/vm/overcommit_memory and /proc/sys/vm/overcommit_ratio. These knobs control the memory system. By setting overcommit_memory to "2" and overcommit_ratio to "95", it is possible to approximate the behavior that a naive C programmer might expect from the kernel. In this mode, malloc will only return a non-null pointer if the allocation can actually be fulfilled in physical memory. Also, this setting of overcommit_ratio ensures that 5% of memory is reserved to the kernel. I tried running 767 on an XO in this mode, and the bottom line is that the conventional wisdom is correct. I set the parameters and restarted X, and Sugar came up fine. Every view displayed correctly, including the Journal and the mesh view with buddies from Gabble. Some activities, like Calculate, run fine, but the big ones, like Record and Browse, are semi-functional at best, and at worst cause sugar to lock up entirely. This is not too surprising. I'm no expert, but making the system work well without overcommit would probably require extensive modifications to the python interpreter, the fd.o libraries (dbus, gstreamer, telepathy, etc.), gecko, and maybe even X. All of these would need to allocate only as much memory as they need, and react appropriately when malloc returns NULL. In other words, 'tain't gonna happen. Conclusion: no magic get-out-of-jail-free card. - --Ben -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) iEYEARECAAYFAklHLL4ACgkQUJT6e6HFtqSf7gCdGQlnXOaIAA/214NbLCjM3imi NdsAoJaFUD2SvtRtvKDn1NfytrJk2yN4 =TCMw -END PGP SIGNATURE- ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel