Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
Hi! > > IMVHO every developer involved in memory-management (and indeed, any > > software development; the authors of ntpd comes in mind here) should > > have a 386 with 4MB of RAM and some 16MB of swap. Nowadays I have the > > luxury of a 486 with 8MB of RAM and 32MB of swap as a firewall, but it's > > still a pain to work with. > > If you really want to have fun, remove all swap... My handheld has 12MB ram, no swap ;-), and that's pretty big machine for handheld. Pavel PS: Swapping on flash disk is bad idea, right? -- I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care." Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
Hi! > > IMVHO every developer involved in memory-management (and indeed, any > > software development; the authors of ntpd comes in mind here) should > > have a 386 with 4MB of RAM and some 16MB of swap. Nowadays I have the > > luxury of a 486 with 8MB of RAM and 32MB of swap as a firewall, but it's > > still a pain to work with. > > You're absolutely right. The smallest thing I'm testing with > on a regular basis is my dual pentium machine, booted with > mem=8m or mem=16m. > > Time to hunt around for a 386 or 486 which is limited to such > a small amount of RAM ;) Buy agenda handheld: 16MB flash, 8MB ram, X, size of palm. It is definitely more sexy machine than average 486. [Or get philips velo 1, if you want keyboard ;-)] Pavel -- The best software in life is free (not shareware)! Pavel GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Wed, May 23, 2001 at 05:51:50PM +, Scott Anderson wrote: > David Weinehall wrote: > > IMVHO every developer involved in memory-management (and indeed, any > > software development; the authors of ntpd comes in mind here) should > > have a 386 with 4MB of RAM and some 16MB of swap. Nowadays I have the > > luxury of a 486 with 8MB of RAM and 32MB of swap as a firewall, but it's > > still a pain to work with. > > If you really want to have fun, remove all swap... Oh, I've done some testing without swap too, mainly to test Rik's oom-killer. Seemed to work pretty well. Can't say it was enjoyable, though. /David _ _ // David Weinehall <[EMAIL PROTECTED]> /> Northern lights wander \\ // Project MCA Linux hacker// Dance across the winter sky // \> http://www.acc.umu.se/~tao/http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Thu, 24 May 2001, Rik van Riel wrote: > > > > OK.. let's forget about throughput for a moment and consider > > > > those annoying reports of 0 order allocations failing :) > > > > > > Those are ok. All failing 0 order allocations are either > > > atomic allocations or GFP_BUFFER allocations. I guess we > > > should just remove the printk() ;) > > > > Hmm. The guy who's box locks up on him after a burst of these > > probably doesn't think these failures are very OK ;-) I don't > > think order 0 failing is cool at all.. ever. > > You may not think it's cool, but it's needed in order to > prevent deadlocks. Just because an allocation cannot do > disk IO or sleep, that's no reason to loop around like > crazy in __alloc_pages() and hang the machine ... ;) True, but if we have resources available there's no excuse for a failure. Well, yes there is. If the cost of that resource is higher than the value of letting the allocation succeed. We have no data on the value of success, but we do plan on consuming the reclaimable pool and do that (must), so I still think turning these resources loose at strategic moments is logically sound. (doesn't mean there's not a better way.. it's just an easy way) I'd really like someone who has this problem to try the patch to see if it does help. I don't have this darn problem myself, so I'm left holding a bag of idle curiosity. ;-) Cheers, -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Thu, 24 May 2001, Mike Galbraith wrote: > On Thu, 24 May 2001, Rik van Riel wrote: > > On Thu, 24 May 2001, Mike Galbraith wrote: > > > On Sun, 20 May 2001, Rik van Riel wrote: > > > > > > > Remember that inactive_clean pages are always immediately > > > > reclaimable by __alloc_pages(), if you measured a performance > > > > difference by freeing pages in a different way I'm pretty sure > > > > it's a side effect of something else. What that something > > > > else is I'm curious to find out, but I'm pretty convinced that > > > > throwing away data early isn't the way to go. > > > > > > OK.. let's forget about throughput for a moment and consider > > > those annoying reports of 0 order allocations failing :) > > > > Those are ok. All failing 0 order allocations are either > > atomic allocations or GFP_BUFFER allocations. I guess we > > should just remove the printk() ;) > > Hmm. The guy who's box locks up on him after a burst of these > probably doesn't think these failures are very OK ;-) I don't > think order 0 failing is cool at all.. ever. You may not think it's cool, but it's needed in order to prevent deadlocks. Just because an allocation cannot do disk IO or sleep, that's no reason to loop around like crazy in __alloc_pages() and hang the machine ... ;) > A (long) while back, Linus specifically mentioned worrying > about atomic allocation reliability. That's a separate issue. That was, IIRC, about the failure of atomic allocations causing packet loss on Linux routers and, because of that, poor performance. This is something we still need to look into, but basically this problem is about too high latency and NOT about "pre-freeing" more pages (like your patch attempts). If this problem is still an issue, it's quite likely that the VM is holding locks for too long so that it cannot react fast enough to free up some inactive_clean pages. regards, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Thu, 24 May 2001, Rik van Riel wrote: > On Thu, 24 May 2001, Mike Galbraith wrote: > > On Sun, 20 May 2001, Rik van Riel wrote: > > > > > Remember that inactive_clean pages are always immediately > > > reclaimable by __alloc_pages(), if you measured a performance > > > difference by freeing pages in a different way I'm pretty sure > > > it's a side effect of something else. What that something > > > else is I'm curious to find out, but I'm pretty convinced that > > > throwing away data early isn't the way to go. > > > > OK.. let's forget about throughput for a moment and consider > > those annoying reports of 0 order allocations failing :) > > Those are ok. All failing 0 order allocations are either > atomic allocations or GFP_BUFFER allocations. I guess we > should just remove the printk() ;) Hmm. The guy who's box locks up on him after a burst of these probably doesn't think these failures are very OK ;-) I don't think order 0 failing is cool at all.. ever. A (long) while back, Linus specifically mentioned worrying about atomic allocation reliability. > > What do you think of the below (ignore the refill_inactive bit) > > wrt allocator reliability under heavy stress? The thing does > > kick in and pump up zones even if I set the 'blood donor' level > > to pages_min. > > > - unsigned long water_mark; > > + unsigned long water_mark = 1 << order; > > Makes no sense at all since water_mark gets assigned not 10 > lines below. ;) That assignment was supposed to turn into +=. > > + if (direct_reclaim) { > > + int count; > > + > > + /* If we're in bad shape.. */ > > + if (z->free_pages < z->pages_low && z->inactive_clean_pages) { > > I'm not sure if we want to fill up the free list all the way > to z->pages_low all the time, since "free memory is wasted > memory". Yes. I'm just thinking of the burst of allocations with no reclaim possible. > The reason the current scheme only triggers when we reach > z->pages_min and then goes all the way up to z->pages_low > is memory defragmentation. Since we'll be doing direct Ah. > reclaim for just about every allocation in the system, it > only happens occasionally that we throw away all the > inactive_clean pages between z->pages_min and z->pages_low. This one has me puzzled. We're reluctant to release cleaned pages, but at the same time, we reclaim if possible as soon as all zones are below pages_high. > > + count = 4 * (1 << page_cluster); > > + /* reclaim a page for ourselves if we can afford to.. >*/ > > + if (z->inactive_clean_pages > count) > > + page = reclaim_page(z); > > + if (z->inactive_clean_pages < 2 * count) > > + count = z->inactive_clean_pages / 2; > > + } else count = 0; > > What exactly is the reasoning behind this complex "count" > stuff? Is there a good reason for not just refilling the > free list up to the target or until the inactive_clean list > is depleted ? Well, yes. You didn't like the 50/50 split thingy I did before, so I connected zones to a tricklecharger instead. > > + /* > > +* and make a small donation to the reclaim challenged. > > +* > > +* We don't ever want a zone to reach the state where we > > +* have nothing except reclaimable pages left.. not if > > +* we can possibly do something to help prevent it. > > +*/ > > This comment makes little sense If not, then none of it does. This situation is the ONLY thing I was worried about. free_pages + inactive_clean_pages > pages_min does nothing about free_pages for those who can't reclaim if most of that is inactive_clean_pages. IFF it's possible to be critical on free_pages and still have clean pages, it does make sense. > > + if (z->inactive_clean_pages - z->free_pages > z->pages_low > > + && waitqueue_active(&kreclaimd_wait)) > > + wake_up_interruptible(&kreclaimd_wait); > > This doesn't make any sense to me at all. Why wake up > kreclaimd just because the difference between the number > of inactive_clean pages and free pages is large ? You had to get there with direct_reclaim not set was the thought. Nobody gave the zone a transfusion, but there is a blood supply. If nobody gets around to refilling the zone, kreclaimd will. > Didn't we determine in our last exchange of email that > it would be a good thing under most loads to keep as much > inactive_clean memory around as possible and not waste^Wfree > memory early ? So why do we reclaim if we're just below pages_high? The whole point of this patch is to reclaim _less_ in the general case, but to do so in a timely manner if we really need it.
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Thu, 24 May 2001, Mike Galbraith wrote: > On Sun, 20 May 2001, Rik van Riel wrote: > > > Remember that inactive_clean pages are always immediately > > reclaimable by __alloc_pages(), if you measured a performance > > difference by freeing pages in a different way I'm pretty sure > > it's a side effect of something else. What that something > > else is I'm curious to find out, but I'm pretty convinced that > > throwing away data early isn't the way to go. > > OK.. let's forget about throughput for a moment and consider > those annoying reports of 0 order allocations failing :) Those are ok. All failing 0 order allocations are either atomic allocations or GFP_BUFFER allocations. I guess we should just remove the printk() ;) > What do you think of the below (ignore the refill_inactive bit) > wrt allocator reliability under heavy stress? The thing does > kick in and pump up zones even if I set the 'blood donor' level > to pages_min. > - unsigned long water_mark; > + unsigned long water_mark = 1 << order; Makes no sense at all since water_mark gets assigned not 10 lines below. ;) > + if (direct_reclaim) { > + int count; > + > + /* If we're in bad shape.. */ > + if (z->free_pages < z->pages_low && z->inactive_clean_pages) { I'm not sure if we want to fill up the free list all the way to z->pages_low all the time, since "free memory is wasted memory". The reason the current scheme only triggers when we reach z->pages_min and then goes all the way up to z->pages_low is memory defragmentation. Since we'll be doing direct reclaim for just about every allocation in the system, it only happens occasionally that we throw away all the inactive_clean pages between z->pages_min and z->pages_low. > + count = 4 * (1 << page_cluster); > + /* reclaim a page for ourselves if we can afford to.. >*/ > + if (z->inactive_clean_pages > count) > + page = reclaim_page(z); > + if (z->inactive_clean_pages < 2 * count) > + count = z->inactive_clean_pages / 2; > + } else count = 0; What exactly is the reasoning behind this complex "count" stuff? Is there a good reason for not just refilling the free list up to the target or until the inactive_clean list is depleted ? > + /* > + * and make a small donation to the reclaim challenged. > + * > + * We don't ever want a zone to reach the state where we > + * have nothing except reclaimable pages left.. not if > + * we can possibly do something to help prevent it. > + */ This comment makes little sense > + if (z->inactive_clean_pages - z->free_pages > z->pages_low > + && waitqueue_active(&kreclaimd_wait)) > + wake_up_interruptible(&kreclaimd_wait); This doesn't make any sense to me at all. Why wake up kreclaimd just because the difference between the number of inactive_clean pages and free pages is large ? Didn't we determine in our last exchange of email that it would be a good thing under most loads to keep as much inactive_clean memory around as possible and not waste^Wfree memory early ? > - /* > - * First, see if we have any zones with lots of free memory. > - * > - * We allocate free memory first because it doesn't contain > - * any data ... DUH! > - */ We want to keep this. Suppose we have one zone which is half filled with inactive_clean pages and one zone which has "too many" free pages. Allocating from the first zone means we evict some piece of, potentially useful, data from the cache; allocating from the second zone means we can keep the data in memory and only fill up a currently unused page. > @@ -824,39 +824,17 @@ > #define DEF_PRIORITY (6) > static int refill_inactive(unsigned int gfp_mask, int user) > { I've heard all kinds of things about this part of the patch, except an explanation of why and how it is supposed to work ;) > @@ -976,8 +954,9 @@ >* We go to sleep for one second, but if it's needed >* we'll be woken up earlier... >*/ > - if (!free_shortage() || !inactive_shortage()) { > - interruptible_sleep_on_timeout(&kswapd_wait, HZ); > + if (current->need_resched || !free_shortage() || > + !inactive_shortage()) { > + interruptible_sleep_on_timeout(&kswapd_wait, HZ/10); Makes sense. Integrated in my tree ;) regards, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to los
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Rik van Riel wrote: > Remember that inactive_clean pages are always immediately > reclaimable by __alloc_pages(), if you measured a performance > difference by freeing pages in a different way I'm pretty sure > it's a side effect of something else. What that something > else is I'm curious to find out, but I'm pretty convinced that > throwing away data early isn't the way to go. OK.. let's forget about throughput for a moment and consider those annoying reports of 0 order allocations failing :) What do you think of the below (ignore the refill_inactive bit) wrt allocator reliability under heavy stress? The thing does kick in and pump up zones even if I set the 'blood donor' level to pages_min. -Mike --- linux-2.4.5-pre3/mm/page_alloc.c.orgMon May 21 10:35:06 2001 +++ linux-2.4.5-pre3/mm/page_alloc.cThu May 24 08:18:36 2001 @@ -224,10 +224,11 @@ unsigned long order, int limit, int direct_reclaim) { zone_t **zone = zonelist->zones; + struct page *page = NULL; for (;;) { zone_t *z = *(zone++); - unsigned long water_mark; + unsigned long water_mark = 1 << order; if (!z) break; @@ -249,18 +250,44 @@ case PAGES_HIGH: water_mark = z->pages_high; } + if (z->free_pages + z->inactive_clean_pages < water_mark) + continue; - if (z->free_pages + z->inactive_clean_pages > water_mark) { - struct page *page = NULL; - /* If possible, reclaim a page directly. */ - if (direct_reclaim && z->free_pages < z->pages_min + 8) + if (direct_reclaim) { + int count; + + /* If we're in bad shape.. */ + if (z->free_pages < z->pages_low && z->inactive_clean_pages) { + count = 4 * (1 << page_cluster); + /* reclaim a page for ourselves if we can afford to.. +*/ + if (z->inactive_clean_pages > count) + page = reclaim_page(z); + if (z->inactive_clean_pages < 2 * count) + count = z->inactive_clean_pages / 2; + } else count = 0; + + /* +* and make a small donation to the reclaim challenged. +* +* We don't ever want a zone to reach the state where we +* have nothing except reclaimable pages left.. not if +* we can possibly do something to help prevent it. +*/ + while (count--) { + struct page *page; page = reclaim_page(z); - /* If that fails, fall back to rmqueue. */ - if (!page) - page = rmqueue(z, order); - if (page) - return page; + if (!page) + break; + __free_page(page); + } } + if (!page) + page = rmqueue(z, order); + if (page) + return page; + if (z->inactive_clean_pages - z->free_pages > z->pages_low + && waitqueue_active(&kreclaimd_wait)) + wake_up_interruptible(&kreclaimd_wait); } /* Found nothing. */ @@ -314,29 +341,6 @@ wakeup_bdflush(0); try_again: - /* -* First, see if we have any zones with lots of free memory. -* -* We allocate free memory first because it doesn't contain -* any data ... DUH! -*/ - zone = zonelist->zones; - for (;;) { - zone_t *z = *(zone++); - if (!z) - break; - if (!z->size) - BUG(); - - if (z->free_pages >= z->pages_low) { - page = rmqueue(z, order); - if (page) - return page; - } else if (z->free_pages < z->pages_min && - waitqueue_active(&kreclaimd_wait)) { - wake_up_interruptible(&kreclaimd_wait); - } - } /* * Try to allocate a page from a zone with a HIGH --- linux-2.4.5-pre3/mm/vmscan.c.orgThu May 17 16:44:23 2001 +++ linux-2.4.5-pre3/mm/vmscan.cThu May 24 08:05:21 2001 @@ -824,39 +824,17 @@ #define DEF_PRIORITY (6)
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
David Weinehall wrote: > IMVHO every developer involved in memory-management (and indeed, any > software development; the authors of ntpd comes in mind here) should > have a 386 with 4MB of RAM and some 16MB of swap. Nowadays I have the > luxury of a 486 with 8MB of RAM and 32MB of swap as a firewall, but it's > still a pain to work with. If you really want to have fun, remove all swap... Scott Anderson [EMAIL PROTECTED] MontaVista Software Inc. (408)328-9214 1237 East Arques Ave. http://www.mvista.com Sunnyvale, CA 94085 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
>Time to hunt around for a 386 or 486 which is limited to such >a small amount of RAM ;) I've got an old knackered 486DX/33 with 8Mb RAM (in 30-pin SIMMs, woohoo!), a flat CMOS battery, a 2Gb Maxtor HD that needs a low-level format every year, and no case. It isn't running anything right now... -- from: Jonathan "Chromatix" Morton mail: [EMAIL PROTECTED] (not for attachments) big-mail: [EMAIL PROTECTED] uni-mail: [EMAIL PROTECTED] The key to knowledge is not to rely on people to teach you it. Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/ -BEGIN GEEK CODE BLOCK- Version 3.12 GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) -END GEEK CODE BLOCK- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Mon, 21 May 2001, David Weinehall wrote: > IMVHO every developer involved in memory-management (and indeed, any > software development; the authors of ntpd comes in mind here) should > have a 386 with 4MB of RAM and some 16MB of swap. Nowadays I have the > luxury of a 486 with 8MB of RAM and 32MB of swap as a firewall, but it's > still a pain to work with. You're absolutely right. The smallest thing I'm testing with on a regular basis is my dual pentium machine, booted with mem=8m or mem=16m. Time to hunt around for a 386 or 486 which is limited to such a small amount of RAM ;) cheers, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, May 20, 2001 at 11:54:09PM +0200, Pavel Machek wrote: > Hi! > > > > You're right. It should never dump too much data at once. OTOH, if > > > those cleaned pages are really old (front of reclaim list), there's no > > > value in keeping them either. Maybe there should be a slow bleed for > > > mostly idle or lightly loaded conditions. > > > > If you don't think it's worthwhile keeping the oldest pages > > in memory around, please hand me your excess DIMMS ;) > > Sorry, Rik, you can't have that that DIMM. You know, you are > developing memory managment, and we can't have you having too much > memory available ;-). IMVHO every developer involved in memory-management (and indeed, any software development; the authors of ntpd comes in mind here) should have a 386 with 4MB of RAM and some 16MB of swap. Nowadays I have the luxury of a 486 with 8MB of RAM and 32MB of swap as a firewall, but it's still a pain to work with. /David _ _ // David Weinehall <[EMAIL PROTECTED]> /> Northern lights wander \\ // Project MCA Linux hacker// Dance across the winter sky // \> http://www.acc.umu.se/~tao/http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
Hi! > > You're right. It should never dump too much data at once. OTOH, if > > those cleaned pages are really old (front of reclaim list), there's no > > value in keeping them either. Maybe there should be a slow bleed for > > mostly idle or lightly loaded conditions. > > If you don't think it's worthwhile keeping the oldest pages > in memory around, please hand me your excess DIMMS ;) Sorry, Rik, you can't have that that DIMM. You know, you are developing memory managment, and we can't have you having too much memory available ;-). Pavel -- I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care." Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
Hi, On Sun, May 20, 2001 at 07:04:31AM -0300, Rik van Riel wrote: > On Sun, 20 May 2001, Mike Galbraith wrote: > > > > Looking at the locking and trying to think SMP (grunt) though, I > > don't like the thought of taking two locks for each page until > > > 100%. The data in that block is toast anyway. A big hairy SMP > > box has to feel reclaim_page(). (they probably feel the zone lock > > too.. probably would like to allocate blocks) > > Indeed, but this is a separate problem. Doing per-CPU private > (small, 8-32 page?) free lists is probably a good idea Ingo already implemented that for Tux2. Cheers, Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Marcelo Tosatti wrote: > On Sat, 19 May 2001, Mike Galbraith wrote: > > > @@ -1054,7 +1033,7 @@ > > if (!zone->size) > > continue; > > > > - while (zone->free_pages < zone->pages_low) { > > + while (zone->free_pages < zone->inactive_clean_pages) { > > struct page * page; > > page = reclaim_page(zone); > > if (!page) > > > What you're trying to do with this change ? Just ensuring that I never had a large supply of cleaned pages laying around at a time when folks are in distress. It also ensures that you never donate your last reclaimable pages, but that wasn't the intent. It was a stray though that happened to produce measurable improvement. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Rik van Riel wrote: > On Sun, 20 May 2001, Mike Galbraith wrote: > > On 20 May 2001, Zlatko Calusic wrote: > > > > Also in all recent kernels, if the machine is swapping, swap cache > > > grows without limits and is hard to recycle, but then again that is > > > a known problem. > > > > This one bugs me. I do not see that and can't understand why. > > Could it be because we never free swap space and never > delete pages from the swap cache ? I sent a query to the list asking if a heavy load cleared it out, but got no replies. I figured about the only thing it could be is that under light load, reclaim isn't needed to cure and shortage. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sat, 19 May 2001, Mike Galbraith wrote: > @@ -1054,7 +1033,7 @@ > if (!zone->size) > continue; > > - while (zone->free_pages < zone->pages_low) { > + while (zone->free_pages < zone->inactive_clean_pages) { > struct page * page; > page = reclaim_page(zone); > if (!page) What you're trying to do with this change ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Mike Galbraith wrote: > On 20 May 2001, Zlatko Calusic wrote: > > Also in all recent kernels, if the machine is swapping, swap cache > > grows without limits and is hard to recycle, but then again that is > > a known problem. > > This one bugs me. I do not see that and can't understand why. Could it be because we never free swap space and never delete pages from the swap cache ? Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Mike Galbraith wrote: > > Also in all recent kernels, if the machine is swapping, swap cache > > grows without limits and is hard to recycle, but then again that is > > a known problem. > > This one bugs me. I do not see that and can't understand why. To throw away dirty and dead swapcache (its done at swap writepage()) pages page_launder() has to run into its second loop (launder_loop = 1) (meaning that a lot of clean cache has been thrown out already). We can "short circuit" this dead swapcache pages by cleaning them in the first page_launder() loop. Take a look at the writepage() patch I sent to Linus a few days ago. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On 20 May 2001, Zlatko Calusic wrote: > Mike Galbraith <[EMAIL PROTECTED]> writes: > > > Hi, > > > > On Fri, 18 May 2001, Stephen C. Tweedie wrote: > > > > > That's the main problem with static parameters. The problem you are > > > trying to solve is fundamentally dynamic in most cases (which is also > > > why magic numbers tend to suck in the VM.) > > > > Magic numbers might be sucking some performance right now ;-) > > > [snip] > > I like your patch, it improves performance somewhat and makes things > more smooth and also code is simpler. Thanks for the feedback. Positive is nice.. as is negative. > Anyway, 2.4.5-pre3 is quite debalanced and it has even broken some > things that were working properly before. For instance, swapoff now > deadlocks the machine (even with your patch applied). I haven't run into that. > Unfortunately, I have failed to pinpoint the exact problem, but I'm > confident that kernel goes in some kind of loop (99% system time, just > before deadlock). Anybody has some guidelines how to debug kernel if > you're running X? Serial console and kdb or kgdb if you have two machines.. or uml? > Also in all recent kernels, if the machine is swapping, swap cache > grows without limits and is hard to recycle, but then again that is > a known problem. This one bugs me. I do not see that and can't understand why. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Ingo Oeser wrote: > On Sun, May 20, 2001 at 05:29:49AM +0200, Mike Galbraith wrote: > > I'm not sure why that helps. I didn't put it in as a trick or > > anything though. I put it in because it didn't seem like a > > good idea to ever have more cleaned pages than free pages at a > > time when we're yammering for help.. so I did that and it helped. > > The rationale for this is easy: free pages is wasted memory, > clean pages is hot, clean cache. The best state a cache can be in. Sure. Under low load, cache is great. Under stress, keeping it is not an option though ;-) We're at or beyond capacity and moving at a high delda V (people yammering for help). If you can recognize and kill the delta rapidly by dumping that which you are going to have to dump anyway, you save time getting back on your feet. (my guess as to why dumping clean pages does measurably help in this case) -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
Mike Galbraith <[EMAIL PROTECTED]> writes: > Hi, > > On Fri, 18 May 2001, Stephen C. Tweedie wrote: > > > That's the main problem with static parameters. The problem you are > > trying to solve is fundamentally dynamic in most cases (which is also > > why magic numbers tend to suck in the VM.) > > Magic numbers might be sucking some performance right now ;-) > [snip] I like your patch, it improves performance somewhat and makes things more smooth and also code is simpler. Anyway, 2.4.5-pre3 is quite debalanced and it has even broken some things that were working properly before. For instance, swapoff now deadlocks the machine (even with your patch applied). Unfortunately, I have failed to pinpoint the exact problem, but I'm confident that kernel goes in some kind of loop (99% system time, just before deadlock). Anybody has some guidelines how to debug kernel if you're running X? Also in all recent kernels, if the machine is swapping, swap cache grows without limits and is hard to recycle, but then again that is a known problem. -- Zlatko - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, May 20, 2001 at 05:29:49AM +0200, Mike Galbraith wrote: > I'm not sure why that helps. I didn't put it in as a trick or > anything though. I put it in because it didn't seem like a > good idea to ever have more cleaned pages than free pages at a > time when we're yammering for help.. so I did that and it helped. The rationale for this is easy: free pages is wasted memory, clean pages is hot, clean cache. The best state a cache can be in. Regards Ingo Oeser -- To the systems programmer, users and applications serve only to provide a test load. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Mike Galbraith wrote: > but ;-) > > Looking at the locking and trying to think SMP (grunt) though, I > don't like the thought of taking two locks for each page until > 100%. The data in that block is toast anyway. A big hairy SMP > box has to feel reclaim_page(). (they probably feel the zone lock > too.. probably would like to allocate blocks) Indeed, but this is a separate problem. Doing per-CPU private (small, 8-32 page?) free lists is probably a good idea, but I don't really think it's related to kreclaimd ;) regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Rik van Riel wrote: > On Sun, 20 May 2001, Mike Galbraith wrote: > > > You're right. It should never dump too much data at once. OTOH, if > > those cleaned pages are really old (front of reclaim list), there's no > > value in keeping them either. Maybe there should be a slow bleed for > > mostly idle or lightly loaded conditions. > > If you don't think it's worthwhile keeping the oldest pages > in memory around, please hand me your excess DIMMS ;) You're welcome to the data in any of them :) The hardware I keep. > Remember that inactive_clean pages are always immediately > reclaimable by __alloc_pages(), if you measured a performance > difference by freeing pages in a different way I'm pretty sure > it's a side effect of something else. What that something > else is I'm curious to find out, but I'm pretty convinced that > throwing away data early isn't the way to go. OK. I'm getting a little distracted by thinking about the locking and some latency comments I've heard various gurus make. I should probably stick to thinking about/measuring throughput.. much easier. but ;-) Looking at the locking and trying to think SMP (grunt) though, I don't like the thought of taking two locks for each page until kreclaimd gets a chance to run. One of those locks is the pagecache_lock, and that makes me think it'd be better to just reclaim a block if I have to reclaim at all. At that point, the chances of needing to lock the pagecache soon again are about 100%. The data in that block is toast anyway. A big hairy SMP box has to feel reclaim_page(). (they probably feel the zone lock too.. probably would like to allocate blocks) -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Mike Galbraith wrote: > You're right. It should never dump too much data at once. OTOH, if > those cleaned pages are really old (front of reclaim list), there's no > value in keeping them either. Maybe there should be a slow bleed for > mostly idle or lightly loaded conditions. If you don't think it's worthwhile keeping the oldest pages in memory around, please hand me your excess DIMMS ;) Remember that inactive_clean pages are always immediately reclaimable by __alloc_pages(), if you measured a performance difference by freeing pages in a different way I'm pretty sure it's a side effect of something else. What that something else is I'm curious to find out, but I'm pretty convinced that throwing away data early isn't the way to go. regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Rik van Riel wrote: > On Sun, 20 May 2001, Mike Galbraith wrote: > > > > I'm not sure why that helps. I didn't put it in as a trick or > > anything though. I put it in because it didn't seem like a > > good idea to ever have more cleaned pages than free pages at a > > time when we're yammering for help.. so I did that and it helped. >^ > > Note that this is not the normal situation. Now think > about the amount of data you'd be blowing away from the > inactive_clean pages after a bit of background aging > has gone on on a lightly loaded system. Not Good(tm) You're right. It should never dump too much data at once. OTOH, if those cleaned pages are really old (front of reclaim list), there's no value in keeping them either. Maybe there should be a slow bleed for mostly idle or lightly loaded conditions. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Mike Galbraith wrote: > On Sat, 19 May 2001, Rik van Riel wrote: > > On Sat, 19 May 2001, Mike Galbraith wrote: > > > On Fri, 18 May 2001, Stephen C. Tweedie wrote: > > > > > > > That's the main problem with static parameters. The problem you are > > > > trying to solve is fundamentally dynamic in most cases (which is also > > > > why magic numbers tend to suck in the VM.) > > > > > > Magic numbers might be sucking some performance right now ;-) > > > > ... so you replace them with some others ... ;) > > I reused one of our base numbers to classify the severity of the > situation.. not the same as inventing new ones. (well, not quite > the same anyway.. half did come from the south fourty;) *nod* ;) (not that I'm saying this is bad ... it's just that I'd like to know why things work before looking at applying them) > > > (yes, the last hunk looks out of place wrt my text. > > > > It also looks kind of bogus and geared completely towards this > > particular workload ;) > > I'm not sure why that helps. I didn't put it in as a trick or > anything though. I put it in because it didn't seem like a > good idea to ever have more cleaned pages than free pages at a > time when we're yammering for help.. so I did that and it helped. ^ Note that this is not the normal situation. Now think about the amount of data you'd be blowing away from the inactive_clean pages after a bit of background aging has gone on on a lightly loaded system. Not Good(tm) regards, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sun, 20 May 2001, Dieter Nützel wrote: > > > Three back to back make -j 30 runs for three different kernels. > > > Swap cache numbers are taken immediately after last completion. > > > > The performance increase is nice, though. Do you see similar > > changes in different kinds of workloads ? > > I you have a patch against 2.4.4-ac11 I will do some tests with some > (interactive) 3D apps. I don't have an ac kernel resident atm, but since Alan merged here very recently, it will probably go in ok. If not, just holler and I'll download ac11 and make you a clean patch. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sat, 19 May 2001, Rik van Riel wrote: > On Sat, 19 May 2001, Mike Galbraith wrote: > > On Fri, 18 May 2001, Stephen C. Tweedie wrote: > > > > > That's the main problem with static parameters. The problem you are > > > trying to solve is fundamentally dynamic in most cases (which is also > > > why magic numbers tend to suck in the VM.) > > > > Magic numbers might be sucking some performance right now ;-) > > ... so you replace them with some others ... ;) I reused one of our base numbers to classify the severity of the situation.. not the same as inventing new ones. (well, not quite the same anyway.. half did come from the south fourty;) > > Three back to back make -j 30 runs for three different kernels. > > Swap cache numbers are taken immediately after last completion. > > The performance increase is nice, though. Do you see similar > changes in different kinds of workloads ? I don't have much to test with here, but I'll see if I can find something. I'd rather see someone with a server load try it. > > (yes, the last hunk looks out of place wrt my text. > > It also looks kind of bogus and geared completely towards this > particular workload ;) I'm not sure why that helps. I didn't put it in as a trick or anything though. I put it in because it didn't seem like a good idea to ever have more cleaned pages than free pages at a time when we're yammering for help.. so I did that and it helped. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
> > Three back to back make -j 30 runs for three different kernels. > > Swap cache numbers are taken immediately after last completion. > > The performance increase is nice, though. Do you see similar > changes in different kinds of workloads ? I you have a patch against 2.4.4-ac11 I will do some tests with some (interactive) 3D apps. -Dieter - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Re: Linux 2.4.4-ac10
On Sat, 19 May 2001, Mike Galbraith wrote: > On Fri, 18 May 2001, Stephen C. Tweedie wrote: > > > That's the main problem with static parameters. The problem you are > > trying to solve is fundamentally dynamic in most cases (which is also > > why magic numbers tend to suck in the VM.) > > Magic numbers might be sucking some performance right now ;-) ... so you replace them with some others ... ;) > Three back to back make -j 30 runs for three different kernels. > Swap cache numbers are taken immediately after last completion. The performance increase is nice, though. Do you see similar changes in different kinds of workloads ? > (yes, the last hunk looks out of place wrt my text. It also looks kind of bogus and geared completely towards this particular workload ;) regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] Re: Linux 2.4.4-ac10
Hi, On Fri, 18 May 2001, Stephen C. Tweedie wrote: > That's the main problem with static parameters. The problem you are > trying to solve is fundamentally dynamic in most cases (which is also > why magic numbers tend to suck in the VM.) Magic numbers might be sucking some performance right now ;-) Three back to back make -j 30 runs for three different kernels. Swap cache numbers are taken immediately after last completion. Reference runs (bad numbers. cache collapse hurts.. a lot) real12m8.157s 11m41.192s 11m36.069s 2.4.4.virgin user7m57.710s 7m57.820s 7m57.150s sys 0m37.200s 0m37.070s 0m37.020s Swap cache: add 785029, delete 781670, find 243396/1051626 oddball.. infrequent, but happens real10m30.470s 9m36.478s 9m50.512s 2.4.5-pre3.virgin user7m54.300s 7m53.430s 7m55.200s sys 0m36.010s 0m36.850s 0m35.230s Swap cache: add 1018892, delete 1007053, find 821456/1447811 real9m9.679s9m18.291s 8m55.981s 3.4.5-pre3.tweak user7m55.590s 7m57.060s 7m55.850s sys 0m34.890s 0m34.370s 0m34.330s Swap cache: add 656966, delete 646676, find 325186/865183 --- linux-2.4.5-pre3/mm/vmscan.c.orgThu May 17 16:44:23 2001 +++ linux-2.4.5-pre3/mm/vmscan.cSat May 19 11:52:40 2001 @@ -824,39 +824,17 @@ #define DEF_PRIORITY (6) static int refill_inactive(unsigned int gfp_mask, int user) { - int count, start_count, maxtry; - - if (user) { - count = (1 << page_cluster); - maxtry = 6; - } else { - count = inactive_shortage(); - maxtry = 1 << DEF_PRIORITY; - } - - start_count = count; - do { - if (current->need_resched) { - __set_current_state(TASK_RUNNING); - schedule(); - if (!inactive_shortage()) - return 1; - } - - count -= refill_inactive_scan(DEF_PRIORITY, count); - if (count <= 0) - goto done; - - /* If refill_inactive_scan failed, try to page stuff out.. */ - swap_out(DEF_PRIORITY, gfp_mask); - - if (--maxtry <= 0) - return 0; - - } while (inactive_shortage()); - -done: - return (count < start_count); + int shortage = inactive_shortage(); + int large = freepages.high/2; + int scale; + + scale = shortage/large; + scale += free_shortage()/large; + if (scale > DEF_PRIORITY-1) + scale = DEF_PRIORITY-1; + if (refill_inactive_scan(DEF_PRIORITY-scale, shortage) < shortage) + return swap_out(DEF_PRIORITY, gfp_mask); + return 1; } static int do_try_to_free_pages(unsigned int gfp_mask, int user) @@ -976,7 +954,8 @@ * We go to sleep for one second, but if it's needed * we'll be woken up earlier... */ - if (!free_shortage() || !inactive_shortage()) { + if (current->need_resched || !free_shortage() || + !inactive_shortage()) { interruptible_sleep_on_timeout(&kswapd_wait, HZ); /* * If we couldn't free enough memory, we see if it was @@ -1054,7 +1033,7 @@ if (!zone->size) continue; - while (zone->free_pages < zone->pages_low) { + while (zone->free_pages < zone->inactive_clean_pages) { struct page * page; page = reclaim_page(zone); if (!page) Now, lets go back to the patch I posted which reduced context switches under load by ~40% (of ~685000) for a moment. Kswapd is asleep while awaiting IO completion. The guys who are pestering the sleeping kswapd are going to be doing page_launder to fix the shortage they're yammering at kswapd about. We're nibbling away at the free shortage.. and the inactive_dirty list. So now, we have an inactive shortage as well as a large free shortage when we enter refill_inactive. (shortages became large because kswapd is sleeping on the job) 6 * (1 << page_cluster) is larger than MAX_LAUNDER, but I don't see any reason to sneak up on the shortage instead of correcting it all at once. It takes too long to find out it's going to fail. Why not just get it over with before every scrubber in the system is sleeping on IO.. except the ones doing swap pagebuffer allocations. They can swapout, but they can't help push swap, so it'll all sit there until somebody wakes up no? If I'm interpreting the results right, taking it all on at once is saving a lot of what looks to me to be unnecessary swap. I can't see those swap numbers as being anyth
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Rik van Riel wrote: > On Fri, 18 May 2001, Stephen C. Tweedie wrote: > > On Fri, May 18, 2001 at 07:44:39PM -0300, Rik van Riel wrote: > > > > > This is the core of why we cannot (IMHO) have a discussion > > > of whether a patch introducing new VM tunables can go in: > > > there is no clear overview of exactly what would need to be > > > tunable and how it would help. > > > > It's worse than that. The workload on most typical systems is not > > static. The VM *must* be able to cope with dynamic workloads. You > > might twiddle all the knobs on your system to make your database run > > faster, but end up in such a situation that the next time a mail flood > > arrives for sendmail, the whole box locks up because the VM can no > > longer adapt. > > That's another problem, indeed ;) > > Ingo, Mike, please keep this in mind when designing > tunables or deciding which test you want to run today > in order to look how the VM is performing. I've bent your code up a bit. I've not yet been tempted to replace any of it with a knob ;-) There is a little piece I'd like to see thrown away though.. the loop in refill_inactive does nothing good. The test I prefer is a good one for the area of vm performance I'm most interested in. It doesn't cover the full vm spectrum by any means. I don't have a setup (any) good for testing mondo network or IO stuff. I test a simple 'job one size to large' scenario. Yes, it's limited test coverage.. it's still legitimate. Perhaps when you're evaluating vm performance, you should try my simple test once in a while. :) I'll bet you a bogobeer right here and now that when 2.4.5 hits the street you're going to be queried by the big-busy-box folks wrt swap volume. > Basic rule for VM: once you start swapping, you cannot > win; All you can do is make sure no situation loses > really badly and most situations perform reasonably. I disagree with that. I've seen a heavily swapping box run like a scaulded ass ape many times. Warsteiner, -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Stephen C. Tweedie wrote: > Hi, > > On Fri, May 18, 2001 at 07:44:39PM -0300, Rik van Riel wrote: > > > This is the core of why we cannot (IMHO) have a discussion > > of whether a patch introducing new VM tunables can go in: > > there is no clear overview of exactly what would need to be > > tunable and how it would help. > > It's worse than that. The workload on most typical systems is not > static. The VM *must* be able to cope with dynamic workloads. You > might twiddle all the knobs on your system to make your database run > faster, but end up in such a situation that the next time a mail flood > arrives for sendmail, the whole box locks up because the VM can no > longer adapt. > > That's the main problem with static parameters. The problem you are > trying to solve is fundamentally dynamic in most cases (which is also > why magic numbers tend to suck in the VM.) Yup. The problems are dynamic even with my static test load. Off the top of my head, if I could make a suggestion to the vm it would be something like "don't let dirty pages lay idle any longer than this" and maybe "reclaim cleaned pages older than that". -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, May 18, 2001 at 11:12:32PM -0300, Rik van Riel wrote: > Basic rule for VM: once you start swapping, you cannot > win; All you can do is make sure no situation loses > really badly and most situations perform reasonably. Do you mean paging in general or thrashing? I always thought: paging good, thrashing bad. A good effecient paging system, always moving data between memory and disk, is great. It's when you have the greater than physical memory working set that things go to hell in a hand basket. Did Linux ever do the old trick of "We've too much going on! You! (randomly points to a process) take a seat! You're not running for a while!" and the process gets totatlly swapped out for a "while," not even scheduled? mrc -- Mike Castle Life is like a clock: You can work constantly [EMAIL PROTECTED] and be right all the time, or not work at all www.netcom.com/~dalgoda/ and be right at least twice a day. -- mrc We are all of us living in the shadow of Manhattan. -- Watchmen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Stephen C. Tweedie wrote: > On Fri, May 18, 2001 at 07:44:39PM -0300, Rik van Riel wrote: > > > This is the core of why we cannot (IMHO) have a discussion > > of whether a patch introducing new VM tunables can go in: > > there is no clear overview of exactly what would need to be > > tunable and how it would help. > > It's worse than that. The workload on most typical systems is not > static. The VM *must* be able to cope with dynamic workloads. You > might twiddle all the knobs on your system to make your database run > faster, but end up in such a situation that the next time a mail flood > arrives for sendmail, the whole box locks up because the VM can no > longer adapt. That's another problem, indeed ;) Ingo, Mike, please keep this in mind when designing tunables or deciding which test you want to run today in order to look how the VM is performing. Basic rule for VM: once you start swapping, you cannot win; All you can do is make sure no situation loses really badly and most situations perform reasonably. Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
Hi, On Fri, May 18, 2001 at 07:44:39PM -0300, Rik van Riel wrote: > This is the core of why we cannot (IMHO) have a discussion > of whether a patch introducing new VM tunables can go in: > there is no clear overview of exactly what would need to be > tunable and how it would help. It's worse than that. The workload on most typical systems is not static. The VM *must* be able to cope with dynamic workloads. You might twiddle all the knobs on your system to make your database run faster, but end up in such a situation that the next time a mail flood arrives for sendmail, the whole box locks up because the VM can no longer adapt. That's the main problem with static parameters. The problem you are trying to solve is fundamentally dynamic in most cases (which is also why magic numbers tend to suck in the VM.) Cheers, Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Mike Galbraith wrote: > While I'd love to have more control, I can't say I have a clear > picture of exactly how I'd like those knobs to look. I always > start out trying to get it to seek the right behavior.. :) and > end up fighting so many different fires I get lost in the smoke. This is the core of why we cannot (IMHO) have a discussion of whether a patch introducing new VM tunables can go in: there is no clear overview of exactly what would need to be tunable and how it would help. When you and Ingo have something more specific to talk about, I guess we can decide on that; but deciding on something like this isn't really possible without at least knowing what should be tunable ;) regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10: Oops -> 2.4.4
> Anyway, the bug is in 2.4.4, not in 2.4.4-ac10: I am really sorry for > having loosing your time. With 2.4.4-ac9 with my fdomain, everything is > also working great ;-) Great. [Crosses another bug off] Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Ingo Oeser wrote: > On Fri, May 18, 2001 at 03:23:03PM -0300, Rik van Riel wrote: > > On Fri, 18 May 2001, Ingo Oeser wrote: > > > > > Rik: Would you take patches for such a tradeoff sysctl? > > > > "such a tradeoff" ? > > > > While this sounds reasonable, I have to point out that > > up to now nobody has described exactly WHAT tradeoff > > they'd like to make tunable and why... > > Amount of pages reclaimed from swapout_mm() versus amount of > pages reclaimed from caches. I don't know if this'll make sense, but I think this has to be a ~fuzzy suggestion to the kernel. There are so many variables that you can't predict what the kernel will run into. For example, with my favorite test, sometimes tasks do something nasty, like all deciding to do the same things at once and thereby jerking a _knot_ in the vm's tail. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Ingo Oeser wrote: > On Fri, May 18, 2001 at 03:23:03PM -0300, Rik van Riel wrote: > > "such a tradeoff" ? > > > > While this sounds reasonable, I have to point out that > > up to now nobody has described exactly WHAT tradeoff > > they'd like to make tunable and why... > > Amount of pages reclaimed from swapout_mm() versus amount of > pages reclaimed from caches. > > A value that says: "use XX% of my main memory for RSS of > processes, even if I run heavy disk loadf now" would be nice. > > For general purpose machines, where I run several services but > also play games, this would allow both to survive. > > The external services would go slower. Who cares, if some CVS > updates or NFS services go slower, if I can play my favorite game > at full speed? ;-) Remember that the executable and data of that game reside in the filesystem cache. This "double counting" makes it quite a bit harder to actually implement what seems like a simple tradeoff. regards, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Rik van Riel wrote: > On Fri, 18 May 2001, Ingo Oeser wrote: > > > Rik: Would you take patches for such a tradeoff sysctl? > > "such a tradeoff" ? > > While this sounds reasonable, I have to point out that > up to now nobody has described exactly WHAT tradeoff > they'd like to make tunable and why... While I'd love to have more control, I can't say I have a clear picture of exactly how I'd like those knobs to look. I always start out trying to get it to seek the right behavior.. :) and end up fighting so many different fires I get lost in the smoke. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10: Oops -> 2.4.4
Thus spake Alan Cox ([EMAIL PROTECTED]): > Can you boot a kernel without fdomain.c compiled in next Yes, but I am too stupid: there were a faillure in my patch-2.4.4-ac10.bz2, which is 0 bits so I have bunzip -c patch-2.4.4-ac10.bz2|patch -p1 -s with an empty file :-(( That mean I compiled a 2.4.4 kernel, and not a 2.4.4-ac10 one. I know that becauseause I reemoved my fdomain SCSI controller and put an Adaptec 2940 at the same place, and it booted very well... So the problem is ME: I am too stupid!!! Anyway, the bug is in 2.4.4, not in 2.4.4-ac10: I am really sorry for having loosing your time. With 2.4.4-ac9 with my fdomain, everything is also working great ;-) If I can do anything, just let me know ;-) Thanks you very much, Greg http://ulima.unil.ch/greg ICQ:16624071 mailto:[EMAIL PROTECTED] PGP signature
Re: Linux 2.4.4-ac10
On Fri, May 18, 2001 at 03:23:03PM -0300, Rik van Riel wrote: > On Fri, 18 May 2001, Ingo Oeser wrote: > > > Rik: Would you take patches for such a tradeoff sysctl? > > "such a tradeoff" ? > > While this sounds reasonable, I have to point out that > up to now nobody has described exactly WHAT tradeoff > they'd like to make tunable and why... Amount of pages reclaimed from swapout_mm() versus amount of pages reclaimed from caches. A value that says: "use XX% of my main memory for RSS of processes, even if I run heavy disk loadf now" would be nice. For general purpose machines, where I run several services but also play games, this would allow both to survive. The external services would go slower. Who cares, if some CVS updates or NFS services go slower, if I can play my favorite game at full speed? ;-) > I'm not against making things tunable, but I would like > to at least see the proponents of tunable things explain > WHAT they want tunable and exactly WHY. Ideally: Every value that the kernel decides by heuristics, because heuristics can fail to get even close to an optimal result. But this is too much. Some tunables from refill_inactive would be nice. Also the patch for honouring the soft rss limit is good (is it in?). Regards Ingo Oeser -- To the systems programmer, users and applications serve only to provide a test load. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Ingo Oeser wrote: > Rik: Would you take patches for such a tradeoff sysctl? "such a tradeoff" ? While this sounds reasonable, I have to point out that up to now nobody has described exactly WHAT tradeoff they'd like to make tunable and why... I'm not against making things tunable, but I would like to at least see the proponents of tunable things explain WHAT they want tunable and exactly WHY. regards, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, May 18, 2001 at 07:45:15PM +0200, Mike Galbraith wrote: > Yes, ~exactly! I chose 30 tasks because they almost do (tool/userland > dependant.. must recalibrate often) fit. The bitch is to get the vm > to automagically detect the rss/cache munch tradeoff point without all > the manual help. What about a sysctl for that? Choose decent steps and let 0 (which is an insane value) mean "let's kernel decide" and make this default. In the past we could do this by adjusting some watermarks in /proc/sys/vm but now, we can't do anything but trust the genius kernel developers. I doubt that we can test all kinds of workload and even imagine what pervert stuff some people do with their machines. Tuning _is_ manual work. Always has been and always will be. This countinously "I know it better then you" is what I hated about Windows and now this comes more and more into Linux :-( Rik: Would you take patches for such a tradeoff sysctl? Regards Ingo Oeser -- To the systems programmer, users and applications serve only to provide a test load. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Rik van Riel wrote: > On Fri, 18 May 2001, Mike Galbraith wrote: > > On Thu, 17 May 2001, Rik van Riel wrote: > > > On Thu, 17 May 2001, Mike Galbraith wrote: > > > > > > > Only doing parallel kernel builds. Heavy load throughput is up, > > > > but it swaps too heavily. It's a little too conservative about > > > > releasing cache now imho. (keeping about double what it should be > > > > with this load.. easily [thump] tweaked;) > > > > > > "about double what it should be" ? > > > > Do you think there's 60-80mb of good cachable data? ;-) The "double" > > is based upon many hundreds of test runs. I "know" that performance > > is best with this load when the cache stays around 25-35Mb. I know > > this because I've done enough bend adjusting to get throughput to > > within one minute of single task times to have absolutely no doubt. > > I can get it to 30 seconds with much obscene tweaking, and have done > > it with zero additional overhead for make -j 30 ten times in a row. > > (that kernel was.. plain weird. perfect synchronization.. voodoo!) > > Ahhh, I see. Remember that the "cached" figure you are > seeing also includes swap-cached data from the gccs, which > results from kswapd scanning the processes, clearing the > PTE and, a bit later, the process grabbing the page again. Yes. > I suspect that if the gccs _just_ fit in memory, you can > get some extra performance by mercilessly eating from the > cache and keeping the ggcs in memory. However, I also have > the sneaking suspicion that this is not the best tactic for > all workloads ;) Yes, ~exactly! I chose 30 tasks because they almost do (tool/userland dependant.. must recalibrate often) fit. The bitch is to get the vm to automagically detect the rss/cache munch tradeoff point without all the manual help. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Mike Galbraith wrote: > On Thu, 17 May 2001, Rik van Riel wrote: > > On Thu, 17 May 2001, Mike Galbraith wrote: > > > > > Only doing parallel kernel builds. Heavy load throughput is up, > > > but it swaps too heavily. It's a little too conservative about > > > releasing cache now imho. (keeping about double what it should be > > > with this load.. easily [thump] tweaked;) > > > > "about double what it should be" ? > > Do you think there's 60-80mb of good cachable data? ;-) The "double" > is based upon many hundreds of test runs. I "know" that performance > is best with this load when the cache stays around 25-35Mb. I know > this because I've done enough bend adjusting to get throughput to > within one minute of single task times to have absolutely no doubt. > I can get it to 30 seconds with much obscene tweaking, and have done > it with zero additional overhead for make -j 30 ten times in a row. > (that kernel was.. plain weird. perfect synchronization.. voodoo!) Ahhh, I see. Remember that the "cached" figure you are seeing also includes swap-cached data from the gccs, which results from kswapd scanning the processes, clearing the PTE and, a bit later, the process grabbing the page again. I suspect that if the gccs _just_ fit in memory, you can get some extra performance by mercilessly eating from the cache and keeping the ggcs in memory. However, I also have the sneaking suspicion that this is not the best tactic for all workloads ;) regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10: Oops
Thus spake Alan Cox ([EMAIL PROTECTED]): > > SCSI subsystem driver Revision: 1.00 > > PCI: Found IRQ 11 for device 00:0b.0 > > Unable to handle kernel NULL pointer dereference at virtual address 000 > > printing eip: > > What scsi drivers do you have and which are on IRQ 11 I have two: 00:0b.0 SCSI storage controller: Future Domain Corp. TMC-18C30 [36C70] Flags: medium devsel, IRQ 11 I/O ports at b400 [size=16] Expansion ROM at [disabled] [size=64K] 00:06.0 SCSI storage controller: Adaptec AHA-2940U2/W / 7890 Subsystem: Adaptec 2940U2W SCSI Controller Flags: bus master, medium devsel, latency 32, IRQ 5 BIST result: 00 I/O ports at d000 [disabled] [size=256] Memory at df80 (64-bit, non-prefetchable) [size=4K] Expansion ROM at [disabled] [size=128K] Capabilities: Thanks, Greg http://ulima.unil.ch/greg ICQ:16624071 mailto:[EMAIL PROTECTED] PGP signature
Re: Linux 2.4.4-ac10
David Balazic <[EMAIL PROTECTED]> wrote: > What old old binutils ? > Isn't there a clear requirement for a minimum binutils version in > Documentation/Changes ( or maybe it is README ... ) ? Yes there is. From the Changes file: o binutils 2.9.1.0.25 # ld -v -- André Dahlqvist <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
Alan Cox ([EMAIL PROTECTED]) wrote : > > > > gcc -D__KERNEL__ -I/usr/src/linux-2.4.4-ac/include -Wall -Wstrict-prototypes -O2 >-fomit-frame-pointer -fno-strict-aliasing -pipe > -mpreferred-stack-boundary=2 -march=i686 -malign-functions=4 -c -o apm.o apm.c > > {standard input}: Assembler messages: > > {standard input}:180: Warning: indirect lcall without `*' > > {standard input}:274: Warning: indirect lcall without `*' > > > > Does anyone know what's up with that? Kernel problem or binutils issue? > > binutils is issuing a correct warning but if we fix the warning old old binutils > will then refuse to assemble it right. What old old binutils ? Isn't there a clear requirement for a minimum binutils version in Documentation/Changes ( or maybe it is README ... ) ? -- David Balazic -- "Be excellent to each other." - Bill & Ted - - - - - - - - - - - - - - - - - - - - - - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, 17 May 2001, Rik van Riel wrote: > On Thu, 17 May 2001, Mike Galbraith wrote: > > > > Has anyone benched 2.4.5pre3 vs 2.4.4 vs. ? > > > > Only doing parallel kernel builds. Heavy load throughput is up, > > but it swaps too heavily. It's a little too conservative about > > releasing cache now imho. (keeping about double what it should be > > with this load.. easily [thump] tweaked;) > > "about double what it should be" > > That's an interesting statement, unless you have some > arguments to define exactly how much cache the system > should keep. Do you think there's 60-80mb of good cachable data? ;-) The "double" is based upon many hundreds of test runs. I "know" that performance is best with this load when the cache stays around 25-35Mb. I know this because I've done enough bend adjusting to get throughput to within one minute of single task times to have absolutely no doubt. I can get it to 30 seconds with much obscene tweaking, and have done it with zero additional overhead for make -j 30 ten times in a row. (that kernel was.. plain weird. perfect synchronization.. voodoo!) > Or are you just comparing with 2.2 and you'd rather > have 2.2 performance? ;) Nope. I've bent this vm up a little and build kernels that kicked the snot out of the previous record holder (classzone). I know for a fact that it can kick major butt.. why I fiddle with it when it doesn't. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Fri, 18 May 2001, Sasi Peter wrote: > On Thu, 17 May 2001, Rik van Riel wrote: > > > Or are you just comparing with 2.2 and you'd rather > > have 2.2 performance? ;) > > Actually, yes. Doing fileserving with Samba, and also using the box > interactively feels better with 2.2, and also the average TCP througput > (measured by iptraf) seems higher. This part is probably mostly due to the inode and dentry cache balancing being completely broken in current 2.4 kernels. Expect a patch soon (I'm running something really ugly right now here at home, I'll make something cleaner). regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, 17 May 2001, Rik van Riel wrote: > Or are you just comparing with 2.2 and you'd rather > have 2.2 performance? ;) Actually, yes. Doing fileserving with Samba, and also using the box interactively feels better with 2.2, and also the average TCP througput (measured by iptraf) seems higher. -- SaPE - Peter, Sasi - mailto:[EMAIL PROTECTED] - http://sape.iq.rulez.org/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10: Oops
> SCSI subsystem driver Revision: 1.00 > PCI: Found IRQ 11 for device 00:0b.0 > Unable to handle kernel NULL pointer dereference at virtual address 000 > printing eip: What scsi drivers do you have and which are on IRQ 11 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, 17 May 2001, Mike Galbraith wrote: > > Has anyone benched 2.4.5pre3 vs 2.4.4 vs. ? > > Only doing parallel kernel builds. Heavy load throughput is up, > but it swaps too heavily. It's a little too conservative about > releasing cache now imho. (keeping about double what it should be > with this load.. easily [thump] tweaked;) "about double what it should be" That's an interesting statement, unless you have some arguments to define exactly how much cache the system should keep. Or are you just comparing with 2.2 and you'd rather have 2.2 performance? ;) Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux 2.4.4-ac10: Oops
Hello, I have just compiled 2.4.4-ac10 and got: ... SCSI subsystem driver Revision: 1.00 PCI: Found IRQ 11 for device 00:0b.0 Unable to handle kernel NULL pointer dereference at virtual address 000 printing eip: c01d11d0 *pde = Oops: 0002 CPU: 0 EIP: 0010:[] EFLAGS: 00010282 eax: dfff1000 ... Are the last ... needed, if yes I'll copy them (is there an easy way to do it)? Just let me know about it ;-) gcc -v gives me: Reading specs from /usr/lib/gcc-lib/i586-mandrake-linux/2.96/specs gcc version 2.96 2731 (Linux-Mandrake 8.1 2.96-0.50mdk) I have put under http:://ulima.unil.ch/greg/linux/ some files: the config of my kernels (2.4.4-ac9 and ac10), the System.map, the output of dmesg from 2.4.4-ac9, the output of lspci -v and finally the list of the rpm that are installed on my system. Thanks you very much, Greg http://ulima.unil.ch/greg ICQ:16624071 mailto:[EMAIL PROTECTED] PGP signature
Re: Linux 2.4.4-ac10
> And a pair more: No > --- linux-2.4.4-ac10/include/linux/raid/md_k.h.orig Thu May 17 19:35:41 > 2001 > +++ linux-2.4.4-ac10/include/linux/raid/md_k.hThu May 17 19:36:15 2001 > @@ -38,6 +38,8 @@ > case RAID5: return 5; > } > panic("pers_to_level()"); > + > + return 0; panic appears properly declared as __attribute(noreturn). This looks to me like a gcc bug - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
> > gcc -D__KERNEL__ -I/usr/src/linux-2.4.4-ac/include -Wall -Wstrict-prototypes -O2 >-fomit-frame-pointer -fno-strict-aliasing -pipe -mpreferred-stack-boundary=2 >-march=i686 -malign-functions=4 -c -o apm.o apm.c > {standard input}: Assembler messages: > {standard input}:180: Warning: indirect lcall without `*' > {standard input}:274: Warning: indirect lcall without `*' > > Does anyone know what's up with that? Kernel problem or binutils issue? binutils is issuing a correct warning but if we fix the warning old old binutils will then refuse to assemble it right. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
"J . A . Magallon" wrote: > --- linux-2.4.4-ac10/include/linux/raid/md_k.h.orig Thu May 17 19:35:41 > 2001 > +++ linux-2.4.4-ac10/include/linux/raid/md_k.h Thu May 17 19:36:15 2001 > @@ -38,6 +38,8 @@ > case RAID5: return 5; > } > panic("pers_to_level()"); > + > + return 0; > } panic should be marked attribute(noreturn), so gcc is being silly here by warning at all. I do this too, because IMHO its inline and won't make things bigger just shut up the warning. But Alan will yell at you for fixing gcc bugs in the kernel source :) Also, add a comment "fixes gcc warning" next to the code, so people know why it's there. -- Jeff Garzik | Game called on account of naked chick Building 1024| MandrakeSoft | - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On 05.17 Ingo Oeser wrote: > On Thu, May 17, 2001 at 05:45:38PM +0100, Alan Cox wrote: > > 2.4.4-ac10 > > I think someone forgot this little return. It removes the > following warning: > > serial.c:4208: warning: control reaches end of non-void function > > > --- linux-2.4.4-ac10/drivers/char/serial.c Thu May 17 20:41:05 2001 > +++ linux-2.4.4-ac10-ioe/drivers/char/serial.cThu May 17 20:35:53 2001 > @@ -4205,6 +4205,7 @@ > { > __set_current_state(TASK_UNINTERRUPTIBLE); > schedule_timeout(HZ/10); > + return 0; > } > And a pair more: --- linux-2.4.4-ac10/include/linux/raid/md_k.h.orig Thu May 17 19:35:41 2001 +++ linux-2.4.4-ac10/include/linux/raid/md_k.h Thu May 17 19:36:15 2001 @@ -38,6 +38,8 @@ case RAID5: return 5; } panic("pers_to_level()"); + + return 0; } extern inline int level_to_pers (int level) --- linux-2.4.3-ac12/drivers/scsi/aic7xxx/aic7xxx_osm.h.origSun Apr 22 10:21:55 2001 +++ linux-2.4.3-ac12/drivers/scsi/aic7xxx/aic7xxx_osm.h Mon Apr 23 10:55:58 2001 @@ -843,10 +843,10 @@ pci_read_config_dword(pci, reg, &retval); return (retval); } - default: - panic("ahc_pci_read_config: Read size too big"); - /* NOTREACHED */ } + panic("ahc_pci_read_config: Read size too big"); + /* NOTREACHED */ + return 0; } static __inline void ahc_pci_write_config(ahc_dev_softc_t pci, -- J.A. Magallon # Let the source be with you... mailto:[EMAIL PROTECTED] Linux Mandrake release 8.1 (Cooker) for i586 Linux werewolf 2.4.4-ac9 #4 SMP Mon May 14 11:22:40 CEST 2001 i686 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, May 17, 2001 at 09:40:39PM +0300, Matti Aarnio wrote: > On Thu, May 17, 2001 at 08:33:36PM +0200, Udo A. Steinberg wrote: > > With 2.4.4-ac10 and binutils 2.11 I get the following warnings: > > It is a warning about kernel code using assembler statements > which are not valid with some older assemblers. Naeh, I am confusing (you, and myself). Fixing those (adding the '*') would not work with some older assemblers. Claiming minimum level of 2.10/2.11 for assembler/binutils would certainly allow fixing things by adding the missing '*'. > > gcc -D__KERNEL__ -I/usr/src/linux-2.4.4-ac/include -Wall -Wstrict-prototypes -O2 >-fomit-frame-pointer -fno-strict-aliasing -pipe -mpreferred-stack-boundary=2 >-march=i686 -malign-functions=4 -c -o pci-pc.o pci-pc.c > > pci-pc.c:964: warning: `pci_fixup_via691' defined but not used > > pci-pc.c:977: warning: `pci_fixup_via691_2' defined but not used > > {standard input}: Assembler messages: > > {standard input}:747: Warning: indirect lcall without `*' > > {standard input}:832: Warning: indirect lcall without `*' > > {standard input}:919: Warning: indirect lcall without `*' > > {standard input}:958: Warning: indirect lcall without `*' ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, May 17, 2001 at 05:45:38PM +0100, Alan Cox wrote: > 2.4.4-ac10 I think someone forgot this little return. It removes the following warning: serial.c:4208: warning: control reaches end of non-void function --- linux-2.4.4-ac10/drivers/char/serial.c Thu May 17 20:41:05 2001 +++ linux-2.4.4-ac10-ioe/drivers/char/serial.c Thu May 17 20:35:53 2001 @@ -4205,6 +4205,7 @@ { __set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout(HZ/10); + return 0; } /* Regards Ingo Oeser -- To the systems programmer, users and applications serve only to provide a test load. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, May 17, 2001 at 08:33:36PM +0200, Udo A. Steinberg wrote: > With 2.4.4-ac10 and binutils 2.11 I get the following warnings: It is a warning about kernel code using assembler statements which are not valid with some older assemblers. > gcc -D__KERNEL__ -I/usr/src/linux-2.4.4-ac/include -Wall -Wstrict-prototypes -O2 >-fomit-frame-pointer -fno-strict-aliasing -pipe -mpreferred-stack-boundary=2 >-march=i686 -malign-functions=4 -c -o pci-pc.o pci-pc.c > pci-pc.c:964: warning: `pci_fixup_via691' defined but not used > pci-pc.c:977: warning: `pci_fixup_via691_2' defined but not used > {standard input}: Assembler messages: > {standard input}:747: Warning: indirect lcall without `*' > {standard input}:832: Warning: indirect lcall without `*' > {standard input}:919: Warning: indirect lcall without `*' > {standard input}:958: Warning: indirect lcall without `*' > {standard input}:990: Warning: indirect lcall without `*' > {standard input}:1022: Warning: indirect lcall without `*' > {standard input}:1053: Warning: indirect lcall without `*' > {standard input}:1082: Warning: indirect lcall without `*' > {standard input}:: Warning: indirect lcall without `*' > {standard input}:1392: Warning: indirect lcall without `*' > {standard input}:1497: Warning: indirect lcall without `*' > > gcc -D__KERNEL__ -I/usr/src/linux-2.4.4-ac/include -Wall -Wstrict-prototypes -O2 >-fomit-frame-pointer -fno-strict-aliasing -pipe -mpreferred-stack-boundary=2 >-march=i686 -malign-functions=4 -c -o apm.o apm.c > {standard input}: Assembler messages: > {standard input}:180: Warning: indirect lcall without `*' > {standard input}:274: Warning: indirect lcall without `*' > > > Does anyone know what's up with that? Kernel problem or binutils issue? > > Regards, > Udo. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
Hi, Alan Cox wrote: > > 2.4.4-ac10 With 2.4.4-ac10 and binutils 2.11 I get the following warnings: gcc -D__KERNEL__ -I/usr/src/linux-2.4.4-ac/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe -mpreferred-stack-boundary=2 -march=i686 -malign-functions=4 -c -o pci-pc.o pci-pc.c pci-pc.c:964: warning: `pci_fixup_via691' defined but not used pci-pc.c:977: warning: `pci_fixup_via691_2' defined but not used {standard input}: Assembler messages: {standard input}:747: Warning: indirect lcall without `*' {standard input}:832: Warning: indirect lcall without `*' {standard input}:919: Warning: indirect lcall without `*' {standard input}:958: Warning: indirect lcall without `*' {standard input}:990: Warning: indirect lcall without `*' {standard input}:1022: Warning: indirect lcall without `*' {standard input}:1053: Warning: indirect lcall without `*' {standard input}:1082: Warning: indirect lcall without `*' {standard input}:: Warning: indirect lcall without `*' {standard input}:1392: Warning: indirect lcall without `*' {standard input}:1497: Warning: indirect lcall without `*' gcc -D__KERNEL__ -I/usr/src/linux-2.4.4-ac/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe -mpreferred-stack-boundary=2 -march=i686 -malign-functions=4 -c -o apm.o apm.c {standard input}: Assembler messages: {standard input}:180: Warning: indirect lcall without `*' {standard input}:274: Warning: indirect lcall without `*' Does anyone know what's up with that? Kernel problem or binutils issue? Regards, Udo. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, 17 May 2001, Chris Evans wrote: > On Thu, 17 May 2001, Alan Cox wrote: > > > 2.4.4-ac10 > [...] > > - now 2.4.5pre vm seems sane dump other vmscan > > experiments > > Has anyone benched 2.4.5pre3 vs 2.4.4 vs. ? Only doing parallel kernel builds. Heavy load throughput is up, but it swaps too heavily. It's a little too conservative about releasing cache now imho. (keeping about double what it should be with this load.. easily [thump] tweaked;) -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, 17 May 2001, Chris Evans wrote: > On Thu, 17 May 2001, Alan Cox wrote: > > > 2.4.4-ac10 > [...] > > - now 2.4.5pre vm seems sane dump other vmscan > > experiments > > Has anyone benched 2.4.5pre3 vs 2.4.4 vs. ? Marcelo saw a 30% speed increase from 2.4.4 to 2.4.5pre3 on several tests. At the moment the main issues left seem to be: - balancing the inode + dentry caches versus the rest of the memory users (I'm working on it now) - simplifying __alloc_pages() a bit (I've got some things ready but I'm waiting for some other people who say they also have some stuff to add) - making sure we get rid of all the highmem flushing deadlocks ... there may still be a few around (ben, marcelo?) regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
> > ftp://ftp.linux.org.uk/pub/linux/alan/2.4-ac/ > > > Can't find it there (neither -ac9), but on the other hand it > is on kernel.org... Guess who forgot to fix the URL;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, 17 May 2001, Alan Cox wrote: > 2.4.4-ac10 > [not merged; rage-xl code] I'll take care of that... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
On Thu, 17 May 2001, Alan Cox wrote: > 2.4.4-ac10 [...] > - now 2.4.5pre vm seems sane dump other vmscan > experiments Has anyone benched 2.4.5pre3 vs 2.4.4 vs. ? Cheers Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac10
Hi Alan, In article <[EMAIL PROTECTED]> you wrote: > >ftp://ftp.linux.org.uk/pub/linux/alan/2.4-ac/ > Can't find it there (neither -ac9), but on the other hand it is on kernel.org... Christoph -- Of course it doesn't work. We've performed a software upgrade. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux 2.4.4-ac10
ftp://ftp.linux.org.uk/pub/linux/alan/2.4-ac/ Intermediate diffs are available from http://www.bzimage.org Ok we are back on kernel.org 2.4.4-ac10 o Move cs46xx docs into the right spot(Arjan van de Ven) o Merge Linus 2.4.5pre3 - switch to Linus page fault race fixes - switch to Linus arch/ppc - merged serial driver cli fixes but also added an extra missing moxa check - used -ac better version of comx fix - used -ac better version of scsi fix - now 2.4.5pre vm seems sane dump other vmscan experiments [not merged; rage-xl code] 2.4.4-ac9 o Clean up x86isms from the UML code (Chris Emerson) o Remove un-needed UML flag,fix hang under load (Jeff Dike) o Fix attach race in UML (Jeff Dike) o Fix warnings, clean up cpp abuses in UML(Roman Zippel) o Remove -D__KERNEL__ from user space of UML (Roman Zippel) o Add NCR53c700 and 53c700/66 driver (James Bottomley) |For NCR Dual 700 microchannel card o Alpha semaphore updates (Ivan Kokshaysky) p Fix ibmtr build a bit (Andrzej Krzysztofowicz) o Tidy sysrq-t output (Russell King) o Fix miata halt to SRM (Tom Vier) o Fix aging on buffer cache pages (Marcelo Tosatti) o Fix looping behaviour on failing memory (Marcelo Tosatti) allocations o Handle the PIIX4 on the new intel 82801BAM (Tim Raymond) o Fix user visible -ENOIOCTLCMD returns (Shane Wegner) o Fix startech uart detection problem (Val Henson) o Further tulip updates (Jeff Garzik) o Revert hpt366 patch 2.4.4-ac8 o Prefetch constant copy_to_user data (Arjan van de Ven) o Update cpqarray driver - use pci dma api(Charles White) o Update cciss driver - use pci dma api (Charles White) o Enable compiled in synclink driver (Paul Fulghum) o Fix plip section conflict (Keith Owens) o Tulip driver updates(Jeff Garzik) o Frame buffer logo updates (Geert Uytterhoeven) o Update __initdata documentation (Ingo Oeser) o Linearize sunrpc buffers using GFP_KERNEL (Trond Myklebust) o C Scott Ananian has moved (C Scott Ananian) o Update get_unaligned docs (John Levon) o Fix pci pool handling on boxes that have non(Pete Zaitcev) irq safe map create/destroy o Update m68k semaphores (Geert Uytterhoeven) o Update NLS Configure.help (Nerijus Baliunas) o Clean up cyclom driver (Arnaldo Carvalho de Melo) o Further serial driver update(Jeff Garzik) o Fix typo in sched.c (Jim Freeman) o Do prefetches on wake_up_common walk(Arjan van de Ven) o Fix bootmem init problems (Andrea Arcangeli) o Fix pops on cs46xx power management (Thomas Woller) o Fix reference of freed memory in cs46xx (Christopher Kanaan) o Hopefully fix i2o scsi reset crash (me) 2.4.4-ac7 o Fix dasd off by one found by Al Viro(me) o Fix copy under cli in moxa,mxser,pcxx,riscom8 o Cleaned up serial167 formatting (no code(me) changes this patch set) o Fix missing length check in AGPgart (me) | Found by Al Viro o Fix wrong kmalloc sizes in ixj/emu10k1 (David Chan) o Fix make distclean on ramfs/tmpfs (Ingo Oeser) o Update checkconfig, Changes (Niels Jensen) o NFS mmap consistency on close fix (Andrea Arcangeli) o Fix 10 bit decode causing APM hang on a laptop (Pete Zaitcev) when using ymfpci o Reserve failure on vesa video ram is not fatal (Jordan Crouse) o Update athlon mmx copier to not prefetch off(Arjan van de Ven) the end o Fix scsi.c procfs zero termination checks (Al Viro) o And fix -EFAULT returns from it (me) o Update ibm token ring driver(Mike Phillips) o Fix sockfilter maths overflow (Al Viro) o Make dev name lookup robust to nonterminated(Al Viro) buffers o Update config.h use (Niels Jensen) o Fix xircom cardbus ethernet/modem support (Bill Nottingham) o Fix off by one buffer checks in atm_poa and (Al Viro) dasd