RE: How to fix malloc.
On 23-Feb-02 Alfred Perlstein wrote: * Matthew Dillon [EMAIL PROTECTED] [020223 12:51] wrote: :Here is the most up-to-date version of pgrp/session lock (at Change 6700): : :http://people.FreeBSD.org/~tanimura/patches/pgrp10.diff.gz : :I would like to commit this on the next Sunday. Otherwise, my patch :would conflict with other patches, especially tty. : :-- :Seigo Tanimura [EMAIL PROTECTED] [EMAIL PROTECTED] Do you have any plans to get pgdelete() out from under Giant? That would allow leavepgrp(), doenterpgrp(), enterpgrp(), enterthispgrp(), setsid() (mostly) to be taken out from under Giant, and perhaps a few others. I was thinking of simply having a free list of sessions and process groups, locked by PGRPSESS_XLOCK(). pgdelete() would then not have to call FREE() and setsid() would almost always be able to pull a new structure of the appropriate free list and thus not have to obtain Giant for the MALLOC. All these per-subsystem free-lists are making me nervous in both complexity and wasted code... Ok, instead of keeping all these per-subsystem free-lists here's what we do: In kern_malloc:free() right at the point of if (size MAXALLOCSAVE) we check if we have Giant or not. if we do not then we simply queue the memory however, if we do then we call into kmem_free with all the queued memory. This ought to solve the issue without making us keep all these per-cpu caches. By the way, since MAXALLOCSAVE is some multiple of PAGE_SIZE, we really don't have to worry about it when freeing small structures although that puts evilness into malloc(9) consumers. Can you please consider that instead of continueing down this path of per-subsystem caches? Just as a general heads up, the slab memory allocator Jeff Roberson is working on will not need Giant for the free path, so can we just live with Giant locking around free() for now rather than adding in a bunch of temporary hacks to reinvent the wheel that we have to go back and remove later? -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve! - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
Jake Burkholder wrote: Jeff Roberson (jeff@) has been working on a slab allocator that goes a long way to making malloc(), free() and the zone allocator not require giant. I've reviewed what he's got so far and it looks pretty damn good to me, I'll see about getting him to post it. He's working on adding the per-cpu queues now. A design like this resolves my objection to the pure SLAB allocator; Vahalia suggests this as a potential enhancment in his book, and the authors of the SLAB allocator mention it in their second paper (~1996/1997). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
Alfred Perlstein wrote: * Matthew Dillon [EMAIL PROTECTED] [020223 14:43] wrote: This is approximately what I am thinking. Note that this gives us the flexibility to create a larger infrastructure around the bucket cache, such as implement per-cpu caches and so on and so forth. What I have here is the minimal implementation. I strongly object to this implementation right now, please do not commit it. I already explained to you how to make the problem go away but instead you insist on some complex api that pins memory down like the damn zone allocator. It's not needed, so please don't do it. Actually, the zone allocator is not far off, I think. Imagine if the entire possible KVA space (real RAM + swap) was preallocated PTEs. Allocations could treat it as anonymous memory, for which a mapping process was not required, and all allocations would be interrupt safe by default, without having to special case the code one way or the other. This seems, to me, to be the right idea. The only issue left is that the maps take real memory that is wired down. This raises the possibility of adding to the swap where swap + RAM KVA swap + RAM + new_swap = KVA, which would imply mappings bneing required on the adding of swap (via swapon). Not that painful, but it does imply a 1:1000 limit ratio on real vs. virtual RAM to get to the page mapping overhead. 4M pages would cover some of that problem... but making allocations swappable is often desirable, even in the kernel, so you would need to special case those mappings... and 4M and 4K pages play badly together, unless you know what you are doing, and you know the undocumented bugs (c.v. the recent AMD AGP thing). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Success! critical_enter()/critical_exit() revamp (was Re: malloc_bucket() idea (was Re: How to fix malloc.))
I'm going to wait until tomorrow before posting it. I have a bunch of cleanup to do. Bruce, your sys.dif was *invaluable*! It would have taken me a lot longer to figure out the interrupt disablement requirement around the icu_lock, and the statclock and hardclock forwarding issues (which I got working by the way!). It turns out that putting the pending masks in the per-cpu area was the right solution! It made statclock and hardclock forwarding easy (I can see why you didn't even try in your patch set, doing it with a global mask would be nearly impossible). In fact, it made everything unbelievably easy. Apart from all the assembly coding, there were two serious issues. fork_exit() assumes that interrupts are hard-disabled on entry. I readjusted the code such that the trampoline assembly initialized the td_critnest count so it could STI prior to calling fork_exit(). The second issue is that cpu_switch() does not save/restore the PSL_I (interrupt disablement mask). I added a PSL word to the PCB structure to take care of the problem. Without this if you have a thread switch away while holding interrupts hard-disabled, the next thread will inherit the hard disablement. I saw the sti's you added in various places but I figured the best solution was to simply save/restore the state. The original code didn't have this problem because interrupts were always hard-disabled while holding the sched_lock and, of course, cpu_switch() is always called while holding sched_lock. (Similarly, icu_lock needed hard-disablements wrapped around it for the same reason, because a level interrupt still needs to be able to get icu_lock in order to defer itself). In anycase, I have successfully built the world in a -current SMP + apic_vector system. Tomorrow I will cleanup on the UP icu_vector code to match the apic_vector code and post the results. I also have a sysctl that allows developers to toggle the mode for testing purposes (old way or new way). Finally, I took your suggestion in regards to not trying to combine the masks together. I have a 'some sort of interrupt is pending' variable then I have fpending (for fast interrupts), ipending (for normal interrupt threads), and spending (which I use for the stat and hardclock forwarding). They're all per-cpu entities, of course. unpend() prioritizes them. In anycase, I'll post it all tomorrow after I've got icu_vector cleaned up. One of the best things about this patch set is that it is really flexible. We should be able to really make interrupts fly. In fact, it should even be possible for one cpu to steal another cpu's pending interrupt(s) fairly trivially, without having to resort to IPIs. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
On Sat, 23 Feb 2002, Alfred Perlstein wrote: Usually when I see diff(1) output from you I usually expect a commit within the next half hour or so, I just wanted to make myself clear on the issue. No worries. :) Yes, and hopefully JeffR's allocator will fix our problems, that is if it ever makes it out of p4. :) That assumes that it was actually in P4 in the first place. Jeffr's commit bit was approved in the last two days, and he's begun posting his patches to -arch. I hope to see his memory allocator patch there soon :-). Robert N M Watson FreeBSD Core Team, TrustedBSD Project [EMAIL PROTECTED] NAI Labs, Safeport Network Services To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Success! critical_enter()/critical_exit() revamp (was Re:malloc_bucket() idea (was Re: How to fix malloc.))
On Sun, 24 Feb 2002, Matthew Dillon wrote: Bruce, your sys.dif was *invaluable*! It would have taken me a lot longer to figure out the interrupt disablement requirement around the icu_lock, and the statclock and hardclock forwarding issues (which I got working by the way!). Thanks. It turns out that putting the pending masks in the per-cpu area was the right solution! It made statclock and hardclock forwarding easy (I can see why you didn't even try in your patch set, doing it with a global mask would be nearly impossible). In fact, it made everything unbelievably easy. ... [This paragraph reordered] up. One of the best things about this patch set is that it is really flexible. We should be able to really make interrupts fly. In fact, it should even be possible for one cpu to steal another cpu's pending interrupt(s) fairly trivially, without having to resort to IPIs. Good idea. Stealing would be even easier if the mask were global :-). The second issue is that cpu_switch() does not save/restore the PSL_I (interrupt disablement mask). I added a PSL word to the PCB structure to take care of the problem. Without this if you have a thread switch away while holding interrupts hard-disabled, the next thread will inherit the hard disablement. I saw the sti's you added in various places but I figured the best solution was to simply save/restore the state. The original code didn't have cpu_switch() certainly needs to do this if it can be called with the interrupt enable flag[s] in different states. I need the sti's (actually enable_intr()'s because I don't want fast interrupts to be disabled during context switches. This works because enabling interrupts is sure to be safe, since we might be switching to a thread that will enable them. Some sort of lock is needed to prevent interrupts interfering with the switch. I think soft-masking them in critical_enter() is sufficient in your version too. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Success! critical_enter()/critical_exit() revamp (was Re:malloc_bucket() idea (was Re: How to fix malloc.))
: up. One of the best things about this patch set is that it is really : flexible. We should be able to really make interrupts fly. In fact, : it should even be possible for one cpu to steal another cpu's pending : interrupt(s) fairly trivially, without having to resort to IPIs. : :Good idea. Stealing would be even easier if the mask were global :-). Well, for the moment I am attempting to avoid having to use lock;'d bus cycles to keep things efficient. When we get to the stealing part we might wind up using lock;, but I'll deal with that when the time comes. : The second issue is that cpu_switch() does not save/restore the : PSL_I (interrupt disablement mask). I added a PSL word to the PCB : structure to take care of the problem. Without this if you have : a thread switch away while holding interrupts hard-disabled, the : next thread will inherit the hard disablement. I saw the sti's : you added in various places but I figured the best solution was : to simply save/restore the state. The original code didn't have : :cpu_switch() certainly needs to do this if it can be called with the :interrupt enable flag[s] in different states. I need the sti's (actually :enable_intr()'s because I don't want fast interrupts to be disabled :during context switches. This works because enabling interrupts is sure :to be safe, since we might be switching to a thread that will enable :them. Some sort of lock is needed to prevent interrupts interfering :with the switch. I think soft-masking them in critical_enter() is :sufficient in your version too. : :Bruce I don't think we want to make sched_lock any more complex then it already is, so at least for the foreseeable future we are not going to be able to actually execute an interrupt handler until the sched_lock is released in (typically) msleep(). I am rather annoyed that two levels of procedure have to be called with the sched_lock held (mi_switch() and cpu_switch()), leaving interrupts disabled for a fairly long period of time, but I don't see any way around it right now. Eventually (presumably) we will have per-cpu run queues. That combined with interrupt stealing may resolve the problem for us. I am still not convinced that making the various *pending* flags globals will be more efficient, because it introduces significant cache mastership issues. It might be easier to do this: interrupt_vector { if (td-td_critnest) { ... try to forward the interrupt to another cpu that is not in a critical section ... ... else set bit in local ipending mask. } else { ... take or schedule interrupt ... } } Or possibly: interrupt_vector { if (td-td_critnest) { if (!mtx_owned(sched_lock) mtx_trylock(sched_lock)) { ... we can still schedule the interrupt even though we are in a critical section, we just can't switch to it yet ... } else { ... try to forward the interrupt to a cpu that is not in a critical section ... ... else set bit in local ipending mask. } } else { ... take or schedule interrupt ... } } -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Success! critical_enter()/critical_exit() revamp (was Re:malloc_bucket() idea (was Re: How to fix malloc.))
On Sun, 24 Feb 2002, Matthew Dillon wrote: :cpu_switch() certainly needs to do this if it can be called with the :interrupt enable flag[s] in different states. I need the sti's (actually :enable_intr()'s because I don't want fast interrupts to be disabled :during context switches. This works because enabling interrupts is sure :to be safe, since we might be switching to a thread that will enable :them. Some sort of lock is needed to prevent interrupts interfering :with the switch. I think soft-masking them in critical_enter() is :sufficient in your version too. I don't think we want to make sched_lock any more complex then it already is, so at least for the foreseeable future we are not going to be able to actually execute an interrupt handler until the sched_lock is released in (typically) msleep(). I am rather Well, my kernel has been executing fast interrupt handlers while sched_lock is held for almost a year. It's actually less complicated with respect to sched_lock but more complicated with respect to fast interrupt handlers. annoyed that two levels of procedure have to be called with the sched_lock held (mi_switch() and cpu_switch()), leaving interrupts disabled for a fairly long period of time, but I don't see any way around it right now. The worst offenders for interrupt latency seemed to be calcru() and/or the sched_locking related to fork and/or exit. Latency was many thousand instructions (reasonable only on 100+ MIPS machines). sched_locking for calcru() is moostly bogus and should be easy to avoid, but not so for context switching. Eventually (presumably) we will have per-cpu run queues. That combined with interrupt stealing may resolve the problem for us. I am still not convinced that making the various *pending* flags globals will be more efficient, because it introduces significant cache mastership issues. It might be easier to do this: OK. I don't care about this yet. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: How to fix malloc.
Alfred Perlstein wrote: All these per-subsystem free-lists are making me nervous in both complexity and wasted code... Me too. Ok, instead of keeping all these per-subsystem free-lists here's what we do: In kern_malloc:free() right at the point of if (size MAXALLOCSAVE) we check if we have Giant or not. if we do not then we simply queue the memory however, if we do then we call into kmem_free with all the queued memory. This ought to solve the issue without making us keep all these per-cpu caches. One modification: limit the number that are freed per invocation to some number small enough that there won't be a big latency. Once everything gets to this point, though, there will be nothing to trigger a free with giant held, and you'll just queue things up forever. Really, we need counted queues -- queues that know the number of elemenets on them. This is a requirement for RED-Queueing, and it will let us know when the queue gets deep... then you can grab giant, and flush down the queue if it hits a high watermark. Obviously, the correct way to handle this is per CPU memory pools that don't have any need for lock contention at all on the real frees. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: How to fix malloc.
:All these per-subsystem free-lists are making me nervous in both :complexity and wasted code... : :Ok, instead of keeping all these per-subsystem free-lists here's what :we do: : :In kern_malloc:free() right at the point of : if (size MAXALLOCSAVE) we check if we have Giant or not. :if we do not then we simply queue the memory :however, if we do then we call into kmem_free with all the queued memory. : :This ought to solve the issue without making us keep all these :per-cpu caches. Hmm. I don't like generalizing it to that extent. What if we simply assert that Giant is held if size MAXALLOCSAVE in free()? i.e. for the moment we require that any 'large allocations' use Giant. Then we can freely use free() to free smaller allocations without Giant. There are three other problems: (1) malloc() itself will call kmem_malloc() even if M_NOWAIT is passed, so we can't optimize malloc()-without-giant unless we add a new flag to deal with this situation. And, (2) we have a single mutex, malloc_mtx. This is Bosko's code so I am adding him to the To: list. Bosko, it looks like with a simple cleanup to the msleep() call we can use a pool lock instead of malloc_mtx, which would give us the ability to lock malloc() on a bucket-by-bucket basis. The addition of a another malloc flag will allow us to tell malloc() to return NULL if it can't do the operation with its own mutex. And, (3) we wind up getting and releasing two mutexes instead of one in the code I was originally refering to, because setsid() already gets an SX lock and I was just going to fold the free list into that. Still, I don't mind doing it via malloc(). What do people think? -Matt Matthew Dillon [EMAIL PROTECTED] :By the way, since MAXALLOCSAVE is some multiple of PAGE_SIZE, we :really don't have to worry about it when freeing small structures :although that puts evilness into malloc(9) consumers. : :Can you please consider that instead of continueing down this path :of per-subsystem caches? : :thanks, :-- :-Alfred Perlstein [[EMAIL PROTECTED]] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
malloc_bucket() idea (was Re: How to fix malloc.)
This is approximately what I am thinking. Note that this gives us the flexibility to create a larger infrastructure around the bucket cache, such as implement per-cpu caches and so on and so forth. What I have here is the minimal implementation. -Matt Index: kern/kern_malloc.c === RCS file: /home/ncvs/src/sys/kern/kern_malloc.c,v retrieving revision 1.93 diff -u -r1.93 kern_malloc.c --- kern/kern_malloc.c 12 Sep 2001 08:37:44 - 1.93 +++ kern/kern_malloc.c 23 Feb 2002 22:41:56 - @@ -116,6 +116,63 @@ #endif /* INVARIANTS */ /* + * init_malloc_bucket: + * + * Initialize a malloc bucket + */ +void +init_malloc_bucket(struct malloc_bucket *bucket, struct malloc_type *type, int size) +{ +bzero(bucket, sizeof(struct malloc_bucket)); +bucket-b_size = size; +bucket-b_type = type; +bucket-b_mtx = mtx_pool_find(bucket); +} + +/* + * malloc_bucket: + * + * Allocate a structure from the supplied low-latency cache. NULL is + * returned if the cache is empty. + */ +void * +malloc_bucket(struct malloc_bucket *bucket) +{ + void *ptr = NULL; + + if (bucket-b_next) { + mtx_lock(bucket-b_mtx); + if ((ptr = bucket-b_next) != NULL) { + bucket-b_next = *(caddr_t *)ptr; + KASSERT(bucket-b_count 0, (bucket count mismatch)); + --bucket-b_count; + *(caddr_t *)ptr = WEIRD_ADDR; + } else { + KASSERT(bucket-b_count == 0, (bucket count mismatch)); + } + mtx_unlock(bucket-b_mtx); + } + return(ptr); +} + +/* + * free_bucket: + * + * Free a structure to the low latency cache. + * + */ +void +free_bucket(struct malloc_bucket *bucket, void *ptr) +{ + mtx_lock(bucket-b_mtx); + *(caddr_t *)ptr = bucket-b_next; + bucket-b_next = (caddr_t)ptr; + ++bucket-b_count; + mtx_unlock(bucket-b_mtx); + /* XXX if b_count something, wakeup our cleaner? */ +} + +/* * malloc: * * Allocate a block of memory. Index: sys/malloc.h === RCS file: /home/ncvs/src/sys/sys/malloc.h,v retrieving revision 1.54 diff -u -r1.54 malloc.h --- sys/malloc.h10 Aug 2001 06:37:04 - 1.54 +++ sys/malloc.h23 Feb 2002 22:36:09 - @@ -109,6 +109,18 @@ longkb_couldfree; /* over high water mark and could free */ }; +/* + * malloc_bucket holder for fast low-latency malloc_bucket() and free_bucket() + * calls. + */ +struct malloc_bucket { + caddr_t b_next; + int b_count; + int b_size; + struct mtx *b_mtx; + struct malloc_type *b_type; +}; + #ifdef _KERNEL #defineMINALLOCSIZE(1 MINBUCKET) #define BUCKETINDX(size) \ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
Matthew Dillon wrote: This is approximately what I am thinking. Note that this gives us the flexibility to create a larger infrastructure around the bucket cache, such as implement per-cpu caches and so on and so forth. What I have here is the minimal implementation. Uh, probably freeing the mutex of the new bucket after holding the one on the next list is probably not what you wanted to happen. 8-). I think you would nead a seperate head, and then use the head containing structure mutex, instead. I understand what you were trying to do, though... I guess you are relying on the use of different buckets to spread the pain? Right now the pain is centered arouns one struct type anyway (the cred one that started this dicussion), so you are still going to contend the same resource for each (just bucket hash mtx instead of giant, for the bucket type). Also, you probably want it to be a STAILQ. I'm not sure that this is actually a win?!? I guess if there are a bunch of these buckets, you would end up with different hash values for the mutexes... though relying on a hash seems wrong, since you can't really control collision domains that way; eventally, you will get people with widely different perofromance based on where the linker linked their bucket list head to, and that's probably not good. OTOH, A per CPU bucket list means no bucket mtx would be necessary; without that, it's just replacing one type of contention for another, I think, until you start making mutex selection indeterminate. 8^(. Really, this delayed freeing idea is starting to get uglier than just going to per CPU pools... -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
:OTOH, A per CPU bucket list means no bucket mtx would be necessary; :without that, it's just replacing one type of contention for another, :I think, until you start making mutex selection indeterminate. 8^(. : :Really, this delayed freeing idea is starting to get uglier than :just going to per CPU pools... : :-- Terry Well, if we want to rewrite malloc() a per-cpu pool inside a critical section would not be that difficult. The 'bucket' array would simply be placed in the per-cpu area. However, with malloc() we still have a serious problem with the malloc_type structure statistics. There would have to be per-cpu information for those as well. critical_enter() isn't much better then a mutex. It can be optimized to get rid of the inline cli sti however. Actually, it can be optimized trivially for i386, all we have to do is check the critical nesting count from the interrupt code and set the interrupts-disabled bit in the returned (supervisor mode) frame. critical_enter() and critical_exit() would then be nearly perfect. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
Apparently, On Sat, Feb 23, 2002 at 02:43:35PM -0800, Matthew Dillon said words to the effect of; This is approximately what I am thinking. Note that this gives us the flexibility to create a larger infrastructure around the bucket cache, such as implement per-cpu caches and so on and so forth. What I have here is the minimal implementation. -Matt Jeff Roberson (jeff@) has been working on a slab allocator that goes a long way to making malloc(), free() and the zone allocator not require giant. I've reviewed what he's got so far and it looks pretty damn good to me, I'll see about getting him to post it. He's working on adding the per-cpu queues now. Jake To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
On Sat, 23 Feb 2002, Matthew Dillon wrote: critical_enter() isn't much better then a mutex. It can be optimized to get rid of the inline cli sti however. Actually, it can be optimized trivially for i386, all we have to do is check the critical nesting count from the interrupt code and set the interrupts-disabled bit in the returned (supervisor mode) frame. critical_enter() and critical_exit() would then be nearly perfect. My version of it does less than this. I only use it to help implement spinlocks. Index: kern_switch.c === RCS file: /home/ncvs/src/sys/kern/kern_switch.c,v retrieving revision 1.20 diff -u -2 -r1.20 kern_switch.c --- kern_switch.c 11 Feb 2002 20:37:51 - 1.20 +++ kern_switch.c 13 Feb 2002 05:34:20 - @@ -70,14 +70,21 @@ } -/* Critical sections that prevent preemption. */ +/*- + * Critical section handling. + * XXX doesn't belong here. + * + * Entering a critical section only blocks non-fast interrupts. + * critical_enter() is similar to splhigh() in a 2-level spl setup under + * old versions of FreeBSD. + * + * Exiting from all critical sections unblocks non-fast interrupts and runs + * the handlers of any that were blocked. critical_exit() is similar to + * spl(old_level) in a 2-level spl setup under old versions of FreeBSD. + */ void critical_enter(void) { - struct thread *td; - td = curthread; - if (td-td_critnest == 0) - td-td_savecrit = cpu_critical_enter(); - td-td_critnest++; + curthread-td_critnest++; } @@ -85,12 +92,7 @@ critical_exit(void) { - struct thread *td; - td = curthread; - if (td-td_critnest == 1) { - td-td_critnest = 0; - cpu_critical_exit(td-td_savecrit); - } else - td-td_critnest--; + if (--curthread-td_critnest == 0 (ipending | spending) != 0) + unpend(); } Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
:My version of it does less than this. I only use it to help implement :spinlocks. You put together infrastructure to deal with pending pci interrupts? If so, then why not commit it (or at least commit a version #ifdef'd for the i386 architecture). -Matt Matthew Dillon [EMAIL PROTECTED] :Index: kern_switch.c :=== :RCS file: /home/ncvs/src/sys/kern/kern_switch.c,v :retrieving revision 1.20 :diff -u -2 -r1.20 kern_switch.c :--- kern_switch.c 11 Feb 2002 20:37:51 - 1.20 :+++ kern_switch.c 13 Feb 2002 05:34:20 - :@@ -70,14 +70,21 @@ : } : :-/* Critical sections that prevent preemption. */ :+/*- :+ * Critical section handling. :+ * XXX doesn't belong here. :+ * :+ * Entering a critical section only blocks non-fast interrupts. :+ * critical_enter() is similar to splhigh() in a 2-level spl setup under :+ * old versions of FreeBSD. :+ * :+ * Exiting from all critical sections unblocks non-fast interrupts and runs :+ * the handlers of any that were blocked. critical_exit() is similar to :+ * spl(old_level) in a 2-level spl setup under old versions of FreeBSD. :+ */ : void : critical_enter(void) : { :- struct thread *td; : :- td = curthread; :- if (td-td_critnest == 0) :- td-td_savecrit = cpu_critical_enter(); :- td-td_critnest++; :+ curthread-td_critnest++; : } : :@@ -85,12 +92,7 @@ : critical_exit(void) : { :- struct thread *td; : :- td = curthread; :- if (td-td_critnest == 1) { :- td-td_critnest = 0; :- cpu_critical_exit(td-td_savecrit); :- } else :- td-td_critnest--; :+ if (--curthread-td_critnest == 0 (ipending | spending) != 0) :+ unpend(); : } : :Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
* Matthew Dillon [EMAIL PROTECTED] [020223 14:43] wrote: This is approximately what I am thinking. Note that this gives us the flexibility to create a larger infrastructure around the bucket cache, such as implement per-cpu caches and so on and so forth. What I have here is the minimal implementation. I strongly object to this implementation right now, please do not commit it. I already explained to you how to make the problem go away but instead you insist on some complex api that pins memory down like the damn zone allocator. It's not needed, so please don't do it. -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
: :* Matthew Dillon [EMAIL PROTECTED] [020223 14:43] wrote: : This is approximately what I am thinking. Note that this gives us the : flexibility to create a larger infrastructure around the bucket cache, : such as implement per-cpu caches and so on and so forth. What I have : here is the minimal implementation. : :I strongly object to this implementation right now, please do not :commit it. I already explained to you how to make the problem :go away but instead you insist on some complex api that pins :memory down like the damn zone allocator. It's not needed, so :please don't do it. : :-Alfred Woa! Timeout! I'm not planning on comitting any sort of malloc thingy. That was a 10 second thought experiment. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
* Matthew Dillon [EMAIL PROTECTED] [020223 16:41] wrote: : :* Matthew Dillon [EMAIL PROTECTED] [020223 14:43] wrote: : This is approximately what I am thinking. Note that this gives us the : flexibility to create a larger infrastructure around the bucket cache, : such as implement per-cpu caches and so on and so forth. What I have : here is the minimal implementation. : :I strongly object to this implementation right now, please do not :commit it. I already explained to you how to make the problem :go away but instead you insist on some complex api that pins :memory down like the damn zone allocator. It's not needed, so :please don't do it. : :-Alfred Woa! Timeout! I'm not planning on comitting any sort of malloc thingy. That was a 10 second thought experiment. Usually when I see diff(1) output from you I usually expect a commit within the next half hour or so, I just wanted to make myself clear on the issue. No worries. :) Yes, and hopefully JeffR's allocator will fix our problems, that is if it ever makes it out of p4. :) -- -Alfred Perlstein [[EMAIL PROTECTED]] 'Instead of asking why a piece of software is using 1970s technology, start asking why software is ignoring 30 years of accumulated wisdom.' Tax deductible donations for FreeBSD: http://www.freebsdfoundation.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
On Sat, 23 Feb 2002, Matthew Dillon wrote: :My version of it does less than this. I only use it to help implement :spinlocks. You put together infrastructure to deal with pending pci interrupts? If so, then why not commit it (or at least commit a version #ifdef'd for the i386 architecture). It's too messy and unfinished (doesn't work right for SMP or irqs = 16), and dificult to untangle from my other patches. I posted these partial ones to attempt to inhibit() recomplication of the current critical* functions in directions that I don't want to go :-). :Index: kern_switch.c :=== :RCS file: /home/ncvs/src/sys/kern/kern_switch.c,v :retrieving revision 1.20 :diff -u -2 -r1.20 kern_switch.c :--- kern_switch.c11 Feb 2002 20:37:51 - 1.20 :+++ kern_switch.c13 Feb 2002 05:34:20 - :@@ -70,14 +70,21 @@ : } : :-/* Critical sections that prevent preemption. */ :+/*- :+ * Critical section handling. :+ * XXX doesn't belong here. :+ * :+ * Entering a critical section only blocks non-fast interrupts. :+ * critical_enter() is similar to splhigh() in a 2-level spl setup under :+ * old versions of FreeBSD. :+ * :+ * Exiting from all critical sections unblocks non-fast interrupts and runs :+ * the handlers of any that were blocked. critical_exit() is similar to :+ * spl(old_level) in a 2-level spl setup under old versions of FreeBSD. :+ */ : void : critical_enter(void) : { :-struct thread *td; : :-td = curthread; :-if (td-td_critnest == 0) :-td-td_savecrit = cpu_critical_enter(); :-td-td_critnest++; :+curthread-td_critnest++; : } : :@@ -85,12 +92,7 @@ : critical_exit(void) : { :-struct thread *td; : :-td = curthread; :-if (td-td_critnest == 1) { :-td-td_critnest = 0; :-cpu_critical_exit(td-td_savecrit); :-} else :-td-td_critnest--; :+if (--curthread-td_critnest == 0 (ipending | spending) != 0) :+unpend(); : } ipending here works much as in RELENG_4. It is ORed into by sched_ithd() if curthread-td_critnest != 0. Nothing special is needed for pci (the ICU masks pending interrupts). Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
Bruce, I've looked at the vector assembly and it looks like cleaning up critical_enter() and critical_exit() for i386 ought to be simple. If you have a complete patch set I would like to see it posted for review or comitted, or if you don't want to deal with the commit I would love to have it so I can bang it into shape for commit. Having a sleek critical_enter()/critical_exit() would double sched_lock's performance. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
: :On Sat, 23 Feb 2002, Matthew Dillon wrote: : : :My version of it does less than this. I only use it to help implement : :spinlocks. : : You put together infrastructure to deal with pending pci interrupts? : If so, then why not commit it (or at least commit a version #ifdef'd : for the i386 architecture). : :It's too messy and unfinished (doesn't work right for SMP or irqs = 16), :and dificult to untangle from my other patches. I posted these partial :ones to attempt to inhibit() recomplication of the current critical* :functions in directions that I don't want to go :-). Oh, ok. Hmm. Well, do you mind if I throw together an interrupt-set-cli- in-return-frame patch set? I think it would wind up being just as fast. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
:It's too messy and unfinished (doesn't work right for SMP or irqs = 16), :and dificult to untangle from my other patches. I posted these partial :ones to attempt to inhibit() recomplication of the current critical* :functions in directions that I don't want to go :-). :... : :ipending here works much as in RELENG_4. It is ORed into by sched_ithd() :if curthread-td_critnest != 0. Nothing special is needed for pci :(the ICU masks pending interrupts). : :Bruce Ok, I have most of it coded up. Could you explain what 'spending' was for? I am doing it slightly different then you. To avoid having to use locked instructions I have placed ipending in the thread structure: void critical_enter(void) { struct thread *td = curthread; ++td-td_critnest; } void critical_exit(void) { struct thread *td = curthread; KASSERT(td-td_critnest 0, (bad td_critnest value!)); if (--td-td_critnest == 0) { if (td-td_ipending) unpend(); } } The apic and icu vector code will do a simple td_critnest test and OR the irq bit into td-td_ipending (it conveniently already has the struct thread in %ebx). And the vector code will also check and handle any non-zero td_ipending if critnest is 0, handling the 1-0 transition/preemption race. I'll post the patch when it gets a little further along. How should I deal with fast interrupts? -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
On Sat, 23 Feb 2002, Matthew Dillon wrote: Bruce, I've looked at the vector assembly and it looks like cleaning up critical_enter() and critical_exit() for i386 ought to be simple. If you have a complete patch set I would like to see it posted for review or comitted, or if you don't want to deal with the commit I would love to have it so I can bang it into shape for commit. Patches for most of my sys tree are in sys.tar.gz in my home directory on freefall. Having a sleek critical_enter()/critical_exit() would double sched_lock's performance. I'm not sure about that. Perhaps I've already optimized sched_lock enough that I don't notice. My changes were actually directed at avoiding sched_lock's interrupt latency and it wasn't until the critical_enter()/ exit() calls pessimized mtx_{lock,unlock}_spin in -current that I managed to share enough code (the critnest increment/decrement) to avoid losing efficiency. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
On Sat, 23 Feb 2002, Matthew Dillon wrote: :It's too messy and unfinished (doesn't work right for SMP or irqs = 16), :and dificult to untangle from my other patches. I posted these partial :ones to attempt to inhibit() recomplication of the current critical* :functions in directions that I don't want to go :-). Oh, ok. Hmm. Well, do you mind if I throw together an interrupt-set-cli- in-return-frame patch set? I think it would wind up being just as fast. I don't see how that would work. We need to call critical_enter() from ordinary code, so there is no free frame pop. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
: :On Sat, 23 Feb 2002, Matthew Dillon wrote: : : :It's too messy and unfinished (doesn't work right for SMP or irqs = 16), : :and dificult to untangle from my other patches. I posted these partial : :ones to attempt to inhibit() recomplication of the current critical* : :functions in directions that I don't want to go :-). : : Oh, ok. Hmm. Well, do you mind if I throw together an interrupt-set-cli- : in-return-frame patch set? I think it would wind up being just as fast. : :I don't see how that would work. We need to call critical_enter() from :ordinary code, so there is no free frame pop. : :Bruce I'm trying it your way, with an ipending variable. The set-cli-in-return-frame mechanism would only work well for level interrupts. Basically the interrupt occurs and you set PS_I in the return frame (that was pushed on entry to the interrupt). critical_exit() would then detect the 1-0 transition and simply STI. But since we have a combination of level and edge it gets too messy for my tastes, so I'm trying it with an ipending variable. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
On Sat, 23 Feb 2002, Matthew Dillon wrote: :ipending here works much as in RELENG_4. It is ORed into by sched_ithd() :if curthread-td_critnest != 0. Nothing special is needed for pci :(the ICU masks pending interrupts). : :Bruce Ok, I have most of it coded up. Could you explain what 'spending' was for? Like the SWI bits in ipending in RELENG_4. In RELENG_4, we have to pass around all the bits to spl*(), so we had to pack them in at least some places (not required in inpending, but efficient). In -current, packing is not so important even if it is possible, so I didn't attempt it. I am doing it slightly different then you. To avoid having to use locked instructions I have placed ipending in the thread structure: Mine is global partly for historical reasons. BTW, I've found that variables in the thread struct are more fragile than per-cpu globals. You have to remember to deal with them all before you switch threads. I needed to clone td_critnest to the next thread in one or two places (at least one related to the complications for forking). void critical_enter(void) { struct thread *td = curthread; ++td-td_critnest; } void critical_exit(void) { struct thread *td = curthread; KASSERT(td-td_critnest 0, (bad td_critnest value!)); if (--td-td_critnest == 0) { if (td-td_ipending) unpend(); } } The apic and icu vector code will do a simple td_critnest test and OR the irq bit into td-td_ipending (it conveniently already has the struct thread in %ebx). And the vector code will also check and I do this at the start of sched_ithd(). This is efficient enough because the nested case is rare. handle any non-zero td_ipending if critnest is 0, handling the 1-0 transition/preemption race. I'll post the patch when it gets a little further along. How should I deal with fast interrupts? Hmm, this gets complicated. First, you need the td_critnest test in Xfastintr* (do it before Xfastintr* calls critical_enter()). It is important in -current (although bad for latency) that critical_enter() keeps masking even fast interupts although it doesn't do it in software). Next, you need to mask the fast interrupt. It probably needs to be masked in the ICU/APIC on i386's, so it will become not-so-fast. Finally, unpend() needs to do more for fast interrupts: - give them highest priority - unmask the in the ICU/APIC/wherever, either before or after calling the handler. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: malloc_bucket() idea (was Re: How to fix malloc.)
On Sat, Feb 23, 2002 at 03:12:36PM -0800, Matthew Dillon wrote: :OTOH, A per CPU bucket list means no bucket mtx would be necessary; :without that, it's just replacing one type of contention for another, :I think, until you start making mutex selection indeterminate. 8^(. : :Really, this delayed freeing idea is starting to get uglier than :just going to per CPU pools... : :-- Terry Well, if we want to rewrite malloc() a per-cpu pool inside a critical section would not be that difficult. The 'bucket' array would simply be placed in the per-cpu area. However, with malloc() we still have a serious problem with the malloc_type structure statistics. There would have to be per-cpu information for those as well. Someone (jeffr) is already working on a new allocator to hopefully [at least] eventually replace malloc(9) and various other kernel allocators. It will support per CPU working-sets and some cache friendly goodies. -- Bosko Milekic [EMAIL PROTECTED] [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message