Does anyone have Phong Vo <[email protected]>'s new email address? The following work (i.e. the per thread caching etc) in Illumos might be very interesting...
Forwarded conversation Subject: [developer] 4489-4491 ptcumem and other libumem enhancements ------------------------ From: Robert Mustacchi <[email protected]> Date: Thu, Jan 16, 2014 at 3:59 AM To: illumos Developer <[email protected]> This combines work done over the past few years at Joyent enhancing libumem. It's broken down into two commits. One which adds 128k slabs and makes VM_BESTFIT the default, the other which adds per thread caching umem. For more on umem, see http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/. https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html Robert ------------------------------------------- illumos-developer Archives: https://www.listbox.com/member/archive/182179/=now RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175106-197fca96 Modify Your Subscription: https://www.listbox.com/member/?member_id=21175106&id_secret=21175106-3d886b59 Powered by Listbox: http://www.listbox.com ---------- From: Dan McDonald <[email protected]> Date: Mon, Jan 20, 2014 at 4:33 AM To: Robert Mustacchi <[email protected]> Cc: illumos Developer <[email protected]> On Jan 15, 2014, at 9:59 PM, Robert Mustacchi <[email protected]> wrote: > This combines work done over the past few years at Joyent enhancing > libumem. It's broken down into two commits. One which adds 128k slabs > and makes VM_BESTFIT the default, the other which adds per thread > caching umem. For more on umem, see > http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/. > > https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html I mostly understood this except for one constant, and I suspect that's me being dense. I also didn't find much wrong, save for one possible attack-surface reduction. Dan ----------- exception_lists/check_rtime usr/src/cmd/mdb/common/modules/libc/libc.c usr/src/cmd/mdb/common/modules/libumem/libumem.c usr/src/cmd/mdb/common/modules/libumem/umem.c usr/src/cmd/mdb/intel/amd64/libumem/Makefile usr/src/cmd/mdb/intel/ia32/libumem/Makefile usr/src/cmd/mdb/sparc/v7/libumem/Makefile usr/src/cmd/mdb/sparc/v9/libumem/Makefile usr/src/lib/libc/amd64/Makefile usr/src/lib/libc/i386/Makefile.com usr/src/lib/libc/port/mapfile-vers usr/src/lib/libc/port/threads/thr.c usr/src/lib/libc/sparc/Makefile.com usr/src/lib/libc/sparcv9/Makefile.com usr/src/lib/libumem/Makefile.com usr/src/lib/libumem/common/envvar.c usr/src/lib/libumem/common/malloc.c usr/src/lib/libumem/common/stub_stand.c usr/src/lib/libumem/common/umem_base.h usr/src/lib/libumem/common/umem_impl.h usr/src/lib/libumem/common/vmem_base.c usr/src/lib/libumem/i386/asm_subr.s usr/src/lib/libc/port/threads/tmem.c usr/src/man/man3malloc/umem_alloc.3malloc * File look good, no comments needed. usr/src/lib/libc/inc/thr_uberdata.h * Why 16 for tm_roots? Ahhh, it's on line 405-408 of umem.c. Mention the big-theory statement, please? usr/src/lib/libumem/amd64/umem_genasm.c * Line 567: Grammar nit: "it's" ("it is", not "single-byte addl's MAX_UINT32") usr/src/lib/libumem/common/mapfile-vers * Have you thought through the security considerations of allowing asm_subr.o's text segment to be writable? Or am I worrying about nothing? Worst case, you create a new .o that JUST has these writeable malloc linkages in them, I think. usr/src/lib/libumem/common/umem.c * Sorry for being dense, but 16 for TMNUMROOTS still seems arbitrary. Please enlighten me, or tell me where in here or in umem_genasm.c I missed it?! usr/src/lib/libumem/i386/umem_genasm.c * Lines 32 & 557: it's/its grammar. usr/src/lib/libumem/sparc/umem_genasm.c * WHAT?!? NO SPARC?!? okay okay... put the baseball bat down... just kidding. :) ---------- From: Josef 'Jeff' Sipek <[email protected]> Date: Mon, Jan 20, 2014 at 4:30 PM To: Robert Mustacchi <[email protected]> Cc: illumos Developer <[email protected]> On Wed, Jan 15, 2014 at 06:59:29PM -0800, Robert Mustacchi wrote: > This combines work done over the past few years at Joyent enhancing > libumem. It's broken down into two commits. One which adds 128k slabs > and makes VM_BESTFIT the default, the other which adds per thread > caching umem. For more on umem, see > http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/. > > https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html Fascinating read in umem.c... I'm not exactly thrilled about some of the arch specifics in common code. Yes, I know we support only x86 and sparc (and no one actually uses sparc) and the ptcumem changes are supported only on x86. For example, line 489 mentions 5-byte jump. Should some of this architecture specific documentation move into the architecture specific files? Ah, I see that there's some there. I feel like having arch specific details in the common files is begging for contradicting comments down the line as one gets updated but the other doesn't. umem.c:552: is that supposed to be s/makefile/mapfile/? would it make sense to make umem_genasm_supported const? stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not familiar with the requirements for stubs.) malloc.c: Is there any "config" in illumos so that places like this can check for a (compile time) feature instead of an architecture? Jeff. -- Once you have their hardware. Never give it back. (The First Rule of Hardware Acquisition) ---------- From: Robert Mustacchi <[email protected]> Date: Mon, Jan 20, 2014 at 5:42 PM To: Josef 'Jeff' Sipek <[email protected]> Cc: illumos Developer <[email protected]> On 01/20/2014 07:30 AM, Josef 'Jeff' Sipek wrote: > On Wed, Jan 15, 2014 at 06:59:29PM -0800, Robert Mustacchi wrote: >> This combines work done over the past few years at Joyent enhancing >> libumem. It's broken down into two commits. One which adds 128k slabs >> and makes VM_BESTFIT the default, the other which adds per thread >> caching umem. For more on umem, see >> http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/. >> >> https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html > > Fascinating read in umem.c... > > I'm not exactly thrilled about some of the arch specifics in common code. > Yes, I know we support only x86 and sparc (and no one actually uses sparc) > and the ptcumem changes are supported only on x86. For example, line 489 > mentions 5-byte jump. Should some of this architecture specific > documentation move into the architecture specific files? Ah, I see that > there's some there. I feel like having arch specific details in the common > files is begging for contradicting comments down the line as one gets > updated but the other doesn't. It's not a perfect world, and I don't think there's a great answer. The vast majority of that block of text that I added into the big theory statement is equally viable across all architectures that illumos does or may some day support. If someone wanted to port this to sparc, which I'd welcome, or we enter the mystical future where there are other architectures that illumos supports, then the contents there tell the writer how they should go about doing this, the parts about breaking down the generated asm into different portions are the same as what you'd do across any architecture. There are a few places where we delve into x86 specifics (8.2 and 8.3), which I've updated to try to generalize them. I'd really rather not break out the vast majority of this into the per-architecture implementations as otherwise I fear we'll hit your fears much sooner, as we'll actually be duplicating content. > umem.c:552: is that supposed to be s/makefile/mapfile/? Yeah, it is, thanks. > would it make sense to make umem_genasm_supported const? I don't think it really changes anything, but sure, I can do that. > stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not familiar > with the requirements for stubs.) So, no, not really. These are basically just used for kmdb and aren't atomic. See what we're doing with mutex_lock and umem_atomic_add_64 in cmd/mdb/common/kmdb/kmdb_umemglue.c. If you feel it will be more consistent, I could do the same rename tricks and move its stub implementation into there and do the same rename tricks. > malloc.c: Is there any "config" in illumos so that places like this can > check for a (compile time) feature instead of an architecture? We only have that in terms of features of the architecture and processor or langauge, see <sys/isa_defs.h> and <sys/feature_tests.h>. So, for better or worse, this is what we do in the rest of the system. I've updated the webrev in place. Robert ---------- From: Josef 'Jeff' Sipek <[email protected]> Date: Mon, Jan 20, 2014 at 8:16 PM To: Robert Mustacchi <[email protected]> Cc: illumos Developer <[email protected]> Generalizing is what I was looking for. > > umem.c:552: is that supposed to be s/makefile/mapfile/? > > Yeah, it is, thanks. > > > would it make sense to make umem_genasm_supported const? > > I don't think it really changes anything, but sure, I can do that. .data vs .rodata? Since it's not a tunable might as well have the compiler and kernel (assuming non-writable .rodata) tell us about any bad behavior. Should umem_base.h be updated too? > > stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not familiar > > with the requirements for stubs.) > > So, no, not really. These are basically just used for kmdb and aren't > atomic. See what we're doing with mutex_lock and umem_atomic_add_64 in > cmd/mdb/common/kmdb/kmdb_umemglue.c. If you feel it will be more > consistent, I could do the same rename tricks and move its stub > implementation into there and do the same rename tricks. I don't think I quite have the big picture here to decide which way makes more sense, so I leave it up to you to decide. (I just made sure to mention it in case it did need to be atomic.) > > malloc.c: Is there any "config" in illumos so that places like this can > > check for a (compile time) feature instead of an architecture? > > We only have that in terms of features of the architecture and processor > or langauge, see <sys/isa_defs.h> and <sys/feature_tests.h>. So, for > better or worse, this is what we do in the rest of the system. Yeah, that's what I thought. Some more feedback: umem_genasm.c:116: 0x0 -> 0x00 Overall, LGTM. Jeff. -- Mankind invented the atomic bomb, but no mouse would ever construct a mousetrap. - Albert Einstein ---------- From: Robert Mustacchi <[email protected]> Date: Mon, Jan 20, 2014 at 9:08 PM To: Josef 'Jeff' Sipek <[email protected]> Cc: illumos Developer <[email protected]> On 1/20/14 11:16 , Josef 'Jeff' Sipek wrote: > Should umem_base.h be updated too? Yes, I forgot about that. Thanks. >>> stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not familiar >>> with the requirements for stubs.) >> >> So, no, not really. These are basically just used for kmdb and aren't >> atomic. See what we're doing with mutex_lock and umem_atomic_add_64 in >> cmd/mdb/common/kmdb/kmdb_umemglue.c. If you feel it will be more >> consistent, I could do the same rename tricks and move its stub >> implementation into there and do the same rename tricks. > > I don't think I quite have the big picture here to decide which way makes > more sense, so I leave it up to you to decide. (I just made sure to mention > it in case it did need to be atomic.) I've opted to do the latter to be more consistent. > Some more feedback: > > umem_genasm.c:116: 0x0 -> 0x00 Fixed. Webrev updated in place. Robert ---------- From: Josef 'Jeff' Sipek <[email protected]> Date: Mon, Jan 20, 2014 at 9:24 PM To: Robert Mustacchi <[email protected]> Cc: illumos Developer <[email protected]> On Mon, Jan 20, 2014 at 12:08:49PM -0800, Robert Mustacchi wrote: > On 1/20/14 11:16 , Josef 'Jeff' Sipek wrote: > > Should umem_base.h be updated too? > > Yes, I forgot about that. Thanks. > > >>> stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not > >>> familiar > >>> with the requirements for stubs.) > >> > >> So, no, not really. These are basically just used for kmdb and aren't > >> atomic. See what we're doing with mutex_lock and umem_atomic_add_64 in > >> cmd/mdb/common/kmdb/kmdb_umemglue.c. If you feel it will be more > >> consistent, I could do the same rename tricks and move its stub > >> implementation into there and do the same rename tricks. > > > > I don't think I quite have the big picture here to decide which way makes > > more sense, so I leave it up to you to decide. (I just made sure to mention > > it in case it did need to be atomic.) > > I've opted to do the latter to be more consistent. Turns out that I have no idea how the rename tricks work. Assuming that it works, LGTM. Jeff. > > Some more feedback: > > > > umem_genasm.c:116: 0x0 -> 0x00 > > Fixed. > > Webrev updated in place. > > Robert -- *NOTE: This message is ROT-13 encrypted twice for extra protection* ---------- From: Robert Mustacchi <[email protected]> Date: Tue, Jan 21, 2014 at 12:21 AM To: Dan McDonald <[email protected]> Cc: illumos Developer <[email protected]> Hey Dan, Responses inline. On 1/19/14 19:33 , Dan McDonald wrote: > usr/src/lib/libc/inc/thr_uberdata.h > > * Why 16 for tm_roots? Ahhh, it's on line 405-408 of umem.c. Mention > the big-theory statement, please? I've added a comment to direct folks there. > usr/src/lib/libumem/common/mapfile-vers > > * Have you thought through the security considerations of allowing > asm_subr.o's text segment to be writable? Or am I worrying about nothing? > Worst case, you create a new .o that JUST has these writeable malloc > linkages in them, I think. If we wanted to do anything, we could mprotect it after the work by umem_genasm(). That'd actually solve the problems that this could cause. I think that's a reasonable low priority follow up RFE. > usr/src/lib/libumem/common/umem.c > > * Sorry for being dense, but 16 for TMNUMROOTS still seems arbitrary. Please > enlighten me, or tell me where in here or in umem_genasm.c I missed it?! Well, it isn't properly explained, so I'll try and add something to the comments. It is slightly arbitrary. But it was done based on the size of the allocations and what sizes would be covered by the per-thread cache and the likelihood we've seen for applications to allocate those sizes. This means <= 256 bytes and <=448 bytes will be cached for i386 and amd64 respectively. Basically, beyond those sizes you start getting to more serious sized objects being allocated which are much more noticeable in other systems. So beyond that size we basically felt that the efficacy of this was going to decrease and that folks who are doing larger sized allocations are going to be much more noticeable and less likely to just do it every time by accident. I've updated the webrev in place. Robert -- Irek _______________________________________________ ast-developers mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-developers
