Hi, there's something terribly sad with the kmem API: kmem_free takes a size argument. It has I think two major drawbacks:
- the fact that the caller needs to know the allocated size often leads to strange hacks in structures, where fields are added just to hold allocated sizes. Accumulating hacks sometimes leads to great bugs - see NetBSD-SA2014-014 - and omissions - like in ffs_unmount(), currently. - the fact that the allocator does not "know" the allocated size is efficient from a memory pov - it doesn't need more memory space to hold the size -, but makes kmem_alloc(0) illegal. That is something that could be more or less fixed by re-indexing the kmem cache array, but then one more 8-byte cache would be necessary, and we probably end up losing the memory optimisation. We currently have malloc()/free(), which at a first glance would be the solution when too many hacks take place. Moreover malloc(0) is not illegal: in all cases a header is allocated. But malloc is deprecated. Let's give a glance at the man page and the code: - malloc_roundup() and malloc_type_setlimit() are documented in man while they don't even exist. - malloc_type_attach(), malloc_type_detach(), MALLOC_DEFINE_LIMIT(), MALLOC_JUSTDEFINE_LIMIT(), MALLOC_DEFINE(), MALLOC_JUSTDEFINE() and MALLOC_DECLARE() are dead ops. - <sys/mallocvar.h> and <sys/malloc.h> are still included in many places where no call is made to either malloc() or free(). - kern_malloc() allocates memory through kmem_alloc, therefore the M_CANFAIL flag is lost, which means that malloc() may behave unexpectedly. You can understand that the malloc API is dead, and that no attempt has been made to switch correctly to kmem_alloc, or to re-implement something clean. That's what motivates my proposal: creating kmem_valloc and kmem_vfree, to definitely get rid of malloc, and have another set of functions that can perform allocations/deallocations without needing a size argument. The 'v' stands for "variable". The implementation is simple: - allocate 8 more bytes to hold this structure: struct kmem_header { size_t size; } __aligned(KMEM_ALIGN); and put the allocated size in the field. It is KMEM_ALIGN-aligned. Then return (header + 1). - when freeing: do (given_pointer - 1), and call kmem_free() with the size in the field. That's mostly what kern_malloc() does, but it is consistent and sometimes consumes less memory - kern_malloc may allocate one more page (PAGE_SIZE). As some of you may have noticed, some recent Security Advisories were related to kmem. And now there's this issue in ffs_unmount(); and more bugs will come. Here is a patch which implements kmem_valloc. Comments? Index: kern/subr_kmem.c =================================================================== RCS file: /cvsroot/src/sys/kern/subr_kmem.c,v retrieving revision 1.60 diff -u -r1.60 subr_kmem.c --- kern/subr_kmem.c 22 Jul 2014 07:38:41 -0000 1.60 +++ kern/subr_kmem.c 8 Nov 2014 06:12:44 -0000 @@ -173,6 +173,10 @@ #define KMEM_BIG_MAXSIZE 16384 #define KMEM_CACHE_BIG_COUNT (KMEM_BIG_MAXSIZE >> KMEM_BIG_SHIFT) +struct kmem_header { + size_t size; +} __aligned(KMEM_ALIGN); + static pool_cache_t kmem_cache_big[KMEM_CACHE_BIG_COUNT] __cacheline_aligned; static size_t kmem_cache_big_maxidx __read_mostly; @@ -206,9 +210,6 @@ #endif /* defined(KMEM_REDZONE) */ #if defined(KMEM_SIZE) -struct kmem_header { - size_t size; -} __aligned(KMEM_ALIGN); #define SIZE_SIZE sizeof(struct kmem_header) static void kmem_size_set(void *, size_t); static void kmem_size_check(void *, size_t); @@ -406,6 +407,48 @@ kmem_intr_free(p, size); } +/* + * kmem_valloc: allocate wired memory. + * => must not be called from interrupt context. + */ + +void * +kmem_valloc(size_t size, km_flag_t kmflags) +{ + struct kmem_header *hd; + size_t allocsize; + void *p; + + KASSERTMSG((!cpu_intr_p() && !cpu_softintr_p()), + "kmem(9) should not be used from the interrupt context"); + + allocsize = size + sizeof(struct kmem_header); + if ((p = kmem_intr_alloc(allocsize, kmflags)) == NULL) + return NULL; + hd = (struct kmem_header *)p; + hd->size = allocsize; + + return hd + 1; +} + +/* + * kmem_vfree: free wired memory allocated by kmem_valloc. + * => must not be called from interrupt context. + */ + +void +kmem_vfree(void *p) +{ + struct kmem_header *hd; + + KASSERT(!cpu_intr_p()); + KASSERT(!cpu_softintr_p()); + + hd = (struct kmem_header *)p; + hd -= 1; + kmem_intr_free((uint8_t *)hd, hd->size); +} + static size_t kmem_create_caches(const struct kmem_cache_info *array, pool_cache_t alloc_table[], size_t maxsize, int shift, int ipl) Index: sys/kmem.h =================================================================== RCS file: /cvsroot/src/sys/sys/kmem.h,v retrieving revision 1.9 diff -u -r1.9 kmem.h --- sys/kmem.h 5 Feb 2012 03:40:08 -0000 1.9 +++ sys/kmem.h 8 Nov 2014 06:12:44 -0000 @@ -40,6 +40,9 @@ void * kmem_zalloc(size_t, km_flag_t); void kmem_free(void *, size_t); +void * kmem_valloc(size_t, km_flag_t); +void kmem_vfree(void *); + void * kmem_intr_alloc(size_t, km_flag_t); void * kmem_intr_zalloc(size_t, km_flag_t); void kmem_intr_free(void *, size_t);