Lars Heidieker <l...@heidieker.de> wrote: > The patch provided includes the changed extent as well as a changed > implementation of kmem, which provides page aligned memory for > allocation >= PAGE_SIZE. The interface between the pool-subsystem and > uvm_km is changed by passing the pool_allocators page_size to the > uvm_km functions, the idea behind this is to have multiply default > pool_allocators with different pool-page-sizes to lower inner > fragmentation within the pools. > In order to support those different-pool-page-sizes the kernel_map and > kmem_map gained caches for virtual addresses not only for PAGE_SIZE > but low integer multiplies of PAGE_SIZE. > These large then PAGE_SIZE caches are used by the the larger the > PAGE_SIZE allocations in kmem.
Concerns/questions: - What are the effects of replacing vmem(9) to fragmentation? Do you have some numbers that kmem_cache_big_sizes is doing a good job? - The patch decreases the variety of small-sized caches (e.g. no more 8, 24, 40, 56, etc). On what basis? Was it measured? It might be sensitive! See here: http://mail-index.netbsd.org/tech-kern/2009/01/11/msg003989.html - Can you try regress/sys/kern/allocfree/allocfree.c with your patch? If yamt@ says OK for replacing the use of vmem(9), as he probably has a best understanding of fragmentation issues, then you have a green light for the patch! :) Also, I suppose by mistake, patch has removed kmem_poison_check() and the following checks: - KASSERT(!cpu_intr_p()); - KASSERT(!cpu_softintr_p()); These are must-have, critical mechanisms for catching bugs! uvm_km_alloc_poolpage_cache() and uvm_km_alloc_poolpage() now take size argument. Please KASSERT() that size is dividable by PAGE_SIZE. Also, in uvm_km_free_poolpage_cache(), the following was removed: -#if defined(DEBUG) - pmap_update(pmap_kernel()); -#endif Since interaction between layers here is already a bit confusing, this definitely needs a comment i.e. why leaving a stale entry is OK. Which is because of KVA reservation (as it is KVA cache), where TLB flush would be performed in km_vacache_free(). (A case of KVA starvation, normally?) > > http://ftp.netbsd.org/pub/NetBSD/misc/para/kmem-pool-uvm-extent.patch > Few other comments on the patch. Please use diff -p option when generating the patches. Also, splitting mechanical parts and functional into a separate diffs would ease review, and e.g. malloc to kmem conversions can/should be committed separately. - hd = malloc(sizeof(*hd), M_DEVBUF, M_NOWAIT); + hd = kmem_alloc(sizeof(struct hook_desc), KM_NOSLEEP); All such M_NOWAIT cases should be audited that there are no uses from interrupt context, and if not - changed to KM_SLEEP (check for NULL can be removed too). Generally, there should be no KM_NOSLEEP uses without a very good reason. - t = malloc(len + sizeof(struct m_tag), M_PACKET_TAGS, wait); + t = kmem_alloc(len + sizeof(struct m_tag), + wait ? KM_SLEEP : KM_NOSLEEP); mbuf tags are created from interrupt context, so this is a bug (those removed KASSERT()s would have caught)! - free(corename, M_TEMP); + kmem_free(corename, strlen(corename) + 1); Hmm, these are a bit inefficient. I have been thinking about this.. If, for nearly all cases, size is either constant or can-be/is normally saved in some structures, and only strings are real, justified cases to store the size within allocation, e.g. in a byte or word before the actual string - perhaps add kmem_stralloc() and kmem_strfree()? + size += REDZONE_SIZE + SIZE_SIZE; + if ((index = ((size - 1) >> KMEM_TINY_SHIFT)) < (KMEM_TINY_MAXSIZE >> KMEM_TINY_SHIFT)) { + pc = kmem_cache_tiny[index]; + } else if ((index = ((size - 1) >> KMEM_BIG_SHIFT)) < (KMEM_BIG_MAXSIZE >> KMEM_BIG_SHIFT)) { + pc = kmem_cache_big[index]; + } else { + uvm_km_free(kernel_map, (vaddr_t)p, round_page(size), UVM_KMF_WIRED); + return; + } Please KNF, clean-up (the whole patch). Lines should be no longer than 80 characters. When wrapping long lines, second level indents have four extra spaces (not a tab). There are many cases in the patch with the trailing, added or missed whitespaces, etc. - * to only a part of an amap). if the malloc of the array fails + * to only a part of an amap). if the kmem of the array fails In such places use word "allocation", no need to be allocator-specific. :) rw_enter(&swap_syscall_lock, RW_WRITER); - userpath = malloc(SWAP_PATH_MAX, M_TEMP, M_WAITOK); + userpath = kmem_alloc(SWAP_PATH_MAX, KM_SLEEP); .. - free(userpath, M_TEMP); + kmem_free(userpath, SWAP_PATH_MAX); rw_exit(&swap_syscall_lock); Opportunity to move out kmem_*() outside the lock, as it is safe! - sdp = malloc(sizeof *sdp, M_VMSWAP, M_WAITOK); - spp = malloc(sizeof *spp, M_VMSWAP, M_WAITOK); + sdp = kmem_alloc(sizeof *sdp, KM_SLEEP); + spp = kmem_alloc(sizeof *spp, KM_SLEEP); memset(sdp, 0, sizeof(*sdp)); KNF: sizeof *sdp -> sizeof(*sdp). Also, -memset(), +kmem_zalloc(). -- Mindaugas