Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/25/2013 11:18 PM, Seth Jennings wrote: On 02/23/2013 06:37 PM, Ric Mason wrote: On 02/23/2013 05:02 AM, Seth Jennings wrote: On 02/21/2013 08:56 PM, Ric Mason wrote: On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but <= PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or "size classes" in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +"zspage" which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of "back slabs with HIGHMEM pages"? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. "Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE." Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects > PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. kmalloc object > PAGE_SIZE/2 or > PAGE_SIZE should also allocate from slab cache, correct? Then how can alloc object w/o slab cache which? contains this object size objects? I have to admit, I didn't understand the question. object is allocated from slab cache, correct? There two kinds of slab cache, one is for general purpose, eg. kmalloc slab cache, the other is for special purpose, eg. mm_struct, task_struct. kmalloc object > PAGE_SIZE/2 or > PAGE_SIZE should also allocated from slab cache, correct? then why you said that there are no caches between kmalloc-2048 and kmalloc-4096? Ok, now I get it. Yes, I guess I should qualified here that there are no _kmalloc_ caches between PAGE_SIZE/2 and
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/25/2013 11:18 PM, Seth Jennings wrote: On 02/23/2013 06:37 PM, Ric Mason wrote: On 02/23/2013 05:02 AM, Seth Jennings wrote: On 02/21/2013 08:56 PM, Ric Mason wrote: On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE. Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. kmalloc object PAGE_SIZE/2 or PAGE_SIZE should also allocate from slab cache, correct? Then how can alloc object w/o slab cache which? contains this object size objects? I have to admit, I didn't understand the question. object is allocated from slab cache, correct? There two kinds of slab cache, one is for general purpose, eg. kmalloc slab cache, the other is for special purpose, eg. mm_struct, task_struct. kmalloc object PAGE_SIZE/2 or PAGE_SIZE should also allocated from slab cache, correct? then why you said that there are no caches between kmalloc-2048 and kmalloc-4096? Ok, now I get it. Yes, I guess I should qualified here that there are no _kmalloc_ caches between
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/23/2013 06:37 PM, Ric Mason wrote: > On 02/23/2013 05:02 AM, Seth Jennings wrote: >> On 02/21/2013 08:56 PM, Ric Mason wrote: >>> On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: > On 02/19/2013 03:16 AM, Seth Jennings wrote: >> On 02/16/2013 12:21 AM, Ric Mason wrote: >>> On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but <= PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or "size classes" in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +"zspage" which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. >>> Do you mean the reason of to use a zsmalloc object must map after >>> malloc is object data maybe reside in two different nocontiguous >>> pages? >> Yes, that is one reason for the mapping. The other reason (more >> of an >> added bonus) is below. >> + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible >>> What's the meaning of "back slabs with HIGHMEM pages"? >> By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit >> systems >> with larger that 1GB (actually a little less) of RAM. The upper >> 3GB >> of the 4GB address space, depending on kernel build options, is not >> directly addressable by the kernel, but can be mapped into the >> kernel >> address space with functions like kmap() or kmap_atomic(). >> >> These pages can't be used by slab/slub because they are not >> continuously mapped into the kernel address space. However, since >> zsmalloc requires a mapping anyway to handle objects that span >> non-contiguous page boundaries, we do the kernel mapping as part of >> the process. >> >> So zspages, the conceptual slab in zsmalloc backed by single-order >> pages can include pages from the HIGHMEM zone as well. > Thanks for your clarify, >http://lwn.net/Articles/537422/, your article about zswap in lwn. >"Additionally, the kernel slab allocator does not allow > objects that > are less > than a page in size to span a page boundary. This means that if an > object is > PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, > resulting in > ~50% waste. Hense there are *no kmalloc() cache size* between > PAGE_SIZE/2 and > PAGE_SIZE." > Are your sure? It seems that kmalloc cache support big size, your > can > check in > include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects > PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/23/2013 06:37 PM, Ric Mason wrote: On 02/23/2013 05:02 AM, Seth Jennings wrote: On 02/21/2013 08:56 PM, Ric Mason wrote: On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE. Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. kmalloc object PAGE_SIZE/2 or PAGE_SIZE should also allocate from slab cache, correct? Then how can alloc object w/o slab cache which? contains this object size objects? I have to admit, I didn't understand the question. object is allocated from slab cache, correct? There two kinds of slab cache, one is for general purpose, eg. kmalloc slab cache, the other is for special purpose, eg. mm_struct, task_struct. kmalloc object PAGE_SIZE/2 or PAGE_SIZE should also allocated from slab cache, correct? then why you said that there are no caches between kmalloc-2048 and kmalloc-4096? Ok, now I get it. Yes, I guess I should qualified here that
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/23/2013 05:02 AM, Seth Jennings wrote: On 02/21/2013 08:56 PM, Ric Mason wrote: On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but <= PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or "size classes" in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +"zspage" which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of "back slabs with HIGHMEM pages"? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. "Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE." Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects > PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. kmalloc object > PAGE_SIZE/2 or > PAGE_SIZE should also allocate from slab cache, correct? Then how can alloc object w/o slab cache which? contains this object size objects? I have to admit, I didn't understand the question. object is allocated from slab cache, correct? There two kinds of slab cache, one is for general purpose, eg. kmalloc slab cache, the other is for special purpose, eg. mm_struct, task_struct. kmalloc object > PAGE_SIZE/2 or > PAGE_SIZE should also allocated from slab cache, correct? then why you said that there are no caches between kmalloc-2048 and kmalloc-4096? Thanks, Seth -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/23/2013 05:02 AM, Seth Jennings wrote: On 02/21/2013 08:56 PM, Ric Mason wrote: On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE. Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. kmalloc object PAGE_SIZE/2 or PAGE_SIZE should also allocate from slab cache, correct? Then how can alloc object w/o slab cache which? contains this object size objects? I have to admit, I didn't understand the question. object is allocated from slab cache, correct? There two kinds of slab cache, one is for general purpose, eg. kmalloc slab cache, the other is for special purpose, eg. mm_struct, task_struct. kmalloc object PAGE_SIZE/2 or PAGE_SIZE should also allocated from slab cache, correct? then why you said that there are no caches between kmalloc-2048 and kmalloc-4096? Thanks, Seth -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/21/2013 08:56 PM, Ric Mason wrote: > On 02/21/2013 11:50 PM, Seth Jennings wrote: >> On 02/21/2013 02:49 AM, Ric Mason wrote: >>> On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: > On 02/14/2013 02:38 AM, Seth Jennings wrote: >> This patch adds a documentation file for zsmalloc at >> Documentation/vm/zsmalloc.txt >> >> Signed-off-by: Seth Jennings >> --- >> Documentation/vm/zsmalloc.txt | 68 >> + >> 1 file changed, 68 insertions(+) >> create mode 100644 Documentation/vm/zsmalloc.txt >> >> diff --git a/Documentation/vm/zsmalloc.txt >> b/Documentation/vm/zsmalloc.txt >> new file mode 100644 >> index 000..85aa617 >> --- /dev/null >> +++ b/Documentation/vm/zsmalloc.txt >> @@ -0,0 +1,68 @@ >> +zsmalloc Memory Allocator >> + >> +Overview >> + >> +zmalloc a new slab-based memory allocator, >> +zsmalloc, for storing compressed pages. It is designed for >> +low fragmentation and high allocation success rate on >> +large object, but <= PAGE_SIZE allocations. >> + >> +zsmalloc differs from the kernel slab allocator in two primary >> +ways to achieve these design goals. >> + >> +zsmalloc never requires high order page allocations to back >> +slabs, or "size classes" in zsmalloc terms. Instead it allows >> +multiple single-order pages to be stitched together into a >> +"zspage" which backs the slab. This allows for higher allocation >> +success rate under memory pressure. >> + >> +Also, zsmalloc allows objects to span page boundaries within the >> +zspage. This allows for lower fragmentation than could be had >> +with the kernel slab allocator for objects between PAGE_SIZE/2 >> +and PAGE_SIZE. With the kernel slab allocator, if a page >> compresses >> +to 60% of it original size, the memory savings gained through >> +compression is lost in fragmentation because another object of >> +the same size can't be stored in the leftover space. >> + >> +This ability to span pages results in zsmalloc allocations not >> being >> +directly addressable by the user. The user is given an >> +non-dereferencable handle in response to an allocation request. >> +That handle must be mapped, using zs_map_object(), which returns >> +a pointer to the mapped region that can be used. The mapping is >> +necessary since the object data may reside in two different >> +noncontigious pages. > Do you mean the reason of to use a zsmalloc object must map after > malloc is object data maybe reside in two different nocontiguous > pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. >> + >> +For 32-bit systems, zsmalloc has the added benefit of being >> +able to back slabs with HIGHMEM pages, something not possible > What's the meaning of "back slabs with HIGHMEM pages"? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. >>> Thanks for your clarify, >>> http://lwn.net/Articles/537422/, your article about zswap in lwn. >>> "Additionally, the kernel slab allocator does not allow objects that >>> are less >>> than a page in size to span a page boundary. This means that if an >>> object is >>> PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, >>> resulting in >>> ~50% waste. Hense there are *no kmalloc() cache size* between >>> PAGE_SIZE/2 and >>> PAGE_SIZE." >>> Are your sure? It seems that kmalloc cache support big size, your can >>> check in >>> include/linux/kmalloc_sizes.h >> Yes, kmalloc can allocate large objects > PAGE_SIZE, but there are no >> cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a >> system with 4k pages, there are no caches between kmalloc-2048 and >> kmalloc-4096. > > kmalloc object > PAGE_SIZE/2 or > PAGE_SIZE should also allocate from > slab cache, correct? Then how can alloc object w/o slab cache which > contains this object size objects? I have to admit, I didn't understand the question. Thanks, Seth -- To unsubscribe from this list:
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/21/2013 08:56 PM, Ric Mason wrote: On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE. Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. kmalloc object PAGE_SIZE/2 or PAGE_SIZE should also allocate from slab cache, correct? Then how can alloc object w/o slab cache which contains this object size objects? I have to admit, I didn't understand the question. Thanks, Seth -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but <= PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or "size classes" in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +"zspage" which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of "back slabs with HIGHMEM pages"? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. "Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE." Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects > PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. Since slub cache can merge, is it the root reason? Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: mailto:"d...@kvack.org;> em...@kvack.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but <= PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or "size classes" in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +"zspage" which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of "back slabs with HIGHMEM pages"? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. "Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE." Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects > PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. kmalloc object > PAGE_SIZE/2 or > PAGE_SIZE should also allocate from slab cache, correct? Then how can alloc object w/o slab cache which contains this object size objects? Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: mailto:"d...@kvack.org;> em...@kvack.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCHv5 2/8] zsmalloc: add documentation
> From: Seth Jennings [mailto:sjenn...@linux.vnet.ibm.com] > Subject: Re: [PATCHv5 2/8] zsmalloc: add documentation > > On 02/21/2013 02:49 AM, Ric Mason wrote: > > On 02/19/2013 03:16 AM, Seth Jennings wrote: > >> On 02/16/2013 12:21 AM, Ric Mason wrote: > >>> On 02/14/2013 02:38 AM, Seth Jennings wrote: > >>>> This patch adds a documentation file for zsmalloc at > >>>> Documentation/vm/zsmalloc.txt > >>>> > >>>> Signed-off-by: Seth Jennings > >>>> --- > >>>>Documentation/vm/zsmalloc.txt | 68 > >>>> + > >>>>1 file changed, 68 insertions(+) > >>>>create mode 100644 Documentation/vm/zsmalloc.txt > >>>> > >>>> diff --git a/Documentation/vm/zsmalloc.txt > >>>> b/Documentation/vm/zsmalloc.txt > >>>> new file mode 100644 > >>>> index 000..85aa617 > >>>> --- /dev/null > >>>> +++ b/Documentation/vm/zsmalloc.txt > >>>> @@ -0,0 +1,68 @@ > >>>> +zsmalloc Memory Allocator > >>>> + > >>>> +Overview > >>>> + > >>>> +zmalloc a new slab-based memory allocator, > >>>> +zsmalloc, for storing compressed pages. It is designed for > >>>> +low fragmentation and high allocation success rate on > >>>> +large object, but <= PAGE_SIZE allocations. > >>>> + > >>>> +zsmalloc differs from the kernel slab allocator in two primary > >>>> +ways to achieve these design goals. > >>>> + > >>>> +zsmalloc never requires high order page allocations to back > >>>> +slabs, or "size classes" in zsmalloc terms. Instead it allows > >>>> +multiple single-order pages to be stitched together into a > >>>> +"zspage" which backs the slab. This allows for higher allocation > >>>> +success rate under memory pressure. > >>>> + > >>>> +Also, zsmalloc allows objects to span page boundaries within the > >>>> +zspage. This allows for lower fragmentation than could be had > >>>> +with the kernel slab allocator for objects between PAGE_SIZE/2 > >>>> +and PAGE_SIZE. With the kernel slab allocator, if a page compresses > >>>> +to 60% of it original size, the memory savings gained through > >>>> +compression is lost in fragmentation because another object of > >>>> +the same size can't be stored in the leftover space. > >>>> + > >>>> +This ability to span pages results in zsmalloc allocations not being > >>>> +directly addressable by the user. The user is given an > >>>> +non-dereferencable handle in response to an allocation request. > >>>> +That handle must be mapped, using zs_map_object(), which returns > >>>> +a pointer to the mapped region that can be used. The mapping is > >>>> +necessary since the object data may reside in two different > >>>> +noncontigious pages. > >>> Do you mean the reason of to use a zsmalloc object must map after > >>> malloc is object data maybe reside in two different nocontiguous pages? > >> Yes, that is one reason for the mapping. The other reason (more of an > >> added bonus) is below. > >> > >>>> + > >>>> +For 32-bit systems, zsmalloc has the added benefit of being > >>>> +able to back slabs with HIGHMEM pages, something not possible > >>> What's the meaning of "back slabs with HIGHMEM pages"? > >> By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems > >> with larger that 1GB (actually a little less) of RAM. The upper 3GB > >> of the 4GB address space, depending on kernel build options, is not > >> directly addressable by the kernel, but can be mapped into the kernel > >> address space with functions like kmap() or kmap_atomic(). > >> > >> These pages can't be used by slab/slub because they are not > >> continuously mapped into the kernel address space. However, since > >> zsmalloc requires a mapping anyway to handle objects that span > >> non-contiguous page boundaries, we do the kernel mapping as part of > >> the process. > >> > >> So zspages, the conceptual slab in zsmalloc backed by single-order > >> pages can include pages from the HIGHMEM zone as well. > > > > Thanks for your clarify,
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/21/2013 02:49 AM, Ric Mason wrote: > On 02/19/2013 03:16 AM, Seth Jennings wrote: >> On 02/16/2013 12:21 AM, Ric Mason wrote: >>> On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but <= PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or "size classes" in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +"zspage" which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. >>> Do you mean the reason of to use a zsmalloc object must map after >>> malloc is object data maybe reside in two different nocontiguous pages? >> Yes, that is one reason for the mapping. The other reason (more of an >> added bonus) is below. >> + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible >>> What's the meaning of "back slabs with HIGHMEM pages"? >> By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems >> with larger that 1GB (actually a little less) of RAM. The upper 3GB >> of the 4GB address space, depending on kernel build options, is not >> directly addressable by the kernel, but can be mapped into the kernel >> address space with functions like kmap() or kmap_atomic(). >> >> These pages can't be used by slab/slub because they are not >> continuously mapped into the kernel address space. However, since >> zsmalloc requires a mapping anyway to handle objects that span >> non-contiguous page boundaries, we do the kernel mapping as part of >> the process. >> >> So zspages, the conceptual slab in zsmalloc backed by single-order >> pages can include pages from the HIGHMEM zone as well. > > Thanks for your clarify, > http://lwn.net/Articles/537422/, your article about zswap in lwn. > "Additionally, the kernel slab allocator does not allow objects that > are less > than a page in size to span a page boundary. This means that if an > object is > PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, > resulting in > ~50% waste. Hense there are *no kmalloc() cache size* between > PAGE_SIZE/2 and > PAGE_SIZE." > Are your sure? It seems that kmalloc cache support big size, your can > check in > include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects > PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. Seth -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but <= PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or "size classes" in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +"zspage" which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of "back slabs with HIGHMEM pages"? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. "Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE." Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: mailto:"d...@kvack.org;> em...@kvack.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE. Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE. Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. Seth -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCHv5 2/8] zsmalloc: add documentation
From: Seth Jennings [mailto:sjenn...@linux.vnet.ibm.com] Subject: Re: [PATCHv5 2/8] zsmalloc: add documentation On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE. Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. Important and left unsaid here is that, in many workloads, the distribution of compressed pages (zpages) will have as many as half or more with compressed size (zsize) between PAGE_SIZE/2 and PAGE_SIZE. And, in many workloads, the majority of values for zsize will be much closer to PAGE_SIZE/2 than PAGE_SIZE, which will result in a great deal of wasted space if slab were used. And, also very important, kmalloc requires page allocations with order 0 (2**n contiguous pages) to deal with big size objects. In-kernel compression would need many of these and they are difficult (often impossible) to allocate when the system is under memory
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE. Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. kmalloc object PAGE_SIZE/2 or PAGE_SIZE should also allocate from slab cache, correct? Then how can alloc object w/o slab cache which contains this object size objects? Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/21/2013 11:50 PM, Seth Jennings wrote: On 02/21/2013 02:49 AM, Ric Mason wrote: On 02/19/2013 03:16 AM, Seth Jennings wrote: On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Thanks for your clarify, http://lwn.net/Articles/537422/, your article about zswap in lwn. Additionally, the kernel slab allocator does not allow objects that are less than a page in size to span a page boundary. This means that if an object is PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page, resulting in ~50% waste. Hense there are *no kmalloc() cache size* between PAGE_SIZE/2 and PAGE_SIZE. Are your sure? It seems that kmalloc cache support big size, your can check in include/linux/kmalloc_sizes.h Yes, kmalloc can allocate large objects PAGE_SIZE, but there are no cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE. For example, on a system with 4k pages, there are no caches between kmalloc-2048 and kmalloc-4096. Since slub cache can merge, is it the root reason? Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/16/2013 12:21 AM, Ric Mason wrote: > On 02/14/2013 02:38 AM, Seth Jennings wrote: >> This patch adds a documentation file for zsmalloc at >> Documentation/vm/zsmalloc.txt >> >> Signed-off-by: Seth Jennings >> --- >> Documentation/vm/zsmalloc.txt | 68 >> + >> 1 file changed, 68 insertions(+) >> create mode 100644 Documentation/vm/zsmalloc.txt >> >> diff --git a/Documentation/vm/zsmalloc.txt >> b/Documentation/vm/zsmalloc.txt >> new file mode 100644 >> index 000..85aa617 >> --- /dev/null >> +++ b/Documentation/vm/zsmalloc.txt >> @@ -0,0 +1,68 @@ >> +zsmalloc Memory Allocator >> + >> +Overview >> + >> +zmalloc a new slab-based memory allocator, >> +zsmalloc, for storing compressed pages. It is designed for >> +low fragmentation and high allocation success rate on >> +large object, but <= PAGE_SIZE allocations. >> + >> +zsmalloc differs from the kernel slab allocator in two primary >> +ways to achieve these design goals. >> + >> +zsmalloc never requires high order page allocations to back >> +slabs, or "size classes" in zsmalloc terms. Instead it allows >> +multiple single-order pages to be stitched together into a >> +"zspage" which backs the slab. This allows for higher allocation >> +success rate under memory pressure. >> + >> +Also, zsmalloc allows objects to span page boundaries within the >> +zspage. This allows for lower fragmentation than could be had >> +with the kernel slab allocator for objects between PAGE_SIZE/2 >> +and PAGE_SIZE. With the kernel slab allocator, if a page compresses >> +to 60% of it original size, the memory savings gained through >> +compression is lost in fragmentation because another object of >> +the same size can't be stored in the leftover space. >> + >> +This ability to span pages results in zsmalloc allocations not being >> +directly addressable by the user. The user is given an >> +non-dereferencable handle in response to an allocation request. >> +That handle must be mapped, using zs_map_object(), which returns >> +a pointer to the mapped region that can be used. The mapping is >> +necessary since the object data may reside in two different >> +noncontigious pages. > > Do you mean the reason of to use a zsmalloc object must map after > malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. > >> + >> +For 32-bit systems, zsmalloc has the added benefit of being >> +able to back slabs with HIGHMEM pages, something not possible > > What's the meaning of "back slabs with HIGHMEM pages"? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Seth -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/16/2013 12:21 AM, Ric Mason wrote: On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? Yes, that is one reason for the mapping. The other reason (more of an added bonus) is below. + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems with larger that 1GB (actually a little less) of RAM. The upper 3GB of the 4GB address space, depending on kernel build options, is not directly addressable by the kernel, but can be mapped into the kernel address space with functions like kmap() or kmap_atomic(). These pages can't be used by slab/slub because they are not continuously mapped into the kernel address space. However, since zsmalloc requires a mapping anyway to handle objects that span non-contiguous page boundaries, we do the kernel mapping as part of the process. So zspages, the conceptual slab in zsmalloc backed by single-order pages can include pages from the HIGHMEM zone as well. Seth -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but <= PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or "size classes" in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +"zspage" which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of "back slabs with HIGHMEM pages"? +with the kernel slab allocators (SLAB or SLUB). + +Usage: + +#include + +/* create a new pool */ +struct zs_pool *pool = zs_create_pool("mypool", GFP_KERNEL); + +/* allocate a 256 byte object */ +unsigned long handle = zs_malloc(pool, 256); + +/* + * Map the object to get a dereferenceable pointer in "read-write mode" + * (see zsmalloc.h for additional modes) + */ +void *ptr = zs_map_object(pool, handle, ZS_MM_RW); + +/* do something with ptr */ + +/* + * Unmap the object when done dealing with it. You should try to + * minimize the time for which the object is mapped since preemption + * is disabled during the mapped period. + */ +zs_unmap_object(pool, handle); + +/* free the object */ +zs_free(pool, handle); + +/* destroy the pool */ +zs_destroy_pool(pool); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv5 2/8] zsmalloc: add documentation
On 02/14/2013 02:38 AM, Seth Jennings wrote: This patch adds a documentation file for zsmalloc at Documentation/vm/zsmalloc.txt Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com --- Documentation/vm/zsmalloc.txt | 68 + 1 file changed, 68 insertions(+) create mode 100644 Documentation/vm/zsmalloc.txt diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000..85aa617 --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,68 @@ +zsmalloc Memory Allocator + +Overview + +zmalloc a new slab-based memory allocator, +zsmalloc, for storing compressed pages. It is designed for +low fragmentation and high allocation success rate on +large object, but = PAGE_SIZE allocations. + +zsmalloc differs from the kernel slab allocator in two primary +ways to achieve these design goals. + +zsmalloc never requires high order page allocations to back +slabs, or size classes in zsmalloc terms. Instead it allows +multiple single-order pages to be stitched together into a +zspage which backs the slab. This allows for higher allocation +success rate under memory pressure. + +Also, zsmalloc allows objects to span page boundaries within the +zspage. This allows for lower fragmentation than could be had +with the kernel slab allocator for objects between PAGE_SIZE/2 +and PAGE_SIZE. With the kernel slab allocator, if a page compresses +to 60% of it original size, the memory savings gained through +compression is lost in fragmentation because another object of +the same size can't be stored in the leftover space. + +This ability to span pages results in zsmalloc allocations not being +directly addressable by the user. The user is given an +non-dereferencable handle in response to an allocation request. +That handle must be mapped, using zs_map_object(), which returns +a pointer to the mapped region that can be used. The mapping is +necessary since the object data may reside in two different +noncontigious pages. Do you mean the reason of to use a zsmalloc object must map after malloc is object data maybe reside in two different nocontiguous pages? + +For 32-bit systems, zsmalloc has the added benefit of being +able to back slabs with HIGHMEM pages, something not possible What's the meaning of back slabs with HIGHMEM pages? +with the kernel slab allocators (SLAB or SLUB). + +Usage: + +#include linux/zsmalloc.h + +/* create a new pool */ +struct zs_pool *pool = zs_create_pool(mypool, GFP_KERNEL); + +/* allocate a 256 byte object */ +unsigned long handle = zs_malloc(pool, 256); + +/* + * Map the object to get a dereferenceable pointer in read-write mode + * (see zsmalloc.h for additional modes) + */ +void *ptr = zs_map_object(pool, handle, ZS_MM_RW); + +/* do something with ptr */ + +/* + * Unmap the object when done dealing with it. You should try to + * minimize the time for which the object is mapped since preemption + * is disabled during the mapped period. + */ +zs_unmap_object(pool, handle); + +/* free the object */ +zs_free(pool, handle); + +/* destroy the pool */ +zs_destroy_pool(pool); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/