Re: [PATCH] zswap: update docs for runtime-changeable attributes
On 25.8.2015 6:22, Sergey Senozhatsky wrote: i'd argue that neither zbud nor zsmalloc are responsible for reacting to memory pressure, they just store the pages. It's zswap that has to limit its size, which it does with max_percent_pool. >>> >>> Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim >>> requests >>> from zswap when zswap hits the limit. Zswap could easily add a shrinker that >>> would relay this requests in response to memory pressure as well. However, >>> zsmalloc doesn't implement the reclaim, or LRU tracking. >> >> I wrote a patch for zsmalloc reclaim a while ago: >> >> https://lwn.net/Articles/611713/ >> >> however it didn't make it in, due to the lack of zsmalloc LRU, or any >> proven benefit to zsmalloc reclaim. >> >> It's not really possible to add LRU to zsmalloc, by the nature of its >> design, using the struct page fields directly; there's no extra field >> to use as a lru entry. > > Just for information, zsmalloc now registers shrinker callbacks > > https://lkml.org/lkml/2015/7/8/497 Yeah but that's just for compaction, not freeing. I think that ideally zswap should track the LRU on the level of pages it receives as input, and then just tell zswap/zbud to free them. Then zswap would use its compaction to make sure that the reclaim results in actual freeing of page frames. Zbud could re-pair the orphaned half-pages to the same effect. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On 25.8.2015 6:22, Sergey Senozhatsky wrote: i'd argue that neither zbud nor zsmalloc are responsible for reacting to memory pressure, they just store the pages. It's zswap that has to limit its size, which it does with max_percent_pool. Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim requests from zswap when zswap hits the limit. Zswap could easily add a shrinker that would relay this requests in response to memory pressure as well. However, zsmalloc doesn't implement the reclaim, or LRU tracking. I wrote a patch for zsmalloc reclaim a while ago: https://lwn.net/Articles/611713/ however it didn't make it in, due to the lack of zsmalloc LRU, or any proven benefit to zsmalloc reclaim. It's not really possible to add LRU to zsmalloc, by the nature of its design, using the struct page fields directly; there's no extra field to use as a lru entry. Just for information, zsmalloc now registers shrinker callbacks https://lkml.org/lkml/2015/7/8/497 Yeah but that's just for compaction, not freeing. I think that ideally zswap should track the LRU on the level of pages it receives as input, and then just tell zswap/zbud to free them. Then zswap would use its compaction to make sure that the reclaim results in actual freeing of page frames. Zbud could re-pair the orphaned half-pages to the same effect. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On (08/19/15 11:56), Dan Streetman wrote: [..] > > Ugh that's madness. Still, a documented madness is better than an > > undocumented one. > > heh, i'm not sure why it's madness, the alternative of > uncompressing/recompressing all pages into the new zpool and/or with > the new compressor seems much worse ;-) > Well, I sort of still think that 'change compressor and reboot' is OK. 5cents. > > > >>> > The zsmalloc type zpool has a more > +complex compressed page storage method, and it can achieve greater > storage > +densities. However, zsmalloc does not implement compressed page > eviction, so > +once zswap fills it cannot evict the oldest page, it can only reject > new pages. > >>> > >>> I still wonder why anyone would use zsmalloc with zswap given this > >>> limitation. > >>> It seems only fine for zram which has no real swap as fallback. And even > >>> zbud > >>> doesn't have any shrinker interface that would react to memory pressure, > >>> so > >>> there's a possibility of premature OOM... sigh. > >> > >> for situations where zswap isn't expected to ever fill up, zsmalloc > >> will outperform zbud, since it has higher density. > > > > But then you could just use zram? :) > > well not *expected* to fill up doesn't mean it *won't* fill up :) > > > > >> i'd argue that neither zbud nor zsmalloc are responsible for reacting > >> to memory pressure, they just store the pages. It's zswap that has to > >> limit its size, which it does with max_percent_pool. > > > > Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim > > requests > > from zswap when zswap hits the limit. Zswap could easily add a shrinker that > > would relay this requests in response to memory pressure as well. However, > > zsmalloc doesn't implement the reclaim, or LRU tracking. > > I wrote a patch for zsmalloc reclaim a while ago: > > https://lwn.net/Articles/611713/ > > however it didn't make it in, due to the lack of zsmalloc LRU, or any > proven benefit to zsmalloc reclaim. > > It's not really possible to add LRU to zsmalloc, by the nature of its > design, using the struct page fields directly; there's no extra field > to use as a lru entry. Just for information, zsmalloc now registers shrinker callbacks https://lkml.org/lkml/2015/7/8/497 -ss -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On (08/19/15 11:56), Dan Streetman wrote: [..] Ugh that's madness. Still, a documented madness is better than an undocumented one. heh, i'm not sure why it's madness, the alternative of uncompressing/recompressing all pages into the new zpool and/or with the new compressor seems much worse ;-) Well, I sort of still think that 'change compressor and reboot' is OK. 5cents. The zsmalloc type zpool has a more +complex compressed page storage method, and it can achieve greater storage +densities. However, zsmalloc does not implement compressed page eviction, so +once zswap fills it cannot evict the oldest page, it can only reject new pages. I still wonder why anyone would use zsmalloc with zswap given this limitation. It seems only fine for zram which has no real swap as fallback. And even zbud doesn't have any shrinker interface that would react to memory pressure, so there's a possibility of premature OOM... sigh. for situations where zswap isn't expected to ever fill up, zsmalloc will outperform zbud, since it has higher density. But then you could just use zram? :) well not *expected* to fill up doesn't mean it *won't* fill up :) i'd argue that neither zbud nor zsmalloc are responsible for reacting to memory pressure, they just store the pages. It's zswap that has to limit its size, which it does with max_percent_pool. Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim requests from zswap when zswap hits the limit. Zswap could easily add a shrinker that would relay this requests in response to memory pressure as well. However, zsmalloc doesn't implement the reclaim, or LRU tracking. I wrote a patch for zsmalloc reclaim a while ago: https://lwn.net/Articles/611713/ however it didn't make it in, due to the lack of zsmalloc LRU, or any proven benefit to zsmalloc reclaim. It's not really possible to add LRU to zsmalloc, by the nature of its design, using the struct page fields directly; there's no extra field to use as a lru entry. Just for information, zsmalloc now registers shrinker callbacks https://lkml.org/lkml/2015/7/8/497 -ss -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On Wed, Aug 19, 2015 at 11:02 AM, Vlastimil Babka wrote: > On 08/19/2015 04:21 PM, Dan Streetman wrote: >> On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka wrote: >>> On 08/18/2015 09:07 PM, Dan Streetman wrote: +pages are freed. The pool is not preallocated. By default, a zpool of type +zbud is created, but it can be selected at boot time by setting the "zpool" +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using the +sysfs "zpool" attribute, e.g. + +echo zbud > /sys/module/zswap/parameters/zpool >>> >>> What exactly happens if zswap is already being used and has allocated pages >>> in >>> one type of pool, and you're changing it to the other one? >> >> zswap has a rcu list where each entry contains a specific compressor >> and zpool. When either the compressor or zpool is changed, a new >> entry is created with a new compressor and pool and put at the front >> of the list. New pages always use the "current" (first) entry. Any >> old (unused) entries are freed whenever all the pages they contain are >> removed. >> >> So when the compressor or zpool is changed, the only thing that >> happens is zswap creates a new compressor and zpool and places it at >> the front of the list, for new pages to use. No existing pages are >> touched. > > Ugh that's madness. Still, a documented madness is better than an > undocumented one. heh, i'm not sure why it's madness, the alternative of uncompressing/recompressing all pages into the new zpool and/or with the new compressor seems much worse ;-) > >>> The zsmalloc type zpool has a more +complex compressed page storage method, and it can achieve greater storage +densities. However, zsmalloc does not implement compressed page eviction, so +once zswap fills it cannot evict the oldest page, it can only reject new pages. >>> >>> I still wonder why anyone would use zsmalloc with zswap given this >>> limitation. >>> It seems only fine for zram which has no real swap as fallback. And even >>> zbud >>> doesn't have any shrinker interface that would react to memory pressure, so >>> there's a possibility of premature OOM... sigh. >> >> for situations where zswap isn't expected to ever fill up, zsmalloc >> will outperform zbud, since it has higher density. > > But then you could just use zram? :) well not *expected* to fill up doesn't mean it *won't* fill up :) > >> i'd argue that neither zbud nor zsmalloc are responsible for reacting >> to memory pressure, they just store the pages. It's zswap that has to >> limit its size, which it does with max_percent_pool. > > Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim > requests > from zswap when zswap hits the limit. Zswap could easily add a shrinker that > would relay this requests in response to memory pressure as well. However, > zsmalloc doesn't implement the reclaim, or LRU tracking. I wrote a patch for zsmalloc reclaim a while ago: https://lwn.net/Articles/611713/ however it didn't make it in, due to the lack of zsmalloc LRU, or any proven benefit to zsmalloc reclaim. It's not really possible to add LRU to zsmalloc, by the nature of its design, using the struct page fields directly; there's no extra field to use as a lru entry. > > One could also argue that aging should be tracked in zswap, and it would just > tell zbud/zmalloc to drop a specific compressed page. But that wouldn't > reliably > translate into freeing of page frames... > Yep, that was Minchan's suggestion as well, which I agree with, although that would also require a new api function to free the entire page that a single compressed page is in. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On 08/19/2015 04:21 PM, Dan Streetman wrote: > On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka wrote: >> On 08/18/2015 09:07 PM, Dan Streetman wrote: >>> +pages are freed. The pool is not preallocated. By default, a zpool of >>> type >>> +zbud is created, but it can be selected at boot time by setting the "zpool" >>> +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using >>> the >>> +sysfs "zpool" attribute, e.g. >>> + >>> +echo zbud > /sys/module/zswap/parameters/zpool >> >> What exactly happens if zswap is already being used and has allocated pages >> in >> one type of pool, and you're changing it to the other one? > > zswap has a rcu list where each entry contains a specific compressor > and zpool. When either the compressor or zpool is changed, a new > entry is created with a new compressor and pool and put at the front > of the list. New pages always use the "current" (first) entry. Any > old (unused) entries are freed whenever all the pages they contain are > removed. > > So when the compressor or zpool is changed, the only thing that > happens is zswap creates a new compressor and zpool and places it at > the front of the list, for new pages to use. No existing pages are > touched. Ugh that's madness. Still, a documented madness is better than an undocumented one. >> >>> The zsmalloc type zpool has a more >>> +complex compressed page storage method, and it can achieve greater storage >>> +densities. However, zsmalloc does not implement compressed page eviction, >>> so >>> +once zswap fills it cannot evict the oldest page, it can only reject new >>> pages. >> >> I still wonder why anyone would use zsmalloc with zswap given this >> limitation. >> It seems only fine for zram which has no real swap as fallback. And even zbud >> doesn't have any shrinker interface that would react to memory pressure, so >> there's a possibility of premature OOM... sigh. > > for situations where zswap isn't expected to ever fill up, zsmalloc > will outperform zbud, since it has higher density. But then you could just use zram? :) > i'd argue that neither zbud nor zsmalloc are responsible for reacting > to memory pressure, they just store the pages. It's zswap that has to > limit its size, which it does with max_percent_pool. Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim requests from zswap when zswap hits the limit. Zswap could easily add a shrinker that would relay this requests in response to memory pressure as well. However, zsmalloc doesn't implement the reclaim, or LRU tracking. One could also argue that aging should be tracked in zswap, and it would just tell zbud/zmalloc to drop a specific compressed page. But that wouldn't reliably translate into freeing of page frames... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka wrote: > On 08/18/2015 09:07 PM, Dan Streetman wrote: >> Change the Documentation/vm/zswap.txt doc to indicate that the "zpool" >> and "compressor" params are now changeable at runtime. >> >> Signed-off-by: Dan Streetman >> --- >> Documentation/vm/zswap.txt | 31 +++ >> 1 file changed, 23 insertions(+), 8 deletions(-) >> >> diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt >> index 8458c08..06f7ce2 100644 >> --- a/Documentation/vm/zswap.txt >> +++ b/Documentation/vm/zswap.txt >> @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the >> sysfs interface. >> An example command to enable zswap at runtime, assuming sysfs is mounted >> at /sys, is: >> >> -echo 1 > /sys/modules/zswap/parameters/enabled >> +echo 1 > /sys/module/zswap/parameters/enabled >> >> When zswap is disabled at runtime it will stop storing pages that are >> being swapped out. However, it will _not_ immediately write out or fault >> @@ -49,14 +49,27 @@ Zswap receives pages for compression through the >> Frontswap API and is able to >> evict pages from its own compressed pool on an LRU basis and write them >> back to >> the backing swap device in the case that the compressed pool is full. >> >> -Zswap makes use of zbud for the managing the compressed memory pool. Each >> -allocation in zbud is not directly accessible by address. Rather, a handle >> is >> +Zswap makes use of zpool for the managing the compressed memory pool. Each >> +allocation in zpool is not directly accessible by address. Rather, a >> handle is >> returned by the allocation routine and that handle must be mapped before >> being >> accessed. The compressed memory pool grows on demand and shrinks as >> compressed >> -pages are freed. The pool is not preallocated. >> +pages are freed. The pool is not preallocated. By default, a zpool of type >> +zbud is created, but it can be selected at boot time by setting the "zpool" >> +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using >> the >> +sysfs "zpool" attribute, e.g. >> + >> +echo zbud > /sys/module/zswap/parameters/zpool > > What exactly happens if zswap is already being used and has allocated pages in > one type of pool, and you're changing it to the other one? zswap has a rcu list where each entry contains a specific compressor and zpool. When either the compressor or zpool is changed, a new entry is created with a new compressor and pool and put at the front of the list. New pages always use the "current" (first) entry. Any old (unused) entries are freed whenever all the pages they contain are removed. So when the compressor or zpool is changed, the only thing that happens is zswap creates a new compressor and zpool and places it at the front of the list, for new pages to use. No existing pages are touched. > >> + >> +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, >> which >> +means the compression ratio will always be exactly 2:1 (not including >> half-full >> +zbud pages), and any page that compresses to more than 1/2 page in size >> will be >> +rejected (and written to the swap disk). > > Hm is this correct? I've been going through the zbud code briefly (as of > Linus' > tree) and it seems to me that it will accept pages larger than 1/2, but they > will sit in the unbuddied list until a small enough "buddy" comes. ha, yeah you're right. I didn't read zbud_alloc closely before, it definitely takes compressed pages > 1/2 page. I'll update the doc. thanks! > >> The zsmalloc type zpool has a more >> +complex compressed page storage method, and it can achieve greater storage >> +densities. However, zsmalloc does not implement compressed page eviction, >> so >> +once zswap fills it cannot evict the oldest page, it can only reject new >> pages. > > I still wonder why anyone would use zsmalloc with zswap given this limitation. > It seems only fine for zram which has no real swap as fallback. And even zbud > doesn't have any shrinker interface that would react to memory pressure, so > there's a possibility of premature OOM... sigh. for situations where zswap isn't expected to ever fill up, zsmalloc will outperform zbud, since it has higher density. i'd argue that neither zbud nor zsmalloc are responsible for reacting to memory pressure, they just store the pages. It's zswap that has to limit its size, which it does with max_percent_pool. > >> When a swap page is passed from frontswap to zswap, zswap maintains a >> mapping >> -of the swap entry, a combination of the swap type and swap offset, to the >> zbud >> +of the swap entry, a combination of the swap type and swap offset, to the >> zpool >> handle that references that compressed swap page. This mapping is achieved >> with a red-black tree per swap type. The swap offset is the search key for >> the >> tree nodes. >> @@ -74,9 +87,11 @@ controlled policy: >>
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On 08/18/2015 09:07 PM, Dan Streetman wrote: > Change the Documentation/vm/zswap.txt doc to indicate that the "zpool" > and "compressor" params are now changeable at runtime. > > Signed-off-by: Dan Streetman > --- > Documentation/vm/zswap.txt | 31 +++ > 1 file changed, 23 insertions(+), 8 deletions(-) > > diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt > index 8458c08..06f7ce2 100644 > --- a/Documentation/vm/zswap.txt > +++ b/Documentation/vm/zswap.txt > @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the sysfs > interface. > An example command to enable zswap at runtime, assuming sysfs is mounted > at /sys, is: > > -echo 1 > /sys/modules/zswap/parameters/enabled > +echo 1 > /sys/module/zswap/parameters/enabled > > When zswap is disabled at runtime it will stop storing pages that are > being swapped out. However, it will _not_ immediately write out or fault > @@ -49,14 +49,27 @@ Zswap receives pages for compression through the > Frontswap API and is able to > evict pages from its own compressed pool on an LRU basis and write them back > to > the backing swap device in the case that the compressed pool is full. > > -Zswap makes use of zbud for the managing the compressed memory pool. Each > -allocation in zbud is not directly accessible by address. Rather, a handle > is > +Zswap makes use of zpool for the managing the compressed memory pool. Each > +allocation in zpool is not directly accessible by address. Rather, a handle > is > returned by the allocation routine and that handle must be mapped before > being > accessed. The compressed memory pool grows on demand and shrinks as > compressed > -pages are freed. The pool is not preallocated. > +pages are freed. The pool is not preallocated. By default, a zpool of type > +zbud is created, but it can be selected at boot time by setting the "zpool" > +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using > the > +sysfs "zpool" attribute, e.g. > + > +echo zbud > /sys/module/zswap/parameters/zpool What exactly happens if zswap is already being used and has allocated pages in one type of pool, and you're changing it to the other one? > + > +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, > which > +means the compression ratio will always be exactly 2:1 (not including > half-full > +zbud pages), and any page that compresses to more than 1/2 page in size will > be > +rejected (and written to the swap disk). Hm is this correct? I've been going through the zbud code briefly (as of Linus' tree) and it seems to me that it will accept pages larger than 1/2, but they will sit in the unbuddied list until a small enough "buddy" comes. > The zsmalloc type zpool has a more > +complex compressed page storage method, and it can achieve greater storage > +densities. However, zsmalloc does not implement compressed page eviction, so > +once zswap fills it cannot evict the oldest page, it can only reject new > pages. I still wonder why anyone would use zsmalloc with zswap given this limitation. It seems only fine for zram which has no real swap as fallback. And even zbud doesn't have any shrinker interface that would react to memory pressure, so there's a possibility of premature OOM... sigh. > When a swap page is passed from frontswap to zswap, zswap maintains a mapping > -of the swap entry, a combination of the swap type and swap offset, to the > zbud > +of the swap entry, a combination of the swap type and swap offset, to the > zpool > handle that references that compressed swap page. This mapping is achieved > with a red-black tree per swap type. The swap offset is the search key for > the > tree nodes. > @@ -74,9 +87,11 @@ controlled policy: > * max_pool_percent - The maximum percentage of memory that the compressed > pool can occupy. > > -Zswap allows the compressor to be selected at kernel boot time by setting the > -“compressor” attribute. The default compressor is lzo. e.g. > -zswap.compressor=deflate > +The default compressor is lzo, but it can be selected at boot time by setting > +the “compressor” attribute, e.g. zswap.compressor=lzo. It can also be > changed > +at runtime using the sysfs "compressor" attribute, e.g. > + > +echo lzo > /sys/module/zswap/parameters/compressor Again, what happens to pages already compressed? Are they freed? Recompressed? Does zswap remember it has to decompress them differently than the currently used compressor? > A debugfs interface is provided for various statistic about pool size, number > of pages stored, and various counters for the reasons pages are rejected. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On 08/18/2015 09:07 PM, Dan Streetman wrote: Change the Documentation/vm/zswap.txt doc to indicate that the zpool and compressor params are now changeable at runtime. Signed-off-by: Dan Streetman ddstr...@ieee.org --- Documentation/vm/zswap.txt | 31 +++ 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt index 8458c08..06f7ce2 100644 --- a/Documentation/vm/zswap.txt +++ b/Documentation/vm/zswap.txt @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the sysfs interface. An example command to enable zswap at runtime, assuming sysfs is mounted at /sys, is: -echo 1 /sys/modules/zswap/parameters/enabled +echo 1 /sys/module/zswap/parameters/enabled When zswap is disabled at runtime it will stop storing pages that are being swapped out. However, it will _not_ immediately write out or fault @@ -49,14 +49,27 @@ Zswap receives pages for compression through the Frontswap API and is able to evict pages from its own compressed pool on an LRU basis and write them back to the backing swap device in the case that the compressed pool is full. -Zswap makes use of zbud for the managing the compressed memory pool. Each -allocation in zbud is not directly accessible by address. Rather, a handle is +Zswap makes use of zpool for the managing the compressed memory pool. Each +allocation in zpool is not directly accessible by address. Rather, a handle is returned by the allocation routine and that handle must be mapped before being accessed. The compressed memory pool grows on demand and shrinks as compressed -pages are freed. The pool is not preallocated. +pages are freed. The pool is not preallocated. By default, a zpool of type +zbud is created, but it can be selected at boot time by setting the zpool +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using the +sysfs zpool attribute, e.g. + +echo zbud /sys/module/zswap/parameters/zpool What exactly happens if zswap is already being used and has allocated pages in one type of pool, and you're changing it to the other one? + +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which +means the compression ratio will always be exactly 2:1 (not including half-full +zbud pages), and any page that compresses to more than 1/2 page in size will be +rejected (and written to the swap disk). Hm is this correct? I've been going through the zbud code briefly (as of Linus' tree) and it seems to me that it will accept pages larger than 1/2, but they will sit in the unbuddied list until a small enough buddy comes. The zsmalloc type zpool has a more +complex compressed page storage method, and it can achieve greater storage +densities. However, zsmalloc does not implement compressed page eviction, so +once zswap fills it cannot evict the oldest page, it can only reject new pages. I still wonder why anyone would use zsmalloc with zswap given this limitation. It seems only fine for zram which has no real swap as fallback. And even zbud doesn't have any shrinker interface that would react to memory pressure, so there's a possibility of premature OOM... sigh. When a swap page is passed from frontswap to zswap, zswap maintains a mapping -of the swap entry, a combination of the swap type and swap offset, to the zbud +of the swap entry, a combination of the swap type and swap offset, to the zpool handle that references that compressed swap page. This mapping is achieved with a red-black tree per swap type. The swap offset is the search key for the tree nodes. @@ -74,9 +87,11 @@ controlled policy: * max_pool_percent - The maximum percentage of memory that the compressed pool can occupy. -Zswap allows the compressor to be selected at kernel boot time by setting the -“compressor” attribute. The default compressor is lzo. e.g. -zswap.compressor=deflate +The default compressor is lzo, but it can be selected at boot time by setting +the “compressor” attribute, e.g. zswap.compressor=lzo. It can also be changed +at runtime using the sysfs compressor attribute, e.g. + +echo lzo /sys/module/zswap/parameters/compressor Again, what happens to pages already compressed? Are they freed? Recompressed? Does zswap remember it has to decompress them differently than the currently used compressor? A debugfs interface is provided for various statistic about pool size, number of pages stored, and various counters for the reasons pages are rejected. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On 08/19/2015 04:21 PM, Dan Streetman wrote: On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka vba...@suse.cz wrote: On 08/18/2015 09:07 PM, Dan Streetman wrote: +pages are freed. The pool is not preallocated. By default, a zpool of type +zbud is created, but it can be selected at boot time by setting the zpool +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using the +sysfs zpool attribute, e.g. + +echo zbud /sys/module/zswap/parameters/zpool What exactly happens if zswap is already being used and has allocated pages in one type of pool, and you're changing it to the other one? zswap has a rcu list where each entry contains a specific compressor and zpool. When either the compressor or zpool is changed, a new entry is created with a new compressor and pool and put at the front of the list. New pages always use the current (first) entry. Any old (unused) entries are freed whenever all the pages they contain are removed. So when the compressor or zpool is changed, the only thing that happens is zswap creates a new compressor and zpool and places it at the front of the list, for new pages to use. No existing pages are touched. Ugh that's madness. Still, a documented madness is better than an undocumented one. The zsmalloc type zpool has a more +complex compressed page storage method, and it can achieve greater storage +densities. However, zsmalloc does not implement compressed page eviction, so +once zswap fills it cannot evict the oldest page, it can only reject new pages. I still wonder why anyone would use zsmalloc with zswap given this limitation. It seems only fine for zram which has no real swap as fallback. And even zbud doesn't have any shrinker interface that would react to memory pressure, so there's a possibility of premature OOM... sigh. for situations where zswap isn't expected to ever fill up, zsmalloc will outperform zbud, since it has higher density. But then you could just use zram? :) i'd argue that neither zbud nor zsmalloc are responsible for reacting to memory pressure, they just store the pages. It's zswap that has to limit its size, which it does with max_percent_pool. Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim requests from zswap when zswap hits the limit. Zswap could easily add a shrinker that would relay this requests in response to memory pressure as well. However, zsmalloc doesn't implement the reclaim, or LRU tracking. One could also argue that aging should be tracked in zswap, and it would just tell zbud/zmalloc to drop a specific compressed page. But that wouldn't reliably translate into freeing of page frames... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On Wed, Aug 19, 2015 at 11:02 AM, Vlastimil Babka vba...@suse.cz wrote: On 08/19/2015 04:21 PM, Dan Streetman wrote: On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka vba...@suse.cz wrote: On 08/18/2015 09:07 PM, Dan Streetman wrote: +pages are freed. The pool is not preallocated. By default, a zpool of type +zbud is created, but it can be selected at boot time by setting the zpool +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using the +sysfs zpool attribute, e.g. + +echo zbud /sys/module/zswap/parameters/zpool What exactly happens if zswap is already being used and has allocated pages in one type of pool, and you're changing it to the other one? zswap has a rcu list where each entry contains a specific compressor and zpool. When either the compressor or zpool is changed, a new entry is created with a new compressor and pool and put at the front of the list. New pages always use the current (first) entry. Any old (unused) entries are freed whenever all the pages they contain are removed. So when the compressor or zpool is changed, the only thing that happens is zswap creates a new compressor and zpool and places it at the front of the list, for new pages to use. No existing pages are touched. Ugh that's madness. Still, a documented madness is better than an undocumented one. heh, i'm not sure why it's madness, the alternative of uncompressing/recompressing all pages into the new zpool and/or with the new compressor seems much worse ;-) The zsmalloc type zpool has a more +complex compressed page storage method, and it can achieve greater storage +densities. However, zsmalloc does not implement compressed page eviction, so +once zswap fills it cannot evict the oldest page, it can only reject new pages. I still wonder why anyone would use zsmalloc with zswap given this limitation. It seems only fine for zram which has no real swap as fallback. And even zbud doesn't have any shrinker interface that would react to memory pressure, so there's a possibility of premature OOM... sigh. for situations where zswap isn't expected to ever fill up, zsmalloc will outperform zbud, since it has higher density. But then you could just use zram? :) well not *expected* to fill up doesn't mean it *won't* fill up :) i'd argue that neither zbud nor zsmalloc are responsible for reacting to memory pressure, they just store the pages. It's zswap that has to limit its size, which it does with max_percent_pool. Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim requests from zswap when zswap hits the limit. Zswap could easily add a shrinker that would relay this requests in response to memory pressure as well. However, zsmalloc doesn't implement the reclaim, or LRU tracking. I wrote a patch for zsmalloc reclaim a while ago: https://lwn.net/Articles/611713/ however it didn't make it in, due to the lack of zsmalloc LRU, or any proven benefit to zsmalloc reclaim. It's not really possible to add LRU to zsmalloc, by the nature of its design, using the struct page fields directly; there's no extra field to use as a lru entry. One could also argue that aging should be tracked in zswap, and it would just tell zbud/zmalloc to drop a specific compressed page. But that wouldn't reliably translate into freeing of page frames... Yep, that was Minchan's suggestion as well, which I agree with, although that would also require a new api function to free the entire page that a single compressed page is in. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] zswap: update docs for runtime-changeable attributes
On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka vba...@suse.cz wrote: On 08/18/2015 09:07 PM, Dan Streetman wrote: Change the Documentation/vm/zswap.txt doc to indicate that the zpool and compressor params are now changeable at runtime. Signed-off-by: Dan Streetman ddstr...@ieee.org --- Documentation/vm/zswap.txt | 31 +++ 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt index 8458c08..06f7ce2 100644 --- a/Documentation/vm/zswap.txt +++ b/Documentation/vm/zswap.txt @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the sysfs interface. An example command to enable zswap at runtime, assuming sysfs is mounted at /sys, is: -echo 1 /sys/modules/zswap/parameters/enabled +echo 1 /sys/module/zswap/parameters/enabled When zswap is disabled at runtime it will stop storing pages that are being swapped out. However, it will _not_ immediately write out or fault @@ -49,14 +49,27 @@ Zswap receives pages for compression through the Frontswap API and is able to evict pages from its own compressed pool on an LRU basis and write them back to the backing swap device in the case that the compressed pool is full. -Zswap makes use of zbud for the managing the compressed memory pool. Each -allocation in zbud is not directly accessible by address. Rather, a handle is +Zswap makes use of zpool for the managing the compressed memory pool. Each +allocation in zpool is not directly accessible by address. Rather, a handle is returned by the allocation routine and that handle must be mapped before being accessed. The compressed memory pool grows on demand and shrinks as compressed -pages are freed. The pool is not preallocated. +pages are freed. The pool is not preallocated. By default, a zpool of type +zbud is created, but it can be selected at boot time by setting the zpool +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using the +sysfs zpool attribute, e.g. + +echo zbud /sys/module/zswap/parameters/zpool What exactly happens if zswap is already being used and has allocated pages in one type of pool, and you're changing it to the other one? zswap has a rcu list where each entry contains a specific compressor and zpool. When either the compressor or zpool is changed, a new entry is created with a new compressor and pool and put at the front of the list. New pages always use the current (first) entry. Any old (unused) entries are freed whenever all the pages they contain are removed. So when the compressor or zpool is changed, the only thing that happens is zswap creates a new compressor and zpool and places it at the front of the list, for new pages to use. No existing pages are touched. + +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which +means the compression ratio will always be exactly 2:1 (not including half-full +zbud pages), and any page that compresses to more than 1/2 page in size will be +rejected (and written to the swap disk). Hm is this correct? I've been going through the zbud code briefly (as of Linus' tree) and it seems to me that it will accept pages larger than 1/2, but they will sit in the unbuddied list until a small enough buddy comes. ha, yeah you're right. I didn't read zbud_alloc closely before, it definitely takes compressed pages 1/2 page. I'll update the doc. thanks! The zsmalloc type zpool has a more +complex compressed page storage method, and it can achieve greater storage +densities. However, zsmalloc does not implement compressed page eviction, so +once zswap fills it cannot evict the oldest page, it can only reject new pages. I still wonder why anyone would use zsmalloc with zswap given this limitation. It seems only fine for zram which has no real swap as fallback. And even zbud doesn't have any shrinker interface that would react to memory pressure, so there's a possibility of premature OOM... sigh. for situations where zswap isn't expected to ever fill up, zsmalloc will outperform zbud, since it has higher density. i'd argue that neither zbud nor zsmalloc are responsible for reacting to memory pressure, they just store the pages. It's zswap that has to limit its size, which it does with max_percent_pool. When a swap page is passed from frontswap to zswap, zswap maintains a mapping -of the swap entry, a combination of the swap type and swap offset, to the zbud +of the swap entry, a combination of the swap type and swap offset, to the zpool handle that references that compressed swap page. This mapping is achieved with a red-black tree per swap type. The swap offset is the search key for the tree nodes. @@ -74,9 +87,11 @@ controlled policy: * max_pool_percent - The maximum percentage of memory that the compressed pool can occupy. -Zswap allows the compressor to be selected at
[PATCH] zswap: update docs for runtime-changeable attributes
Change the Documentation/vm/zswap.txt doc to indicate that the "zpool" and "compressor" params are now changeable at runtime. Signed-off-by: Dan Streetman --- Documentation/vm/zswap.txt | 31 +++ 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt index 8458c08..06f7ce2 100644 --- a/Documentation/vm/zswap.txt +++ b/Documentation/vm/zswap.txt @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the sysfs interface. An example command to enable zswap at runtime, assuming sysfs is mounted at /sys, is: -echo 1 > /sys/modules/zswap/parameters/enabled +echo 1 > /sys/module/zswap/parameters/enabled When zswap is disabled at runtime it will stop storing pages that are being swapped out. However, it will _not_ immediately write out or fault @@ -49,14 +49,27 @@ Zswap receives pages for compression through the Frontswap API and is able to evict pages from its own compressed pool on an LRU basis and write them back to the backing swap device in the case that the compressed pool is full. -Zswap makes use of zbud for the managing the compressed memory pool. Each -allocation in zbud is not directly accessible by address. Rather, a handle is +Zswap makes use of zpool for the managing the compressed memory pool. Each +allocation in zpool is not directly accessible by address. Rather, a handle is returned by the allocation routine and that handle must be mapped before being accessed. The compressed memory pool grows on demand and shrinks as compressed -pages are freed. The pool is not preallocated. +pages are freed. The pool is not preallocated. By default, a zpool of type +zbud is created, but it can be selected at boot time by setting the "zpool" +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using the +sysfs "zpool" attribute, e.g. + +echo zbud > /sys/module/zswap/parameters/zpool + +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which +means the compression ratio will always be exactly 2:1 (not including half-full +zbud pages), and any page that compresses to more than 1/2 page in size will be +rejected (and written to the swap disk). The zsmalloc type zpool has a more +complex compressed page storage method, and it can achieve greater storage +densities. However, zsmalloc does not implement compressed page eviction, so +once zswap fills it cannot evict the oldest page, it can only reject new pages. When a swap page is passed from frontswap to zswap, zswap maintains a mapping -of the swap entry, a combination of the swap type and swap offset, to the zbud +of the swap entry, a combination of the swap type and swap offset, to the zpool handle that references that compressed swap page. This mapping is achieved with a red-black tree per swap type. The swap offset is the search key for the tree nodes. @@ -74,9 +87,11 @@ controlled policy: * max_pool_percent - The maximum percentage of memory that the compressed pool can occupy. -Zswap allows the compressor to be selected at kernel boot time by setting the -“compressor” attribute. The default compressor is lzo. e.g. -zswap.compressor=deflate +The default compressor is lzo, but it can be selected at boot time by setting +the “compressor” attribute, e.g. zswap.compressor=lzo. It can also be changed +at runtime using the sysfs "compressor" attribute, e.g. + +echo lzo > /sys/module/zswap/parameters/compressor A debugfs interface is provided for various statistic about pool size, number of pages stored, and various counters for the reasons pages are rejected. -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] zswap: update docs for runtime-changeable attributes
Change the Documentation/vm/zswap.txt doc to indicate that the zpool and compressor params are now changeable at runtime. Signed-off-by: Dan Streetman ddstr...@ieee.org --- Documentation/vm/zswap.txt | 31 +++ 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt index 8458c08..06f7ce2 100644 --- a/Documentation/vm/zswap.txt +++ b/Documentation/vm/zswap.txt @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the sysfs interface. An example command to enable zswap at runtime, assuming sysfs is mounted at /sys, is: -echo 1 /sys/modules/zswap/parameters/enabled +echo 1 /sys/module/zswap/parameters/enabled When zswap is disabled at runtime it will stop storing pages that are being swapped out. However, it will _not_ immediately write out or fault @@ -49,14 +49,27 @@ Zswap receives pages for compression through the Frontswap API and is able to evict pages from its own compressed pool on an LRU basis and write them back to the backing swap device in the case that the compressed pool is full. -Zswap makes use of zbud for the managing the compressed memory pool. Each -allocation in zbud is not directly accessible by address. Rather, a handle is +Zswap makes use of zpool for the managing the compressed memory pool. Each +allocation in zpool is not directly accessible by address. Rather, a handle is returned by the allocation routine and that handle must be mapped before being accessed. The compressed memory pool grows on demand and shrinks as compressed -pages are freed. The pool is not preallocated. +pages are freed. The pool is not preallocated. By default, a zpool of type +zbud is created, but it can be selected at boot time by setting the zpool +attribute, e.g. zswap.zpool=zbud. It can also be changed at runtime using the +sysfs zpool attribute, e.g. + +echo zbud /sys/module/zswap/parameters/zpool + +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which +means the compression ratio will always be exactly 2:1 (not including half-full +zbud pages), and any page that compresses to more than 1/2 page in size will be +rejected (and written to the swap disk). The zsmalloc type zpool has a more +complex compressed page storage method, and it can achieve greater storage +densities. However, zsmalloc does not implement compressed page eviction, so +once zswap fills it cannot evict the oldest page, it can only reject new pages. When a swap page is passed from frontswap to zswap, zswap maintains a mapping -of the swap entry, a combination of the swap type and swap offset, to the zbud +of the swap entry, a combination of the swap type and swap offset, to the zpool handle that references that compressed swap page. This mapping is achieved with a red-black tree per swap type. The swap offset is the search key for the tree nodes. @@ -74,9 +87,11 @@ controlled policy: * max_pool_percent - The maximum percentage of memory that the compressed pool can occupy. -Zswap allows the compressor to be selected at kernel boot time by setting the -“compressor” attribute. The default compressor is lzo. e.g. -zswap.compressor=deflate +The default compressor is lzo, but it can be selected at boot time by setting +the “compressor” attribute, e.g. zswap.compressor=lzo. It can also be changed +at runtime using the sysfs compressor attribute, e.g. + +echo lzo /sys/module/zswap/parameters/compressor A debugfs interface is provided for various statistic about pool size, number of pages stored, and various counters for the reasons pages are rejected. -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/