Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-25 Thread Vlastimil Babka
On 25.8.2015 6:22, Sergey Senozhatsky wrote:
 i'd argue that neither zbud nor zsmalloc are responsible for reacting
 to memory pressure, they just store the pages.  It's zswap that has to
 limit its size, which it does with max_percent_pool.
>>>
>>> Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim 
>>> requests
>>> from zswap when zswap hits the limit. Zswap could easily add a shrinker that
>>> would relay this requests in response to memory pressure as well. However,
>>> zsmalloc doesn't implement the reclaim, or LRU tracking.
>>
>> I wrote a patch for zsmalloc reclaim a while ago:
>>
>> https://lwn.net/Articles/611713/
>>
>> however it didn't make it in, due to the lack of zsmalloc LRU, or any
>> proven benefit to zsmalloc reclaim.
>>
>> It's not really possible to add LRU to zsmalloc, by the nature of its
>> design, using the struct page fields directly; there's no extra field
>> to use as a lru entry.
> 
> Just for information, zsmalloc now registers shrinker callbacks
> 
> https://lkml.org/lkml/2015/7/8/497

Yeah but that's just for compaction, not freeing. I think that ideally zswap
should track the LRU on the level of pages it receives as input, and then just
tell zswap/zbud to free them. Then zswap would use its compaction to make sure
that the reclaim results in actual freeing of page frames. Zbud could re-pair
the orphaned half-pages to the same effect.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-25 Thread Vlastimil Babka
On 25.8.2015 6:22, Sergey Senozhatsky wrote:
 i'd argue that neither zbud nor zsmalloc are responsible for reacting
 to memory pressure, they just store the pages.  It's zswap that has to
 limit its size, which it does with max_percent_pool.

 Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim 
 requests
 from zswap when zswap hits the limit. Zswap could easily add a shrinker that
 would relay this requests in response to memory pressure as well. However,
 zsmalloc doesn't implement the reclaim, or LRU tracking.

 I wrote a patch for zsmalloc reclaim a while ago:

 https://lwn.net/Articles/611713/

 however it didn't make it in, due to the lack of zsmalloc LRU, or any
 proven benefit to zsmalloc reclaim.

 It's not really possible to add LRU to zsmalloc, by the nature of its
 design, using the struct page fields directly; there's no extra field
 to use as a lru entry.
 
 Just for information, zsmalloc now registers shrinker callbacks
 
 https://lkml.org/lkml/2015/7/8/497

Yeah but that's just for compaction, not freeing. I think that ideally zswap
should track the LRU on the level of pages it receives as input, and then just
tell zswap/zbud to free them. Then zswap would use its compaction to make sure
that the reclaim results in actual freeing of page frames. Zbud could re-pair
the orphaned half-pages to the same effect.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-24 Thread Sergey Senozhatsky
On (08/19/15 11:56), Dan Streetman wrote:
[..]
> > Ugh that's madness. Still, a documented madness is better than an 
> > undocumented one.
> 
> heh, i'm not sure why it's madness, the alternative of
> uncompressing/recompressing all pages into the new zpool and/or with
> the new compressor seems much worse ;-)
> 

Well, I sort of still think that 'change compressor and reboot' is OK. 5cents.

> >
> >>>
>  The zsmalloc type zpool has a more
>  +complex compressed page storage method, and it can achieve greater 
>  storage
>  +densities.  However, zsmalloc does not implement compressed page 
>  eviction, so
>  +once zswap fills it cannot evict the oldest page, it can only reject 
>  new pages.
> >>>
> >>> I still wonder why anyone would use zsmalloc with zswap given this 
> >>> limitation.
> >>> It seems only fine for zram which has no real swap as fallback. And even 
> >>> zbud
> >>> doesn't have any shrinker interface that would react to memory pressure, 
> >>> so
> >>> there's a possibility of premature OOM... sigh.
> >>
> >> for situations where zswap isn't expected to ever fill up, zsmalloc
> >> will outperform zbud, since it has higher density.
> >
> > But then you could just use zram? :)
> 
> well not *expected* to fill up doesn't mean it *won't* fill up :)
> 
> >
> >> i'd argue that neither zbud nor zsmalloc are responsible for reacting
> >> to memory pressure, they just store the pages.  It's zswap that has to
> >> limit its size, which it does with max_percent_pool.
> >
> > Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim 
> > requests
> > from zswap when zswap hits the limit. Zswap could easily add a shrinker that
> > would relay this requests in response to memory pressure as well. However,
> > zsmalloc doesn't implement the reclaim, or LRU tracking.
> 
> I wrote a patch for zsmalloc reclaim a while ago:
> 
> https://lwn.net/Articles/611713/
> 
> however it didn't make it in, due to the lack of zsmalloc LRU, or any
> proven benefit to zsmalloc reclaim.
> 
> It's not really possible to add LRU to zsmalloc, by the nature of its
> design, using the struct page fields directly; there's no extra field
> to use as a lru entry.

Just for information, zsmalloc now registers shrinker callbacks

https://lkml.org/lkml/2015/7/8/497

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-24 Thread Sergey Senozhatsky
On (08/19/15 11:56), Dan Streetman wrote:
[..]
  Ugh that's madness. Still, a documented madness is better than an 
  undocumented one.
 
 heh, i'm not sure why it's madness, the alternative of
 uncompressing/recompressing all pages into the new zpool and/or with
 the new compressor seems much worse ;-)
 

Well, I sort of still think that 'change compressor and reboot' is OK. 5cents.

 
 
  The zsmalloc type zpool has a more
  +complex compressed page storage method, and it can achieve greater 
  storage
  +densities.  However, zsmalloc does not implement compressed page 
  eviction, so
  +once zswap fills it cannot evict the oldest page, it can only reject 
  new pages.
 
  I still wonder why anyone would use zsmalloc with zswap given this 
  limitation.
  It seems only fine for zram which has no real swap as fallback. And even 
  zbud
  doesn't have any shrinker interface that would react to memory pressure, 
  so
  there's a possibility of premature OOM... sigh.
 
  for situations where zswap isn't expected to ever fill up, zsmalloc
  will outperform zbud, since it has higher density.
 
  But then you could just use zram? :)
 
 well not *expected* to fill up doesn't mean it *won't* fill up :)
 
 
  i'd argue that neither zbud nor zsmalloc are responsible for reacting
  to memory pressure, they just store the pages.  It's zswap that has to
  limit its size, which it does with max_percent_pool.
 
  Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim 
  requests
  from zswap when zswap hits the limit. Zswap could easily add a shrinker that
  would relay this requests in response to memory pressure as well. However,
  zsmalloc doesn't implement the reclaim, or LRU tracking.
 
 I wrote a patch for zsmalloc reclaim a while ago:
 
 https://lwn.net/Articles/611713/
 
 however it didn't make it in, due to the lack of zsmalloc LRU, or any
 proven benefit to zsmalloc reclaim.
 
 It's not really possible to add LRU to zsmalloc, by the nature of its
 design, using the struct page fields directly; there's no extra field
 to use as a lru entry.

Just for information, zsmalloc now registers shrinker callbacks

https://lkml.org/lkml/2015/7/8/497

-ss
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-19 Thread Dan Streetman
On Wed, Aug 19, 2015 at 11:02 AM, Vlastimil Babka  wrote:
> On 08/19/2015 04:21 PM, Dan Streetman wrote:
>> On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka  wrote:
>>> On 08/18/2015 09:07 PM, Dan Streetman wrote:
 +pages are freed.  The pool is not preallocated.  By default, a zpool of 
 type
 +zbud is created, but it can be selected at boot time by setting the 
 "zpool"
 +attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime 
 using the
 +sysfs "zpool" attribute, e.g.
 +
 +echo zbud > /sys/module/zswap/parameters/zpool
>>>
>>> What exactly happens if zswap is already being used and has allocated pages 
>>> in
>>> one type of pool, and you're changing it to the other one?
>>
>> zswap has a rcu list where each entry contains a specific compressor
>> and zpool.  When either the compressor or zpool is changed, a new
>> entry is created with a new compressor and pool and put at the front
>> of the list.  New pages always use the "current" (first) entry.  Any
>> old (unused) entries are freed whenever all the pages they contain are
>> removed.
>>
>> So when the compressor or zpool is changed, the only thing that
>> happens is zswap creates a new compressor and zpool and places it at
>> the front of the list, for new pages to use.  No existing pages are
>> touched.
>
> Ugh that's madness. Still, a documented madness is better than an 
> undocumented one.

heh, i'm not sure why it's madness, the alternative of
uncompressing/recompressing all pages into the new zpool and/or with
the new compressor seems much worse ;-)

>
>>>
 The zsmalloc type zpool has a more
 +complex compressed page storage method, and it can achieve greater storage
 +densities.  However, zsmalloc does not implement compressed page 
 eviction, so
 +once zswap fills it cannot evict the oldest page, it can only reject new 
 pages.
>>>
>>> I still wonder why anyone would use zsmalloc with zswap given this 
>>> limitation.
>>> It seems only fine for zram which has no real swap as fallback. And even 
>>> zbud
>>> doesn't have any shrinker interface that would react to memory pressure, so
>>> there's a possibility of premature OOM... sigh.
>>
>> for situations where zswap isn't expected to ever fill up, zsmalloc
>> will outperform zbud, since it has higher density.
>
> But then you could just use zram? :)

well not *expected* to fill up doesn't mean it *won't* fill up :)

>
>> i'd argue that neither zbud nor zsmalloc are responsible for reacting
>> to memory pressure, they just store the pages.  It's zswap that has to
>> limit its size, which it does with max_percent_pool.
>
> Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim 
> requests
> from zswap when zswap hits the limit. Zswap could easily add a shrinker that
> would relay this requests in response to memory pressure as well. However,
> zsmalloc doesn't implement the reclaim, or LRU tracking.

I wrote a patch for zsmalloc reclaim a while ago:

https://lwn.net/Articles/611713/

however it didn't make it in, due to the lack of zsmalloc LRU, or any
proven benefit to zsmalloc reclaim.

It's not really possible to add LRU to zsmalloc, by the nature of its
design, using the struct page fields directly; there's no extra field
to use as a lru entry.


>
> One could also argue that aging should be tracked in zswap, and it would just
> tell zbud/zmalloc to drop a specific compressed page. But that wouldn't 
> reliably
> translate into freeing of page frames...
>

Yep, that was Minchan's suggestion as well, which I agree with,
although that would also require a new api function to free the entire
page that a single compressed page is in.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-19 Thread Vlastimil Babka
On 08/19/2015 04:21 PM, Dan Streetman wrote:
> On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka  wrote:
>> On 08/18/2015 09:07 PM, Dan Streetman wrote:
>>> +pages are freed.  The pool is not preallocated.  By default, a zpool of 
>>> type
>>> +zbud is created, but it can be selected at boot time by setting the "zpool"
>>> +attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime using 
>>> the
>>> +sysfs "zpool" attribute, e.g.
>>> +
>>> +echo zbud > /sys/module/zswap/parameters/zpool
>>
>> What exactly happens if zswap is already being used and has allocated pages 
>> in
>> one type of pool, and you're changing it to the other one?
> 
> zswap has a rcu list where each entry contains a specific compressor
> and zpool.  When either the compressor or zpool is changed, a new
> entry is created with a new compressor and pool and put at the front
> of the list.  New pages always use the "current" (first) entry.  Any
> old (unused) entries are freed whenever all the pages they contain are
> removed.
> 
> So when the compressor or zpool is changed, the only thing that
> happens is zswap creates a new compressor and zpool and places it at
> the front of the list, for new pages to use.  No existing pages are
> touched.

Ugh that's madness. Still, a documented madness is better than an undocumented 
one.

>>
>>> The zsmalloc type zpool has a more
>>> +complex compressed page storage method, and it can achieve greater storage
>>> +densities.  However, zsmalloc does not implement compressed page eviction, 
>>> so
>>> +once zswap fills it cannot evict the oldest page, it can only reject new 
>>> pages.
>>
>> I still wonder why anyone would use zsmalloc with zswap given this 
>> limitation.
>> It seems only fine for zram which has no real swap as fallback. And even zbud
>> doesn't have any shrinker interface that would react to memory pressure, so
>> there's a possibility of premature OOM... sigh.
> 
> for situations where zswap isn't expected to ever fill up, zsmalloc
> will outperform zbud, since it has higher density.

But then you could just use zram? :)

> i'd argue that neither zbud nor zsmalloc are responsible for reacting
> to memory pressure, they just store the pages.  It's zswap that has to
> limit its size, which it does with max_percent_pool.

Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim requests
from zswap when zswap hits the limit. Zswap could easily add a shrinker that
would relay this requests in response to memory pressure as well. However,
zsmalloc doesn't implement the reclaim, or LRU tracking.

One could also argue that aging should be tracked in zswap, and it would just
tell zbud/zmalloc to drop a specific compressed page. But that wouldn't reliably
translate into freeing of page frames...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-19 Thread Dan Streetman
On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka  wrote:
> On 08/18/2015 09:07 PM, Dan Streetman wrote:
>> Change the Documentation/vm/zswap.txt doc to indicate that the "zpool"
>> and "compressor" params are now changeable at runtime.
>>
>> Signed-off-by: Dan Streetman 
>> ---
>>  Documentation/vm/zswap.txt | 31 +++
>>  1 file changed, 23 insertions(+), 8 deletions(-)
>>
>> diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
>> index 8458c08..06f7ce2 100644
>> --- a/Documentation/vm/zswap.txt
>> +++ b/Documentation/vm/zswap.txt
>> @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the 
>> sysfs interface.
>>  An example command to enable zswap at runtime, assuming sysfs is mounted
>>  at /sys, is:
>>
>> -echo 1 > /sys/modules/zswap/parameters/enabled
>> +echo 1 > /sys/module/zswap/parameters/enabled
>>
>>  When zswap is disabled at runtime it will stop storing pages that are
>>  being swapped out.  However, it will _not_ immediately write out or fault
>> @@ -49,14 +49,27 @@ Zswap receives pages for compression through the 
>> Frontswap API and is able to
>>  evict pages from its own compressed pool on an LRU basis and write them 
>> back to
>>  the backing swap device in the case that the compressed pool is full.
>>
>> -Zswap makes use of zbud for the managing the compressed memory pool.  Each
>> -allocation in zbud is not directly accessible by address.  Rather, a handle 
>> is
>> +Zswap makes use of zpool for the managing the compressed memory pool.  Each
>> +allocation in zpool is not directly accessible by address.  Rather, a 
>> handle is
>>  returned by the allocation routine and that handle must be mapped before 
>> being
>>  accessed.  The compressed memory pool grows on demand and shrinks as 
>> compressed
>> -pages are freed.  The pool is not preallocated.
>> +pages are freed.  The pool is not preallocated.  By default, a zpool of type
>> +zbud is created, but it can be selected at boot time by setting the "zpool"
>> +attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime using 
>> the
>> +sysfs "zpool" attribute, e.g.
>> +
>> +echo zbud > /sys/module/zswap/parameters/zpool
>
> What exactly happens if zswap is already being used and has allocated pages in
> one type of pool, and you're changing it to the other one?

zswap has a rcu list where each entry contains a specific compressor
and zpool.  When either the compressor or zpool is changed, a new
entry is created with a new compressor and pool and put at the front
of the list.  New pages always use the "current" (first) entry.  Any
old (unused) entries are freed whenever all the pages they contain are
removed.

So when the compressor or zpool is changed, the only thing that
happens is zswap creates a new compressor and zpool and places it at
the front of the list, for new pages to use.  No existing pages are
touched.

>
>> +
>> +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, 
>> which
>> +means the compression ratio will always be exactly 2:1 (not including 
>> half-full
>> +zbud pages), and any page that compresses to more than 1/2 page in size 
>> will be
>> +rejected (and written to the swap disk).
>
> Hm is this correct? I've been going through the zbud code briefly (as of 
> Linus'
> tree) and it seems to me that it will accept pages larger than 1/2, but they
> will sit in the unbuddied list until a small enough "buddy" comes.

ha, yeah you're right.  I didn't read zbud_alloc closely before, it
definitely takes compressed pages > 1/2 page.  I'll update the doc.

thanks!

>
>> The zsmalloc type zpool has a more
>> +complex compressed page storage method, and it can achieve greater storage
>> +densities.  However, zsmalloc does not implement compressed page eviction, 
>> so
>> +once zswap fills it cannot evict the oldest page, it can only reject new 
>> pages.
>
> I still wonder why anyone would use zsmalloc with zswap given this limitation.
> It seems only fine for zram which has no real swap as fallback. And even zbud
> doesn't have any shrinker interface that would react to memory pressure, so
> there's a possibility of premature OOM... sigh.

for situations where zswap isn't expected to ever fill up, zsmalloc
will outperform zbud, since it has higher density.

i'd argue that neither zbud nor zsmalloc are responsible for reacting
to memory pressure, they just store the pages.  It's zswap that has to
limit its size, which it does with max_percent_pool.

>
>>  When a swap page is passed from frontswap to zswap, zswap maintains a 
>> mapping
>> -of the swap entry, a combination of the swap type and swap offset, to the 
>> zbud
>> +of the swap entry, a combination of the swap type and swap offset, to the 
>> zpool
>>  handle that references that compressed swap page.  This mapping is achieved
>>  with a red-black tree per swap type.  The swap offset is the search key for 
>> the
>>  tree nodes.
>> @@ -74,9 +87,11 @@ controlled policy:
>>  

Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-19 Thread Vlastimil Babka
On 08/18/2015 09:07 PM, Dan Streetman wrote:
> Change the Documentation/vm/zswap.txt doc to indicate that the "zpool"
> and "compressor" params are now changeable at runtime.
> 
> Signed-off-by: Dan Streetman 
> ---
>  Documentation/vm/zswap.txt | 31 +++
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
> index 8458c08..06f7ce2 100644
> --- a/Documentation/vm/zswap.txt
> +++ b/Documentation/vm/zswap.txt
> @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the sysfs 
> interface.
>  An example command to enable zswap at runtime, assuming sysfs is mounted
>  at /sys, is:
>  
> -echo 1 > /sys/modules/zswap/parameters/enabled
> +echo 1 > /sys/module/zswap/parameters/enabled
>  
>  When zswap is disabled at runtime it will stop storing pages that are
>  being swapped out.  However, it will _not_ immediately write out or fault
> @@ -49,14 +49,27 @@ Zswap receives pages for compression through the 
> Frontswap API and is able to
>  evict pages from its own compressed pool on an LRU basis and write them back 
> to
>  the backing swap device in the case that the compressed pool is full.
>  
> -Zswap makes use of zbud for the managing the compressed memory pool.  Each
> -allocation in zbud is not directly accessible by address.  Rather, a handle 
> is
> +Zswap makes use of zpool for the managing the compressed memory pool.  Each
> +allocation in zpool is not directly accessible by address.  Rather, a handle 
> is
>  returned by the allocation routine and that handle must be mapped before 
> being
>  accessed.  The compressed memory pool grows on demand and shrinks as 
> compressed
> -pages are freed.  The pool is not preallocated.
> +pages are freed.  The pool is not preallocated.  By default, a zpool of type
> +zbud is created, but it can be selected at boot time by setting the "zpool"
> +attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime using 
> the
> +sysfs "zpool" attribute, e.g.
> +
> +echo zbud > /sys/module/zswap/parameters/zpool

What exactly happens if zswap is already being used and has allocated pages in
one type of pool, and you're changing it to the other one?

> +
> +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, 
> which
> +means the compression ratio will always be exactly 2:1 (not including 
> half-full
> +zbud pages), and any page that compresses to more than 1/2 page in size will 
> be
> +rejected (and written to the swap disk).

Hm is this correct? I've been going through the zbud code briefly (as of Linus'
tree) and it seems to me that it will accept pages larger than 1/2, but they
will sit in the unbuddied list until a small enough "buddy" comes.

> The zsmalloc type zpool has a more
> +complex compressed page storage method, and it can achieve greater storage
> +densities.  However, zsmalloc does not implement compressed page eviction, so
> +once zswap fills it cannot evict the oldest page, it can only reject new 
> pages.

I still wonder why anyone would use zsmalloc with zswap given this limitation.
It seems only fine for zram which has no real swap as fallback. And even zbud
doesn't have any shrinker interface that would react to memory pressure, so
there's a possibility of premature OOM... sigh.

>  When a swap page is passed from frontswap to zswap, zswap maintains a mapping
> -of the swap entry, a combination of the swap type and swap offset, to the 
> zbud
> +of the swap entry, a combination of the swap type and swap offset, to the 
> zpool
>  handle that references that compressed swap page.  This mapping is achieved
>  with a red-black tree per swap type.  The swap offset is the search key for 
> the
>  tree nodes.
> @@ -74,9 +87,11 @@ controlled policy:
>  * max_pool_percent - The maximum percentage of memory that the compressed
>  pool can occupy.
>  
> -Zswap allows the compressor to be selected at kernel boot time by setting the
> -“compressor” attribute.  The default compressor is lzo.  e.g.
> -zswap.compressor=deflate
> +The default compressor is lzo, but it can be selected at boot time by setting
> +the “compressor” attribute, e.g. zswap.compressor=lzo.  It can also be 
> changed
> +at runtime using the sysfs "compressor" attribute, e.g.
> +
> +echo lzo > /sys/module/zswap/parameters/compressor

Again, what happens to pages already compressed? Are they freed? Recompressed?
Does zswap remember it has to decompress them differently than the currently
used compressor?

>  A debugfs interface is provided for various statistic about pool size, number
>  of pages stored, and various counters for the reasons pages are rejected.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-19 Thread Vlastimil Babka
On 08/18/2015 09:07 PM, Dan Streetman wrote:
 Change the Documentation/vm/zswap.txt doc to indicate that the zpool
 and compressor params are now changeable at runtime.
 
 Signed-off-by: Dan Streetman ddstr...@ieee.org
 ---
  Documentation/vm/zswap.txt | 31 +++
  1 file changed, 23 insertions(+), 8 deletions(-)
 
 diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
 index 8458c08..06f7ce2 100644
 --- a/Documentation/vm/zswap.txt
 +++ b/Documentation/vm/zswap.txt
 @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the sysfs 
 interface.
  An example command to enable zswap at runtime, assuming sysfs is mounted
  at /sys, is:
  
 -echo 1  /sys/modules/zswap/parameters/enabled
 +echo 1  /sys/module/zswap/parameters/enabled
  
  When zswap is disabled at runtime it will stop storing pages that are
  being swapped out.  However, it will _not_ immediately write out or fault
 @@ -49,14 +49,27 @@ Zswap receives pages for compression through the 
 Frontswap API and is able to
  evict pages from its own compressed pool on an LRU basis and write them back 
 to
  the backing swap device in the case that the compressed pool is full.
  
 -Zswap makes use of zbud for the managing the compressed memory pool.  Each
 -allocation in zbud is not directly accessible by address.  Rather, a handle 
 is
 +Zswap makes use of zpool for the managing the compressed memory pool.  Each
 +allocation in zpool is not directly accessible by address.  Rather, a handle 
 is
  returned by the allocation routine and that handle must be mapped before 
 being
  accessed.  The compressed memory pool grows on demand and shrinks as 
 compressed
 -pages are freed.  The pool is not preallocated.
 +pages are freed.  The pool is not preallocated.  By default, a zpool of type
 +zbud is created, but it can be selected at boot time by setting the zpool
 +attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime using 
 the
 +sysfs zpool attribute, e.g.
 +
 +echo zbud  /sys/module/zswap/parameters/zpool

What exactly happens if zswap is already being used and has allocated pages in
one type of pool, and you're changing it to the other one?

 +
 +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, 
 which
 +means the compression ratio will always be exactly 2:1 (not including 
 half-full
 +zbud pages), and any page that compresses to more than 1/2 page in size will 
 be
 +rejected (and written to the swap disk).

Hm is this correct? I've been going through the zbud code briefly (as of Linus'
tree) and it seems to me that it will accept pages larger than 1/2, but they
will sit in the unbuddied list until a small enough buddy comes.

 The zsmalloc type zpool has a more
 +complex compressed page storage method, and it can achieve greater storage
 +densities.  However, zsmalloc does not implement compressed page eviction, so
 +once zswap fills it cannot evict the oldest page, it can only reject new 
 pages.

I still wonder why anyone would use zsmalloc with zswap given this limitation.
It seems only fine for zram which has no real swap as fallback. And even zbud
doesn't have any shrinker interface that would react to memory pressure, so
there's a possibility of premature OOM... sigh.

  When a swap page is passed from frontswap to zswap, zswap maintains a mapping
 -of the swap entry, a combination of the swap type and swap offset, to the 
 zbud
 +of the swap entry, a combination of the swap type and swap offset, to the 
 zpool
  handle that references that compressed swap page.  This mapping is achieved
  with a red-black tree per swap type.  The swap offset is the search key for 
 the
  tree nodes.
 @@ -74,9 +87,11 @@ controlled policy:
  * max_pool_percent - The maximum percentage of memory that the compressed
  pool can occupy.
  
 -Zswap allows the compressor to be selected at kernel boot time by setting the
 -“compressor” attribute.  The default compressor is lzo.  e.g.
 -zswap.compressor=deflate
 +The default compressor is lzo, but it can be selected at boot time by setting
 +the “compressor” attribute, e.g. zswap.compressor=lzo.  It can also be 
 changed
 +at runtime using the sysfs compressor attribute, e.g.
 +
 +echo lzo  /sys/module/zswap/parameters/compressor

Again, what happens to pages already compressed? Are they freed? Recompressed?
Does zswap remember it has to decompress them differently than the currently
used compressor?

  A debugfs interface is provided for various statistic about pool size, number
  of pages stored, and various counters for the reasons pages are rejected.
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-19 Thread Vlastimil Babka
On 08/19/2015 04:21 PM, Dan Streetman wrote:
 On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka vba...@suse.cz wrote:
 On 08/18/2015 09:07 PM, Dan Streetman wrote:
 +pages are freed.  The pool is not preallocated.  By default, a zpool of 
 type
 +zbud is created, but it can be selected at boot time by setting the zpool
 +attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime using 
 the
 +sysfs zpool attribute, e.g.
 +
 +echo zbud  /sys/module/zswap/parameters/zpool

 What exactly happens if zswap is already being used and has allocated pages 
 in
 one type of pool, and you're changing it to the other one?
 
 zswap has a rcu list where each entry contains a specific compressor
 and zpool.  When either the compressor or zpool is changed, a new
 entry is created with a new compressor and pool and put at the front
 of the list.  New pages always use the current (first) entry.  Any
 old (unused) entries are freed whenever all the pages they contain are
 removed.
 
 So when the compressor or zpool is changed, the only thing that
 happens is zswap creates a new compressor and zpool and places it at
 the front of the list, for new pages to use.  No existing pages are
 touched.

Ugh that's madness. Still, a documented madness is better than an undocumented 
one.


 The zsmalloc type zpool has a more
 +complex compressed page storage method, and it can achieve greater storage
 +densities.  However, zsmalloc does not implement compressed page eviction, 
 so
 +once zswap fills it cannot evict the oldest page, it can only reject new 
 pages.

 I still wonder why anyone would use zsmalloc with zswap given this 
 limitation.
 It seems only fine for zram which has no real swap as fallback. And even zbud
 doesn't have any shrinker interface that would react to memory pressure, so
 there's a possibility of premature OOM... sigh.
 
 for situations where zswap isn't expected to ever fill up, zsmalloc
 will outperform zbud, since it has higher density.

But then you could just use zram? :)

 i'd argue that neither zbud nor zsmalloc are responsible for reacting
 to memory pressure, they just store the pages.  It's zswap that has to
 limit its size, which it does with max_percent_pool.

Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim requests
from zswap when zswap hits the limit. Zswap could easily add a shrinker that
would relay this requests in response to memory pressure as well. However,
zsmalloc doesn't implement the reclaim, or LRU tracking.

One could also argue that aging should be tracked in zswap, and it would just
tell zbud/zmalloc to drop a specific compressed page. But that wouldn't reliably
translate into freeing of page frames...

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-19 Thread Dan Streetman
On Wed, Aug 19, 2015 at 11:02 AM, Vlastimil Babka vba...@suse.cz wrote:
 On 08/19/2015 04:21 PM, Dan Streetman wrote:
 On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka vba...@suse.cz wrote:
 On 08/18/2015 09:07 PM, Dan Streetman wrote:
 +pages are freed.  The pool is not preallocated.  By default, a zpool of 
 type
 +zbud is created, but it can be selected at boot time by setting the 
 zpool
 +attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime 
 using the
 +sysfs zpool attribute, e.g.
 +
 +echo zbud  /sys/module/zswap/parameters/zpool

 What exactly happens if zswap is already being used and has allocated pages 
 in
 one type of pool, and you're changing it to the other one?

 zswap has a rcu list where each entry contains a specific compressor
 and zpool.  When either the compressor or zpool is changed, a new
 entry is created with a new compressor and pool and put at the front
 of the list.  New pages always use the current (first) entry.  Any
 old (unused) entries are freed whenever all the pages they contain are
 removed.

 So when the compressor or zpool is changed, the only thing that
 happens is zswap creates a new compressor and zpool and places it at
 the front of the list, for new pages to use.  No existing pages are
 touched.

 Ugh that's madness. Still, a documented madness is better than an 
 undocumented one.

heh, i'm not sure why it's madness, the alternative of
uncompressing/recompressing all pages into the new zpool and/or with
the new compressor seems much worse ;-)



 The zsmalloc type zpool has a more
 +complex compressed page storage method, and it can achieve greater storage
 +densities.  However, zsmalloc does not implement compressed page 
 eviction, so
 +once zswap fills it cannot evict the oldest page, it can only reject new 
 pages.

 I still wonder why anyone would use zsmalloc with zswap given this 
 limitation.
 It seems only fine for zram which has no real swap as fallback. And even 
 zbud
 doesn't have any shrinker interface that would react to memory pressure, so
 there's a possibility of premature OOM... sigh.

 for situations where zswap isn't expected to ever fill up, zsmalloc
 will outperform zbud, since it has higher density.

 But then you could just use zram? :)

well not *expected* to fill up doesn't mean it *won't* fill up :)


 i'd argue that neither zbud nor zsmalloc are responsible for reacting
 to memory pressure, they just store the pages.  It's zswap that has to
 limit its size, which it does with max_percent_pool.

 Yeah but it's zbud that tracks the aging via LRU and reacts to reclaim 
 requests
 from zswap when zswap hits the limit. Zswap could easily add a shrinker that
 would relay this requests in response to memory pressure as well. However,
 zsmalloc doesn't implement the reclaim, or LRU tracking.

I wrote a patch for zsmalloc reclaim a while ago:

https://lwn.net/Articles/611713/

however it didn't make it in, due to the lack of zsmalloc LRU, or any
proven benefit to zsmalloc reclaim.

It's not really possible to add LRU to zsmalloc, by the nature of its
design, using the struct page fields directly; there's no extra field
to use as a lru entry.



 One could also argue that aging should be tracked in zswap, and it would just
 tell zbud/zmalloc to drop a specific compressed page. But that wouldn't 
 reliably
 translate into freeing of page frames...


Yep, that was Minchan's suggestion as well, which I agree with,
although that would also require a new api function to free the entire
page that a single compressed page is in.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zswap: update docs for runtime-changeable attributes

2015-08-19 Thread Dan Streetman
On Wed, Aug 19, 2015 at 10:02 AM, Vlastimil Babka vba...@suse.cz wrote:
 On 08/18/2015 09:07 PM, Dan Streetman wrote:
 Change the Documentation/vm/zswap.txt doc to indicate that the zpool
 and compressor params are now changeable at runtime.

 Signed-off-by: Dan Streetman ddstr...@ieee.org
 ---
  Documentation/vm/zswap.txt | 31 +++
  1 file changed, 23 insertions(+), 8 deletions(-)

 diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
 index 8458c08..06f7ce2 100644
 --- a/Documentation/vm/zswap.txt
 +++ b/Documentation/vm/zswap.txt
 @@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the 
 sysfs interface.
  An example command to enable zswap at runtime, assuming sysfs is mounted
  at /sys, is:

 -echo 1  /sys/modules/zswap/parameters/enabled
 +echo 1  /sys/module/zswap/parameters/enabled

  When zswap is disabled at runtime it will stop storing pages that are
  being swapped out.  However, it will _not_ immediately write out or fault
 @@ -49,14 +49,27 @@ Zswap receives pages for compression through the 
 Frontswap API and is able to
  evict pages from its own compressed pool on an LRU basis and write them 
 back to
  the backing swap device in the case that the compressed pool is full.

 -Zswap makes use of zbud for the managing the compressed memory pool.  Each
 -allocation in zbud is not directly accessible by address.  Rather, a handle 
 is
 +Zswap makes use of zpool for the managing the compressed memory pool.  Each
 +allocation in zpool is not directly accessible by address.  Rather, a 
 handle is
  returned by the allocation routine and that handle must be mapped before 
 being
  accessed.  The compressed memory pool grows on demand and shrinks as 
 compressed
 -pages are freed.  The pool is not preallocated.
 +pages are freed.  The pool is not preallocated.  By default, a zpool of type
 +zbud is created, but it can be selected at boot time by setting the zpool
 +attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime using 
 the
 +sysfs zpool attribute, e.g.
 +
 +echo zbud  /sys/module/zswap/parameters/zpool

 What exactly happens if zswap is already being used and has allocated pages in
 one type of pool, and you're changing it to the other one?

zswap has a rcu list where each entry contains a specific compressor
and zpool.  When either the compressor or zpool is changed, a new
entry is created with a new compressor and pool and put at the front
of the list.  New pages always use the current (first) entry.  Any
old (unused) entries are freed whenever all the pages they contain are
removed.

So when the compressor or zpool is changed, the only thing that
happens is zswap creates a new compressor and zpool and places it at
the front of the list, for new pages to use.  No existing pages are
touched.


 +
 +The zbud type zpool allocates exactly 1 page to store 2 compressed pages, 
 which
 +means the compression ratio will always be exactly 2:1 (not including 
 half-full
 +zbud pages), and any page that compresses to more than 1/2 page in size 
 will be
 +rejected (and written to the swap disk).

 Hm is this correct? I've been going through the zbud code briefly (as of 
 Linus'
 tree) and it seems to me that it will accept pages larger than 1/2, but they
 will sit in the unbuddied list until a small enough buddy comes.

ha, yeah you're right.  I didn't read zbud_alloc closely before, it
definitely takes compressed pages  1/2 page.  I'll update the doc.

thanks!


 The zsmalloc type zpool has a more
 +complex compressed page storage method, and it can achieve greater storage
 +densities.  However, zsmalloc does not implement compressed page eviction, 
 so
 +once zswap fills it cannot evict the oldest page, it can only reject new 
 pages.

 I still wonder why anyone would use zsmalloc with zswap given this limitation.
 It seems only fine for zram which has no real swap as fallback. And even zbud
 doesn't have any shrinker interface that would react to memory pressure, so
 there's a possibility of premature OOM... sigh.

for situations where zswap isn't expected to ever fill up, zsmalloc
will outperform zbud, since it has higher density.

i'd argue that neither zbud nor zsmalloc are responsible for reacting
to memory pressure, they just store the pages.  It's zswap that has to
limit its size, which it does with max_percent_pool.


  When a swap page is passed from frontswap to zswap, zswap maintains a 
 mapping
 -of the swap entry, a combination of the swap type and swap offset, to the 
 zbud
 +of the swap entry, a combination of the swap type and swap offset, to the 
 zpool
  handle that references that compressed swap page.  This mapping is achieved
  with a red-black tree per swap type.  The swap offset is the search key for 
 the
  tree nodes.
 @@ -74,9 +87,11 @@ controlled policy:
  * max_pool_percent - The maximum percentage of memory that the compressed
  pool can occupy.

 -Zswap allows the compressor to be selected at 

[PATCH] zswap: update docs for runtime-changeable attributes

2015-08-18 Thread Dan Streetman
Change the Documentation/vm/zswap.txt doc to indicate that the "zpool"
and "compressor" params are now changeable at runtime.

Signed-off-by: Dan Streetman 
---
 Documentation/vm/zswap.txt | 31 +++
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
index 8458c08..06f7ce2 100644
--- a/Documentation/vm/zswap.txt
+++ b/Documentation/vm/zswap.txt
@@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the sysfs 
interface.
 An example command to enable zswap at runtime, assuming sysfs is mounted
 at /sys, is:
 
-echo 1 > /sys/modules/zswap/parameters/enabled
+echo 1 > /sys/module/zswap/parameters/enabled
 
 When zswap is disabled at runtime it will stop storing pages that are
 being swapped out.  However, it will _not_ immediately write out or fault
@@ -49,14 +49,27 @@ Zswap receives pages for compression through the Frontswap 
API and is able to
 evict pages from its own compressed pool on an LRU basis and write them back to
 the backing swap device in the case that the compressed pool is full.
 
-Zswap makes use of zbud for the managing the compressed memory pool.  Each
-allocation in zbud is not directly accessible by address.  Rather, a handle is
+Zswap makes use of zpool for the managing the compressed memory pool.  Each
+allocation in zpool is not directly accessible by address.  Rather, a handle is
 returned by the allocation routine and that handle must be mapped before being
 accessed.  The compressed memory pool grows on demand and shrinks as compressed
-pages are freed.  The pool is not preallocated.
+pages are freed.  The pool is not preallocated.  By default, a zpool of type
+zbud is created, but it can be selected at boot time by setting the "zpool"
+attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime using the
+sysfs "zpool" attribute, e.g.
+
+echo zbud > /sys/module/zswap/parameters/zpool
+
+The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which
+means the compression ratio will always be exactly 2:1 (not including half-full
+zbud pages), and any page that compresses to more than 1/2 page in size will be
+rejected (and written to the swap disk).  The zsmalloc type zpool has a more
+complex compressed page storage method, and it can achieve greater storage
+densities.  However, zsmalloc does not implement compressed page eviction, so
+once zswap fills it cannot evict the oldest page, it can only reject new pages.
 
 When a swap page is passed from frontswap to zswap, zswap maintains a mapping
-of the swap entry, a combination of the swap type and swap offset, to the zbud
+of the swap entry, a combination of the swap type and swap offset, to the zpool
 handle that references that compressed swap page.  This mapping is achieved
 with a red-black tree per swap type.  The swap offset is the search key for the
 tree nodes.
@@ -74,9 +87,11 @@ controlled policy:
 * max_pool_percent - The maximum percentage of memory that the compressed
 pool can occupy.
 
-Zswap allows the compressor to be selected at kernel boot time by setting the
-“compressor” attribute.  The default compressor is lzo.  e.g.
-zswap.compressor=deflate
+The default compressor is lzo, but it can be selected at boot time by setting
+the “compressor” attribute, e.g. zswap.compressor=lzo.  It can also be changed
+at runtime using the sysfs "compressor" attribute, e.g.
+
+echo lzo > /sys/module/zswap/parameters/compressor
 
 A debugfs interface is provided for various statistic about pool size, number
 of pages stored, and various counters for the reasons pages are rejected.
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] zswap: update docs for runtime-changeable attributes

2015-08-18 Thread Dan Streetman
Change the Documentation/vm/zswap.txt doc to indicate that the zpool
and compressor params are now changeable at runtime.

Signed-off-by: Dan Streetman ddstr...@ieee.org
---
 Documentation/vm/zswap.txt | 31 +++
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
index 8458c08..06f7ce2 100644
--- a/Documentation/vm/zswap.txt
+++ b/Documentation/vm/zswap.txt
@@ -32,7 +32,7 @@ can also be enabled and disabled at runtime using the sysfs 
interface.
 An example command to enable zswap at runtime, assuming sysfs is mounted
 at /sys, is:
 
-echo 1  /sys/modules/zswap/parameters/enabled
+echo 1  /sys/module/zswap/parameters/enabled
 
 When zswap is disabled at runtime it will stop storing pages that are
 being swapped out.  However, it will _not_ immediately write out or fault
@@ -49,14 +49,27 @@ Zswap receives pages for compression through the Frontswap 
API and is able to
 evict pages from its own compressed pool on an LRU basis and write them back to
 the backing swap device in the case that the compressed pool is full.
 
-Zswap makes use of zbud for the managing the compressed memory pool.  Each
-allocation in zbud is not directly accessible by address.  Rather, a handle is
+Zswap makes use of zpool for the managing the compressed memory pool.  Each
+allocation in zpool is not directly accessible by address.  Rather, a handle is
 returned by the allocation routine and that handle must be mapped before being
 accessed.  The compressed memory pool grows on demand and shrinks as compressed
-pages are freed.  The pool is not preallocated.
+pages are freed.  The pool is not preallocated.  By default, a zpool of type
+zbud is created, but it can be selected at boot time by setting the zpool
+attribute, e.g. zswap.zpool=zbud.  It can also be changed at runtime using the
+sysfs zpool attribute, e.g.
+
+echo zbud  /sys/module/zswap/parameters/zpool
+
+The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which
+means the compression ratio will always be exactly 2:1 (not including half-full
+zbud pages), and any page that compresses to more than 1/2 page in size will be
+rejected (and written to the swap disk).  The zsmalloc type zpool has a more
+complex compressed page storage method, and it can achieve greater storage
+densities.  However, zsmalloc does not implement compressed page eviction, so
+once zswap fills it cannot evict the oldest page, it can only reject new pages.
 
 When a swap page is passed from frontswap to zswap, zswap maintains a mapping
-of the swap entry, a combination of the swap type and swap offset, to the zbud
+of the swap entry, a combination of the swap type and swap offset, to the zpool
 handle that references that compressed swap page.  This mapping is achieved
 with a red-black tree per swap type.  The swap offset is the search key for the
 tree nodes.
@@ -74,9 +87,11 @@ controlled policy:
 * max_pool_percent - The maximum percentage of memory that the compressed
 pool can occupy.
 
-Zswap allows the compressor to be selected at kernel boot time by setting the
-“compressor” attribute.  The default compressor is lzo.  e.g.
-zswap.compressor=deflate
+The default compressor is lzo, but it can be selected at boot time by setting
+the “compressor” attribute, e.g. zswap.compressor=lzo.  It can also be changed
+at runtime using the sysfs compressor attribute, e.g.
+
+echo lzo  /sys/module/zswap/parameters/compressor
 
 A debugfs interface is provided for various statistic about pool size, number
 of pages stored, and various counters for the reasons pages are rejected.
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/