Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Vitaly Wool
On Fri, Sep 25, 2015 at 10:47 AM, Minchan Kim  wrote:
> On Fri, Sep 25, 2015 at 10:17:54AM +0200, Vitaly Wool wrote:
>> 
>> > I already said questions, opinion and concerns but anything is not clear
>> > until now. Only clear thing I could hear is just "compaction stats are
>> > better" which is not enough for me. Sorry.
>> >
>> > 1) https://lkml.org/lkml/2015/9/15/33
>> > 2) https://lkml.org/lkml/2015/9/21/2
>>
>> Could you please stop perverting the facts, I did answer to that:
>> https://lkml.org/lkml/2015/9/21/753.
>>
>> Apart from that, an opinion is not necessarily something I would
>> answer. Concerns about zsmalloc are not in the scope of this patch's
>> discussion. If you have any concerns regarding this particular patch,
>> please let us know.
>
> Yes, I don't want to interrupt zbud thing which is Seth should maintain
> and I respect his decision but the reason I nacked is you said this patch
> aims for supporing zbud into zsmalloc for determinism.
> For that, at least, you should discuss with me and Sergey but I feel
> you are ignoring our comments.
>
>>
>> > Vitally, Please say what's the root cause of your problem and if it
>> > is external fragmentation, what's the problem of my approach?
>> >
>> > 1) make non-LRU page migrate
>> > 2) provide zsmalloc's migratpage
>>
>> The problem with your approach is that in your world I need to prove
>> my right to use zbud. This is a very strange speculation.
>
> No. If you want to contribute something, you should prove why yours
> is better. I already said my concerns and my approach. It's your turn
> that you should explain why it's better.

In fact, I do not add any specific functionality, my patches just do
what should have already been done -- that is, zram should have been
converted to use zpool api long ago. Your opposing to that is counter
productive.

>> > We should provide it for CMA as well as external fragmentation.
>> > I think we could solve your issue with above approach and
>> > it fundamentally makes zsmalloc/zbud happy in future.
>>
>> I doubt that but I'll answer in this thread:
>> https://lkml.org/lkml/2015/9/15/33 as zsmalloc deficiencies do not
>> have direct relation to this particular patch.
>>
>> > Also, please keep it in mind that zram has been in linux kernel for
>> > memory efficiency for a long time and later zswap/zbud was born
>> > for *determinism* at the cost of memory efficiency.
>>
>> Yep, and determinism is more important to me than the memory
>> efficiency. Dropping the compression ration from 3.2x to 1.8x is okay
>> with me and stalls in UI are not.
>
> Then, you could use zswap which have aimed for it with small changes
> to prevent writeback.

Should i come with a patch to zram explicitly stating that it is not
meant to be used in any environment that is deterministic / worst case
critical? Is that what you are aiming for?

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Sergey Senozhatsky
On (09/25/15 10:27), Vitaly Wool wrote:
> > Have you seen those symptoms before? How did you come up to a conclusion
> > that zram->zbud will do the trick?
> 
> I have data from various tests (partially described here:
> https://lkml.org/lkml/2015/9/17/244) and once again, I'll post a reply

yeah, I guess I'm just not so bright to quickly understand what is wrong
with zsmalloc from those numbers.

> to https://lkml.org/lkml/2015/9/15/33 with more detailed test
> description and explanation why zsmalloc is not the right choice for
> me.

great, thanks.

> > If those symptoms are some sort of a recent addition, then does it help
> > when you disable zsmalloc compaction?
> 
> No it doesn't. OTOH enabled zsmalloc compaction doesn't seem to have a
> substantial effect either.

hm. ok, that was my quick guess.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Minchan Kim
On Fri, Sep 25, 2015 at 05:47:13PM +0900, Minchan Kim wrote:
> On Fri, Sep 25, 2015 at 10:17:54AM +0200, Vitaly Wool wrote:
> > 
> > > I already said questions, opinion and concerns but anything is not clear
> > > until now. Only clear thing I could hear is just "compaction stats are
> > > better" which is not enough for me. Sorry.
> > >
> > > 1) https://lkml.org/lkml/2015/9/15/33
> > > 2) https://lkml.org/lkml/2015/9/21/2
> > 
> > Could you please stop perverting the facts, I did answer to that:
> > https://lkml.org/lkml/2015/9/21/753.
> > 
> > Apart from that, an opinion is not necessarily something I would
> > answer. Concerns about zsmalloc are not in the scope of this patch's
> > discussion. If you have any concerns regarding this particular patch,
> > please let us know.
> 
> Yes, I don't want to interrupt zbud thing which is Seth should maintain
> and I respect his decision but the reason I nacked is you said this patch
> aims for supporing zbud into zsmalloc for determinism.
   zram

Sorry for the typo.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Minchan Kim
On Fri, Sep 25, 2015 at 10:17:54AM +0200, Vitaly Wool wrote:
> 
> > I already said questions, opinion and concerns but anything is not clear
> > until now. Only clear thing I could hear is just "compaction stats are
> > better" which is not enough for me. Sorry.
> >
> > 1) https://lkml.org/lkml/2015/9/15/33
> > 2) https://lkml.org/lkml/2015/9/21/2
> 
> Could you please stop perverting the facts, I did answer to that:
> https://lkml.org/lkml/2015/9/21/753.
> 
> Apart from that, an opinion is not necessarily something I would
> answer. Concerns about zsmalloc are not in the scope of this patch's
> discussion. If you have any concerns regarding this particular patch,
> please let us know.

Yes, I don't want to interrupt zbud thing which is Seth should maintain
and I respect his decision but the reason I nacked is you said this patch
aims for supporing zbud into zsmalloc for determinism.
For that, at least, you should discuss with me and Sergey but I feel
you are ignoring our comments.

> 
> > Vitally, Please say what's the root cause of your problem and if it
> > is external fragmentation, what's the problem of my approach?
> >
> > 1) make non-LRU page migrate
> > 2) provide zsmalloc's migratpage
> 
> The problem with your approach is that in your world I need to prove
> my right to use zbud. This is a very strange speculation.

No. If you want to contribute something, you should prove why yours
is better. I already said my concerns and my approach. It's your turn
that you should explain why it's better.

> 
> > We should provide it for CMA as well as external fragmentation.
> > I think we could solve your issue with above approach and
> > it fundamentally makes zsmalloc/zbud happy in future.
> 
> I doubt that but I'll answer in this thread:
> https://lkml.org/lkml/2015/9/15/33 as zsmalloc deficiencies do not
> have direct relation to this particular patch.
> 
> > Also, please keep it in mind that zram has been in linux kernel for
> > memory efficiency for a long time and later zswap/zbud was born
> > for *determinism* at the cost of memory efficiency.
> 
> Yep, and determinism is more important to me than the memory
> efficiency. Dropping the compression ration from 3.2x to 1.8x is okay
> with me and stalls in UI are not.

Then, you could use zswap which have aimed for it with small changes
to prevent writeback.


> 
> ~vitaly

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Vitaly Wool
> Have you seen those symptoms before? How did you come up to a conclusion
> that zram->zbud will do the trick?

I have data from various tests (partially described here:
https://lkml.org/lkml/2015/9/17/244) and once again, I'll post a reply
to https://lkml.org/lkml/2015/9/15/33 with more detailed test
description and explanation why zsmalloc is not the right choice for
me.

> If those symptoms are some sort of a recent addition, then does it help
> when you disable zsmalloc compaction?

No it doesn't. OTOH enabled zsmalloc compaction doesn't seem to have a
substantial effect either.

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Vitaly Wool

> I already said questions, opinion and concerns but anything is not clear
> until now. Only clear thing I could hear is just "compaction stats are
> better" which is not enough for me. Sorry.
>
> 1) https://lkml.org/lkml/2015/9/15/33
> 2) https://lkml.org/lkml/2015/9/21/2

Could you please stop perverting the facts, I did answer to that:
https://lkml.org/lkml/2015/9/21/753.

Apart from that, an opinion is not necessarily something I would
answer. Concerns about zsmalloc are not in the scope of this patch's
discussion. If you have any concerns regarding this particular patch,
please let us know.

> Vitally, Please say what's the root cause of your problem and if it
> is external fragmentation, what's the problem of my approach?
>
> 1) make non-LRU page migrate
> 2) provide zsmalloc's migratpage

The problem with your approach is that in your world I need to prove
my right to use zbud. This is a very strange speculation.

> We should provide it for CMA as well as external fragmentation.
> I think we could solve your issue with above approach and
> it fundamentally makes zsmalloc/zbud happy in future.

I doubt that but I'll answer in this thread:
https://lkml.org/lkml/2015/9/15/33 as zsmalloc deficiencies do not
have direct relation to this particular patch.

> Also, please keep it in mind that zram has been in linux kernel for
> memory efficiency for a long time and later zswap/zbud was born
> for *determinism* at the cost of memory efficiency.

Yep, and determinism is more important to me than the memory
efficiency. Dropping the compression ration from 3.2x to 1.8x is okay
with me and stalls in UI are not.

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Sergey Senozhatsky
On (09/25/15 11:13), Minchan Kim wrote:
> > Ok, I can see that having the allocator backends for zpool 
> > have the same set of constraints is nice.
> 
> Sorry for delay. I'm on vacation until next week.
> It seems Seth was missed in previous discusstion which was not the end.
> 
> I already said questions, opinion and concerns but anything is not clear
> until now. Only clear thing I could hear is just "compaction stats are
> better" which is not enough for me. Sorry.

Agree.

There weren't lots of answers, really.

Vitaly,

Have you seen those symptoms before? How did you come up to a conclusion
that zram->zbud will do the trick?

If those symptoms are some sort of a recent addition, then does it help
when you disable zsmalloc compaction?

---

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index f59e8eb..b6c6a19 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1944,8 +1944,8 @@ struct zs_pool *zs_create_pool(const char *name, gfp_t 
flags)
 * Not critical, we still can use the pool
 * and user can trigger compaction manually.
 */
-   if (zs_register_shrinker(pool) == 0)
-   pool->shrinker_enabled = true;
+/* if (zs_register_shrinker(pool) == 0)
+   pool->shrinker_enabled = true;*/
return pool;
 
 err:

---


p.s. I'll be on vacation next week, so most likely will be quite slow
to answer.

-ss

> 
> 1) https://lkml.org/lkml/2015/9/15/33
> 2) https://lkml.org/lkml/2015/9/21/2
> 
> Vitally, Please say what's the root cause of your problem and if it
> is external fragmentation, what's the problem of my approach?
> 
> 1) make non-LRU page migrate
> 2) provide zsmalloc's migratpage
> 
> We should provide it for CMA as well as external fragmentation.
> I think we could solve your issue with above approach and
> it fundamentally makes zsmalloc/zbud happy in future.
> 
> Also, please keep it in mind that zram has been in linux kernel for
> memory efficiency for a long time and later zswap/zbud was born
> for *determinism* at the cost of memory efficiency.
> 
> Thanks.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Sergey Senozhatsky
On (09/25/15 11:13), Minchan Kim wrote:
> > Ok, I can see that having the allocator backends for zpool 
> > have the same set of constraints is nice.
> 
> Sorry for delay. I'm on vacation until next week.
> It seems Seth was missed in previous discusstion which was not the end.
> 
> I already said questions, opinion and concerns but anything is not clear
> until now. Only clear thing I could hear is just "compaction stats are
> better" which is not enough for me. Sorry.

Agree.

There weren't lots of answers, really.

Vitaly,

Have you seen those symptoms before? How did you come up to a conclusion
that zram->zbud will do the trick?

If those symptoms are some sort of a recent addition, then does it help
when you disable zsmalloc compaction?

---

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index f59e8eb..b6c6a19 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1944,8 +1944,8 @@ struct zs_pool *zs_create_pool(const char *name, gfp_t 
flags)
 * Not critical, we still can use the pool
 * and user can trigger compaction manually.
 */
-   if (zs_register_shrinker(pool) == 0)
-   pool->shrinker_enabled = true;
+/* if (zs_register_shrinker(pool) == 0)
+   pool->shrinker_enabled = true;*/
return pool;
 
 err:

---


p.s. I'll be on vacation next week, so most likely will be quite slow
to answer.

-ss

> 
> 1) https://lkml.org/lkml/2015/9/15/33
> 2) https://lkml.org/lkml/2015/9/21/2
> 
> Vitally, Please say what's the root cause of your problem and if it
> is external fragmentation, what's the problem of my approach?
> 
> 1) make non-LRU page migrate
> 2) provide zsmalloc's migratpage
> 
> We should provide it for CMA as well as external fragmentation.
> I think we could solve your issue with above approach and
> it fundamentally makes zsmalloc/zbud happy in future.
> 
> Also, please keep it in mind that zram has been in linux kernel for
> memory efficiency for a long time and later zswap/zbud was born
> for *determinism* at the cost of memory efficiency.
> 
> Thanks.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Minchan Kim
On Fri, Sep 25, 2015 at 10:17:54AM +0200, Vitaly Wool wrote:
> 
> > I already said questions, opinion and concerns but anything is not clear
> > until now. Only clear thing I could hear is just "compaction stats are
> > better" which is not enough for me. Sorry.
> >
> > 1) https://lkml.org/lkml/2015/9/15/33
> > 2) https://lkml.org/lkml/2015/9/21/2
> 
> Could you please stop perverting the facts, I did answer to that:
> https://lkml.org/lkml/2015/9/21/753.
> 
> Apart from that, an opinion is not necessarily something I would
> answer. Concerns about zsmalloc are not in the scope of this patch's
> discussion. If you have any concerns regarding this particular patch,
> please let us know.

Yes, I don't want to interrupt zbud thing which is Seth should maintain
and I respect his decision but the reason I nacked is you said this patch
aims for supporing zbud into zsmalloc for determinism.
For that, at least, you should discuss with me and Sergey but I feel
you are ignoring our comments.

> 
> > Vitally, Please say what's the root cause of your problem and if it
> > is external fragmentation, what's the problem of my approach?
> >
> > 1) make non-LRU page migrate
> > 2) provide zsmalloc's migratpage
> 
> The problem with your approach is that in your world I need to prove
> my right to use zbud. This is a very strange speculation.

No. If you want to contribute something, you should prove why yours
is better. I already said my concerns and my approach. It's your turn
that you should explain why it's better.

> 
> > We should provide it for CMA as well as external fragmentation.
> > I think we could solve your issue with above approach and
> > it fundamentally makes zsmalloc/zbud happy in future.
> 
> I doubt that but I'll answer in this thread:
> https://lkml.org/lkml/2015/9/15/33 as zsmalloc deficiencies do not
> have direct relation to this particular patch.
> 
> > Also, please keep it in mind that zram has been in linux kernel for
> > memory efficiency for a long time and later zswap/zbud was born
> > for *determinism* at the cost of memory efficiency.
> 
> Yep, and determinism is more important to me than the memory
> efficiency. Dropping the compression ration from 3.2x to 1.8x is okay
> with me and stalls in UI are not.

Then, you could use zswap which have aimed for it with small changes
to prevent writeback.


> 
> ~vitaly

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Vitaly Wool

> I already said questions, opinion and concerns but anything is not clear
> until now. Only clear thing I could hear is just "compaction stats are
> better" which is not enough for me. Sorry.
>
> 1) https://lkml.org/lkml/2015/9/15/33
> 2) https://lkml.org/lkml/2015/9/21/2

Could you please stop perverting the facts, I did answer to that:
https://lkml.org/lkml/2015/9/21/753.

Apart from that, an opinion is not necessarily something I would
answer. Concerns about zsmalloc are not in the scope of this patch's
discussion. If you have any concerns regarding this particular patch,
please let us know.

> Vitally, Please say what's the root cause of your problem and if it
> is external fragmentation, what's the problem of my approach?
>
> 1) make non-LRU page migrate
> 2) provide zsmalloc's migratpage

The problem with your approach is that in your world I need to prove
my right to use zbud. This is a very strange speculation.

> We should provide it for CMA as well as external fragmentation.
> I think we could solve your issue with above approach and
> it fundamentally makes zsmalloc/zbud happy in future.

I doubt that but I'll answer in this thread:
https://lkml.org/lkml/2015/9/15/33 as zsmalloc deficiencies do not
have direct relation to this particular patch.

> Also, please keep it in mind that zram has been in linux kernel for
> memory efficiency for a long time and later zswap/zbud was born
> for *determinism* at the cost of memory efficiency.

Yep, and determinism is more important to me than the memory
efficiency. Dropping the compression ration from 3.2x to 1.8x is okay
with me and stalls in UI are not.

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Vitaly Wool
> Have you seen those symptoms before? How did you come up to a conclusion
> that zram->zbud will do the trick?

I have data from various tests (partially described here:
https://lkml.org/lkml/2015/9/17/244) and once again, I'll post a reply
to https://lkml.org/lkml/2015/9/15/33 with more detailed test
description and explanation why zsmalloc is not the right choice for
me.

> If those symptoms are some sort of a recent addition, then does it help
> when you disable zsmalloc compaction?

No it doesn't. OTOH enabled zsmalloc compaction doesn't seem to have a
substantial effect either.

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Minchan Kim
On Fri, Sep 25, 2015 at 05:47:13PM +0900, Minchan Kim wrote:
> On Fri, Sep 25, 2015 at 10:17:54AM +0200, Vitaly Wool wrote:
> > 
> > > I already said questions, opinion and concerns but anything is not clear
> > > until now. Only clear thing I could hear is just "compaction stats are
> > > better" which is not enough for me. Sorry.
> > >
> > > 1) https://lkml.org/lkml/2015/9/15/33
> > > 2) https://lkml.org/lkml/2015/9/21/2
> > 
> > Could you please stop perverting the facts, I did answer to that:
> > https://lkml.org/lkml/2015/9/21/753.
> > 
> > Apart from that, an opinion is not necessarily something I would
> > answer. Concerns about zsmalloc are not in the scope of this patch's
> > discussion. If you have any concerns regarding this particular patch,
> > please let us know.
> 
> Yes, I don't want to interrupt zbud thing which is Seth should maintain
> and I respect his decision but the reason I nacked is you said this patch
> aims for supporing zbud into zsmalloc for determinism.
   zram

Sorry for the typo.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Vitaly Wool
On Fri, Sep 25, 2015 at 10:47 AM, Minchan Kim  wrote:
> On Fri, Sep 25, 2015 at 10:17:54AM +0200, Vitaly Wool wrote:
>> 
>> > I already said questions, opinion and concerns but anything is not clear
>> > until now. Only clear thing I could hear is just "compaction stats are
>> > better" which is not enough for me. Sorry.
>> >
>> > 1) https://lkml.org/lkml/2015/9/15/33
>> > 2) https://lkml.org/lkml/2015/9/21/2
>>
>> Could you please stop perverting the facts, I did answer to that:
>> https://lkml.org/lkml/2015/9/21/753.
>>
>> Apart from that, an opinion is not necessarily something I would
>> answer. Concerns about zsmalloc are not in the scope of this patch's
>> discussion. If you have any concerns regarding this particular patch,
>> please let us know.
>
> Yes, I don't want to interrupt zbud thing which is Seth should maintain
> and I respect his decision but the reason I nacked is you said this patch
> aims for supporing zbud into zsmalloc for determinism.
> For that, at least, you should discuss with me and Sergey but I feel
> you are ignoring our comments.
>
>>
>> > Vitally, Please say what's the root cause of your problem and if it
>> > is external fragmentation, what's the problem of my approach?
>> >
>> > 1) make non-LRU page migrate
>> > 2) provide zsmalloc's migratpage
>>
>> The problem with your approach is that in your world I need to prove
>> my right to use zbud. This is a very strange speculation.
>
> No. If you want to contribute something, you should prove why yours
> is better. I already said my concerns and my approach. It's your turn
> that you should explain why it's better.

In fact, I do not add any specific functionality, my patches just do
what should have already been done -- that is, zram should have been
converted to use zpool api long ago. Your opposing to that is counter
productive.

>> > We should provide it for CMA as well as external fragmentation.
>> > I think we could solve your issue with above approach and
>> > it fundamentally makes zsmalloc/zbud happy in future.
>>
>> I doubt that but I'll answer in this thread:
>> https://lkml.org/lkml/2015/9/15/33 as zsmalloc deficiencies do not
>> have direct relation to this particular patch.
>>
>> > Also, please keep it in mind that zram has been in linux kernel for
>> > memory efficiency for a long time and later zswap/zbud was born
>> > for *determinism* at the cost of memory efficiency.
>>
>> Yep, and determinism is more important to me than the memory
>> efficiency. Dropping the compression ration from 3.2x to 1.8x is okay
>> with me and stalls in UI are not.
>
> Then, you could use zswap which have aimed for it with small changes
> to prevent writeback.

Should i come with a patch to zram explicitly stating that it is not
meant to be used in any environment that is deterministic / worst case
critical? Is that what you are aiming for?

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-25 Thread Sergey Senozhatsky
On (09/25/15 10:27), Vitaly Wool wrote:
> > Have you seen those symptoms before? How did you come up to a conclusion
> > that zram->zbud will do the trick?
> 
> I have data from various tests (partially described here:
> https://lkml.org/lkml/2015/9/17/244) and once again, I'll post a reply

yeah, I guess I'm just not so bright to quickly understand what is wrong
with zsmalloc from those numbers.

> to https://lkml.org/lkml/2015/9/15/33 with more detailed test
> description and explanation why zsmalloc is not the right choice for
> me.

great, thanks.

> > If those symptoms are some sort of a recent addition, then does it help
> > when you disable zsmalloc compaction?
> 
> No it doesn't. OTOH enabled zsmalloc compaction doesn't seem to have a
> substantial effect either.

hm. ok, that was my quick guess.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-24 Thread Vitaly Wool
Hello Seth,

On Thu, Sep 24, 2015 at 12:41 AM, Seth Jennings
 wrote:
> On Wed, Sep 23, 2015 at 10:59:00PM +0200, Vitaly Wool wrote:
>> Okay, how about this? It's gotten smaller BTW :)
>>
>> zbud: allow up to PAGE_SIZE allocations
>>
>> Currently zbud is only capable of allocating not more than
>> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
>> long as only zswap is using it, but other users of zbud may
>> (and likely will) want to allocate up to PAGE_SIZE. This patch
>> addresses that by skipping the creation of zbud internal
>> structure in the beginning of an allocated page. As a zbud page
>> is no longer guaranteed to contain zbud header, the following
>> changes have to be applied throughout the code:
>> * page->lru to be used for zbud page lists
>> * page->private to hold 'under_reclaim' flag
>>
>> page->private will also be used to indicate if this page contains
>> a zbud header in the beginning or not ('headless' flag).
>>
>> Signed-off-by: Vitaly Wool 
>> ---
>>  mm/zbud.c | 167 
>> ++
>>  1 file changed, 113 insertions(+), 54 deletions(-)
>>
>> diff --git a/mm/zbud.c b/mm/zbud.c
>> index fa48bcdf..3946fba 100644
>> --- a/mm/zbud.c
>> +++ b/mm/zbud.c
>> @@ -105,18 +105,20 @@ struct zbud_pool {
>>
>>  /*
>>   * struct zbud_header - zbud page metadata occupying the first chunk of each
>> - *   zbud page.
>> + *   zbud page, except for HEADLESS pages
>>   * @buddy:   links the zbud page into the unbuddied/buddied lists in the 
>> pool
>> - * @lru: links the zbud page into the lru list in the pool
>>   * @first_chunks:the size of the first buddy in chunks, 0 if free
>>   * @last_chunks: the size of the last buddy in chunks, 0 if free
>>   */
>>  struct zbud_header {
>>   struct list_head buddy;
>> - struct list_head lru;
>>   unsigned int first_chunks;
>>   unsigned int last_chunks;
>> - bool under_reclaim;
>> +};
>> +
>> +enum zbud_page_flags {
>> + UNDER_RECLAIM = 0,
>
> Don't need the "= 0"
>
>> + PAGE_HEADLESS,
>
> Also I think we should prefix the enum values here. With ZPF_ ?
>
>>  };
>>
>>  /*
>> @@ -221,6 +223,7 @@ MODULE_ALIAS("zpool-zbud");
>>  */
>>  /* Just to make the code easier to read */
>>  enum buddy {
>> + HEADLESS,
>>   FIRST,
>>   LAST
>>  };
>> @@ -238,11 +241,14 @@ static int size_to_chunks(size_t size)
>>  static struct zbud_header *init_zbud_page(struct page *page)
>>  {
>>   struct zbud_header *zhdr = page_address(page);
>> +
>> + INIT_LIST_HEAD(>lru);
>> + clear_bit(UNDER_RECLAIM, >private);
>> + clear_bit(HEADLESS, >private);
>
> I know we are using private in a bitwise flags mode, but maybe we
> should just init with page->private = 0
>
>> +
>>   zhdr->first_chunks = 0;
>>   zhdr->last_chunks = 0;
>>   INIT_LIST_HEAD(>buddy);
>> - INIT_LIST_HEAD(>lru);
>> - zhdr->under_reclaim = 0;
>>   return zhdr;
>>  }
>>
>> @@ -267,11 +273,22 @@ static unsigned long encode_handle(struct zbud_header 
>> *zhdr, enum buddy bud)
>>* over the zbud header in the first chunk.
>>*/
>>   handle = (unsigned long)zhdr;
>> - if (bud == FIRST)
>> + switch (bud) {
>> + case FIRST:
>>   /* skip over zbud header */
>>   handle += ZHDR_SIZE_ALIGNED;
>> - else /* bud == LAST */
>> + break;
>> + case LAST:
>>   handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
>> + break;
>> + case HEADLESS:
>> + break;
>> + default:
>> + /* this should never happen */
>> + pr_err("zbud: invalid buddy value %d\n", bud);
>> + handle = 0;
>> + break;
>> + }
>
> Don't need this default case since we have a case for each valid value
> of the enum.
>
> Also, I think we want to add some code to free_zbud_page() to clear
> page->private and init page->lru so we don't leave dangling pointers.

Right, maybe it makes sense for free_zbud_page() to take struct page
pointer as an argument, too, to minimize back-and-forth conversions?

> Looks good though :)

Thanks, I'll come up with the new version shortly. :)

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-24 Thread Minchan Kim
Hello,

On Wed, Sep 23, 2015 at 04:57:26PM -0500, Seth Jennings wrote:
> On Wed, Sep 23, 2015 at 09:54:02AM +0200, Vitaly Wool wrote:
> > On Wed, Sep 23, 2015 at 5:18 AM, Seth Jennings  
> > wrote:
> > > On Tue, Sep 22, 2015 at 02:17:33PM +0200, Vitaly Wool wrote:
> > >> Currently zbud is only capable of allocating not more than
> > >> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> > >> long as only zswap is using it, but other users of zbud may
> > >> (and likely will) want to allocate up to PAGE_SIZE. This patch
> > >> addresses that by skipping the creation of zbud internal
> > >> structure in the beginning of an allocated page (such pages are
> > >> then called 'headless').
> > >
> > > I guess I'm having trouble with this.  If you store a PAGE_SIZE
> > > allocation in zbud, then the zpage can only have one allocation as there
> > > is no room for a buddy.  So... we have an allocator for that: the
> > > page allocator.
> > >
> > > zbud doesn't support this by design because, if you are only storing one
> > > allocation per page, you don't gain anything.
> > >
> > > This functionality creates many new edge cases for the code.
> > >
> > > What is this use case you envision?  I think we need to discuss
> > > whether the use case exists and if it justifies the added complexity.
> > 
> > The use case is to use zram with zbud as allocator via the common
> > zpool api. Sometimes determinism and better worst-case time are more
> > important than high compression ratio.
> > As far as I can see, I'm not the only one who wants this case
> > supported in mainline.
> 
> Ok, I can see that having the allocator backends for zpool 
> have the same set of constraints is nice.

Sorry for delay. I'm on vacation until next week.
It seems Seth was missed in previous discusstion which was not the end.

I already said questions, opinion and concerns but anything is not clear
until now. Only clear thing I could hear is just "compaction stats are
better" which is not enough for me. Sorry.

1) https://lkml.org/lkml/2015/9/15/33
2) https://lkml.org/lkml/2015/9/21/2

Vitally, Please say what's the root cause of your problem and if it
is external fragmentation, what's the problem of my approach?

1) make non-LRU page migrate
2) provide zsmalloc's migratpage

We should provide it for CMA as well as external fragmentation.
I think we could solve your issue with above approach and
it fundamentally makes zsmalloc/zbud happy in future.

Also, please keep it in mind that zram has been in linux kernel for
memory efficiency for a long time and later zswap/zbud was born
for *determinism* at the cost of memory efficiency.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-24 Thread Minchan Kim
Hello,

On Wed, Sep 23, 2015 at 04:57:26PM -0500, Seth Jennings wrote:
> On Wed, Sep 23, 2015 at 09:54:02AM +0200, Vitaly Wool wrote:
> > On Wed, Sep 23, 2015 at 5:18 AM, Seth Jennings  
> > wrote:
> > > On Tue, Sep 22, 2015 at 02:17:33PM +0200, Vitaly Wool wrote:
> > >> Currently zbud is only capable of allocating not more than
> > >> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> > >> long as only zswap is using it, but other users of zbud may
> > >> (and likely will) want to allocate up to PAGE_SIZE. This patch
> > >> addresses that by skipping the creation of zbud internal
> > >> structure in the beginning of an allocated page (such pages are
> > >> then called 'headless').
> > >
> > > I guess I'm having trouble with this.  If you store a PAGE_SIZE
> > > allocation in zbud, then the zpage can only have one allocation as there
> > > is no room for a buddy.  So... we have an allocator for that: the
> > > page allocator.
> > >
> > > zbud doesn't support this by design because, if you are only storing one
> > > allocation per page, you don't gain anything.
> > >
> > > This functionality creates many new edge cases for the code.
> > >
> > > What is this use case you envision?  I think we need to discuss
> > > whether the use case exists and if it justifies the added complexity.
> > 
> > The use case is to use zram with zbud as allocator via the common
> > zpool api. Sometimes determinism and better worst-case time are more
> > important than high compression ratio.
> > As far as I can see, I'm not the only one who wants this case
> > supported in mainline.
> 
> Ok, I can see that having the allocator backends for zpool 
> have the same set of constraints is nice.

Sorry for delay. I'm on vacation until next week.
It seems Seth was missed in previous discusstion which was not the end.

I already said questions, opinion and concerns but anything is not clear
until now. Only clear thing I could hear is just "compaction stats are
better" which is not enough for me. Sorry.

1) https://lkml.org/lkml/2015/9/15/33
2) https://lkml.org/lkml/2015/9/21/2

Vitally, Please say what's the root cause of your problem and if it
is external fragmentation, what's the problem of my approach?

1) make non-LRU page migrate
2) provide zsmalloc's migratpage

We should provide it for CMA as well as external fragmentation.
I think we could solve your issue with above approach and
it fundamentally makes zsmalloc/zbud happy in future.

Also, please keep it in mind that zram has been in linux kernel for
memory efficiency for a long time and later zswap/zbud was born
for *determinism* at the cost of memory efficiency.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-24 Thread Vitaly Wool
Hello Seth,

On Thu, Sep 24, 2015 at 12:41 AM, Seth Jennings
 wrote:
> On Wed, Sep 23, 2015 at 10:59:00PM +0200, Vitaly Wool wrote:
>> Okay, how about this? It's gotten smaller BTW :)
>>
>> zbud: allow up to PAGE_SIZE allocations
>>
>> Currently zbud is only capable of allocating not more than
>> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
>> long as only zswap is using it, but other users of zbud may
>> (and likely will) want to allocate up to PAGE_SIZE. This patch
>> addresses that by skipping the creation of zbud internal
>> structure in the beginning of an allocated page. As a zbud page
>> is no longer guaranteed to contain zbud header, the following
>> changes have to be applied throughout the code:
>> * page->lru to be used for zbud page lists
>> * page->private to hold 'under_reclaim' flag
>>
>> page->private will also be used to indicate if this page contains
>> a zbud header in the beginning or not ('headless' flag).
>>
>> Signed-off-by: Vitaly Wool 
>> ---
>>  mm/zbud.c | 167 
>> ++
>>  1 file changed, 113 insertions(+), 54 deletions(-)
>>
>> diff --git a/mm/zbud.c b/mm/zbud.c
>> index fa48bcdf..3946fba 100644
>> --- a/mm/zbud.c
>> +++ b/mm/zbud.c
>> @@ -105,18 +105,20 @@ struct zbud_pool {
>>
>>  /*
>>   * struct zbud_header - zbud page metadata occupying the first chunk of each
>> - *   zbud page.
>> + *   zbud page, except for HEADLESS pages
>>   * @buddy:   links the zbud page into the unbuddied/buddied lists in the 
>> pool
>> - * @lru: links the zbud page into the lru list in the pool
>>   * @first_chunks:the size of the first buddy in chunks, 0 if free
>>   * @last_chunks: the size of the last buddy in chunks, 0 if free
>>   */
>>  struct zbud_header {
>>   struct list_head buddy;
>> - struct list_head lru;
>>   unsigned int first_chunks;
>>   unsigned int last_chunks;
>> - bool under_reclaim;
>> +};
>> +
>> +enum zbud_page_flags {
>> + UNDER_RECLAIM = 0,
>
> Don't need the "= 0"
>
>> + PAGE_HEADLESS,
>
> Also I think we should prefix the enum values here. With ZPF_ ?
>
>>  };
>>
>>  /*
>> @@ -221,6 +223,7 @@ MODULE_ALIAS("zpool-zbud");
>>  */
>>  /* Just to make the code easier to read */
>>  enum buddy {
>> + HEADLESS,
>>   FIRST,
>>   LAST
>>  };
>> @@ -238,11 +241,14 @@ static int size_to_chunks(size_t size)
>>  static struct zbud_header *init_zbud_page(struct page *page)
>>  {
>>   struct zbud_header *zhdr = page_address(page);
>> +
>> + INIT_LIST_HEAD(>lru);
>> + clear_bit(UNDER_RECLAIM, >private);
>> + clear_bit(HEADLESS, >private);
>
> I know we are using private in a bitwise flags mode, but maybe we
> should just init with page->private = 0
>
>> +
>>   zhdr->first_chunks = 0;
>>   zhdr->last_chunks = 0;
>>   INIT_LIST_HEAD(>buddy);
>> - INIT_LIST_HEAD(>lru);
>> - zhdr->under_reclaim = 0;
>>   return zhdr;
>>  }
>>
>> @@ -267,11 +273,22 @@ static unsigned long encode_handle(struct zbud_header 
>> *zhdr, enum buddy bud)
>>* over the zbud header in the first chunk.
>>*/
>>   handle = (unsigned long)zhdr;
>> - if (bud == FIRST)
>> + switch (bud) {
>> + case FIRST:
>>   /* skip over zbud header */
>>   handle += ZHDR_SIZE_ALIGNED;
>> - else /* bud == LAST */
>> + break;
>> + case LAST:
>>   handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
>> + break;
>> + case HEADLESS:
>> + break;
>> + default:
>> + /* this should never happen */
>> + pr_err("zbud: invalid buddy value %d\n", bud);
>> + handle = 0;
>> + break;
>> + }
>
> Don't need this default case since we have a case for each valid value
> of the enum.
>
> Also, I think we want to add some code to free_zbud_page() to clear
> page->private and init page->lru so we don't leave dangling pointers.

Right, maybe it makes sense for free_zbud_page() to take struct page
pointer as an argument, too, to minimize back-and-forth conversions?

> Looks good though :)

Thanks, I'll come up with the new version shortly. :)

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Seth Jennings
On Wed, Sep 23, 2015 at 10:59:00PM +0200, Vitaly Wool wrote:
> Okay, how about this? It's gotten smaller BTW :)
> 
> zbud: allow up to PAGE_SIZE allocations
> 
> Currently zbud is only capable of allocating not more than
> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> long as only zswap is using it, but other users of zbud may
> (and likely will) want to allocate up to PAGE_SIZE. This patch
> addresses that by skipping the creation of zbud internal
> structure in the beginning of an allocated page. As a zbud page
> is no longer guaranteed to contain zbud header, the following
> changes have to be applied throughout the code:
> * page->lru to be used for zbud page lists
> * page->private to hold 'under_reclaim' flag
> 
> page->private will also be used to indicate if this page contains
> a zbud header in the beginning or not ('headless' flag).
> 
> Signed-off-by: Vitaly Wool 
> ---
>  mm/zbud.c | 167 
> ++
>  1 file changed, 113 insertions(+), 54 deletions(-)
> 
> diff --git a/mm/zbud.c b/mm/zbud.c
> index fa48bcdf..3946fba 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -105,18 +105,20 @@ struct zbud_pool {
>  
>  /*
>   * struct zbud_header - zbud page metadata occupying the first chunk of each
> - *   zbud page.
> + *   zbud page, except for HEADLESS pages
>   * @buddy:   links the zbud page into the unbuddied/buddied lists in the pool
> - * @lru: links the zbud page into the lru list in the pool
>   * @first_chunks:the size of the first buddy in chunks, 0 if free
>   * @last_chunks: the size of the last buddy in chunks, 0 if free
>   */
>  struct zbud_header {
>   struct list_head buddy;
> - struct list_head lru;
>   unsigned int first_chunks;
>   unsigned int last_chunks;
> - bool under_reclaim;
> +};
> +
> +enum zbud_page_flags {
> + UNDER_RECLAIM = 0,

Don't need the "= 0"

> + PAGE_HEADLESS,

Also I think we should prefix the enum values here. With ZPF_ ?

>  };
>  
>  /*
> @@ -221,6 +223,7 @@ MODULE_ALIAS("zpool-zbud");
>  */
>  /* Just to make the code easier to read */
>  enum buddy {
> + HEADLESS,
>   FIRST,
>   LAST
>  };
> @@ -238,11 +241,14 @@ static int size_to_chunks(size_t size)
>  static struct zbud_header *init_zbud_page(struct page *page)
>  {
>   struct zbud_header *zhdr = page_address(page);
> +
> + INIT_LIST_HEAD(>lru);
> + clear_bit(UNDER_RECLAIM, >private);
> + clear_bit(HEADLESS, >private);

I know we are using private in a bitwise flags mode, but maybe we
should just init with page->private = 0

> +
>   zhdr->first_chunks = 0;
>   zhdr->last_chunks = 0;
>   INIT_LIST_HEAD(>buddy);
> - INIT_LIST_HEAD(>lru);
> - zhdr->under_reclaim = 0;
>   return zhdr;
>  }
>  
> @@ -267,11 +273,22 @@ static unsigned long encode_handle(struct zbud_header 
> *zhdr, enum buddy bud)
>* over the zbud header in the first chunk.
>*/
>   handle = (unsigned long)zhdr;
> - if (bud == FIRST)
> + switch (bud) {
> + case FIRST:
>   /* skip over zbud header */
>   handle += ZHDR_SIZE_ALIGNED;
> - else /* bud == LAST */
> + break;
> + case LAST:
>   handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
> + break;
> + case HEADLESS:
> + break;
> + default:
> + /* this should never happen */
> + pr_err("zbud: invalid buddy value %d\n", bud);
> + handle = 0;
> + break;
> + }

Don't need this default case since we have a case for each valid value
of the enum.

Also, I think we want to add some code to free_zbud_page() to clear
page->private and init page->lru so we don't leave dangling pointers.

Looks good though :)

Thanks,
Seth

>   return handle;
>  }
>  
> @@ -287,6 +304,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
>   /*
>* Rather than branch for different situations, just use the fact that
>* free buddies have a length of zero to simplify everything.
> +  * NB: can't be used with HEADLESS pages.
>*/
>   return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
>  }
> @@ -353,31 +371,39 @@ void zbud_destroy_pool(struct zbud_pool *pool)
>  int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
>   unsigned long *handle)
>  {
> - int chunks, i, freechunks;
> + int chunks = 0, i, freechunks;
>   struct zbud_header *zhdr = NULL;
>   enum buddy bud;
>   struct page *page;
>  
>   if (!size || (gfp & __GFP_HIGHMEM))
>   return -EINVAL;
> - if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
> +
> + if (size > PAGE_SIZE)
>   return -ENOSPC;
> - chunks = size_to_chunks(size);
> - spin_lock(>lock);
>  
> - /* First, try to find an unbuddied zbud page. */
> - zhdr 

Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Seth Jennings
On Wed, Sep 23, 2015 at 09:54:02AM +0200, Vitaly Wool wrote:
> On Wed, Sep 23, 2015 at 5:18 AM, Seth Jennings  
> wrote:
> > On Tue, Sep 22, 2015 at 02:17:33PM +0200, Vitaly Wool wrote:
> >> Currently zbud is only capable of allocating not more than
> >> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> >> long as only zswap is using it, but other users of zbud may
> >> (and likely will) want to allocate up to PAGE_SIZE. This patch
> >> addresses that by skipping the creation of zbud internal
> >> structure in the beginning of an allocated page (such pages are
> >> then called 'headless').
> >
> > I guess I'm having trouble with this.  If you store a PAGE_SIZE
> > allocation in zbud, then the zpage can only have one allocation as there
> > is no room for a buddy.  So... we have an allocator for that: the
> > page allocator.
> >
> > zbud doesn't support this by design because, if you are only storing one
> > allocation per page, you don't gain anything.
> >
> > This functionality creates many new edge cases for the code.
> >
> > What is this use case you envision?  I think we need to discuss
> > whether the use case exists and if it justifies the added complexity.
> 
> The use case is to use zram with zbud as allocator via the common
> zpool api. Sometimes determinism and better worst-case time are more
> important than high compression ratio.
> As far as I can see, I'm not the only one who wants this case
> supported in mainline.

Ok, I can see that having the allocator backends for zpool 
have the same set of constraints is nice.

I'll look at your latest patch.

Thanks,
Seth

> 
> > We are crossing a boundary into zsmalloc style complexity with storing
> > stuff in the struct page, something I really didn't want to do in zbud.
> 
> Well, the thing is we need PAGE_SIZE allocations supported to use zram
> with zbud. I can of course add the code handling this in zpool but I
> am quite sure doing that in zbud directly is a better idea. I'm very
> keen on keeping the complexity down as much as possible though.
> 
> > zbud is the simple one, zsmalloc is the complex one.  I'd hate to have
> > two complex ones :-/
> 
> Who am I to disagree :) Keeping zbud simple is my goal, too, but once
> again, I'd really like it to support PAGE_SIZE allocations. And if it
> doesn't, the whole zpool thing for it becomes useless, since there
> will hardly be any zbud users other than zswap.
> 
> ~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Vitaly Wool
Okay, how about this? It's gotten smaller BTW :)

zbud: allow up to PAGE_SIZE allocations

Currently zbud is only capable of allocating not more than
PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
long as only zswap is using it, but other users of zbud may
(and likely will) want to allocate up to PAGE_SIZE. This patch
addresses that by skipping the creation of zbud internal
structure in the beginning of an allocated page. As a zbud page
is no longer guaranteed to contain zbud header, the following
changes have to be applied throughout the code:
* page->lru to be used for zbud page lists
* page->private to hold 'under_reclaim' flag

page->private will also be used to indicate if this page contains
a zbud header in the beginning or not ('headless' flag).

Signed-off-by: Vitaly Wool 
---
 mm/zbud.c | 167 ++
 1 file changed, 113 insertions(+), 54 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index fa48bcdf..3946fba 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -105,18 +105,20 @@ struct zbud_pool {
 
 /*
  * struct zbud_header - zbud page metadata occupying the first chunk of each
- * zbud page.
+ * zbud page, except for HEADLESS pages
  * @buddy: links the zbud page into the unbuddied/buddied lists in the pool
- * @lru:   links the zbud page into the lru list in the pool
  * @first_chunks:  the size of the first buddy in chunks, 0 if free
  * @last_chunks:   the size of the last buddy in chunks, 0 if free
  */
 struct zbud_header {
struct list_head buddy;
-   struct list_head lru;
unsigned int first_chunks;
unsigned int last_chunks;
-   bool under_reclaim;
+};
+
+enum zbud_page_flags {
+   UNDER_RECLAIM = 0,
+   PAGE_HEADLESS,
 };
 
 /*
@@ -221,6 +223,7 @@ MODULE_ALIAS("zpool-zbud");
 */
 /* Just to make the code easier to read */
 enum buddy {
+   HEADLESS,
FIRST,
LAST
 };
@@ -238,11 +241,14 @@ static int size_to_chunks(size_t size)
 static struct zbud_header *init_zbud_page(struct page *page)
 {
struct zbud_header *zhdr = page_address(page);
+
+   INIT_LIST_HEAD(>lru);
+   clear_bit(UNDER_RECLAIM, >private);
+   clear_bit(HEADLESS, >private);
+
zhdr->first_chunks = 0;
zhdr->last_chunks = 0;
INIT_LIST_HEAD(>buddy);
-   INIT_LIST_HEAD(>lru);
-   zhdr->under_reclaim = 0;
return zhdr;
 }
 
@@ -267,11 +273,22 @@ static unsigned long encode_handle(struct zbud_header 
*zhdr, enum buddy bud)
 * over the zbud header in the first chunk.
 */
handle = (unsigned long)zhdr;
-   if (bud == FIRST)
+   switch (bud) {
+   case FIRST:
/* skip over zbud header */
handle += ZHDR_SIZE_ALIGNED;
-   else /* bud == LAST */
+   break;
+   case LAST:
handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
+   break;
+   case HEADLESS:
+   break;
+   default:
+   /* this should never happen */
+   pr_err("zbud: invalid buddy value %d\n", bud);
+   handle = 0;
+   break;
+   }
return handle;
 }
 
@@ -287,6 +304,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
/*
 * Rather than branch for different situations, just use the fact that
 * free buddies have a length of zero to simplify everything.
+* NB: can't be used with HEADLESS pages.
 */
return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
 }
@@ -353,31 +371,39 @@ void zbud_destroy_pool(struct zbud_pool *pool)
 int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
unsigned long *handle)
 {
-   int chunks, i, freechunks;
+   int chunks = 0, i, freechunks;
struct zbud_header *zhdr = NULL;
enum buddy bud;
struct page *page;
 
if (!size || (gfp & __GFP_HIGHMEM))
return -EINVAL;
-   if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
+
+   if (size > PAGE_SIZE)
return -ENOSPC;
-   chunks = size_to_chunks(size);
-   spin_lock(>lock);
 
-   /* First, try to find an unbuddied zbud page. */
-   zhdr = NULL;
-   for_each_unbuddied_list(i, chunks) {
-   if (!list_empty(>unbuddied[i])) {
-   zhdr = list_first_entry(>unbuddied[i],
-   struct zbud_header, buddy);
-   list_del(>buddy);
-   if (zhdr->first_chunks == 0)
-   bud = FIRST;
-   else
-   bud = LAST;
-   goto found;
+   if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
+   bud = HEADLESS;
+   else {
+   chunks = size_to_chunks(size);
+

Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Vitaly Wool
On Tue, Sep 22, 2015 at 11:49 PM, Dan Streetman  wrote:
> On Tue, Sep 22, 2015 at 8:17 AM, Vitaly Wool  wrote:
>> Currently zbud is only capable of allocating not more than
>> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
>> long as only zswap is using it, but other users of zbud may
>> (and likely will) want to allocate up to PAGE_SIZE. This patch
>> addresses that by skipping the creation of zbud internal
>> structure in the beginning of an allocated page (such pages are
>> then called 'headless').
>>
>> As a zbud page is no longer guaranteed to contain zbud header, the
>> following changes had to be applied throughout the code:
>> * page->lru to be used for zbud page lists
>> * page->private to hold 'under_reclaim' flag
>>
>> page->private will also be used to indicate if this page contains
>> a zbud header in the beginning or not ('headless' flag).
>>
>> Signed-off-by: Vitaly Wool 
>> ---
>>  mm/zbud.c | 194 
>> +-
>>  1 file changed, 128 insertions(+), 66 deletions(-)
>>
>> diff --git a/mm/zbud.c b/mm/zbud.c
>> index fa48bcdf..7b51eb6 100644
>> --- a/mm/zbud.c
>> +++ b/mm/zbud.c
>> @@ -105,18 +105,25 @@ struct zbud_pool {
>>
>>  /*
>>   * struct zbud_header - zbud page metadata occupying the first chunk of each
>> - * zbud page.
>> + * zbud page, except for HEADLESS pages
>
> hmm, personally I like "FULL" better than HEADLESS...it's only
> headless because it's a full page.

Yep, but OTOH, a page with two buddies is also full, so I just wanted
to avoid ambiguity here. I have no personal naming preference TBO. :)

>>   * @buddy: links the zbud page into the unbuddied/buddied lists in the 
>> pool
>> - * @lru:   links the zbud page into the lru list in the pool
>>   * @first_chunks:  the size of the first buddy in chunks, 0 if free
>>   * @last_chunks:   the size of the last buddy in chunks, 0 if free
>>   */
>>  struct zbud_header {
>> struct list_head buddy;
>> -   struct list_head lru;
>> unsigned int first_chunks;
>> unsigned int last_chunks;
>> -   bool under_reclaim;
>> +};
>> +
>> +/*
>> + * struct zbud_page_priv - zbud flags to be stored in page->private
>> + * @under_reclaim: if a zbud page is under reclaim
>> + * @headless: indicates a page where zbud header didn't fit
>> + */
>> +struct zbud_page_priv {
>> +   bool under_reclaim:1;
>> +   bool headless:1;
>>  };
>
> Hmm, this is just my personal opinion, but I'm not a fan of casting
> ->private as a struct, if we're only using it as a bitmap.  I'd
> suggest just defining bits as an enum (like page flags), e.g.
>
> enum zbud_flags {
>   ZBUD_UNDER_RECLAIM,
>   ZBUD_FULL_PAGE,
> };

Agreed, will rework that.

> or some names similar to that.  then it can be checked with a simple
> test_bit() call, and set_bit()/clear_bit().
>
> alternately, there are the already-existing PG_private and
> PG_private_2 bits in the page flags...but unless we need ->private for
> something else, it probably makes more sense to just use it instead of
> the PG_private flags.

IIRC I used PG_private in my first patch but the thing also is, it's
safer to have under_reclaim flag in struct page, too, so page->private
seems to be a better fit.

>>
>>  /*
>> @@ -221,6 +228,7 @@ MODULE_ALIAS("zpool-zbud");
>>  */
>>  /* Just to make the code easier to read */
>>  enum buddy {
>> +   HEADLESS,
>> FIRST,
>> LAST
>>  };
>> @@ -237,12 +245,15 @@ static int size_to_chunks(size_t size)
>>  /* Initializes the zbud header of a newly allocated zbud page */
>>  static struct zbud_header *init_zbud_page(struct page *page)
>>  {
>> +   struct zbud_page_priv *ppriv = (struct zbud_page_priv 
>> *)page->private;
>> struct zbud_header *zhdr = page_address(page);
>> +
>> +   INIT_LIST_HEAD(>lru);
>> +   ppriv->under_reclaim = 0;
>
> don't forget to initialize the headless/full bit too.  (I'll mention
> that the page allocation code does clear ->private before handing it
> to us, so it should already be 0.  but let's not rely on that)
>
>> +
>> zhdr->first_chunks = 0;
>> zhdr->last_chunks = 0;
>> INIT_LIST_HEAD(>buddy);
>> -   INIT_LIST_HEAD(>lru);
>> -   zhdr->under_reclaim = 0;
>> return zhdr;
>>  }
>>
>> @@ -267,11 +278,22 @@ static unsigned long encode_handle(struct zbud_header 
>> *zhdr, enum buddy bud)
>>  * over the zbud header in the first chunk.
>>  */
>> handle = (unsigned long)zhdr;
>> -   if (bud == FIRST)
>> +   switch (bud) {
>> +   case FIRST:
>> /* skip over zbud header */
>> handle += ZHDR_SIZE_ALIGNED;
>> -   else /* bud == LAST */
>> +   break;
>> +   case LAST:
>> handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
>> +   break;
>> +   case HEADLESS:
>> +   

Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Vitaly Wool
On Wed, Sep 23, 2015 at 5:18 AM, Seth Jennings  wrote:
> On Tue, Sep 22, 2015 at 02:17:33PM +0200, Vitaly Wool wrote:
>> Currently zbud is only capable of allocating not more than
>> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
>> long as only zswap is using it, but other users of zbud may
>> (and likely will) want to allocate up to PAGE_SIZE. This patch
>> addresses that by skipping the creation of zbud internal
>> structure in the beginning of an allocated page (such pages are
>> then called 'headless').
>
> I guess I'm having trouble with this.  If you store a PAGE_SIZE
> allocation in zbud, then the zpage can only have one allocation as there
> is no room for a buddy.  So... we have an allocator for that: the
> page allocator.
>
> zbud doesn't support this by design because, if you are only storing one
> allocation per page, you don't gain anything.
>
> This functionality creates many new edge cases for the code.
>
> What is this use case you envision?  I think we need to discuss
> whether the use case exists and if it justifies the added complexity.

The use case is to use zram with zbud as allocator via the common
zpool api. Sometimes determinism and better worst-case time are more
important than high compression ratio.
As far as I can see, I'm not the only one who wants this case
supported in mainline.

> We are crossing a boundary into zsmalloc style complexity with storing
> stuff in the struct page, something I really didn't want to do in zbud.

Well, the thing is we need PAGE_SIZE allocations supported to use zram
with zbud. I can of course add the code handling this in zpool but I
am quite sure doing that in zbud directly is a better idea. I'm very
keen on keeping the complexity down as much as possible though.

> zbud is the simple one, zsmalloc is the complex one.  I'd hate to have
> two complex ones :-/

Who am I to disagree :) Keeping zbud simple is my goal, too, but once
again, I'd really like it to support PAGE_SIZE allocations. And if it
doesn't, the whole zpool thing for it becomes useless, since there
will hardly be any zbud users other than zswap.

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Vitaly Wool
On Tue, Sep 22, 2015 at 11:49 PM, Dan Streetman  wrote:
> On Tue, Sep 22, 2015 at 8:17 AM, Vitaly Wool  wrote:
>> Currently zbud is only capable of allocating not more than
>> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
>> long as only zswap is using it, but other users of zbud may
>> (and likely will) want to allocate up to PAGE_SIZE. This patch
>> addresses that by skipping the creation of zbud internal
>> structure in the beginning of an allocated page (such pages are
>> then called 'headless').
>>
>> As a zbud page is no longer guaranteed to contain zbud header, the
>> following changes had to be applied throughout the code:
>> * page->lru to be used for zbud page lists
>> * page->private to hold 'under_reclaim' flag
>>
>> page->private will also be used to indicate if this page contains
>> a zbud header in the beginning or not ('headless' flag).
>>
>> Signed-off-by: Vitaly Wool 
>> ---
>>  mm/zbud.c | 194 
>> +-
>>  1 file changed, 128 insertions(+), 66 deletions(-)
>>
>> diff --git a/mm/zbud.c b/mm/zbud.c
>> index fa48bcdf..7b51eb6 100644
>> --- a/mm/zbud.c
>> +++ b/mm/zbud.c
>> @@ -105,18 +105,25 @@ struct zbud_pool {
>>
>>  /*
>>   * struct zbud_header - zbud page metadata occupying the first chunk of each
>> - * zbud page.
>> + * zbud page, except for HEADLESS pages
>
> hmm, personally I like "FULL" better than HEADLESS...it's only
> headless because it's a full page.

Yep, but OTOH, a page with two buddies is also full, so I just wanted
to avoid ambiguity here. I have no personal naming preference TBO. :)

>>   * @buddy: links the zbud page into the unbuddied/buddied lists in the 
>> pool
>> - * @lru:   links the zbud page into the lru list in the pool
>>   * @first_chunks:  the size of the first buddy in chunks, 0 if free
>>   * @last_chunks:   the size of the last buddy in chunks, 0 if free
>>   */
>>  struct zbud_header {
>> struct list_head buddy;
>> -   struct list_head lru;
>> unsigned int first_chunks;
>> unsigned int last_chunks;
>> -   bool under_reclaim;
>> +};
>> +
>> +/*
>> + * struct zbud_page_priv - zbud flags to be stored in page->private
>> + * @under_reclaim: if a zbud page is under reclaim
>> + * @headless: indicates a page where zbud header didn't fit
>> + */
>> +struct zbud_page_priv {
>> +   bool under_reclaim:1;
>> +   bool headless:1;
>>  };
>
> Hmm, this is just my personal opinion, but I'm not a fan of casting
> ->private as a struct, if we're only using it as a bitmap.  I'd
> suggest just defining bits as an enum (like page flags), e.g.
>
> enum zbud_flags {
>   ZBUD_UNDER_RECLAIM,
>   ZBUD_FULL_PAGE,
> };

Agreed, will rework that.

> or some names similar to that.  then it can be checked with a simple
> test_bit() call, and set_bit()/clear_bit().
>
> alternately, there are the already-existing PG_private and
> PG_private_2 bits in the page flags...but unless we need ->private for
> something else, it probably makes more sense to just use it instead of
> the PG_private flags.

IIRC I used PG_private in my first patch but the thing also is, it's
safer to have under_reclaim flag in struct page, too, so page->private
seems to be a better fit.

>>
>>  /*
>> @@ -221,6 +228,7 @@ MODULE_ALIAS("zpool-zbud");
>>  */
>>  /* Just to make the code easier to read */
>>  enum buddy {
>> +   HEADLESS,
>> FIRST,
>> LAST
>>  };
>> @@ -237,12 +245,15 @@ static int size_to_chunks(size_t size)
>>  /* Initializes the zbud header of a newly allocated zbud page */
>>  static struct zbud_header *init_zbud_page(struct page *page)
>>  {
>> +   struct zbud_page_priv *ppriv = (struct zbud_page_priv 
>> *)page->private;
>> struct zbud_header *zhdr = page_address(page);
>> +
>> +   INIT_LIST_HEAD(>lru);
>> +   ppriv->under_reclaim = 0;
>
> don't forget to initialize the headless/full bit too.  (I'll mention
> that the page allocation code does clear ->private before handing it
> to us, so it should already be 0.  but let's not rely on that)
>
>> +
>> zhdr->first_chunks = 0;
>> zhdr->last_chunks = 0;
>> INIT_LIST_HEAD(>buddy);
>> -   INIT_LIST_HEAD(>lru);
>> -   zhdr->under_reclaim = 0;
>> return zhdr;
>>  }
>>
>> @@ -267,11 +278,22 @@ static unsigned long encode_handle(struct zbud_header 
>> *zhdr, enum buddy bud)
>>  * over the zbud header in the first chunk.
>>  */
>> handle = (unsigned long)zhdr;
>> -   if (bud == FIRST)
>> +   switch (bud) {
>> +   case FIRST:
>> /* skip over zbud header */
>> handle += ZHDR_SIZE_ALIGNED;
>> -   else /* bud == LAST */
>> +   break;
>> +   case LAST:
>> handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
>> + 

Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Vitaly Wool
On Wed, Sep 23, 2015 at 5:18 AM, Seth Jennings  wrote:
> On Tue, Sep 22, 2015 at 02:17:33PM +0200, Vitaly Wool wrote:
>> Currently zbud is only capable of allocating not more than
>> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
>> long as only zswap is using it, but other users of zbud may
>> (and likely will) want to allocate up to PAGE_SIZE. This patch
>> addresses that by skipping the creation of zbud internal
>> structure in the beginning of an allocated page (such pages are
>> then called 'headless').
>
> I guess I'm having trouble with this.  If you store a PAGE_SIZE
> allocation in zbud, then the zpage can only have one allocation as there
> is no room for a buddy.  So... we have an allocator for that: the
> page allocator.
>
> zbud doesn't support this by design because, if you are only storing one
> allocation per page, you don't gain anything.
>
> This functionality creates many new edge cases for the code.
>
> What is this use case you envision?  I think we need to discuss
> whether the use case exists and if it justifies the added complexity.

The use case is to use zram with zbud as allocator via the common
zpool api. Sometimes determinism and better worst-case time are more
important than high compression ratio.
As far as I can see, I'm not the only one who wants this case
supported in mainline.

> We are crossing a boundary into zsmalloc style complexity with storing
> stuff in the struct page, something I really didn't want to do in zbud.

Well, the thing is we need PAGE_SIZE allocations supported to use zram
with zbud. I can of course add the code handling this in zpool but I
am quite sure doing that in zbud directly is a better idea. I'm very
keen on keeping the complexity down as much as possible though.

> zbud is the simple one, zsmalloc is the complex one.  I'd hate to have
> two complex ones :-/

Who am I to disagree :) Keeping zbud simple is my goal, too, but once
again, I'd really like it to support PAGE_SIZE allocations. And if it
doesn't, the whole zpool thing for it becomes useless, since there
will hardly be any zbud users other than zswap.

~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Seth Jennings
On Wed, Sep 23, 2015 at 09:54:02AM +0200, Vitaly Wool wrote:
> On Wed, Sep 23, 2015 at 5:18 AM, Seth Jennings  
> wrote:
> > On Tue, Sep 22, 2015 at 02:17:33PM +0200, Vitaly Wool wrote:
> >> Currently zbud is only capable of allocating not more than
> >> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> >> long as only zswap is using it, but other users of zbud may
> >> (and likely will) want to allocate up to PAGE_SIZE. This patch
> >> addresses that by skipping the creation of zbud internal
> >> structure in the beginning of an allocated page (such pages are
> >> then called 'headless').
> >
> > I guess I'm having trouble with this.  If you store a PAGE_SIZE
> > allocation in zbud, then the zpage can only have one allocation as there
> > is no room for a buddy.  So... we have an allocator for that: the
> > page allocator.
> >
> > zbud doesn't support this by design because, if you are only storing one
> > allocation per page, you don't gain anything.
> >
> > This functionality creates many new edge cases for the code.
> >
> > What is this use case you envision?  I think we need to discuss
> > whether the use case exists and if it justifies the added complexity.
> 
> The use case is to use zram with zbud as allocator via the common
> zpool api. Sometimes determinism and better worst-case time are more
> important than high compression ratio.
> As far as I can see, I'm not the only one who wants this case
> supported in mainline.

Ok, I can see that having the allocator backends for zpool 
have the same set of constraints is nice.

I'll look at your latest patch.

Thanks,
Seth

> 
> > We are crossing a boundary into zsmalloc style complexity with storing
> > stuff in the struct page, something I really didn't want to do in zbud.
> 
> Well, the thing is we need PAGE_SIZE allocations supported to use zram
> with zbud. I can of course add the code handling this in zpool but I
> am quite sure doing that in zbud directly is a better idea. I'm very
> keen on keeping the complexity down as much as possible though.
> 
> > zbud is the simple one, zsmalloc is the complex one.  I'd hate to have
> > two complex ones :-/
> 
> Who am I to disagree :) Keeping zbud simple is my goal, too, but once
> again, I'd really like it to support PAGE_SIZE allocations. And if it
> doesn't, the whole zpool thing for it becomes useless, since there
> will hardly be any zbud users other than zswap.
> 
> ~vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Vitaly Wool
Okay, how about this? It's gotten smaller BTW :)

zbud: allow up to PAGE_SIZE allocations

Currently zbud is only capable of allocating not more than
PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
long as only zswap is using it, but other users of zbud may
(and likely will) want to allocate up to PAGE_SIZE. This patch
addresses that by skipping the creation of zbud internal
structure in the beginning of an allocated page. As a zbud page
is no longer guaranteed to contain zbud header, the following
changes have to be applied throughout the code:
* page->lru to be used for zbud page lists
* page->private to hold 'under_reclaim' flag

page->private will also be used to indicate if this page contains
a zbud header in the beginning or not ('headless' flag).

Signed-off-by: Vitaly Wool 
---
 mm/zbud.c | 167 ++
 1 file changed, 113 insertions(+), 54 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index fa48bcdf..3946fba 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -105,18 +105,20 @@ struct zbud_pool {
 
 /*
  * struct zbud_header - zbud page metadata occupying the first chunk of each
- * zbud page.
+ * zbud page, except for HEADLESS pages
  * @buddy: links the zbud page into the unbuddied/buddied lists in the pool
- * @lru:   links the zbud page into the lru list in the pool
  * @first_chunks:  the size of the first buddy in chunks, 0 if free
  * @last_chunks:   the size of the last buddy in chunks, 0 if free
  */
 struct zbud_header {
struct list_head buddy;
-   struct list_head lru;
unsigned int first_chunks;
unsigned int last_chunks;
-   bool under_reclaim;
+};
+
+enum zbud_page_flags {
+   UNDER_RECLAIM = 0,
+   PAGE_HEADLESS,
 };
 
 /*
@@ -221,6 +223,7 @@ MODULE_ALIAS("zpool-zbud");
 */
 /* Just to make the code easier to read */
 enum buddy {
+   HEADLESS,
FIRST,
LAST
 };
@@ -238,11 +241,14 @@ static int size_to_chunks(size_t size)
 static struct zbud_header *init_zbud_page(struct page *page)
 {
struct zbud_header *zhdr = page_address(page);
+
+   INIT_LIST_HEAD(>lru);
+   clear_bit(UNDER_RECLAIM, >private);
+   clear_bit(HEADLESS, >private);
+
zhdr->first_chunks = 0;
zhdr->last_chunks = 0;
INIT_LIST_HEAD(>buddy);
-   INIT_LIST_HEAD(>lru);
-   zhdr->under_reclaim = 0;
return zhdr;
 }
 
@@ -267,11 +273,22 @@ static unsigned long encode_handle(struct zbud_header 
*zhdr, enum buddy bud)
 * over the zbud header in the first chunk.
 */
handle = (unsigned long)zhdr;
-   if (bud == FIRST)
+   switch (bud) {
+   case FIRST:
/* skip over zbud header */
handle += ZHDR_SIZE_ALIGNED;
-   else /* bud == LAST */
+   break;
+   case LAST:
handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
+   break;
+   case HEADLESS:
+   break;
+   default:
+   /* this should never happen */
+   pr_err("zbud: invalid buddy value %d\n", bud);
+   handle = 0;
+   break;
+   }
return handle;
 }
 
@@ -287,6 +304,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
/*
 * Rather than branch for different situations, just use the fact that
 * free buddies have a length of zero to simplify everything.
+* NB: can't be used with HEADLESS pages.
 */
return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
 }
@@ -353,31 +371,39 @@ void zbud_destroy_pool(struct zbud_pool *pool)
 int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
unsigned long *handle)
 {
-   int chunks, i, freechunks;
+   int chunks = 0, i, freechunks;
struct zbud_header *zhdr = NULL;
enum buddy bud;
struct page *page;
 
if (!size || (gfp & __GFP_HIGHMEM))
return -EINVAL;
-   if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
+
+   if (size > PAGE_SIZE)
return -ENOSPC;
-   chunks = size_to_chunks(size);
-   spin_lock(>lock);
 
-   /* First, try to find an unbuddied zbud page. */
-   zhdr = NULL;
-   for_each_unbuddied_list(i, chunks) {
-   if (!list_empty(>unbuddied[i])) {
-   zhdr = list_first_entry(>unbuddied[i],
-   struct zbud_header, buddy);
-   list_del(>buddy);
-   if (zhdr->first_chunks == 0)
-   bud = FIRST;
-   else
-   bud = LAST;
-   goto found;
+   if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
+   bud = HEADLESS;
+   else {
+   chunks = 

Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-23 Thread Seth Jennings
On Wed, Sep 23, 2015 at 10:59:00PM +0200, Vitaly Wool wrote:
> Okay, how about this? It's gotten smaller BTW :)
> 
> zbud: allow up to PAGE_SIZE allocations
> 
> Currently zbud is only capable of allocating not more than
> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> long as only zswap is using it, but other users of zbud may
> (and likely will) want to allocate up to PAGE_SIZE. This patch
> addresses that by skipping the creation of zbud internal
> structure in the beginning of an allocated page. As a zbud page
> is no longer guaranteed to contain zbud header, the following
> changes have to be applied throughout the code:
> * page->lru to be used for zbud page lists
> * page->private to hold 'under_reclaim' flag
> 
> page->private will also be used to indicate if this page contains
> a zbud header in the beginning or not ('headless' flag).
> 
> Signed-off-by: Vitaly Wool 
> ---
>  mm/zbud.c | 167 
> ++
>  1 file changed, 113 insertions(+), 54 deletions(-)
> 
> diff --git a/mm/zbud.c b/mm/zbud.c
> index fa48bcdf..3946fba 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -105,18 +105,20 @@ struct zbud_pool {
>  
>  /*
>   * struct zbud_header - zbud page metadata occupying the first chunk of each
> - *   zbud page.
> + *   zbud page, except for HEADLESS pages
>   * @buddy:   links the zbud page into the unbuddied/buddied lists in the pool
> - * @lru: links the zbud page into the lru list in the pool
>   * @first_chunks:the size of the first buddy in chunks, 0 if free
>   * @last_chunks: the size of the last buddy in chunks, 0 if free
>   */
>  struct zbud_header {
>   struct list_head buddy;
> - struct list_head lru;
>   unsigned int first_chunks;
>   unsigned int last_chunks;
> - bool under_reclaim;
> +};
> +
> +enum zbud_page_flags {
> + UNDER_RECLAIM = 0,

Don't need the "= 0"

> + PAGE_HEADLESS,

Also I think we should prefix the enum values here. With ZPF_ ?

>  };
>  
>  /*
> @@ -221,6 +223,7 @@ MODULE_ALIAS("zpool-zbud");
>  */
>  /* Just to make the code easier to read */
>  enum buddy {
> + HEADLESS,
>   FIRST,
>   LAST
>  };
> @@ -238,11 +241,14 @@ static int size_to_chunks(size_t size)
>  static struct zbud_header *init_zbud_page(struct page *page)
>  {
>   struct zbud_header *zhdr = page_address(page);
> +
> + INIT_LIST_HEAD(>lru);
> + clear_bit(UNDER_RECLAIM, >private);
> + clear_bit(HEADLESS, >private);

I know we are using private in a bitwise flags mode, but maybe we
should just init with page->private = 0

> +
>   zhdr->first_chunks = 0;
>   zhdr->last_chunks = 0;
>   INIT_LIST_HEAD(>buddy);
> - INIT_LIST_HEAD(>lru);
> - zhdr->under_reclaim = 0;
>   return zhdr;
>  }
>  
> @@ -267,11 +273,22 @@ static unsigned long encode_handle(struct zbud_header 
> *zhdr, enum buddy bud)
>* over the zbud header in the first chunk.
>*/
>   handle = (unsigned long)zhdr;
> - if (bud == FIRST)
> + switch (bud) {
> + case FIRST:
>   /* skip over zbud header */
>   handle += ZHDR_SIZE_ALIGNED;
> - else /* bud == LAST */
> + break;
> + case LAST:
>   handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
> + break;
> + case HEADLESS:
> + break;
> + default:
> + /* this should never happen */
> + pr_err("zbud: invalid buddy value %d\n", bud);
> + handle = 0;
> + break;
> + }

Don't need this default case since we have a case for each valid value
of the enum.

Also, I think we want to add some code to free_zbud_page() to clear
page->private and init page->lru so we don't leave dangling pointers.

Looks good though :)

Thanks,
Seth

>   return handle;
>  }
>  
> @@ -287,6 +304,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
>   /*
>* Rather than branch for different situations, just use the fact that
>* free buddies have a length of zero to simplify everything.
> +  * NB: can't be used with HEADLESS pages.
>*/
>   return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
>  }
> @@ -353,31 +371,39 @@ void zbud_destroy_pool(struct zbud_pool *pool)
>  int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
>   unsigned long *handle)
>  {
> - int chunks, i, freechunks;
> + int chunks = 0, i, freechunks;
>   struct zbud_header *zhdr = NULL;
>   enum buddy bud;
>   struct page *page;
>  
>   if (!size || (gfp & __GFP_HIGHMEM))
>   return -EINVAL;
> - if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
> +
> + if (size > PAGE_SIZE)
>   return -ENOSPC;
> - chunks = size_to_chunks(size);
> - spin_lock(>lock);
>  
> - /* First, try to find an unbuddied zbud 

Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-22 Thread Seth Jennings
On Tue, Sep 22, 2015 at 02:17:33PM +0200, Vitaly Wool wrote:
> Currently zbud is only capable of allocating not more than
> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> long as only zswap is using it, but other users of zbud may
> (and likely will) want to allocate up to PAGE_SIZE. This patch
> addresses that by skipping the creation of zbud internal
> structure in the beginning of an allocated page (such pages are
> then called 'headless').

I guess I'm having trouble with this.  If you store a PAGE_SIZE
allocation in zbud, then the zpage can only have one allocation as there
is no room for a buddy.  So... we have an allocator for that: the
page allocator.

zbud doesn't support this by design because, if you are only storing one
allocation per page, you don't gain anything.

This functionality creates many new edge cases for the code.

What is this use case you envision?  I think we need to discuss
whether the use case exists and if it justifies the added complexity.

We are crossing a boundary into zsmalloc style complexity with storing
stuff in the struct page, something I really didn't want to do in zbud.

zbud is the simple one, zsmalloc is the complex one.  I'd hate to have
two complex ones :-/

Seth

> 
> As a zbud page is no longer guaranteed to contain zbud header, the
> following changes had to be applied throughout the code:
> * page->lru to be used for zbud page lists
> * page->private to hold 'under_reclaim' flag
> 
> page->private will also be used to indicate if this page contains
> a zbud header in the beginning or not ('headless' flag).
> 
> Signed-off-by: Vitaly Wool 
> ---
>  mm/zbud.c | 194 
> +-
>  1 file changed, 128 insertions(+), 66 deletions(-)
> 
> diff --git a/mm/zbud.c b/mm/zbud.c
> index fa48bcdf..7b51eb6 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -105,18 +105,25 @@ struct zbud_pool {
>  
>  /*
>   * struct zbud_header - zbud page metadata occupying the first chunk of each
> - *   zbud page.
> + *   zbud page, except for HEADLESS pages
>   * @buddy:   links the zbud page into the unbuddied/buddied lists in the pool
> - * @lru: links the zbud page into the lru list in the pool
>   * @first_chunks:the size of the first buddy in chunks, 0 if free
>   * @last_chunks: the size of the last buddy in chunks, 0 if free
>   */
>  struct zbud_header {
>   struct list_head buddy;
> - struct list_head lru;
>   unsigned int first_chunks;
>   unsigned int last_chunks;
> - bool under_reclaim;
> +};
> +
> +/*
> + * struct zbud_page_priv - zbud flags to be stored in page->private
> + * @under_reclaim: if a zbud page is under reclaim
> + * @headless: indicates a page where zbud header didn't fit
> + */
> +struct zbud_page_priv {
> + bool under_reclaim:1;
> + bool headless:1;
>  };
>  
>  /*
> @@ -221,6 +228,7 @@ MODULE_ALIAS("zpool-zbud");
>  */
>  /* Just to make the code easier to read */
>  enum buddy {
> + HEADLESS,
>   FIRST,
>   LAST
>  };
> @@ -237,12 +245,15 @@ static int size_to_chunks(size_t size)
>  /* Initializes the zbud header of a newly allocated zbud page */
>  static struct zbud_header *init_zbud_page(struct page *page)
>  {
> + struct zbud_page_priv *ppriv = (struct zbud_page_priv *)page->private;
>   struct zbud_header *zhdr = page_address(page);
> +
> + INIT_LIST_HEAD(>lru);
> + ppriv->under_reclaim = 0;
> +
>   zhdr->first_chunks = 0;
>   zhdr->last_chunks = 0;
>   INIT_LIST_HEAD(>buddy);
> - INIT_LIST_HEAD(>lru);
> - zhdr->under_reclaim = 0;
>   return zhdr;
>  }
>  
> @@ -267,11 +278,22 @@ static unsigned long encode_handle(struct zbud_header 
> *zhdr, enum buddy bud)
>* over the zbud header in the first chunk.
>*/
>   handle = (unsigned long)zhdr;
> - if (bud == FIRST)
> + switch (bud) {
> + case FIRST:
>   /* skip over zbud header */
>   handle += ZHDR_SIZE_ALIGNED;
> - else /* bud == LAST */
> + break;
> + case LAST:
>   handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
> + break;
> + case HEADLESS:
> + break;
> + default:
> + /* this should never happen */
> + pr_err("zbud: invalid buddy value %d\n", bud);
> + handle = 0;
> + break;
> + }
>   return handle;
>  }
>  
> @@ -287,6 +309,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
>   /*
>* Rather than branch for different situations, just use the fact that
>* free buddies have a length of zero to simplify everything.
> +  * NB: can't be used with HEADLESS pages.
>*/
>   return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
>  }
> @@ -353,31 +376,40 @@ void zbud_destroy_pool(struct zbud_pool *pool)
>  int zbud_alloc(struct zbud_pool *pool, size_t size, 

Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-22 Thread Dan Streetman
On Tue, Sep 22, 2015 at 8:17 AM, Vitaly Wool  wrote:
> Currently zbud is only capable of allocating not more than
> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> long as only zswap is using it, but other users of zbud may
> (and likely will) want to allocate up to PAGE_SIZE. This patch
> addresses that by skipping the creation of zbud internal
> structure in the beginning of an allocated page (such pages are
> then called 'headless').
>
> As a zbud page is no longer guaranteed to contain zbud header, the
> following changes had to be applied throughout the code:
> * page->lru to be used for zbud page lists
> * page->private to hold 'under_reclaim' flag
>
> page->private will also be used to indicate if this page contains
> a zbud header in the beginning or not ('headless' flag).
>
> Signed-off-by: Vitaly Wool 
> ---
>  mm/zbud.c | 194 
> +-
>  1 file changed, 128 insertions(+), 66 deletions(-)
>
> diff --git a/mm/zbud.c b/mm/zbud.c
> index fa48bcdf..7b51eb6 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -105,18 +105,25 @@ struct zbud_pool {
>
>  /*
>   * struct zbud_header - zbud page metadata occupying the first chunk of each
> - * zbud page.
> + * zbud page, except for HEADLESS pages

hmm, personally I like "FULL" better than HEADLESS...it's only
headless because it's a full page.

>   * @buddy: links the zbud page into the unbuddied/buddied lists in the 
> pool
> - * @lru:   links the zbud page into the lru list in the pool
>   * @first_chunks:  the size of the first buddy in chunks, 0 if free
>   * @last_chunks:   the size of the last buddy in chunks, 0 if free
>   */
>  struct zbud_header {
> struct list_head buddy;
> -   struct list_head lru;
> unsigned int first_chunks;
> unsigned int last_chunks;
> -   bool under_reclaim;
> +};
> +
> +/*
> + * struct zbud_page_priv - zbud flags to be stored in page->private
> + * @under_reclaim: if a zbud page is under reclaim
> + * @headless: indicates a page where zbud header didn't fit
> + */
> +struct zbud_page_priv {
> +   bool under_reclaim:1;
> +   bool headless:1;
>  };

Hmm, this is just my personal opinion, but I'm not a fan of casting
->private as a struct, if we're only using it as a bitmap.  I'd
suggest just defining bits as an enum (like page flags), e.g.

enum zbud_flags {
  ZBUD_UNDER_RECLAIM,
  ZBUD_FULL_PAGE,
};

or some names similar to that.  then it can be checked with a simple
test_bit() call, and set_bit()/clear_bit().

alternately, there are the already-existing PG_private and
PG_private_2 bits in the page flags...but unless we need ->private for
something else, it probably makes more sense to just use it instead of
the PG_private flags.

>
>  /*
> @@ -221,6 +228,7 @@ MODULE_ALIAS("zpool-zbud");
>  */
>  /* Just to make the code easier to read */
>  enum buddy {
> +   HEADLESS,
> FIRST,
> LAST
>  };
> @@ -237,12 +245,15 @@ static int size_to_chunks(size_t size)
>  /* Initializes the zbud header of a newly allocated zbud page */
>  static struct zbud_header *init_zbud_page(struct page *page)
>  {
> +   struct zbud_page_priv *ppriv = (struct zbud_page_priv *)page->private;
> struct zbud_header *zhdr = page_address(page);
> +
> +   INIT_LIST_HEAD(>lru);
> +   ppriv->under_reclaim = 0;

don't forget to initialize the headless/full bit too.  (I'll mention
that the page allocation code does clear ->private before handing it
to us, so it should already be 0.  but let's not rely on that)

> +
> zhdr->first_chunks = 0;
> zhdr->last_chunks = 0;
> INIT_LIST_HEAD(>buddy);
> -   INIT_LIST_HEAD(>lru);
> -   zhdr->under_reclaim = 0;
> return zhdr;
>  }
>
> @@ -267,11 +278,22 @@ static unsigned long encode_handle(struct zbud_header 
> *zhdr, enum buddy bud)
>  * over the zbud header in the first chunk.
>  */
> handle = (unsigned long)zhdr;
> -   if (bud == FIRST)
> +   switch (bud) {
> +   case FIRST:
> /* skip over zbud header */
> handle += ZHDR_SIZE_ALIGNED;
> -   else /* bud == LAST */
> +   break;
> +   case LAST:
> handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
> +   break;
> +   case HEADLESS:
> +   break;
> +   default:
> +   /* this should never happen */
> +   pr_err("zbud: invalid buddy value %d\n", bud);
> +   handle = 0;
> +   break;
> +   }
> return handle;
>  }
>
> @@ -287,6 +309,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
> /*
>  * Rather than branch for different situations, just use the fact that
>  * free buddies have a length of zero to simplify everything.
> +* NB: can't be used with HEADLESS pages.
>   

[PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-22 Thread Vitaly Wool
Currently zbud is only capable of allocating not more than
PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
long as only zswap is using it, but other users of zbud may
(and likely will) want to allocate up to PAGE_SIZE. This patch
addresses that by skipping the creation of zbud internal
structure in the beginning of an allocated page (such pages are
then called 'headless').

As a zbud page is no longer guaranteed to contain zbud header, the
following changes had to be applied throughout the code:
* page->lru to be used for zbud page lists
* page->private to hold 'under_reclaim' flag

page->private will also be used to indicate if this page contains
a zbud header in the beginning or not ('headless' flag).

Signed-off-by: Vitaly Wool 
---
 mm/zbud.c | 194 +-
 1 file changed, 128 insertions(+), 66 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index fa48bcdf..7b51eb6 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -105,18 +105,25 @@ struct zbud_pool {
 
 /*
  * struct zbud_header - zbud page metadata occupying the first chunk of each
- * zbud page.
+ * zbud page, except for HEADLESS pages
  * @buddy: links the zbud page into the unbuddied/buddied lists in the pool
- * @lru:   links the zbud page into the lru list in the pool
  * @first_chunks:  the size of the first buddy in chunks, 0 if free
  * @last_chunks:   the size of the last buddy in chunks, 0 if free
  */
 struct zbud_header {
struct list_head buddy;
-   struct list_head lru;
unsigned int first_chunks;
unsigned int last_chunks;
-   bool under_reclaim;
+};
+
+/*
+ * struct zbud_page_priv - zbud flags to be stored in page->private
+ * @under_reclaim: if a zbud page is under reclaim
+ * @headless: indicates a page where zbud header didn't fit
+ */
+struct zbud_page_priv {
+   bool under_reclaim:1;
+   bool headless:1;
 };
 
 /*
@@ -221,6 +228,7 @@ MODULE_ALIAS("zpool-zbud");
 */
 /* Just to make the code easier to read */
 enum buddy {
+   HEADLESS,
FIRST,
LAST
 };
@@ -237,12 +245,15 @@ static int size_to_chunks(size_t size)
 /* Initializes the zbud header of a newly allocated zbud page */
 static struct zbud_header *init_zbud_page(struct page *page)
 {
+   struct zbud_page_priv *ppriv = (struct zbud_page_priv *)page->private;
struct zbud_header *zhdr = page_address(page);
+
+   INIT_LIST_HEAD(>lru);
+   ppriv->under_reclaim = 0;
+
zhdr->first_chunks = 0;
zhdr->last_chunks = 0;
INIT_LIST_HEAD(>buddy);
-   INIT_LIST_HEAD(>lru);
-   zhdr->under_reclaim = 0;
return zhdr;
 }
 
@@ -267,11 +278,22 @@ static unsigned long encode_handle(struct zbud_header 
*zhdr, enum buddy bud)
 * over the zbud header in the first chunk.
 */
handle = (unsigned long)zhdr;
-   if (bud == FIRST)
+   switch (bud) {
+   case FIRST:
/* skip over zbud header */
handle += ZHDR_SIZE_ALIGNED;
-   else /* bud == LAST */
+   break;
+   case LAST:
handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
+   break;
+   case HEADLESS:
+   break;
+   default:
+   /* this should never happen */
+   pr_err("zbud: invalid buddy value %d\n", bud);
+   handle = 0;
+   break;
+   }
return handle;
 }
 
@@ -287,6 +309,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
/*
 * Rather than branch for different situations, just use the fact that
 * free buddies have a length of zero to simplify everything.
+* NB: can't be used with HEADLESS pages.
 */
return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
 }
@@ -353,31 +376,40 @@ void zbud_destroy_pool(struct zbud_pool *pool)
 int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
unsigned long *handle)
 {
-   int chunks, i, freechunks;
+   int chunks = 0, i, freechunks;
struct zbud_header *zhdr = NULL;
+   struct zbud_page_priv *ppriv;
enum buddy bud;
struct page *page;
 
if (!size || (gfp & __GFP_HIGHMEM))
return -EINVAL;
-   if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
+
+   if (size > PAGE_SIZE)
return -ENOSPC;
-   chunks = size_to_chunks(size);
-   spin_lock(>lock);
 
-   /* First, try to find an unbuddied zbud page. */
-   zhdr = NULL;
-   for_each_unbuddied_list(i, chunks) {
-   if (!list_empty(>unbuddied[i])) {
-   zhdr = list_first_entry(>unbuddied[i],
-   struct zbud_header, buddy);
-   list_del(>buddy);
-   if (zhdr->first_chunks == 0)
-   bud = 

Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-22 Thread Dan Streetman
On Tue, Sep 22, 2015 at 8:17 AM, Vitaly Wool  wrote:
> Currently zbud is only capable of allocating not more than
> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> long as only zswap is using it, but other users of zbud may
> (and likely will) want to allocate up to PAGE_SIZE. This patch
> addresses that by skipping the creation of zbud internal
> structure in the beginning of an allocated page (such pages are
> then called 'headless').
>
> As a zbud page is no longer guaranteed to contain zbud header, the
> following changes had to be applied throughout the code:
> * page->lru to be used for zbud page lists
> * page->private to hold 'under_reclaim' flag
>
> page->private will also be used to indicate if this page contains
> a zbud header in the beginning or not ('headless' flag).
>
> Signed-off-by: Vitaly Wool 
> ---
>  mm/zbud.c | 194 
> +-
>  1 file changed, 128 insertions(+), 66 deletions(-)
>
> diff --git a/mm/zbud.c b/mm/zbud.c
> index fa48bcdf..7b51eb6 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -105,18 +105,25 @@ struct zbud_pool {
>
>  /*
>   * struct zbud_header - zbud page metadata occupying the first chunk of each
> - * zbud page.
> + * zbud page, except for HEADLESS pages

hmm, personally I like "FULL" better than HEADLESS...it's only
headless because it's a full page.

>   * @buddy: links the zbud page into the unbuddied/buddied lists in the 
> pool
> - * @lru:   links the zbud page into the lru list in the pool
>   * @first_chunks:  the size of the first buddy in chunks, 0 if free
>   * @last_chunks:   the size of the last buddy in chunks, 0 if free
>   */
>  struct zbud_header {
> struct list_head buddy;
> -   struct list_head lru;
> unsigned int first_chunks;
> unsigned int last_chunks;
> -   bool under_reclaim;
> +};
> +
> +/*
> + * struct zbud_page_priv - zbud flags to be stored in page->private
> + * @under_reclaim: if a zbud page is under reclaim
> + * @headless: indicates a page where zbud header didn't fit
> + */
> +struct zbud_page_priv {
> +   bool under_reclaim:1;
> +   bool headless:1;
>  };

Hmm, this is just my personal opinion, but I'm not a fan of casting
->private as a struct, if we're only using it as a bitmap.  I'd
suggest just defining bits as an enum (like page flags), e.g.

enum zbud_flags {
  ZBUD_UNDER_RECLAIM,
  ZBUD_FULL_PAGE,
};

or some names similar to that.  then it can be checked with a simple
test_bit() call, and set_bit()/clear_bit().

alternately, there are the already-existing PG_private and
PG_private_2 bits in the page flags...but unless we need ->private for
something else, it probably makes more sense to just use it instead of
the PG_private flags.

>
>  /*
> @@ -221,6 +228,7 @@ MODULE_ALIAS("zpool-zbud");
>  */
>  /* Just to make the code easier to read */
>  enum buddy {
> +   HEADLESS,
> FIRST,
> LAST
>  };
> @@ -237,12 +245,15 @@ static int size_to_chunks(size_t size)
>  /* Initializes the zbud header of a newly allocated zbud page */
>  static struct zbud_header *init_zbud_page(struct page *page)
>  {
> +   struct zbud_page_priv *ppriv = (struct zbud_page_priv *)page->private;
> struct zbud_header *zhdr = page_address(page);
> +
> +   INIT_LIST_HEAD(>lru);
> +   ppriv->under_reclaim = 0;

don't forget to initialize the headless/full bit too.  (I'll mention
that the page allocation code does clear ->private before handing it
to us, so it should already be 0.  but let's not rely on that)

> +
> zhdr->first_chunks = 0;
> zhdr->last_chunks = 0;
> INIT_LIST_HEAD(>buddy);
> -   INIT_LIST_HEAD(>lru);
> -   zhdr->under_reclaim = 0;
> return zhdr;
>  }
>
> @@ -267,11 +278,22 @@ static unsigned long encode_handle(struct zbud_header 
> *zhdr, enum buddy bud)
>  * over the zbud header in the first chunk.
>  */
> handle = (unsigned long)zhdr;
> -   if (bud == FIRST)
> +   switch (bud) {
> +   case FIRST:
> /* skip over zbud header */
> handle += ZHDR_SIZE_ALIGNED;
> -   else /* bud == LAST */
> +   break;
> +   case LAST:
> handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
> +   break;
> +   case HEADLESS:
> +   break;
> +   default:
> +   /* this should never happen */
> +   pr_err("zbud: invalid buddy value %d\n", bud);
> +   handle = 0;
> +   break;
> +   }
> return handle;
>  }
>
> @@ -287,6 +309,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
> /*
>  * Rather than branch for different situations, just use the fact that
>  * free buddies have a length of zero to simplify everything.
> +* 

Re: [PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-22 Thread Seth Jennings
On Tue, Sep 22, 2015 at 02:17:33PM +0200, Vitaly Wool wrote:
> Currently zbud is only capable of allocating not more than
> PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
> long as only zswap is using it, but other users of zbud may
> (and likely will) want to allocate up to PAGE_SIZE. This patch
> addresses that by skipping the creation of zbud internal
> structure in the beginning of an allocated page (such pages are
> then called 'headless').

I guess I'm having trouble with this.  If you store a PAGE_SIZE
allocation in zbud, then the zpage can only have one allocation as there
is no room for a buddy.  So... we have an allocator for that: the
page allocator.

zbud doesn't support this by design because, if you are only storing one
allocation per page, you don't gain anything.

This functionality creates many new edge cases for the code.

What is this use case you envision?  I think we need to discuss
whether the use case exists and if it justifies the added complexity.

We are crossing a boundary into zsmalloc style complexity with storing
stuff in the struct page, something I really didn't want to do in zbud.

zbud is the simple one, zsmalloc is the complex one.  I'd hate to have
two complex ones :-/

Seth

> 
> As a zbud page is no longer guaranteed to contain zbud header, the
> following changes had to be applied throughout the code:
> * page->lru to be used for zbud page lists
> * page->private to hold 'under_reclaim' flag
> 
> page->private will also be used to indicate if this page contains
> a zbud header in the beginning or not ('headless' flag).
> 
> Signed-off-by: Vitaly Wool 
> ---
>  mm/zbud.c | 194 
> +-
>  1 file changed, 128 insertions(+), 66 deletions(-)
> 
> diff --git a/mm/zbud.c b/mm/zbud.c
> index fa48bcdf..7b51eb6 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -105,18 +105,25 @@ struct zbud_pool {
>  
>  /*
>   * struct zbud_header - zbud page metadata occupying the first chunk of each
> - *   zbud page.
> + *   zbud page, except for HEADLESS pages
>   * @buddy:   links the zbud page into the unbuddied/buddied lists in the pool
> - * @lru: links the zbud page into the lru list in the pool
>   * @first_chunks:the size of the first buddy in chunks, 0 if free
>   * @last_chunks: the size of the last buddy in chunks, 0 if free
>   */
>  struct zbud_header {
>   struct list_head buddy;
> - struct list_head lru;
>   unsigned int first_chunks;
>   unsigned int last_chunks;
> - bool under_reclaim;
> +};
> +
> +/*
> + * struct zbud_page_priv - zbud flags to be stored in page->private
> + * @under_reclaim: if a zbud page is under reclaim
> + * @headless: indicates a page where zbud header didn't fit
> + */
> +struct zbud_page_priv {
> + bool under_reclaim:1;
> + bool headless:1;
>  };
>  
>  /*
> @@ -221,6 +228,7 @@ MODULE_ALIAS("zpool-zbud");
>  */
>  /* Just to make the code easier to read */
>  enum buddy {
> + HEADLESS,
>   FIRST,
>   LAST
>  };
> @@ -237,12 +245,15 @@ static int size_to_chunks(size_t size)
>  /* Initializes the zbud header of a newly allocated zbud page */
>  static struct zbud_header *init_zbud_page(struct page *page)
>  {
> + struct zbud_page_priv *ppriv = (struct zbud_page_priv *)page->private;
>   struct zbud_header *zhdr = page_address(page);
> +
> + INIT_LIST_HEAD(>lru);
> + ppriv->under_reclaim = 0;
> +
>   zhdr->first_chunks = 0;
>   zhdr->last_chunks = 0;
>   INIT_LIST_HEAD(>buddy);
> - INIT_LIST_HEAD(>lru);
> - zhdr->under_reclaim = 0;
>   return zhdr;
>  }
>  
> @@ -267,11 +278,22 @@ static unsigned long encode_handle(struct zbud_header 
> *zhdr, enum buddy bud)
>* over the zbud header in the first chunk.
>*/
>   handle = (unsigned long)zhdr;
> - if (bud == FIRST)
> + switch (bud) {
> + case FIRST:
>   /* skip over zbud header */
>   handle += ZHDR_SIZE_ALIGNED;
> - else /* bud == LAST */
> + break;
> + case LAST:
>   handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
> + break;
> + case HEADLESS:
> + break;
> + default:
> + /* this should never happen */
> + pr_err("zbud: invalid buddy value %d\n", bud);
> + handle = 0;
> + break;
> + }
>   return handle;
>  }
>  
> @@ -287,6 +309,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
>   /*
>* Rather than branch for different situations, just use the fact that
>* free buddies have a length of zero to simplify everything.
> +  * NB: can't be used with HEADLESS pages.
>*/
>   return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
>  }
> @@ -353,31 +376,40 @@ void zbud_destroy_pool(struct zbud_pool *pool)
>  int zbud_alloc(struct 

[PATCH v2] zbud: allow up to PAGE_SIZE allocations

2015-09-22 Thread Vitaly Wool
Currently zbud is only capable of allocating not more than
PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE. This is okay as
long as only zswap is using it, but other users of zbud may
(and likely will) want to allocate up to PAGE_SIZE. This patch
addresses that by skipping the creation of zbud internal
structure in the beginning of an allocated page (such pages are
then called 'headless').

As a zbud page is no longer guaranteed to contain zbud header, the
following changes had to be applied throughout the code:
* page->lru to be used for zbud page lists
* page->private to hold 'under_reclaim' flag

page->private will also be used to indicate if this page contains
a zbud header in the beginning or not ('headless' flag).

Signed-off-by: Vitaly Wool 
---
 mm/zbud.c | 194 +-
 1 file changed, 128 insertions(+), 66 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index fa48bcdf..7b51eb6 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -105,18 +105,25 @@ struct zbud_pool {
 
 /*
  * struct zbud_header - zbud page metadata occupying the first chunk of each
- * zbud page.
+ * zbud page, except for HEADLESS pages
  * @buddy: links the zbud page into the unbuddied/buddied lists in the pool
- * @lru:   links the zbud page into the lru list in the pool
  * @first_chunks:  the size of the first buddy in chunks, 0 if free
  * @last_chunks:   the size of the last buddy in chunks, 0 if free
  */
 struct zbud_header {
struct list_head buddy;
-   struct list_head lru;
unsigned int first_chunks;
unsigned int last_chunks;
-   bool under_reclaim;
+};
+
+/*
+ * struct zbud_page_priv - zbud flags to be stored in page->private
+ * @under_reclaim: if a zbud page is under reclaim
+ * @headless: indicates a page where zbud header didn't fit
+ */
+struct zbud_page_priv {
+   bool under_reclaim:1;
+   bool headless:1;
 };
 
 /*
@@ -221,6 +228,7 @@ MODULE_ALIAS("zpool-zbud");
 */
 /* Just to make the code easier to read */
 enum buddy {
+   HEADLESS,
FIRST,
LAST
 };
@@ -237,12 +245,15 @@ static int size_to_chunks(size_t size)
 /* Initializes the zbud header of a newly allocated zbud page */
 static struct zbud_header *init_zbud_page(struct page *page)
 {
+   struct zbud_page_priv *ppriv = (struct zbud_page_priv *)page->private;
struct zbud_header *zhdr = page_address(page);
+
+   INIT_LIST_HEAD(>lru);
+   ppriv->under_reclaim = 0;
+
zhdr->first_chunks = 0;
zhdr->last_chunks = 0;
INIT_LIST_HEAD(>buddy);
-   INIT_LIST_HEAD(>lru);
-   zhdr->under_reclaim = 0;
return zhdr;
 }
 
@@ -267,11 +278,22 @@ static unsigned long encode_handle(struct zbud_header 
*zhdr, enum buddy bud)
 * over the zbud header in the first chunk.
 */
handle = (unsigned long)zhdr;
-   if (bud == FIRST)
+   switch (bud) {
+   case FIRST:
/* skip over zbud header */
handle += ZHDR_SIZE_ALIGNED;
-   else /* bud == LAST */
+   break;
+   case LAST:
handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
+   break;
+   case HEADLESS:
+   break;
+   default:
+   /* this should never happen */
+   pr_err("zbud: invalid buddy value %d\n", bud);
+   handle = 0;
+   break;
+   }
return handle;
 }
 
@@ -287,6 +309,7 @@ static int num_free_chunks(struct zbud_header *zhdr)
/*
 * Rather than branch for different situations, just use the fact that
 * free buddies have a length of zero to simplify everything.
+* NB: can't be used with HEADLESS pages.
 */
return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
 }
@@ -353,31 +376,40 @@ void zbud_destroy_pool(struct zbud_pool *pool)
 int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
unsigned long *handle)
 {
-   int chunks, i, freechunks;
+   int chunks = 0, i, freechunks;
struct zbud_header *zhdr = NULL;
+   struct zbud_page_priv *ppriv;
enum buddy bud;
struct page *page;
 
if (!size || (gfp & __GFP_HIGHMEM))
return -EINVAL;
-   if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
+
+   if (size > PAGE_SIZE)
return -ENOSPC;
-   chunks = size_to_chunks(size);
-   spin_lock(>lock);
 
-   /* First, try to find an unbuddied zbud page. */
-   zhdr = NULL;
-   for_each_unbuddied_list(i, chunks) {
-   if (!list_empty(>unbuddied[i])) {
-   zhdr = list_first_entry(>unbuddied[i],
-   struct zbud_header, buddy);
-   list_del(>buddy);
-   if (zhdr->first_chunks == 0)
-