Re: Generic approach to customizable zones - was: Re: [PATCH v7 0/7] Introduce ZONE_CMA

2017-04-28 Thread Igor Stoppa


On 28/04/17 11:36, Michal Hocko wrote:
> I didn't read this thoughly yet because I will be travelling shortly

ok, thanks for bearing with me =)

> but
> this point alone just made ask, because it seems there is some
> misunderstanding

It is possible, so far I did some changes, but I have not completed the
whole conversion.

> On Fri 28-04-17 11:04:27, Igor Stoppa wrote:
> [...]
>> * if one is happy to have a 64bits type, allow for as many zones as
>>   it's possible to fit, or anyway more than what is possible with
>>   the 32 bit mask.
> 
> zones are currently placed in struct page::flags. And that already is
> 64b size on 64b arches. 

Ok, the issues I had so fare were related to the enum for zones being
treated as 32b.

> And we do not really have any room spare there.
> We encode page flags, zone id, numa_nid/sparse section_nr there. How can
> you add more without enlarging the struct page itself or using external
> means to store the same information (page_ext comes to mind)?

Then I'll be conservative and assume I can't, unless I can prove otherwise.

There is still the possibility I mentioned of loosely coupling DMA,
DMA32 and HIGHMEM with the bits currently reserved for them, right?

If my system doesn't use those zones as such, because it doesn't
have/need them, those bits are wasted for me. Otoh someone else is
probably not interested in what I'm after but needs one or more of those
zones.

Making the meaning of the bits configurable should still be a viable
option. It's not altering their amount, just their purpose on a specific
build.

> Even if
> the later would be possible then note thatpage_zone() is used in many
> performance sensitive paths and making it perform well with special
> casing would be far from trivial.


If the solution I propose is acceptable, I'm willing to bite the bullet
and go for implementing the conversion.

In my case I really would like to be able to use kmalloc, because it
would provide an easy path to convert also other portions of the kernel,
besides SE Linux.

I suspect I would encounter overall far less resistance if the type of
change I propose is limited to:

s/GFP_KERNEL/GFP_LOCKABLE/

And if I can guarrantee that GFP_LOCKABLE falls back to GFP_KERNEL when
the "lockable" feature is not enabled.


--
thanks, igor


Re: Generic approach to customizable zones - was: Re: [PATCH v7 0/7] Introduce ZONE_CMA

2017-04-28 Thread Igor Stoppa


On 28/04/17 11:36, Michal Hocko wrote:
> I didn't read this thoughly yet because I will be travelling shortly

ok, thanks for bearing with me =)

> but
> this point alone just made ask, because it seems there is some
> misunderstanding

It is possible, so far I did some changes, but I have not completed the
whole conversion.

> On Fri 28-04-17 11:04:27, Igor Stoppa wrote:
> [...]
>> * if one is happy to have a 64bits type, allow for as many zones as
>>   it's possible to fit, or anyway more than what is possible with
>>   the 32 bit mask.
> 
> zones are currently placed in struct page::flags. And that already is
> 64b size on 64b arches. 

Ok, the issues I had so fare were related to the enum for zones being
treated as 32b.

> And we do not really have any room spare there.
> We encode page flags, zone id, numa_nid/sparse section_nr there. How can
> you add more without enlarging the struct page itself or using external
> means to store the same information (page_ext comes to mind)?

Then I'll be conservative and assume I can't, unless I can prove otherwise.

There is still the possibility I mentioned of loosely coupling DMA,
DMA32 and HIGHMEM with the bits currently reserved for them, right?

If my system doesn't use those zones as such, because it doesn't
have/need them, those bits are wasted for me. Otoh someone else is
probably not interested in what I'm after but needs one or more of those
zones.

Making the meaning of the bits configurable should still be a viable
option. It's not altering their amount, just their purpose on a specific
build.

> Even if
> the later would be possible then note thatpage_zone() is used in many
> performance sensitive paths and making it perform well with special
> casing would be far from trivial.


If the solution I propose is acceptable, I'm willing to bite the bullet
and go for implementing the conversion.

In my case I really would like to be able to use kmalloc, because it
would provide an easy path to convert also other portions of the kernel,
besides SE Linux.

I suspect I would encounter overall far less resistance if the type of
change I propose is limited to:

s/GFP_KERNEL/GFP_LOCKABLE/

And if I can guarrantee that GFP_LOCKABLE falls back to GFP_KERNEL when
the "lockable" feature is not enabled.


--
thanks, igor


Re: Generic approach to customizable zones - was: Re: [PATCH v7 0/7] Introduce ZONE_CMA

2017-04-28 Thread Michal Hocko
I didn't read this thoughly yet because I will be travelling shortly but
this point alone just made ask, because it seems there is some
misunderstanding

On Fri 28-04-17 11:04:27, Igor Stoppa wrote:
[...]
> * if one is happy to have a 64bits type, allow for as many zones as
>   it's possible to fit, or anyway more than what is possible with
>   the 32 bit mask.

zones are currently placed in struct page::flags. And that already is
64b size on 64b arches. And we do not really have any room spare there.
We encode page flags, zone id, numa_nid/sparse section_nr there. How can
you add more without enlarging the struct page itself or using external
means to store the same information (page_ext comes to mind)? Even if
the later would be possible then note thatpage_zone() is used in many
performance sensitive paths and making it perform well with special
casing would be far from trivial.
-- 
Michal Hocko
SUSE Labs


Re: Generic approach to customizable zones - was: Re: [PATCH v7 0/7] Introduce ZONE_CMA

2017-04-28 Thread Michal Hocko
I didn't read this thoughly yet because I will be travelling shortly but
this point alone just made ask, because it seems there is some
misunderstanding

On Fri 28-04-17 11:04:27, Igor Stoppa wrote:
[...]
> * if one is happy to have a 64bits type, allow for as many zones as
>   it's possible to fit, or anyway more than what is possible with
>   the 32 bit mask.

zones are currently placed in struct page::flags. And that already is
64b size on 64b arches. And we do not really have any room spare there.
We encode page flags, zone id, numa_nid/sparse section_nr there. How can
you add more without enlarging the struct page itself or using external
means to store the same information (page_ext comes to mind)? Even if
the later would be possible then note thatpage_zone() is used in many
performance sensitive paths and making it perform well with special
casing would be far from trivial.
-- 
Michal Hocko
SUSE Labs


Generic approach to customizable zones - was: Re: [PATCH v7 0/7] Introduce ZONE_CMA

2017-04-28 Thread Igor Stoppa
On 27/04/17 18:06, Michal Hocko wrote:
> On Tue 25-04-17 12:42:57, Joonsoo Kim wrote:

[...]

>> Yes, it requires one more bit for a new zone and it's handled by the patch.
> 
> I am pretty sure that you are aware that consuming new page flag bits
> is usually a no-go and something we try to avoid as much as possible
> because we are in a great shortage there. So there really have to be a
> _strong_ reason if we go that way. My current understanding that the
> whole zone concept is more about a more convenient implementation rather
> than a fundamental change which will solve unsolvable problems with the
> current approach. More on that below.

Since I am in a similar situation, I think it's better if I join this
conversation instead of going through the same in a separate thread.

In this regard, I have a few observations (are they correct?):

* not everyone seems to be interested in having all the current
  zones active simultaneously

* some zones are even not so meaningful on certain architectures or
  platforms

* some architectures/platforms that are 64 bits would have no penalty
  in dealing with a larger data type.

So I wonder, would anybody be against this:

* within the 32bits constraint, define some optional zones

* decouple the specific position of a bit from the zone it represents;
  iow: if the zone is enabled, ensure that it gets a bit in the mask,
  but do not make promises about which one it is, provided that the
  corresponding macros work properly

* ensure that if one selects more optional zones than there are bits
  available (in the case of a 32bits mask), an error is produced at
  compile time

* if one is happy to have a 64bits type, allow for as many zones as
  it's possible to fit, or anyway more than what is possible with
  the 32 bit mask.

I think I can re-factor the code so that there is no runtime performance
degradation, if there is no immediate objection to what I described. Or
maybe I failed to notice some obvious pitfall?

>From what I see, there seems to be a lot of interest in using functions
like Kmalloc / vmalloc, with the ability of specifying pseudo-custom
areas from where they should tap into.

Why not, as long as those who do not need it are not negatively impacted?

I understand that if the association between bits and zones is fixed,
then suddenly bits become very precious stuff, but if they could be used
in a more efficient way, then maybe they could be used more liberally.

The alternative is to keep getting requests about new zones and turning
them away because they do not pass the bar of being extremely critical,
even if indeed they would simplify people's life.


The change shouldn't be too ugly, if I do something along these lines of
the pseudo code below.
Note: the #ifdefs would be mainly concentrated in the declaration part.

enum gfp_zone_shift {
#if IS_ENABLED(CONFIG_ZONE_DMA)
/*I haven't checked if this is the correct name, but it gives the idea*/
ZONE_DMA_SHIFT = 0,
#endif
#if IS_ENABLED(CONFIG_ZONE_HIGHMEM)
ZONE_HIGHMEM_SHIFT,
#endif
#if IS_ENABLED(CONFIG_ZONE_DMA32)
ZONE_DMA32_SHIFT,
#endif
#if IS_ENABLED(CONFIG_ZONE_xxx)
ZONE_xxx,
#endif
   NON_OPTIONAL_ZONE_SHIFT,
   ...
   USED_ZONES_NUMBER,
   ZONE_MOVABLE_SHIFT = USED_ZONES_NUMBER,
   ...
};

#if USED_ZONES_NUMBER < MAX_ZONES_32BITS
typedef gfp_zones_t uint32_t
#elif IS_ENABLED(CONFIG_ZONES_64BITS
typedef gfp_zones_t uint64_t
#else
#error
#endif

The type should be adjusted in other places where it is used, but I
didn't find too many occurrences.

#define __ZONE_DMA \
  (((gfp_zones_t)IS_ENABLED(CONFIG_ZONE_DMA)) << \
   (ZONE_DMA_SHIFT - 0))

[rinse and repeat]

Code referring to these optional zones can be sandboxed in

#if IS_ENABLED(CONFIG_ZONE_DMA)

inline function do_something_dma() {
   
}

#else
#define do_something_dma()
#endif


Or equivalent, effectively removing many #ifdefs from the main code of
functions like those called by kmalloc.


So, would this approach stand a chance?


thanks, igor


Generic approach to customizable zones - was: Re: [PATCH v7 0/7] Introduce ZONE_CMA

2017-04-28 Thread Igor Stoppa
On 27/04/17 18:06, Michal Hocko wrote:
> On Tue 25-04-17 12:42:57, Joonsoo Kim wrote:

[...]

>> Yes, it requires one more bit for a new zone and it's handled by the patch.
> 
> I am pretty sure that you are aware that consuming new page flag bits
> is usually a no-go and something we try to avoid as much as possible
> because we are in a great shortage there. So there really have to be a
> _strong_ reason if we go that way. My current understanding that the
> whole zone concept is more about a more convenient implementation rather
> than a fundamental change which will solve unsolvable problems with the
> current approach. More on that below.

Since I am in a similar situation, I think it's better if I join this
conversation instead of going through the same in a separate thread.

In this regard, I have a few observations (are they correct?):

* not everyone seems to be interested in having all the current
  zones active simultaneously

* some zones are even not so meaningful on certain architectures or
  platforms

* some architectures/platforms that are 64 bits would have no penalty
  in dealing with a larger data type.

So I wonder, would anybody be against this:

* within the 32bits constraint, define some optional zones

* decouple the specific position of a bit from the zone it represents;
  iow: if the zone is enabled, ensure that it gets a bit in the mask,
  but do not make promises about which one it is, provided that the
  corresponding macros work properly

* ensure that if one selects more optional zones than there are bits
  available (in the case of a 32bits mask), an error is produced at
  compile time

* if one is happy to have a 64bits type, allow for as many zones as
  it's possible to fit, or anyway more than what is possible with
  the 32 bit mask.

I think I can re-factor the code so that there is no runtime performance
degradation, if there is no immediate objection to what I described. Or
maybe I failed to notice some obvious pitfall?

>From what I see, there seems to be a lot of interest in using functions
like Kmalloc / vmalloc, with the ability of specifying pseudo-custom
areas from where they should tap into.

Why not, as long as those who do not need it are not negatively impacted?

I understand that if the association between bits and zones is fixed,
then suddenly bits become very precious stuff, but if they could be used
in a more efficient way, then maybe they could be used more liberally.

The alternative is to keep getting requests about new zones and turning
them away because they do not pass the bar of being extremely critical,
even if indeed they would simplify people's life.


The change shouldn't be too ugly, if I do something along these lines of
the pseudo code below.
Note: the #ifdefs would be mainly concentrated in the declaration part.

enum gfp_zone_shift {
#if IS_ENABLED(CONFIG_ZONE_DMA)
/*I haven't checked if this is the correct name, but it gives the idea*/
ZONE_DMA_SHIFT = 0,
#endif
#if IS_ENABLED(CONFIG_ZONE_HIGHMEM)
ZONE_HIGHMEM_SHIFT,
#endif
#if IS_ENABLED(CONFIG_ZONE_DMA32)
ZONE_DMA32_SHIFT,
#endif
#if IS_ENABLED(CONFIG_ZONE_xxx)
ZONE_xxx,
#endif
   NON_OPTIONAL_ZONE_SHIFT,
   ...
   USED_ZONES_NUMBER,
   ZONE_MOVABLE_SHIFT = USED_ZONES_NUMBER,
   ...
};

#if USED_ZONES_NUMBER < MAX_ZONES_32BITS
typedef gfp_zones_t uint32_t
#elif IS_ENABLED(CONFIG_ZONES_64BITS
typedef gfp_zones_t uint64_t
#else
#error
#endif

The type should be adjusted in other places where it is used, but I
didn't find too many occurrences.

#define __ZONE_DMA \
  (((gfp_zones_t)IS_ENABLED(CONFIG_ZONE_DMA)) << \
   (ZONE_DMA_SHIFT - 0))

[rinse and repeat]

Code referring to these optional zones can be sandboxed in

#if IS_ENABLED(CONFIG_ZONE_DMA)

inline function do_something_dma() {
   
}

#else
#define do_something_dma()
#endif


Or equivalent, effectively removing many #ifdefs from the main code of
functions like those called by kmalloc.


So, would this approach stand a chance?


thanks, igor