Re: mainline build failure due to 322458c2bb1a ("drm/ttm: Reduce the number of used allocation orders for TTM pages")

2023-04-26 Thread Michael Ellerman
Linus Torvalds  writes:
> On Wed, Apr 26, 2023 at 10:44 AM Sudip Mukherjee (Codethink)
>  wrote:
>>
>> drivers/gpu/drm/ttm/ttm_pool.c:73:29: error: variably modified 
>> 'global_write_combined' at file scope
>>73 | static struct ttm_pool_type global_write_combined[TTM_DIM_ORDER];
>>   | ^
>
> Ugh.
>
> This is because we have
>
>   #define TTM_DIM_ORDER (__TTM_DIM_ORDER <= MAX_ORDER ?
> __TTM_DIM_ORDER : MAX_ORDER)
>
> which looks perfectly fine as a constant ("pick the smaller of
> MAX_ORDER and __TTM_DIM_ORDER").
>
> But:
>
>   #define __TTM_DIM_ORDER (TTM_MAX_ORDER + 1)
>   #define TTM_MAX_ORDER (PMD_SHIFT - PAGE_SHIFT)
>
> which still _looks_ fine, but  on 64-bit powerpc, we then have
>
>#define PTE_INDEX_SIZE  __pte_index_size
>
> so that __TTM_DIM_ORDER constant isn't actually a constant at all.
..
>
> It's a bit sad how that macro that _looks_ like a constant (and is one
> pretty much everywhere else) isn't actually constant on powerpc, but
> looking around it looks fairly unavoidable.

Yeah, it allows us to build a single kernel that can choose at runtime
whether it uses the HPT or Radix MMU. The page table geometry is
different between the MMUs because they support a different sized huge
page for THP.

cheers


Re: mainline build failure due to 322458c2bb1a ("drm/ttm: Reduce the number of used allocation orders for TTM pages")

2023-04-26 Thread Christian König

Am 26.04.23 um 20:30 schrieb Linus Torvalds:

On Wed, Apr 26, 2023 at 10:44 AM Sudip Mukherjee (Codethink)
 wrote:

drivers/gpu/drm/ttm/ttm_pool.c:73:29: error: variably modified 
'global_write_combined' at file scope
73 | static struct ttm_pool_type global_write_combined[TTM_DIM_ORDER];
   | ^

Ugh.


Revert of this is patch already on the way to you.



This is because we have

   #define TTM_DIM_ORDER (__TTM_DIM_ORDER <= MAX_ORDER ?
__TTM_DIM_ORDER : MAX_ORDER)

which looks perfectly fine as a constant ("pick the smaller of
MAX_ORDER and __TTM_DIM_ORDER").

But:

   #define __TTM_DIM_ORDER (TTM_MAX_ORDER + 1)
   #define TTM_MAX_ORDER (PMD_SHIFT - PAGE_SHIFT)

which still _looks_ fine, but  on 64-bit powerpc, we then have

#define PTE_INDEX_SIZE  __pte_index_size

so that __TTM_DIM_ORDER constant isn't actually a constant at all.

I would suggest that the DRM code just make those fixed-size arrays
use "MAX_ORDER", because even though it might be a bit bigger than
necessary, that's still not a very big constant (ie typically it's
"11", but it could be up to 64).

It's a bit sad how that macro that _looks_ like a constant (and is one
pretty much everywhere else) isn't actually constant on powerpc, but
looking around it looks fairly unavoidable.

Maybe the DRM code could try to avoid using things like PMD_SHIFT entirely?


Yeah we already try very hard to do so.

I quite often have to remind people that the CPU architecture doesn't 
affect how many bits the internal address space of a GPU has and they 
should always assume u64 for those instead of unsigned long.


But I have to agree that it's a bit unexpected that a constant 
everywhere else is not constant on this particular architecture.


Christian.




 Linus




Re: mainline build failure due to 322458c2bb1a ("drm/ttm: Reduce the number of used allocation orders for TTM pages")

2023-04-26 Thread Linus Torvalds
On Wed, Apr 26, 2023 at 10:44 AM Sudip Mukherjee (Codethink)
 wrote:
>
> drivers/gpu/drm/ttm/ttm_pool.c:73:29: error: variably modified 
> 'global_write_combined' at file scope
>73 | static struct ttm_pool_type global_write_combined[TTM_DIM_ORDER];
>   | ^

Ugh.

This is because we have

  #define TTM_DIM_ORDER (__TTM_DIM_ORDER <= MAX_ORDER ?
__TTM_DIM_ORDER : MAX_ORDER)

which looks perfectly fine as a constant ("pick the smaller of
MAX_ORDER and __TTM_DIM_ORDER").

But:

  #define __TTM_DIM_ORDER (TTM_MAX_ORDER + 1)
  #define TTM_MAX_ORDER (PMD_SHIFT - PAGE_SHIFT)

which still _looks_ fine, but  on 64-bit powerpc, we then have

   #define PTE_INDEX_SIZE  __pte_index_size

so that __TTM_DIM_ORDER constant isn't actually a constant at all.

I would suggest that the DRM code just make those fixed-size arrays
use "MAX_ORDER", because even though it might be a bit bigger than
necessary, that's still not a very big constant (ie typically it's
"11", but it could be up to 64).

It's a bit sad how that macro that _looks_ like a constant (and is one
pretty much everywhere else) isn't actually constant on powerpc, but
looking around it looks fairly unavoidable.

Maybe the DRM code could try to avoid using things like PMD_SHIFT entirely?

Linus


mainline build failure due to 322458c2bb1a ("drm/ttm: Reduce the number of used allocation orders for TTM pages")

2023-04-26 Thread Sudip Mukherjee (Codethink)
Hi All,

The latest mainline kernel branch fails to build powerpc allmodconfig
with the error:


drivers/gpu/drm/ttm/ttm_pool.c:73:29: error: variably modified 
'global_write_combined' at file scope
   73 | static struct ttm_pool_type global_write_combined[TTM_DIM_ORDER];
  | ^
drivers/gpu/drm/ttm/ttm_pool.c:74:29: error: variably modified 
'global_uncached' at file scope
   74 | static struct ttm_pool_type global_uncached[TTM_DIM_ORDER];
  | ^~~
drivers/gpu/drm/ttm/ttm_pool.c:76:29: error: variably modified 
'global_dma32_write_combined' at file scope
   76 | static struct ttm_pool_type global_dma32_write_combined[TTM_DIM_ORDER];
  | ^~~
drivers/gpu/drm/ttm/ttm_pool.c:77:29: error: variably modified 
'global_dma32_uncached' at file scope
   77 | static struct ttm_pool_type global_dma32_uncached[TTM_DIM_ORDER];
  | ^


git bisect pointed to 322458c2bb1a ("drm/ttm: Reduce the number of used 
allocation orders for TTM pages")

I will be happy to test any patch or provide any extra log if needed.


-- 
Regards
Sudip