RE: Bogus struct page layout on 32-bit

2021-04-17 Thread David Laight
From: Grygorii Strashko
> Sent: 16 April 2021 10:27
...
> Sry, for delayed reply.
> 
> The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA 
> even in case of LPAE
> (dma-ranges are used).
> Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected 
> for the LPAE case
> on TI platforms and the fact that it became set is the result of 
> multi-paltform/allXXXconfig/DMA
> optimizations and unification.
> (just checked - not set in 4.14)
> 
> Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config 
> symbol in lib/Kconfig").
> 
> The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y 
> by using things like
> (__force u32)
> for example.

Hmmm using (__force u32) is probably wrong.
If an address +length >= 2**32 can get passed then the IO request
needs to be errored (or a bounce buffer used).

Otherwise you can get particularly horrid corruptions.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: Bogus struct page layout on 32-bit

2021-04-16 Thread Arnd Bergmann
On Fri, Apr 16, 2021 at 11:27 AM 'Grygorii Strashko' via Clang Built
Linux  wrote:
> On 10/04/2021 11:52, Ilias Apalodimas wrote:
> > +CC Grygorii for the cpsw part as Ivan's email is not valid anymore
> The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA 
> even in case of LPAE (dma-ranges are used).
> Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected 
> for the LPAE case
> on TI platforms and the fact that it became set is the result of 
> multi-paltform/allXXXconfig/DMA
> optimizations and unification.
> (just checked - not set in 4.14)
>
> Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config 
> symbol in lib/Kconfig").

I completely missed this change in the past, and I don't really agree
with it either.

Most 32-bit Arm platforms are in fact limited to 32-bit DMA, even when they have
MMIO or RAM areas above the 4GB boundary that require LPAE.

> The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y 
> by using
> things like (__force u32) for example.
>
> Honestly, I've done sanity check of CPSW with LPAE=y 
> (ARCH_DMA_ADDR_T_64BIT=y) very long time ago.

This is of course a good idea, drivers should work with any
combination of 32-bit
or 64-bit phys_addr_t and dma_addr_t.

Arnd


Re: Bogus struct page layout on 32-bit

2021-04-16 Thread Grygorii Strashko

Hi Ilias, All,

On 10/04/2021 11:52, Ilias Apalodimas wrote:

+CC Grygorii for the cpsw part as Ivan's email is not valid anymore

Thanks for catching this. Interesting indeed...

On Sat, 10 Apr 2021 at 09:22, Jesper Dangaard Brouer  wrote:


On Sat, 10 Apr 2021 03:43:13 +0100
Matthew Wilcox  wrote:


On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:

include/linux/mm_types.h:274:1: error: static_assert failed due to requirement 
'__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' 
"offsetof(struct page, lru) == offsetof(struct folio, lru)"

FOLIO_MATCH(lru, lru);
include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
static_assert(offsetof(struct page, pg) == offsetof(struct folio, 
fl))


Well, this is interesting.  pahole reports:

struct page {
 long unsigned int  flags;/* 0 4 */
 /* XXX 4 bytes hole, try to pack */
 union {
 struct {
 struct list_head lru;/* 8 8 */
...
struct folio {
 union {
 struct {
 long unsigned int flags; /* 0 4 */
 struct list_head lru;/* 4 8 */

so this assert has absolutely done its job.

But why has this assert triggered?  Why is struct page layout not what
we thought it was?  Turns out it's the dma_addr added in 2019 by commit
c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
the whole union gets moved out by 4 bytes.


Argh, good that you are catching this!


Unfortunately, we can't just fix this by putting an 'unsigned long pad'
in front of it.  It still aligns the entire union to 8 bytes, and then
it skips another 4 bytes after the pad.

We can fix it like this ...

+++ b/include/linux/mm_types.h
@@ -96,11 +96,12 @@ struct page {
 unsigned long private;
 };
 struct {/* page_pool used by netstack */
+   unsigned long _page_pool_pad;


I'm fine with this pad.  Matteo is currently proposing[1] to add a 32-bit
value after @dma_addr, and he could use this area instead.

[1] 
https://lore.kernel.org/netdev/20210409223801.104657-3-mcr...@linux.microsoft.com/

When adding/changing this, we need to make sure that it doesn't overlap
member @index, because network stack use/check page_is_pfmemalloc().
As far as my calculations this is safe to add.  I always try to keep an
eye out for this, but I wonder if we could have a build check like yours.



 /**
  * @dma_addr: might require a 64-bit value even on
  * 32-bit architectures.
  */
-   dma_addr_t dma_addr;
+   dma_addr_t dma_addr __packed;
 };
 struct {/* slab, slob and slub */
 union {

but I don't know if GCC is smart enough to realise that dma_addr is now
on an 8 byte boundary and it can use a normal instruction to access it,
or whether it'll do something daft like use byte loads to access it.

We could also do:

+   dma_addr_t dma_addr __packed __aligned(sizeof(void *));

and I see pahole, at least sees this correctly:

 struct {
 long unsigned int _page_pool_pad; /* 4 4 */
 dma_addr_t dma_addr __attribute__((__aligned__(4))); 
/* 8 8 */
 } __attribute__((__packed__)) __attribute__((__aligned__(4)));

This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
/ dma_addr_t.  Advice, please?


I'm not sure that the 32-bit behavior is with 64-bit (dma) addrs.

I don't have any 32-bit boards with 64-bit DMA.  Cc. Ivan, wasn't your
board (572x ?) 32-bit with driver 'cpsw' this case (where Ivan added
XDP+page_pool) ?


Sry, for delayed reply.

The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA 
even in case of LPAE (dma-ranges are used).
Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected 
for the LPAE case
on TI platforms and the fact that it became set is the result of 
multi-paltform/allXXXconfig/DMA
optimizations and unification.
(just checked - not set in 4.14)

Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol 
in lib/Kconfig").

The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y by 
using things like (__force u32)
for example.

Honestly, I've done sanity check of CPSW with LPAE=y (ARCH_DMA_ADDR_T_64BIT=y) 
very long time ago.

--
Best regards,
grygorii


Re: Bogus struct page layout on 32-bit

2021-04-11 Thread Matthew Wilcox
On Sat, Apr 10, 2021 at 09:10:47PM +0200, Arnd Bergmann wrote:
> On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox  wrote:
> > +   dma_addr_t dma_addr __packed;
> > };
> > struct {/* slab, slob and slub */
> > union {
> >
> > but I don't know if GCC is smart enough to realise that dma_addr is now
> > on an 8 byte boundary and it can use a normal instruction to access it,
> > or whether it'll do something daft like use byte loads to access it.
> >
> > We could also do:
> >
> > +   dma_addr_t dma_addr __packed __aligned(sizeof(void 
> > *));
> >
> > and I see pahole, at least sees this correctly:
> >
> > struct {
> > long unsigned int _page_pool_pad; /* 4 4 */
> > dma_addr_t dma_addr 
> > __attribute__((__aligned__(4))); /* 8 8 */
> > } __attribute__((__packed__)) 
> > __attribute__((__aligned__(4)));
> >
> > This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> > / dma_addr_t.  Advice, please?
> 
> I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3
> 
> struct page {
> short a;
> struct {
> short b;
> long long c __attribute__((packed, aligned(2)));
> } __attribute__((packed));
> } __attribute__((aligned(8)));
> 
> In this structure, 'c' is clearly aligned to eight bytes, and gcc does
> realize that
> it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
> struct members with less than 4 byte alignment. However, it also complains
> that passing a pointer to 'c' into a function that expects a 'long long' is 
> not
> allowed because alignof(c) is only '2' here.
> 
> (I used 'short' here because I having a 64-bit member misaligned by four
> bytes wouldn't make a difference to the instructions on Arm, or any other
> 32-bit architecture I can think of, regardless of the ABI requirements).

So ... we could do this:

+++ b/include/linux/types.h
@@ -140,7 +140,7 @@ typedef u64 blkcnt_t;
  * so they don't care about the size of the actual bus addresses.
  */
 #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
-typedef u64 dma_addr_t;
+typedef u64 __attribute__((aligned(sizeof(void * dma_addr_t;
 #else
 typedef u32 dma_addr_t;
 #endif

but I'm a little scared that this might have unintended consequences.
And Jesper points out that a big-endian 64-bit dma_addr_t can impersonate
a PageTail page, and we should solve that problem while we're at it.
So I don't think we should do this, but thought I should mention it as
a possibility.


Re: Bogus struct page layout on 32-bit

2021-04-10 Thread Arnd Bergmann
On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox  wrote:
> +   dma_addr_t dma_addr __packed;
> };
> struct {/* slab, slob and slub */
> union {
>
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
>
> We could also do:
>
> +   dma_addr_t dma_addr __packed __aligned(sizeof(void 
> *));
>
> and I see pahole, at least sees this correctly:
>
> struct {
> long unsigned int _page_pool_pad; /* 4 4 */
> dma_addr_t dma_addr __attribute__((__aligned__(4))); 
> /* 8 8 */
> } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3

struct page {
short a;
struct {
short b;
long long c __attribute__((packed, aligned(2)));
} __attribute__((packed));
} __attribute__((aligned(8)));

In this structure, 'c' is clearly aligned to eight bytes, and gcc does
realize that
it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
struct members with less than 4 byte alignment. However, it also complains
that passing a pointer to 'c' into a function that expects a 'long long' is not
allowed because alignof(c) is only '2' here.

(I used 'short' here because I having a 64-bit member misaligned by four
bytes wouldn't make a difference to the instructions on Arm, or any other
32-bit architecture I can think of, regardless of the ABI requirements).

  Arnd


Re: Bogus struct page layout on 32-bit

2021-04-10 Thread Russell King - ARM Linux admin
On Sat, Apr 10, 2021 at 03:06:52PM +0100, Matthew Wilcox wrote:
> How about moving the flags into the union?  A bit messy, but we don't
> have to play games with __packed__.

Yes, that is probably the better solution, avoiding the games to try
and get the union appropriately placed on 32-bit systems.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


RE: Bogus struct page layout on 32-bit

2021-04-10 Thread David Laight
From: Matthew Wilcox
> Sent: 10 April 2021 03:43
> On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> > >> include/linux/mm_types.h:274:1: error: static_assert failed due to 
> > >> requirement
> '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, 
> lru)' "offsetof(struct page,
> lru) == offsetof(struct folio, lru)"
> >FOLIO_MATCH(lru, lru);
> >include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
> >static_assert(offsetof(struct page, pg) == offsetof(struct 
> > folio, fl))
> 
> Well, this is interesting.  pahole reports:
> 
> struct page {
> long unsigned int  flags;/* 0 4 */
> /* XXX 4 bytes hole, try to pack */
> union {
> struct {
> struct list_head lru;/* 8 8 */
> ...
> struct folio {
> union {
> struct {
> long unsigned int flags; /* 0 4 */
> struct list_head lru;/* 4 8 */
> 
> so this assert has absolutely done its job.
> 
> But why has this assert triggered?  Why is struct page layout not what
> we thought it was?  Turns out it's the dma_addr added in 2019 by commit
> c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
> config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
> the whole union gets moved out by 4 bytes.
> 
> Unfortunately, we can't just fix this by putting an 'unsigned long pad'
> in front of it.  It still aligns the entire union to 8 bytes, and then
> it skips another 4 bytes after the pad.
> 
> We can fix it like this ...
> 
> +++ b/include/linux/mm_types.h
> @@ -96,11 +96,12 @@ struct page {
> unsigned long private;
> };
> struct {/* page_pool used by netstack */
> +   unsigned long _page_pool_pad;
> /**
>  * @dma_addr: might require a 64-bit value even on
>  * 32-bit architectures.
>  */
> -   dma_addr_t dma_addr;
> +   dma_addr_t dma_addr __packed;
> };
> struct {/* slab, slob and slub */
> union {
> 
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.

I'm betting it will use byte accesses.
Checked - it does seem to use 4-byte accesses.
(godbolt is rather short of 32 bit compilers...)

> 
> We could also do:
> 
> +   dma_addr_t dma_addr __packed __aligned(sizeof(void 
> *));

I wonder if __aligned(n) should be defined as
__attribute__((packed,aligned(n))
I don't think you ever want the 'unpacked' variant.

But explicitly reducing the alignment of single member is much
better than the habit of marking the structure 'packed'.

(Never mind the habit of adding __packed 'because we don't want
the compiler to add random padding.)

> 
> and I see pahole, at least sees this correctly:
> 
> struct {
> long unsigned int _page_pool_pad; /* 4 4 */
> dma_addr_t dma_addr __attribute__((__aligned__(4))); 
> /* 8 8 */
> } __attribute__((__packed__)) __attribute__((__aligned__(4)));

Is the attribute on the struct an artifact of pahole?
It should just have an alignment of 4 without anything special.

> 
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

Only those where a 64-bit value is 64-bit aligned.
So that excludes x86 (which can have 64-bit dma) but includes sparc32
(which probably doesn't).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: Bogus struct page layout on 32-bit

2021-04-10 Thread Matthew Wilcox
How about moving the flags into the union?  A bit messy, but we don't
have to play games with __packed__.

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 1210a8e41fad..f374d2f06255 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -68,16 +68,22 @@ struct mem_cgroup;
 #endif
 
 struct page {
-   unsigned long flags;/* Atomic flags, some possibly
-* updated asynchronously */
/*
-* Five words (20/40 bytes) are available in this union.
-* WARNING: bit 0 of the first word is used for PageTail(). That
-* means the other users of this union MUST NOT use the bit to
+* This union is six words (24 / 48 bytes) in size.
+* The first word is reserved for atomic flags, often updated
+* asynchronously.  Use the PageFoo() macros to access it.  Some
+* of the flags can be reused for your own purposes, but the
+* word as a whole often contains other information and overwriting
+* it will cause functions like page_zone() and page_node() to stop
+* working correctly.
+*
+* Bit 0 of the second word is used for PageTail(). That
+* means the other users of this union MUST leave the bit zero to
 * avoid collision and false-positive PageTail().
 */
union {
struct {/* Page cache and anonymous pages */
+   unsigned long flags;
/**
 * @lru: Pageout list, eg. active_list protected by
 * lruvec->lru_lock.  Sometimes used as a generic list
@@ -96,6 +102,8 @@ struct page {
unsigned long private;
};
struct {/* page_pool used by netstack */
+   unsigned long _pp_flags;
+   unsigned long _pp_pad;
/**
 * @dma_addr: might require a 64-bit value even on
 * 32-bit architectures.
@@ -103,6 +111,7 @@ struct page {
dma_addr_t dma_addr;
};
struct {/* slab, slob and slub */
+   unsigned long _slab_flags;
union {
struct list_head slab_list;
struct {/* Partial pages */
@@ -130,6 +139,7 @@ struct page {
};
};
struct {/* Tail pages of compound page */
+   unsigned long _tail1_flags;
unsigned long compound_head;/* Bit zero is set */
 
/* First tail page only */
@@ -139,12 +149,14 @@ struct page {
unsigned int compound_nr; /* 1 << compound_order */
};
struct {/* Second tail page of compound page */
+   unsigned long _tail2_flags;
unsigned long _compound_pad_1;  /* compound_head */
atomic_t hpage_pinned_refcount;
/* For both global and memcg */
struct list_head deferred_list;
};
struct {/* Page table pages */
+   unsigned long _pt_flags;
unsigned long _pt_pad_1;/* compound_head */
pgtable_t pmd_huge_pte; /* protected by page->ptl */
unsigned long _pt_pad_2;/* mapping */
@@ -159,6 +171,7 @@ struct page {
 #endif
};
struct {/* ZONE_DEVICE pages */
+   unsigned long _zd_flags;
/** @pgmap: Points to the hosting device page map. */
struct dev_pagemap *pgmap;
void *zone_device_data;
@@ -174,8 +187,11 @@ struct page {
 */
};
 
-   /** @rcu_head: You can use this to free a page by RCU. */
-   struct rcu_head rcu_head;
+   struct {
+   unsigned long _rcu_flags;
+   /** @rcu_head: You can use this to free a page by RCU. 
*/
+   struct rcu_head rcu_head;
+   };
};
 
union { /* This union is 4 bytes in size. */


Re: Bogus struct page layout on 32-bit

2021-04-10 Thread Ilias Apalodimas
+CC Grygorii for the cpsw part as Ivan's email is not valid anymore

Thanks for catching this. Interesting indeed...

On Sat, 10 Apr 2021 at 09:22, Jesper Dangaard Brouer  wrote:
>
> On Sat, 10 Apr 2021 03:43:13 +0100
> Matthew Wilcox  wrote:
>
> > On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> > > >> include/linux/mm_types.h:274:1: error: static_assert failed due to 
> > > >> requirement '__builtin_offsetof(struct page, lru) == 
> > > >> __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == 
> > > >> offsetof(struct folio, lru)"
> > >FOLIO_MATCH(lru, lru);
> > >include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
> > >static_assert(offsetof(struct page, pg) == offsetof(struct 
> > > folio, fl))
> >
> > Well, this is interesting.  pahole reports:
> >
> > struct page {
> > long unsigned int  flags;/* 0 4 */
> > /* XXX 4 bytes hole, try to pack */
> > union {
> > struct {
> > struct list_head lru;/* 8 8 */
> > ...
> > struct folio {
> > union {
> > struct {
> > long unsigned int flags; /* 0 4 */
> > struct list_head lru;/* 4 8 */
> >
> > so this assert has absolutely done its job.
> >
> > But why has this assert triggered?  Why is struct page layout not what
> > we thought it was?  Turns out it's the dma_addr added in 2019 by commit
> > c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
> > config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
> > the whole union gets moved out by 4 bytes.
>
> Argh, good that you are catching this!
>
> > Unfortunately, we can't just fix this by putting an 'unsigned long pad'
> > in front of it.  It still aligns the entire union to 8 bytes, and then
> > it skips another 4 bytes after the pad.
> >
> > We can fix it like this ...
> >
> > +++ b/include/linux/mm_types.h
> > @@ -96,11 +96,12 @@ struct page {
> > unsigned long private;
> > };
> > struct {/* page_pool used by netstack */
> > +   unsigned long _page_pool_pad;
>
> I'm fine with this pad.  Matteo is currently proposing[1] to add a 32-bit
> value after @dma_addr, and he could use this area instead.
>
> [1] 
> https://lore.kernel.org/netdev/20210409223801.104657-3-mcr...@linux.microsoft.com/
>
> When adding/changing this, we need to make sure that it doesn't overlap
> member @index, because network stack use/check page_is_pfmemalloc().
> As far as my calculations this is safe to add.  I always try to keep an
> eye out for this, but I wonder if we could have a build check like yours.
>
>
> > /**
> >  * @dma_addr: might require a 64-bit value even on
> >  * 32-bit architectures.
> >  */
> > -   dma_addr_t dma_addr;
> > +   dma_addr_t dma_addr __packed;
> > };
> > struct {/* slab, slob and slub */
> > union {
> >
> > but I don't know if GCC is smart enough to realise that dma_addr is now
> > on an 8 byte boundary and it can use a normal instruction to access it,
> > or whether it'll do something daft like use byte loads to access it.
> >
> > We could also do:
> >
> > +   dma_addr_t dma_addr __packed __aligned(sizeof(void 
> > *));
> >
> > and I see pahole, at least sees this correctly:
> >
> > struct {
> > long unsigned int _page_pool_pad; /* 4 4 */
> > dma_addr_t dma_addr 
> > __attribute__((__aligned__(4))); /* 8 8 */
> > } __attribute__((__packed__)) 
> > __attribute__((__aligned__(4)));
> >
> > This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> > / dma_addr_t.  Advice, please?
>
> I'm not sure that the 32-bit behavior is with 64-bit (dma) addrs.
>
> I don't have any 32-bit boards with 64-bit DMA.  Cc. Ivan, wasn't your
> board (572x ?) 32-bit with driver 'cpsw' this case (where Ivan added
> XDP+page_pool) ?
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
>


Re: Bogus struct page layout on 32-bit

2021-04-10 Thread Jesper Dangaard Brouer
On Sat, 10 Apr 2021 03:43:13 +0100
Matthew Wilcox  wrote:

> On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> > >> include/linux/mm_types.h:274:1: error: static_assert failed due to 
> > >> requirement '__builtin_offsetof(struct page, lru) == 
> > >> __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == 
> > >> offsetof(struct folio, lru)"  
> >FOLIO_MATCH(lru, lru);
> >include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
> >static_assert(offsetof(struct page, pg) == offsetof(struct 
> > folio, fl))  
> 
> Well, this is interesting.  pahole reports:
> 
> struct page {
> long unsigned int  flags;/* 0 4 */
> /* XXX 4 bytes hole, try to pack */
> union {
> struct {
> struct list_head lru;/* 8 8 */
> ...
> struct folio {
> union {
> struct {
> long unsigned int flags; /* 0 4 */
> struct list_head lru;/* 4 8 */
> 
> so this assert has absolutely done its job.
> 
> But why has this assert triggered?  Why is struct page layout not what
> we thought it was?  Turns out it's the dma_addr added in 2019 by commit
> c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
> config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
> the whole union gets moved out by 4 bytes.

Argh, good that you are catching this!

> Unfortunately, we can't just fix this by putting an 'unsigned long pad'
> in front of it.  It still aligns the entire union to 8 bytes, and then
> it skips another 4 bytes after the pad.
> 
> We can fix it like this ...
> 
> +++ b/include/linux/mm_types.h
> @@ -96,11 +96,12 @@ struct page {
> unsigned long private;
> };
> struct {/* page_pool used by netstack */
> +   unsigned long _page_pool_pad;

I'm fine with this pad.  Matteo is currently proposing[1] to add a 32-bit
value after @dma_addr, and he could use this area instead.

[1] 
https://lore.kernel.org/netdev/20210409223801.104657-3-mcr...@linux.microsoft.com/

When adding/changing this, we need to make sure that it doesn't overlap
member @index, because network stack use/check page_is_pfmemalloc().
As far as my calculations this is safe to add.  I always try to keep an
eye out for this, but I wonder if we could have a build check like yours.


> /**
>  * @dma_addr: might require a 64-bit value even on
>  * 32-bit architectures.
>  */
> -   dma_addr_t dma_addr;
> +   dma_addr_t dma_addr __packed;
> };
> struct {/* slab, slob and slub */
> union {
> 
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
> 
> We could also do:
> 
> +   dma_addr_t dma_addr __packed __aligned(sizeof(void 
> *));
> 
> and I see pahole, at least sees this correctly:
> 
> struct {
> long unsigned int _page_pool_pad; /* 4 4 */
> dma_addr_t dma_addr __attribute__((__aligned__(4))); 
> /* 8 8 */
> } __attribute__((__packed__)) 
> __attribute__((__aligned__(4)));  
> 
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I'm not sure that the 32-bit behavior is with 64-bit (dma) addrs.

I don't have any 32-bit boards with 64-bit DMA.  Cc. Ivan, wasn't your
board (572x ?) 32-bit with driver 'cpsw' this case (where Ivan added
XDP+page_pool) ?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer



Bogus struct page layout on 32-bit

2021-04-09 Thread Matthew Wilcox
On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> >> include/linux/mm_types.h:274:1: error: static_assert failed due to 
> >> requirement '__builtin_offsetof(struct page, lru) == 
> >> __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == 
> >> offsetof(struct folio, lru)"
>FOLIO_MATCH(lru, lru);
>include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
>static_assert(offsetof(struct page, pg) == offsetof(struct folio, 
> fl))

Well, this is interesting.  pahole reports:

struct page {
long unsigned int  flags;/* 0 4 */
/* XXX 4 bytes hole, try to pack */
union {
struct {
struct list_head lru;/* 8 8 */
...
struct folio {
union {
struct {
long unsigned int flags; /* 0 4 */
struct list_head lru;/* 4 8 */

so this assert has absolutely done its job.

But why has this assert triggered?  Why is struct page layout not what
we thought it was?  Turns out it's the dma_addr added in 2019 by commit
c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
the whole union gets moved out by 4 bytes.

Unfortunately, we can't just fix this by putting an 'unsigned long pad'
in front of it.  It still aligns the entire union to 8 bytes, and then
it skips another 4 bytes after the pad.

We can fix it like this ...

+++ b/include/linux/mm_types.h
@@ -96,11 +96,12 @@ struct page {
unsigned long private;
};
struct {/* page_pool used by netstack */
+   unsigned long _page_pool_pad;
/**
 * @dma_addr: might require a 64-bit value even on
 * 32-bit architectures.
 */
-   dma_addr_t dma_addr;
+   dma_addr_t dma_addr __packed;
};
struct {/* slab, slob and slub */
union {

but I don't know if GCC is smart enough to realise that dma_addr is now
on an 8 byte boundary and it can use a normal instruction to access it,
or whether it'll do something daft like use byte loads to access it.

We could also do:

+   dma_addr_t dma_addr __packed __aligned(sizeof(void *));

and I see pahole, at least sees this correctly:

struct {
long unsigned int _page_pool_pad; /* 4 4 */
dma_addr_t dma_addr __attribute__((__aligned__(4))); /* 
8 8 */
} __attribute__((__packed__)) __attribute__((__aligned__(4)));  

This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
/ dma_addr_t.  Advice, please?