Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On 2022/06/02 16:38, Arnd Bergmann wrote: I think that SmPL script worked great, almost every instance is something that ought to be changed, as long as it stops reporting those structures that are also __aligned(). I would extend it to also report structures with 'bool', 'enum', or any pointer, but that could give more false-positives. Maybe have a separate script for those instances embedding atomics or spinlocks (very broken) vs the other members (causes more harm than good or might need alignment). I extended my script to detect __packed struct or union without __aligned. It is split in two scripts. The first one is to search for problematic cases where __packed structs/unions have atomic types or spinlock types. In this version, types whose names contain "atomic" or "spinlock" are targeted. == Scripts == @r@ type T; identifier i; type b =~ ".*(atomic|spinlock).*"; position p; attribute name __packed, __aligned; attribute at; @@ T@p { ... b i; ... } at; @script:python@ p
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Thu, 2 Jun 2022 at 14:12, Arnd Bergmann wrote: > > On Thu, Jun 2, 2022 at 1:21 PM Tetsuo Handa > wrote: > > On 2022/06/02 16:38, Arnd Bergmann wrote: > > >> But let's cc the tomoyo and chelsio people. > > > > > > I think both of them work because the structures are always > > > embedded inside of larger structures that have at least word > > > alignment. This is the thing I was looking for, and the > > > __packed attribute was added in error, most likely copied > > > from somewhere else. > > > > The __packed in "struct tomoyo_shared_acl_head" is to embed next > > naturally-aligned member of a larger struct into the bytes that > > would have been wasted if __packed is not specified. For example, > > > > struct tomoyo_shared_acl_head { > > struct list_head list; > > atomic_t users; > > } __packed; > > > > struct tomoyo_condition { > > struct tomoyo_shared_acl_head head; > > u32 size; /* Memory size allocated for this entry. */ > > (...snipped...) > > }; > > > > saves 4 bytes on 64 bits build. > > > > If the next naturally-aligned member of a larger struct is larger than > > the bytes that was saved by __packed, the saved bytes will be unused. > > Ok, got it. I think as gcc should still be able to always figure out the > alignment when accessing the atomic, without ever falling back > to byte access on an atomic_get() or atomic_set(). > > To be on the safe side, I would still either move the __packed attribute > to the 'list' member, or make the structure '__aligned(4)'. > The tomoyo code generates lots of byte size accesses when built for ARMv5, but interestingly, the atomic_t accesses are emitted normally, probably due to the fact that the C api (atomic_read, atomic_set, etc) takes atomic_t pointers, and so GCC assumes natural alignment, even when inlining. But ordinary accesses of multi-byte fields in the structs in question are emitted as sequences of LDRB instructions. I understand the reason for these annotations, but I think we should drop them anyway, and just let the compiler organize the structs.
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On 2022/06/02 16:38, Arnd Bergmann wrote: >> But let's cc the tomoyo and chelsio people. > > I think both of them work because the structures are always > embedded inside of larger structures that have at least word > alignment. This is the thing I was looking for, and the > __packed attribute was added in error, most likely copied > from somewhere else. The __packed in "struct tomoyo_shared_acl_head" is to embed next naturally-aligned member of a larger struct into the bytes that would have been wasted if __packed is not specified. For example, struct tomoyo_shared_acl_head { struct list_head list; atomic_t users; } __packed; struct tomoyo_condition { struct tomoyo_shared_acl_head head; u32 size; /* Memory size allocated for this entry. */ (...snipped...) }; saves 4 bytes on 64 bits build. If the next naturally-aligned member of a larger struct is larger than the bytes that was saved by __packed, the saved bytes will be unused.
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Thu, Jun 2, 2022 at 1:21 PM Tetsuo Handa wrote: > On 2022/06/02 16:38, Arnd Bergmann wrote: > >> But let's cc the tomoyo and chelsio people. > > > > I think both of them work because the structures are always > > embedded inside of larger structures that have at least word > > alignment. This is the thing I was looking for, and the > > __packed attribute was added in error, most likely copied > > from somewhere else. > > The __packed in "struct tomoyo_shared_acl_head" is to embed next > naturally-aligned member of a larger struct into the bytes that > would have been wasted if __packed is not specified. For example, > > struct tomoyo_shared_acl_head { > struct list_head list; > atomic_t users; > } __packed; > > struct tomoyo_condition { > struct tomoyo_shared_acl_head head; > u32 size; /* Memory size allocated for this entry. */ > (...snipped...) > }; > > saves 4 bytes on 64 bits build. > > If the next naturally-aligned member of a larger struct is larger than > the bytes that was saved by __packed, the saved bytes will be unused. Ok, got it. I think as gcc should still be able to always figure out the alignment when accessing the atomic, without ever falling back to byte access on an atomic_get() or atomic_set(). To be on the safe side, I would still either move the __packed attribute to the 'list' member, or make the structure '__aligned(4)'. Arnd
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On 2022/06/01 1:41, Linus Torvalds wrote: On Tue, May 31, 2022 at 1:04 AM Arnd Bergmann wrote: As an experiment: what kind of results would we get when looking for packed structures and unions that contain any of these: I don't think we have that. Not only because it would already cause breakage, but simply because the kinds of structures that people pack aren't generally the kind that contain these kinds of things. That said, you might have a struct that is packed, but that intentionally aligns parts of itself, so it *could* be valid. But it would probably not be a bad idea to check that packed structures/unions don't have atomic types or locks in them. I _think_ we're all good, but who knows.. I am Julia's student at INRIA and I heard from her that there is an opportunity to use Coccinelle to find specific types in packed struct or enum. I found 13 definitions of packed structure that contains: > - spinlock_t > - atomic_t > - dma_addr_t > - phys_addr_t > - size_t > - struct mutex > - struct device - raw_spinlock_t == Results == security/tomoyo/common.h: atomic_t in tomoyo_shared_acl_head drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h: spinlock_t in key_map include/linux/ti-emif-sram.h: phys_addr_t in ti_emif_pm_data drivers/scsi/wd719x.h: dma_addr_t in wd719x_scb drivers/net/wireless/intel/ipw2x00/ipw2200.h: dma_addr_t in clx2_queue drivers/infiniband/hw/irdma/osdep.h: dma_addr_t in irdma_dma_mem drivers/infiniband/core/mad_priv.h: size_t in ib_mad_private drivers/crypto/qat/qat_common/qat_asym_algs.c: - dma_addr_t in qat_rsa_ctx - dma_addr_t in qat_dh_ctx drivers/atm/idt77252.h: dma_addr_t in idt77252_skb_prv arch/s390/include/asm/kvm_host.h: atomic_t in kvm_s390_sie_block drivers/net/wireless/ath/ath10k/core.h: dma_addr_t in ath10k_skb_cb drivers/net/wireless/ath/ath11k/core.h: dma_addr_t in ath10k_skb_cb drivers/crypto/ccp/ccp-dev.h: dma_addr_t in ccp_dma_info The last 3 structures have a dma_adddr_t member defined as the first member variable. Most of the others also seems valid. I used this SmPL script to find them: @e1@ type T; identifier i; position p; attribute name __packed; @@ T@p { ... ( atomic_t i; | raw_spinlock_t i; | struct mutex i; | spinlock_t i; | dma_addr_t i; | phys_addr_t i; | size_t i; | struct device i; ) ... } __packed; @e2@ type T; identifier i; position p; @@ T@p { ... ( atomic_t i; | raw_spinlock_t i; | struct mutex i; | spinlock_t i; | dma_addr_t i; | phys_addr_t i; | size_t i; | struct device i; ) ... } __attribute__(( ( pack | __pack__ ) ,... )); @script:python@ ps <
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Thu, Jun 2, 2022 at 3:08 AM Linus Torvalds wrote: > > On Wed, Jun 1, 2022 at 3:28 PM Keisuke Nishimura > wrote: > > > > > > I found 13 definitions of packed structure that contains: > > > - spinlock_t > > > - atomic_t > > > - dma_addr_t > > > - phys_addr_t > > > - size_t > > > - struct mutex > > > - struct device > > > > - raw_spinlock_t > > Ok, so I don't think dma_addr_t/phys_addr_t/size_t are problematic, > they are just regular integers. > > And 'struct device' is problematic only as it then contains any of the > atomic types (which it does) is I suggested this list because they are problematic for different reasons: - any atomics are clearly broken here - dma_addr_t/phys_addr_t are sometimes put into hardware data structures in coherent DMA allocations. This is broken because these types are variably-sized depending on the architecture, and annotating structures in uncached memory as __packed is also broken on architectures that have neither coherent DMA nor unaligned access (most 32-bit mips and armv5), where this will result in a series of expensive one-byte uncached load/store instructions. - having any complex kernel data structure embedded in a __packed struct is a red flag, because there should not be a need to mark it packed for compatibility with either hardware or user space. If the structure is actually misaligned, passing a pointer for the embedded struct into an interface that expects an aligned pointer is undefined behavior in C, and gcc may decide to do something bad here even on architectures that can access unaligned pointers. > > security/tomoyo/common.h: atomic_t in tomoyo_shared_acl_head > > drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h: spinlock_t in > > key_map > > arch/s390/include/asm/kvm_host.h: atomic_t in kvm_s390_sie_block > > So these do look problematic. > > I'm not actually clear on why tomoyo_shared_acl_head would be packed. > That makes no sense to me. > > Same goes for key_map, it's not clear what the reason for that > __packed is, and it's clearly bogus. It might work, almost by mistake, > but it's wrong to try to pack that spinlock_t. > > The s390 kvm use actually looks fine: the structure is packed, but > it's also aligned, and the spin-lock is at the beginning, so the > "packing" part is about the other members, not the first one. Right, I think the coccinelle script should nor report structures that are both packed and aligned. > The two that look wrong look like they will probably work anyway > (they'll presumably be effectively word-aligned, and that's sufficient > for spinlocks in practice). > > But let's cc the tomoyo and chelsio people. I think both of them work because the structures are always embedded inside of larger structures that have at least word alignment. This is the thing I was looking for, and the __packed attribute was added in error, most likely copied from somewhere else. Looking at the other ones: include/linux/ti-emif-sram.h: phys_addr_t in ti_emif_pm_data No misalignment because of the __aligned(8), but this might go wrong if the emif firmware relies on the structure layout to have a 32-bit phys_addr_t. drivers/scsi/wd719x.h: dma_addr_t in wd719x_scb This one is correct, as the structure has 64 bytes of hardware data and a few members that are only accessed by the kernel. There should still be an __aligned(8) for efficiency. drivers/net/wireless/intel/ipw2x00/ipw2200.h: dma_addr_t in clx2_queue Al marked the incorrect __packed annotations in 2008, see 83f7d57c37e8 ("ipw2200 annotations and fixes"). Mostly harmless but the __packed should just get removed here. > drivers/infiniband/hw/irdma/osdep.h: dma_addr_t in irdma_dma_mem > drivers/infiniband/core/mad_priv.h: size_t in ib_mad_private Same here: harmless but __packed should be removed, possibly while reordering members by size. > drivers/crypto/qat/qat_common/qat_asym_algs.c: > - dma_addr_t in qat_rsa_ctx > - dma_addr_t in qat_dh_ctx Probably harmless because the structure is __aligned(64), but I'm completely puzzled by what the author was actually trying to achieve here. There are also 'bool' members in the packed struct, which is probably something we want to look for as well. > drivers/atm/idt77252.h: dma_addr_t in idt77252_skb_prv This is a bug on architectures with 64-bit dma_addr_t, it should be an __le32, and the structure should be __aligned() as a DMA descriptor. > drivers/net/wireless/ath/ath10k/core.h: dma_addr_t in ath10k_skb_cb > drivers/net/wireless/ath/ath11k/core.h: dma_addr_t in ath10k_skb_cb Should almost certainly not be __packed, fixing these will likely improve performance on mips32 routers using ath10k > drivers/crypto/ccp/ccp-dev.h: dma_addr_t in ccp_dma_info This looks ok, the "__packed __aligned(4)" here can save a bit of stack space as intended. I think that SmPL script worked great, almost every instance is something that ought to be changed, as long as it stops reporting those structure
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Wed, Jun 1, 2022 at 3:28 PM Keisuke Nishimura wrote: > > > I found 13 definitions of packed structure that contains: > > - spinlock_t > > - atomic_t > > - dma_addr_t > > - phys_addr_t > > - size_t > > - struct mutex > > - struct device > > - raw_spinlock_t Ok, so I don't think dma_addr_t/phys_addr_t/size_t are problematic, they are just regular integers. And 'struct device' is problematic only as it then contains any of the atomic types (which it does) > security/tomoyo/common.h: atomic_t in tomoyo_shared_acl_head > drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h: spinlock_t in > key_map > arch/s390/include/asm/kvm_host.h: atomic_t in kvm_s390_sie_block So these do look problematic. I'm not actually clear on why tomoyo_shared_acl_head would be packed. That makes no sense to me. Same goes for key_map, it's not clear what the reason for that __packed is, and it's clearly bogus. It might work, almost by mistake, but it's wrong to try to pack that spinlock_t. The s390 kvm use actually looks fine: the structure is packed, but it's also aligned, and the spin-lock is at the beginning, so the "packing" part is about the other members, not the first one. The two that look wrong look like they will probably work anyway (they'll presumably be effectively word-aligned, and that's sufficient for spinlocks in practice). But let's cc the tomoyo and chelsio people. Linus
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Tue, May 31, 2022 at 1:04 AM Arnd Bergmann wrote: > > As an experiment: what kind of results would we get when looking > for packed structures and unions that contain any of these: Yeah, any atomics or locks should always be aligned, and won't even work (or might be *very* slow) on multiple architectures. Even x86 - which does very well on unaligned data - reacts badly to sufficiently unaligned atomics (ie cacheline crossing). I don't think we have that. Not only because it would already cause breakage, but simply because the kinds of structures that people pack aren't generally the kind that contain these kinds of things. That said, you might have a struct that is packed, but that intentionally aligns parts of itself, so it *could* be valid. But it would probably not be a bad idea to check that packed structures/unions don't have atomic types or locks in them. I _think_ we're all good, but who knows.. Linus
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Tue, May 31, 2022 at 8:26 AM Julia Lawall wrote: > > On 30 May 2022, at 15:27, Arnd Bergmann wrote: > > On Mon, May 30, 2022 at 4:08 PM Jani Nikula wrote: > >>> On Mon, 30 May 2022, Arnd Bergmann wrote: > >>> struct my_driver_priv { > >>> struct device dev; > >>> u8 causes_misalignment; > >>> spinlock_t lock; > >>> atomic_t counter; > >>> } __packed; /* this annotation is harmful because it breaks the atomics */ > >> > >> I wonder if this is something that could be caught with coccinelle. Or > >> sparse. Are there any cases where this combo is necessary? (I can't > >> think of any, but it's a low bar. ;) > >> ... > >>> or if the annotation does not change the layout like > >>> > >>> struct my_dma_descriptor { > >>> __le64 address; > >>> __le64 length; > >>> } __packed; /* does not change layout but makes access slow on some > >>> architectures */ > >> > >> Why is this the case, though? I'd imagine the compiler could figure this > >> out. > > > > When you annotate the entire structure as __packed without an > > extra __aligned() annotation, the compiler has to assume that the > > structure itself is unaligned as well. On many of the older architectures, > > this will result in accessing the values one byte at a time. Marking > > the structure as "__packed __aligned(8)" instead would be harmless. > > > > When I have a structure with a few misaligned members, I generally > > prefer to only annotate the members that are not naturally aligned, > > but this approach is not very common. > > Searching for specific types in a packed structure would be easy. As an experiment: what kind of results would we get when looking for packed structures and unions that contain any of these: - spinlock_t - atomic_t - dma_addr_t - phys_addr_t - size_t - any pointer - any enum - struct mutex - struct device This is just a list of common data types that are used in a lot of structures but that one should never find in hardware specific types. If the output from coccinelle is 90% actual bugs, this would be really helpful. OTOH if there is no output at all, or all false-positives, we don't need to look for additional types. > Coccinelle could duplicate the structure without the packed and see if > any offsets change, using build bug on, but that would be architecture > specific so maybe not useful. I would consider this a separate issue. The first one above is for identifying structures that are marked as packed but should not be packed at all, regardless of whether that changes the layout. Arnd
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
> On 30 May 2022, at 15:27, Arnd Bergmann wrote: > > On Mon, May 30, 2022 at 4:08 PM Jani Nikula wrote: >>> On Mon, 30 May 2022, Arnd Bergmann wrote: >>> struct my_driver_priv { >>> struct device dev; >>> u8 causes_misalignment; >>> spinlock_t lock; >>> atomic_t counter; >>> } __packed; /* this annotation is harmful because it breaks the atomics */ >> >> I wonder if this is something that could be caught with coccinelle. Or >> sparse. Are there any cases where this combo is necessary? (I can't >> think of any, but it's a low bar. ;) >> >> Cc: Julia. > > I think one would first have to make a list of data types that are not > meant to be in a packed structure. It could be a good start to > search for any packed aggregates with a pointer, atomic_t or spinlock_t > in them, but there are of course many more types that you won't > find in hardware structures. > >>> or if the annotation does not change the layout like >>> >>> struct my_dma_descriptor { >>> __le64 address; >>> __le64 length; >>> } __packed; /* does not change layout but makes access slow on some >>> architectures */ >> >> Why is this the case, though? I'd imagine the compiler could figure this >> out. > > When you annotate the entire structure as __packed without an > extra __aligned() annotation, the compiler has to assume that the > structure itself is unaligned as well. On many of the older architectures, > this will result in accessing the values one byte at a time. Marking > the structure as "__packed __aligned(8)" instead would be harmless. > > When I have a structure with a few misaligned members, I generally > prefer to only annotate the members that are not naturally aligned, > but this approach is not very common. Searching for specific types in a packed structure would be easy. Coccinelle could duplicate the structure without the packed and see if any offsets change, using build bug on, but that would be architecture specific so maybe not useful. Julia > Arnd
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Mon, May 30, 2022 at 03:35:28PM +0200, Arnd Bergmann wrote: > The annotations for edid are completely correct and necessary. However > other driver authors just slap __packed annotations on any structure > even if the layout is not fixed at all like: > > struct my_driver_priv { >struct device dev; >u8 causes_misalignment; >spinlock_t lock; >atomic_t counter; > } __packed; /* this annotation is harmful because it breaks the atomics */ > > or if the annotation does not change the layout like > > struct my_dma_descriptor { > __le64 address; > __le64 length; > } __packed; /* does not change layout but makes access slow on some > architectures */ Sounds like we need a howto document for people to ignore and continue doing their own thing. :P -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Mon, May 30, 2022 at 02:43:45PM +0200, Arnd Bergmann wrote: > Overall I'm not that worried because the only machines running OABI > kernels would be on really old hardware that runs a limited set of > driver code. ... and from what I remember, none of them care about EDID anyway. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Mon, May 30, 2022 at 12:33:17PM +0300, Jani Nikula wrote: > On Mon, 30 May 2022, Jani Nikula wrote: > > On Sat, 28 May 2022, Linus Torvalds wrote: > >> On Sat, May 28, 2022 at 11:59 AM Arnd Bergmann wrote: > >>> > >>> It's CONFIG_ARM_AEABI, which is normally set everywhere. Without this > >>> option, you the kernel is built for the old 'OABI' that forces all > >>> non-packed > >>> struct members to be at least 16-bit aligned. > >> > >> Looks like forced word (32 bit) alignment to me. > >> > >> I wonder how many other structures that messes up, but I committed the > >> EDID fix for now. > > > > Thanks for the fix, and the thorough commit message! > > > >> This has presumably been broken for a long time, but maybe the > >> affected targets don't typically use EDID and kernel modesetting, and > >> only use some fixed display setup instead. > >> > >> Those structure definitions go back a _loong_ time (from a quick 'git > >> blame' I see November 2008). > >> > >> But despite that, I did not mark my fix 'cc:stable' because I don't > >> know if any of those machines affected by this bad arm ABI issue could > >> possibly care. > >> > >> At least my tree hopefully now builds on them, with the BUILD_BUG_ON() > >> that uncovered this. > > > > Indeed the bug is ancient. I just threw in the BUILD_BUG_ON() on a whim > > as an extra sanity check when doing pointer arithmetics on struct edid > > *. > > > > If there are affected machines, buffer overflows are the real danger due > > to edid->extensions indicating the number of extensions. > > That is, for EDID. Makes you wonder about all the other packed structs > with enum members across the kernel. enum should not be used in structures if the layout of the struct matters. ISTR there was a proposal for EABI to make enums just about big enough to hold their enumerated constants - so you'd end up with 8-bit, 16-bit etc according to the largest enumerated value that the compiler thinks it could hold. That's a latent disaster when enums get used in structs where the layout matters, __packed or not. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Mon, May 30, 2022 at 4:08 PM Jani Nikula wrote: > On Mon, 30 May 2022, Arnd Bergmann wrote: > > struct my_driver_priv { > >struct device dev; > >u8 causes_misalignment; > >spinlock_t lock; > >atomic_t counter; > > } __packed; /* this annotation is harmful because it breaks the atomics */ > > I wonder if this is something that could be caught with coccinelle. Or > sparse. Are there any cases where this combo is necessary? (I can't > think of any, but it's a low bar. ;) > > Cc: Julia. I think one would first have to make a list of data types that are not meant to be in a packed structure. It could be a good start to search for any packed aggregates with a pointer, atomic_t or spinlock_t in them, but there are of course many more types that you won't find in hardware structures. > > or if the annotation does not change the layout like > > > > struct my_dma_descriptor { > > __le64 address; > > __le64 length; > > } __packed; /* does not change layout but makes access slow on some > > architectures */ > > Why is this the case, though? I'd imagine the compiler could figure this > out. When you annotate the entire structure as __packed without an extra __aligned() annotation, the compiler has to assume that the structure itself is unaligned as well. On many of the older architectures, this will result in accessing the values one byte at a time. Marking the structure as "__packed __aligned(8)" instead would be harmless. When I have a structure with a few misaligned members, I generally prefer to only annotate the members that are not naturally aligned, but this approach is not very common. Arnd
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Mon, 30 May 2022, Arnd Bergmann wrote: > On Mon, May 30, 2022 at 3:10 PM Jani Nikula wrote: >> > >> > I think in general, most __packed annotations we have in the kernel are >> > completely pointless because they do not change the structure layout on >> > any architecture but instead just make member access slower on >> >> Please explain. >> >> They are used quite a bit for parsing blob data, or >> serialization/deserialization, like in the EDID case at hand. Try >> removing __attribute__((packed)) from include/drm/drm_edid.h and see the >> sizeof(struct edid) on any architecture. > > The annotations for edid are completely correct and necessary. However > other driver authors just slap __packed annotations on any structure > even if the layout is not fixed at all like: Right. Thanks for the examples. > struct my_driver_priv { >struct device dev; >u8 causes_misalignment; >spinlock_t lock; >atomic_t counter; > } __packed; /* this annotation is harmful because it breaks the atomics */ I wonder if this is something that could be caught with coccinelle. Or sparse. Are there any cases where this combo is necessary? (I can't think of any, but it's a low bar. ;) Cc: Julia. > or if the annotation does not change the layout like > > struct my_dma_descriptor { > __le64 address; > __le64 length; > } __packed; /* does not change layout but makes access slow on some > architectures */ Why is this the case, though? I'd imagine the compiler could figure this out. BR, Jani. -- Jani Nikula, Intel Open Source Graphics Center
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Mon, 30 May 2022, Arnd Bergmann wrote: > On Mon, May 30, 2022 at 11:33 AM Jani Nikula wrote: >> >> That is, for EDID. Makes you wonder about all the other packed structs >> with enum members across the kernel. > > It is not the 'enum' that is special here, it's the 'union' having > unpacked members, Obviously meant union not enum, that was just a -ENOCOFFEE on my part. > and the same thing happens when you have nested structs: both the inner > and the outer aggregate need to be packed, either with __packed at the > end, or on each individual member that is not fully aligned to > max(sizeof(member), 4)). > > I think in general, most __packed annotations we have in the kernel are > completely pointless because they do not change the structure layout on > any architecture but instead just make member access slower on Please explain. They are used quite a bit for parsing blob data, or serialization/deserialization, like in the EDID case at hand. Try removing __attribute__((packed)) from include/drm/drm_edid.h and see the sizeof(struct edid) on any architecture. BR, Jani. > architectures that lack unaligned load/store instructions. There have > definitely been other cases though where a __packed annotation is > not needed on any sane architecture but is needed for OABI ARM. > > Overall I'm not that worried because the only machines running OABI > kernels would be on really old hardware that runs a limited set of > driver code. > > A completely different matter are the extraneous __packed annotations > that lead to possible problems when accessed through a misaligned > pointer. We ignore -Waddress-of-packed-member and -Wcast-align > in the kernel, so these never get caught at build time, but we have > seen bugs from gcc making incorrect assumptions about alignment > even on architectures that have unaligned load/store instructions. > > Arnd -- Jani Nikula, Intel Open Source Graphics Center
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Mon, May 30, 2022 at 11:33 AM Jani Nikula wrote: > > That is, for EDID. Makes you wonder about all the other packed structs > with enum members across the kernel. It is not the 'enum' that is special here, it's the 'union' having unpacked members, and the same thing happens when you have nested structs: both the inner and the outer aggregate need to be packed, either with __packed at the end, or on each individual member that is not fully aligned to max(sizeof(member), 4)). I think in general, most __packed annotations we have in the kernel are completely pointless because they do not change the structure layout on any architecture but instead just make member access slower on architectures that lack unaligned load/store instructions. There have definitely been other cases though where a __packed annotation is not needed on any sane architecture but is needed for OABI ARM. Overall I'm not that worried because the only machines running OABI kernels would be on really old hardware that runs a limited set of driver code. A completely different matter are the extraneous __packed annotations that lead to possible problems when accessed through a misaligned pointer. We ignore -Waddress-of-packed-member and -Wcast-align in the kernel, so these never get caught at build time, but we have seen bugs from gcc making incorrect assumptions about alignment even on architectures that have unaligned load/store instructions. Arnd
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Mon, 30 May 2022, Jani Nikula wrote: > On Sat, 28 May 2022, Linus Torvalds wrote: >> On Sat, May 28, 2022 at 11:59 AM Arnd Bergmann wrote: >>> >>> It's CONFIG_ARM_AEABI, which is normally set everywhere. Without this >>> option, you the kernel is built for the old 'OABI' that forces all >>> non-packed >>> struct members to be at least 16-bit aligned. >> >> Looks like forced word (32 bit) alignment to me. >> >> I wonder how many other structures that messes up, but I committed the >> EDID fix for now. > > Thanks for the fix, and the thorough commit message! > >> This has presumably been broken for a long time, but maybe the >> affected targets don't typically use EDID and kernel modesetting, and >> only use some fixed display setup instead. >> >> Those structure definitions go back a _loong_ time (from a quick 'git >> blame' I see November 2008). >> >> But despite that, I did not mark my fix 'cc:stable' because I don't >> know if any of those machines affected by this bad arm ABI issue could >> possibly care. >> >> At least my tree hopefully now builds on them, with the BUILD_BUG_ON() >> that uncovered this. > > Indeed the bug is ancient. I just threw in the BUILD_BUG_ON() on a whim > as an extra sanity check when doing pointer arithmetics on struct edid > *. > > If there are affected machines, buffer overflows are the real danger due > to edid->extensions indicating the number of extensions. That is, for EDID. Makes you wonder about all the other packed structs with enum members across the kernel. BR, Jani. -- Jani Nikula, Intel Open Source Graphics Center
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Sat, 28 May 2022, Linus Torvalds wrote: > On Sat, May 28, 2022 at 11:59 AM Arnd Bergmann wrote: >> >> It's CONFIG_ARM_AEABI, which is normally set everywhere. Without this >> option, you the kernel is built for the old 'OABI' that forces all non-packed >> struct members to be at least 16-bit aligned. > > Looks like forced word (32 bit) alignment to me. > > I wonder how many other structures that messes up, but I committed the > EDID fix for now. Thanks for the fix, and the thorough commit message! > This has presumably been broken for a long time, but maybe the > affected targets don't typically use EDID and kernel modesetting, and > only use some fixed display setup instead. > > Those structure definitions go back a _loong_ time (from a quick 'git > blame' I see November 2008). > > But despite that, I did not mark my fix 'cc:stable' because I don't > know if any of those machines affected by this bad arm ABI issue could > possibly care. > > At least my tree hopefully now builds on them, with the BUILD_BUG_ON() > that uncovered this. Indeed the bug is ancient. I just threw in the BUILD_BUG_ON() on a whim as an extra sanity check when doing pointer arithmetics on struct edid *. If there are affected machines, buffer overflows are the real danger due to edid->extensions indicating the number of extensions. BR, Jani. -- Jani Nikula, Intel Open Source Graphics Center
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Sat, May 28, 2022 at 10:31 PM Linus Torvalds wrote: > On Sat, May 28, 2022 at 11:59 AM Arnd Bergmann wrote: > > > > It's CONFIG_ARM_AEABI, which is normally set everywhere. Without this > > option, you the kernel is built for the old 'OABI' that forces all > > non-packed > > struct members to be at least 16-bit aligned. > > Looks like forced word (32 bit) alignment to me. Ah, of course, I keep mixing it up with the odd structure alignment of m68k, which does the opposite and aligns struct members to no more than 16 bits. Arnd
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Sat, May 28, 2022 at 11:08:48AM -0700, Linus Torvalds wrote: > This smells like a compiler bug triggered by "there's a 16-bit member > field in that gtf2 structure, and despite it being packed and aligned > to 1, we somehow still align the size to 2". It's an age old thing, it's no compiler bug, and it's completely compliant with the C standards. Implementations are permitted by the C standard to pad structures and unions how they see fit - and some do if it makes sense for performance. The mistake is that people forget this detail, and they expect structs and unions to be laid out a certain way - because it doesn't matter to the same extent on x86. However, as older ARM CPUs could not do unaligned loads, ensuring that things were naturally aligned made complete sense, even if it meant that people who assume the world is x86 got tripped up - the only way around that would be to make every load very expensive. It's not "align to size of 2" in OABI, it tends to be align to a multiple of 4, because the underlying architecture is 32-bit. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Sat, May 28, 2022 at 11:59 AM Arnd Bergmann wrote: > > It's CONFIG_ARM_AEABI, which is normally set everywhere. Without this > option, you the kernel is built for the old 'OABI' that forces all non-packed > struct members to be at least 16-bit aligned. Looks like forced word (32 bit) alignment to me. I wonder how many other structures that messes up, but I committed the EDID fix for now. This has presumably been broken for a long time, but maybe the affected targets don't typically use EDID and kernel modesetting, and only use some fixed display setup instead. Those structure definitions go back a _loong_ time (from a quick 'git blame' I see November 2008). But despite that, I did not mark my fix 'cc:stable' because I don't know if any of those machines affected by this bad arm ABI issue could possibly care. At least my tree hopefully now builds on them, with the BUILD_BUG_ON() that uncovered this. Linus
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Sat, May 28, 2022 at 8:08 PM Linus Torvalds wrote: > > Not very much tested, and I have no idea what it is that triggers this > only on spear3xx_defconfig. > > Some ARM ABI issue that is triggered by some very particular ARM > compiler flag enabled by that config? It's CONFIG_ARM_AEABI, which is normally set everywhere. Without this option, you the kernel is built for the old 'OABI' that forces all non-packed struct members to be at least 16-bit aligned. I think Russell still uses OABI kernels on his oldest machines, but it is incompatible with all modern user space and should probably not be in the defconfig. Your patch looks like the correct solution to me. Arnd
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Sat, May 28, 2022 at 10:40 AM Linus Torvalds wrote: > > So digging a bit deeper - since I have am arm compiler after all - I > note that 'sizeof(detailed_timings)' is 88. Hmm. sizeof() both detailed_timings[0].data.other_data.data.range.formula.gtf2 and detailed_timings[0].data.other_data.data.range.formula.cvt is 7. But the *union* of those things is detailed_timings[0].data.other_data.data.range.formula and its size is 8 (despite having an alignment of just 1). The attached patch would seem to fix it for me. Not very much tested, and I have no idea what it is that triggers this only on spear3xx_defconfig. Some ARM ABI issue that is triggered by some very particular ARM compiler flag enabled by that config? Adding some ARM (and SPEAR, and SoC) people in case they have any idea. This smells like a compiler bug triggered by "there's a 16-bit member field in that gtf2 structure, and despite it being packed and aligned to 1, we somehow still align the size to 2". I dunno. But marking those unions packed too doesn't seem wrong, and does seem to fix it. Linus include/drm/drm_edid.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h index c3204a58fb09..b2756753370b 100644 --- a/include/drm/drm_edid.h +++ b/include/drm/drm_edid.h @@ -121,7 +121,7 @@ struct detailed_data_monitor_range { u8 supported_scalings; u8 preferred_refresh; } __attribute__((packed)) cvt; - } formula; + } __attribute__((packed)) formula; } __attribute__((packed)); struct detailed_data_wpindex { @@ -154,7 +154,7 @@ struct detailed_non_pixel { struct detailed_data_wpindex color; struct std_timing timings[6]; struct cvt_timing cvt[4]; - } data; + } __attribute__((packed)) data; } __attribute__((packed)); #define EDID_DETAIL_EST_TIMINGS 0xf7 @@ -172,7 +172,7 @@ struct detailed_timing { union { struct detailed_pixel_timing pixel_data; struct detailed_non_pixel other_data; - } data; + } __attribute__((packed)) data; } __attribute__((packed)); #define DRM_EDID_INPUT_SERRATION_VSYNC (1 << 0)
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Sat, May 28, 2022 at 5:13 AM Sudip Mukherjee wrote: > > just tried this with > make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- drivers/gpu/drm/drm_edid.s > > size_of_edid: > mov r0, #144@, > ldmfd sp, {fp, sp, pc}@ So digging a bit deeper - since I have am arm compiler after all - I note that 'sizeof(detailed_timings)' is 88. Which is completely wrong. It should be 72 bytes (an array of 4 structures, each 18 bytes in size). I have not dug deeper, but that is clearly the issue. Now, why that only happens on that spear3xx_defconfig, I have no idea. Linus
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Fri, May 27, 2022 at 06:04:14PM -0700, Linus Torvalds wrote: > On Fri, May 27, 2022 at 4:41 PM Sudip Mukherjee > wrote: > > > > I just tested with various values, sizeof(*edid) is 144 bytes at that place. > > Hmm. What compiler do you have? Because it seems very broken. > > You don't actually have to try with various sizes, you could have just > done something like > > int size_of_edid(const struct edid *edid) > { > return sizeof(*edid); > } > > and then "make drivers/gpu/drm/drm_edid.s" to generate assembly and > see what it looks like (obviously removing the BUG_ON() in order to > build). just tried this with make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- drivers/gpu/drm/drm_edid.s > > That obviously generates code like > > movl$128, %eax > ret and for me it looks like: .L1030: .word .LC40 .word .LC41 .word -1431655765 .word .LC39 .size drm_edid_to_sad, .-drm_edid_to_sad .align 2 .global size_of_edid .syntax unified .arm .type size_of_edid, %function size_of_edid: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 1, uses_anonymous_args = 0 mov ip, sp @, push{fp, ip, lr, pc}@ sub fp, ip, #4 @,, @ drivers/gpu/drm/drm_edid.c:1573: } mov r0, #144@, ldmfd sp, {fp, sp, pc}@ .size size_of_edid, .-size_of_edid -- Regards Sudip
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
Hi Linus, On Fri, May 27, 2022 at 06:04:14PM -0700, Linus Torvalds wrote: > On Fri, May 27, 2022 at 4:41 PM Sudip Mukherjee > wrote: > > > > I just tested with various values, sizeof(*edid) is 144 bytes at that place. > > Hmm. What compiler do you have? Because it seems very broken. I am using gcc version 11.3.1 20220517 (GCC). And I am not just building spear3xx_defconfig, I am building all the arm configs with the same compiler in the same setup and only spear3xx_defconfig started failing. I am attaching a build summary report generated on 26th May, all arm builds passed, even allmodconfig. > > > But it obviously doesn't happen for me or for most other people, so > it's something in your setup. Unusual compiler? And, just to verify its not my setup or the compiler I use, I took a fresh Debian Bullseye docker, installed 'gcc-arm-linux-gnueabi' from Debian and I see the same build failure with spear3xx_defconfig. This gcc from Debian Bullseye is: gcc version 10.2.1 20210110 (Debian 10.2.1-6). -- Regards Sudip HEAD -> babf0bb978e3 ("Merge tag 'xfs-5.19-for-linus' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux") allmodconfig -> pass am200epdkit_defconfig -> pass aspeed_g4_defconfig -> pass aspeed_g5_defconfig -> pass assabet_defconfig -> pass at91_dt_defconfig -> pass axm55xx_defconfig -> pass badge4_defconfig -> pass bcm2835_defconfig -> pass cerfcube_defconfig -> pass clps711x_defconfig -> pass cm_x300_defconfig -> pass cns3420vb_defconfig -> pass colibri_pxa270_defconfig -> pass colibri_pxa300_defconfig -> pass collie_defconfig -> pass corgi_defconfig -> pass davinci_all_defconfig -> pass dove_defconfig -> pass ep93xx_defconfig -> pass eseries_pxa_defconfig -> pass exynos_defconfig -> pass ezx_defconfig -> pass footbridge_defconfig -> pass gemini_defconfig -> pass h3600_defconfig -> pass h5000_defconfig -> pass hackkit_defconfig -> pass hisi_defconfig -> pass imx_v4_v5_defconfig -> pass imx_v6_v7_defconfig -> pass imxrt_defconfig -> pass integrator_defconfig -> pass iop32x_defconfig -> pass ixp4xx_defconfig -> pass jornada720_defconfig -> pass keystone_defconfig -> pass lart_defconfig -> pass lpc18xx_defconfig -> pass lpc32xx_defconfig -> pass lpd270_defconfig -> pass lubbock_defconfig -> pass magician_defconfig -> pass mainstone_defconfig -> pass milbeaut_m10v_defconfig -> pass mini2440_defconfig -> pass mmp2_defconfig -> pass moxart_defconfig -> pass mps2_defconfig -> pass multi_v4t_defconfig -> pass multi_v5_defconfig -> pass multi_v7_defconfig -> pass mv78xx0_defconfig -> pass mvebu_v5_defconfig -> pass mvebu_v7_defconfig -> pass mxs_defconfig -> pass neponset_defconfig -> pass netwinder_defconfig -> pass nhk8815_defconfig -> pass omap1_defconfig -> pass omap2plus_defconfig -> pass orion5x_defconfig -> pass oxnas_v6_defconfig -> pass palmz72_defconfig -> pass pcm027_defconfig -> pass pleb_defconfig -> pass pxa168_defconfig -> pass pxa255-idp_defconfig -> pass pxa3xx_defconfig -> pass pxa910_defconfig -> pass pxa_defconfig -> pass qcom_defconfig -> pass realview_defconfig -> pass rpc_defconfig -> pass s3c2410_defconfig -> pass s3c6400_defconfig -> pass s5pv210_defconfig -> pass sama5_defconfig -> pass sama7_defconfig -> pass shannon_defconfig -> pass shmobile_defconfig -> pass simpad_defconfig -> pass socfpga_defconfig -> pass spear13xx_defconfig -> pass spear3xx_defconfig -> failed spear6xx_defconfig -> pass spitz_defconfig -> pass stm32_defconfig -> pass sunxi_defconfig -> pass tct_hammer_defconfig -> pass tegra_defconfig -> pass trizeps4_defconfig -> pass u8500_defconfig -> pass versatile_defconfig -> pass vexpress_defconfig -> pass vf610m4_defconfig -> pass viper_defconfig -> pass vt8500_v6_v7_defconfig -> pass xcep_defconfig -> pass zeus_defconfig -> pass
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Fri, May 27, 2022 at 4:41 PM Sudip Mukherjee wrote: > > I just tested with various values, sizeof(*edid) is 144 bytes at that place. Hmm. What compiler do you have? Because it seems very broken. You don't actually have to try with various sizes, you could have just done something like int size_of_edid(const struct edid *edid) { return sizeof(*edid); } and then "make drivers/gpu/drm/drm_edid.s" to generate assembly and see what it looks like (obviously removing the BUG_ON() in order to build). That obviously generates code like movl$128, %eax ret for me, and looking at the definition of that type I really can't see how it would ever generate anything else. But it's apparently not even close for you. I suspect some of the structs inside of that 'struct edid' end up getting aligned, despite the '__attribute__((packed))'. For example, 'struct est_timings' is supposed to be just 3 bytes, and it's at an odd offset too (byte offset 35 in the 'struct edid' if I did the math correctly). But it obviously doesn't happen for me or for most other people, so it's something in your setup. Unusual compiler? Linus
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Fri, May 27, 2022 at 7:56 PM Linus Torvalds wrote: > > On Fri, May 27, 2022 at 2:07 AM Sudip Mukherjee > wrote: > > > > declared with attribute error: BUILD_BUG_ON failed: sizeof(*edid) > > != EDID_LENGTH > > > > And, reverting it on top of mainline branch has fixed the build failure. > > Hmm. That BUILD_BUG_ON() looks entirely correct, and if that sizeof() > doesn't work, then the code doesn't work. > > Very strange. It would be interesting to know where that sizeof goes > wrong, but it would seem to be something very peculiar to your build > environment (either that config, or your compiler). I just tested with various values, sizeof(*edid) is 144 bytes at that place. My last good build was with fdaf9a5840ac ("Merge tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache") And my setup has not changed in anyway since then. Also verified the build failure on my laptop. -- Regards Sudip
Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")
On Fri, May 27, 2022 at 2:07 AM Sudip Mukherjee wrote: > > declared with attribute error: BUILD_BUG_ON failed: sizeof(*edid) != > EDID_LENGTH > > And, reverting it on top of mainline branch has fixed the build failure. Hmm. That BUILD_BUG_ON() looks entirely correct, and if that sizeof() doesn't work, then the code doesn't work. I'm not seeing what could go wrong in there, with all the structures I see being marked as __packed__. I wonder if the union in 'struct detailed_timing' also wants that "__attribute__((packed))" but I also wonder what it is that would make this fail on spear3xx but not elsewhere. Very strange. It would be interesting to know where that sizeof goes wrong, but it would seem to be something very peculiar to your build environment (either that config, or your compiler). Linus