Re: [PATCH 3/3] [RFC] Revert "drm/i915: use variadic macros and arrays to choose port/pipe based registers"
On Mon, 20 Mar 2017, Arnd Bergmannwrote: > I don't know how to generate a URL for it, but after adding this to the > command line for gcc-7, > > -fsanitize=kernel-address -fasan-shadow-offset=0xdfff9000 > --param asan-stack=1 --param asan-globals=1 --param > asan-instrumentation-with-call-threshold=1 > -fsanitize-address-use-after-scope > > the code turned from really nice into the log series of checks below. > Without -fsanitize-address-use-after-scope (which didn't exist before gcc-7), > it's less bad but still exceeds the (arbitrary) 1536 byte limit. It seems to be the combination of --param asan-stack=1 and -fsanitize-address-use-after-scope that really blows up the code [1]. I filed a GCC bug on it, mostly to see what they say [2]. I don't know, maybe they think it's expected. *shrug*. BR, Jani. [1] https://godbolt.org/g/hgS817 [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80114 -- Jani Nikula, Intel Open Source Technology Center ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 3/3] [RFC] Revert "drm/i915: use variadic macros and arrays to choose port/pipe based registers"
On Mon, Mar 20, 2017 at 11:08 AM, Jani Nikulawrote: > On Mon, 20 Mar 2017, Arnd Bergmann wrote: >> The varargs macro trick in _PIPE3/_PHY3/_PORT3 was meant as an optimization >> to shrink the i915 kernel module by around 1000 bytes. > > To be clear, it was not at all intended to be an optimization, nothing > of the sort. The intention was to make it easier and less error prone to > add more parameters to said macros. The text size shring was just a > bonus. > >> However, the >> downside is a size regression with CONFIG_KASAN, as I found from stack size >> warnings with gcc-7.0.1: > > In his review of the original change, Chris provided this comparison > https://godbolt.org/g/YCK1od > > How does CONFIG_KASAN change this? Would be nice to see how the > generated code blows up. > I don't know how to generate a URL for it, but after adding this to the command line for gcc-7, -fsanitize=kernel-address -fasan-shadow-offset=0xdfff9000 --param asan-stack=1 --param asan-globals=1 --param asan-instrumentation-with-call-threshold=1 -fsanitize-address-use-after-scope the code turned from really nice into the log series of checks below. Without -fsanitize-address-use-after-scope (which didn't exist before gcc-7), it's less bad but still exceeds the (arbitrary) 1536 byte limit. Arnd .LC0: .string "2 32 4 1 i 96 24 9 " main: .LASANPC0: pushq %r12 pushq %rbp movabsq $-2305966154516004864, %rdx pushq %rbx subq$160, %rsp movq%rsp, %rbp leaq96(%rsp), %rbx movq$1102416563, (%rsp) shrq$3, %rbp movq$.LC0, 8(%rsp) movq$.LASANPC0, 16(%rsp) leaq0(%rbp,%rdx), %rax movl$-235802127, (%rax) movl$-218959356, 4(%rax) movl$-218959118, 8(%rax) movl$-234881024, 12(%rax) movl$-202116109, 16(%rax) movq%rbx, %rax movl$1, 32(%rsp) shrq$3, %rax movzbl (%rax,%rdx), %eax testb %al, %al je .L2 cmpb$3, %al jle .L53 .L2: leaq4(%rbx), %rdi movabsq $-2305966154516004864, %rax movl$0, 96(%rsp) movq%rdi, %rdx shrq$3, %rdx movzbl (%rdx,%rax), %edx movq%rdi, %rax andl$7, %eax addl$3, %eax cmpb%dl, %al jl .L3 testb %dl, %dl jne .L54 .L3: leaq8(%rbx), %rdi movabsq $-2305966154516004864, %rax movl$1, 100(%rsp) movq%rdi, %rdx shrq$3, %rdx movzbl (%rdx,%rax), %eax testb %al, %al je .L4 cmpb$3, %al jle .L55 .L4: leaq12(%rbx), %rdi movabsq $-2305966154516004864, %rax movl$2, 104(%rsp) movq%rdi, %rdx shrq$3, %rdx movzbl (%rdx,%rax), %edx movq%rdi, %rax andl$7, %eax addl$3, %eax cmpb%dl, %al jl .L5 testb %dl, %dl jne .L56 .L5: leaq16(%rbx), %rdi movabsq $-2305966154516004864, %rax movl$3, 108(%rsp) movq%rdi, %rdx shrq$3, %rdx movzbl (%rdx,%rax), %eax testb %al, %al je .L6 cmpb$3, %al jle .L57 .L6: leaq20(%rbx), %rdi movabsq $-2305966154516004864, %rax movl$4, 112(%rsp) movq%rdi, %rdx shrq$3, %rdx movzbl (%rdx,%rax), %edx movq%rdi, %rax andl$7, %eax addl$3, %eax cmpb%dl, %al jl .L7 testb %dl, %dl jne .L58 .L7: movslq 32(%rsp), %r12 movabsq $-2305966154516004864, %rax movl$5, 116(%rsp) leaq(%rbx,%r12,4), %rdi movq%rdi, %rdx shrq$3, %rdx movzbl (%rdx,%rax), %edx movq%rdi, %rax andl$7, %eax addl$3, %eax cmpb%dl, %al jl .L8 testb %dl, %dl jne .L59 .L8: pxor%xmm0, %xmm0 movabsq $-2305966154516004864, %rdx movl96(%rsp,%r12,4), %eax movl$0, 16(%rdx,%rbp) movups %xmm0, 0(%rbp,%rdx) addq$160, %rsp popq%rbx popq%rbp popq%r12 ret .L59: call__asan_report_load4_noabort jmp .L8 .L58: call__asan_report_store4_noabort jmp .L7 .L57: call__asan_report_store4_noabort jmp .L6 .L56: call__asan_report_store4_noabort jmp .L5 .L55: call__asan_report_store4_noabort jmp .L4 .L54: call__asan_report_store4_noabort jmp .L3 .L53: movq%rbx, %rdi call
Re: [PATCH 3/3] [RFC] Revert "drm/i915: use variadic macros and arrays to choose port/pipe based registers"
On Mon, 20 Mar 2017, Arnd Bergmannwrote: > The varargs macro trick in _PIPE3/_PHY3/_PORT3 was meant as an optimization > to shrink the i915 kernel module by around 1000 bytes. To be clear, it was not at all intended to be an optimization, nothing of the sort. The intention was to make it easier and less error prone to add more parameters to said macros. The text size shring was just a bonus. > However, the > downside is a size regression with CONFIG_KASAN, as I found from stack size > warnings with gcc-7.0.1: In his review of the original change, Chris provided this comparison https://godbolt.org/g/YCK1od How does CONFIG_KASAN change this? Would be nice to see how the generated code blows up. BR, Jani. > > before: > drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_get_hw_state': > drivers/gpu/drm/i915/intel_dpll_mgr.c:1644:1: error: the frame size of 176 > bytes is larger than 100 bytes [-Werror=frame-larger-than=] > drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_enable': > drivers/gpu/drm/i915/intel_dpll_mgr.c:1548:1: error: the frame size of 224 > bytes is larger than 100 bytes [-Werror=frame-larger-than=] > > after: > drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_get_hw_state': > drivers/gpu/drm/i915/intel_dpll_mgr.c:1644:1: error: the frame size of 1016 > bytes is larger than 1000 bytes [-Werror=frame-larger-than=] > drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_enable': > drivers/gpu/drm/i915/intel_dpll_mgr.c:1548:1: error: the frame size of 1960 > bytes is larger than 1000 bytes [-Werror=frame-larger-than=] > > I also checked the module sizes and got > > before: >text data bss dec hex filename > 2704592412025 11104 3127721 2fb9a9 drivers/gpu/drm/i915/i915.o > > after: >text data bss dec hex filename > 2718538412025 11104 3141667 2ff023 drivers/gpu/drm/i915/i915.o > > showing a significant code size growth. This reverts the patch to avoid > the warning and get back to the better code with CONFIG_KASAN. If someone > has another idea to avoid the warning while keeping the more efficient > code for the non-KASAN case, that would obviously be better. > > Fixes: ce64645d86ac ("drm/i915: use variadic macros and arrays to choose > port/pipe based registers") > Signed-off-by: Arnd Bergmann > --- > drivers/gpu/drm/i915/i915_reg.h | 11 ++- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index 04c8f69fcc62..aa732eccdeb5 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -48,8 +48,6 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) > return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG); > } > > -#define _PICK(__index, ...) (((const u32 []){ __VA_ARGS__ })[__index]) > - > #define _PIPE(pipe, a, b) ((a) + (pipe)*((b)-(a))) > #define _MMIO_PIPE(pipe, a, b) _MMIO(_PIPE(pipe, a, b)) > #define _PLANE(plane, a, b) _PIPE(plane, a, b) > @@ -58,11 +56,14 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) > #define _MMIO_TRANS(tran, a, b) _MMIO(_TRANS(tran, a, b)) > #define _PORT(port, a, b) ((a) + (port)*((b)-(a))) > #define _MMIO_PORT(port, a, b) _MMIO(_PORT(port, a, b)) > -#define _PIPE3(pipe, ...) _PICK(pipe, __VA_ARGS__) > +#define _PIPE3(pipe, a, b, c) ((pipe) == PIPE_A ? (a) : \ > +(pipe) == PIPE_B ? (b) : (c)) > #define _MMIO_PIPE3(pipe, a, b, c) _MMIO(_PIPE3(pipe, a, b, c)) > -#define _PORT3(port, ...) _PICK(port, __VA_ARGS__) > +#define _PORT3(port, a, b, c) ((port) == PORT_A ? (a) : \ > +(port) == PORT_B ? (b) : (c)) > #define _MMIO_PORT3(pipe, a, b, c) _MMIO(_PORT3(pipe, a, b, c)) > -#define _PHY3(phy, ...) _PICK(phy, __VA_ARGS__) > +#define _PHY3(phy, a, b, c) ((phy) == DPIO_PHY0 ? (a) : \ > + (phy) == DPIO_PHY1 ? (b) : (c)) > #define _MMIO_PHY3(phy, a, b, c) _MMIO(_PHY3(phy, a, b, c)) > > #define _MASKED_FIELD(mask, value) ({ >\ -- Jani Nikula, Intel Open Source Technology Center ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 3/3] [RFC] Revert "drm/i915: use variadic macros and arrays to choose port/pipe based registers"
The varargs macro trick in _PIPE3/_PHY3/_PORT3 was meant as an optimization to shrink the i915 kernel module by around 1000 bytes. However, the downside is a size regression with CONFIG_KASAN, as I found from stack size warnings with gcc-7.0.1: before: drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_get_hw_state': drivers/gpu/drm/i915/intel_dpll_mgr.c:1644:1: error: the frame size of 176 bytes is larger than 100 bytes [-Werror=frame-larger-than=] drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_enable': drivers/gpu/drm/i915/intel_dpll_mgr.c:1548:1: error: the frame size of 224 bytes is larger than 100 bytes [-Werror=frame-larger-than=] after: drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_get_hw_state': drivers/gpu/drm/i915/intel_dpll_mgr.c:1644:1: error: the frame size of 1016 bytes is larger than 1000 bytes [-Werror=frame-larger-than=] drivers/gpu/drm/i915/intel_dpll_mgr.c: In function 'bxt_ddi_pll_enable': drivers/gpu/drm/i915/intel_dpll_mgr.c:1548:1: error: the frame size of 1960 bytes is larger than 1000 bytes [-Werror=frame-larger-than=] I also checked the module sizes and got before: textdata bss dec hex filename 2704592 412025 11104 3127721 2fb9a9 drivers/gpu/drm/i915/i915.o after: textdata bss dec hex filename 2718538 412025 11104 3141667 2ff023 drivers/gpu/drm/i915/i915.o showing a significant code size growth. This reverts the patch to avoid the warning and get back to the better code with CONFIG_KASAN. If someone has another idea to avoid the warning while keeping the more efficient code for the non-KASAN case, that would obviously be better. Fixes: ce64645d86ac ("drm/i915: use variadic macros and arrays to choose port/pipe based registers") Signed-off-by: Arnd Bergmann--- drivers/gpu/drm/i915/i915_reg.h | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 04c8f69fcc62..aa732eccdeb5 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -48,8 +48,6 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG); } -#define _PICK(__index, ...) (((const u32 []){ __VA_ARGS__ })[__index]) - #define _PIPE(pipe, a, b) ((a) + (pipe)*((b)-(a))) #define _MMIO_PIPE(pipe, a, b) _MMIO(_PIPE(pipe, a, b)) #define _PLANE(plane, a, b) _PIPE(plane, a, b) @@ -58,11 +56,14 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define _MMIO_TRANS(tran, a, b) _MMIO(_TRANS(tran, a, b)) #define _PORT(port, a, b) ((a) + (port)*((b)-(a))) #define _MMIO_PORT(port, a, b) _MMIO(_PORT(port, a, b)) -#define _PIPE3(pipe, ...) _PICK(pipe, __VA_ARGS__) +#define _PIPE3(pipe, a, b, c) ((pipe) == PIPE_A ? (a) : \ + (pipe) == PIPE_B ? (b) : (c)) #define _MMIO_PIPE3(pipe, a, b, c) _MMIO(_PIPE3(pipe, a, b, c)) -#define _PORT3(port, ...) _PICK(port, __VA_ARGS__) +#define _PORT3(port, a, b, c) ((port) == PORT_A ? (a) : \ + (port) == PORT_B ? (b) : (c)) #define _MMIO_PORT3(pipe, a, b, c) _MMIO(_PORT3(pipe, a, b, c)) -#define _PHY3(phy, ...) _PICK(phy, __VA_ARGS__) +#define _PHY3(phy, a, b, c) ((phy) == DPIO_PHY0 ? (a) : \ +(phy) == DPIO_PHY1 ? (b) : (c)) #define _MMIO_PHY3(phy, a, b, c) _MMIO(_PHY3(phy, a, b, c)) #define _MASKED_FIELD(mask, value) ({ \ -- 2.9.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel