[PATCH] drm/radeon: refactor CIK tiling table initialization

2016-03-09 Thread Josh Poimboeuf
On Mon, Mar 07, 2016 at 11:45:36PM +, Deucher, Alexander wrote:
> > -Original Message-
> > From: Josh Poimboeuf [mailto:jpoimboe at redhat.com]
> > Sent: Monday, March 07, 2016 6:10 PM
> > To: Deucher, Alexander; Koenig, Christian
> > Cc: dri-devel at lists.freedesktop.org; linux-kernel at vger.kernel.org; 
> > kbuild
> > test robot; Ingo Molnar
> > Subject: [PATCH] drm/radeon: refactor CIK tiling table initialization
> > 
> > When compiling the radeon driver on x86_64 with
> > CONFIG_STACK_VALIDATION
> > enabled, objtool gives the following warnings:
> > 
> >   drivers/gpu/drm/radeon/cik.o: warning: objtool:
> > cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
> >   drivers/gpu/drm/radeon/cik.o: warning: objtool:
> > cik_tiling_mode_table_init()+0x72b: call without frame pointer save/setup
> >   drivers/gpu/drm/radeon/cik.o: warning: objtool:
> > cik_tiling_mode_table_init()+0x464: call without frame pointer save/setup
> >   ...
> > 
> > These are actually false positive warnings; there are no frame pointer
> > bugs.  Instead objtool gets confused by the jump tables created by all
> > the switch statements, combined with some other gcc optimizations.  It
> > tries to follows all possible code paths, but it fails to realize that
> > some of the paths aren't possible.  For example:
> > 
> > 4c97:   31 c0   xor%eax,%eax
> > ...
> > 4ca2:   89 c1   mov%eax,%ecx
> > 4ca4:   ff 24 cd 00 00 00 00jmpq   *0x0(,%rcx,8) 4ca7: 
> > R_X86_64_32S
> > .rodata+0x148
> > 
> > First eax is cleared to zero with the "xor %eax,%eax" instruction.
> > Later, it moves the value of eax (zero in this case) to ecx, and uses
> > that value to jump to the first entry in a jump table in .rodata.
> > 
> > Because objtool doesn't have an x86 emulator, it doesn't know that rcx
> > is zero.  So instead of following a single code path to the first jump
> > table entry, it follows all possible jump table entry paths in parallel.
> > 
> > Usually such overactive analysis isn't a problem.  In every other jump
> > table in the kernel, all the jump targets have the same frame pointer
> > state.  But in this exceedingly rare case, different targets have
> > different frame pointer states.  Objtool notices that and creates the
> > false positive warnings.
> > 
> > In theory we could use the STACK_FRAME_NON_STANDARD marker to tell
> > objtool to skip analysis of the function.  However, that's less than
> > ideal.
> > 
> > Looking at the cik_tiling_mode_table_init() code, it seems overly
> > complex with lots of repetition.  So let's simplify it.  All the switch
> > statements and conditionals can be replaced with much simpler logic by
> > generalizing the different behaviors and moving the initialization data
> > into data structures.
> > 
> > The change is a win-win: it's easier to parse for both humans and
> > machines.  It also reduces the binary size by about 2%:
> > 
> >  text  data bss dec hex filename
> >101011 30360   0  131371   2012b cik-before.o
> > 98699 30200   0  128899   1f783 cik-after.o
> > 
> > [ Note: Unfortunately I don't know how to test this code, so it's
> >   completely untested.  Any help or guidance with ensuring that the
> >   correct initialization is still being written would be greatly
> >   appreciated! ]
> 
> I think it would be clearer to rework it similarly to how it was
> reworked in amdgpu (see gfx_v8_0.c and gfx_v7_0.c in drm-next).  Also
> ideally you'd update the similar code in si.c as well for consistency.

Hi Alex,

Thanks for the pointers.  As it turns out, the false positive warning in
objtool was easier to fix than I originally thought, so this warning has
gone away.  But regardless I'll follow through and make a v2 patch based
on your suggestions.

-- 
Josh


[PATCH] drm/radeon: refactor CIK tiling table initialization

2016-03-07 Thread Deucher, Alexander
> -Original Message-
> From: Josh Poimboeuf [mailto:jpoimboe at redhat.com]
> Sent: Monday, March 07, 2016 6:10 PM
> To: Deucher, Alexander; Koenig, Christian
> Cc: dri-devel at lists.freedesktop.org; linux-kernel at vger.kernel.org; 
> kbuild
> test robot; Ingo Molnar
> Subject: [PATCH] drm/radeon: refactor CIK tiling table initialization
> 
> When compiling the radeon driver on x86_64 with
> CONFIG_STACK_VALIDATION
> enabled, objtool gives the following warnings:
> 
>   drivers/gpu/drm/radeon/cik.o: warning: objtool:
> cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
>   drivers/gpu/drm/radeon/cik.o: warning: objtool:
> cik_tiling_mode_table_init()+0x72b: call without frame pointer save/setup
>   drivers/gpu/drm/radeon/cik.o: warning: objtool:
> cik_tiling_mode_table_init()+0x464: call without frame pointer save/setup
>   ...
> 
> These are actually false positive warnings; there are no frame pointer
> bugs.  Instead objtool gets confused by the jump tables created by all
> the switch statements, combined with some other gcc optimizations.  It
> tries to follows all possible code paths, but it fails to realize that
> some of the paths aren't possible.  For example:
> 
> 4c97:   31 c0   xor%eax,%eax
> ...
> 4ca2:   89 c1   mov%eax,%ecx
> 4ca4:   ff 24 cd 00 00 00 00jmpq   *0x0(,%rcx,8) 4ca7: 
> R_X86_64_32S
> .rodata+0x148
> 
> First eax is cleared to zero with the "xor %eax,%eax" instruction.
> Later, it moves the value of eax (zero in this case) to ecx, and uses
> that value to jump to the first entry in a jump table in .rodata.
> 
> Because objtool doesn't have an x86 emulator, it doesn't know that rcx
> is zero.  So instead of following a single code path to the first jump
> table entry, it follows all possible jump table entry paths in parallel.
> 
> Usually such overactive analysis isn't a problem.  In every other jump
> table in the kernel, all the jump targets have the same frame pointer
> state.  But in this exceedingly rare case, different targets have
> different frame pointer states.  Objtool notices that and creates the
> false positive warnings.
> 
> In theory we could use the STACK_FRAME_NON_STANDARD marker to tell
> objtool to skip analysis of the function.  However, that's less than
> ideal.
> 
> Looking at the cik_tiling_mode_table_init() code, it seems overly
> complex with lots of repetition.  So let's simplify it.  All the switch
> statements and conditionals can be replaced with much simpler logic by
> generalizing the different behaviors and moving the initialization data
> into data structures.
> 
> The change is a win-win: it's easier to parse for both humans and
> machines.  It also reduces the binary size by about 2%:
> 
>  textdata bss dec hex filename
>101011   30360   0  131371   2012b cik-before.o
> 98699   30200   0  128899   1f783 cik-after.o
> 
> [ Note: Unfortunately I don't know how to test this code, so it's
>   completely untested.  Any help or guidance with ensuring that the
>   correct initialization is still being written would be greatly
>   appreciated! ]

I think it would be clearer to rework it similarly to how it was reworked in 
amdgpu (see gfx_v8_0.c and gfx_v7_0.c in drm-next).  Also ideally you'd update 
the similar code in si.c as well for consistency.

Alex




[PATCH] drm/radeon: refactor CIK tiling table initialization

2016-03-07 Thread Josh Poimboeuf
When compiling the radeon driver on x86_64 with CONFIG_STACK_VALIDATION
enabled, objtool gives the following warnings:

  drivers/gpu/drm/radeon/cik.o: warning: objtool: 
cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
  drivers/gpu/drm/radeon/cik.o: warning: objtool: 
cik_tiling_mode_table_init()+0x72b: call without frame pointer save/setup
  drivers/gpu/drm/radeon/cik.o: warning: objtool: 
cik_tiling_mode_table_init()+0x464: call without frame pointer save/setup
  ...

These are actually false positive warnings; there are no frame pointer
bugs.  Instead objtool gets confused by the jump tables created by all
the switch statements, combined with some other gcc optimizations.  It
tries to follows all possible code paths, but it fails to realize that
some of the paths aren't possible.  For example:

4c97:   31 c0   xor%eax,%eax
...
4ca2:   89 c1   mov%eax,%ecx
4ca4:   ff 24 cd 00 00 00 00jmpq   *0x0(,%rcx,8) 4ca7: R_X86_64_32S 
 .rodata+0x148

First eax is cleared to zero with the "xor %eax,%eax" instruction.
Later, it moves the value of eax (zero in this case) to ecx, and uses
that value to jump to the first entry in a jump table in .rodata.

Because objtool doesn't have an x86 emulator, it doesn't know that rcx
is zero.  So instead of following a single code path to the first jump
table entry, it follows all possible jump table entry paths in parallel.

Usually such overactive analysis isn't a problem.  In every other jump
table in the kernel, all the jump targets have the same frame pointer
state.  But in this exceedingly rare case, different targets have
different frame pointer states.  Objtool notices that and creates the
false positive warnings.

In theory we could use the STACK_FRAME_NON_STANDARD marker to tell
objtool to skip analysis of the function.  However, that's less than
ideal.

Looking at the cik_tiling_mode_table_init() code, it seems overly
complex with lots of repetition.  So let's simplify it.  All the switch
statements and conditionals can be replaced with much simpler logic by
generalizing the different behaviors and moving the initialization data
into data structures.

The change is a win-win: it's easier to parse for both humans and
machines.  It also reduces the binary size by about 2%:

 text  data bss dec hex filename
   101011 30360   0  131371   2012b cik-before.o
98699 30200   0  128899   1f783 cik-after.o

[ Note: Unfortunately I don't know how to test this code, so it's
  completely untested.  Any help or guidance with ensuring that the
  correct initialization is still being written would be greatly
  appreciated! ]

Reported-by: kbuild test robot 
Signed-off-by: Josh Poimboeuf 
---
Based on linux-next.

 drivers/gpu/drm/radeon/cik.c | 1352 ++
 1 file changed, 325 insertions(+), 1027 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 0600140..1a477e6 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -2327,6 +2327,290 @@ out:
return err;
 }

+#define PIPE_CONFIG_2  0
+#define PIPE_CONFIG_4  1
+#define PIPE_CONFIG_8  2
+#define PIPE_CONFIG_16 3
+
+#define PIPE_CONFIG_4_RBS4 4
+
+#define TILE_SPLIT_ROW_SIZE((unsigned char)-1)
+
+#define NUM_TILE_MODE_STATES   32
+#define NUM_SECONDARY_TILE_MODE_STATES 16
+
+static unsigned char array_modes[][NUM_TILE_MODE_STATES] = {
+   {   /* PIPE_CONFIG_2 */
+   [0 ... 4]   = ARRAY_2D_TILED_THIN1,
+   [5] = ARRAY_1D_TILED_THIN1,
+   [6 ... 7]   = ARRAY_PRT_2D_TILED_THIN1,
+   [8] = ARRAY_LINEAR_ALIGNED,
+   [9] = ARRAY_1D_TILED_THIN1,
+   [10]= ARRAY_2D_TILED_THIN1,
+   [11]= ARRAY_PRT_TILED_THIN1,
+   [12]= ARRAY_PRT_2D_TILED_THIN1,
+   [13]= ARRAY_1D_TILED_THIN1,
+   [14]= ARRAY_2D_TILED_THIN1,
+   [16]= ARRAY_PRT_TILED_THIN1,
+   [17]= ARRAY_PRT_2D_TILED_THIN1,
+   [27]= ARRAY_1D_TILED_THIN1,
+   [28]= ARRAY_PRT_2D_TILED_THIN1,
+   [29]= ARRAY_PRT_TILED_THIN1,
+   [30]= ARRAY_PRT_2D_TILED_THIN1,
+   },
+   {   /* PIPE_CONFIG_4 */
+   [0 ... 4]   = ARRAY_2D_TILED_THIN1,
+   [5] = ARRAY_1D_TILED_THIN1,
+   [6 ... 7]   = ARRAY_PRT_2D_TILED_THIN1,
+   [8] = ARRAY_LINEAR_ALIGNED,
+   [9] = ARRAY_1D_TILED_THIN1,
+   [10]= ARRAY_2D_TILED_THIN1,
+   [11]= ARRAY_PRT_TILED_THIN1,
+   [12]