[PATCH] drm/radeon: refactor CIK tiling table initialization
On Mon, Mar 07, 2016 at 11:45:36PM +, Deucher, Alexander wrote: > > -Original Message- > > From: Josh Poimboeuf [mailto:jpoimboe at redhat.com] > > Sent: Monday, March 07, 2016 6:10 PM > > To: Deucher, Alexander; Koenig, Christian > > Cc: dri-devel at lists.freedesktop.org; linux-kernel at vger.kernel.org; > > kbuild > > test robot; Ingo Molnar > > Subject: [PATCH] drm/radeon: refactor CIK tiling table initialization > > > > When compiling the radeon driver on x86_64 with > > CONFIG_STACK_VALIDATION > > enabled, objtool gives the following warnings: > > > > drivers/gpu/drm/radeon/cik.o: warning: objtool: > > cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup > > drivers/gpu/drm/radeon/cik.o: warning: objtool: > > cik_tiling_mode_table_init()+0x72b: call without frame pointer save/setup > > drivers/gpu/drm/radeon/cik.o: warning: objtool: > > cik_tiling_mode_table_init()+0x464: call without frame pointer save/setup > > ... > > > > These are actually false positive warnings; there are no frame pointer > > bugs. Instead objtool gets confused by the jump tables created by all > > the switch statements, combined with some other gcc optimizations. It > > tries to follows all possible code paths, but it fails to realize that > > some of the paths aren't possible. For example: > > > > 4c97: 31 c0 xor%eax,%eax > > ... > > 4ca2: 89 c1 mov%eax,%ecx > > 4ca4: ff 24 cd 00 00 00 00jmpq *0x0(,%rcx,8) 4ca7: > > R_X86_64_32S > > .rodata+0x148 > > > > First eax is cleared to zero with the "xor %eax,%eax" instruction. > > Later, it moves the value of eax (zero in this case) to ecx, and uses > > that value to jump to the first entry in a jump table in .rodata. > > > > Because objtool doesn't have an x86 emulator, it doesn't know that rcx > > is zero. So instead of following a single code path to the first jump > > table entry, it follows all possible jump table entry paths in parallel. > > > > Usually such overactive analysis isn't a problem. In every other jump > > table in the kernel, all the jump targets have the same frame pointer > > state. But in this exceedingly rare case, different targets have > > different frame pointer states. Objtool notices that and creates the > > false positive warnings. > > > > In theory we could use the STACK_FRAME_NON_STANDARD marker to tell > > objtool to skip analysis of the function. However, that's less than > > ideal. > > > > Looking at the cik_tiling_mode_table_init() code, it seems overly > > complex with lots of repetition. So let's simplify it. All the switch > > statements and conditionals can be replaced with much simpler logic by > > generalizing the different behaviors and moving the initialization data > > into data structures. > > > > The change is a win-win: it's easier to parse for both humans and > > machines. It also reduces the binary size by about 2%: > > > > text data bss dec hex filename > >101011 30360 0 131371 2012b cik-before.o > > 98699 30200 0 128899 1f783 cik-after.o > > > > [ Note: Unfortunately I don't know how to test this code, so it's > > completely untested. Any help or guidance with ensuring that the > > correct initialization is still being written would be greatly > > appreciated! ] > > I think it would be clearer to rework it similarly to how it was > reworked in amdgpu (see gfx_v8_0.c and gfx_v7_0.c in drm-next). Also > ideally you'd update the similar code in si.c as well for consistency. Hi Alex, Thanks for the pointers. As it turns out, the false positive warning in objtool was easier to fix than I originally thought, so this warning has gone away. But regardless I'll follow through and make a v2 patch based on your suggestions. -- Josh
[PATCH] drm/radeon: refactor CIK tiling table initialization
> -Original Message- > From: Josh Poimboeuf [mailto:jpoimboe at redhat.com] > Sent: Monday, March 07, 2016 6:10 PM > To: Deucher, Alexander; Koenig, Christian > Cc: dri-devel at lists.freedesktop.org; linux-kernel at vger.kernel.org; > kbuild > test robot; Ingo Molnar > Subject: [PATCH] drm/radeon: refactor CIK tiling table initialization > > When compiling the radeon driver on x86_64 with > CONFIG_STACK_VALIDATION > enabled, objtool gives the following warnings: > > drivers/gpu/drm/radeon/cik.o: warning: objtool: > cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup > drivers/gpu/drm/radeon/cik.o: warning: objtool: > cik_tiling_mode_table_init()+0x72b: call without frame pointer save/setup > drivers/gpu/drm/radeon/cik.o: warning: objtool: > cik_tiling_mode_table_init()+0x464: call without frame pointer save/setup > ... > > These are actually false positive warnings; there are no frame pointer > bugs. Instead objtool gets confused by the jump tables created by all > the switch statements, combined with some other gcc optimizations. It > tries to follows all possible code paths, but it fails to realize that > some of the paths aren't possible. For example: > > 4c97: 31 c0 xor%eax,%eax > ... > 4ca2: 89 c1 mov%eax,%ecx > 4ca4: ff 24 cd 00 00 00 00jmpq *0x0(,%rcx,8) 4ca7: > R_X86_64_32S > .rodata+0x148 > > First eax is cleared to zero with the "xor %eax,%eax" instruction. > Later, it moves the value of eax (zero in this case) to ecx, and uses > that value to jump to the first entry in a jump table in .rodata. > > Because objtool doesn't have an x86 emulator, it doesn't know that rcx > is zero. So instead of following a single code path to the first jump > table entry, it follows all possible jump table entry paths in parallel. > > Usually such overactive analysis isn't a problem. In every other jump > table in the kernel, all the jump targets have the same frame pointer > state. But in this exceedingly rare case, different targets have > different frame pointer states. Objtool notices that and creates the > false positive warnings. > > In theory we could use the STACK_FRAME_NON_STANDARD marker to tell > objtool to skip analysis of the function. However, that's less than > ideal. > > Looking at the cik_tiling_mode_table_init() code, it seems overly > complex with lots of repetition. So let's simplify it. All the switch > statements and conditionals can be replaced with much simpler logic by > generalizing the different behaviors and moving the initialization data > into data structures. > > The change is a win-win: it's easier to parse for both humans and > machines. It also reduces the binary size by about 2%: > > textdata bss dec hex filename >101011 30360 0 131371 2012b cik-before.o > 98699 30200 0 128899 1f783 cik-after.o > > [ Note: Unfortunately I don't know how to test this code, so it's > completely untested. Any help or guidance with ensuring that the > correct initialization is still being written would be greatly > appreciated! ] I think it would be clearer to rework it similarly to how it was reworked in amdgpu (see gfx_v8_0.c and gfx_v7_0.c in drm-next). Also ideally you'd update the similar code in si.c as well for consistency. Alex
[PATCH] drm/radeon: refactor CIK tiling table initialization
When compiling the radeon driver on x86_64 with CONFIG_STACK_VALIDATION enabled, objtool gives the following warnings: drivers/gpu/drm/radeon/cik.o: warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup drivers/gpu/drm/radeon/cik.o: warning: objtool: cik_tiling_mode_table_init()+0x72b: call without frame pointer save/setup drivers/gpu/drm/radeon/cik.o: warning: objtool: cik_tiling_mode_table_init()+0x464: call without frame pointer save/setup ... These are actually false positive warnings; there are no frame pointer bugs. Instead objtool gets confused by the jump tables created by all the switch statements, combined with some other gcc optimizations. It tries to follows all possible code paths, but it fails to realize that some of the paths aren't possible. For example: 4c97: 31 c0 xor%eax,%eax ... 4ca2: 89 c1 mov%eax,%ecx 4ca4: ff 24 cd 00 00 00 00jmpq *0x0(,%rcx,8) 4ca7: R_X86_64_32S .rodata+0x148 First eax is cleared to zero with the "xor %eax,%eax" instruction. Later, it moves the value of eax (zero in this case) to ecx, and uses that value to jump to the first entry in a jump table in .rodata. Because objtool doesn't have an x86 emulator, it doesn't know that rcx is zero. So instead of following a single code path to the first jump table entry, it follows all possible jump table entry paths in parallel. Usually such overactive analysis isn't a problem. In every other jump table in the kernel, all the jump targets have the same frame pointer state. But in this exceedingly rare case, different targets have different frame pointer states. Objtool notices that and creates the false positive warnings. In theory we could use the STACK_FRAME_NON_STANDARD marker to tell objtool to skip analysis of the function. However, that's less than ideal. Looking at the cik_tiling_mode_table_init() code, it seems overly complex with lots of repetition. So let's simplify it. All the switch statements and conditionals can be replaced with much simpler logic by generalizing the different behaviors and moving the initialization data into data structures. The change is a win-win: it's easier to parse for both humans and machines. It also reduces the binary size by about 2%: text data bss dec hex filename 101011 30360 0 131371 2012b cik-before.o 98699 30200 0 128899 1f783 cik-after.o [ Note: Unfortunately I don't know how to test this code, so it's completely untested. Any help or guidance with ensuring that the correct initialization is still being written would be greatly appreciated! ] Reported-by: kbuild test robot Signed-off-by: Josh Poimboeuf --- Based on linux-next. drivers/gpu/drm/radeon/cik.c | 1352 ++ 1 file changed, 325 insertions(+), 1027 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 0600140..1a477e6 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -2327,6 +2327,290 @@ out: return err; } +#define PIPE_CONFIG_2 0 +#define PIPE_CONFIG_4 1 +#define PIPE_CONFIG_8 2 +#define PIPE_CONFIG_16 3 + +#define PIPE_CONFIG_4_RBS4 4 + +#define TILE_SPLIT_ROW_SIZE((unsigned char)-1) + +#define NUM_TILE_MODE_STATES 32 +#define NUM_SECONDARY_TILE_MODE_STATES 16 + +static unsigned char array_modes[][NUM_TILE_MODE_STATES] = { + { /* PIPE_CONFIG_2 */ + [0 ... 4] = ARRAY_2D_TILED_THIN1, + [5] = ARRAY_1D_TILED_THIN1, + [6 ... 7] = ARRAY_PRT_2D_TILED_THIN1, + [8] = ARRAY_LINEAR_ALIGNED, + [9] = ARRAY_1D_TILED_THIN1, + [10]= ARRAY_2D_TILED_THIN1, + [11]= ARRAY_PRT_TILED_THIN1, + [12]= ARRAY_PRT_2D_TILED_THIN1, + [13]= ARRAY_1D_TILED_THIN1, + [14]= ARRAY_2D_TILED_THIN1, + [16]= ARRAY_PRT_TILED_THIN1, + [17]= ARRAY_PRT_2D_TILED_THIN1, + [27]= ARRAY_1D_TILED_THIN1, + [28]= ARRAY_PRT_2D_TILED_THIN1, + [29]= ARRAY_PRT_TILED_THIN1, + [30]= ARRAY_PRT_2D_TILED_THIN1, + }, + { /* PIPE_CONFIG_4 */ + [0 ... 4] = ARRAY_2D_TILED_THIN1, + [5] = ARRAY_1D_TILED_THIN1, + [6 ... 7] = ARRAY_PRT_2D_TILED_THIN1, + [8] = ARRAY_LINEAR_ALIGNED, + [9] = ARRAY_1D_TILED_THIN1, + [10]= ARRAY_2D_TILED_THIN1, + [11]= ARRAY_PRT_TILED_THIN1, + [12]