On Mon, Jan 7, 2019 at 11:12 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Mon, Jan 7, 2019 at 6:40 PM H.J. Lu <hongjiu...@intel.com> wrote: > > > > There is no need to generate vzeroupper if caller uses upper bits of > > AVX/AVX512 registers, We track caller's avx_u128_state and avoid > > vzeroupper when caller's avx_u128_state is AVX_U128_DIRTY. > > > > Tested on i686 and x86-64 with and without --with-arch=native. > > > > OK for trunk? > > In principle OK, but I think we don't have to cache the result of > ix86_avx_u128_mode_entry. Simply call the function from > ix86_avx_u128_mode_exit; it is a simple function, so I guess we can > afford to re-call it one more time per function.
Do we really need ix86_avx_u128_mode_entry? We can just set entry state to AVX_U128_CLEAN and set exit state to AVX_U128_DIRTY if caller returns AVX/AVX512 register or passes AVX/AVX512 registers to callee. Does this patch look OK? Thanks. H.J. -- diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index d01278d866f..1ac89fd2eb5 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -19087,25 +19087,6 @@ ix86_dirflag_mode_entry (void) return X86_DIRFLAG_RESET; } -static int -ix86_avx_u128_mode_entry (void) -{ - tree arg; - - /* Entry mode is set to AVX_U128_DIRTY if there are - 256bit or 512bit modes used in function arguments. */ - for (arg = DECL_ARGUMENTS (current_function_decl); arg; - arg = TREE_CHAIN (arg)) - { - rtx incoming = DECL_INCOMING_RTL (arg); - - if (incoming && ix86_check_avx_upper_register (incoming)) - return AVX_U128_DIRTY; - } - - return AVX_U128_CLEAN; -} - /* Return a mode that ENTITY is assumed to be switched to at function entry. */ @@ -19117,7 +19098,7 @@ ix86_mode_entry (int entity) case X86_DIRFLAG: return ix86_dirflag_mode_entry (); case AVX_U128: - return ix86_avx_u128_mode_entry (); + return AVX_U128_CLEAN; case I387_TRUNC: case I387_FLOOR: case I387_CEIL: @@ -19130,13 +19111,24 @@ ix86_mode_entry (int entity) static int ix86_avx_u128_mode_exit (void) { + /* Exit mode is set to AVX_U128_DIRTY if there are 256bit or 512bit + modes used in function arguments or function return.. */ rtx reg = crtl->return_rtx; - /* Exit mode is set to AVX_U128_DIRTY if there are 256bit - or 512 bit modes used in the function return register. */ if (reg && ix86_check_avx_upper_register (reg)) return AVX_U128_DIRTY; + tree arg; + + for (arg = DECL_ARGUMENTS (current_function_decl); arg; + arg = TREE_CHAIN (arg)) + { + rtx incoming = DECL_INCOMING_RTL (arg); + + if (incoming && ix86_check_avx_upper_register (incoming)) + return AVX_U128_DIRTY; + } + return AVX_U128_CLEAN; }