On Sun, Dec 27, 2015 at 6:00 PM, Bill Schmidt <wschm...@linux.vnet.ibm.com> wrote: > Hi, > > POWER9 adds endian-neutral load and store vector instructions that > support unaligned accesses. This allows more efficient code generation > than POWER8. With these new instructions, we no longer generate the > load-swap and swap-store sequences, and we no longer need to perform > swap optimization to get rid of unnecessary swaps. We also need to make > sure that we don't perform P8-specific vector load fusion sequences when > the new instructions are available. > > This patch includes two tests that verify the correct instructions are > generated with -mcpu=power9. One of these generates a pattern that > causes P8-specific vector load fusion with -mcpu=power8, and verifies we > don't generate it with -mcpu=power9. > > Besides these tests, I hand-tested all the swaps-p8* tests to verify > correct generation of lxvx and stxvx rather than the old P8 sequences. > > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no > regressions. Ok for trunk, and then for backport to GCC 5? > > Thanks, > Bill > > > [gcc] > > 2015-12-27 Bill Schmidt <wschm...@linux.vnet.ibm.com> > > * config/rs6000/rs6000.c (rs6000_emit_le_vsx_move): Verify that > this is never called when lxvx/stxvx are available. > (pass_analyze_swaps::gate): Don't perform swap optimization when > lxvx/stxvx are available. > * config/rs6000/vector.md (mov<mode>): Don't call > rs6000_emit_le_vsx_move when lxvx/stxvx are available. > * config/rs6000/vsx.md (*p9_vecload_<mode>): New define_insn. > (*p9_vecstore_<mode>): Likewise. > (*vsx_le_perm_load_<mode>:VSX_LE): Disable when lxvx/stxvx are > available. > (*vsx_le_perm_load_<mode>:VSX_W): Likewise. > (*vsx_le_perm_load_v8hi): Likewise. > (*vsx_le_perm_load_v16qi): Likewise. > (*vsx_le_perm_store_<mode>:VSX_LE): Likewise. > ([related define_splits]): Likewise. > (*vsx_le_perm_store_<mode>:VSX_W): Likewise. > ([related define_splits]): Likewise. > (*vsx_le_perm_store_v8hi): Likewise. > ([related define_splits]): Likewise. > (*vsx_le_perm_store_v16qi): Likewise. > ([related define_splits]): Likewise. > (*vsx_lxvd2x2_le_<mode>): Likewise. > (*vsx_lxvd2x4_le_<mode>): Likewise. > (*vsx_lxvd2x8_le_V8HI): Likewise. > (*vsx_lvxd2x16_le_V16QI): Likewise. > (*vsx_stxvd2x2_le_<mode>): Likewise. > (*vsx_stxvd2x4_le_<mode>): Likewise. > (*vsx_stxvd2x8_le_V8HI): Likewise. > (*vsx_stxvdx16_le_V16QI): Likewise. > > [gcc/testsuite] > > 2015-12-27 Bill Schmidt <wschm...@linux.vnet.ibm.com> > > * gcc.target/powerpc/p9-lxvx-stxvx-1.c: New. > * gcc.target/powerpc/p9-lxvx-stxvx-2.c: New.
Okay for trunk. This isn't a bug fix, so GCC 5 will require more consideration. Thanks, David