https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69693
--- Comment #6 from H.J. Lu <hjl.tools at gmail dot com> --- (In reply to Uroš Bizjak from comment #5) > (In reply to H.J. Lu from comment #4) > > > It looks that it is done on purpose. > > In this case, our planned transition to generic unaligned SSE loads should > "fix" this issue. The realignment will be necessary only for performance > reasons, not for the correctness. Maybe we should also look if > SLOW_UNALIGNED_ACCESS affects moves that may result in muvups insns. > > (However, since movups from unaligned address is slow, I still think that > realignment should be fixed for STV pass). STV paradoxical V2DI subregs only needs 8-byte alignment. LRA generates aligned V2DI move on paradoxical V2DI subregs since we don't provide paradoxical V2DI subreg move and SLOW_UNALIGNED_ACCESS is 0. STV doesn't need 16-byte alignment. I bootstrapped 32-bit GCC using my patch with --with-arch=corei7 --with-cpu=corei7. There are no regressions.