https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68233
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Can you try GCC 5 or the trunk, I think this was just fixed recently. And it is not 1 cycle, it is 4 cycles (the latency to L1).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68233
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Can you try GCC 5 or the trunk, I think this was just fixed recently. And it is not 1 cycle, it is 4 cycles (the latency to L1).