In most cases the most heavy instructions are load/stores. So, I believe, in this case it's better to try to hide load latency then enable macro-fusion. BTW, I'm not sure about SKL/SKX, but for the previous generations macro-fusion depends on code alignment.
--Sergey On Wed, Jan 18, 2017 at 2:44 AM, Aleksey Shipilev < aleksey.shipi...@gmail.com> wrote: > (triggered again) > > On 01/18/2017 12:33 AM, Sergey Melnikov wrote: > > mov (%rax), %rbx > > cmp %rbx, %rdx > > jxx Lxxx > > > > But if you schedule them this way > > > > mov (%rax), %rbx > > cmp %rbx, %rdx > > ... few instructions > > jxx Lxxx > > ...doesn't this give up on macro-fusion, and effectively sets up for a > better > chance of a "bottleneck in instruction fetch/decode phases"? :) > > -Aleksey > > -- > You received this message because you are subscribed to the Google Groups > "mechanical-sympathy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to mechanical-sympathy+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- --Sergey -- You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group. To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.