RE: [patch, i386] false dependencies fix
Hello! > So what I'm confused about is in the original output template operand > 0 is duplicated. In the new template operand 1 is duplicated. > > Presumably what you're trying to accomplish is avoiding a false read > on operand 0 (the destination)? Can you please confirm? > Knowing that should also help me evaluate the changes to recp and > rsqrt since they're being changed to the same style encoding when > operating strictly on registers. Yes, it's the same for all instructions in the patch - we're not just avoiding read but present more possibilities to execute speculatively for CPU here. The destination depends only on the source after the patch, and (thanks to CPU register renaming) CPU can successfully execute this instruction even if some previous instruction with write to the same destination is not finished currently. -- Alexander Nesterovskiy
[PATCH, i386]: AVX false dependencies fix
This is the same patch I posted a few days ago, a bit modified according to Uros' recommendation. Patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, vsqrtss and vsqrtsd instructions. Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance gains. 2018-05-04 Alexander Nesterovskiy * config/i386/i386.md (*movsf_internal): AVX falsedep fix. (*movdf_internal): Ditto. (*rcpsf2_sse): Ditto. (*rsqrtsf2_sse): Ditto. (*sqrt2_sse): Ditto. -- Alexander Nesterovskiy avx_falsedep.patch Description: avx_falsedep.patch
[patch, i386] false dependencies fix
This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, vsqrtss and vsqrtsd instructions. Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance gains. Please let me know if something is wrong here and should be changed. -- Alexander Nesterovskiy falsedep.patch Description: falsedep.patch