[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

Gabor Buella via Phabricator via cfe-commits Wed, 09 May 2018 07:06:01 -0700

GBuella added a comment.

In https://reviews.llvm.org/D45616#1067492, @efriedma wrote:


> > The fcmp opcode has no defined behavior with NaN operands in the 
> > comparisions handled in this patch.
>
> Could you describe the problem here in a bit more detail?  As far as I know, 
> the LLVM IR fcmp should return the same result as the x86 CMPPD, even without 
> fast-math.


So, I'm still looking into this.
What I see is, yes, fcmp just so happens to work the same as x86 CMPPD.
An example:

  fcmp olt <2 x double> %x, %y

becomes vcmpltpd.

But this only holds for condition codes 0 - 7.

Where LLVM IR has a condition "olt" <- ordered less-than, x86 cmppd has two 
corresponding condition codes: 0x01->"less-than (ordered, signaling)", which is 
"vcmpltpd" and 0x11->"less-than (ordered, nonsignaling)" which is  "vcmplt_oqps"

Now, if the builtin's CC argument is 1 (which refers to vcmpltps), we lower it 
to "fcmp olt", which then results in "vcmpltps", we are ok, yes.
But in the IR, there is no information about the user expecting "vcmpltps" vs 
"vcmplt_oqps".

Do I understand these tricks right?
If we are ok with this (hard to understand) approach, I can just lower these 
without fast-math as well, as long as CC < 8, by modifying this condition:

  if (CC < 8 && !UsesNonDefaultRounding && getLangOpts().FastMath) {

Although, I'm still looked at what happens with sNaN, and with qNaN constants, 
once these comparisons are lowered to fcmp.


Repository:
  rC Clang

https://reviews.llvm.org/D45616



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

Reply via email to