[PATCH] D28503: Documentation for the newly added x86 intrinsics.

Katya Romanova via Phabricator via cfe-commits Wed, 11 Jan 2017 15:01:51 -0800

kromanova added inline comments.


================
Comment at: emmintrin.h:1607
+///
+/// This intrinsic corresponds to the <c> VMOVSD / MOVSD </c> instruction.
+///
----------------
probinson wrote:
> should this be VMOVQ/MOVQ instead?
Probably yes. Let me know if you have a different opinion.
 
If I use this intrinsic by itself, clang generates VMOVSD instruction. 
It happens because the default domain is chooses to generate smaller 
instruction code. 
I got confused because I couldn't find Intel's documentation about 
_mm_loadu_si64, so I just wrote a test like the one below and looked what 
instructions got generated.

```
__m128i foo22 (void const * __a)
{
  return _mm_loadu_si64 (__a);
}
```

However, if I change the test and use an intrisic to add 2 64-bit integers 
after the load intrinsics, I can see that VMOVQ instruction gets generated.

```
__m128d foo44 (double const * __a)
{
  __m128i first  = _mm_loadu_si64 (__a);
  __m128i second = _mm_loadu_si64 (__a);
  return _mm_add_epi64(first, second);

}
```

So, as you see clang could generate either VMOVSD/MOVSD or VMOVSQ/MOVSQ. I 
think it makes sense to change the documentation as Paul suggested:

/// This intrinsic corresponds to the VMOVSQ/MOVSQ.

Or, alternatively, we could list all the instructions that correspond to this 
intrinsics:

/// This intrinsic corresponds to the VMOVSQ/MOVSQ/VMOVSD/MOVSD.

          


================
Comment at: emmintrin.h:1607
+///
+/// This intrinsic corresponds to the <c> VMOVSD / MOVSD </c> instruction.
+///
----------------
kromanova wrote:
> probinson wrote:
> > should this be VMOVQ/MOVQ instead?
> Probably yes. Let me know if you have a different opinion.
>  
> If I use this intrinsic by itself, clang generates VMOVSD instruction. 
> It happens because the default domain is chooses to generate smaller 
> instruction code. 
> I got confused because I couldn't find Intel's documentation about 
> _mm_loadu_si64, so I just wrote a test like the one below and looked what 
> instructions got generated.
> 
> ```
> __m128i foo22 (void const * __a)
> {
>   return _mm_loadu_si64 (__a);
> }
> ```
> 
> However, if I change the test and use an intrisic to add 2 64-bit integers 
> after the load intrinsics, I can see that VMOVQ instruction gets generated.
> 
> ```
> __m128d foo44 (double const * __a)
> {
>   __m128i first  = _mm_loadu_si64 (__a);
>   __m128i second = _mm_loadu_si64 (__a);
>   return _mm_add_epi64(first, second);
> 
> }
> ```
> 
> So, as you see clang could generate either VMOVSD/MOVSD or VMOVSQ/MOVSQ. I 
> think it makes sense to change the documentation as Paul suggested:
> 
> /// This intrinsic corresponds to the VMOVSQ/MOVSQ.
> 
> Or, alternatively, we could list all the instructions that correspond to this 
> intrinsics:
> 
> /// This intrinsic corresponds to the VMOVSQ/MOVSQ/VMOVSD/MOVSD.
> 
>           
It will be interesting to hear Asaf Badoug opinion, since he added this 
intrisic. He probably has access to Intel's documentation for this intrinsic 
too (which I wasn't able to find online).


================
Comment at: emmintrin.h:1607
+///
+/// This intrinsic corresponds to the <c> VMOVSD / MOVSD </c> instruction.
+///
----------------
kromanova wrote:
> kromanova wrote:
> > probinson wrote:
> > > should this be VMOVQ/MOVQ instead?
> > Probably yes. Let me know if you have a different opinion.
> >  
> > If I use this intrinsic by itself, clang generates VMOVSD instruction. 
> > It happens because the default domain is chooses to generate smaller 
> > instruction code. 
> > I got confused because I couldn't find Intel's documentation about 
> > _mm_loadu_si64, so I just wrote a test like the one below and looked what 
> > instructions got generated.
> > 
> > ```
> > __m128i foo22 (void const * __a)
> > {
> >   return _mm_loadu_si64 (__a);
> > }
> > ```
> > 
> > However, if I change the test and use an intrisic to add 2 64-bit integers 
> > after the load intrinsics, I can see that VMOVQ instruction gets generated.
> > 
> > ```
> > __m128d foo44 (double const * __a)
> > {
> >   __m128i first  = _mm_loadu_si64 (__a);
> >   __m128i second = _mm_loadu_si64 (__a);
> >   return _mm_add_epi64(first, second);
> > 
> > }
> > ```
> > 
> > So, as you see clang could generate either VMOVSD/MOVSD or VMOVSQ/MOVSQ. I 
> > think it makes sense to change the documentation as Paul suggested:
> > 
> > /// This intrinsic corresponds to the VMOVSQ/MOVSQ.
> > 
> > Or, alternatively, we could list all the instructions that correspond to 
> > this intrinsics:
> > 
> > /// This intrinsic corresponds to the VMOVSQ/MOVSQ/VMOVSD/MOVSD.
> > 
> >           
> It will be interesting to hear Asaf Badoug opinion, since he added this 
> intrisic. He probably has access to Intel's documentation for this intrinsic 
> too (which I wasn't able to find online).
There is a similar situation for one intrisic just a few lines above, namely 
_mm_loadu_pd. It could generate either VMOVUPD / MOVUPD or VMOVUPS/MOVUPS 
instructions. 
I have actually asked Simon question about it offline just a couple of days 
ago. 

I decided to kept referring to VMOVUPD / MOVUPD as a corresponding instruction 
for _mm_loadu_pd. However, if we end up doing things differently for 
_mm_loadu_si64, we need to do a similar change to _mm_loadu_pd (and probably to 
some other intrinsics).


Repository:
  rL LLVM

https://reviews.llvm.org/D28503



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28503: Documentation for the newly added x86 intrinsics.

Reply via email to