[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-06-04 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a comment.

By the way, I went ahead and reverted this in r333958.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-06-04 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a comment.

In https://reviews.llvm.org/D46042#1121674, @rjmccall wrote:

> > I think we should revert this for now. Adding the alignment attribute to 
> > all Intel vector typedefs is a bigger change than it seems.
>
> Ugh.  That is just an awful language rule.   Would it be reasonable to 
> restrict it to only attributes spelled with `__declspec(align(N))` rather 
> than `__attribute__((aligned(N)))`, or is that too invasive in the alignment 
> computation?


When we were working on the record layout code, I didn't want to do that 
because users often structure their portability headers to check for 
`__clang__` first because clang also defines `_MSC_VER` and `__GNUC__`. I felt 
it would be best if the alignment attributes were as interchangeable as 
possible. They are very common.

Maybe checking the spelling of the packing attribute would work better. The GCC 
`__attribute__` spelling would ignore what we called "required alignment", 
meaning alignment required by explicit attributes and not the normal `alignof`.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-06-04 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

In https://reviews.llvm.org/D46042#1121648, @rnk wrote:

> It's the typedef alignment changes that are causing problems for us, not the 
> MaxVectorAlign changes. That makes more sense. The new alignment attribute 
> breaks our implementation of `_mm256_loadu_ps`, because the packed struct 
> ends up with a 32-byte alignment. Here's the implementation:
>
>   static __inline __m256 __DEFAULT_FN_ATTRS
>   _mm256_loadu_ps(float const *__p)
>   {
> struct __loadu_ps {
>   __m256 __v;
> } __attribute__((__packed__, __may_alias__));
> return ((struct __loadu_ps*)__p)->__v;
>   }
>
>
> And clang's -fdump-record-layouts says:
>
>   *** Dumping AST Record Layout
>0 | struct __loadu_ps
>0 |   __m256 __v
>  | [sizeof=32, align=32]
>
>
> I think the problem is that `__attribute__((aligned(N)))` beats 
> `__attribute__((packed))` on Windows to match MSVC's behavior with 
> `__declspec(align(N))`.
>
> I think we should revert this for now. Adding the alignment attribute to all 
> Intel vector typedefs is a bigger change than it seems.


Ugh.  That is just an awful language rule.   Would it be reasonable to restrict 
it to only attributes spelled with `__declspec(align(N))` rather than 
`__attribute__((aligned(N)))`, or is that too invasive in the alignment 
computation?


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-06-04 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a comment.

It's the typedef alignment changes that are causing problems for us, not the 
MaxVectorAlign changes. That makes more sense. The new alignment attribute 
breaks our implementation of `_mm256_loadu_ps`, because the packed struct ends 
up with a 32-byte alignment. Here's the implementation:

  static __inline __m256 __DEFAULT_FN_ATTRS
  _mm256_loadu_ps(float const *__p)
  {
struct __loadu_ps {
  __m256 __v;
} __attribute__((__packed__, __may_alias__));
return ((struct __loadu_ps*)__p)->__v;
  }

And clang's -fdump-record-layouts says:

  *** Dumping AST Record Layout
   0 | struct __loadu_ps
   0 |   __m256 __v
 | [sizeof=32, align=32]

I think the problem is that `__attribute__((aligned(N)))` beats 
`__attribute__((packed))` on Windows to match MSVC's behavior with 
`__declspec(align(N))`.

I think we should revert this for now. Adding the alignment attribute to all 
Intel vector typedefs is a bigger change than it seems.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-06-04 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a comment.

This change appears to have caused some blink vector math unit tests to fail on 
Windows. We are tracking it at https://crbug.com/849251.

It has a pretty small reproducer:

  #include 
  __m256 loadit(__m256 *p) { return _mm256_loadu_ps((const float *)p); }

Compile for x86_64-windows-msvc with -mavx, and before this change we got this 
IR: `%0 = load <8 x float>, <8 x float>* %p, align 1`
After we get this IR: `%0 = load <8 x float>, <8 x float>* %p, align 32`

This is surprising. I'll keep debugging.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-06-01 Thread John McCall via Phabricator via cfe-commits
rjmccall closed this revision.
rjmccall added a comment.

Landed as r333791.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-05-04 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

In https://reviews.llvm.org/D46042#1088049, @scanon wrote:

> In https://reviews.llvm.org/D46042#1088044, @ab wrote:
>
> > So, this makes sense to me, but on x86, should we also be worried about the 
> > fact that the calling convention is based on which features are available?  
> > (>128bit ext_vector_types are passed in AVX/AVX-512 registers, if 
> > available).  Presumably swift is also affected, no?
>
>
> Swift's calling conventions (will?) always divide larger vectors into 128b 
> pieces. When interacting with C conventions, yes, this is still an issue.


Right, this is just a C ABI issue.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-05-04 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

In https://reviews.llvm.org/D46042#1088044, @ab wrote:

> So, this makes sense to me, but on x86, should we also be worried about the 
> fact that the calling convention is based on which features are available?  
> (>128bit ext_vector_types are passed in AVX/AVX-512 registers, if available). 
>  Presumably swift is also affected, no?


I'd forgotten about that.  I think there's a strong argument that we're 
required to pass at least the Intel intrinsic vector types that way, yeah.  But 
if we want a stable ABI for other vector types, we really can't.  The root 
problem here is that the Intel ABI seems to imagine that these vector types 
only exist when they're supported directly by hardware.  (And the Intel 
intrinsic headers do define those types even when AVX is disabled!)  So I don't 
know that we can make a good ABI story for that.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-05-04 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

In https://reviews.llvm.org/D46042#1088044, @ab wrote:

> So, this makes sense to me, but on x86, should we also be worried about the 
> fact that the calling convention is based on which features are available?  
> (>128bit ext_vector_types are passed in AVX/AVX-512 registers, if available). 
>  Presumably swift is also affected, no?


Swift's calling conventions (will?) always divide larger vectors into 128b 
pieces. When interacting with C conventions, yes, this is still an issue.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-05-04 Thread Ahmed Bougacha via Phabricator via cfe-commits
ab added a comment.

So, this makes sense to me, but on x86, should we also be worried about the 
fact that the calling convention is based on which features are available?  
(>128bit ext_vector_types are passed in AVX/AVX-512 registers, if available).  
Presumably swift is also affected, no?


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-05-03 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

I think we should seriously consider making alignment attributes on typedefs 
(and maybe some other attributes like may_alias) actual type qualifiers that 
are preserved in the canonical type, mangled, and so on.  It would be an ABI 
break, but it'd also solve a lot of problems.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-05-03 Thread Akira Hatanaka via Phabricator via cfe-commits
ahatanak added a comment.

> Note that this sort of attribute is stripped from template arguments in 
> template substitution, so there's a possibility that code templated over 
> vectors will produce inadequately-aligned objects.

I was wondering whether there is a warning clang issues when the aligned 
attribute is stripped. If it doesn't warn, should it? I recently came across a 
case where a 16-byte vector annotated with a 4-byte alignment was passed to 
std::swap, which caused a crash because the alignment was stripped and the x86 
backend decided to emit an 16-byte aligned load to load an unaligned vector.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-05-02 Thread Akira Hatanaka via Phabricator via cfe-commits
ahatanak accepted this revision.
ahatanak added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-04-24 Thread John McCall via Phabricator via cfe-commits
rjmccall created this revision.
Herald added a reviewer: javed.absar.
Herald added subscribers: cfe-commits, kristof.beyls.

This fixes two major problems:

- We were not capping vector alignment as desired on 32-bit ARM.
- We were using different alignments based on the AVX settings on Intel, so we 
did not have a consistent ABI.

This is an ABI break, but we think we can get away with it because vectors tend 
to be used mostly in inline code (which is why not having a consistent ABI has 
not proven disastrous on Intel).

Intel's AVX types are specified as having 32-byte / 64-byte alignment, so align 
them explicitly instead of relying on the base ABI rule.  Note that this sort 
of attribute is stripped from template arguments in template substitution, so 
there's a possibility that code templated over vectors will produce 
inadequately-aligned objects.


Repository:
  rC Clang

https://reviews.llvm.org/D46042

Files:
  lib/Basic/Targets/OSTargets.h
  lib/Basic/Targets/X86.h
  lib/CodeGen/CGBuiltin.cpp
  lib/Headers/avx512fintrin.h
  lib/Headers/avxintrin.h
  test/CodeGen/arm-swiftcall.c
  test/CodeGen/vector-alignment.c
  test/CodeGenCXX/align-avx-complete-objects.cpp

Index: test/CodeGenCXX/align-avx-complete-objects.cpp
===
--- test/CodeGenCXX/align-avx-complete-objects.cpp
+++ test/CodeGenCXX/align-avx-complete-objects.cpp
@@ -12,7 +12,7 @@
 return r[0];
 }
 
-// CHECK: [[R:%.*]] = alloca <8 x float>, align 32
+// CHECK: [[R:%.*]] = alloca <8 x float>, align 16
 // CHECK-NEXT:  [[CALL:%.*]] = call i8* @_Znwm(i64 32)
 // CHECK-NEXT:  [[ZERO:%.*]] = bitcast i8* [[CALL]] to <8 x float>*
 // CHECK-NEXT:  store <8 x float>* [[ZERO]], <8 x float>** [[P:%.*]], align 8
@@ -22,8 +22,8 @@
 // CHECK-NEXT:  store volatile <8 x float> [[TWO]], <8 x float>* [[THREE]], align 16
 // CHECK-NEXT:  [[FOUR:%.*]] = load <8 x float>*, <8 x float>** [[P]], align 8
 // CHECK-NEXT:  [[FIVE:%.*]] = load volatile <8 x float>, <8 x float>* [[FOUR]], align 16
-// CHECK-NEXT:  store <8 x float> [[FIVE]], <8 x float>* [[R]], align 32
-// CHECK-NEXT:  [[SIX:%.*]] = load <8 x float>, <8 x float>* [[R]], align 32
+// CHECK-NEXT:  store <8 x float> [[FIVE]], <8 x float>* [[R]], align 16
+// CHECK-NEXT:  [[SIX:%.*]] = load <8 x float>, <8 x float>* [[R]], align 16
 // CHECK-NEXT:  [[VECEXT:%.*]] = extractelement <8 x float> [[SIX]], i32 0
 // CHECK-NEXT:  ret float [[VECEXT]]
 
Index: test/CodeGen/vector-alignment.c
===
--- test/CodeGen/vector-alignment.c
+++ test/CodeGen/vector-alignment.c
@@ -1,38 +1,68 @@
 // RUN: %clang_cc1 -w -triple x86_64-apple-darwin10 \
-// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=SSE
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=DARWIN_SSE
 // RUN: %clang_cc1 -w -triple   i386-apple-darwin10 \
-// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=SSE
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=DARWIN_SSE
 // RUN: %clang_cc1 -w -triple x86_64-apple-darwin10 -target-feature +avx \
-// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=AVX
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=DARWIN_AVX
 // RUN: %clang_cc1 -w -triple   i386-apple-darwin10 -target-feature +avx \
-// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=AVX
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=DARWIN_AVX
 // RUN: %clang_cc1 -w -triple x86_64-apple-darwin10 -target-feature +avx512f \
-// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=AVX512
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=DARWIN_AVX512
 // RUN: %clang_cc1 -w -triple   i386-apple-darwin10 -target-feature +avx512f \
-// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=AVX512
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=DARWIN_AVX512
+// RUN: %clang_cc1 -w -triple armv7-apple-ios10 \
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=DARWIN_ARM32
+// RUN: %clang_cc1 -w -triple arm64-apple-ios10 \
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=DARWIN_ARM64
+
+// RUN: %clang_cc1 -w -triple x86_64-pc-linux \
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=GENERIC
+// RUN: %clang_cc1 -w -triple   i386-pc-linux \
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=GENERIC
+// RUN: %clang_cc1 -w -triple x86_64-pc-linux -target-feature +avx \
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=GENERIC
+// RUN: %clang_cc1 -w -triple   i386-pc-linux -target-feature +avx \
+// RUN:  -emit-llvm -o - %s | FileCheck %s --check-prefix=ALL --check-prefix=GENERIC
+// RUN: %cl