from:"Paul Robinson via Phabricator via cfe\-commits"

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+/// result[j+31:j] := Load32(__X+(i*4))
+///   ELSE

pengfei wrote:
> probinson wrote:
> > pengfei wrote:
> > > probinson wrote:
> > > > pengfei wrote:
> > > > > A more intrinsic guide format is `MEM[__X+j:j]`
> > > > LoadXX is the syntax in the gather intrinsics, e.g. 
> > > > _mm_mask_i32gather_pd. I'd prefer to be consistent.
> > > I think the problem here is the measurement is easily confusing.
> > > From C point of view, `__X` is a `int` pointer, so we should `+ i` rather 
> > > than `i * 4`
> > > From the other part of the code, we are measuring in bits, but here `i * 
> > > 4` is a byte offset.
> > Well, the pseudo-code is clearly not C. If you look at the gather code, it 
> > computes a byte address using an offset multiplied by an explicit scale 
> > factor. I am doing exactly the same here.
> > 
> > The syntax `MEM[__X+j:j]` is mixing a byte address with a bit offset, which 
> > I think is more confusing. To be fully consistent, using `[]` with bit 
> > offsets only, it should be
> > ```
> > k := __X*8 + i*32
> > result[j+31:j] := MEM[k+31:k]
> > ```
> > which I think obscures more than it explains.
> Yeah, it's not C code here. But we are easy to fall into C concepts, e.g., 
> why assuming __X is measuring in bytes?
> That's why I think it's clear to make both in bits.
> I made a mistake here, I wanted to propose `MEM[__X+j+31: __X+j]`. It matches 
> with [[ 
> https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=4057,4058,4059,4053,665,3890,5959,5910,3870,4280&text=_mm256_maskload_epi32
>  | Intrinsic Guide ]].
> 
We assume `__X` is in bytes because that's how addresses work on X86. Adding a 
bit offset to a byte address makes no sense. I see that is how existing Intel 
documentation works, which does not make it correct.

To "make both in bits" means multiplying `__X` by 8, as in the example in my 
previous comment. Or coming up with a different syntax that makes the 
difference clear.
`MEM(__X)[j+31:j]` or even `MEM[__X][j+31:j]` would be preferable.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-30 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1461fabfb141: [Headers][doc] Add load/store/cmp/cvt 
intrinsic descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -600,30 +600,130 @@
   ((__m256i)__builtin_ia32_pblendw256((__v16hi)(__m256i)(V1), \
   (__v16hi)(__m256i)(V2), (int)(M)))
 
+/// Compares corresponding bytes in the 256-bit integer vectors in \a __a and
+///\a __b for equality and returns the outcomes in the corresponding
+///bytes of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[j+7:j] := (__a[j+7:j] == __b[j+7:j]) ? 0xFF : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the inputs.
+/// \param __b
+///A 256-bit integer vector containing one of the inputs.
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qi)__a == (__v32qi)__b);
 }
 
+/// Compares corresponding elements in the 256-bit vectors of [16 x i16] in
+///\a __a and \a __b for equality and returns the outcomes in the
+///corresponding elements of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 15
+///   j := i*16
+///   result[j+15:j] := (__a[j+15:j] == __b[j+15:j]) ? 0x : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the inputs.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the inputs.
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hi)__a == (__v16hi)__b);
 }
 
+/// Compares corresponding elements in the 256-bit vectors of [8 x i32] in
+///\a __a and \a __b for equality and returns the outcomes in the
+///corresponding elements of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*32
+///   result[j+31:j] := (__a[j+31:j] == __b[j+31:j]) ? 0x : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the inputs.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the inputs.
+/// \returns A 256-bit vector of [8 x i32] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8si)__a == (__v8si)__b);
 }
 
+/// Compares corresponding elements in the 256-bit vectors of [4 x i64] in
+///\a __a and \a __b for equality and returns the outcomes in the
+///corresponding elements of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*64
+///   result[j+63:j] := (__a[j+63:j] == __b[j+63:j]) ? 0x : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the inputs.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the inputs.
+/// \returns A 256-bit vector of [4 x i64] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4di)__a == (__v4di)__b);
 }
 
+/// Compares corresponding signed bytes in the 256-bit integer vectors in
+///\a __a and \a __b for greater-than and returns the outcomes in the
+///corresponding bytes of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[j+7:j] := (__a[j+7:j] > __b[j+7:j]) ? 0xFF : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPGTB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the inputs.
+/// \param __b
+///A 256-bit integer vector containing one of the inputs.
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpgt_epi8(__m256i __a, __m256i __b)
 {
@@ -632,18 +732,78 @@
   return (__m256i)((

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153681: [X86] Move back _mulx_u32 to 32-bit only

2023-06-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I'm okay with putting this back if the codegen (D153620 
) can't be resolved. Honestly I didn't think 
it was a problem.
But I think Craig or Simon should sign off.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153681/new/

https://reviews.llvm.org/D153681

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141824: [clang-repl] Add a command to load dynamic libraries

2023-03-29 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In D141824#4229953 , @argentite wrote:

> Just to confirm, `UNSUPPORTED: target=x86_64-scei-ps4` should be enough, 
> right?

`UNSUPPORTED: target={{.*-(ps4|ps5)}}` please.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141824/new/

https://reviews.llvm.org/D141824

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141824: [clang-repl] Add a command to load dynamic libraries

2023-03-29 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In D141824#4231372 , @probinson wrote:

> In D141824#4229953 , @argentite 
> wrote:
>
>> Just to confirm, `UNSUPPORTED: target=x86_64-scei-ps4` should be enough, 
>> right?
>
> `UNSUPPORTED: target={{.*-(ps4|ps5)}}` please.

Oh wait, @dyung says this should work on PS5.  Yes, your original suggestion is 
good.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141824/new/

https://reviews.llvm.org/D141824

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D147256: [DebugInfo] Fix file path separator when targeting windows.

2023-03-31 Thread Paul Robinson via Phabricator via cfe-commits

probinson added subscribers: debug-info, probinson.
probinson added a comment.

I think we cannot be 100% sure about source paths in a cross-compile situation. 
Cross-compiling on platform A targeting platform B does not mean your sources 
and debugger UI are on platform B. My users keep source and debugger UI on 
platform A, debugging target B remotely. We need to preserve the host 
pathnames. It is not clear to me that this patch does so.

Comment at: llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp:787

   StringRef PathRef(Asm->TM.Options.ObjectFilenameForDebug);
   llvm::SmallString<256> PathStore(PathRef);

zequanwu wrote:
> hans wrote:
> > This handles codeview. Does anything need to be done for dwarf on windows? 
> > mstorsjo might have input on that.
> It looks like `TM.Options.ObjectFilenameForDebug` is only used for codeview. 
> I guess dwarf doesn't store the object file path.
Right, DWARF only stores the source path.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147256/new/

https://reviews.llvm.org/D147256

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D147461: [Headers] Add some intrinsic function descriptions to immintrin.h

2023-04-03 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: RKSimon, pengfei.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D147461

Files:
  clang/lib/Headers/immintrin.h

Index: clang/lib/Headers/immintrin.h
===
--- clang/lib/Headers/immintrin.h
+++ clang/lib/Headers/immintrin.h
@@ -284,18 +284,45 @@
 
 #if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||  \
 defined(__RDRND__)
+/// Returns a 16-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///Pointer to a 16-bit location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand16_step(unsigned short *__p)
 {
   return (int)__builtin_ia32_rdrand16_step(__p);
 }
 
+/// Returns a 32-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///Pointer to a 32-bit location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand32_step(unsigned int *__p)
 {
   return (int)__builtin_ia32_rdrand32_step(__p);
 }
 
+/// Returns a 64-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///Pointer to a 64-bit location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 #ifdef __x86_64__
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand64_step(unsigned long long *__p)
@@ -325,48 +352,108 @@
 #if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||  \
 defined(__FSGSBASE__)
 #ifdef __x86_64__
+/// Reads the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDFSBASE  instruction.
+///
+/// \returns The lower 32 bits of the FS base register.
 static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readfsbase_u32(void)
 {
   return __builtin_ia32_rdfsbase32();
 }
 
+/// Reads the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDFSBASE  instruction.
+///
+/// \returns The contents of the FS base register.
 static __inline__ unsigned long long __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readfsbase_u64(void)
 {
   return __builtin_ia32_rdfsbase64();
 }
 
+/// Reads the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDGSBASE  instruction.
+///
+/// \returns The lower 32 bits of the GS base register.
 static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readgsbase_u32(void)
 {
   return __builtin_ia32_rdgsbase32();
 }
 
+/// Reads the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDGSBASE  instruction.
+///
+/// \returns The contents of the GS base register.
 static __inline__ unsigned long long __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readgsbase_u64(void)
 {
   return __builtin_ia32_rdgsbase64();
 }
 
+/// Modifies the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for the lower 32 bits of the FS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writefsbase_u32(unsigned int __V)
 {
   __builtin_ia32_wrfsbase32(__V);
 }
 
+/// Modifies the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for the FS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writefsbase_u64(unsigned long long __V)
 {
   __builtin_ia32_wrfsbase64(__V);
 }
 
+/// Modifies the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRGSBASE  instruction.
+///
+/// \param __V
+///Value to use for the lower 32 bits of the GS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writegsbase_u32(unsigned int __V)
 {
   __builtin_ia32_wrgsbase32(__V);
 }
 
+/// Modifies the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for GS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writegsb

[PATCH] D147461: [Headers] Add some intrinsic function descriptions to immintrin.h

2023-04-03 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

FTR, I'll be working my way through a bunch of intrinsics over the next month 
or so, trying not to do too many at once. Mostly AVX2 but also some others.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147461/new/

https://reviews.llvm.org/D147461

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D147461: [Headers] Add some intrinsic function descriptions to immintrin.h

2023-04-04 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa82170fa41ca: [Headers] Add some intrinsic function 
descriptions to immintrin.h. (authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D147461?vs=510568&id=510778#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147461/new/

https://reviews.llvm.org/D147461

Files:
  clang/lib/Headers/immintrin.h

Index: clang/lib/Headers/immintrin.h
===
--- clang/lib/Headers/immintrin.h
+++ clang/lib/Headers/immintrin.h
@@ -284,18 +284,45 @@
 
 #if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||  \
 defined(__RDRND__)
+/// Returns a 16-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///A pointer to a 16-bit memory location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand16_step(unsigned short *__p)
 {
   return (int)__builtin_ia32_rdrand16_step(__p);
 }
 
+/// Returns a 32-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///A pointer to a 32-bit memory location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand32_step(unsigned int *__p)
 {
   return (int)__builtin_ia32_rdrand32_step(__p);
 }
 
+/// Returns a 64-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///A pointer to a 64-bit memory location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 #ifdef __x86_64__
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand64_step(unsigned long long *__p)
@@ -325,48 +352,108 @@
 #if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||  \
 defined(__FSGSBASE__)
 #ifdef __x86_64__
+/// Reads the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDFSBASE  instruction.
+///
+/// \returns The lower 32 bits of the FS base register.
 static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readfsbase_u32(void)
 {
   return __builtin_ia32_rdfsbase32();
 }
 
+/// Reads the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDFSBASE  instruction.
+///
+/// \returns The contents of the FS base register.
 static __inline__ unsigned long long __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readfsbase_u64(void)
 {
   return __builtin_ia32_rdfsbase64();
 }
 
+/// Reads the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDGSBASE  instruction.
+///
+/// \returns The lower 32 bits of the GS base register.
 static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readgsbase_u32(void)
 {
   return __builtin_ia32_rdgsbase32();
 }
 
+/// Reads the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDGSBASE  instruction.
+///
+/// \returns The contents of the GS base register.
 static __inline__ unsigned long long __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readgsbase_u64(void)
 {
   return __builtin_ia32_rdgsbase64();
 }
 
+/// Modifies the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for the lower 32 bits of the FS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writefsbase_u32(unsigned int __V)
 {
   __builtin_ia32_wrfsbase32(__V);
 }
 
+/// Modifies the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for the FS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writefsbase_u64(unsigned long long __V)
 {
   __builtin_ia32_wrfsbase64(__V);
 }
 
+/// Modifies the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRGSBASE  instruction.
+///
+/// \param __V
+///Value to use for the lower 32 bits of the GS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writegsbase_u32(unsig

[PATCH] D147256: [DebugInfo] Fix file path separator when targeting windows.

2023-04-05 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

An LLVM code change should be testable on its own; this has it tested by Clang.
I think you need a new command-line option to set 
TargetOptions::UseTargetPathSeparator e.g. via llvm-mc. Other TargetOptions are 
handled this way.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147256/new/

https://reviews.llvm.org/D147256

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148021: [Headers][doc] Add FMA intrinsic descriptions

2023-04-11 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: RKSimon, pengfei, goldstein.w.n.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D148021

Files:
  clang/lib/Headers/fmaintrin.h

Index: clang/lib/Headers/fmaintrin.h
===
--- clang/lib/Headers/fmaintrin.h
+++ clang/lib/Headers/fmaintrin.h
@@ -18,192 +18,724 @@
 #define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("fma"), __min_vector_width__(128)))
 #define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("fma"), __min_vector_width__(256)))
 
+/// Computes a multiply-add of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) + __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMADD213PS  instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the addend.
+/// \returns A 128-bit vector of [4 x float] containing the result.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmadd_ps(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddps((__v4sf)__A, (__v4sf)__B, (__v4sf)__C);
 }
 
+/// Computes a multiply-add of 128-bit vectors of [2 x double].
+///For each element, computes  (__A * __B) + __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMADD213PD  instruction.
+///
+/// \param __A
+///A 128-bit vector of [2 x double] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [2 x double] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [2 x double] containing the addend.
+/// \returns A 128-bit [2 x double] vector containing the result.
 static __inline__ __m128d __DEFAULT_FN_ATTRS128
 _mm_fmadd_pd(__m128d __A, __m128d __B, __m128d __C)
 {
   return (__m128d)__builtin_ia32_vfmaddpd((__v2df)__A, (__v2df)__B, (__v2df)__C);
 }
 
+/// Computes a scalar multiply-add of the single-precision values in the
+///low 32 bits of 128-bit vectors of [4 x float]. \n
+/// result[31:0] = (__A[31:0] * __B[31:0]) + __C[31:0]  \n
+/// result[127:32] = __A[127:32] 
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMADD213SS  instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand in the low
+///32 bits.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier in the low
+///32 bits.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the addend in the low
+///32 bits.
+/// \returns A 128-bit vector of [4 x float] containing the result in the low
+///32 bits and a copy of \a __A[127:32] in the upper 96 bits.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmadd_ss(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddss3((__v4sf)__A, (__v4sf)__B, (__v4sf)__C);
 }
 
+/// Computes a scalar multiply-add of the double-precision values in the
+///low 64 bits of 128-bit vectors of [2 x double]. \n
+/// result[63:0] = (__A[63:0] * __B[63:0]) + __C[63:0]  \n
+/// result[127:64] = __A[127:64] 
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMADD213SD  instruction.
+///
+/// \param __A
+///A 128-bit vector of [2 x double] containing the multiplicand in the low
+///64 bits.
+/// \param __B
+///A 128-bit vector of [2 x double] containing the multiplier in the low
+///64 bits.
+/// \param __C
+///A 128-bit vector of [2 x double] containing the addend in the low
+///64 bits.
+/// \returns A 128-bit vector of [2 x double] containing the result in the low
+///64 bits and a copy of \a __A[127:64] in the upper 64 bits.
 static __inline__ __m128d __DEFAULT_FN_ATTRS128
 _mm_fmadd_sd(__m128d __A, __m128d __B, __m128d __C)
 {
   return (__m128d)__builtin_ia32_vfmaddsd3((__v2df)__A, (__v2df)__B, (__v2df)__C);
 }
 
+/// Computes a multiply-subtract of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) - __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMSUB213PS  instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the subtrahend.
+/// \returns A 128-bit vector of [4 x float] containing the result.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmsub_ps(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddps((__v4sf)__A, (__v4sf)__B, -(__v4sf)__C);
 }
 
+/// Computes a multiply-subtract of 128-bit vectors of [2 x double].
+///For each element, computes  (__A * __B) -

[PATCH] D148021: [Headers][doc] Add FMA intrinsic descriptions

2023-04-13 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/fmaintrin.h:22
+/// Computes a multiply-add of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) + __C .
+///

pengfei wrote:
> We are using a special format to describute the function in a pseudo code to 
> share it with the intrinsic guide, e.g.,
> https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/avx512fintrin.h#L9604-L9610
> https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm512_i32logather_pd&ig_expand=4077
> 
> There's no strong requirement to follow it, but it would be better to adopt a 
> uniform format.
Is a FOR loop with computed bit offsets really clearer than "For each element" 
? Is it valuable to repeat information that can be found in the instruction 
reference? 
I can accept answers of "yes" and "yes" because I am not someone who ever deals 
with vector data, but I would be a little surprised by those answers.




Comment at: clang/lib/Headers/fmaintrin.h:26
+///
+/// This intrinsic corresponds to the  VFMADD213PS  instruction.
+///

pengfei wrote:
> It would be odd to user given this is just 1/3 instructions the intrinsic may 
> generate, but I don't have a good idea here.
I listed the 213 version because that's the one that multiplies the first two 
operands, and the intrinsic multiplies the first two operands. So it's the 
instruction that most closely corresponds to the intrinsic.
We don't guarantee that the "corresponding" instruction is what is actually 
generated, in general. I know this point has come up before regarding intrinsic 
descriptions. My thinking is that the "corresponding instruction" gives the 
reader a place to look in the instruction reference manual, so listing only one 
is again okay.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148021/new/

https://reviews.llvm.org/D148021

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-23 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:156
+///A 256-bit vector containing one of the source operands.
+/// \returns A 256-bit vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256

craig.topper wrote:
> Why do some return descriptions include the type like [4 x i64] but some 
> don't?
My policy has been to provide the type for element sizes other than byte. So, I 
haven't been saying [32 x i8] but I do say [4 x i64] or whatever.
Although, I have also tended to say "integer vector" when it's a byte vector, 
and I'll make that consistent here as well.



Comment at: clang/lib/Headers/avx2intrin.h:1043
+///corresponding byte of the 256-bit integer vector result (overflow is
+///ignored). For each byte, computes  result = __a - __b .
+///

pengfei wrote:
> It better to move it to `\code{.operation}` for consistency. Same for the 
> below.
Okay.



Comment at: clang/lib/Headers/avx2intrin.h:1050
+/// \param __a
+///A 256-bit vector containing the subtrahends.
+/// \param __b

craig.topper wrote:
> I think minuend and subtrahend are swapped here.
Thanks for catching that!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-23 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 524786.
probinson added a comment.

Address review comments


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -65,48 +65,150 @@
   return (__m256i) __builtin_ia32_packusdw256((__v8si)__V1, (__v8si)__V2);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors and returns the lower 8 bits of each sum in the corresponding
+///byte of the 256-bit integer vector result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qu)__a + (__v32qu)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] and returns the lower 16 bits of each sum in the
+///corresponding element of the [16 x i16] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hu)__a + (__v16hu)__b);
 }
 
+/// Adds 32-bit integers from corresponding elements of two 256-bit vectors of
+///[8 x i32] and returns the lower 32 bits of each sum in the corresponding
+///element of the [8 x i32] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \returns A 256-bit vector of [8 x i32] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8su)__a + (__v8su)__b);
 }
 
+/// Adds 64-bit integers from corresponding elements of two 256-bit vectors of
+///[4 x i64] and returns the lower 64 bits of each sum in the corresponding
+///element of the [4 x i64] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \returns A 256-bit vector of [4 x i64] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4du)__a + (__v4du)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using signed saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v32qs)__a, (__v32qs)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] using signed saturation, and returns the [16 x i16] result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using unsigned saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-25 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 525632.
probinson added a comment.

Correct order of horizontal operands


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -65,48 +65,150 @@
   return (__m256i) __builtin_ia32_packusdw256((__v8si)__V1, (__v8si)__V2);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors and returns the lower 8 bits of each sum in the corresponding
+///byte of the 256-bit integer vector result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qu)__a + (__v32qu)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] and returns the lower 16 bits of each sum in the
+///corresponding element of the [16 x i16] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hu)__a + (__v16hu)__b);
 }
 
+/// Adds 32-bit integers from corresponding elements of two 256-bit vectors of
+///[8 x i32] and returns the lower 32 bits of each sum in the corresponding
+///element of the [8 x i32] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \returns A 256-bit vector of [8 x i32] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8su)__a + (__v8su)__b);
 }
 
+/// Adds 64-bit integers from corresponding elements of two 256-bit vectors of
+///[4 x i64] and returns the lower 64 bits of each sum in the corresponding
+///element of the [4 x i64] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \returns A 256-bit vector of [4 x i64] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4du)__a + (__v4du)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using signed saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v32qs)__a, (__v32qs)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] using signed saturation, and returns the [16 x i16] result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using unsigned saturation, and returns each sum in the
+///corresponding byte of the 256-bit in

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-25 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:456
+///   j := i*128
+///   result[j+31:j] := __a[j+63:j+32] - __a[j+31:j]
+///   result[j+63:j+32] := __a[j+127:j+96] - __a[j+95:j+64]

craig.topper wrote:
> Intel intrinsics guide says
> 
> ```
> dst[31:0] := a[31:0] - a[63:32]
> dst[63:32] := a[95:64] - a[127:96]
> dst[95:64] := b[31:0] - b[63:32]
> dst[127:96] := b[95:64] - b[127:96]
> dst[159:128] := a[159:128] - a[191:160]
> dst[191:160] := a[223:192] - a[255:224]
> dst[223:192] := b[159:128] - b[191:160]
> dst[255:224] := b[223:192] - b[255:224]
> dst[MAX:256] := 0
> ```
> 
> So I think the operands are in the wrong order here?
Words fail me. Also diagrams. I wanted the add and sub descriptions to look 
similar, and copy-pasted from add to sub without verifying the order.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson marked an inline comment as done.
probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:412
+///vectors of [16 x i16] and returns the lower 16 bits of each difference
+///in an element of the [16 x i16] result (overflow is ignored).
+///Differences from \a __a are returned in the lower 64 bits of each

pengfei wrote:
> underflow?
I don't often see "underflow" applied to integer operations. Technically, any 
signed add or subtract could either overflow or underflow, depending on the 
sign and magnitude of the operands. I think just saying "overflow" is clear 
enough?



Comment at: clang/lib/Headers/avx2intrin.h:448
+///vectors of [8 x i32] and returns the lower 32 bits of each difference in
+///an element of the [8 x i31] result (overflow is ignored). Differences
+///from \a __a are returned in the lower 64 bits of each 128-bit half of

pengfei wrote:
> pengfei wrote:
> > typo or intended?
> underflow.
The `i31` is a typo, fixed.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-30 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
probinson marked an inline comment as done.
Closed by commit rGd8291908ef49: [Headers][doc] Add add/sub/mul intrinsic 
descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D150114?vs=525632&id=526700#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -65,48 +65,150 @@
   return (__m256i) __builtin_ia32_packusdw256((__v8si)__V1, (__v8si)__V2);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors and returns the lower 8 bits of each sum in the corresponding
+///byte of the 256-bit integer vector result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qu)__a + (__v32qu)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] and returns the lower 16 bits of each sum in the
+///corresponding element of the [16 x i16] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hu)__a + (__v16hu)__b);
 }
 
+/// Adds 32-bit integers from corresponding elements of two 256-bit vectors of
+///[8 x i32] and returns the lower 32 bits of each sum in the corresponding
+///element of the [8 x i32] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \returns A 256-bit vector of [8 x i32] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8su)__a + (__v8su)__b);
 }
 
+/// Adds 64-bit integers from corresponding elements of two 256-bit vectors of
+///[4 x i64] and returns the lower 64 bits of each sum in the corresponding
+///element of the [4 x i64] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \returns A 256-bit vector of [4 x i64] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4du)__a + (__v4du)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using signed saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v32qs)__a, (__v32qs)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] using signed saturation, and returns the [16 x i16] result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m25

[PATCH] D151749: [Headers][doc] Add "shuffle-like" intrinsic descriptions to avx2intrin.h

2023-05-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: RKSimon, pengfei, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

(Time to look for the next round of embarrassing mistakes...)


https://reviews.llvm.org/D151749

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -41,24 +41,126 @@
 return (__m256i)__builtin_elementwise_abs((__v8si)__a);
 }
 
+/// Converts the elements of two 256-bit vectors of [16 x i16] to 8-bit
+///integers using signed saturation, and returns the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*16
+///   k := i*8
+///   result[7+k:k] := SATURATE8(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKSSWB instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [16 x i16] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packs_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packsswb256((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Converts the elements of two 256-bit vectors of [8 x i32] to 16-bit
+///integers using signed saturation, and returns the resulting 256-bit
+///vector of [16 x i16].
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*32
+///   k := i*16
+///   result[15+k:k] := SATURATE16(__a[31+j:j])
+///   result[79+k:64+k] := SATURATE16(__b[31+j:j])
+///   result[143+k:128+k] := SATURATE16(__a[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16(__b[159+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKSSDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [8 x i32] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packs_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packssdw256((__v8si)__a, (__v8si)__b);
 }
 
+/// Converts elements from two 256-bit vectors of [16 x i16] to 8-bit integers
+///using unsigned saturation, and returns the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*16
+///   k := i*8
+///   result[7+k:k] := SATURATE8(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKUSWB instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [16 x i16] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packus_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packuswb256((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Converts elements from two 256-bit vectors of [8 x i32] to 16-bit integers
+///using unsigned saturation, and returns the resulting 256-bit vector of
+///[16 x i16].
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*32
+///   k := i*16
+///   result[15+k:k] := SATURATE16(__V1[31+j:j])
+///   result[79+k:64+k] := SATURATE16(__V2[31+j:j])
+///   result[143+k:128+k] := SATURATE16(__V1[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16(__V2[159+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKUSDW instruction.
+///
+/// \param __V1
+///A 256-bit vector of [8 x i32] used to generate result[63:0] and
+///result[191:128].
+/// \param __V2
+///A 256-bit vector of [8 x i32] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packus_epi32(__m256i __V1, __m256i __V2)
 {
@@ -215,6 +317,30 @@
   return (__m256i)__builtin_elementwise_add_sat((__v16hu)__a, (__v16hu)__b);
 }
 
+/// Uses the lower half of the 256-bit vector \a a as the upper half of a
+///temporary 256-bit value, and the lower half of the 256-bit vector \a

[PATCH] D151749: [Headers][doc] Add "shuffle-like" intrinsic descriptions to avx2intrin.h

2023-05-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 526782.
probinson added a comment.

Update some SATURATEx to SATURATExU


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151749/new/

https://reviews.llvm.org/D151749

Files:
  clang/lib/Headers/avx2intrin.h


Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -111,10 +111,10 @@
 /// FOR i := 0 TO 7
 ///   j := i*16
 ///   k := i*8
-///   result[7+k:k] := SATURATE8(__a[15+j:j])
-///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
-///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
-///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+///   result[7+k:k] := SATURATE8U(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8U(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8U(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8U(__b[143+j:128+j])
 /// ENDFOR
 /// \endcode
 ///
@@ -143,10 +143,10 @@
 /// FOR i := 0 TO 3
 ///   j := i*32
 ///   k := i*16
-///   result[15+k:k] := SATURATE16(__V1[31+j:j])
-///   result[79+k:64+k] := SATURATE16(__V2[31+j:j])
-///   result[143+k:128+k] := SATURATE16(__V1[159+j:128+j])
-///   result[207+k:192+k] := SATURATE16(__V2[159+j:128+j])
+///   result[15+k:k] := SATURATE16U(__V1[31+j:j])
+///   result[79+k:64+k] := SATURATE16U(__V2[31+j:j])
+///   result[143+k:128+k] := SATURATE16U(__V1[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16U(__V2[159+j:128+j])
 /// ENDFOR
 /// \endcode
 ///


Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -111,10 +111,10 @@
 /// FOR i := 0 TO 7
 ///   j := i*16
 ///   k := i*8
-///   result[7+k:k] := SATURATE8(__a[15+j:j])
-///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
-///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
-///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+///   result[7+k:k] := SATURATE8U(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8U(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8U(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8U(__b[143+j:128+j])
 /// ENDFOR
 /// \endcode
 ///
@@ -143,10 +143,10 @@
 /// FOR i := 0 TO 3
 ///   j := i*32
 ///   k := i*16
-///   result[15+k:k] := SATURATE16(__V1[31+j:j])
-///   result[79+k:64+k] := SATURATE16(__V2[31+j:j])
-///   result[143+k:128+k] := SATURATE16(__V1[159+j:128+j])
-///   result[207+k:192+k] := SATURATE16(__V2[159+j:128+j])
+///   result[15+k:k] := SATURATE16U(__V1[31+j:j])
+///   result[79+k:64+k] := SATURATE16U(__V2[31+j:j])
+///   result[143+k:128+k] := SATURATE16U(__V1[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16U(__V2[159+j:128+j])
 /// ENDFOR
 /// \endcode
 ///
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151749: [Headers][doc] Add "shuffle-like" intrinsic descriptions to avx2intrin.h

2023-05-31 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe5399f1d7cab: [Headers][doc] Add shuffle-like intrinsic 
descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D151749?vs=526782&id=527015#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151749/new/

https://reviews.llvm.org/D151749

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -41,24 +41,126 @@
 return (__m256i)__builtin_elementwise_abs((__v8si)__a);
 }
 
+/// Converts the elements of two 256-bit vectors of [16 x i16] to 8-bit
+///integers using signed saturation, and returns the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*16
+///   k := i*8
+///   result[7+k:k] := SATURATE8(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKSSWB instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [16 x i16] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packs_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packsswb256((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Converts the elements of two 256-bit vectors of [8 x i32] to 16-bit
+///integers using signed saturation, and returns the resulting 256-bit
+///vector of [16 x i16].
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*32
+///   k := i*16
+///   result[15+k:k] := SATURATE16(__a[31+j:j])
+///   result[79+k:64+k] := SATURATE16(__b[31+j:j])
+///   result[143+k:128+k] := SATURATE16(__a[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16(__b[159+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKSSDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [8 x i32] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packs_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packssdw256((__v8si)__a, (__v8si)__b);
 }
 
+/// Converts elements from two 256-bit vectors of [16 x i16] to 8-bit integers
+///using unsigned saturation, and returns the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*16
+///   k := i*8
+///   result[7+k:k] := SATURATE8U(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8U(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8U(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8U(__b[143+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKUSWB instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [16 x i16] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packus_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packuswb256((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Converts elements from two 256-bit vectors of [8 x i32] to 16-bit integers
+///using unsigned saturation, and returns the resulting 256-bit vector of
+///[16 x i16].
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*32
+///   k := i*16
+///   result[15+k:k] := SATURATE16U(__V1[31+j:j])
+///   result[79+k:64+k] := SATURATE16U(__V2[31+j:j])
+///   result[143+k:128+k] := SATURATE16U(__V1[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16U(__V2[159+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKUSDW instruction.
+///
+/// \param __V1
+///A 256-bit vector of [8 x i32] used to generate result[63:0] and
+///result[191:128].
+/// \param __V2
+///A 256-bit vector of [8 x i32] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packus_epi32(__m256i __V1, __m256i __V2)
 {
@@ -215,6

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-02 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I'm traveling but will look at this on Monday.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D143745: Make section attribute and -ffunction-sections play nicely

2023-02-10 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: rjmccall, MaskRay.
Herald added a project: All.
probinson requested review of this revision.

People use -ffunction-sections to put each function into its own
object-file section; this makes linker garbage-collection simpler.
However, if there's an explicit __attribute__((section("name"))
on the function, all functions with that attribute end up in a
single section, defeating the linker GC.

Use section groups to make these things work together.


https://reviews.llvm.org/D143745

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/test/CodeGen/section-attr-comdat.cpp


Index: clang/test/CodeGen/section-attr-comdat.cpp
===
--- /dev/null
+++ clang/test/CodeGen/section-attr-comdat.cpp
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -triple=x86_64-linux -emit-llvm  %s -o - | \
+// RUN: FileCheck %s --check-prefix=NOCOMDAT
+// RUN: %clang_cc1 -triple=x86_64-linux -emit-llvm -ffunction-sections %s -o - 
| \
+// RUN: FileCheck %s --check-prefix=COMDAT
+
+// template function = comdat always.
+template
+__attribute__((section("foo"))) T ftemplate(T a) { return a + 1; }
+__attribute__((section("foo"))) int fglobal(int a) { return ftemplate(a) + 2; }
+
+// NOCOMDAT-DAG: ${{.*}}ftemplate{{.*}} = comdat any
+// NOCOMDAT-DAG: define {{.*}}ftemplate{{.*}} section "foo" comdat {
+// NOCOMDAT-DAG: define {{.*}}fglobal{{.*}} section "foo" {
+
+// COMDAT-DAG: ${{.*}}ftemplate{{.*}} = comdat any
+// COMDAT-DAG: ${{.*}}fglobal{{.*}} = comdat nodeduplicate
+// COMDAT-DAG: define {{.*}}ftemplate{{.*}} section "foo" comdat {
+// COMDAT-DAG: define {{.*}}fglobal{{.*}} section "foo" comdat {
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -2293,6 +2293,18 @@
   if (auto *SA = D->getAttr())
 if (!D->getAttr())
   F->addFnAttr("implicit-section-name", SA->getName());
+  // If we have -ffunction-sections, and also an explicit section name,
+  // put the function into a section group so that various linker GC
+  // operations will still work with this function.
+  if (CodeGenOpts.FunctionSections && getTriple().supportsCOMDAT() &&
+  D->hasAttr()) {
+// Don't replace a real comdat.
+if (!F->getComdat()) {
+  llvm::Comdat *C = TheModule.getOrInsertComdat(F->getName());
+  C->setSelectionKind(llvm::Comdat::NoDeduplicate);
+  F->setComdat(C);
+}
+  }
 
   llvm::AttrBuilder Attrs(F->getContext());
   if (GetCPUAndFeaturesAttributes(GD, Attrs)) {


Index: clang/test/CodeGen/section-attr-comdat.cpp
===
--- /dev/null
+++ clang/test/CodeGen/section-attr-comdat.cpp
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -triple=x86_64-linux -emit-llvm  %s -o - | \
+// RUN: FileCheck %s --check-prefix=NOCOMDAT
+// RUN: %clang_cc1 -triple=x86_64-linux -emit-llvm -ffunction-sections %s -o - | \
+// RUN: FileCheck %s --check-prefix=COMDAT
+
+// template function = comdat always.
+template
+__attribute__((section("foo"))) T ftemplate(T a) { return a + 1; }
+__attribute__((section("foo"))) int fglobal(int a) { return ftemplate(a) + 2; }
+
+// NOCOMDAT-DAG: ${{.*}}ftemplate{{.*}} = comdat any
+// NOCOMDAT-DAG: define {{.*}}ftemplate{{.*}} section "foo" comdat {
+// NOCOMDAT-DAG: define {{.*}}fglobal{{.*}} section "foo" {
+
+// COMDAT-DAG: ${{.*}}ftemplate{{.*}} = comdat any
+// COMDAT-DAG: ${{.*}}fglobal{{.*}} = comdat nodeduplicate
+// COMDAT-DAG: define {{.*}}ftemplate{{.*}} section "foo" comdat {
+// COMDAT-DAG: define {{.*}}fglobal{{.*}} section "foo" comdat {
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -2293,6 +2293,18 @@
   if (auto *SA = D->getAttr())
 if (!D->getAttr())
   F->addFnAttr("implicit-section-name", SA->getName());
+  // If we have -ffunction-sections, and also an explicit section name,
+  // put the function into a section group so that various linker GC
+  // operations will still work with this function.
+  if (CodeGenOpts.FunctionSections && getTriple().supportsCOMDAT() &&
+  D->hasAttr()) {
+// Don't replace a real comdat.
+if (!F->getComdat()) {
+  llvm::Comdat *C = TheModule.getOrInsertComdat(F->getName());
+  C->setSelectionKind(llvm::Comdat::NoDeduplicate);
+  F->setComdat(C);
+}
+  }
 
   llvm::AttrBuilder Attrs(F->getContext());
   if (GetCPUAndFeaturesAttributes(GD, Attrs)) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D143745: Make section attribute and -ffunction-sections play nicely

2023-02-10 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

This works in my simple test cases, but I'm not 100% sure it's (a) the best 
place to do this, or (b) the only place that needs to do this.
Really we should guarantee that `comdat any` wins over `comdat nodeduplicate` 
in all cases.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143745/new/

https://reviews.llvm.org/D143745

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D143745: Make section attribute and -ffunction-sections play nicely

2023-02-10 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I'm not sure I understand the linker's mechanics here. Let me say some things 
and you can describe my misunderstanding.

- If the linker was going to discard all of section foo (in the current 
scheme), that means it had no reason to retain either f() or g(). In the new 
scheme, the net result would be the same, both f() and g() are dead and would 
be discarded. So, no behavior change.
- If the linker wanted to retain f(), but had no reason to retain g(), then in 
the current scheme it would retain all of section foo. In the new scheme it 
would retain f() but discard g(). This is the desired behavior change.

Is that second point the "surprising" behavior? Note that this change applies 
only to functions, not variables.

Yes, more testing is definitely a good thing.

If we do have to have an option, I suppose it could be something like:
`-ffunction-sections[=(default,all)]` where `-ffunction-sections` or 
`-ffunction-sections=default` will apply only to `.text` (or other default text 
section), while `-ffunction-sections=all` does what my patch says.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143745/new/

https://reviews.llvm.org/D143745

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D143745: Make section attribute and -ffunction-sections play nicely

2023-02-14 Thread Paul Robinson via Phabricator via cfe-commits

probinson abandoned this revision.
probinson added a comment.

Discussion on the GCC bug has persuaded me this is not a good idea.  I'll solve 
my user's problem a different way.
@MaskRay you can close the GCC bug, it looks like I can't do it myself.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143745/new/

https://reviews.llvm.org/D143745

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144188: Tighten up a modules test

2023-02-16 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added a reviewer: ChuanqiXu.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D144188

Files:
  clang/test/CXX/module/basic/basic.def.odr/p4.cppm


Index: clang/test/CXX/module/basic/basic.def.odr/p4.cppm
===
--- clang/test/CXX/module/basic/basic.def.odr/p4.cppm
+++ clang/test/CXX/module/basic/basic.def.odr/p4.cppm
@@ -2,7 +2,7 @@
 // RUN: mkdir %t
 // RUN: split-file %s %t
 //
-// RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple 
-emit-llvm -o - | FileCheck %t/Module.cppm --implicit-check-not unused_inline 
--implicit-check-not unused_stastic_global_module
+// RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple 
-emit-llvm -o - | FileCheck %t/Module.cppm --implicit-check-not unused
 //
 // RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple 
-emit-module-interface -o %t/Module.pcm
 // RUN: %clang_cc1 -std=c++20 %t/module.cpp -triple %itanium_abi_triple 
-fmodule-file=%t/Module.pcm -emit-llvm -o - | FileCheck %t/module.cpp 
--implicit-check-not=unused --implicit-check-not=global_module
@@ -33,8 +33,6 @@
 // CHECK-DAG: @_ZL24const_var_module_linkage = internal
 //
 // CHECK-DAG: @_ZW6Module25unused_var_module_linkage = {{(dso_local )?}}global 
i32 4
-// CHECK-NOT: @_ZW6Module32unused_static_var_module_linkage =
-// CHECK-NOT: @_ZW6Module31unused_const_var_module_linkage =
 
 module;
 
@@ -85,7 +83,6 @@
   }
 }
 
-// CHECK-NOT: define {{(dso_local )?}}void 
{{.*}}@_ZW6Module28unused_static_module_linkagev
 static void unused_static_module_linkage() {}
 
 static void used_static_module_linkage() {}


Index: clang/test/CXX/module/basic/basic.def.odr/p4.cppm
===
--- clang/test/CXX/module/basic/basic.def.odr/p4.cppm
+++ clang/test/CXX/module/basic/basic.def.odr/p4.cppm
@@ -2,7 +2,7 @@
 // RUN: mkdir %t
 // RUN: split-file %s %t
 //
-// RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple -emit-llvm -o - | FileCheck %t/Module.cppm --implicit-check-not unused_inline --implicit-check-not unused_stastic_global_module
+// RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple -emit-llvm -o - | FileCheck %t/Module.cppm --implicit-check-not unused
 //
 // RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple -emit-module-interface -o %t/Module.pcm
 // RUN: %clang_cc1 -std=c++20 %t/module.cpp -triple %itanium_abi_triple -fmodule-file=%t/Module.pcm -emit-llvm -o - | FileCheck %t/module.cpp --implicit-check-not=unused --implicit-check-not=global_module
@@ -33,8 +33,6 @@
 // CHECK-DAG: @_ZL24const_var_module_linkage = internal
 //
 // CHECK-DAG: @_ZW6Module25unused_var_module_linkage = {{(dso_local )?}}global i32 4
-// CHECK-NOT: @_ZW6Module32unused_static_var_module_linkage =
-// CHECK-NOT: @_ZW6Module31unused_const_var_module_linkage =
 
 module;
 
@@ -85,7 +83,6 @@
   }
 }
 
-// CHECK-NOT: define {{(dso_local )?}}void {{.*}}@_ZW6Module28unused_static_module_linkagev
 static void unused_static_module_linkage() {}
 
 static void used_static_module_linkage() {}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144188: Tighten up a modules test

2023-02-16 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I looked at this test only because it caused a merge conflict downstream. While 
it does work, it will not catch some kinds of mistakes; by being less specific 
in the "not" checks, it will catch more potential problems.




Comment at: clang/test/CXX/module/basic/basic.def.odr/p4.cppm:5
 //
-// RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple 
-emit-llvm -o - | FileCheck %t/Module.cppm --implicit-check-not unused_inline 
--implicit-check-not unused_stastic_global_module
 //

Note the typo "unused_sta*s*tic_global_module"
Simpler to reject all "unused" strings.



Comment at: clang/test/CXX/module/basic/basic.def.odr/p4.cppm:88
 
-// CHECK-NOT: define {{(dso_local )?}}void 
{{.*}}@_ZW6Module28unused_static_module_linkagev
 static void unused_static_module_linkage() {}

This CHECK-NOT is overly specific. For example,
`define dso_local hidden void ...`
would not match, and therefore pass, when it should not.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144188/new/

https://reviews.llvm.org/D144188

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144188: Tighten up a modules test

2023-02-17 Thread Paul Robinson via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG17a90f1196c1: Tighten up a modules test (authored by 
probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144188/new/

https://reviews.llvm.org/D144188

Files:
  clang/test/CXX/module/basic/basic.def.odr/p4.cppm


Index: clang/test/CXX/module/basic/basic.def.odr/p4.cppm
===
--- clang/test/CXX/module/basic/basic.def.odr/p4.cppm
+++ clang/test/CXX/module/basic/basic.def.odr/p4.cppm
@@ -2,7 +2,7 @@
 // RUN: mkdir %t
 // RUN: split-file %s %t
 //
-// RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple 
-emit-llvm -o - | FileCheck %t/Module.cppm --implicit-check-not unused_inline 
--implicit-check-not unused_stastic_global_module
+// RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple 
-emit-llvm -o - | FileCheck %t/Module.cppm --implicit-check-not unused
 //
 // RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple 
-emit-module-interface -o %t/Module.pcm
 // RUN: %clang_cc1 -std=c++20 %t/module.cpp -triple %itanium_abi_triple 
-fmodule-file=%t/Module.pcm -emit-llvm -o - | FileCheck %t/module.cpp 
--implicit-check-not=unused --implicit-check-not=global_module
@@ -33,8 +33,6 @@
 // CHECK-DAG: @_ZL24const_var_module_linkage = internal
 //
 // CHECK-DAG: @_ZW6Module25unused_var_module_linkage = {{(dso_local )?}}global 
i32 4
-// CHECK-NOT: @_ZW6Module32unused_static_var_module_linkage =
-// CHECK-NOT: @_ZW6Module31unused_const_var_module_linkage =
 
 module;
 
@@ -85,7 +83,6 @@
   }
 }
 
-// CHECK-NOT: define {{(dso_local )?}}void 
{{.*}}@_ZW6Module28unused_static_module_linkagev
 static void unused_static_module_linkage() {}
 
 static void used_static_module_linkage() {}


Index: clang/test/CXX/module/basic/basic.def.odr/p4.cppm
===
--- clang/test/CXX/module/basic/basic.def.odr/p4.cppm
+++ clang/test/CXX/module/basic/basic.def.odr/p4.cppm
@@ -2,7 +2,7 @@
 // RUN: mkdir %t
 // RUN: split-file %s %t
 //
-// RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple -emit-llvm -o - | FileCheck %t/Module.cppm --implicit-check-not unused_inline --implicit-check-not unused_stastic_global_module
+// RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple -emit-llvm -o - | FileCheck %t/Module.cppm --implicit-check-not unused
 //
 // RUN: %clang_cc1 -std=c++20 %t/Module.cppm -triple %itanium_abi_triple -emit-module-interface -o %t/Module.pcm
 // RUN: %clang_cc1 -std=c++20 %t/module.cpp -triple %itanium_abi_triple -fmodule-file=%t/Module.pcm -emit-llvm -o - | FileCheck %t/module.cpp --implicit-check-not=unused --implicit-check-not=global_module
@@ -33,8 +33,6 @@
 // CHECK-DAG: @_ZL24const_var_module_linkage = internal
 //
 // CHECK-DAG: @_ZW6Module25unused_var_module_linkage = {{(dso_local )?}}global i32 4
-// CHECK-NOT: @_ZW6Module32unused_static_var_module_linkage =
-// CHECK-NOT: @_ZW6Module31unused_const_var_module_linkage =
 
 module;
 
@@ -85,7 +83,6 @@
   }
 }
 
-// CHECK-NOT: define {{(dso_local )?}}void {{.*}}@_ZW6Module28unused_static_module_linkagev
 static void unused_static_module_linkage() {}
 
 static void used_static_module_linkage() {}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144586: [PS4/PS5] Specify no or

2023-02-22 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added a reviewer: aaron.ballman.
Herald added a project: All.
probinson requested review of this revision.

We've never provided these headers so set the preprocessor
toggles to reflect that.


https://reviews.llvm.org/D144586

Files:
  clang/lib/Basic/Targets/OSTargets.h
  clang/test/C/C11/n1460.c


Index: clang/test/C/C11/n1460.c
===
--- clang/test/C/C11/n1460.c
+++ clang/test/C/C11/n1460.c
@@ -7,9 +7,15 @@
 // If we claim to not support the feature then we expect diagnostics when
 // using that feature. Otherwise, we expect no diagnostics.
 #ifdef __STDC_NO_COMPLEX__
-  // We do not have any targets which do not support complex, so we don't
-  // expect to get into this block.
-  #error "it's unexpected that we don't support complex"
+  // PS4/PS5 set this to indicate no  but still support the
+  // _Complex syntax.
+  #ifdef __SCE__
+#define HAS_COMPLEX
+  #else
+// We do not have any other targets which do not support complex, so we
+// don't expect to get into this block.
+#error "it's unexpected that we don't support complex"
+  #endif
   float _Complex fc;
   double _Complex dc;
   long double _Complex ldc;
Index: clang/lib/Basic/Targets/OSTargets.h
===
--- clang/lib/Basic/Targets/OSTargets.h
+++ clang/lib/Basic/Targets/OSTargets.h
@@ -535,6 +535,8 @@
 DefineStd(Builder, "unix", Opts);
 Builder.defineMacro("__ELF__");
 Builder.defineMacro("__SCE__");
+Builder.defineMacro("__STDC_NO_COMPLEX__");
+Builder.defineMacro("__STDC_NO_THREADS__");
   }
 
 public:


Index: clang/test/C/C11/n1460.c
===
--- clang/test/C/C11/n1460.c
+++ clang/test/C/C11/n1460.c
@@ -7,9 +7,15 @@
 // If we claim to not support the feature then we expect diagnostics when
 // using that feature. Otherwise, we expect no diagnostics.
 #ifdef __STDC_NO_COMPLEX__
-  // We do not have any targets which do not support complex, so we don't
-  // expect to get into this block.
-  #error "it's unexpected that we don't support complex"
+  // PS4/PS5 set this to indicate no  but still support the
+  // _Complex syntax.
+  #ifdef __SCE__
+#define HAS_COMPLEX
+  #else
+// We do not have any other targets which do not support complex, so we
+// don't expect to get into this block.
+#error "it's unexpected that we don't support complex"
+  #endif
   float _Complex fc;
   double _Complex dc;
   long double _Complex ldc;
Index: clang/lib/Basic/Targets/OSTargets.h
===
--- clang/lib/Basic/Targets/OSTargets.h
+++ clang/lib/Basic/Targets/OSTargets.h
@@ -535,6 +535,8 @@
 DefineStd(Builder, "unix", Opts);
 Builder.defineMacro("__ELF__");
 Builder.defineMacro("__SCE__");
+Builder.defineMacro("__STDC_NO_COMPLEX__");
+Builder.defineMacro("__STDC_NO_THREADS__");
   }
 
 public:
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144586: [PS4/PS5] Specify no or

2023-02-22 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

That's correct, I do see  in our SDK.

I don't see a need for a release note; we're not actually removing anything 
that we used to support.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144586/new/

https://reviews.llvm.org/D144586

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144586: [PS4/PS5] Specify no or

2023-02-23 Thread Paul Robinson via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG32d441bfb4f3: [PS4/PS5] Specify no  or 
 (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144586/new/

https://reviews.llvm.org/D144586

Files:
  clang/lib/Basic/Targets/OSTargets.h
  clang/test/C/C11/n1460.c


Index: clang/test/C/C11/n1460.c
===
--- clang/test/C/C11/n1460.c
+++ clang/test/C/C11/n1460.c
@@ -7,9 +7,15 @@
 // If we claim to not support the feature then we expect diagnostics when
 // using that feature. Otherwise, we expect no diagnostics.
 #ifdef __STDC_NO_COMPLEX__
-  // We do not have any targets which do not support complex, so we don't
-  // expect to get into this block.
-  #error "it's unexpected that we don't support complex"
+  // PS4/PS5 set this to indicate no  but still support the
+  // _Complex syntax.
+  #ifdef __SCE__
+#define HAS_COMPLEX
+  #else
+// We do not have any other targets which do not support complex, so we
+// don't expect to get into this block.
+#error "it's unexpected that we don't support complex"
+  #endif
   float _Complex fc;
   double _Complex dc;
   long double _Complex ldc;
Index: clang/lib/Basic/Targets/OSTargets.h
===
--- clang/lib/Basic/Targets/OSTargets.h
+++ clang/lib/Basic/Targets/OSTargets.h
@@ -535,6 +535,8 @@
 DefineStd(Builder, "unix", Opts);
 Builder.defineMacro("__ELF__");
 Builder.defineMacro("__SCE__");
+Builder.defineMacro("__STDC_NO_COMPLEX__");
+Builder.defineMacro("__STDC_NO_THREADS__");
   }
 
 public:


Index: clang/test/C/C11/n1460.c
===
--- clang/test/C/C11/n1460.c
+++ clang/test/C/C11/n1460.c
@@ -7,9 +7,15 @@
 // If we claim to not support the feature then we expect diagnostics when
 // using that feature. Otherwise, we expect no diagnostics.
 #ifdef __STDC_NO_COMPLEX__
-  // We do not have any targets which do not support complex, so we don't
-  // expect to get into this block.
-  #error "it's unexpected that we don't support complex"
+  // PS4/PS5 set this to indicate no  but still support the
+  // _Complex syntax.
+  #ifdef __SCE__
+#define HAS_COMPLEX
+  #else
+// We do not have any other targets which do not support complex, so we
+// don't expect to get into this block.
+#error "it's unexpected that we don't support complex"
+  #endif
   float _Complex fc;
   double _Complex dc;
   long double _Complex ldc;
Index: clang/lib/Basic/Targets/OSTargets.h
===
--- clang/lib/Basic/Targets/OSTargets.h
+++ clang/lib/Basic/Targets/OSTargets.h
@@ -535,6 +535,8 @@
 DefineStd(Builder, "unix", Opts);
 Builder.defineMacro("__ELF__");
 Builder.defineMacro("__SCE__");
+Builder.defineMacro("__STDC_NO_COMPLEX__");
+Builder.defineMacro("__STDC_NO_THREADS__");
   }
 
 public:
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-08 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D150114

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -65,48 +65,150 @@
   return (__m256i) __builtin_ia32_packusdw256((__v8si)__V1, (__v8si)__V2);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors and returns the lower 8 bits of each sum in the corresponding
+///byte of the 256-bit integer vector result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDB instruction.
+///
+/// \param __a
+///A 256-bit vector containing one of the source operands.
+/// \param __b
+///A 256-bit vector containing one of the source operands.
+/// \returns A 256-bit vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qu)__a + (__v32qu)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] and returns the lower 16 bits of each sum in the
+///corresponding element of the [16 x i16] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hu)__a + (__v16hu)__b);
 }
 
+/// Adds 32-bit integers from corresponding elements of two 256-bit vectors of
+///[8 x i32] and returns the lower 32 bits of each sum in the corresponding
+///element of the [8 x i32] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \returns A 256-bit vector of [8 x i32] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8su)__a + (__v8su)__b);
 }
 
+/// Adds 64-bit integers from corresponding elements of two 256-bit vectors of
+///[4 x i64] and returns the lower 64 bits of each sum in the corresponding
+///element of the [4 x i64] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \returns A 256-bit vector of [4 x i64] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4du)__a + (__v4du)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using signed saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSB instruction.
+///
+/// \param __a
+///A 256-bit vector containing one of the source operands.
+/// \param __b
+///A 256-bit vector containing one of the source operands.
+/// \returns A 256-bit vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v32qs)__a, (__v32qs)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] using signed saturation, and returns the [16 x i16] result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using unsigned saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile 
+///

[PATCH] D150278: [Headers][doc] Add "shift" intrinsic descriptions to avx2intrin.h

2023-05-10 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D150278

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -493,108 +493,404 @@
 return (__m256i)__builtin_ia32_psignd256((__v8si)__a, (__v8si)__b);
 }
 
+/// Shifts each 128-bit half of the 256-bit integer vector \a a left by
+///\a imm bytes, shifting in zero bytes, and returns the result. If \a imm
+///is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_slli_si256(__m256i a, const int imm);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VPSLLDQ instruction.
+///
+/// \param a
+///A 256-bit integer vector to be shifted.
+/// \param imm
+/// An unsigned immediate value specifying the shift count (in bytes).
+/// \returns A 256-bit integer vector containing the result.
 #define _mm256_slli_si256(a, imm) \
   ((__m256i)__builtin_ia32_pslldqi256_byteshift((__v4di)(__m256i)(a), (int)(imm)))
 
+/// Shifts each 128-bit half of the 256-bit integer vector \a a left by
+///\a imm bytes, shifting in zero bytes, and returns the result. If \a imm
+///is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_bslli_epi128(__m256i a, const int imm);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VPSLLDQ instruction.
+///
+/// \param a
+///A 256-bit integer vector to be shifted.
+/// \param imm
+///An unsigned immediate value specifying the shift count (in bytes).
+/// \returns A 256-bit integer vector containing the result.
 #define _mm256_bslli_epi128(a, imm) \
   ((__m256i)__builtin_ia32_pslldqi256_byteshift((__v4di)(__m256i)(a), (int)(imm)))
 
+/// Shifts each 16-bit element of the 256-bit vector of [16 x i16] in \a __a
+///left by \a __count bits, shifting in zero bits, and returns the result.
+///If \a __count is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] to be shifted.
+/// \param __count
+///An unsigned integer value specifying the shift count (in bits).
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_slli_epi16(__m256i __a, int __count)
 {
   return (__m256i)__builtin_ia32_psllwi256((__v16hi)__a, __count);
 }
 
+/// Shifts each 16-bit element of the 256-bit vector of [16 x i16] in \a __a
+///left by the number of bits specified by the lower 64 bits of \a __count,
+///shifting in zero bits, and returns the result. If \a __count is greater
+///than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] to be shifted.
+/// \param __count
+///A 128-bit vector of [2 x i64] whose lower element gives the unsigned
+///shift count (in bits). The upper element is ignored.
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_sll_epi16(__m256i __a, __m128i __count)
 {
   return (__m256i)__builtin_ia32_psllw256((__v16hi)__a, (__v8hi)__count);
 }
 
+/// Shifts each 32-bit element of the 256-bit vector of [8 x i32] in \a __a
+///left by \a __count bits, shifting in zero bits, and returns the result.
+///If \a __count is greater than 31, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] to be shifted.
+/// \param __count
+///An unsigned integer value specifying the shift count (in bits).
+/// \returns A 256-bit vector of [8 x i32] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_slli_epi32(__m256i __a, int __count)
 {
   return (__m256i)__builtin_ia32_pslldi256((__v8si)__a, __count);
 }
 
+/// Shifts each 32-bit element of the 256-bit vector of [8 x i32] in \a __a
+///left by the number of bits given in the lower 64 bits of \a __count,
+///shifting in zero bits, and returns the result. If \a __count is greater
+///than 31, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] to be shifted.
+/// \param __count
+///A 128-bit vector of [2 x i64] whose lower element gives the unsigned
+///shift count (in bits). The upper element is ignored.
+/// \returns A 256-bit vector of [8 x i32] containing th

[PATCH] D150278: [Headers][doc] Add "shift" intrinsic descriptions to avx2intrin.h

2023-05-10 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG642bd1123d05: [Headers][doc] Add "shift" intrinsic 
descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150278/new/

https://reviews.llvm.org/D150278

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -493,108 +493,404 @@
 return (__m256i)__builtin_ia32_psignd256((__v8si)__a, (__v8si)__b);
 }
 
+/// Shifts each 128-bit half of the 256-bit integer vector \a a left by
+///\a imm bytes, shifting in zero bytes, and returns the result. If \a imm
+///is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_slli_si256(__m256i a, const int imm);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VPSLLDQ instruction.
+///
+/// \param a
+///A 256-bit integer vector to be shifted.
+/// \param imm
+/// An unsigned immediate value specifying the shift count (in bytes).
+/// \returns A 256-bit integer vector containing the result.
 #define _mm256_slli_si256(a, imm) \
   ((__m256i)__builtin_ia32_pslldqi256_byteshift((__v4di)(__m256i)(a), (int)(imm)))
 
+/// Shifts each 128-bit half of the 256-bit integer vector \a a left by
+///\a imm bytes, shifting in zero bytes, and returns the result. If \a imm
+///is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_bslli_epi128(__m256i a, const int imm);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VPSLLDQ instruction.
+///
+/// \param a
+///A 256-bit integer vector to be shifted.
+/// \param imm
+///An unsigned immediate value specifying the shift count (in bytes).
+/// \returns A 256-bit integer vector containing the result.
 #define _mm256_bslli_epi128(a, imm) \
   ((__m256i)__builtin_ia32_pslldqi256_byteshift((__v4di)(__m256i)(a), (int)(imm)))
 
+/// Shifts each 16-bit element of the 256-bit vector of [16 x i16] in \a __a
+///left by \a __count bits, shifting in zero bits, and returns the result.
+///If \a __count is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] to be shifted.
+/// \param __count
+///An unsigned integer value specifying the shift count (in bits).
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_slli_epi16(__m256i __a, int __count)
 {
   return (__m256i)__builtin_ia32_psllwi256((__v16hi)__a, __count);
 }
 
+/// Shifts each 16-bit element of the 256-bit vector of [16 x i16] in \a __a
+///left by the number of bits specified by the lower 64 bits of \a __count,
+///shifting in zero bits, and returns the result. If \a __count is greater
+///than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] to be shifted.
+/// \param __count
+///A 128-bit vector of [2 x i64] whose lower element gives the unsigned
+///shift count (in bits). The upper element is ignored.
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_sll_epi16(__m256i __a, __m128i __count)
 {
   return (__m256i)__builtin_ia32_psllw256((__v16hi)__a, (__v8hi)__count);
 }
 
+/// Shifts each 32-bit element of the 256-bit vector of [8 x i32] in \a __a
+///left by \a __count bits, shifting in zero bits, and returns the result.
+///If \a __count is greater than 31, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] to be shifted.
+/// \param __count
+///An unsigned integer value specifying the shift count (in bits).
+/// \returns A 256-bit vector of [8 x i32] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_slli_epi32(__m256i __a, int __count)
 {
   return (__m256i)__builtin_ia32_pslldi256((__v8si)__a, __count);
 }
 
+/// Shifts each 32-bit element of the 256-bit vector of [8 x i32] in \a __a
+///left by the number of bits given in the lower 64 bits of \a __count,
+///shifting in zero bits, and returns the result. If \a __count is greater
+///than 31, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] to be shifted.
+///

[PATCH] D148021: [Headers][doc] Add FMA intrinsic descriptions

2023-04-18 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG0905c567f0c7: [Headers][doc] Add FMA intrinsic descriptions 
(authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D148021?vs=512461&id=514663#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148021/new/

https://reviews.llvm.org/D148021

Files:
  clang/lib/Headers/fmaintrin.h

Index: clang/lib/Headers/fmaintrin.h
===
--- clang/lib/Headers/fmaintrin.h
+++ clang/lib/Headers/fmaintrin.h
@@ -18,192 +18,756 @@
 #define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("fma"), __min_vector_width__(128)))
 #define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("fma"), __min_vector_width__(256)))
 
+/// Computes a multiply-add of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) + __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMADD213PS instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the addend.
+/// \returns A 128-bit vector of [4 x float] containing the result.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmadd_ps(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddps((__v4sf)__A, (__v4sf)__B, (__v4sf)__C);
 }
 
+/// Computes a multiply-add of 128-bit vectors of [2 x double].
+///For each element, computes  (__A * __B) + __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMADD213PD instruction.
+///
+/// \param __A
+///A 128-bit vector of [2 x double] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [2 x double] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [2 x double] containing the addend.
+/// \returns A 128-bit [2 x double] vector containing the result.
 static __inline__ __m128d __DEFAULT_FN_ATTRS128
 _mm_fmadd_pd(__m128d __A, __m128d __B, __m128d __C)
 {
   return (__m128d)__builtin_ia32_vfmaddpd((__v2df)__A, (__v2df)__B, (__v2df)__C);
 }
 
+/// Computes a scalar multiply-add of the single-precision values in the
+///low 32 bits of 128-bit vectors of [4 x float].
+/// \code
+/// result[31:0] = (__A[31:0] * __B[31:0]) + __C[31:0]
+/// result[127:32] = __A[127:32]
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMADD213SS instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand in the low
+///32 bits.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier in the low
+///32 bits.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the addend in the low
+///32 bits.
+/// \returns A 128-bit vector of [4 x float] containing the result in the low
+///32 bits and a copy of \a __A[127:32] in the upper 96 bits.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmadd_ss(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddss3((__v4sf)__A, (__v4sf)__B, (__v4sf)__C);
 }
 
+/// Computes a scalar multiply-add of the double-precision values in the
+///low 64 bits of 128-bit vectors of [2 x double].
+/// \code
+/// result[63:0] = (__A[63:0] * __B[63:0]) + __C[63:0]
+/// result[127:64] = __A[127:64]
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMADD213SD instruction.
+///
+/// \param __A
+///A 128-bit vector of [2 x double] containing the multiplicand in the low
+///64 bits.
+/// \param __B
+///A 128-bit vector of [2 x double] containing the multiplier in the low
+///64 bits.
+/// \param __C
+///A 128-bit vector of [2 x double] containing the addend in the low
+///64 bits.
+/// \returns A 128-bit vector of [2 x double] containing the result in the low
+///64 bits and a copy of \a __A[127:64] in the upper 64 bits.
 static __inline__ __m128d __DEFAULT_FN_ATTRS128
 _mm_fmadd_sd(__m128d __A, __m128d __B, __m128d __C)
 {
   return (__m128d)__builtin_ia32_vfmaddsd3((__v2df)__A, (__v2df)__B, (__v2df)__C);
 }
 
+/// Computes a multiply-subtract of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) - __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMSUB213PS instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the subtrahend.
+/// \returns A 128-bit vector of [4 x float] containing the resu

[PATCH] D148021: [Headers][doc] Add FMA intrinsic descriptions

2023-04-18 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I chose to leave the "for each element" cases as-is, but I will keep your 
comments in mind as I go through other intrinsics.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148021/new/

https://reviews.llvm.org/D148021

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148653: [Header][doc] Add/revise MONITOR/MWAIT[X] descriptions

2023-04-18 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: RKSimon, pengfei, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D148653

Files:
  clang/lib/Headers/mwaitxintrin.h
  clang/lib/Headers/pmmintrin.h


Index: clang/lib/Headers/pmmintrin.h
===
--- clang/lib/Headers/pmmintrin.h
+++ clang/lib/Headers/pmmintrin.h
@@ -253,9 +253,12 @@
 ///the processor in the monitor event pending state. Data stored in the
 ///monitored address range causes the processor to exit the pending state.
 ///
+/// The \c MONITOR instruction can be used in kernel mode, and in other modes
+/// if MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MONITOR  instruction.
+/// This intrinsic corresponds to the \c MONITOR instruction.
 ///
 /// \param __p
 ///The memory range to be monitored. The size of the range is determined by
@@ -270,19 +273,22 @@
   __builtin_ia32_monitor(__p, __extensions, __hints);
 }
 
-/// Used with the MONITOR instruction to wait while the processor is in
+/// Used with the \c MONITOR instruction to wait while the processor is in
 ///the monitor event pending state. Data stored in the monitored address
 ///range causes the processor to exit the pending state.
 ///
+/// The \c MWAIT instruction can be used in kernel mode, and in other modes if
+/// MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MWAIT  instruction.
+/// This intrinsic corresponds to the \c MWAIT instruction.
 ///
 /// \param __extensions
-///Optional extensions for the monitoring state, which may vary by
+///Optional extensions for the monitoring state, which can vary by
 ///processor.
 /// \param __hints
-///Optional hints for the monitoring state, which may vary by processor.
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_mwait(unsigned __extensions, unsigned __hints)
 {
Index: clang/lib/Headers/mwaitxintrin.h
===
--- clang/lib/Headers/mwaitxintrin.h
+++ clang/lib/Headers/mwaitxintrin.h
@@ -16,12 +16,41 @@
 
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__,  
__target__("mwaitx")))
+
+/// Establishes a linear address memory range to be monitored and puts
+///the processor in the monitor event pending state. Data stored in the
+///monitored address range causes the processor to exit the pending state.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c MONITORX instruction.
+///
+/// \param __p
+///The memory range to be monitored. The size of the range is determined by
+///CPUID function _0005h.
+/// \param __extensions
+///Optional extensions for the monitoring state.
+/// \param __hints
+///Optional hints for the monitoring state.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_monitorx(void * __p, unsigned __extensions, unsigned __hints)
 {
   __builtin_ia32_monitorx(__p, __extensions, __hints);
 }
 
+/// Used with the \c MONITORX instruction to wait while the processor is in
+///the monitor event pending state. Data stored in the monitored address
+///range causes the processor to exit the pending state.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c MWAITX instruction.
+///
+/// \param __extensions
+///Optional extensions for the monitoring state, which can vary by
+///processor.
+/// \param __hints
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_mwaitx(unsigned __extensions, unsigned __hints, unsigned __clock)
 {


Index: clang/lib/Headers/pmmintrin.h
===
--- clang/lib/Headers/pmmintrin.h
+++ clang/lib/Headers/pmmintrin.h
@@ -253,9 +253,12 @@
 ///the processor in the monitor event pending state. Data stored in the
 ///monitored address range causes the processor to exit the pending state.
 ///
+/// The \c MONITOR instruction can be used in kernel mode, and in other modes
+/// if MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MONITOR  instruction.
+/// This intrinsic corresponds to the \c MONITOR instruction.
 ///
 /// \param __p
 ///The memory range to be monitored. The size of the range is determined by
@@ -270,19 +273,22 @@
   __builtin_ia32_monitor(__p, __extensions, __hints);
 }
 
-/// Used with the MONITOR instruction to wait while the processor is in
+/// Used with the \c MONITOR instruction to wait while the processor is in
 ///the monitor event pending state. Data stored in the monitored address
 /

[PATCH] D148653: [Header][doc] Add/revise MONITOR/MWAIT[X] descriptions

2023-04-19 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG5ddcef2ad3db: [Headers][doc] Add/revise MONITOR/MWAIT 
descriptions (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148653/new/

https://reviews.llvm.org/D148653

Files:
  clang/lib/Headers/mwaitxintrin.h
  clang/lib/Headers/pmmintrin.h


Index: clang/lib/Headers/pmmintrin.h
===
--- clang/lib/Headers/pmmintrin.h
+++ clang/lib/Headers/pmmintrin.h
@@ -253,9 +253,12 @@
 ///the processor in the monitor event pending state. Data stored in the
 ///monitored address range causes the processor to exit the pending state.
 ///
+/// The \c MONITOR instruction can be used in kernel mode, and in other modes
+/// if MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MONITOR  instruction.
+/// This intrinsic corresponds to the \c MONITOR instruction.
 ///
 /// \param __p
 ///The memory range to be monitored. The size of the range is determined by
@@ -270,19 +273,22 @@
   __builtin_ia32_monitor(__p, __extensions, __hints);
 }
 
-/// Used with the MONITOR instruction to wait while the processor is in
+/// Used with the \c MONITOR instruction to wait while the processor is in
 ///the monitor event pending state. Data stored in the monitored address
 ///range causes the processor to exit the pending state.
 ///
+/// The \c MWAIT instruction can be used in kernel mode, and in other modes if
+/// MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MWAIT  instruction.
+/// This intrinsic corresponds to the \c MWAIT instruction.
 ///
 /// \param __extensions
-///Optional extensions for the monitoring state, which may vary by
+///Optional extensions for the monitoring state, which can vary by
 ///processor.
 /// \param __hints
-///Optional hints for the monitoring state, which may vary by processor.
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_mwait(unsigned __extensions, unsigned __hints)
 {
Index: clang/lib/Headers/mwaitxintrin.h
===
--- clang/lib/Headers/mwaitxintrin.h
+++ clang/lib/Headers/mwaitxintrin.h
@@ -16,12 +16,41 @@
 
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__,  
__target__("mwaitx")))
+
+/// Establishes a linear address memory range to be monitored and puts
+///the processor in the monitor event pending state. Data stored in the
+///monitored address range causes the processor to exit the pending state.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c MONITORX instruction.
+///
+/// \param __p
+///The memory range to be monitored. The size of the range is determined by
+///CPUID function _0005h.
+/// \param __extensions
+///Optional extensions for the monitoring state.
+/// \param __hints
+///Optional hints for the monitoring state.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_monitorx(void * __p, unsigned __extensions, unsigned __hints)
 {
   __builtin_ia32_monitorx(__p, __extensions, __hints);
 }
 
+/// Used with the \c MONITORX instruction to wait while the processor is in
+///the monitor event pending state. Data stored in the monitored address
+///range causes the processor to exit the pending state.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c MWAITX instruction.
+///
+/// \param __extensions
+///Optional extensions for the monitoring state, which can vary by
+///processor.
+/// \param __hints
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_mwaitx(unsigned __extensions, unsigned __hints, unsigned __clock)
 {


Index: clang/lib/Headers/pmmintrin.h
===
--- clang/lib/Headers/pmmintrin.h
+++ clang/lib/Headers/pmmintrin.h
@@ -253,9 +253,12 @@
 ///the processor in the monitor event pending state. Data stored in the
 ///monitored address range causes the processor to exit the pending state.
 ///
+/// The \c MONITOR instruction can be used in kernel mode, and in other modes
+/// if MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MONITOR  instruction.
+/// This intrinsic corresponds to the \c MONITOR instruction.
 ///
 /// \param __p
 ///The memory range to be monitored. The size of the range is determined by
@@ -270,19 +273,22 @@
   __builtin_ia32_monitor(__p, __extensions, __hints);
 }
 
-/// Used with the MONITOR inst

[PATCH] D148653: [Header][doc] Add/revise MONITOR/MWAIT[X] descriptions

2023-04-19 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/pmmintrin.h:278
 ///the monitor event pending state. Data stored in the monitored address
 ///range causes the processor to exit the pending state.
 ///

goldstein.w.n wrote:
> interrupts too. Might as well add that if updating the comments.
Ah, sorry missed this comment before I pushed. Updated in rG12426441




Comment at: clang/lib/Headers/pmmintrin.h:291
 /// \param __hints
-///Optional hints for the monitoring state, which may vary by processor.
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS

goldstein.w.n wrote:
> out of curiosity, why "may" -> "can"?
That was on the advice of my tech writer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148653/new/

https://reviews.llvm.org/D148653

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149205: [Headers][doc] Add "gather" intrinsic descriptions to avx2intrin.h

2023-04-25 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D149205

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -786,7 +786,6 @@
   return (__m128i)__builtin_shufflevector((__v8hi)__X, (__v8hi)__X, 0, 0, 0, 0, 0, 0, 0, 0);
 }
 
-
 static __inline__ __m128i __DEFAULT_FN_ATTRS128
 _mm_broadcastd_epi32(__m128i __X)
 {
@@ -935,102 +934,810 @@
   return (__m128i)__builtin_ia32_psrlv2di((__v2di)__X, (__v2di)__Y);
 }
 
+/// Conditionally gathers two 64-bit floating-point values, either from the
+///128-bit vector of [2 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [4 x i32] in \a i. The 128-bit vector
+///of [2 x double] in \a mask determines the source for each element.
+///
+/// \code
+/// FOR element := 0 to 1
+///   j := element*64
+///   k := element*32
+///   IF mask[j+63] == 0
+/// result[j+63:j] := a[j+63:j]
+///   ELSE
+/// result[j+63:j] := Load64(m + SignExtend(i[k+31:k])*s)
+///   FI
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m128d _mm_mask_i32gather_pd(__m128d a, const double *m, __m128i i,
+///   __m128d mask, const int s);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VGATHERDPD instruction.
+///
+/// \param a
+///A 128-bit vector of [2 x double] used as the source when a mask bit is
+///zero.
+/// \param m
+///A pointer to the memory used for loading values.
+/// \param i
+///A 128-bit vector of [4 x i32] containing signed indexes into \a m. Only
+///the first two elements are used.
+/// \param mask
+///A 128-bit vector of [2 x double] containing the mask. The most
+///significant bit of each element in the mask vector represents the mask
+///bits. If a mask bit is zero, the corresponding value from vector \a a
+///is gathered; otherwise the value is loaded from memory.
+/// \param s
+///A literal constant scale factor for the indexes in \a i. Must be
+///1, 2, 4, or 8.
+/// \returns A 128-bit vector of [2 x double] containing the gathered values.
 #define _mm_mask_i32gather_pd(a, m, i, mask, s) \
   ((__m128d)__builtin_ia32_gatherd_pd((__v2df)(__m128i)(a), \
   (double const *)(m), \
   (__v4si)(__m128i)(i), \
   (__v2df)(__m128d)(mask), (s)))
 
+/// Conditionally gathers four 64-bit floating-point values, either from the
+///256-bit vector of [4 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [4 x i32] in \a i. The 256-bit vector
+///of [4 x double] in \a mask determines the source for each element.
+///
+/// \code
+/// FOR element := 0 to 3
+///   j := element*64
+///   k := element*32
+///   IF mask[j+63] == 0
+/// result[j+63:j] := a[j+63:j]
+///   ELSE
+/// result[j+63:j] := Load64(m + SignExtend(i[k+31:k])*s)
+///   FI
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m256d _mm256_mask_i32gather_pd(__m256d a, const double *m, __m128i i,
+///  __m256d mask, const int s);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VGATHERDPD instruction.
+///
+/// \param a
+///A 256-bit vector of [4 x double] used as the source when a mask bit is
+///zero.
+/// \param m
+///A pointer to the memory used for loading values.
+/// \param i
+///A 128-bit vector of [4 x i32] containing signed indexes into \a m.
+/// \param mask
+///A 256-bit vector of [4 x double] containing the mask. The most
+///significant bit of each element in the mask vector represents the mask
+///bits. If a mask bit is zero, the corresponding value from vector \a a
+///is gathered; otherwise the value is loaded from memory.
+/// \param s
+///A literal constant scale factor for the indexes in \a i. Must be
+///1, 2, 4, or 8.
+/// \returns A 256-bit vector of [4 x double] containing the gathered values.
 #define _mm256_mask_i32gather_pd(a, m, i, mask, s) \
   ((__m256d)__builtin_ia32_gatherd_pd256((__v4df)(__m256d)(a), \
  (double const *)(m), \
  (__v4si)(__m128i)(i), \
  (__v4df)(__m256d)(mask), (s)))
 
+/// Conditionally gathers two 64-bit floating-point values, either from the
+///128-bit vector of [2 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [2 x i64] in \a i. The 128-bit vector
+///of [2 x double] in \a mask determines the source for each element.
+///
+/// \code
+/// FOR element := 0 t

[PATCH] D149205: [Headers][doc] Add "gather" intrinsic descriptions to avx2intrin.h

2023-04-26 Thread Paul Robinson via Phabricator via cfe-commits

probinson marked 2 inline comments as done.
probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:942
+///
+/// \code
+/// FOR element := 0 to 1

pengfei wrote:
> Use `\code{.operation}` please, the same below. Our internal tool will 
> recognize this pattern.
Ok. I'll modify our tooling to ignore it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149205/new/

https://reviews.llvm.org/D149205

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149205: [Headers][doc] Add "gather" intrinsic descriptions to avx2intrin.h

2023-04-26 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
probinson marked an inline comment as done.
Closed by commit rG039ae62405b6: [Headers][doc] Add "gather" 
intrinsic descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D149205?vs=516917&id=517182#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149205/new/

https://reviews.llvm.org/D149205

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -935,102 +935,810 @@
   return (__m128i)__builtin_ia32_psrlv2di((__v2di)__X, (__v2di)__Y);
 }
 
+/// Conditionally gathers two 64-bit floating-point values, either from the
+///128-bit vector of [2 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [4 x i32] in \a i. The 128-bit vector
+///of [2 x double] in \a mask determines the source for each element.
+///
+/// \code{.operation}
+/// FOR element := 0 to 1
+///   j := element*64
+///   k := element*32
+///   IF mask[j+63] == 0
+/// result[j+63:j] := a[j+63:j]
+///   ELSE
+/// result[j+63:j] := Load64(m + SignExtend(i[k+31:k])*s)
+///   FI
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m128d _mm_mask_i32gather_pd(__m128d a, const double *m, __m128i i,
+///   __m128d mask, const int s);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VGATHERDPD instruction.
+///
+/// \param a
+///A 128-bit vector of [2 x double] used as the source when a mask bit is
+///zero.
+/// \param m
+///A pointer to the memory used for loading values.
+/// \param i
+///A 128-bit vector of [4 x i32] containing signed indexes into \a m. Only
+///the first two elements are used.
+/// \param mask
+///A 128-bit vector of [2 x double] containing the mask. The most
+///significant bit of each element in the mask vector represents the mask
+///bits. If a mask bit is zero, the corresponding value from vector \a a
+///is gathered; otherwise the value is loaded from memory.
+/// \param s
+///A literal constant scale factor for the indexes in \a i. Must be
+///1, 2, 4, or 8.
+/// \returns A 128-bit vector of [2 x double] containing the gathered values.
 #define _mm_mask_i32gather_pd(a, m, i, mask, s) \
   ((__m128d)__builtin_ia32_gatherd_pd((__v2df)(__m128i)(a), \
   (double const *)(m), \
   (__v4si)(__m128i)(i), \
   (__v2df)(__m128d)(mask), (s)))
 
+/// Conditionally gathers four 64-bit floating-point values, either from the
+///256-bit vector of [4 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [4 x i32] in \a i. The 256-bit vector
+///of [4 x double] in \a mask determines the source for each element.
+///
+/// \code{.operation}
+/// FOR element := 0 to 3
+///   j := element*64
+///   k := element*32
+///   IF mask[j+63] == 0
+/// result[j+63:j] := a[j+63:j]
+///   ELSE
+/// result[j+63:j] := Load64(m + SignExtend(i[k+31:k])*s)
+///   FI
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m256d _mm256_mask_i32gather_pd(__m256d a, const double *m, __m128i i,
+///  __m256d mask, const int s);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VGATHERDPD instruction.
+///
+/// \param a
+///A 256-bit vector of [4 x double] used as the source when a mask bit is
+///zero.
+/// \param m
+///A pointer to the memory used for loading values.
+/// \param i
+///A 128-bit vector of [4 x i32] containing signed indexes into \a m.
+/// \param mask
+///A 256-bit vector of [4 x double] containing the mask. The most
+///significant bit of each element in the mask vector represents the mask
+///bits. If a mask bit is zero, the corresponding value from vector \a a
+///is gathered; otherwise the value is loaded from memory.
+/// \param s
+///A literal constant scale factor for the indexes in \a i. Must be
+///1, 2, 4, or 8.
+/// \returns A 256-bit vector of [4 x double] containing the gathered values.
 #define _mm256_mask_i32gather_pd(a, m, i, mask, s) \
   ((__m256d)__builtin_ia32_gatherd_pd256((__v4df)(__m256d)(a), \
  (double const *)(m), \
  (__v4si)(__m128i)(i), \
  (__v4df)(__m256d)(mask), (s)))
 
+/// Conditionally gathers two 64-bit floating-point values, either from the
+///128-bit vector of [2 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vecto

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-05 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I experimented with this. Looks like it emits info only for //defined// 
methods, and not //used// methods. That is, if you change the test to say
`void t1::f1() { f2(); }`
you get DWARF for f1 but not f2. The way Sony does it, you get DWARF for f1 and 
f2.
(Neither case sees info for f3, which is declared but not defined or used.)

Is that what you intended?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In D152017#4397113 , @dblaikie wrote:

> What's the particular goal/value in including called-but-not-defined 
> functions? Are your users generally building only parts of their program with 
> debug info & you want it to be complete-ish in the parts that do have debug 
> info? Because if they were building the whole program with debug info, /some/ 
> translation unit would have the definition of the function, and the debug 
> info for it.

Actually the goal was to match the behavior of proprietary compilers for 
previous consoles, knowing that the debugger would be fine with that. I think 
it's worth taking this idea (only defined methods) back to them, and see what 
they think. Because your patch is seriously simpler than ours!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

My debugger guy says "this shouldn't be a problem."

Given that, my request is that `-gincomplete-types` should be default-true for 
`DebuggerTuning == SCE` if you want to commit this; otherwise I'll redo our 
downstream patch to match yours.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I have to say I'm not super excited about "-gincomplete-types" given that 
"incomplete type" means something different to a C++ user.

"-gdefined-methods-only" ? reads awkwardly in the no- form.
"-gsuppress-undefined-methods" ? which is similar to what we called it.
"-gundefined-methods" ? you'd default-true in that case, the no form would 
suppress them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Oh, `-fstandalone-debug` should override this? In the Sony implementation, it 
does. (Sorry for not remembering that sooner.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-07 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Could always go with `-gsuppress-undefined-methods` if you're not happy about 
default-on options. I don't have a strong opinion.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139444: [ZOS] Convert tests to check 'target={{.*}}-zos'

2022-12-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: uweigand, Kai.
Herald added a project: All.
probinson requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Part of the project to eliminate special handling for triples in lit
expressions.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D139444

Files:
  clang/test/Analysis/cfref_PR2519.c
  clang/test/CodeGen/cfstring2.c
  clang/test/Driver/as-version.s
  clang/test/Import/forward-declared-objc-class/test.m
  clang/test/Import/objc-arc/test-cleanup-object.m
  clang/test/Import/objc-autoreleasepool/test.m
  clang/test/Import/objc-definitions-in-expression/test.m
  clang/test/Import/objc-method/test.m
  clang/test/Import/objc-param-decl/test.m
  clang/test/Import/objc-try-catch/test.m
  clang/test/Modules/DebugInfoNamespace.cpp
  clang/test/Modules/DebugInfoTransitiveImport.m
  clang/test/Modules/ExtDebugInfo.cpp
  clang/test/Modules/ExtDebugInfo.m
  clang/test/Modules/ModuleDebugInfo.cpp
  clang/test/Modules/ModuleDebugInfo.m
  clang/test/Modules/ModuleDebugInfoDwoId.cpp
  clang/test/Modules/ModuleModuleDebugInfo.cpp
  clang/test/Modules/autolink.m
  clang/test/Modules/autolinkTBD.m
  clang/test/Modules/builtins.m
  clang/test/Modules/clang_module_file_info.m
  clang/test/Modules/cxx-irgen.cpp
  clang/test/Modules/debug-info-moduleimport-in-module.m
  clang/test/Modules/debug-info-moduleimport.m
  clang/test/Modules/direct-module-import.m
  clang/test/Modules/merge-anon-record-definition-in-objc.m
  clang/test/Modules/merge-extension-ivars.m
  clang/test/Modules/merge-objc-interface-visibility.m
  clang/test/Modules/merge-objc-interface.m
  clang/test/Modules/merge-record-definition-nonmodular.m
  clang/test/Modules/merge-record-definition-visibility.m
  clang/test/Modules/merge-record-definition.m
  clang/test/Modules/module-debuginfo-prefix.m
  clang/test/Modules/module-file-home-is-cwd.m
  clang/test/Modules/module_file_info.m
  clang/test/Modules/objc-initializer.m
  clang/test/Modules/pch-used.m
  clang/test/Modules/redecl-ivars.m
  clang/test/Modules/use-exportas-for-link.m
  clang/test/PCH/externally-retained.m
  clang/test/PCH/irgen-rdar13114142.mm
  clang/test/PCH/objc_container.m
  clang/test/PCH/objc_literals.m
  clang/test/PCH/objc_literals.mm
  clang/test/PCH/objcxx-ivar-class.mm
  clang/test/PCH/pending-ids.m
  llvm/test/MC/AsmParser/debug-no-source.s
  llvm/test/Support/encoding.ll
  llvm/test/tools/llvm-mc/no_warnings.test

Index: llvm/test/tools/llvm-mc/no_warnings.test
===
--- llvm/test/tools/llvm-mc/no_warnings.test
+++ llvm/test/tools/llvm-mc/no_warnings.test
@@ -1,4 +1,4 @@
-# UNSUPPORTED: -zos
+# UNSUPPORTED: target={{.*}}-zos
 # RUN: llvm-mc --no-warn %s 2>&1 | FileCheck %s
 
 # CHECK-NOT: warning:
Index: llvm/test/Support/encoding.ll
===
--- llvm/test/Support/encoding.ll
+++ llvm/test/Support/encoding.ll
@@ -1,7 +1,7 @@
 ; Checks if llc can deal with different char encodings.
 ; This is only required for z/OS.
 ;
-; UNSUPPORTED: !s390x-none-zos
+; REQUIRES: target=s390x-none-zos
 ;
 ; RUN: cat %s >%t && chtag -tc ISO8859-1 %t && llc %t -o - >/dev/null
 ; RUN: iconv -f ISO8859-1 -t IBM-1047 <%s >%t && chtag -tc IBM-1047 %t && llc %t -o - >/dev/null
Index: llvm/test/MC/AsmParser/debug-no-source.s
===
--- llvm/test/MC/AsmParser/debug-no-source.s
+++ llvm/test/MC/AsmParser/debug-no-source.s
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos
+// UNSUPPORTED: target={{.*}}-zos
 // REQUIRES: object-emission
 // RUN: llvm-mc %s | FileCheck %s
 
Index: clang/test/PCH/pending-ids.m
===
--- clang/test/PCH/pending-ids.m
+++ clang/test/PCH/pending-ids.m
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos, target={{.*}}-aix{{.*}}
+// UNSUPPORTED: target={{.*}}-zos, target={{.*}}-aix{{.*}}
 // Test for rdar://10278815
 
 // Without PCH
Index: clang/test/PCH/objcxx-ivar-class.mm
===
--- clang/test/PCH/objcxx-ivar-class.mm
+++ clang/test/PCH/objcxx-ivar-class.mm
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos, target={{.*}}-aix{{.*}}
+// UNSUPPORTED: target={{.*}}-zos, target={{.*}}-aix{{.*}}
 // Test this without pch.
 // RUN: %clang_cc1 -no-opaque-pointers -include %S/objcxx-ivar-class.h -triple %itanium_abi_triple %s -emit-llvm -o - | FileCheck %s
 
Index: clang/test/PCH/objc_literals.mm
===
--- clang/test/PCH/objc_literals.mm
+++ clang/test/PCH/objc_literals.mm
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos, target={{.*}}-aix{{.*}}
+// UNSUPPORTED: target={{.*}}-zos, target={{.*}}-aix{{.*}}
 // RUN: %clang_cc1 -triple %itanium_abi_triple -emit-pch -x objective-c++ -std=c++0x -o %t %s
 // RUN: %clang_cc1 -triple %itani

[PATCH] D139444: [ZOS] Convert tests to check 'target={{.*}}-zos'

2022-12-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

The changes in this patch assume that there aren't any possible suffixes after 
the `-zos` part of the triple (no version numbers, like you might find with 
darwin or macos, and nothing like `-elf` or `-eabi` like some targets have).  
If there are suffixes, I'll happily revise to put `{{.*}}` after everything.

The one test I could not verify is llvm/test/Support/encoding.ll, because I 
don't have the utilities that it needs.  But `UNSUPPORTED: !` was the 
workaround for not having triples allowed in `REQUIRES:` so I think it's the 
right change.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139444/new/

https://reviews.llvm.org/D139444

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D138597: DebugInfo: Add/support new DW_LANG codes for recent C and C++ versions

2022-12-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

LGTM.  I agree with the commentary in the test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138597/new/

https://reviews.llvm.org/D138597

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D138597: DebugInfo: Add/support new DW_LANG codes for recent C and C++ versions

2022-12-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Oh, right, PS4 defaults to C99.  It's okay with me if you make those two 
unsupported for PS4.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138597/new/

https://reviews.llvm.org/D138597

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139444: [ZOS] Convert tests to check 'target={{.*}}-zos'

2022-12-08 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In D139444#3978189 , @uweigand wrote:

> In D139444#3975182 , @probinson 
> wrote:
>
>> The changes in this patch assume that there aren't any possible suffixes 
>> after the `-zos` part of the triple (no version numbers, like you might find 
>> with darwin or macos, and nothing like `-elf` or `-eabi` like some targets 
>> have).  If there are suffixes, I'll happily revise to put `{{.*}}` after 
>> everything.
>
> I think for consistency with other targets, and to be safe for future 
> extensions of the target triple, it would be better to add the `{{.*}}`

Okay.

> [for encoding.ll] To express that restriction on the *host* system, you 
> should be using a `REQUIRES: system-zos` line.   However, it looks like this 
> capability is not actually currently implemented - you'll have to add it to 
> the code in `utils/lit/lit/llvm/config.py` here:
>
>   [...]
>   elif platform.system() == 'NetBSD':
>   features.add('system-netbsd')
>   elif platform.system() == 'AIX':
>   features.add('system-aix')
>   elif platform.system() == 'SunOS':
>   features.add('system-solaris')
>
> (Note that you probably still should add the `-mtriple` because the test case 
> requires *both* running on a z/OS host *and* compiling for the z/OS target.)

If you can tell me the `platform.system()` value to look for to detect z/OS, I 
can do that.  Probably as a separate patch, as it would be going beyond the 
mechanical replacement that I'm doing for everything else.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139444/new/

https://reviews.llvm.org/D139444

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139444: [ZOS] Convert tests to check 'target={{.*}}-zos'

2022-12-08 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 481421.
probinson added a comment.

Add trailing '{{.*}}' as requested.
Have not changed the encoding.ll test, waiting on @uweigand about correct value 
to test in Python.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139444/new/

https://reviews.llvm.org/D139444

Files:
  clang/test/Analysis/cfref_PR2519.c
  clang/test/CodeGen/cfstring2.c
  clang/test/Driver/as-version.s
  clang/test/Import/forward-declared-objc-class/test.m
  clang/test/Import/objc-arc/test-cleanup-object.m
  clang/test/Import/objc-autoreleasepool/test.m
  clang/test/Import/objc-definitions-in-expression/test.m
  clang/test/Import/objc-method/test.m
  clang/test/Import/objc-param-decl/test.m
  clang/test/Import/objc-try-catch/test.m
  clang/test/Modules/DebugInfoNamespace.cpp
  clang/test/Modules/DebugInfoTransitiveImport.m
  clang/test/Modules/ExtDebugInfo.cpp
  clang/test/Modules/ExtDebugInfo.m
  clang/test/Modules/ModuleDebugInfo.cpp
  clang/test/Modules/ModuleDebugInfo.m
  clang/test/Modules/ModuleDebugInfoDwoId.cpp
  clang/test/Modules/ModuleModuleDebugInfo.cpp
  clang/test/Modules/autolink.m
  clang/test/Modules/autolinkTBD.m
  clang/test/Modules/builtins.m
  clang/test/Modules/clang_module_file_info.m
  clang/test/Modules/cxx-irgen.cpp
  clang/test/Modules/debug-info-moduleimport-in-module.m
  clang/test/Modules/debug-info-moduleimport.m
  clang/test/Modules/direct-module-import.m
  clang/test/Modules/merge-anon-record-definition-in-objc.m
  clang/test/Modules/merge-extension-ivars.m
  clang/test/Modules/merge-objc-interface-visibility.m
  clang/test/Modules/merge-objc-interface.m
  clang/test/Modules/merge-record-definition-nonmodular.m
  clang/test/Modules/merge-record-definition-visibility.m
  clang/test/Modules/merge-record-definition.m
  clang/test/Modules/module-debuginfo-prefix.m
  clang/test/Modules/module-file-home-is-cwd.m
  clang/test/Modules/module_file_info.m
  clang/test/Modules/objc-initializer.m
  clang/test/Modules/pch-used.m
  clang/test/Modules/redecl-ivars.m
  clang/test/Modules/use-exportas-for-link.m
  clang/test/PCH/externally-retained.m
  clang/test/PCH/irgen-rdar13114142.mm
  clang/test/PCH/objc_container.m
  clang/test/PCH/objc_literals.m
  clang/test/PCH/objc_literals.mm
  clang/test/PCH/objcxx-ivar-class.mm
  clang/test/PCH/pending-ids.m
  llvm/test/MC/AsmParser/debug-no-source.s
  llvm/test/Support/encoding.ll
  llvm/test/tools/llvm-mc/no_warnings.test

Index: llvm/test/tools/llvm-mc/no_warnings.test
===
--- llvm/test/tools/llvm-mc/no_warnings.test
+++ llvm/test/tools/llvm-mc/no_warnings.test
@@ -1,4 +1,4 @@
-# UNSUPPORTED: -zos
+# UNSUPPORTED: target={{.*}}-zos{{.*}}
 # RUN: llvm-mc --no-warn %s 2>&1 | FileCheck %s
 
 # CHECK-NOT: warning:
Index: llvm/test/Support/encoding.ll
===
--- llvm/test/Support/encoding.ll
+++ llvm/test/Support/encoding.ll
@@ -1,7 +1,7 @@
 ; Checks if llc can deal with different char encodings.
 ; This is only required for z/OS.
 ;
-; UNSUPPORTED: !s390x-none-zos
+; REQUIRES: target=s390x-none-zos
 ;
 ; RUN: cat %s >%t && chtag -tc ISO8859-1 %t && llc %t -o - >/dev/null
 ; RUN: iconv -f ISO8859-1 -t IBM-1047 <%s >%t && chtag -tc IBM-1047 %t && llc %t -o - >/dev/null
Index: llvm/test/MC/AsmParser/debug-no-source.s
===
--- llvm/test/MC/AsmParser/debug-no-source.s
+++ llvm/test/MC/AsmParser/debug-no-source.s
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos
+// UNSUPPORTED: target={{.*}}-zos{{.*}}
 // REQUIRES: object-emission
 // RUN: llvm-mc %s | FileCheck %s
 
Index: clang/test/PCH/pending-ids.m
===
--- clang/test/PCH/pending-ids.m
+++ clang/test/PCH/pending-ids.m
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos, target={{.*}}-aix{{.*}}
+// UNSUPPORTED: target={{.*}}-zos{{.*}}, target={{.*}}-aix{{.*}}
 // Test for rdar://10278815
 
 // Without PCH
Index: clang/test/PCH/objcxx-ivar-class.mm
===
--- clang/test/PCH/objcxx-ivar-class.mm
+++ clang/test/PCH/objcxx-ivar-class.mm
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos, target={{.*}}-aix{{.*}}
+// UNSUPPORTED: target={{.*}}-zos{{.*}}, target={{.*}}-aix{{.*}}
 // Test this without pch.
 // RUN: %clang_cc1 -no-opaque-pointers -include %S/objcxx-ivar-class.h -triple %itanium_abi_triple %s -emit-llvm -o - | FileCheck %s
 
Index: clang/test/PCH/objc_literals.mm
===
--- clang/test/PCH/objc_literals.mm
+++ clang/test/PCH/objc_literals.mm
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos, target={{.*}}-aix{{.*}}
+// UNSUPPORTED: target={{.*}}-zos{{.*}}, target={{.*}}-aix{{.*}}
 // RUN: %clang_cc1 -triple %itanium_abi_triple -emit-pch -x objective-c++ -std=c++0x -o %t %s
 // RUN: %clang_cc1 -triple %itanium_abi_triple -include-pch %t -x objective-c++ -std

[PATCH] D139444: [ZOS] Convert tests to check 'target={{.*}}-zos'

2022-12-12 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I suppose we could temporarily add a test that does

  ; REQUIRES: target={{.*}}-zos
  ; RUN: %python -c 'import platform; print(platform.system())' && false

and see what gets printed.

Searching the buildbot console page for 'zos' turns up nothing; 's390' turns up 
clang-s390x-linux, clang-s390x-linux-lnt, mlir-s390x-linux.  Are there any zos 
hosted bots?  I.e., is this test ever actually run?  Maybe there are only 
downstream zos bots, in which case this should be a downstream test instead?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139444/new/

https://reviews.llvm.org/D139444

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139444: [ZOS] Convert tests to check 'target={{.*}}-zos'

2022-12-12 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 482180.
probinson added a comment.
Herald added a subscriber: delcypher.

Define 'system-zos' and use it in the one test that needs it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139444/new/

https://reviews.llvm.org/D139444

Files:
  clang/test/Analysis/cfref_PR2519.c
  clang/test/CodeGen/cfstring2.c
  clang/test/Driver/as-version.s
  clang/test/Import/forward-declared-objc-class/test.m
  clang/test/Import/objc-arc/test-cleanup-object.m
  clang/test/Import/objc-autoreleasepool/test.m
  clang/test/Import/objc-definitions-in-expression/test.m
  clang/test/Import/objc-method/test.m
  clang/test/Import/objc-param-decl/test.m
  clang/test/Import/objc-try-catch/test.m
  clang/test/Modules/DebugInfoNamespace.cpp
  clang/test/Modules/DebugInfoTransitiveImport.m
  clang/test/Modules/ExtDebugInfo.cpp
  clang/test/Modules/ExtDebugInfo.m
  clang/test/Modules/ModuleDebugInfo.cpp
  clang/test/Modules/ModuleDebugInfo.m
  clang/test/Modules/ModuleDebugInfoDwoId.cpp
  clang/test/Modules/ModuleModuleDebugInfo.cpp
  clang/test/Modules/autolink.m
  clang/test/Modules/autolinkTBD.m
  clang/test/Modules/builtins.m
  clang/test/Modules/clang_module_file_info.m
  clang/test/Modules/cxx-irgen.cpp
  clang/test/Modules/debug-info-moduleimport-in-module.m
  clang/test/Modules/debug-info-moduleimport.m
  clang/test/Modules/direct-module-import.m
  clang/test/Modules/merge-anon-record-definition-in-objc.m
  clang/test/Modules/merge-extension-ivars.m
  clang/test/Modules/merge-objc-interface-visibility.m
  clang/test/Modules/merge-objc-interface.m
  clang/test/Modules/merge-record-definition-nonmodular.m
  clang/test/Modules/merge-record-definition-visibility.m
  clang/test/Modules/merge-record-definition.m
  clang/test/Modules/module-debuginfo-prefix.m
  clang/test/Modules/module-file-home-is-cwd.m
  clang/test/Modules/module_file_info.m
  clang/test/Modules/objc-initializer.m
  clang/test/Modules/pch-used.m
  clang/test/Modules/redecl-ivars.m
  clang/test/Modules/use-exportas-for-link.m
  clang/test/PCH/externally-retained.m
  clang/test/PCH/irgen-rdar13114142.mm
  clang/test/PCH/objc_container.m
  clang/test/PCH/objc_literals.m
  clang/test/PCH/objc_literals.mm
  clang/test/PCH/objcxx-ivar-class.mm
  clang/test/PCH/pending-ids.m
  llvm/test/MC/AsmParser/debug-no-source.s
  llvm/test/Support/encoding.ll
  llvm/test/tools/llvm-mc/no_warnings.test
  llvm/utils/lit/lit/llvm/config.py

Index: llvm/utils/lit/lit/llvm/config.py
===
--- llvm/utils/lit/lit/llvm/config.py
+++ llvm/utils/lit/lit/llvm/config.py
@@ -88,6 +88,8 @@
 features.add('system-aix')
 elif platform.system() == 'SunOS':
 features.add('system-solaris')
+elif platform.system() == 'OS/390':
+features.add('system-zos')
 
 # Native compilation: host arch == default triple arch
 # Both of these values should probably be in every site config (e.g. as
Index: llvm/test/tools/llvm-mc/no_warnings.test
===
--- llvm/test/tools/llvm-mc/no_warnings.test
+++ llvm/test/tools/llvm-mc/no_warnings.test
@@ -1,4 +1,4 @@
-# UNSUPPORTED: -zos
+# UNSUPPORTED: target={{.*}}-zos{{.*}}
 # RUN: llvm-mc --no-warn %s 2>&1 | FileCheck %s
 
 # CHECK-NOT: warning:
Index: llvm/test/Support/encoding.ll
===
--- llvm/test/Support/encoding.ll
+++ llvm/test/Support/encoding.ll
@@ -1,9 +1,9 @@
 ; Checks if llc can deal with different char encodings.
 ; This is only required for z/OS.
 ;
-; UNSUPPORTED: !s390x-none-zos
+; REQUIRES: system-zos, systemz-registered-target
 ;
-; RUN: cat %s >%t && chtag -tc ISO8859-1 %t && llc %t -o - >/dev/null
+; RUN: cat %s >%t && chtag -tc ISO8859-1 %t && llc -mtriple=s390x-ibm-zos %t -o - >/dev/null
 ; RUN: iconv -f ISO8859-1 -t IBM-1047 <%s >%t && chtag -tc IBM-1047 %t && llc %t -o - >/dev/null
 ; RUN: iconv -f ISO8859-1 -t IBM-1047 <%s >%t && chtag -r %t && llc %t -o - >/dev/null
 
Index: llvm/test/MC/AsmParser/debug-no-source.s
===
--- llvm/test/MC/AsmParser/debug-no-source.s
+++ llvm/test/MC/AsmParser/debug-no-source.s
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos
+// UNSUPPORTED: target={{.*}}-zos{{.*}}
 // REQUIRES: object-emission
 // RUN: llvm-mc %s | FileCheck %s
 
Index: clang/test/PCH/pending-ids.m
===
--- clang/test/PCH/pending-ids.m
+++ clang/test/PCH/pending-ids.m
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos, target={{.*}}-aix{{.*}}
+// UNSUPPORTED: target={{.*}}-zos{{.*}}, target={{.*}}-aix{{.*}}
 // Test for rdar://10278815
 
 // Without PCH
Index: clang/test/PCH/objcxx-ivar-class.mm
===
--- clang/test/PCH/objcxx-ivar-class.mm
+++ clang/test/PCH/objcxx-ivar-class.mm
@@ -1,4 +1,4 @@
-

[PATCH] D139444: [ZOS] Convert tests to check 'target={{.*}}-zos'

2022-12-12 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Thanks @Kai and @uweigand, good to know a z/OS bot is in the works. Hope this 
patch now meets your needs.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139444/new/

https://reviews.llvm.org/D139444

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139444: [ZOS] Convert tests to check 'target={{.*}}-zos'

2022-12-12 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG7793e676514b: [ZOS] Convert tests to check 
'target={{.*}}-zos{{.*}}' (authored by probinson).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139444/new/

https://reviews.llvm.org/D139444

Files:
  clang/test/Analysis/cfref_PR2519.c
  clang/test/CodeGen/cfstring2.c
  clang/test/Driver/as-version.s
  clang/test/Import/forward-declared-objc-class/test.m
  clang/test/Import/objc-arc/test-cleanup-object.m
  clang/test/Import/objc-autoreleasepool/test.m
  clang/test/Import/objc-definitions-in-expression/test.m
  clang/test/Import/objc-method/test.m
  clang/test/Import/objc-param-decl/test.m
  clang/test/Import/objc-try-catch/test.m
  clang/test/Modules/DebugInfoNamespace.cpp
  clang/test/Modules/DebugInfoTransitiveImport.m
  clang/test/Modules/ExtDebugInfo.cpp
  clang/test/Modules/ExtDebugInfo.m
  clang/test/Modules/ModuleDebugInfo.cpp
  clang/test/Modules/ModuleDebugInfo.m
  clang/test/Modules/ModuleDebugInfoDwoId.cpp
  clang/test/Modules/ModuleModuleDebugInfo.cpp
  clang/test/Modules/autolink.m
  clang/test/Modules/autolinkTBD.m
  clang/test/Modules/builtins.m
  clang/test/Modules/clang_module_file_info.m
  clang/test/Modules/cxx-irgen.cpp
  clang/test/Modules/debug-info-moduleimport-in-module.m
  clang/test/Modules/debug-info-moduleimport.m
  clang/test/Modules/direct-module-import.m
  clang/test/Modules/merge-anon-record-definition-in-objc.m
  clang/test/Modules/merge-extension-ivars.m
  clang/test/Modules/merge-objc-interface-visibility.m
  clang/test/Modules/merge-objc-interface.m
  clang/test/Modules/merge-record-definition-nonmodular.m
  clang/test/Modules/merge-record-definition-visibility.m
  clang/test/Modules/merge-record-definition.m
  clang/test/Modules/module-debuginfo-prefix.m
  clang/test/Modules/module-file-home-is-cwd.m
  clang/test/Modules/module_file_info.m
  clang/test/Modules/objc-initializer.m
  clang/test/Modules/pch-used.m
  clang/test/Modules/redecl-ivars.m
  clang/test/Modules/use-exportas-for-link.m
  clang/test/PCH/externally-retained.m
  clang/test/PCH/irgen-rdar13114142.mm
  clang/test/PCH/objc_container.m
  clang/test/PCH/objc_literals.m
  clang/test/PCH/objc_literals.mm
  clang/test/PCH/objcxx-ivar-class.mm
  clang/test/PCH/pending-ids.m
  llvm/test/MC/AsmParser/debug-no-source.s
  llvm/test/Support/encoding.ll
  llvm/test/tools/llvm-mc/no_warnings.test
  llvm/utils/lit/lit/llvm/config.py

Index: llvm/utils/lit/lit/llvm/config.py
===
--- llvm/utils/lit/lit/llvm/config.py
+++ llvm/utils/lit/lit/llvm/config.py
@@ -88,6 +88,8 @@
 features.add('system-aix')
 elif platform.system() == 'SunOS':
 features.add('system-solaris')
+elif platform.system() == 'OS/390':
+features.add('system-zos')
 
 # Native compilation: host arch == default triple arch
 # Both of these values should probably be in every site config (e.g. as
Index: llvm/test/tools/llvm-mc/no_warnings.test
===
--- llvm/test/tools/llvm-mc/no_warnings.test
+++ llvm/test/tools/llvm-mc/no_warnings.test
@@ -1,4 +1,4 @@
-# UNSUPPORTED: -zos
+# UNSUPPORTED: target={{.*}}-zos{{.*}}
 # RUN: llvm-mc --no-warn %s 2>&1 | FileCheck %s
 
 # CHECK-NOT: warning:
Index: llvm/test/Support/encoding.ll
===
--- llvm/test/Support/encoding.ll
+++ llvm/test/Support/encoding.ll
@@ -1,9 +1,9 @@
 ; Checks if llc can deal with different char encodings.
 ; This is only required for z/OS.
 ;
-; UNSUPPORTED: !s390x-none-zos
+; REQUIRES: system-zos, systemz-registered-target
 ;
-; RUN: cat %s >%t && chtag -tc ISO8859-1 %t && llc %t -o - >/dev/null
+; RUN: cat %s >%t && chtag -tc ISO8859-1 %t && llc -mtriple=s390x-ibm-zos %t -o - >/dev/null
 ; RUN: iconv -f ISO8859-1 -t IBM-1047 <%s >%t && chtag -tc IBM-1047 %t && llc %t -o - >/dev/null
 ; RUN: iconv -f ISO8859-1 -t IBM-1047 <%s >%t && chtag -r %t && llc %t -o - >/dev/null
 
Index: llvm/test/MC/AsmParser/debug-no-source.s
===
--- llvm/test/MC/AsmParser/debug-no-source.s
+++ llvm/test/MC/AsmParser/debug-no-source.s
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos
+// UNSUPPORTED: target={{.*}}-zos{{.*}}
 // REQUIRES: object-emission
 // RUN: llvm-mc %s | FileCheck %s
 
Index: clang/test/PCH/pending-ids.m
===
--- clang/test/PCH/pending-ids.m
+++ clang/test/PCH/pending-ids.m
@@ -1,4 +1,4 @@
-// UNSUPPORTED: -zos, target={{.*}}-aix{{.*}}
+// UNSUPPORTED: target={{.*}}-zos{{.*}}, target={{.*}}-aix{{.*}}
 // Test for rdar://10278815
 
 // Without PCH
Index: clang/test/PCH/objcxx-ivar-class.mm
===

[PATCH] D138675: [flang] Add -ffast-math and -Ofast

2022-12-14 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

See D139967  for why `UNSUPPORTED: powerpc` 
didn't work. That patch will put it back, and also update the lit config so the 
check will work now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138675/new/

https://reviews.llvm.org/D138675

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139953: [llvm][DebugInfo] Backport DW_AT_default_value for template args

2022-12-14 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

> IIUC the labels have to match a line in the file uniquely, which the DW_TAGs 
> wouldn't

CHECK-LABEL doesn't have to match a line in the file uniquely. What happens is 
that all the -LABEL directives are processed first, in order, subdividing the 
input text into regions. Then the non-LABEL directives are processed within 
their respective regions.

So you could have

  CHECK-LABEL: DW_TAG
  CHECK: DW_AT_name ("foo")
  CHECK-LABEL: DW_TAG
  CHECK: DW_AT_location
  CHECK-LABEL: DW_TAG

which would search only the first tag for "foo" and only the second tag for the 
location attribute.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139953/new/

https://reviews.llvm.org/D139953

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139953: [llvm][DebugInfo] Backport DW_AT_default_value for template args

2022-12-14 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

> CHECK-LABEL doesn't have to match a line in the file uniquely.

I mean, it's good practice if they do match uniquely; that way you don't get 
excessively confusing results when the output changes, and things start 
matching where you didn't expect.  But it's not a requirement.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139953/new/

https://reviews.llvm.org/D139953

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D138954: [PPC] Convert tests to check 'target='

2022-12-15 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG948bb35d7474: [PPC] Convert tests to check 
'target=' (authored by probinson).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138954/new/

https://reviews.llvm.org/D138954

Files:
  clang/test/CodeGen/PowerPC/ppc-mm-malloc-le.c
  clang/test/CodeGen/PowerPC/ppc-mm-malloc.c
  clang/test/CodeGen/no-builtin.cpp
  clang/test/CodeGenCoroutines/pr56329.cpp
  clang/test/Sema/no-builtin.cpp
  llvm/test/DebugInfo/debuglineinfo-path.ll


Index: llvm/test/DebugInfo/debuglineinfo-path.ll
===
--- llvm/test/DebugInfo/debuglineinfo-path.ll
+++ llvm/test/DebugInfo/debuglineinfo-path.ll
@@ -2,7 +2,7 @@
 
 ; On powerpc llvm-nm describes win_func as a global variable, not a function. 
It breaks the test.
 ; It is not essential to DWARF path handling code we're testing here.
-; UNSUPPORTED: powerpc
+; UNSUPPORTED: target=powerpc{{.*}}
 ; REQUIRES: object-emission
 ; RUN: %llc_dwarf -O0 -filetype=obj -o %t < %s
 ; RUN: llvm-nm --radix=o %t | grep posix_absolute_func > %t.posix_absolute_func
Index: clang/test/Sema/no-builtin.cpp
===
--- clang/test/Sema/no-builtin.cpp
+++ clang/test/Sema/no-builtin.cpp
@@ -1,5 +1,4 @@
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -fsyntax-only -verify %s
-// UNSUPPORTED: ppc64be
 
 /// Prevent use of all builtins.
 void valid_attribute_all_1() __attribute__((no_builtin)) {}
Index: clang/test/CodeGenCoroutines/pr56329.cpp
===
--- clang/test/CodeGenCoroutines/pr56329.cpp
+++ clang/test/CodeGenCoroutines/pr56329.cpp
@@ -2,7 +2,7 @@
 //
 // RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 %s -O3 -S -emit-llvm 
-o - | FileCheck %s
 // This test is expected to fail on PowerPC.
-// XFAIL: powerpc
+// XFAIL: target=powerpc{{.*}}
 
 #include "Inputs/coroutine.h"
 
Index: clang/test/CodeGen/no-builtin.cpp
===
--- clang/test/CodeGen/no-builtin.cpp
+++ clang/test/CodeGen/no-builtin.cpp
@@ -1,5 +1,4 @@
 // RUN: %clang_cc1 -no-opaque-pointers -triple x86_64-linux-gnu -S -emit-llvm 
-o - %s | FileCheck %s
-// UNSUPPORTED: ppc64be
 
 // CHECK-LABEL: define{{.*}} void @foo_no_mempcy() #0
 extern "C" void foo_no_mempcy() __attribute__((no_builtin("memcpy"))) {}
Index: clang/test/CodeGen/PowerPC/ppc-mm-malloc.c
===
--- clang/test/CodeGen/PowerPC/ppc-mm-malloc.c
+++ clang/test/CodeGen/PowerPC/ppc-mm-malloc.c
@@ -1,6 +1,5 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
-// REQUIRES: native, powerpc-registered-target
-// UNSUPPORTED: !powerpc64-
+// REQUIRES: native, target=powerpc64-{{.*}}
 // The stdlib.h included in mm_malloc.h references native system header
 // like: bits/libc-header-start.h or features.h, cross-compile it may
 // require installing target headers in build env, otherwise expecting
Index: clang/test/CodeGen/PowerPC/ppc-mm-malloc-le.c
===
--- clang/test/CodeGen/PowerPC/ppc-mm-malloc-le.c
+++ clang/test/CodeGen/PowerPC/ppc-mm-malloc-le.c
@@ -1,6 +1,5 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
-// REQUIRES: native, powerpc-registered-target
-// UNSUPPORTED: !powerpc64le-
+// REQUIRES: native, target=powerpc64le-{{.*}}
 // The stdlib.h included in mm_malloc.h references native system header
 // like: bits/libc-header-start.h or features.h, cross-compile it may
 // require installing target headers in build env, otherwise expecting


Index: llvm/test/DebugInfo/debuglineinfo-path.ll
===
--- llvm/test/DebugInfo/debuglineinfo-path.ll
+++ llvm/test/DebugInfo/debuglineinfo-path.ll
@@ -2,7 +2,7 @@
 
 ; On powerpc llvm-nm describes win_func as a global variable, not a function. It breaks the test.
 ; It is not essential to DWARF path handling code we're testing here.
-; UNSUPPORTED: powerpc
+; UNSUPPORTED: target=powerpc{{.*}}
 ; REQUIRES: object-emission
 ; RUN: %llc_dwarf -O0 -filetype=obj -o %t < %s
 ; RUN: llvm-nm --radix=o %t | grep posix_absolute_func > %t.posix_absolute_func
Index: clang/test/Sema/no-builtin.cpp
===
--- clang/test/Sema/no-builtin.cpp
+++ clang/test/Sema/no-builtin.cpp
@@ -1,5 +1,4 @@
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -fsyntax-only -verify %s
-// UNSUPPORTED: ppc64be
 
 /// Prevent use of all builtins.
 void valid_attribute_all_1() __attribute__((no_builtin)) {}
Index: clang/test/CodeGenCoroutines/pr56329.cpp
==

[PATCH] D27794: Make some diagnostic tests C++11 clean

2016-12-14 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added a reviewer: rsmith.
probinson added subscribers: cfe-commits, tigerleapgorge.

Another half-dozen test revisions in the ongoing campaign to make things ready 
for C++11 as Clangs's default dialect.

Most of these are straightforward, but I am not entirely sure about a couple of 
things:

- In fixit.cpp, the place that now gets 'expected unqualified-id' seems funny, 
but maybe that's just the nature of things
- In copy-assignment.cpp, I am bemused by the whole thing but especially 
'passing argument to parameter here'


https://reviews.llvm.org/D27794

Files:
  test/FixIt/fixit.cpp
  test/OpenMP/teams_distribute_collapse_messages.cpp
  test/OpenMP/teams_distribute_parallel_for_collapse_messages.cpp
  test/OpenMP/teams_distribute_parallel_for_simd_collapse_messages.cpp
  test/Parser/backtrack-off-by-one.cpp
  test/SemaCXX/copy-assignment.cpp

Index: test/SemaCXX/copy-assignment.cpp
===
--- test/SemaCXX/copy-assignment.cpp
+++ test/SemaCXX/copy-assignment.cpp
@@ -1,12 +1,22 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s 
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -verify %s -std=c++98
+// RUN: %clang_cc1 -fsyntax-only -verify %s -std=c++11
+
+#if __cplusplus >= 201103L
+// expected-note@+3 2 {{candidate constructor}}
+// expected-note@+2 {{passing argument to parameter here}}
+#endif
 struct A {
 };
 
 struct ConvertibleToA {
   operator A();
 };
 
 struct ConvertibleToConstA {
+#if __cplusplus >= 201103L
+// expected-note@+2 {{candidate function}}
+#endif
   operator const A();
 };
 
@@ -69,6 +79,9 @@
   na = a;
   na = constA;
   na = convertibleToA;
+#if __cplusplus >= 201103L
+// expected-error@+2 {{no viable conversion}}
+#endif
   na = convertibleToConstA;
   na += a; // expected-error{{no viable overloaded '+='}}
 
Index: test/Parser/backtrack-off-by-one.cpp
===
--- test/Parser/backtrack-off-by-one.cpp
+++ test/Parser/backtrack-off-by-one.cpp
@@ -1,4 +1,6 @@
 // RUN: %clang_cc1 -verify %s
+// RUN: %clang_cc1 -verify %s -std=c++98
+// RUN: %clang_cc1 -verify %s -std=c++11
 
 // PR25946
 // We had an off-by-one error in an assertion when annotating A below.  Our
@@ -10,8 +12,10 @@
 
 // expected-error@+1 {{expected '{' after base class list}}
 template  class B : T // not ',' or '{'
-// expected-error@+3 {{C++ requires a type specifier for all declarations}}
-// expected-error@+2 {{expected ';' after top level declarator}}
+#if __cplusplus < 201103L
+// expected-error@+4 {{expected ';' after top level declarator}}
+#endif
+// expected-error@+2 {{C++ requires a type specifier for all declarations}}
 // expected-error@+1 {{expected ';' after class}}
 A {
 };
Index: test/OpenMP/teams_distribute_parallel_for_simd_collapse_messages.cpp
===
--- test/OpenMP/teams_distribute_parallel_for_simd_collapse_messages.cpp
+++ test/OpenMP/teams_distribute_parallel_for_simd_collapse_messages.cpp
@@ -1,8 +1,13 @@
 // RUN: %clang_cc1 -verify -fopenmp %s
+// RUN: %clang_cc1 -verify -fopenmp %s -std=c++98
+// RUN: %clang_cc1 -verify -fopenmp %s -std=c++11
 
 void foo() {
 }
 
+#if __cplusplus >= 201103L
+// expected-note@+2 4 {{declared here}}
+#endif
 bool foobool(int argc) {
   return argc;
 }
@@ -50,6 +55,9 @@
   for (int i = ST; i < N; i++)
 argv[0][i] = argv[0][i] - argv[0][i-ST]; // expected-error 2 {{expected 2 for loops after '#pragma omp teams distribute parallel for simd', but found only 1}}
 
+#if __cplusplus >= 201103L
+// expected-note@+6 2 {{non-constexpr function 'foobool' cannot be used}}
+#endif
 // expected-error@+4 2 {{directive '#pragma omp teams distribute parallel for simd' cannot contain more than one 'collapse' clause}}
 // expected-error@+3 2 {{argument to 'collapse' clause must be a strictly positive integer value}}
 // expected-error@+2 2 {{expression is not an integral constant expression}}
@@ -62,7 +70,11 @@
   for (int i = ST; i < N; i++)
 argv[0][i] = argv[0][i] - argv[0][i-ST];
 
-// expected-error@+2 2 {{expression is not an integral constant expression}}
+#if __cplusplus >= 201103L
+// expected-error@+5 2 {{integral constant expression must have integral or unscoped enumeration type}}
+#else
+// expected-error@+3 2 {{expression is not an integral constant expression}}
+#endif
 #pragma omp target
 #pragma omp teams distribute parallel for simd collapse (argv[1]=2) // expected-error {{expected ')'}} expected-note {{to match this '('}}
   for (int i = ST; i < N; i++)
@@ -110,11 +122,17 @@
   for (int i = 4; i < 12; i++)
 argv[0][i] = argv[0][i] - argv[0][i-4]; // expected-error {{expected 4 for loops after '#pragma omp teams distribute parallel for simd', but found only 1}}
 
+#if __cplusplus >= 201103L
+// expected-note@+3 {{non-constexpr function 'foobool' cannot be used}}
+#endif
 #pragma omp target
 #pragma

[PATCH] D27794: Make some diagnostic tests C++11 clean

2016-12-15 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a reviewer: ABataev.
probinson added a comment.

+abataev for OpenMP.


https://reviews.llvm.org/D27794



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27794: Make some diagnostic tests C++11 clean

2016-12-19 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a reviewer: rnk.
probinson updated this revision to Diff 81977.
probinson added a comment.

Remove the OpenMP tests from this review (committed in r290128).

+rnk who added test/Parser/backtrack-off-by-one.cpp originally.


https://reviews.llvm.org/D27794

Files:
  test/FixIt/fixit.cpp
  test/Parser/backtrack-off-by-one.cpp
  test/SemaCXX/copy-assignment.cpp

Index: test/SemaCXX/copy-assignment.cpp
===
--- test/SemaCXX/copy-assignment.cpp
+++ test/SemaCXX/copy-assignment.cpp
@@ -1,12 +1,22 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s 
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -verify %s -std=c++98
+// RUN: %clang_cc1 -fsyntax-only -verify %s -std=c++11
+
+#if __cplusplus >= 201103L
+// expected-note@+3 2 {{candidate constructor}}
+// expected-note@+2 {{passing argument to parameter here}}
+#endif
 struct A {
 };
 
 struct ConvertibleToA {
   operator A();
 };
 
 struct ConvertibleToConstA {
+#if __cplusplus >= 201103L
+// expected-note@+2 {{candidate function}}
+#endif
   operator const A();
 };
 
@@ -69,6 +79,9 @@
   na = a;
   na = constA;
   na = convertibleToA;
+#if __cplusplus >= 201103L
+// expected-error@+2 {{no viable conversion}}
+#endif
   na = convertibleToConstA;
   na += a; // expected-error{{no viable overloaded '+='}}
 
Index: test/Parser/backtrack-off-by-one.cpp
===
--- test/Parser/backtrack-off-by-one.cpp
+++ test/Parser/backtrack-off-by-one.cpp
@@ -1,4 +1,6 @@
 // RUN: %clang_cc1 -verify %s
+// RUN: %clang_cc1 -verify %s -std=c++98
+// RUN: %clang_cc1 -verify %s -std=c++11
 
 // PR25946
 // We had an off-by-one error in an assertion when annotating A below.  Our
@@ -10,8 +12,10 @@
 
 // expected-error@+1 {{expected '{' after base class list}}
 template  class B : T // not ',' or '{'
-// expected-error@+3 {{C++ requires a type specifier for all declarations}}
-// expected-error@+2 {{expected ';' after top level declarator}}
+#if __cplusplus < 201103L
+// expected-error@+4 {{expected ';' after top level declarator}}
+#endif
+// expected-error@+2 {{C++ requires a type specifier for all declarations}}
 // expected-error@+1 {{expected ';' after class}}
 A {
 };
Index: test/FixIt/fixit.cpp
===
--- test/FixIt/fixit.cpp
+++ test/FixIt/fixit.cpp
@@ -1,8 +1,12 @@
-// RUN: %clang_cc1 -pedantic -Wall -Wno-comment -verify -fcxx-exceptions -x c++ %s
+// RUN: %clang_cc1 -pedantic -Wall -Wno-comment -verify -fcxx-exceptions -x c++ -std=c++98 %s
+// RUN: cp %s %t-98
+// RUN: not %clang_cc1 -pedantic -Wall -Wno-comment -fcxx-exceptions -fixit -x c++ -std=c++98 %t-98
+// RUN: %clang_cc1 -fsyntax-only -pedantic -Wall -Werror -Wno-comment -fcxx-exceptions -x c++ -std=c++98 %t-98
 // RUN: not %clang_cc1 -fsyntax-only -fdiagnostics-parseable-fixits -x c++ -std=c++11 %s 2>&1 | FileCheck %s
-// RUN: cp %s %t
-// RUN: not %clang_cc1 -pedantic -Wall -Wno-comment -fcxx-exceptions -fixit -x c++ %t
-// RUN: %clang_cc1 -fsyntax-only -pedantic -Wall -Werror -Wno-comment -fcxx-exceptions -x c++ %t
+// RUN: %clang_cc1 -pedantic -Wall -Wno-comment -verify -fcxx-exceptions -x c++ -std=c++11 %s
+// RUN: cp %s %t-11
+// RUN: not %clang_cc1 -pedantic -Wall -Wno-comment -fcxx-exceptions -fixit -x c++ -std=c++11 %t-11
+// RUN: %clang_cc1 -fsyntax-only -pedantic -Wall -Werror -Wno-comment -fcxx-exceptions -x c++ -std=c++11 %t-11
 
 /* This is a test of the various code modification hints that are
provided as part of warning or extension diagnostics. All of the
@@ -21,7 +25,10 @@
 
 template struct CT { template struct Inner; }; // expected-note{{previous use is here}}
 
+// In C++11 this gets 'expected unqualified-id' which fixit can't fix.
+#if __cplusplus < 201103L
 CT<10 >> 2> ct; // expected-warning{{require parentheses}}
+#endif
 
 class C3 {
 public:
@@ -41,7 +48,11 @@
 };
 
 class B : public A {
+#if __cplusplus >= 201103L
+  A::foo; // expected-error{{ISO C++11 does not allow access declarations}}
+#else
   A::foo; // expected-warning{{access declarations are deprecated}}
+#endif
 };
 
 void f() throw(); // expected-note{{previous}}
@@ -285,8 +296,10 @@
 void (*p)() = &t;
 (void)(&t==p); // expected-error {{use '> ='}}
 (void)(&t>=p); // expected-error {{use '> >'}}
+#if __cplusplus < 201103L
 (void)(&t>>=p); // expected-error {{use '> >'}}
 (Shr)&t>>>=p; // expected-error {{use '> >'}}
+#endif
 
 // FIXME: We correct this to '&t > >= p;' not '&t >>= p;'
 //(Shr)&t>>=p;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27936: C++11 test cleanup: nonthrowing destructors

2016-12-19 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added a reviewer: rsmith.
probinson added a subscriber: cfe-commits.

If a dtor has no interesting members, then it ends up being nothrow, which 
affects the generated IR.
Modify some tests to tolerate this difference between C++03 and C++11.

In C++11, a destructor without an explicit exception-spec gets an implicit 
exception-spec.
If the dtor has a body, the implicit exception-spec permits throwing exactly 
the set of types thrown by anything the dtor calls.  If the dtor doesn't have a 
body, use what would be the default dtor's body to determine the implicit 
exception-spec.  If there are no calls, the implicit exception-spec is nothrow.


https://reviews.llvm.org/D27936

Files:
  test/CodeGenCXX/destructors.cpp
  test/CodeGenCXX/nrvo.cpp
  test/CodeGenCXX/partial-destruction.cpp

Index: test/CodeGenCXX/partial-destruction.cpp
===
--- test/CodeGenCXX/partial-destruction.cpp
+++ test/CodeGenCXX/partial-destruction.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -emit-llvm -o - -fcxx-exceptions -fexceptions | FileCheck %s
+// RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -emit-llvm -o - -fcxx-exceptions -fexceptions -std=c++03 | FileCheck %s -check-prefixes=CHECK,CHECKv03
+// RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -emit-llvm -o - -fcxx-exceptions -fexceptions -std=c++11 | FileCheck %s -check-prefixes=CHECK,CHECKv11
 
 // Test IR generation for partial destruction of aggregates.
 
@@ -45,7 +46,8 @@
   // CHECK-NEXT: br label
   // CHECK:  [[ED_AFTER:%.*]] = phi [[A]]* [ [[ED_END]], {{%.*}} ], [ [[ED_CUR:%.*]], {{%.*}} ]
   // CHECK-NEXT: [[ED_CUR]] = getelementptr inbounds [[A]], [[A]]* [[ED_AFTER]], i64 -1
-  // CHECK-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[ED_CUR]])
+  // CHECKv03-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[ED_CUR]])
+  // CHECKv11-NEXT: call   void @_ZN5test01AD1Ev([[A]]* [[ED_CUR]])
   // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[ED_CUR]], [[ED_BEGIN]]
   // CHECK-NEXT: br i1 [[T0]],
   // CHECK:  ret void
@@ -58,7 +60,8 @@
   // CHECK-NEXT: br i1 [[T0]],
   // CHECK:  [[E_AFTER:%.*]] = phi [[A]]* [ [[PARTIAL_END]], {{%.*}} ], [ [[E_CUR:%.*]], {{%.*}} ]
   // CHECK-NEXT: [[E_CUR]] = getelementptr inbounds [[A]], [[A]]* [[E_AFTER]], i64 -1
-  // CHECK-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
+  // CHECKv03-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
+  // CHECKv11-NEXT: call   void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
   // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[E_CUR]], [[E_BEGIN]]
   // CHECK-NEXT: br i1 [[T0]],
 
@@ -73,20 +76,21 @@
   // FIXME: There's some really bad block ordering here which causes
   // the partial destroy for the primary normal destructor to fall
   // within the primary EH destructor.
-  // CHECK:  landingpad { i8*, i32 }
-  // CHECK-NEXT:   cleanup
-  // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[ED_BEGIN]], [[ED_CUR]]
-  // CHECK-NEXT: br i1 [[T0]]
-  // CHECK:  [[EDD_AFTER:%.*]] = phi [[A]]* [ [[ED_CUR]], {{%.*}} ], [ [[EDD_CUR:%.*]], {{%.*}} ]
-  // CHECK-NEXT: [[EDD_CUR]] = getelementptr inbounds [[A]], [[A]]* [[EDD_AFTER]], i64 -1
-  // CHECK-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[EDD_CUR]])
-  // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[EDD_CUR]], [[ED_BEGIN]]
-  // CHECK-NEXT: br i1 [[T0]]
+  // CHECKv03:  landingpad { i8*, i32 }
+  // CHECKv03-NEXT:   cleanup
+  // CHECKv03:  [[T0:%.*]] = icmp eq [[A]]* [[ED_BEGIN]], [[ED_CUR]]
+  // CHECKv03-NEXT: br i1 [[T0]]
+  // CHECKv03:  [[EDD_AFTER:%.*]] = phi [[A]]* [ [[ED_CUR]], {{%.*}} ], [ [[EDD_CUR:%.*]], {{%.*}} ]
+  // CHECKv03-NEXT: [[EDD_CUR]] = getelementptr inbounds [[A]], [[A]]* [[EDD_AFTER]], i64 -1
+  // CHECKv03-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[EDD_CUR]])
+  // CHECKv03:  [[T0:%.*]] = icmp eq [[A]]* [[EDD_CUR]], [[ED_BEGIN]]
+  // CHECKv03-NEXT: br i1 [[T0]]
 
   // Back to the primary EH destructor.
   // CHECK:  [[E_AFTER:%.*]] = phi [[A]]* [ [[E_END]], {{%.*}} ], [ [[E_CUR:%.*]], {{%.*}} ]
   // CHECK-NEXT: [[E_CUR]] = getelementptr inbounds [[A]], [[A]]* [[E_AFTER]], i64 -1
-  // CHECK-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
+  // CHECKv03-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
+  // CHECKv11-NEXT: call   void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
   // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[E_CUR]], [[E0]]
   // CHECK-NEXT: br i1 [[T0]],
 
@@ -120,8 +124,10 @@
   // CHECK-NEXT:   cleanup
   // CHECK:  landingpad { i8*, i32 }
   // CHECK-NEXT:   cleanup
-  // CHECK:  invoke void @_ZN5test11AD1Ev([[A]]* [[Y]])
-  // CHECK:  invoke void @_ZN5test11AD1Ev([[A]]* [[X]])
+  // CHECKv03:  invoke void @_ZN5test11AD1Ev([[A]]* [[Y]])
+  // CHECKv03:  invoke void @_ZN5test11AD1Ev([[A]]* [[X]])
+  // CHECKv11:  call   void @_ZN5test11AD1Ev([[A]]* [[Y]])
+  // CHECKv11:  call   void @_ZN5test11AD1Ev([[A]]* [[X]])
 }
 
 namespac

[PATCH] D27955: Make CodeGenCXX/arm-swiftcall.cpp tolerate C++11

2016-12-19 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added a reviewer: rjmccall.
probinson added a subscriber: cfe-commits.
Herald added subscribers: rengolin, aemerson.

The test conjures up and returns a temp which has a struct type, and the struct 
has some empty/padding bytes in the middle.  In C++03 these are handled as 
zero, so the code uses 'llvm.memset' to initialize the temp.
In C++11, the padding is handled as undef, so the code uses 'llvm.memcpy' 
instead, making the test fail.

I've made the test run twice, once per dialect, and check for the appropriate 
intrinsic.
It doesn't look like this is the point of the test, though,. so maybe 
hard-coding the dialect would be preferable.


https://reviews.llvm.org/D27955

Files:
  test/CodeGenCXX/arm-swiftcall.cpp


Index: test/CodeGenCXX/arm-swiftcall.cpp
===
--- test/CodeGenCXX/arm-swiftcall.cpp
+++ test/CodeGenCXX/arm-swiftcall.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s 
-Wno-return-type-c-linkage | FileCheck %s
+// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s 
-Wno-return-type-c-linkage -std=c++03 | FileCheck %s 
-check-prefixes=CHECK,CHECKv03
+// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s 
-Wno-return-type-c-linkage -std=c++11 | FileCheck %s 
-check-prefixes=CHECK,CHECKv11
 
 // This isn't really testing anything ARM-specific; it's just a convenient
 // 32-bit platform.
@@ -48,7 +49,8 @@
 TEST(struct_1);
 // CHECK-LABEL: define {{.*}} @return_struct_1()
 // CHECK:   [[RET:%.*]] = alloca [[REC:%.*]], align 4
-// CHECK:   @llvm.memset
+// CHECKv03:   @llvm.memset
+// CHECKv11:   @llvm.memcpy
 // CHECK:   [[CAST_TMP:%.*]] = bitcast [[REC]]* [[RET]] to [[AGG:{ i32, \[2 x 
i8\], i8, \[1 x i8\], float, float }]]*
 // CHECK:   [[T0:%.*]] = getelementptr inbounds [[AGG]], [[AGG]]* 
[[CAST_TMP]], i32 0, i32 0
 // CHECK:   [[FIRST:%.*]] = load i32, i32* [[T0]], align 4


Index: test/CodeGenCXX/arm-swiftcall.cpp
===
--- test/CodeGenCXX/arm-swiftcall.cpp
+++ test/CodeGenCXX/arm-swiftcall.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s -Wno-return-type-c-linkage | FileCheck %s
+// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s -Wno-return-type-c-linkage -std=c++03 | FileCheck %s -check-prefixes=CHECK,CHECKv03
+// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s -Wno-return-type-c-linkage -std=c++11 | FileCheck %s -check-prefixes=CHECK,CHECKv11
 
 // This isn't really testing anything ARM-specific; it's just a convenient
 // 32-bit platform.
@@ -48,7 +49,8 @@
 TEST(struct_1);
 // CHECK-LABEL: define {{.*}} @return_struct_1()
 // CHECK:   [[RET:%.*]] = alloca [[REC:%.*]], align 4
-// CHECK:   @llvm.memset
+// CHECKv03:   @llvm.memset
+// CHECKv11:   @llvm.memcpy
 // CHECK:   [[CAST_TMP:%.*]] = bitcast [[REC]]* [[RET]] to [[AGG:{ i32, \[2 x i8\], i8, \[1 x i8\], float, float }]]*
 // CHECK:   [[T0:%.*]] = getelementptr inbounds [[AGG]], [[AGG]]* [[CAST_TMP]], i32 0, i32 0
 // CHECK:   [[FIRST:%.*]] = load i32, i32* [[T0]], align 4
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27956: Make CodeGenCXX/stack-reuse-miscompile.cpp tolerate C++11

2016-12-19 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added a reviewer: lenykholodov.
probinson added a subscriber: cfe-commits.

In this test, the allocas for the temps come out in a different order depending 
on whether the dialect is C++03 or C++11.  To avoid depending on the default 
dialect, I forced it to C++03.

I am concerned, though, because the commentary says there should be no lifetime 
intrinsics.  While that was true in Clang 3.8, it is no longer true in Clang 
3.9, regardless of dialect.  However, the test does not actually verify that 
there are no lifetime intrinsics.

Is it still true that there should be no lifetime intrinsics?  If so, then 
there is a bug that the test has failed to detect.  If not, then the comment 
should be updated.


https://reviews.llvm.org/D27956

Files:
  test/CodeGenCXX/stack-reuse-miscompile.cpp


Index: test/CodeGenCXX/stack-reuse-miscompile.cpp
===
--- test/CodeGenCXX/stack-reuse-miscompile.cpp
+++ test/CodeGenCXX/stack-reuse-miscompile.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang -S -target armv7l-unknown-linux-gnueabihf -emit-llvm -O1 -mllvm 
-disable-llvm-optzns -S %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple armv7l-unknown-linux-gnueabihf -emit-llvm -O1 
-disable-llvm-optzns -std=c++03 %s -o - | FileCheck %s
 
 // This test should not to generate llvm.lifetime.start/llvm.lifetime.end for
 // f function because all temporary objects in this function are used for the


Index: test/CodeGenCXX/stack-reuse-miscompile.cpp
===
--- test/CodeGenCXX/stack-reuse-miscompile.cpp
+++ test/CodeGenCXX/stack-reuse-miscompile.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang -S -target armv7l-unknown-linux-gnueabihf -emit-llvm -O1 -mllvm -disable-llvm-optzns -S %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple armv7l-unknown-linux-gnueabihf -emit-llvm -O1 -disable-llvm-optzns -std=c++03 %s -o - | FileCheck %s
 
 // This test should not to generate llvm.lifetime.start/llvm.lifetime.end for
 // f function because all temporary objects in this function are used for the
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27955: Make CodeGenCXX/arm-swiftcall.cpp tolerate C++11

2016-12-19 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 82021.
probinson added a comment.

Force C++03 on this test, to make it insensitive to future changes in the 
default dialect.


https://reviews.llvm.org/D27955

Files:
  test/CodeGenCXX/arm-swiftcall.cpp


Index: test/CodeGenCXX/arm-swiftcall.cpp
===
--- test/CodeGenCXX/arm-swiftcall.cpp
+++ test/CodeGenCXX/arm-swiftcall.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s 
-Wno-return-type-c-linkage | FileCheck %s
+// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s 
-Wno-return-type-c-linkage -std=c++03 | FileCheck %s -check-prefixes=CHECK
 
 // This isn't really testing anything ARM-specific; it's just a convenient
 // 32-bit platform.


Index: test/CodeGenCXX/arm-swiftcall.cpp
===
--- test/CodeGenCXX/arm-swiftcall.cpp
+++ test/CodeGenCXX/arm-swiftcall.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s -Wno-return-type-c-linkage | FileCheck %s
+// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s -Wno-return-type-c-linkage -std=c++03 | FileCheck %s -check-prefixes=CHECK
 
 // This isn't really testing anything ARM-specific; it's just a convenient
 // 32-bit platform.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27955: Make CodeGenCXX/arm-swiftcall.cpp tolerate C++11

2016-12-19 Thread Paul Robinson via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL290145: Make another test insensitive to the default C++ 
dialect. (authored by probinson).

Changed prior to commit:
  https://reviews.llvm.org/D27955?vs=82021&id=82028#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D27955

Files:
  cfe/trunk/test/CodeGenCXX/arm-swiftcall.cpp


Index: cfe/trunk/test/CodeGenCXX/arm-swiftcall.cpp
===
--- cfe/trunk/test/CodeGenCXX/arm-swiftcall.cpp
+++ cfe/trunk/test/CodeGenCXX/arm-swiftcall.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s 
-Wno-return-type-c-linkage | FileCheck %s
+// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s 
-Wno-return-type-c-linkage -std=c++03 | FileCheck %s -check-prefixes=CHECK
 
 // This isn't really testing anything ARM-specific; it's just a convenient
 // 32-bit platform.


Index: cfe/trunk/test/CodeGenCXX/arm-swiftcall.cpp
===
--- cfe/trunk/test/CodeGenCXX/arm-swiftcall.cpp
+++ cfe/trunk/test/CodeGenCXX/arm-swiftcall.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s -Wno-return-type-c-linkage | FileCheck %s
+// RUN: %clang_cc1 -triple armv7-apple-darwin9 -emit-llvm -o - %s -Wno-return-type-c-linkage -std=c++03 | FileCheck %s -check-prefixes=CHECK
 
 // This isn't really testing anything ARM-specific; it's just a convenient
 // 32-bit platform.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27936: C++11 test cleanup: nonthrowing destructors

2016-12-20 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a reviewer: rjmccall.
probinson added a comment.

+rjmccall as IR Gen owner.


https://reviews.llvm.org/D27936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27994: Make two vtable tests tolerate C++11

2016-12-20 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added a reviewer: rjmccall.
probinson added a subscriber: cfe-commits.

In C++11, we don't emit vtables as eagerly as we do for C++03, so fiddle the 
tests to emit them when the test expects them.

In the C++11 test cleanup project, we're commonly making the tests run in both 
dialects and sometimes with no dialect specified (as Clang's default will 
presumably advance to C++14/17 at some point).  I didn't do that for 
vtable-layout.cpp because it runs FileCheck 46 times, and replicating that 
really seemed like too much.
If it also seems like too much for vtable-linkage.cpp, the easy thing is to 
force it to C++03.


https://reviews.llvm.org/D27994

Files:
  test/CodeGenCXX/vtable-layout.cpp
  test/CodeGenCXX/vtable-linkage.cpp


Index: test/CodeGenCXX/vtable-linkage.cpp
===
--- test/CodeGenCXX/vtable-linkage.cpp
+++ test/CodeGenCXX/vtable-linkage.cpp
@@ -1,6 +1,10 @@
 // RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -o %t
+// RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -std=c++03 -o %t.03
+// RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -std=c++11 -o %t.11
 // RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -disable-llvm-optzns -O3 
-emit-llvm -o %t.opt
 // RUN: FileCheck %s < %t
+// RUN: FileCheck %s < %t.03
+// RUN: FileCheck %s < %t.11
 // RUN: FileCheck --check-prefix=CHECK-OPT %s < %t.opt
 
 namespace {
@@ -33,6 +37,11 @@
 
 static struct : D { } e;
 
+// Force 'e' to be constructed and therefore have a vtable defined.
+void use_e() {
+  e.f();
+}
+
 // The destructor is the key function.
 template
 struct E {
Index: test/CodeGenCXX/vtable-layout.cpp
===
--- test/CodeGenCXX/vtable-layout.cpp
+++ test/CodeGenCXX/vtable-layout.cpp
@@ -1919,6 +1919,8 @@
 virtual int i(int);
 virtual int i();
   };
+  // Force C's vtable to be generated.
+  int C::f() { return 1; }
 
   class D : C {};
 


Index: test/CodeGenCXX/vtable-linkage.cpp
===
--- test/CodeGenCXX/vtable-linkage.cpp
+++ test/CodeGenCXX/vtable-linkage.cpp
@@ -1,6 +1,10 @@
 // RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -o %t
+// RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -std=c++03 -o %t.03
+// RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -std=c++11 -o %t.11
 // RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -disable-llvm-optzns -O3 -emit-llvm -o %t.opt
 // RUN: FileCheck %s < %t
+// RUN: FileCheck %s < %t.03
+// RUN: FileCheck %s < %t.11
 // RUN: FileCheck --check-prefix=CHECK-OPT %s < %t.opt
 
 namespace {
@@ -33,6 +37,11 @@
 
 static struct : D { } e;
 
+// Force 'e' to be constructed and therefore have a vtable defined.
+void use_e() {
+  e.f();
+}
+
 // The destructor is the key function.
 template
 struct E {
Index: test/CodeGenCXX/vtable-layout.cpp
===
--- test/CodeGenCXX/vtable-layout.cpp
+++ test/CodeGenCXX/vtable-layout.cpp
@@ -1919,6 +1919,8 @@
 virtual int i(int);
 virtual int i();
   };
+  // Force C's vtable to be generated.
+  int C::f() { return 1; }
 
   class D : C {};
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27994: Make two vtable tests tolerate C++11

2016-12-20 Thread Paul Robinson via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL290205: Make two vtable tests tolerate C++11. (authored by 
probinson).

Changed prior to commit:
  https://reviews.llvm.org/D27994?vs=82126&id=82156#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D27994

Files:
  cfe/trunk/test/CodeGenCXX/vtable-layout.cpp
  cfe/trunk/test/CodeGenCXX/vtable-linkage.cpp


Index: cfe/trunk/test/CodeGenCXX/vtable-layout.cpp
===
--- cfe/trunk/test/CodeGenCXX/vtable-layout.cpp
+++ cfe/trunk/test/CodeGenCXX/vtable-layout.cpp
@@ -1919,6 +1919,8 @@
 virtual int i(int);
 virtual int i();
   };
+  // Force C's vtable to be generated.
+  int C::f() { return 1; }
 
   class D : C {};
 
Index: cfe/trunk/test/CodeGenCXX/vtable-linkage.cpp
===
--- cfe/trunk/test/CodeGenCXX/vtable-linkage.cpp
+++ cfe/trunk/test/CodeGenCXX/vtable-linkage.cpp
@@ -1,6 +1,10 @@
 // RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -o %t
+// RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -std=c++03 -o %t.03
+// RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -std=c++11 -o %t.11
 // RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -disable-llvm-optzns -O3 
-emit-llvm -o %t.opt
 // RUN: FileCheck %s < %t
+// RUN: FileCheck %s < %t.03
+// RUN: FileCheck %s < %t.11
 // RUN: FileCheck --check-prefix=CHECK-OPT %s < %t.opt
 
 namespace {
@@ -33,6 +37,11 @@
 
 static struct : D { } e;
 
+// Force 'e' to be constructed and therefore have a vtable defined.
+void use_e() {
+  e.f();
+}
+
 // The destructor is the key function.
 template
 struct E {


Index: cfe/trunk/test/CodeGenCXX/vtable-layout.cpp
===
--- cfe/trunk/test/CodeGenCXX/vtable-layout.cpp
+++ cfe/trunk/test/CodeGenCXX/vtable-layout.cpp
@@ -1919,6 +1919,8 @@
 virtual int i(int);
 virtual int i();
   };
+  // Force C's vtable to be generated.
+  int C::f() { return 1; }
 
   class D : C {};
 
Index: cfe/trunk/test/CodeGenCXX/vtable-linkage.cpp
===
--- cfe/trunk/test/CodeGenCXX/vtable-linkage.cpp
+++ cfe/trunk/test/CodeGenCXX/vtable-linkage.cpp
@@ -1,6 +1,10 @@
 // RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -o %t
+// RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -std=c++03 -o %t.03
+// RUN: %clang_cc1 %s -triple=x86_64-pc-linux -emit-llvm -std=c++11 -o %t.11
 // RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -disable-llvm-optzns -O3 -emit-llvm -o %t.opt
 // RUN: FileCheck %s < %t
+// RUN: FileCheck %s < %t.03
+// RUN: FileCheck %s < %t.11
 // RUN: FileCheck --check-prefix=CHECK-OPT %s < %t.opt
 
 namespace {
@@ -33,6 +37,11 @@
 
 static struct : D { } e;
 
+// Force 'e' to be constructed and therefore have a vtable defined.
+void use_e() {
+  e.f();
+}
+
 // The destructor is the key function.
 template
 struct E {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27936: C++11 test cleanup: nonthrowing destructors

2016-12-20 Thread Paul Robinson via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL290207: C++11 test cleanup: nonthrowing destructors 
(authored by probinson).

Changed prior to commit:
  https://reviews.llvm.org/D27936?vs=81982&id=82157#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D27936

Files:
  cfe/trunk/test/CodeGenCXX/destructors.cpp
  cfe/trunk/test/CodeGenCXX/nrvo.cpp
  cfe/trunk/test/CodeGenCXX/partial-destruction.cpp

Index: cfe/trunk/test/CodeGenCXX/partial-destruction.cpp
===
--- cfe/trunk/test/CodeGenCXX/partial-destruction.cpp
+++ cfe/trunk/test/CodeGenCXX/partial-destruction.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -emit-llvm -o - -fcxx-exceptions -fexceptions | FileCheck %s
+// RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -emit-llvm -o - -fcxx-exceptions -fexceptions -std=c++03 | FileCheck %s -check-prefixes=CHECK,CHECKv03
+// RUN: %clang_cc1 %s -triple=x86_64-apple-darwin10 -emit-llvm -o - -fcxx-exceptions -fexceptions -std=c++11 | FileCheck %s -check-prefixes=CHECK,CHECKv11
 
 // Test IR generation for partial destruction of aggregates.
 
@@ -45,7 +46,8 @@
   // CHECK-NEXT: br label
   // CHECK:  [[ED_AFTER:%.*]] = phi [[A]]* [ [[ED_END]], {{%.*}} ], [ [[ED_CUR:%.*]], {{%.*}} ]
   // CHECK-NEXT: [[ED_CUR]] = getelementptr inbounds [[A]], [[A]]* [[ED_AFTER]], i64 -1
-  // CHECK-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[ED_CUR]])
+  // CHECKv03-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[ED_CUR]])
+  // CHECKv11-NEXT: call   void @_ZN5test01AD1Ev([[A]]* [[ED_CUR]])
   // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[ED_CUR]], [[ED_BEGIN]]
   // CHECK-NEXT: br i1 [[T0]],
   // CHECK:  ret void
@@ -58,7 +60,8 @@
   // CHECK-NEXT: br i1 [[T0]],
   // CHECK:  [[E_AFTER:%.*]] = phi [[A]]* [ [[PARTIAL_END]], {{%.*}} ], [ [[E_CUR:%.*]], {{%.*}} ]
   // CHECK-NEXT: [[E_CUR]] = getelementptr inbounds [[A]], [[A]]* [[E_AFTER]], i64 -1
-  // CHECK-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
+  // CHECKv03-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
+  // CHECKv11-NEXT: call   void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
   // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[E_CUR]], [[E_BEGIN]]
   // CHECK-NEXT: br i1 [[T0]],
 
@@ -73,20 +76,21 @@
   // FIXME: There's some really bad block ordering here which causes
   // the partial destroy for the primary normal destructor to fall
   // within the primary EH destructor.
-  // CHECK:  landingpad { i8*, i32 }
-  // CHECK-NEXT:   cleanup
-  // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[ED_BEGIN]], [[ED_CUR]]
-  // CHECK-NEXT: br i1 [[T0]]
-  // CHECK:  [[EDD_AFTER:%.*]] = phi [[A]]* [ [[ED_CUR]], {{%.*}} ], [ [[EDD_CUR:%.*]], {{%.*}} ]
-  // CHECK-NEXT: [[EDD_CUR]] = getelementptr inbounds [[A]], [[A]]* [[EDD_AFTER]], i64 -1
-  // CHECK-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[EDD_CUR]])
-  // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[EDD_CUR]], [[ED_BEGIN]]
-  // CHECK-NEXT: br i1 [[T0]]
+  // CHECKv03:  landingpad { i8*, i32 }
+  // CHECKv03-NEXT:   cleanup
+  // CHECKv03:  [[T0:%.*]] = icmp eq [[A]]* [[ED_BEGIN]], [[ED_CUR]]
+  // CHECKv03-NEXT: br i1 [[T0]]
+  // CHECKv03:  [[EDD_AFTER:%.*]] = phi [[A]]* [ [[ED_CUR]], {{%.*}} ], [ [[EDD_CUR:%.*]], {{%.*}} ]
+  // CHECKv03-NEXT: [[EDD_CUR]] = getelementptr inbounds [[A]], [[A]]* [[EDD_AFTER]], i64 -1
+  // CHECKv03-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[EDD_CUR]])
+  // CHECKv03:  [[T0:%.*]] = icmp eq [[A]]* [[EDD_CUR]], [[ED_BEGIN]]
+  // CHECKv03-NEXT: br i1 [[T0]]
 
   // Back to the primary EH destructor.
   // CHECK:  [[E_AFTER:%.*]] = phi [[A]]* [ [[E_END]], {{%.*}} ], [ [[E_CUR:%.*]], {{%.*}} ]
   // CHECK-NEXT: [[E_CUR]] = getelementptr inbounds [[A]], [[A]]* [[E_AFTER]], i64 -1
-  // CHECK-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
+  // CHECKv03-NEXT: invoke void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
+  // CHECKv11-NEXT: call   void @_ZN5test01AD1Ev([[A]]* [[E_CUR]])
   // CHECK:  [[T0:%.*]] = icmp eq [[A]]* [[E_CUR]], [[E0]]
   // CHECK-NEXT: br i1 [[T0]],
 
@@ -120,8 +124,10 @@
   // CHECK-NEXT:   cleanup
   // CHECK:  landingpad { i8*, i32 }
   // CHECK-NEXT:   cleanup
-  // CHECK:  invoke void @_ZN5test11AD1Ev([[A]]* [[Y]])
-  // CHECK:  invoke void @_ZN5test11AD1Ev([[A]]* [[X]])
+  // CHECKv03:  invoke void @_ZN5test11AD1Ev([[A]]* [[Y]])
+  // CHECKv03:  invoke void @_ZN5test11AD1Ev([[A]]* [[X]])
+  // CHECKv11:  call   void @_ZN5test11AD1Ev([[A]]* [[Y]])
+  // CHECKv11:  call   void @_ZN5test11AD1Ev([[A]]* [[X]])
 }
 
 namespace test2 {
@@ -153,7 +159,8 @@
 // CHECK-NEXT: br i1 [[EMPTY]],
 // CHECK:  [[PAST:%.*]] = phi [[A]]* [ [[CUR]], {{%.*}} ], [ [[DEL:%.*]], {{%.*}} ]
 // CHECK-NEXT: [[DEL]] = getelementptr inbounds [[A]], [[A]]* [[PAST]], i64 -1
-// CHECK-NEXT: invoke void @_ZN5test21AD1Ev([[A]]* [[DEL]])
+// CHECKv03-NEXT: invoke void @_ZN5test21AD1

[PATCH] D27956: Make CodeGenCXX/stack-reuse-miscompile.cpp tolerate C++11

2016-12-20 Thread Paul Robinson via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL290208: Make a test use a specific C++ dialect (authored by 
probinson).

Changed prior to commit:
  https://reviews.llvm.org/D27956?vs=82020&id=82160#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D27956

Files:
  cfe/trunk/test/CodeGenCXX/stack-reuse-miscompile.cpp


Index: cfe/trunk/test/CodeGenCXX/stack-reuse-miscompile.cpp
===
--- cfe/trunk/test/CodeGenCXX/stack-reuse-miscompile.cpp
+++ cfe/trunk/test/CodeGenCXX/stack-reuse-miscompile.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang -S -target armv7l-unknown-linux-gnueabihf -emit-llvm -O1 -mllvm 
-disable-llvm-optzns -S %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple armv7l-unknown-linux-gnueabihf -emit-llvm -O1 
-disable-llvm-optzns -std=c++03 %s -o - | FileCheck %s
 
 // This test should not to generate llvm.lifetime.start/llvm.lifetime.end for
 // f function because all temporary objects in this function are used for the


Index: cfe/trunk/test/CodeGenCXX/stack-reuse-miscompile.cpp
===
--- cfe/trunk/test/CodeGenCXX/stack-reuse-miscompile.cpp
+++ cfe/trunk/test/CodeGenCXX/stack-reuse-miscompile.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang -S -target armv7l-unknown-linux-gnueabihf -emit-llvm -O1 -mllvm -disable-llvm-optzns -S %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple armv7l-unknown-linux-gnueabihf -emit-llvm -O1 -disable-llvm-optzns -std=c++03 %s -o - | FileCheck %s
 
 // This test should not to generate llvm.lifetime.start/llvm.lifetime.end for
 // f function because all temporary objects in this function are used for the
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27794: Make some diagnostic tests C++11 clean

2016-12-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson marked an inline comment as done.
probinson added a comment.

FIXME added.


https://reviews.llvm.org/D27794



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D27794: Make some diagnostic tests C++11 clean

2016-12-21 Thread Paul Robinson via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL290262: Make some diagnostic tests C++11 clean. (authored by 
probinson).

Changed prior to commit:
  https://reviews.llvm.org/D27794?vs=81977&id=82248#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D27794

Files:
  cfe/trunk/test/FixIt/fixit.cpp
  cfe/trunk/test/Parser/backtrack-off-by-one.cpp
  cfe/trunk/test/SemaCXX/copy-assignment.cpp

Index: cfe/trunk/test/FixIt/fixit.cpp
===
--- cfe/trunk/test/FixIt/fixit.cpp
+++ cfe/trunk/test/FixIt/fixit.cpp
@@ -1,8 +1,12 @@
-// RUN: %clang_cc1 -pedantic -Wall -Wno-comment -verify -fcxx-exceptions -x c++ %s
+// RUN: %clang_cc1 -pedantic -Wall -Wno-comment -verify -fcxx-exceptions -x c++ -std=c++98 %s
+// RUN: cp %s %t-98
+// RUN: not %clang_cc1 -pedantic -Wall -Wno-comment -fcxx-exceptions -fixit -x c++ -std=c++98 %t-98
+// RUN: %clang_cc1 -fsyntax-only -pedantic -Wall -Werror -Wno-comment -fcxx-exceptions -x c++ -std=c++98 %t-98
 // RUN: not %clang_cc1 -fsyntax-only -fdiagnostics-parseable-fixits -x c++ -std=c++11 %s 2>&1 | FileCheck %s
-// RUN: cp %s %t
-// RUN: not %clang_cc1 -pedantic -Wall -Wno-comment -fcxx-exceptions -fixit -x c++ %t
-// RUN: %clang_cc1 -fsyntax-only -pedantic -Wall -Werror -Wno-comment -fcxx-exceptions -x c++ %t
+// RUN: %clang_cc1 -pedantic -Wall -Wno-comment -verify -fcxx-exceptions -x c++ -std=c++11 %s
+// RUN: cp %s %t-11
+// RUN: not %clang_cc1 -pedantic -Wall -Wno-comment -fcxx-exceptions -fixit -x c++ -std=c++11 %t-11
+// RUN: %clang_cc1 -fsyntax-only -pedantic -Wall -Werror -Wno-comment -fcxx-exceptions -x c++ -std=c++11 %t-11
 
 /* This is a test of the various code modification hints that are
provided as part of warning or extension diagnostics. All of the
@@ -21,7 +25,11 @@
 
 template struct CT { template struct Inner; }; // expected-note{{previous use is here}}
 
+// FIXME: In C++11 this gets 'expected unqualified-id' which fixit can't fix.
+// Probably parses as `CT<10> > 2 > ct;` rather than `CT<(10 >> 2)> ct;`.
+#if __cplusplus < 201103L
 CT<10 >> 2> ct; // expected-warning{{require parentheses}}
+#endif
 
 class C3 {
 public:
@@ -41,7 +49,11 @@
 };
 
 class B : public A {
+#if __cplusplus >= 201103L
+  A::foo; // expected-error{{ISO C++11 does not allow access declarations}}
+#else
   A::foo; // expected-warning{{access declarations are deprecated}}
+#endif
 };
 
 void f() throw(); // expected-note{{previous}}
@@ -285,8 +297,10 @@
 void (*p)() = &t;
 (void)(&t==p); // expected-error {{use '> ='}}
 (void)(&t>=p); // expected-error {{use '> >'}}
+#if __cplusplus < 201103L
 (void)(&t>>=p); // expected-error {{use '> >'}}
 (Shr)&t>>>=p; // expected-error {{use '> >'}}
+#endif
 
 // FIXME: We correct this to '&t > >= p;' not '&t >>= p;'
 //(Shr)&t>>=p;
Index: cfe/trunk/test/Parser/backtrack-off-by-one.cpp
===
--- cfe/trunk/test/Parser/backtrack-off-by-one.cpp
+++ cfe/trunk/test/Parser/backtrack-off-by-one.cpp
@@ -1,4 +1,6 @@
 // RUN: %clang_cc1 -verify %s
+// RUN: %clang_cc1 -verify %s -std=c++98
+// RUN: %clang_cc1 -verify %s -std=c++11
 
 // PR25946
 // We had an off-by-one error in an assertion when annotating A below.  Our
@@ -10,8 +12,10 @@
 
 // expected-error@+1 {{expected '{' after base class list}}
 template  class B : T // not ',' or '{'
-// expected-error@+3 {{C++ requires a type specifier for all declarations}}
-// expected-error@+2 {{expected ';' after top level declarator}}
+#if __cplusplus < 201103L
+// expected-error@+4 {{expected ';' after top level declarator}}
+#endif
+// expected-error@+2 {{C++ requires a type specifier for all declarations}}
 // expected-error@+1 {{expected ';' after class}}
 A {
 };
Index: cfe/trunk/test/SemaCXX/copy-assignment.cpp
===
--- cfe/trunk/test/SemaCXX/copy-assignment.cpp
+++ cfe/trunk/test/SemaCXX/copy-assignment.cpp
@@ -1,12 +1,22 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s 
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -verify %s -std=c++98
+// RUN: %clang_cc1 -fsyntax-only -verify %s -std=c++11
+
+#if __cplusplus >= 201103L
+// expected-note@+3 2 {{candidate constructor}}
+// expected-note@+2 {{passing argument to parameter here}}
+#endif
 struct A {
 };
 
 struct ConvertibleToA {
   operator A();
 };
 
 struct ConvertibleToConstA {
+#if __cplusplus >= 201103L
+// expected-note@+2 {{candidate function}}
+#endif
   operator const A();
 };
 
@@ -69,6 +79,9 @@
   na = a;
   na = constA;
   na = convertibleToA;
+#if __cplusplus >= 201103L
+// expected-error@+2 {{no viable conversion}}
+#endif
   na = convertibleToConstA;
   na += a; // expected-error{{no viable overloaded '+='}}
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cf

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Maybe instead, pass a flag to enable setting optnone on everything when the 
driver sees `-O0 -flto`?  The patch as-is obviously has a massive testing cost, 
and it's easy to imagine people being tripped up by this in the future.


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#638221, @mehdi_amini wrote:

> In https://reviews.llvm.org/D28404#638217, @probinson wrote:
>
> > The patch as-is obviously has a massive testing cost, and it's easy to 
> > imagine people being tripped up by this in the future.
>
>
> Can you clarify what massive testing cost you're referring to?

Well, you just had to modify around 50 tests, and I'd expect some future tests 
to have to deal with it too.  Maybe "massive" is overstating it but it seemed 
like an unusually large number.

I don't know that just slapping the option on all these tests is really the 
most appropriate fix, either, in some cases.  I'll look at it more.

https://reviews.llvm.org/D28404

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#638350, @mehdi_amini wrote:

> In https://reviews.llvm.org/D28404#638299, @probinson wrote:
>
> > In https://reviews.llvm.org/D28404#638221, @mehdi_amini wrote:
> >
> > > In https://reviews.llvm.org/D28404#638217, @probinson wrote:
> > >
> > > > The patch as-is obviously has a massive testing cost, and it's easy to 
> > > > imagine people being tripped up by this in the future.
> > >
> > >
> > > Can you clarify what massive testing cost you're referring to?
> >
> >
> > Well, you just had to modify around 50 tests, and I'd expect some future 
> > tests to have to deal with it too.  Maybe "massive" is overstating it but 
> > it seemed like an unusually large number.
>
>
> There are two things:
>
> - tests are modified: when adding a new option, it does not seems unusual to 
> me


50 seems rather more than usual, but whatever.  Granted it's not hundreds.

> - what impact on future testing. I still don't see any of this future 
> "testing cost" you're referring to right now.

Maybe I worry too much.

I am getting a slightly different set of test failures than you did though.  I 
get these failures:
CodeGen/aarch64-neon-extract.c
CodeGen/aarch64-poly128.c
CodeGen/arm-neon-shifts.c
CodeGen/arm64-crc32.c

And I don't get these failures:
CodeGenCXX/apple-kext-indirect-virtual-dtor-call.cpp
CodeGenCXX/apple-kext-no-staticinit-section.cpp
CodeGenCXX/debug-info-global-ctor-dtor.cpp




Comment at: clang/lib/CodeGen/CodeGenModule.cpp:900
+// OptimizeNone implies noinline; we should not be inlining such
+// functions.
 B.addAttribute(llvm::Attribute::NoInline);

I'd set ShouldAddOptNone = false here, as it's already explicit.



Comment at: clang/test/CodeGen/aarch64-neon-2velem.c:1
-// RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +neon 
-emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
+// RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +neon 
-disable-O0-optnone -disable-O0-optnone -emit-llvm -o - %s | opt -S -mem2reg | 
FileCheck %s
 

Option specified twice.


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:962
+  ShouldAddOptNone &= !D->hasAttr();
+  ShouldAddOptNone &= !D->hasAttr();
+  ShouldAddOptNone &= !F->hasFnAttribute(llvm::Attribute::AlwaysInline);

chandlerc wrote:
> why is optnone incompatible with *cold*
Because cold implies OptimizeForSize (just above this).  I take no position on 
whether that is reasonable.


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:896
+  !CodeGenOpts.DisableO0ImplyOptNone && CodeGenOpts.OptimizationLevel == 0;
+  // We can't add optnone in the following cases, it won't pass the verifier
+  ShouldAddOptNone &= !D->hasAttr();

Period at the end of a comment.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:900
+  ShouldAddOptNone &= !D->hasAttr();
+  if (ShouldAddOptNone) {
+B.addAttribute(llvm::Attribute::OptimizeNone);

This block is redundant now?  The same things are added in the next if block.


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:910-912
 // OptimizeNone wins over OptimizeForSize and MinSize.
 F->removeFnAttr(llvm::Attribute::OptimizeForSize);
 F->removeFnAttr(llvm::Attribute::MinSize);

mehdi_amini wrote:
> chandlerc wrote:
> > Is this still at all correct? Why? it seems pretty confusing especially in 
> > conjunction with the code below.
> > 
> > 
> > I think this may force you to either:
> > a) stop early-marking of -Os and -Oz flags with these attributes (early: 
> > prior to calling this routine) and handling all of the -O flag synthesized 
> > attributes here, or
> > b) set optnone for -O0 wher ewe set optsize for -Os and friends, and then 
> > remove it where necessary here.
> > 
> > I don't have any strong opinion about a vs. b.
> I believe it is still correct: during Os/Oz we reach this point and figure 
> that there is `__attribute__((optnone))` in the *source* (not `-O0`), we 
> remove the attributes, nothing changes. Did I miss something?
> 
Hmmm the Os/Oz attributes are added in CGCall.cpp, and are guarded with a check 
on the presence of the Optnone source attribute, so if the Optnone source 
attribute is present we should never see these.  And Os/Oz set 
OptimizationLevel to 2, which is not zero, so we won't come through here for 
ShouldAddOptNone reasons either.
Therefore these 'remove' calls should be no-ops and could be removed.  (For 
paranoia you could turn them into asserts, and do some experimenting to see 
whether I'm confused about how this all fits together.)


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Over the weekend I had a thought:  Why is -O0 so special here?  That is, after 
going to all this trouble to propagate -O0 to LTO, how does this generalize to 
propagating -O1 or any other specific -O option?  (Maybe this question would be 
better dealt with on the dev list...)


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#639887, @mehdi_amini wrote:

> In https://reviews.llvm.org/D28404#639874, @probinson wrote:
>
> > Over the weekend I had a thought:  Why is -O0 so special here?  That is, 
> > after going to all this trouble to propagate -O0 to LTO, how does this 
> > generalize to propagating -O1 or any other specific -O option?  (Maybe this 
> > question would be better dealt with on the dev list...)
>
>
> O0 is "special" like Os and Oz because we have an attribute for it and passes 
> "know" how to handle this attribute.
>  I guess no-one cares enough about 
> https://reviews.llvm.org/owners/package/1//https://reviews.llvm.org/owners/package/2//O3
>  to find a solution for these (in the context of LTO, I don't really care 
> about 
> https://reviews.llvm.org/owners/package/1//https://reviews.llvm.org/owners/package/2/).
>  It is likely that Og would need a special treatment at some point, maybe 
> with a new attribute as well, to inhibit optimization that can't preserve 
> debug info properly.


"I don't care" doesn't seem like much of a principle.

Optnone does not equal -O0.  It is a debugging aid for the programmer, because 
debugging optimized code sucks.  If you have an LTO-built application and want 
to de-optimize parts of it to aid with debugging, then you can use the pragma, 
as originally intended.  I don't think `-c -O0 -flto` should get this 
not-entirely-O0-like behavior.


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640090, @mehdi_amini wrote:

> In https://reviews.llvm.org/D28404#640046, @probinson wrote:
>
> > "I don't care" doesn't seem like much of a principle.
>
>
> Long version is: "There is no use-case, no users, so I don't have much 
> motivation to push it forward for the only sake of completeness". Does it 
> sound enough of a principle like that?

No.  You still need to have adequate justification for your use case, which I 
think you do not.

>> Optnone does not equal -O0.  It is a debugging aid for the programmer, 
>> because debugging optimized code sucks.  If you have an LTO-built 
>> application and want to de-optimize parts of it to aid with debugging, then 
>> you can use the pragma, as originally intended.
> 
> Having to modifying the source isn't friendly. Not being able to honor -O0 
> during LTO is not user-friendly.

IMO, '-O0' and '-flto' are conflicting options and therefore not deserving of 
special support.

In my experience, modifying source is by far simpler than hacking a build 
system to make a special case for compiler options for one module in an 
application.  (If you have a way to build Clang with everything done LTO except 
one module built with -O0, on Linux with ninja, I would be very curious to hear 
how you do that.)  But if your build system makes that easy, you can just as 
easily remove `-flto` as add `-O0` and thus get the result you want without 
trying to pass conflicting options to the compiler.  Or spending time 
implementing this patch.

>>   I don't think `-c -O0` should get this not-entirely-O0-like behavior.
> 
> What is "not-entirely"? And why do you think that?

"Not entirely" means that running the -O0 pipeline, and running an optimization 
pipeline but asking some subset of passes to turn themselves off, does not get 
you the same result.  And I think that because I'm the one who put 'optnone' 
upstream in the first place.  The case that particularly sticks in my memory is 
the register allocator, but I believe there are passes at every stage that do 
not turn themselves off for optnone.

https://reviews.llvm.org/D28404

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640170, @probinson wrote:

> In my experience, modifying source


Note that the source modification consists of adding `#pragma clang optimize 
off` to the top of the file.  It is not a complicated thing.


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640178, @mehdi_amini wrote:

> In https://reviews.llvm.org/D28404#640170, @probinson wrote:
>
> > In https://reviews.llvm.org/D28404#640090, @mehdi_amini wrote:
> >
> > > In https://reviews.llvm.org/D28404#640046, @probinson wrote:
> > >
> > > > "I don't care" doesn't seem like much of a principle.
> > >
> > >
> > > Long version is: "There is no use-case, no users, so I don't have much 
> > > motivation to push it forward for the only sake of completeness". Does it 
> > > sound enough of a principle like that?
> >
> >
> > No.  You still need to have adequate justification for your use case, which 
> > I think you do not.
>
>
> I don't follow your logic. 
>  IIUC, you asked about "why not supporting `O1/O2/O3`" ; how is *not 
> supporting* these because their not useful / don't have use-case related to 
> "supporting `O0` is useful"?


Upfront, it seemed peculiar to handle only one optimization level.  After more 
thought, the whole idea of mixing -O0 and LTO seems wrong.  Sorry, should have 
signaled that I had changed my mind about it.

 Optnone does not equal -O0.  It is a debugging aid for the programmer, 
 because debugging optimized code sucks.  If you have an LTO-built 
 application and want to de-optimize parts of it to aid with debugging, 
 then you can use the pragma, as originally intended.
>>> 
>>> Having to modifying the source isn't friendly. Not being able to honor -O0 
>>> during LTO is not user-friendly.
>> 
>> IMO, '-O0' and '-flto' are conflicting options and therefore not deserving 
>> of special support.
> 
> You're advocating for *rejecting* O0 built module at link-time? We'd still 
> need to detect this though. Status-quo isn't acceptable.
>  Also, that's not practicable: what if I have an LTO static library for which 
> I don't have the source, now if I build my own file with -O0 -flto I can't 
> link anymore.

No, I'm saying they are conflicting options on the same Clang command line.
As long as your linker can handle foo.o and bar.bc on the same command line, 
not a problem.  (If your linker can't handle that, fix the linker first.)

>> In my experience, modifying source is by far simpler than hacking a build 
>> system to make a special case for compiler options for one module in an 
>> application.  (If you have a way to build Clang with everything done LTO 
>> except one module built with -O0, on Linux with ninja, I would be very 
>> curious to hear how you do that.)
> 
> Static library, separated projects, etc.
>  We have tons of users...

Still waiting.  Your up-front use case was about de-optimizing a module to 
assist debugging it within an LTO-built application, not building entire 
projects one way versus another.  If that is not actually your use case, you 
need to start over with the correct description.

   I don't think `-c -O0` should get this not-entirely-O0-like behavior.
>>> 
>>> What is "not-entirely"? And why do you think that?
>> 
>> "Not entirely" means that running the -O0 pipeline, and running an 
>> optimization pipeline but asking some subset of passes to turn themselves 
>> off, does not get you the same result.  And I think that because I'm the one 
>> who put 'optnone' upstream in the first place.  The case that particularly 
>> sticks in my memory is the register allocator, but I believe there are 
>> passes at every stage that do not turn themselves off for optnone.
> 
> That's orthogonal: you're saying we are not handling it correctly yet, I'm 
> just moving toward *fixing* all these.

It's not orthogonal; that's exactly how 'optnone' behaves today.  If you have 
proposed a redesign of how to mix optnone and non-optnone functions in the same 
compilation unit, in some way other than what's done today, I am not aware of 
it; can you point to your proposal?


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640182, @mehdi_amini wrote:

> In https://reviews.llvm.org/D28404#640178, @mehdi_amini wrote:
>
> > Also, that's not practicable: what if I have an LTO static library for 
> > which I don't have the source, now if I build my own file with -O0 -flto I 
> > can't link anymore.
>
>
> Also: LTO is required for some features likes CFI. There are users who wants 
> CFI+O0 during development (possibly for debugging a subcomponent of the app).


Sorry, you lost me.  CFI is part of DWARF and we do DWARF perfectly well 
without LTO (and at O0).


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640314, @mehdi_amini wrote:

> I don't follow: IMO if I generate a module with optnone and pipe it to `opt 
> -O3` I expect no function IR to be touched. If it is not the case it is a bug.

Your opinion and expectation are not supported by the IR spec.  Optnone skips 
"most" optimization passes.  It is not practical (or was not, at the time) to 
make the -O3 pipeline behave exactly the same as the -O0 pipeline, and also not 
actually necessary to support the purpose for which 'optnone' was invented.

If you have a goal of making 'optnone' functions use the actual -O0 pipeline, 
while non-optnone functions use the optimizing pipeline, more power to you and 
you will need to take up that particular design challenge with Chandler first.

https://reviews.llvm.org/D28404

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640314, @mehdi_amini wrote:

> You just wrote above that " mixing -O0 and LTO " is wrong, *if* I were to 
> agree with you at some point, then I'd make it a hard error.

Yes, I was not clear that I meant that `-O0 -flto` on the same clang command 
line just seems nonsensical.  "Optimize my program without optimizing it" 
forsooth.

https://reviews.llvm.org/D28404

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640314, @mehdi_amini wrote:

> In https://reviews.llvm.org/D28404#640284, @probinson wrote:
>
> > Upfront, it seemed peculiar to handle only one optimization level.  After 
> > more thought, the whole idea of mixing -O0 and LTO seems wrong.  Sorry, 
> > should have signaled that I had changed my mind about it.
>
>
> You just haven't articulated 1) why it is wrong and 2) what should we do 
> about it.

"Optimize without optimizing" really?  Does not sound confused to you?  
Persuade me why it makes sense.

If it doesn't make sense, then yes making the `-O0 -flto` combination an error 
would be the right path.

Unless you are taking the position that `-flto` doesn't mean "use LTO" and 
instead means something else, like "emit bitcode" in which case you should be 
advocating to change the name of the option to say what it means.

https://reviews.llvm.org/D28404

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640314, @mehdi_amini wrote:

> In https://reviews.llvm.org/D28404#640284, @probinson wrote:
>
> > In https://reviews.llvm.org/D28404#640178, @mehdi_amini wrote:
> >
> > > In https://reviews.llvm.org/D28404#640170, @probinson wrote:
> > >
> > > > In my experience, modifying source is by far simpler than hacking a 
> > > > build system to make a special case for compiler options for one module 
> > > > in an application.  (If you have a way to build Clang with everything 
> > > > done LTO except one module built with -O0, on Linux with ninja, I would 
> > > > be very curious to hear how you do that.)
> > >
> > >
> > > Static library, separated projects, etc.
> > >  We have tons of users...
> >
> >
> > Still waiting.
>
>
> Waiting for what?
>  We have use-cases, I gave you a few (vendor static libraries are one). 
> Again, if you think it is wrong to support O0 and LTO, then please elaborate.

Your original use-case described debugging a module in an application.  You 
claimed it was simpler to change the build options for a module than change the 
source, which I am still waiting to hear how/why that is simpler.

Your subsequent use cases are about entire sub-projects, which is entirely 
different and orthogonal to where you started.  Please elaborate on the 
original use case.

https://reviews.llvm.org/D28404

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Basically, I don't see why having clang always emit a real .o at -O0 would be a 
problem.
I haven't gotten through the other-CFI documentation yet though.


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640362, @probinson wrote:

> In https://reviews.llvm.org/D28404#640314, @mehdi_amini wrote:
>
> > I don't follow: IMO if I generate a module with optnone and pipe it to `opt 
> > -O3` I expect no function IR to be touched. If it is not the case it is a 
> > bug.
>
>
> Your opinion and expectation are not supported by the IR spec.  Optnone skips 
> "most" optimization passes.  It is not practical (or was not, at the time) to 
> make the -O3 pipeline behave exactly the same as the -O0 pipeline, and also 
> not actually necessary to support the purpose for which 'optnone' was 
> invented.
>
> If you have a goal of making 'optnone' functions use the actual -O0 pipeline, 
> while non-optnone functions use the optimizing pipeline, more power to you 
> and you will need to take up that particular design challenge with Chandler 
> first.


Oh, maybe you are thinking of eliminating the -O0 pipeline?  Because if -O0 
implies optnone then it's kinda-sorta the same thing as the optimization 
pipeline operating on nothing but optnone functions?  I'd think that would make 
-O0 compilations slow down, which would not be a feature.


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640588, @mehdi_amini wrote:

> Actually, as mentioned before, I could be fine with making `O0` incompatible 
> with LTO, however security features like CFI (or other sort of whole-program 
> analyses/instrumentations) requires LTO.

Well, "requires LTO" is overstating the case, AFAICT from the link you gave me. 
 Doesn't depend on //optimization// at all.  It depends on some interprocedural 
analyses given some particular scope/visibility boundary, which it is 
convenient to define as a set of linked bitcode modules, that by some happy 
chance is the same set of linked bitcode modules that LTO will operate on.

If it's important to support combining a bitcode version of my-application with 
your-bitcode-library for this CFI or whatever, and you also want to let me have 
my-application be unoptimized while your-bitcode-library gets optimized, NOW we 
have a use-case.  (Maybe that's what you had in mind earlier, but for some 
reason I wasn't able to extract that out of any prior comments.  No matter.)

I'm now thinking along the lines of a `-foptimize-off` flag (bikesheds welcome) 
which would set the default for the pragma to 'off'.  How is that different 
than what you wanted for `-O0`?  It is defined in terms of an existing pragma, 
which is WAY easier to explain and WAY easier to implement.  And, it still lets 
us say that `-c -O0 -flto` is a mistake, if that seems like a useful thing to 
say.

Does that seem reasonable?  Fit your understanding of the needs?

https://reviews.llvm.org/D28404

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-09 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#640682, @mehdi_amini wrote:

> > I'm now thinking along the lines of a `-foptimize-off` flag (bikesheds 
> > welcome) which would set the default for the pragma to 'off'.  How is that 
> > different than what you wanted for `-O0`?  It is defined in terms of an 
> > existing pragma, which is WAY easier to explain and WAY easier to 
> > implement.  And, it still lets us say that `-c -O0 -flto` is a mistake, if 
> > that seems like a useful thing to say.
>
> Well -O0 being actually "disable optimization", I found "way easier" to 
> handle everything the same way (pragma, command line, etc.). I kind of find 
> it confusing for the user to differentiate `-O0` from `-foptimize=off`. What 
> is supposed to change between the two?

There is a pedantic difference, rooted in the still-true factoid that O0 != 
optnone.
If we redefine LTO as "Link Time Operation" (rather than Optimization; see my 
reply to Duncan)  then `-O0 -flto` is no longer an oxymoron, but using the 
attribute to imply the optimization level is still not good fidelity to what 
the user asked for.

https://reviews.llvm.org/D28404

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-01-10 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In https://reviews.llvm.org/D28404#641078, @chandlerc wrote:

> For me, the arguments you're raising against -O0 and -flto don't hold up on 
> closer inspection:
>
> - O0 != optnone: correct. But this is only visible in LTO. And in LTO, Os != 
> optsize, and Oz != minsize. But we use optsize and minsize to communicate 
> between the compilation and the LTO step to the best of our ability the 
> intent of the programmer. It appears we can use optnone exactly the same way 
> here.

If the design decision is that relevant optimization controls are propagated 
into bitcode as function attributes, I grumble but concede it will do something 
similar to what was requested.

It does bother me that we keep finding things that LTO needs to know but which 
it does not know because it runs in a separate phase of the workflow.  I hope 
it is not a serious problem to ask "is there a more sensible way to fix this?"  
Maybe I'm not so good at expressing that so it comes out as a question rather 
than an objection, but that's basically what it is.

This design decision leaves -O1/-Og needing yet another attribute, when we get 
around to that, but I suppose Og would not have the 
interaction-with-other-attributes problems that optnone has.

> - optnone isn't *really* no optimizations: clearly this is true, but then 
> neither is -O0. We run the always inliner, a couple of other passes, and we 
> run several parts of the code generators optimizer. I understand why optnone 
> deficiencies (ie, too many optimizations) might be frustrating, but having 
> *more users* seems likely to make this *better*.

We have picked all the low-hanging fruit there, and probably some 
medium-hanging fruit.  Mehdi did have the misunderstanding that optnone == -O0 
and that I think was worth correcting.

> - There is no use case for -O0 + -flto:

The email thread has an exchange between Duncan and me, where I accept the use 
case.

> But all of this seems like an attempt to argue "you are wrong to have your 
> use case". I personally find that an unproductive line of discussion.

Not saying it was *wrong* just the description did not convey adequate 
justification.  Listing a few project types does not constitute a use case.  We 
did get to one, eventually, and it even involved differences in optimization 
levels.

> For example, you might ask: could we find some other way to solve the problem 
> you are trying to solve here?

There is another way to make use of the attribute, which I think will be more 
robust:

Have Sema pretend the pragma is in effect at all times, at -O0.  Then all the 
existing conflict detection/resolution logic Just Works, and there's no need to 
spend 4 lines of code hoping to replicate the correct conditions in 
CodeGenModule.

Because Sema does not have a handle on CodeGenOptions and therefore does not 
a-priori know the optimization level, probably the right thing to do is move 
the flag to LangOpts and set it under the correct conditions in 
CompilerInvocation.  It wouldn't be the first codegen-like option in LangOpts.

https://reviews.llvm.org/D28404

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

< 1 2 3 4 5 6 >

401 - 500 of 560 matches

Mail list logo