[PATCH] D147497: [AArch64] Use fneg instead of fsub -0.0, X Cin IR expansion of __builtin_neon_vfmsh_f16.

2023-04-04 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147497/new/

https://reviews.llvm.org/D147497

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136784: [Clang] Improve diagnostic message for loop hint pragma

2022-10-27 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Parse/ParsePragma.cpp:1306
   StringRef Str = PragmaName.getIdentifierInfo()->getName();
+  if (Str == "loop")
+return (llvm::Twine("clang loop ") + 
Option.getIdentifierInfo()->getName()).str();

eopXD wrote:
> SjoerdMeijer wrote:
> > We've got some duplication now, as "loop" is also checked in line 1310. Can 
> > you see if you can merge these checks, simplify things here a bit?
> Yes, the case for `loop` in below can now be removed. Thank you.
I was hoping that modifying the "loop" case would work:

  .Case("loop", (llvm::Twine("clang loop ") + 
Option.getIdentifierInfo()->getName()).str());




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136784/new/

https://reviews.llvm.org/D136784

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136783: Pre-commit test case for D136784

2022-10-27 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Looks like a good set of extra tests to me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136783/new/

https://reviews.llvm.org/D136783

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136784: [Clang] Improve diagnostic message for loop hint pragma

2022-10-27 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a reviewer: fhahn.
SjoerdMeijer added inline comments.



Comment at: clang/lib/Parse/ParsePragma.cpp:1306
   StringRef Str = PragmaName.getIdentifierInfo()->getName();
+  if (Str == "loop")
+return (llvm::Twine("clang loop ") + 
Option.getIdentifierInfo()->getName()).str();

We've got some duplication now, as "loop" is also checked in line 1310. Can you 
see if you can merge these checks, simplify things here a bit?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136784/new/

https://reviews.llvm.org/D136784

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136611: [Clang][AArch64] Add TargetParser support for defining CPU aliases

2022-10-25 Thread Sjoerd Meijer via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9bd273047d4f: [Clang][AArch64] Add TargetParser support for 
defining CPU aliases (authored by SjoerdMeijer).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136611/new/

https://reviews.llvm.org/D136611

Files:
  clang/test/Misc/target-invalid-cpu-note.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/AArch64TargetParser.h
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -1295,7 +1295,8 @@
  AArch64::AEK_LSE | AArch64::AEK_RDM,
  "8.2-A")));
 
-static constexpr unsigned NumAArch64CPUArchs = 58;
+// Note: number of CPUs includes aliases.
+static constexpr unsigned NumAArch64CPUArchs = 59;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/lib/Support/AArch64TargetParser.cpp
===
--- llvm/lib/Support/AArch64TargetParser.cpp
+++ llvm/lib/Support/AArch64TargetParser.cpp
@@ -85,6 +85,14 @@
   return true;
 }
 
+StringRef AArch64::resolveCPUAlias(StringRef CPU) {
+  return StringSwitch(CPU)
+#define AARCH64_CPU_ALIAS(ALIAS,NAME)  \
+  .Case(ALIAS, NAME)
+#include "../../include/llvm/Support/AArch64TargetParser.def"
+  .Default(CPU);
+}
+
 bool AArch64::getArchFeatures(AArch64::ArchKind AK,
   std::vector ) {
   if (AK == ArchKind::INVALID)
@@ -164,6 +172,9 @@
 if (Arch.ArchID != ArchKind::INVALID)
   Values.push_back(Arch.getName());
   }
+
+  for (const auto : AArch64CPUAliases)
+Values.push_back(Alias.getAlias());
 }
 
 bool AArch64::isX18ReservedByDefault(const Triple ) {
@@ -194,9 +205,17 @@
 }
 
 AArch64::ArchKind AArch64::parseCPUArch(StringRef CPU) {
-  for (const auto  : AArch64CPUNames) {
+  // Resolve aliases first.
+  for (const auto  : AArch64CPUAliases) {
+if (CPU == Alias.getAlias()) {
+  CPU = Alias.getName();
+  break;
+}
+  }
+  // Then find the CPU name.
+  for (const auto  : AArch64CPUNames)
 if (CPU == C.getName())
   return C.ArchID;
-  }
+
   return ArchKind::INVALID;
 }
Index: llvm/include/llvm/Support/AArch64TargetParser.h
===
--- llvm/include/llvm/Support/AArch64TargetParser.h
+++ llvm/include/llvm/Support/AArch64TargetParser.h
@@ -109,6 +109,20 @@
 #include "AArch64TargetParser.def"
 };
 
+const struct {
+  const char *Alias;
+  size_t AliasLength;
+  const char *Name;
+  size_t NameLength;
+
+  StringRef getAlias() const { return StringRef(Alias, AliasLength); }
+  StringRef getName() const { return StringRef(Name, NameLength); }
+} AArch64CPUAliases[] = {
+#define AARCH64_CPU_ALIAS(ALIAS,NAME)  \
+  {ALIAS, sizeof(ALIAS) - 1, NAME, sizeof(NAME) - 1},
+#include "AArch64TargetParser.def"
+};
+
 const ArchKind ArchKinds[] = {
 #define AARCH64_ARCH(NAME, ID, CPU_ATTR, SUB_ARCH, ARCH_ATTR, ARCH_FPU, ARCH_BASE_EXT) \
 ArchKind::ID,
@@ -138,13 +152,7 @@
 StringRef getArchExtName(unsigned ArchExtKind);
 StringRef getArchExtFeature(StringRef ArchExt);
 ArchKind convertV9toV8(ArchKind AK);
-
-// FIXME: We should be able to define CPU aliases in TargetParser.
-inline StringRef resolveCPUAlias(StringRef CPU) {
-  if (CPU == "grace")
-return "neoverse-v2";
-  return CPU;
-}
+StringRef resolveCPUAlias(StringRef CPU);
 
 // Information by Name
 unsigned getDefaultFPU(StringRef CPU, ArchKind AK);
Index: llvm/include/llvm/Support/AArch64TargetParser.def
===
--- llvm/include/llvm/Support/AArch64TargetParser.def
+++ llvm/include/llvm/Support/AArch64TargetParser.def
@@ -315,3 +315,9 @@
 // Invalid CPU
 AARCH64_CPU_NAME("invalid", INVALID, FK_INVALID, true, AArch64::AEK_INVALID)
 #undef AARCH64_CPU_NAME
+
+#ifndef AARCH64_CPU_ALIAS
+#define AARCH64_CPU_ALIAS(ALIAS,NAME)
+#endif
+AARCH64_CPU_ALIAS("grace", "neoverse-v2")
+#undef AARCH64_CPU_ALIAS
Index: clang/test/Misc/target-invalid-cpu-note.c
===
--- clang/test/Misc/target-invalid-cpu-note.c
+++ clang/test/Misc/target-invalid-cpu-note.c
@@ -5,11 +5,11 @@
 
 // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix AARCH64
 // AARCH64: error: unknown target CPU 'not-a-cpu'
-// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, 

[PATCH] D136425: [Clang][AArch64] Add support for -mcpu=grace

2022-10-24 Thread Sjoerd Meijer via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9ec7448857c1: [Clang][AArch64] Add support for -mcpu=grace 
(authored by SjoerdMeijer).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136425/new/

https://reviews.llvm.org/D136425

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-mcpu.c
  llvm/include/llvm/Support/AArch64TargetParser.h


Index: llvm/include/llvm/Support/AArch64TargetParser.h
===
--- llvm/include/llvm/Support/AArch64TargetParser.h
+++ llvm/include/llvm/Support/AArch64TargetParser.h
@@ -139,6 +139,13 @@
 StringRef getArchExtFeature(StringRef ArchExt);
 ArchKind convertV9toV8(ArchKind AK);
 
+// FIXME: We should be able to define CPU aliases in TargetParser.
+inline StringRef resolveCPUAlias(StringRef CPU) {
+  if (CPU == "grace")
+return "neoverse-v2";
+  return CPU;
+}
+
 // Information by Name
 unsigned getDefaultFPU(StringRef CPU, ArchKind AK);
 uint64_t getDefaultExtensions(StringRef CPU, ArchKind AK);
Index: clang/test/Driver/aarch64-mcpu.c
===
--- clang/test/Driver/aarch64-mcpu.c
+++ clang/test/Driver/aarch64-mcpu.c
@@ -61,6 +61,9 @@
 // RUN: %clang -target aarch64 -mcpu=cortex-r82  -### -c %s 2>&1 | FileCheck 
-check-prefix=CORTEXR82 %s
 // CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82"
 
+// RUN: %clang -target aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck 
-check-prefix=GRACE %s
+// GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2"
+
 // == Check whether -mcpu and -mtune accept mixed-case values.
 // RUN: %clang -target aarch64 -mcpu=Cortex-a53 -### -c %s 2>&1 | FileCheck 
-check-prefix=CASE-INSENSITIVE-CA53 %s
 // RUN: %clang -target aarch64 -mtune=Cortex-a53 -### -c %s 2>&1 | FileCheck 
-check-prefix=CASE-INSENSITIVE-CA53-TUNE %s
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -38,6 +38,8 @@
 CPU = Mcpu.split("+").first.lower();
   }
 
+  CPU = llvm::AArch64::resolveCPUAlias(CPU);
+
   // Handle CPU name is 'native'.
   if (CPU == "native")
 return std::string(llvm::sys::getHostCPUName());
@@ -121,6 +123,8 @@
   CPU = Split.first;
   llvm::AArch64::ArchKind ArchKind = llvm::AArch64::ArchKind::ARMV8A;
 
+  CPU = llvm::AArch64::resolveCPUAlias(CPU);
+
   if (CPU == "native")
 CPU = llvm::sys::getHostCPUName();
 


Index: llvm/include/llvm/Support/AArch64TargetParser.h
===
--- llvm/include/llvm/Support/AArch64TargetParser.h
+++ llvm/include/llvm/Support/AArch64TargetParser.h
@@ -139,6 +139,13 @@
 StringRef getArchExtFeature(StringRef ArchExt);
 ArchKind convertV9toV8(ArchKind AK);
 
+// FIXME: We should be able to define CPU aliases in TargetParser.
+inline StringRef resolveCPUAlias(StringRef CPU) {
+  if (CPU == "grace")
+return "neoverse-v2";
+  return CPU;
+}
+
 // Information by Name
 unsigned getDefaultFPU(StringRef CPU, ArchKind AK);
 uint64_t getDefaultExtensions(StringRef CPU, ArchKind AK);
Index: clang/test/Driver/aarch64-mcpu.c
===
--- clang/test/Driver/aarch64-mcpu.c
+++ clang/test/Driver/aarch64-mcpu.c
@@ -61,6 +61,9 @@
 // RUN: %clang -target aarch64 -mcpu=cortex-r82  -### -c %s 2>&1 | FileCheck -check-prefix=CORTEXR82 %s
 // CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82"
 
+// RUN: %clang -target aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck -check-prefix=GRACE %s
+// GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2"
+
 // == Check whether -mcpu and -mtune accept mixed-case values.
 // RUN: %clang -target aarch64 -mcpu=Cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CASE-INSENSITIVE-CA53 %s
 // RUN: %clang -target aarch64 -mtune=Cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CASE-INSENSITIVE-CA53-TUNE %s
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -38,6 +38,8 @@
 CPU = Mcpu.split("+").first.lower();
   }
 
+  CPU = llvm::AArch64::resolveCPUAlias(CPU);
+
   // Handle CPU name is 'native'.
   if (CPU == "native")
 return std::string(llvm::sys::getHostCPUName());
@@ -121,6 +123,8 @@
   CPU = Split.first;
   llvm::AArch64::ArchKind ArchKind = llvm::AArch64::ArchKind::ARMV8A;
 
+  CPU = llvm::AArch64::resolveCPUAlias(CPU);
+
   if (CPU == "native")
 CPU = llvm::sys::getHostCPUName();
 

[PATCH] D113779: [Clang] Add mfp16, mfp16fml and mdotprod flags for ARM target features.

2022-09-07 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

In D113779#3496589 , @fhahn wrote:

> In D113779#3207936 , @SjoerdMeijer 
> wrote:
>
>>> If anybody has contacts to GCC that would be very helpful. Unfortunately I 
>>> don't think I will be able to drive this.
>>
>> Ok, I will bring this up internally first with some folks that work on GCC 
>> and see what happens. To be continued...
>
> Hi, did you get an update by any chance?

Sorry for the delay. I have just left a message for our GNU toolchain guys with 
an invitation to add comments here.
I know there are strong opinions that -march should be the only way to set 
extensions, but personally I am open to use case that it might be difficult to 
override -march, so I am not blocking this.
I am still of the opinion that options are already an enormous mess, and 
introducing another way is not making things necessarily better. But if we 
document this, and add -m options for existing extensions, it may not be worse. 
So that will be my request for this patch, that we document this somewhere in 
the Clang docs, and then ideally we see the patches for existing extensions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113779/new/

https://reviews.llvm.org/D113779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116153: [ARM][AArch64] Add missing v8.x checks

2022-02-22 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Thanks, looks good!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116153/new/

https://reviews.llvm.org/D116153

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116153: [ARM][AArch64] Add missing v8.x checks

2022-02-21 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Basic/Targets/ARM.cpp:958
   case llvm::ARM::ArchKind::ARMV8_6A:
+  case llvm::ARM::ArchKind::ARMV8_7A:
   case llvm::ARM::ArchKind::ARMV8_8A:

tyb0807 wrote:
> SjoerdMeijer wrote:
> > tyb0807 wrote:
> > > SjoerdMeijer wrote:
> > > > I see tests for the crypto stuff, but is there or do we need a test for 
> > > > whatever `getTargetDefinesARMV83A` is setting?
> > > I'm not sure that we need a test for this, as none of the other 
> > > architectures really have this. What do you think @SjoerdMeijer 
> > > @vhscampos ?
> > I am expecting tests for the ACLE macros that this patch defines for v8.7 
> > to be added to `clang/test/Preprocessor/arm-target-features.c`.
> In that case, for consistency, I think we should also add tests for ACLE 
> macros for other versions (v8.4/5/6/8) as well. And for AArch64 too. It 
> sounds a bit out of scope, but it ensures consistency IMHO
The subject says 

> [ARM][AArch64] Add missing v8.x checks

so looks perfect in scope to me. :)




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116153/new/

https://reviews.llvm.org/D116153

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116153: [ARM][AArch64] Add missing v8.x checks

2022-02-21 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Basic/Targets/ARM.cpp:958
   case llvm::ARM::ArchKind::ARMV8_6A:
+  case llvm::ARM::ArchKind::ARMV8_7A:
   case llvm::ARM::ArchKind::ARMV8_8A:

tyb0807 wrote:
> SjoerdMeijer wrote:
> > I see tests for the crypto stuff, but is there or do we need a test for 
> > whatever `getTargetDefinesARMV83A` is setting?
> I'm not sure that we need a test for this, as none of the other architectures 
> really have this. What do you think @SjoerdMeijer @vhscampos ?
I am expecting tests for the ACLE macros that this patch defines for v8.7 to be 
added to `clang/test/Preprocessor/arm-target-features.c`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116153/new/

https://reviews.llvm.org/D116153

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116153: [ARM][AArch64] Add missing v8.x checks

2022-02-21 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Basic/Targets/ARM.cpp:958
   case llvm::ARM::ArchKind::ARMV8_6A:
+  case llvm::ARM::ArchKind::ARMV8_7A:
   case llvm::ARM::ArchKind::ARMV8_8A:

I see tests for the crypto stuff, but is there or do we need a test for 
whatever `getTargetDefinesARMV83A` is setting?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116153/new/

https://reviews.llvm.org/D116153

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116153: [ARM][AArch64] Add missing v8.x checks

2022-02-21 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Ah, sorry, forgot about this.  Can you upload the patch with some more context 
please so I can have a quick look again?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116153/new/

https://reviews.llvm.org/D116153

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117753: [AArch64] Support for memset tagged intrinsic

2022-01-27 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Headers/arm_acle.h:734
+/* Memory Operations Intrinsics */
+#if __ARM_FEATURE_MOPS && __ARM_FEATURE_MEMORY_TAGGING
+#define __arm_mops_memset_tag(tagged_address, value, size) 
\

dmgreen wrote:
> tyb0807 wrote:
> > SjoerdMeijer wrote:
> > > Why does this also need MTE? I think the ACLE specifies this intrinsic to 
> > > be available when __ARM_FEATURE_MOPS is defined?
> > Yes you are right, thanks for spotting this.
> Hmm. These map to SETMG, and those instructions require MTE. It wouldn't make 
> sense to have an intrinsic that cannot be emitted to a valid instruction. I 
> think the spec might be wrong, to be honest.
Ah yeah, thanks Dave.  These are the tag setting variants, I didn't look 
careful enough, then probably got confused about the ACLE which I agree must be 
wrong. Would be good to double check that first.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117753/new/

https://reviews.llvm.org/D117753

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117753: [AArch64] Support for memset tagged intrinsic

2022-01-26 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117753/new/

https://reviews.llvm.org/D117753

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117753: [AArch64] Support for memset tagged intrinsic

2022-01-25 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/test/CodeGen/aarch64-mops.c:3
+
+// RUN: %clang_cc1 -triple aarch64-arm-unknown-eabi -target-feature +mops -S 
-emit-llvm -o - %s  | FileCheck %s
+

I forgot if we add negative tests for these things, i.e. check if we error if 
we don't have `+mops` or `__ARM_FEATURE_MOPS` set? I guess so?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117753/new/

https://reviews.llvm.org/D117753

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117753: [AArch64] Support for memset tagged intrinsic

2022-01-20 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Headers/arm_acle.h:734
+/* Memory Operations Intrinsics */
+#if __ARM_FEATURE_MOPS && __ARM_FEATURE_MEMORY_TAGGING
+#define __arm_mops_memset_tag(tagged_address, value, size) 
\

Why does this also need MTE? I think the ACLE specifies this intrinsic to be 
available when __ARM_FEATURE_MOPS is defined?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117753/new/

https://reviews.llvm.org/D117753

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113779: [Clang] Add mfp16, mfp16fml and mdotprod flags for ARM target features.

2021-12-23 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

> If anybody has contacts to GCC that would be very helpful. Unfortunately I 
> don't think I will be able to drive this.

Ok, I will bring this up internally first with some folks that work on GCC and 
see what happens. To be continued...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113779/new/

https://reviews.llvm.org/D113779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116153: [ARM][AArch64] Add missing v8.x checks

2021-12-22 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added reviewers: dmgreen, SjoerdMeijer.
SjoerdMeijer added inline comments.



Comment at: clang/lib/Basic/Targets/ARM.cpp:937
   case llvm::ARM::ArchKind::ARMV9_2A:
 getTargetDefinesARMV83A(Opts, Builder);
 break;

Perhaps unrelated to this patch, but I am surprised to see that from v 8.3 and 
up we only include `getTargetDefinesARMV83A`, so no other target defines were 
introduced or are necessary? This is not rhetorical questionI haven't paid 
attention to this since v8.4.



Comment at: clang/test/Preprocessor/aarch64-target-features.c:296
 
 // RUN: %clang -target x86_64-apple-macosx -arch arm64 -### -c %s 2>&1 | 
FileCheck --check-prefix=CHECK-ARCH-ARM64 %s
+// CHECK-ARCH-ARM64: "-target-cpu" "apple-m1" "-target-feature" "+v8.5a" 
"-target-feature" "+fp-armv8" "-target-feature" "+neon" "-target-feature" 
"+crc" "-target-feature" "+crypto" "-target-feature" "+dotprod" 
"-target-feature" "+fp16fml" "-target-feature" "+ras" "-target-feature" "+lse" 
"-target-feature" "+rdm" "-target-feature" "+rcpc" "-target-feature" "+zcm" 
"-target-feature" "+zcz" "-target-feature" "+fullfp16" "-target-feature" "+sm4" 
"-target-feature" "+sha3" "-target-feature" "+sha2" "-target-feature" "+aes"

I don't pretend to understand the combo `x86_64-apple-macosx -arch arm64` 
here but is unrelated to this patch... 



Comment at: clang/test/Preprocessor/aarch64-target-features.c:298
+// CHECK-ARCH-ARM64: "-target-cpu" "apple-m1" "-target-feature" "+v8.5a" 
"-target-feature" "+fp-armv8" "-target-feature" "+neon" "-target-feature" 
"+crc" "-target-feature" "+crypto" "-target-feature" "+dotprod" 
"-target-feature" "+fp16fml" "-target-feature" "+ras" "-target-feature" "+lse" 
"-target-feature" "+rdm" "-target-feature" "+rcpc" "-target-feature" "+zcm" 
"-target-feature" "+zcz" "-target-feature" "+fullfp16" "-target-feature" "+sm4" 
"-target-feature" "+sha3" "-target-feature" "+sha2" "-target-feature" "+aes"
 
 // RUN: %clang -target x86_64-apple-macosx -arch arm64_32 -### -c %s 2>&1 | 
FileCheck --check-prefix=CHECK-ARCH-ARM64_32 %s

A test for v8.5 seems to be covered above somehow, but do we not need any tests 
for v8.6 and v 8.7?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116153/new/

https://reviews.llvm.org/D116153

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113779: [Clang] Add mfp16, mfp16fml and mdotprod flags for ARM target features.

2021-12-14 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Ok, fair enough, perhaps adding features is a valid use-case.

I will refrain from commenting on "things are terribly broken".  I agree it is 
broken, but in a different way than suggested in previous comments. 
If others also think this makes sense, then here a few follow up remarks from 
my side:

- First of all, this (really) sets precedent for setting options in a different 
way. This needs documentation and release noting.
- If we are going to do this, this should be the first patch in a series to fix 
this for all features. We can't just do a few of them.
- There was a suggestion to allow adding features with -march=+feature. Was 
this dismissed in favour of how things works for x86 and be consistent with 
that?
- It would be really good if we keep options consistent/compatible across the 
GCC and Clang toolchains. Any possibility to check with GCC community if they 
are open for this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113779/new/

https://reviews.llvm.org/D113779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113779: [Clang] Add mfp16, mfp16fml and mdotprod flags for ARM target features.

2021-11-15 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

This introduces another way of setting (optional) architecture extensions and 
having two ways to do the same is nearly always a bad thing, which is how one 
of my colleagues phrased it.

This was already a complex area and thus I don't think introducing another is 
going to really simplify things. This is the main problem of this proposal. 
Other things are:

- ideally we keep these flags consistent with GCC, but that seems to be a no-go.
- how do these new `-m` flags and `-march` interact?

More subjective: for most users this whole `-march` business is abstracted away 
in build systems, so they won't have to deal with this, that's why this isn't 
so much of an improvement.

If we want a better user experience set options, there are probably other 
things that are more important, like checking legal/illegal architecture 
combinations.

So, in summary, we prefer not to go ahead with this. And the precedent that was 
mentioned, `-mcrc`, should probably be deprecated.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113779/new/

https://reviews.llvm.org/D113779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A

2021-09-22 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

LGTM




Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:419
+
+  // Enable SVE2 by default on Armv9-A
+  // It can still be disabled if +nosve2 is present

Nit: some `.`s missing at the end of this sentence and the next below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109517/new/

https://reviews.llvm.org/D109517

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A

2021-09-17 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Thanks, looks reasonable to me.




Comment at: llvm/unittests/Support/TargetParserTest.cpp:495
   ARMBuildAttrs::CPUArch::v8_A));
+  EXPECT_TRUE(
+  testARMArch("armv9-a", "generic", "v9a",

vhscampos wrote:
> SjoerdMeijer wrote:
> > I haven't looked, but in these target parser tests, do we also not need to 
> > check the architecture descriptions?
> > Copied this for example from the target parser def file:
> > 
> >  (ARM::AEK_SEC | ARM::AEK_MP | ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
> >   ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_CRC | ARM::AEK_RAS |
> >   ARM::AEK_DOTPROD)
> If I understand it correctly, we only check architecture extensions for CPUs, 
> not for the architectures themselves.
Ah yeah, I confused it with that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109517/new/

https://reviews.llvm.org/D109517

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A

2021-09-09 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added reviewers: t.p.northover, ab.
SjoerdMeijer added a comment.

Some first comments after just looking at the plumbing for these new options. 
Didn't check yet the architecture extensions for the different version.




Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:82
 
 // +sve implies +f32mm if the base architecture is v8.6A or v8.7A
 // it isn't the case in general that sve implies both f64mm and f32mm

This comment needs to be updated for v9a?



Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:413
 
-  auto V8_6Pos = llvm::find(Features, "+v8.6a");
-  if (V8_6Pos != std::end(Features))
-V8_6Pos = Features.insert(std::next(V8_6Pos), {"+i8mm", "+bf16"});
+  const char *Archs[] = {"+v8.6a", "+v8.7a", "+v9.1a", "+v9.2a"};
+  auto Pos = std::find_first_of(Features.begin(), Features.end(),

How about `+v9a`?



Comment at: clang/test/Preprocessor/aarch64-target-features.c:180
 
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies

Comment out of date?



Comment at: llvm/lib/Target/AArch64/AArch64.td:468
FeatureSSBS, FeatureSB, FeaturePredRes, FeatureCacheDeepPersist,
-   FeatureBranchTargetId]>;
+   FeatureBranchTargetId, FeatureRME]>;
 

Is this an unrelated change? Can this be done separately?



Comment at: llvm/unittests/Support/TargetParserTest.cpp:495
   ARMBuildAttrs::CPUArch::v8_A));
+  EXPECT_TRUE(
+  testARMArch("armv9-a", "generic", "v9a",

I haven't looked, but in these target parser tests, do we also not need to 
check the architecture descriptions?
Copied this for example from the target parser def file:

 (ARM::AEK_SEC | ARM::AEK_MP | ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
  ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_CRC | ARM::AEK_RAS |
  ARM::AEK_DOTPROD)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109517/new/

https://reviews.llvm.org/D109517

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-08 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/test/CodeGen/ARM/cmse-cve-2021-35465-return.ll:6
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+
+%indirect = type { double, double, double, double, double, double, double, 
double }

As you're creating machine instructions, I think it is better to run this and 
the other test with `-verify-machineinstrs.`


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-07 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

The driver part looks good to me. I will let Oli do the approval as he also 
looked at the codegen (and I didn't).




Comment at: clang/include/clang/Driver/Options.td:3274
+  Group,
+  HelpText<"Work around VLLDM erratum CVE-2021-35465 (Arm only)">;
+def mno_fix_cmse_cve_2021_35465 : Flag<["-"], "mno-fix-cmse-cve-2021-35465">,

Nit: I think `Arm` -> `ARM` is better as we we're talking about the ARM backend 
here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-07 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1666
+CmdArgs.push_back("-mllvm");
+if (A->getOption().matches(options::OPT_mfix_cmse_cve_2021_35465))
+  CmdArgs.push_back("-arm-fix-cmse-cve-2021-35465=1");

labrinea wrote:
> SjoerdMeijer wrote:
> > I am wondering if this should use `getLastArg` and what happens with test 
> > cases (which I guess need adding) that have both:
> > 
> >   -mno-fix-cmse-cve-2021-35465  -mfix-cmse-cve-2021-35465 
> > 
> > or
> > 
> >   -mfix-cmse-cve-2021-35465  -mno-fix-cmse-cve-2021-35465
> That's the whole point of `getLastArg` as far as I understand: for options 
> that can either enable or disable a feature, so that the last one wins. I'll 
> add more tests.
Ah, got confused, it's indeed right there!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-07 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1646
 
-  if (Args.getLastArg(options::OPT_mcmse))
+  bool fix_cve_2021_35465 = false;
+  if (Args.getLastArg(options::OPT_mcmse)) {

Nit: capital F in variable name.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1666
+CmdArgs.push_back("-mllvm");
+if (A->getOption().matches(options::OPT_mfix_cmse_cve_2021_35465))
+  CmdArgs.push_back("-arm-fix-cmse-cve-2021-35465=1");

I am wondering if this should use `getLastArg` and what happens with test cases 
(which I guess need adding) that have both:

  -mno-fix-cmse-cve-2021-35465  -mfix-cmse-cve-2021-35465 

or

  -mfix-cmse-cve-2021-35465  -mno-fix-cmse-cve-2021-35465


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-06 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1666
+CmdArgs.push_back("-mllvm");
+if (A->getOption().matches(options::OPT_mfix_cmse_cve_2021_35465))
+  CmdArgs.push_back("-arm-fix-cmse-cve-2021-35465=1");

If `-mcpu=cortex-[m33|m35|m55]` was provided, then 
`-arm-fix-cmse-cve-2021-35465=1` is already set and we are adding another 
option here? For example, for

  -mcpu=cortex-m33 -mcmse -mfix-cmse-cve-2021-35465

I am expecting:

  "-mllvm" "-arm-fix-cmse-cve-2021-35465=1"  "-mllvm" 
"-arm-fix-cmse-cve-2021-35465=1" 

and with `-mno-fix-cmse-cve-2021-35465`:

   "-mllvm" "-arm-fix-cmse-cve-2021-35465=1"  "-mllvm" 
"-arm-fix-cmse-cve-2021-35465=0" 

Probably it's nicer to just pass this once.

Also, in the tests, I think cases are missing for `-mcpu=...` and 
`-m[no-]fix-cmse-cve-2021-35465`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105331: [CFE][X86] Enable complex _Float16.

2021-07-02 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

The patch on itself looks reasonable to me. I was just wondering about the 
_Float16 support on X86 in general because the Clang docs says:

  _Float16 is currently only supported on the following targets, with further 
targets pending ABI standardization: 

Does X86 support _Float16, and how does that work for Complex _Float16?  
Either way, does this need a doc update too?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105331/new/

https://reviews.llvm.org/D105331

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D101696: [Matrix] Implement C-style explicit type conversions in CXX for matrix types

2021-06-07 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

In D101696#2800790 , @SaurabhJha 
wrote:

> In D101696#2798713 , @SjoerdMeijer 
> wrote:
>
>> Perhaps for bonus points, update the Clang documentation in 
>> https://clang.llvm.org/docs/LanguageExtensions.html#matrix-types with some 
>> examples?
>
> Can you please point me to how to submit patches to the documentation? I 
> could only find links like https://clang.llvm.org/hacking.html which talk 
> about code patches. Many thanks.

It works exactly the same as other patches. I think the file you're interested 
in is `llvm-project/clang/docs/LanguageExtensions.rst`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101696/new/

https://reviews.llvm.org/D101696

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D101696: [Matrix] Implement C-style explicit type conversions in CXX for matrix types

2021-06-04 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Perhaps for bonus points, update the Clang documentation in 
https://clang.llvm.org/docs/LanguageExtensions.html#matrix-types with some 
examples?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101696/new/

https://reviews.llvm.org/D101696

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98264: [AArch64] Implement __rndr, __rndrrs intrinsics

2021-03-15 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Thanks, nice one, LGTM.




Comment at: llvm/test/CodeGen/AArch64/rand.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=aarch64 -mcpu=neoverse-v1 -mattr=+rand %s -o - | FileCheck 
%s
+

Nit: we don't need `-mcpu=neoverse-v1`, so just remove it?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98264/new/

https://reviews.llvm.org/D98264

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98264: [AArch64] Implement __rndr, __rndrrs intrinsics

2021-03-15 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/test/CodeGen/arm_acle.c:908
+#if __ARM_64BIT_STATE && defined(__ARM_FEATURE_RNG)
+// AArch64-v8_3-LABEL: @test_rndr(
+// AArch64-v8_3-NEXT:  entry:

stelios-arm wrote:
> SjoerdMeijer wrote:
> > Not sure if I am surprised that this works This is (re)using tag 
> > `AArch64-v8_3` and for the other tests that use this and don't have RNG 
> > set, I was expecting FileCheck to complain about this, not sure if I am 
> > missing something though.
> It turns out that this test is disabled 
> (https://github.com/llvm/llvm-project/commit/6fcd4e080f09c9765d6e0ea03b1da91669c8509a).
>  I am going to remove the changes in this file. 
But this seems to be a useful test to have. Are we testing this elsewhere now?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98264/new/

https://reviews.llvm.org/D98264

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98264: [AArch64] Implement __rndr, __rndrrs intrinsics

2021-03-12 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Basic/Targets/AArch64.cpp:364
+  if (HasRandGen)
+Builder.defineMacro("__ARM_FEATURE_RNG", "1");
+

This needs a clang preprocessor test: both a check for its presence and absence.
Have a look/grep where these others predefined macros are tested.



Comment at: clang/test/CodeGen/arm_acle.c:908
+#if __ARM_64BIT_STATE && defined(__ARM_FEATURE_RNG)
+// AArch64-v8_3-LABEL: @test_rndr(
+// AArch64-v8_3-NEXT:  entry:

Not sure if I am surprised that this works This is (re)using tag 
`AArch64-v8_3` and for the other tests that use this and don't have RNG set, I 
was expecting FileCheck to complain about this, not sure if I am missing 
something though.



Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.td:610
 def AArch64tbl : SDNode<"AArch64ISD::TBL", SDT_AArch64TBL>;
-
+// MRS from CodeGen.
+def AArch64mrs : SDNode<"AArch64ISD::MRS",

Nit: for this comment to be informative, please expand it, or just remove it if 
there's not much to say about it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98264/new/

https://reviews.llvm.org/D98264

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98264: [AArch64] Implement __rndr, __rndrrs intrinsics

2021-03-11 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64InstrFormats.td:1495
   let DecoderNamespace = "Fallback";
+  let Defs = [NZCV];
 }

dmgreen wrote:
> stelios-arm wrote:
> > SjoerdMeijer wrote:
> > > SjoerdMeijer wrote:
> > > > dmgreen wrote:
> > > > > SjoerdMeijer wrote:
> > > > > > dmgreen wrote:
> > > > > > > SjoerdMeijer wrote:
> > > > > > > > Do all MRS instructions do this?
> > > > > > > No, but some do and it's not obvious which ones do and don't. I 
> > > > > > > think it should be reasonable to always def NZCV here, even if 
> > > > > > > they are just dead. It should be very rare that it would be 
> > > > > > > beneficial for NZCV to be live across a MRS instruction.
> > > > > > True, but since creating another definition is cheap, what would 
> > > > > > the cons be of:
> > > > > > 
> > > > > >   class MRS_NZCV : MRSI {
> > > > > > ..
> > > > > > let Defs = [NZCV];
> > > > > >   }
> > > > > > 
> > > > > > The way I look at it is that the description would be more accurate?
> > > > > I believe that would have an over-lapping definition with the 
> > > > > existing MRS instruction?
> > > > > 
> > > > > It would need to be a pseudo I think, which would eventually be 
> > > > > turned into a MSR proper late enough on the pipeline for it to be 
> > > > > valid (or the other way around, the no-nzcv version gets turned into 
> > > > > a the nzcv version to keep the verifier happy).
> > > > > 
> > > > > It could also be a optional def, but those are only used in the arm 
> > > > > backend and I would not recommend using them anywhere else.  I would 
> > > > > probably suggest just setting MRS as a NZCV setting instruction, 
> > > > > unless we find a reason to need to implement it differently.
> > > > > I believe that would have an over-lapping definition with the 
> > > > > existing MRS instruction?
> > > > 
> > > > Ah yeah, that might be true.
> > > > 
> > > > > It would need to be a pseudo I think
> > > > 
> > > > How much work is adding a pseudo for this case? My first reaction would 
> > > > be just trying to model this correctly, then we potentially don't have 
> > > > to worry again later about this. 
> > > I just wanted to add that I don't have too strong opinions on this, but 
> > > what I suggested seemed more the correct, even though the consequence of 
> > > setting NCZV for all MRS is minimal. So I will leave this one up to you 
> > > @dmgreen and @stelios-arm .
> > I will talk with @dmgreen and if it is decided that it is needed to change, 
> > I will address it in a future revision. Please note that the next revision 
> > of patch, will not address this comment (temporarily). 
> Hmm. I feel like adding the pseudo is more trouble than it is worth - they do 
> not come for free. With the extra lowering and adding things like scheduling 
> info, etc. It feels simpler to me to set the flags, even if that's not always 
> exactly what the instruction would do.
Okidoki, then we at least need some comments here explaining this all.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98264/new/

https://reviews.llvm.org/D98264

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98264: [AArch64] Implement __rndr, __rndrrs intrinsics

2021-03-10 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64InstrFormats.td:1495
   let DecoderNamespace = "Fallback";
+  let Defs = [NZCV];
 }

SjoerdMeijer wrote:
> dmgreen wrote:
> > SjoerdMeijer wrote:
> > > dmgreen wrote:
> > > > SjoerdMeijer wrote:
> > > > > Do all MRS instructions do this?
> > > > No, but some do and it's not obvious which ones do and don't. I think 
> > > > it should be reasonable to always def NZCV here, even if they are just 
> > > > dead. It should be very rare that it would be beneficial for NZCV to be 
> > > > live across a MRS instruction.
> > > True, but since creating another definition is cheap, what would the cons 
> > > be of:
> > > 
> > >   class MRS_NZCV : MRSI {
> > > ..
> > > let Defs = [NZCV];
> > >   }
> > > 
> > > The way I look at it is that the description would be more accurate?
> > I believe that would have an over-lapping definition with the existing MRS 
> > instruction?
> > 
> > It would need to be a pseudo I think, which would eventually be turned into 
> > a MSR proper late enough on the pipeline for it to be valid (or the other 
> > way around, the no-nzcv version gets turned into a the nzcv version to keep 
> > the verifier happy).
> > 
> > It could also be a optional def, but those are only used in the arm backend 
> > and I would not recommend using them anywhere else.  I would probably 
> > suggest just setting MRS as a NZCV setting instruction, unless we find a 
> > reason to need to implement it differently.
> > I believe that would have an over-lapping definition with the existing MRS 
> > instruction?
> 
> Ah yeah, that might be true.
> 
> > It would need to be a pseudo I think
> 
> How much work is adding a pseudo for this case? My first reaction would be 
> just trying to model this correctly, then we potentially don't have to worry 
> again later about this. 
I just wanted to add that I don't have too strong opinions on this, but what I 
suggested seemed more the correct, even though the consequence of setting NCZV 
for all MRS is minimal. So I will leave this one up to you @dmgreen and 
@stelios-arm .


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98264/new/

https://reviews.llvm.org/D98264

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98264: [AArch64] Implement __rndr, __rndrrs intrinsics

2021-03-09 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64InstrFormats.td:1495
   let DecoderNamespace = "Fallback";
+  let Defs = [NZCV];
 }

dmgreen wrote:
> SjoerdMeijer wrote:
> > dmgreen wrote:
> > > SjoerdMeijer wrote:
> > > > Do all MRS instructions do this?
> > > No, but some do and it's not obvious which ones do and don't. I think it 
> > > should be reasonable to always def NZCV here, even if they are just dead. 
> > > It should be very rare that it would be beneficial for NZCV to be live 
> > > across a MRS instruction.
> > True, but since creating another definition is cheap, what would the cons 
> > be of:
> > 
> >   class MRS_NZCV : MRSI {
> > ..
> > let Defs = [NZCV];
> >   }
> > 
> > The way I look at it is that the description would be more accurate?
> I believe that would have an over-lapping definition with the existing MRS 
> instruction?
> 
> It would need to be a pseudo I think, which would eventually be turned into a 
> MSR proper late enough on the pipeline for it to be valid (or the other way 
> around, the no-nzcv version gets turned into a the nzcv version to keep the 
> verifier happy).
> 
> It could also be a optional def, but those are only used in the arm backend 
> and I would not recommend using them anywhere else.  I would probably suggest 
> just setting MRS as a NZCV setting instruction, unless we find a reason to 
> need to implement it differently.
> I believe that would have an over-lapping definition with the existing MRS 
> instruction?

Ah yeah, that might be true.

> It would need to be a pseudo I think

How much work is adding a pseudo for this case? My first reaction would be just 
trying to model this correctly, then we potentially don't have to worry again 
later about this. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98264/new/

https://reviews.llvm.org/D98264

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98264: [AArch64] Implement __rndr, __rndrrs intrinsics

2021-03-09 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64InstrFormats.td:1495
   let DecoderNamespace = "Fallback";
+  let Defs = [NZCV];
 }

dmgreen wrote:
> SjoerdMeijer wrote:
> > Do all MRS instructions do this?
> No, but some do and it's not obvious which ones do and don't. I think it 
> should be reasonable to always def NZCV here, even if they are just dead. It 
> should be very rare that it would be beneficial for NZCV to be live across a 
> MRS instruction.
True, but since creating another definition is cheap, what would the cons be of:

  class MRS_NZCV : MRSI {
..
let Defs = [NZCV];
  }

The way I look at it is that the description would be more accurate?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98264/new/

https://reviews.llvm.org/D98264

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98264: [AArch64] Implement __rndr, __rndrrs intrinsics

2021-03-09 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Basic/Targets/AArch64.cpp:363
 
+  if (HasRandGen)
+Builder.defineMacro("__ARM_FEATURE_RNG", "1");

Where/when is `HasRandGen` set?



Comment at: clang/test/Preprocessor/init-aarch64.c:28
 // AARCH64-NEXT: #define __ARM_FEATURE_NUMERIC_MAXMIN 1
+// AARCH64-NEXT: #define  __ARM_FEATURE_RNG 1
 // AARCH64-NEXT: #define __ARM_FEATURE_UNALIGNED 1

Why are we expecting this here? Are we not only expecting this for v8.5?

We also need a negative test and CHECK-NOT of this where we don't expect this.



Comment at: llvm/lib/Target/AArch64/AArch64InstrFormats.td:1495
   let DecoderNamespace = "Fallback";
+  let Defs = [NZCV];
 }

Do all MRS instructions do this?



Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.td:1274
+def : Pat<(AArch64mrs imm:$id),
+  (MRS imm:$id)>;
+

Nit: can be on the same line.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98264/new/

https://reviews.llvm.org/D98264

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98012: [RFC][doc] Document that RISC-V's __fp16 has different behavior

2021-03-05 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

> However we would like have slight different behavior for __fp16 other

than ACLE: The evaluation format of __fp16 set same as _Float16,
which means no promotion are performed if there is no hardware half-precision
supported.

Well, this is really problematic, because you're giving `__fp16` semantics, 
you're defining it to behave as `_Float16`.
Redefining an existing type, let it differentiate from other targets can never 
be a good thing.
I don't see how this is going to help anyone.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98012/new/

https://reviews.llvm.org/D98012

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96866: [AArch64] Add some missing Neoverse features

2021-02-19 Thread Sjoerd Meijer via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG260f90bb3d1a: [AArch64] Add some missing Neoverse features 
(authored by SjoerdMeijer).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Changed prior to commit:
  https://reviews.llvm.org/D96866?vs=324615=324910#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96866/new/

https://reviews.llvm.org/D96866

Files:
  clang/test/Driver/aarch64-cpus.c
  llvm/lib/Target/AArch64/AArch64.td
  llvm/test/CodeGen/AArch64/misched-fusion-aes.ll

Index: llvm/test/CodeGen/AArch64/misched-fusion-aes.ll
===
--- llvm/test/CodeGen/AArch64/misched-fusion-aes.ll
+++ llvm/test/CodeGen/AArch64/misched-fusion-aes.ll
@@ -10,6 +10,10 @@
 ; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=cortex-a78 | FileCheck %s
 ; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=cortex-a78c| FileCheck %s
 ; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=cortex-x1  | FileCheck %s
+; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=neoverse-e1 | FileCheck %s
+; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=neoverse-n1 | FileCheck %s
+; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=neoverse-n2 | FileCheck %s
+; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=neoverse-v1 | FileCheck %s
 ; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=exynos-m3  | FileCheck %s
 ; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=exynos-m4  | FileCheck %s
 ; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=exynos-m5  | FileCheck %s
Index: llvm/lib/Target/AArch64/AArch64.td
===
--- llvm/lib/Target/AArch64/AArch64.td
+++ llvm/lib/Target/AArch64/AArch64.td
@@ -594,7 +594,9 @@
FeatureFullFP16,
FeatureDotProd,
FeatureRCPC,
-   FeaturePerfMon
+   FeaturePerfMon,
+   FeaturePostRAScheduler,
+   FeatureUseAA
]>;
 
 def ProcA57 : SubtargetFeature<"a57", "ARMProcFamily", "CortexA57",
@@ -728,7 +730,7 @@
"CortexR82",
"Cortex-R82 ARM Processors", [
FeaturePostRAScheduler,
-   // TODO: crypto and FuseAES
+   FeatureUseAA,
// All other features are implied by v8_0r ops:
HasV8_0rOps,
]>;
@@ -977,6 +979,9 @@
   FeatureNEON,
   FeatureRCPC,
   FeatureSSBS,
+  FeaturePostRAScheduler,
+  FeatureUseAA,
+  FeatureFuseAES,
   ]>;
 
 def ProcNeoverseN1 : SubtargetFeature<"neoversen1", "ARMProcFamily",
@@ -991,6 +996,9 @@
   FeatureRCPC,
   FeatureSPE,
   FeatureSSBS,
+  FeaturePostRAScheduler,
+  FeatureUseAA,
+  FeatureFuseAES,
   ]>;
 
 def ProcNeoverseN2 : SubtargetFeature<"neoversen2", "ARMProcFamily",
@@ -1003,7 +1011,12 @@
   FeatureMTE,
   FeatureSVE2,
   FeatureSVE2BitPerm,
-  FeatureTRBE]>;
+  FeatureTRBE,
+  FeaturePostRAScheduler,
+  FeatureUseAA,
+  FeatureCrypto,
+  FeatureFuseAES,
+  ]>;
 
 def ProcNeoverseV1 : SubtargetFeature<"neoversev1", "ARMProcFamily",
   "NeoverseV1",
@@ -1020,6 +1033,7 @@
   FeatureNEON,
   FeaturePerfMon,
   FeaturePostRAScheduler,
+  FeatureUseAA,
   FeatureRandGen,
   FeatureSPE,
   FeatureSSBS,
Index: clang/test/Driver/aarch64-cpus.c
===
--- clang/test/Driver/aarch64-cpus.c
+++ clang/test/Driver/aarch64-cpus.c
@@ -185,8 +185,14 @@
 // CORTEXA78: 

[PATCH] D93585: [AArch64][Clang][Linux] Enable out-of-line atomics by default.

2021-01-29 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Thanks, LGTM too.

One nit inlined that can be addressed before committing.




Comment at: clang/test/Driver/aarch64-features.c:47
+// CHECK-OUTLINE-ATOMICS-ON: "-target-feature" "+outline-atomics"
+// CHECK-OUTLINE-ATOMICS-OFF-NOT: "-target-feature" "+outline-atomics"

Probably better to just check for the OFF case:

  "-target-feature" "-outline-atomics"


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93585/new/

https://reviews.llvm.org/D93585

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93585: [AArch64][Clang][Linux] Enable out-of-line atomics by default.

2021-01-29 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Sorry one last question: how about `-mno-outline-atomics` in combination with 
this? Think we need to add a test for that too?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93585/new/

https://reviews.llvm.org/D93585

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93585: [AArch64][Clang][Linux] Enable out-of-line atomics by default.

2021-01-28 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

In D93585#2526774 , @ilinpv wrote:

> Clang driver tests for outline atomics were added.

Thanks!

Is there a way we can test `-rtlib=libgcc`?




Comment at: clang/lib/Driver/ToolChains/Linux.cpp:855
+const GCCVersion  = GCCInstallation.getVersion();
+if (Ver.isOlderThan(9, 3, 1))
+  return false;

Nit: if we change this into:

  if (GCCInstallation.getVersion().isOlderThan(9, 3, 1))

we can get rid of the curly brackets.



Comment at: clang/lib/Driver/ToolChains/Linux.cpp:857
+  return false;
+  } else if (GetRuntimeLibType(Args) != ToolChain::RLT_CompilerRT) {
+return false;

This means we need one more test and RUN line with the rtlib not being 
compiler-rt or libgcc.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93585/new/

https://reviews.llvm.org/D93585

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93585: [AArch64][Clang][Linux] Enable out-of-line atomics by default.

2021-01-27 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:6448
+  CmdArgs.push_back("-target-feature");
+  CmdArgs.push_back("+outline-atomics");
+}

This needs a Clang driver tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93585/new/

https://reviews.llvm.org/D93585

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94779: [Clang] Ensure vector predication pragma is ignored only when vectorization width is 1.

2021-01-22 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

In D94779#2515157 , @dmgreen wrote:

> Thanks. @fhahn @SjoerdMeijer what do we think about the edge case where the 
> width==1? As far as I understand (with this patch):
>
>   #pragma clang loop vectorize_predicate(disable) vectorize_width(4)
>   Gives llvm.loop.vectorize.predicate.enable=false, 
> llvm.loop.vectorize.width=4, llvm.loop.vectorize.scalable.enable=false, 
> llvm.loop.vectorize.enable=true
>   
>   #pragma clang loop vectorize_predicate(disable) vectorize_width(4)

I guess this is a typo, and should be `vectorize_predicate(enable)`

> Gives llvm.loop.vectorize.predicate.enable=true, llvm.loop.vectorize.width=4, 
> llvm.loop.vectorize.scalable.enable=false, llvm.loop.vectorize.enable=true
>
> #pragma clang loop vectorize_predicate(enable) vectorize_width(1)
> Gives llvm.loop.vectorize.width=1, llvm.loop.vectorize.scalable.enable=false
>
> #pragma clang loop vectorize_predicate(disable) vectorize_width(1)
> Gives llvm.loop.vectorize.width=1, llvm.loop.vectorize.scalable.enable=false
>
>   Should we be adding llvm.loop.vectorize.predicate.enable metadata even when 
> width=1? I would be inclined to emit the predication pragma anyway.

So yeah, I got confused about this edge case width=1 earlier, but have 
refreshed my memory.
I would like to ask the question if it matters? Vectorize.width = 1 disables 
vectorisation, so unless I miss something, I don't think it matters if we emit 
`llvm.loop.vectorize.predicate.enable=true` or 
`llvm.loop.vectorize.predicate.enable=false` because it is ignored anyway? But 
if we have a combination that doesn't make sense, like `width(1)` and 
`predicate(enable)`, should we emit a diagnostic if this is not already done?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94779/new/

https://reviews.llvm.org/D94779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94779: [Clang] Ensure vector predication pragma is ignored only when vectorization width is 1.

2021-01-18 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

In D94779#2504519 , @malharJ wrote:

> I had a look at the Clang Language Extension 
> 
>  ... and I saw this:
>
>> Specifying a width/count of 1 disables the optimization, and is equivalent 
>> to vectorize(disable) or interleave(disable).

Ah yes, thanks. I think we should improve the langref spec also, this should be 
described in the same paragraph, not floating below some example.

With this cleared up, I agree with Florian, ignoring things isn't great. 
But this brings me back to my question earlier: what is the problem with the 
current behaviour (i.e. not ignoring/handling it)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94779/new/

https://reviews.llvm.org/D94779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94779: [Clang] Ensure vector predication pragma is ignored only when vectorization width is 1.

2021-01-18 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

I am trying to remember details here, but first about this:

> vectorize_width(1) is also used, since that effectively disables 
> vectorization.

I am not sure this is true (i.e. effectively disabling auto-vec) since in 
LangRef we specify:

> The vector width is specified by vectorize_width(_value_[, fixed|scalable]), 
> where _value_ is a positive integer and the type of vectorization can be 
> specified with an optional second parameter.

So, I *think* `vectorize_width(1)` enables the vectoriser, just the width is 
set to 1. And this explains why I added that condition `VectorizeWidth < 1`.

This leads me to wonder what the actual problem is with this?




Comment at: clang/test/CodeGenCXX/pragma-loop-predicate.cpp:102
+
+// CHECK-NEXT: ![[LOOP7]] = distinct !{![[LOOP7]], [[MP]], [[GEN8]], 
[[GEN11:![0-9]+]], [[GEN3]]}
+// CHECK-NEXT: [[GEN11]] = !{!"llvm.loop.vectorize.width", i32 4}

malharJ wrote:
> SjoerdMeijer wrote:
> > Do we also expect:
> > 
> >!{!"llvm.loop.vectorize.predicate.enable", i1 false}
> > 
> > here since the test sets:
> > 
> >   vectorize_predicate(disable) vectorize_width(4)
> GEN8 covers checking for that ?
Ah yep, of course, I missed that, so ignore this!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94779/new/

https://reviews.llvm.org/D94779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94779: [Clang] Ensure vector predication pragma is ignored only when vectorization width is 1.

2021-01-15 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/test/CodeGenCXX/pragma-loop-predicate.cpp:102
+
+// CHECK-NEXT: ![[LOOP7]] = distinct !{![[LOOP7]], [[MP]], [[GEN8]], 
[[GEN11:![0-9]+]], [[GEN3]]}
+// CHECK-NEXT: [[GEN11]] = !{!"llvm.loop.vectorize.width", i32 4}

Do we also expect:

   !{!"llvm.loop.vectorize.predicate.enable", i1 false}

here since the test sets:

  vectorize_predicate(disable) vectorize_width(4)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94779/new/

https://reviews.llvm.org/D94779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93395: [clang][cli] Remove -f[no-]trapping-math from -cc1 command line

2021-01-07 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Thanks for getting to the bottom of this. Agreed, and also LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93395/new/

https://reviews.llvm.org/D93395

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89031: [SVE] Add support to vectorize_width loop pragma for scalable vectors

2021-01-06 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.

LGTM, perhaps wait a day with committing in case there are more comments.




Comment at: clang/include/clang/Basic/Attr.td:3356
   EnumArgument<"State", "LoopHintState",
-   ["enable", "disable", "numeric", "assume_safety", 
"full"],
-   ["Enable", "Disable", "Numeric", "AssumeSafety", 
"Full"]>,
+   ["enable", "disable", "numeric", "fixed_width", 
"scalable_width", "assume_safety", "full"],
+   ["Enable", "Disable", "Numeric", "FixedWidth", 
"ScalableWidth", "AssumeSafety", "Full"]>,

aaron.ballman wrote:
> david-arm wrote:
> > aaron.ballman wrote:
> > > Should the documentation in AttrDocs.td be updated for this change?
> > Hi @aaron.ballman I had a look at LoopHintDocs in AttrDocs.td and it didn't 
> > explicitly mention these states, i.e. "assume_safety", "numeric", etc., so 
> > I'm not sure if it's necessary to add anything there?
> Oh, I see now, we're deferring to the documentation in the language 
> extensions document. I suppose that's fine as far as this patch goes, sorry 
> for the noise.
Nit: formatting, exceeding 80 columns?



Comment at: clang/include/clang/Basic/Attr.td:3357
+   ["enable", "disable", "numeric", "fixed_width", 
"scalable_width", "assume_safety", "full"],
+   ["Enable", "Disable", "Numeric", "FixedWidth", 
"ScalableWidth", "AssumeSafety", "Full"]>,
   ExprArgument<"Value">];

same?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89031/new/

https://reviews.llvm.org/D89031

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93014: [Clang] Add AArch64 VCMLA LANE variants.

2021-01-05 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added inline comments.
This revision is now accepted and ready to land.



Comment at: clang/include/clang/Basic/arm_neon.td:1911
+// vcmlaq{ROT}_lane
+def : SOpInst<"vcmla" # ROT # "_lane", "...qI", "Q" # type, Op<(call 
"vcmla" # ROT, $p0, $p1,
+   (bitcast $p0, (dup_typed laneqty , (call "vget_lane", (bitcast 
lanety, $p2), $p3>>;

fhahn wrote:
> SjoerdMeijer wrote:
> > I have looked only quickly at this, but I was expecting the `"q"` to appear 
> > here somewhere?
> Yes that is indeed a bit surprising. But the tablegen emitter has code that 
> inserts the `q` automatically, if required, which kicks in here. 
> (https://github.com/llvm/llvm-project/blob/main/clang/utils/TableGen/NeonEmitter.cpp#L1087)
Ok, thanks, LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93014/new/

https://reviews.llvm.org/D93014

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93014: [Clang] Add AArch64 VCMLA LANE variants.

2021-01-04 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/include/clang/Basic/arm_neon.td:1911
+// vcmlaq{ROT}_lane
+def : SOpInst<"vcmla" # ROT # "_lane", "...qI", "Q" # type, Op<(call 
"vcmla" # ROT, $p0, $p1,
+   (bitcast $p0, (dup_typed laneqty , (call "vget_lane", (bitcast 
lanety, $p2), $p3>>;

I have looked only quickly at this, but I was expecting the `"q"` to appear 
here somewhere?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93014/new/

https://reviews.llvm.org/D93014

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93395: [clang][cli] Remove -f[no-]trapping-math from -cc1 command line

2020-12-21 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Frontend/CompilerInvocation.cpp:3080
   }
   Opts.setFPExceptionMode(FPEB);
 

jansvoboda11 wrote:
> SjoerdMeijer wrote:
> > jansvoboda11 wrote:
> > > The parsing of `OPT_ftrapping_math` and `OPT_fno_trapping_math` is 
> > > immediately overwritten here.
> > Yeah, I did some work in this area some years ago, but that's a few years 
> > ago, and then in addition to that, we are talking about option handling in 
> > Clang which I always find very confusing... Just saying I can't remember 
> > too many details at this point.
> > 
> > But starting with a first observation, you're saying that things are 
> > overwritten here, but they are different options. I.e., the deleted part 
> > honours `OPT_ftrapping_math`, and here exception modes are set based on 
> > `OPT_ffp_exception_behavior`. So, looks like there is some interaction 
> > here... Do we know how this should work?
> I see. The thing is, before this patch, we call `Opts.setFPExceptionMode` 
> whenever we see either `OPT_ftrapping_math` or `OPT_fno_trapping_math`.
> 
> But then, we **unconditionally** call `Opts.setFPExceptionMode(FPEB)` again 
> here. That **always** overwrites what we previously did for 
> `OPT_f[no_]trapping_math`, making that a dead code.
> 
> `Opts.setFPExceptionMode()` doesn't do anything fancy, just assigns the 
> argument to the `Opts.FPExceptionModel` member.
Ah yes, I actually missed that we always set `Opts.setFPExceptionMode(FPEB)` 
here! So that's clear now.

But looks like my previous question is still relevant how these options should 
work together? I now see that 
`-ffp-exception-behavior` is a intel/microsoft special introduced in  D62731.

And while currently we ignore `-fftrapping-math` here, but that just seems like 
a bug, it looks like we actually need to handle it here too?



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93395/new/

https://reviews.llvm.org/D93395

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93395: [clang][cli] Remove -f[no-]trapping-math from -cc1 command line

2020-12-21 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Frontend/CompilerInvocation.cpp:3080
   }
   Opts.setFPExceptionMode(FPEB);
 

jansvoboda11 wrote:
> The parsing of `OPT_ftrapping_math` and `OPT_fno_trapping_math` is 
> immediately overwritten here.
Yeah, I did some work in this area some years ago, but that's a few years ago, 
and then in addition to that, we are talking about option handling in Clang 
which I always find very confusing... Just saying I can't remember too many 
details at this point.

But starting with a first observation, you're saying that things are 
overwritten here, but they are different options. I.e., the deleted part 
honours `OPT_ftrapping_math`, and here exception modes are set based on 
`OPT_ffp_exception_behavior`. So, looks like there is some interaction here... 
Do we know how this should work?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93395/new/

https://reviews.llvm.org/D93395

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D91994: [AArch64] Cortex-R82: remove crypto

2020-12-10 Thread Sjoerd Meijer via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG99ad078b91ed: [AArch64] Cortex-R82: remove crypto (authored 
by SjoerdMeijer).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91994/new/

https://reviews.llvm.org/D91994

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -903,12 +903,11 @@
   "8.4-A"));
   EXPECT_TRUE(testAArch64CPU(
  "cortex-r82", "armv8-r", "crypto-neon-fp-armv8",
-  AArch64::AEK_CRC | AArch64::AEK_RDM  | AArch64::AEK_SSBS|
-  AArch64::AEK_CRYPTO  | AArch64::AEK_SM4  | AArch64::AEK_SHA3|
-  AArch64::AEK_SHA2| AArch64::AEK_AES  | AArch64::AEK_DOTPROD |
-  AArch64::AEK_FP  | AArch64::AEK_SIMD | AArch64::AEK_FP16|
-  AArch64::AEK_FP16FML | AArch64::AEK_RAS  | AArch64::AEK_RCPC|
-  AArch64::AEK_SB, "8-R"));
+  AArch64::AEK_CRC | AArch64::AEK_RDM  | AArch64::AEK_SSBS |
+  AArch64::AEK_DOTPROD | AArch64::AEK_FP   | AArch64::AEK_SIMD |
+  AArch64::AEK_FP16| AArch64::AEK_FP16FML | AArch64::AEK_RAS |
+  AArch64::AEK_RCPC| AArch64::AEK_LSE  | AArch64::AEK_SB,
+  "8-R"));
   EXPECT_TRUE(testAArch64CPU(
   "cortex-x1", "armv8.2-a", "crypto-neon-fp-armv8",
   AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_FP |
@@ -1118,6 +1117,12 @@
 AArch64::ArchKind::INVALID, "ras"));
   EXPECT_TRUE(testAArch64Extension("cortex-a55",
 AArch64::ArchKind::INVALID, "ras"));
+  EXPECT_TRUE(testAArch64Extension("cortex-a55",
+AArch64::ArchKind::INVALID, "fp16"));
+  EXPECT_FALSE(testAArch64Extension("cortex-a55",
+AArch64::ArchKind::INVALID, "fp16fml"));
+  EXPECT_TRUE(testAArch64Extension("cortex-a55",
+AArch64::ArchKind::INVALID, "dotprod"));
   EXPECT_FALSE(testAArch64Extension("cortex-a57",
 AArch64::ArchKind::INVALID, "ras"));
   EXPECT_FALSE(testAArch64Extension("cortex-a72",
@@ -1126,6 +1131,22 @@
 AArch64::ArchKind::INVALID, "ras"));
   EXPECT_TRUE(testAArch64Extension("cortex-a75",
 AArch64::ArchKind::INVALID, "ras"));
+  EXPECT_TRUE(testAArch64Extension("cortex-a75",
+AArch64::ArchKind::INVALID, "fp16"));
+  EXPECT_FALSE(testAArch64Extension("cortex-a75",
+AArch64::ArchKind::INVALID, "fp16fml"));
+  EXPECT_TRUE(testAArch64Extension("cortex-a75",
+   AArch64::ArchKind::INVALID, "dotprod"));
+  EXPECT_TRUE(testAArch64Extension("cortex-r82",
+   AArch64::ArchKind::INVALID, "ras"));
+  EXPECT_TRUE(testAArch64Extension("cortex-r82",
+   AArch64::ArchKind::INVALID, "fp16"));
+  EXPECT_TRUE(testAArch64Extension("cortex-r82",
+   AArch64::ArchKind::INVALID, "fp16fml"));
+  EXPECT_TRUE(testAArch64Extension("cortex-r82",
+   AArch64::ArchKind::INVALID, "dotprod"));
+  EXPECT_TRUE(testAArch64Extension("cortex-r82",
+   AArch64::ArchKind::INVALID, "lse"));
   EXPECT_FALSE(testAArch64Extension("cyclone",
 AArch64::ArchKind::INVALID, "ras"));
   EXPECT_FALSE(testAArch64Extension("exynos-m3",
@@ -1168,18 +1189,6 @@
AArch64::ArchKind::INVALID, "profile"));
   EXPECT_FALSE(testAArch64Extension("saphira",
 AArch64::ArchKind::INVALID, "fp16"));
-  EXPECT_TRUE(testAArch64Extension("cortex-a55",
-AArch64::ArchKind::INVALID, "fp16"));
-  EXPECT_FALSE(testAArch64Extension("cortex-a55",
-AArch64::ArchKind::INVALID, "fp16fml"));
-  EXPECT_TRUE(testAArch64Extension("cortex-a55",
-AArch64::ArchKind::INVALID, "dotprod"));
-  EXPECT_TRUE(testAArch64Extension("cortex-a75",
-AArch64::ArchKind::INVALID, "fp16"));
-  EXPECT_FALSE(testAArch64Extension("cortex-a75",
-AArch64::ArchKind::INVALID, "fp16fml"));
-  EXPECT_TRUE(testAArch64Extension("cortex-a75",
-AArch64::ArchKind::INVALID, "dotprod"));
   EXPECT_FALSE(testAArch64Extension("thunderx2t99",
   

[PATCH] D89031: [SVE] Add support to vectorize_width loop pragma for scalable vectors

2020-11-25 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Thanks @david-arm for posting this proposal to the cfe list.
My confusion has been cleared up. The (new) proposal is to have:

1. vectorize_width(X) where X is an integer.
2. vectorize_width(X, fixed|scalable)
3. vectorize_width(fixed|scalable)

And with that 3rd option I agree that this allows us to express everything we 
want, and this patch needs adapting to this new proposal (just stating the 
obvious for clarity/completeness)

If scalable is used on a target that doesn't support this, a warning and 
falling back to fixed seems like the right thing to do.

I did have concerns about this, similarly like @fhahn:

> Hm, I am just a bit worried that it might be a bit confusing to users that do 
> not know what scalable vectors are (it is obvious when knowing all about SVE, 
> but I would assume most people don't necessarily know what this means). I 
> guess that is not the biggest deal, as long vectorize_width(X, scalable) 
> works for every target.

But since the new scalable option is opt-in, people don't need know about this 
if they don't want/need to, this should (hopefully) not be the case.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89031/new/

https://reviews.llvm.org/D89031

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89031: [SVE] Add support to vectorize_width loop pragma for scalable vectors

2020-11-18 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Because I was not understanding, we have discussed this further offline.

I think the conclusion was: pragma `vectorize_width` controls the vectorisation 
vector `VF` in ``. where `vscale` is not just a separate 
thing but it defines a VectorType. That's why it would make sense to attach 
`scalable|fixed` to `vectorize_width`. I agree with this and seems reasonable.

I still don't see how the proposed extension here allows you to specify fixed 
width vectorisation for:

  #pragma clang loop interleave_count(4)

targeting SVE and would appreciate if someone can comment on this example, but 
won't be holding up this work.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89031/new/

https://reviews.llvm.org/D89031

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89031: [SVE] Add support to vectorize_width loop pragma for scalable vectors

2020-11-16 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

> This approach should be fully complementary to `vectorize_with` so that it 
> would be possible to have:
>
>   // Use scalable vectors, but leave it to the cost-model to choose the most 
> efficient N in .
>   // If the pragma is not specified, it defaults to vectorize_style(fixed).
>   #pragma clang loop vectorize_style(scalable)
>   
>   // Use <4 x eltty>
>   #pragma clang loop vectorize_width(4, fixed)
>   
>   // Use 
>   #pragma clang loop vectorize_width(4, scalable)
>   
>   // If vectorize_style(scalable) is specified, then use  eltty>, otherwise <4 x eltty>
>   #pragma clang loop vectorize_width(4)   // uses <4 
> x eltty>
>   #pragma clang loop vectorize_width(4) vectorize_style(scalable) // uses 
> 
>   
>   // Conflicting options, clang should print diagnostic and error or ignore 
> the hint.
>   #pragma clang loop vectorize_width(4, fixed) vectorize_style(scalable)
>
> I hope that gives a bit more context.

Ok, thanks for clarifying that!

If:

  // Use 
  #pragma clang loop vectorize_width(4, scalable)

is equivalent to:

  // uses 
  #pragma clang loop vectorize_width(4) vectorize_style(scalable)

then I think that illustrates that I don't see the point of extending 
`vectorize_width` because we still can't express scalable vectorisation for:

  // 
  #pragma clang loop vectorize_predicate(enable)

and also for `interleave_count(4)`?

Again, when the idea is to have vectorize_style anyway, wouldn't it be easier 
not to bother extending vectorize_width and just go for vectorize_style? It 
allows you to specify fixed/scalable vectorisation in one way, and avoids 
having conflicting options.

The other thing I thought about: this is extending an existing user-facing 
pragma, and notifying the list would probably be best thing to do.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89031/new/

https://reviews.llvm.org/D89031

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89031: [SVE] Add support to vectorize_width loop pragma for scalable vectors

2020-11-12 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

In D89031#2391160 , @david-arm wrote:

> Hi @SjoerdMeijer I think that given we now support scalable vectors we 
> thought it made sense to be able to specify whether the user wants 'fixed' or 
> 'scalable' vectorisation along with the vector width, although without 
> specifying the additional property the default continues to remain 'fixed'. 
> However, what you said about having a vectorize_scalable pragma is correct 
> and we are intending to also add a pragma like this in a future patch.

Okay, I haven't looked at the implementation to be honest, but am just trying 
to understand the different use cases of this first.
I just seem to be missing or not understanding why fixed/scalable is an option 
to only vectorize_width, why not to vectorize(enable) or just a separate one 
like vectorize_scalable? By making scalable/fixed and option to 
vectorize_width, you can't toggle this for other pragmas like 
interleave(enable) that enable vectorisation, which would be inconsistent? It 
also seems to be more work to me to do this first for vectorize_width, and then 
fix up other pragmas later. But I might be missing something (obivous) here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89031/new/

https://reviews.llvm.org/D89031

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89031: [SVE] Add support to vectorize_width loop pragma for scalable vectors

2020-11-11 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

I am very sorry that I am late to this... but I do have some concerns.

The concern that I have is that we extend vecorize_width with a scalable/fixed 
boolean, but there are more vectorisation pragma that set vectorisation options 
which imply enabling vectorisation:

> Pragmas setting transformation options imply the transformation is enabled, 
> as if it was enabled via the corresponding transformation pragma (e.g. 
> vectorize(enable))

Thus, unless I miss something, I don't think this should be an option to 
vectorize_width, but to me it looks like we need a separate one, e.g.:

  vectorize_scalable(enable|disable)

what do you think?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89031/new/

https://reviews.llvm.org/D89031

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89972: Add pipeline model for HiSilicon's TSV110

2020-10-30 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

I haven't checked the instruction descriptions in detail, but the overall 
structure looks good to me. Perhaps wait a day with committing in case 
@bryanpkc has more comments.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89972/new/

https://reviews.llvm.org/D89972

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89972: Add pipeline model for HiSilicon's TSV110

2020-10-29 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/test/Driver/aarch64-cpus.c:298
 
+// RUN: %clang -target aarch64 -mcpu=tsv110 -### -c %s 2>&1 | FileCheck 
-check-prefix=TSV110 %s
+// RUN: %clang -target aarch64 -mlittle-endian -mcpu=tsv110 -### -c %s 2>&1 | 
FileCheck -check-prefix=TSV110 %s

This is unrelated to the scheduling model. 
Looks like good tests to me, so please go ahead and commit this separately, no 
need for another review.



Comment at: llvm/lib/Target/AArch64/AArch64SchedTSV110Details.td:1
+//==- AArch64SchedTSV110Details.td - TSV110 Scheduling Defs -*- tablegen 
-*-==//
+//

Just a nit/question on this new file: is there a benefit of having this 
separately? If there's none, just merge into  AArch64SchedTSV110.td, as that 
would be more consistent with other schedmodels?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89972/new/

https://reviews.llvm.org/D89972

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88660: [AArch64] Add CPU Cortex-R82

2020-10-02 Thread Sjoerd Meijer via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8825fec37e73: [AArch64] Add CPU Cortex-R82 (authored by 
SjoerdMeijer).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Changed prior to commit:
  https://reviews.llvm.org/D88660?vs=295573=295791#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88660/new/

https://reviews.llvm.org/D88660

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/aarch64-dotprod.c
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -881,6 +881,14 @@
   AArch64::AEK_LSE | AArch64::AEK_FP16 | AArch64::AEK_DOTPROD |
   AArch64::AEK_RCPC | AArch64::AEK_SSBS,
   "8.2-A"));
+  EXPECT_TRUE(testAArch64CPU(
+ "cortex-r82", "armv8-r", "crypto-neon-fp-armv8",
+  AArch64::AEK_CRC | AArch64::AEK_RDM  | AArch64::AEK_SSBS|
+  AArch64::AEK_CRYPTO  | AArch64::AEK_SM4  | AArch64::AEK_SHA3|
+  AArch64::AEK_SHA2| AArch64::AEK_AES  | AArch64::AEK_DOTPROD |
+  AArch64::AEK_FP  | AArch64::AEK_SIMD | AArch64::AEK_FP16|
+  AArch64::AEK_FP16FML | AArch64::AEK_RAS  | AArch64::AEK_RCPC|
+  AArch64::AEK_SB, "8-R"));
   EXPECT_TRUE(testAArch64CPU(
   "cortex-x1", "armv8.2-a", "crypto-neon-fp-armv8",
   AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_FP |
@@ -1026,7 +1034,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 42;
+static constexpr unsigned NumAArch64CPUArchs = 43;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
===
--- llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+++ llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
@@ -5251,6 +5251,7 @@
 case AArch64::ArchKind::ARMV8_4A:
 case AArch64::ArchKind::ARMV8_5A:
 case AArch64::ArchKind::ARMV8_6A:
+case AArch64::ArchKind::ARMV8R:
   RequestedExtensions.push_back("sm4");
   RequestedExtensions.push_back("sha3");
   RequestedExtensions.push_back("sha2");
Index: llvm/lib/Target/AArch64/AArch64SystemOperands.td
===
--- llvm/lib/Target/AArch64/AArch64SystemOperands.td
+++ llvm/lib/Target/AArch64/AArch64SystemOperands.td
@@ -32,6 +32,11 @@
  AssemblerPredicate<(all_of FeaturePAN_RWV),
  "ARM v8.2 PAN AT S1E1R and AT S1E1W Variation">;
 
+def HasCONTEXTIDREL2
+   : Predicate<"Subtarget->hasCONTEXTIDREL2()">,
+ AssemblerPredicate<(all_of FeatureCONTEXTIDREL2),
+ "Target contains CONTEXTIDR_EL2 RW operand">;
+
 //===--===//
 // AT (address translate) instruction options.
 //===--===//
@@ -1220,7 +1225,6 @@
 //  Op0Op1 CRn CRmOp2
 let Requires = [{ {AArch64::FeatureVH} }] in {
 def : RWSysReg<"TTBR1_EL2",   0b11, 0b100, 0b0010, 0b, 0b001>;
-def : RWSysReg<"CONTEXTIDR_EL2",  0b11, 0b100, 0b1101, 0b, 0b001>;
 def : RWSysReg<"CNTHV_TVAL_EL2",  0b11, 0b100, 0b1110, 0b0011, 0b000>;
 def : RWSysReg<"CNTHV_CVAL_EL2",  0b11, 0b100, 0b1110, 0b0011, 0b010>;
 def : RWSysReg<"CNTHV_CTL_EL2",   0b11, 0b100, 0b1110, 0b0011, 0b001>;
@@ -1246,6 +1250,9 @@
 def : RWSysReg<"CNTV_CVAL_EL02",  0b11, 0b101, 0b1110, 0b0011, 0b010>;
 def : RWSysReg<"SPSR_EL12",   0b11, 0b101, 0b0100, 0b, 0b000>;
 def : RWSysReg<"ELR_EL12",0b11, 0b101, 0b0100, 0b, 0b001>;
+let Requires = [{ {AArch64::FeatureCONTEXTIDREL2} }] in {
+  def : RWSysReg<"CONTEXTIDR_EL2",  0b11, 0b100, 0b1101, 0b, 0b001>;
+}
 }
 // v8.2a registers
 //  Op0Op1 CRn CRmOp2
Index: llvm/lib/Target/AArch64/AArch64Subtarget.h
===
--- llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -57,6 +57,7 @@
 CortexA76,
 CortexA77,
 CortexA78,
+CortexR82,
 CortexX1,
 ExynosM3,
 Falkor,
@@ -84,6 +85,9 @@
   bool HasV8_5aOps = false;
   bool 

[PATCH] D84703: [clang codegen][AArch64] Use llvm.aarch64.neon.fcvtzs/u where it's necessary

2020-07-28 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Cheers, LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84703/new/

https://reviews.llvm.org/D84703

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D84180: [Matrix] Add LowerMatrixIntrinsics to the NPM

2020-07-22 Thread Sjoerd Meijer via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG5567c62afa55: [Matrix] Add LowerMatrixIntrinsics to the NPM 
(authored by SjoerdMeijer).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Changed prior to commit:
  https://reviews.llvm.org/D84180?vs=279559=279727#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84180/new/

https://reviews.llvm.org/D84180

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  clang/test/CodeGen/matrix-lowering-opt-levels.c
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Index: llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
===
--- llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
+++ llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
@@ -149,9 +149,9 @@
 "enable-order-file-instrumentation", cl::init(false), cl::Hidden,
 cl::desc("Enable order file instrumentation (default = off)"));
 
-static cl::opt
-EnableMatrix("enable-matrix", cl::init(false), cl::Hidden,
- cl::desc("Enable lowering of the matrix intrinsics"));
+cl::opt EnableMatrix(
+"enable-matrix", cl::init(false), cl::Hidden,
+cl::desc("Enable lowering of the matrix intrinsics"));
 
 cl::opt AttributorRun(
 "attributor-enable", cl::Hidden, cl::init(AttributorRunOption::NONE),
Index: llvm/lib/Passes/PassBuilder.cpp
===
--- llvm/lib/Passes/PassBuilder.cpp
+++ llvm/lib/Passes/PassBuilder.cpp
@@ -275,6 +275,8 @@
 extern cl::opt AttributorRun;
 extern cl::opt EnableKnowledgeRetention;
 
+extern cl::opt EnableMatrix;
+
 const PassBuilder::OptimizationLevel PassBuilder::OptimizationLevel::O0 = {
 /*SpeedLevel*/ 0,
 /*SizeLevel*/ 0};
@@ -1093,6 +1095,11 @@
   OptimizePM.addPass(Float2IntPass());
   OptimizePM.addPass(LowerConstantIntrinsicsPass());
 
+  if (EnableMatrix) {
+OptimizePM.addPass(LowerMatrixIntrinsicsPass());
+OptimizePM.addPass(EarlyCSEPass());
+  }
+
   // FIXME: We need to run some loop optimizations to re-rotate loops after
   // simplify-cfg and others undo their rotation.
 
Index: clang/test/CodeGen/matrix-lowering-opt-levels.c
===
--- /dev/null
+++ clang/test/CodeGen/matrix-lowering-opt-levels.c
@@ -0,0 +1,28 @@
+// RUN: %clang -O0 -fenable-matrix -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -O1 -fenable-matrix -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -O2 -fenable-matrix -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -O3 -fenable-matrix -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -Ofast -fenable-matrix -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -Os -fenable-matrix -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -Oz -fenable-matrix -S -emit-llvm %s -o - | FileCheck  %s
+
+// RUN: %clang -O0 -fenable-matrix -fexperimental-new-pass-manager -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -O1 -fenable-matrix -fexperimental-new-pass-manager -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -O2 -fenable-matrix -fexperimental-new-pass-manager -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -O3 -fenable-matrix -fexperimental-new-pass-manager -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -Ofast -fenable-matrix -fexperimental-new-pass-manager -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -Os -fenable-matrix -fexperimental-new-pass-manager -S -emit-llvm %s -o - | FileCheck  %s
+// RUN: %clang -Oz -fenable-matrix -fexperimental-new-pass-manager -S -emit-llvm %s -o - | FileCheck  %s
+
+// Smoke test that the matrix intrinsics are lowered at any optimisation level.
+
+typedef float m4x4_t __attribute__((matrix_type(4, 4)));
+
+m4x4_t f(m4x4_t a, m4x4_t b, m4x4_t c) {
+  //
+  // CHECK-LABEL: f(
+  // CHECK-NOT: @llvm.matrix
+  // CHECK:   }
+  //
+  return a + b * c;
+}
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -74,6 +74,7 @@
 #include "llvm/Transforms/ObjCARC.h"
 #include "llvm/Transforms/Scalar.h"
 #include "llvm/Transforms/Scalar/GVN.h"
+#include "llvm/Transforms/Scalar/LowerMatrixIntrinsics.h"
 #include "llvm/Transforms/Utils.h"
 #include "llvm/Transforms/Utils/CanonicalizeAliases.h"
 #include "llvm/Transforms/Utils/EntryExitInstrumenter.h"
@@ -1372,6 +1373,13 @@
 }
 
 if (CodeGenOpts.OptimizationLevel == 0) {
+  // FIXME: the backends do not handle matrix intrinsics currently. Make
+  // sure they are also lowered in O0. A lightweight version of the pass
+  // should run in the backend pipeline on demand.
+  if (LangOpts.MatrixTypes)
+MPM.addPass(
+createModuleToFunctionPassAdaptor(LowerMatrixIntrinsicsPass()));
+
   addCoroutinePassesAtO0(MPM, 

[PATCH] D83079: [clang][aarch64] Generate preprocessor macros for -march=armv8.6a+sve.

2020-07-07 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:118
+
+  if (!llvm::AArch64::getArchFeatures(ArchKind, Features))
+return false;

Would it be more consistent to move this



Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:273
   const auto ItRFP16FML = std::find(Features.rbegin(), Features.rend(), 
"+fp16fml");
   if (llvm::is_contained(Features, "+v8.4a")) {
 const auto ItRFullFP16  = std::find(Features.rbegin(), Features.rend(), 
"+fullfp16");

...to somewhere here where implied target features are handled



Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:373
   }
 
   if (Arg *A = Args.getLastArg(options::OPT_mno_unaligned_access,

For example, to here.



Comment at: clang/test/Preprocessor/aarch64-target-features.c:115
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1

Can you add a run line for v8.5 if there isn't already one, and add CHECK-NOTs 
for these macros.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83079: [clang] Default target features implied by `-march` on AArch64.

2020-07-02 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

I haven't looked into details of these arch extensions yet, will do that 
tomorrow, but I don't think there's any disagreement, i.e., these options and 
their behaviour are synchronised with the GCC community.  Thus, it's better if 
you remove this wording from both description and comments in the source code. 
If there is divergence in behaviour, then that is probably be an oversight 
somewhere. I am also pretty sure that this 8.5 stuff is implemented, not sure 
if that is upstreamed. And also, I think we need to be more specific than 
"Default target features implied by `-march` on AArch64", because this is a big 
topic, which is not addressed in this patch.  This patch is only about a few 
architecture extension. I am also for example surprised to see bfloat there, as 
I saw quite some activity in this area.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50229: [ARM][AArch64] Add feature +fp16fml

2020-07-02 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer marked an inline comment as done.
SjoerdMeijer added inline comments.



Comment at: test/Driver/aarch64-cpus.c:518
+// RUN: %clang -target aarch64 -march=armv8.4-a -### -c %s 2>&1 | FileCheck 
-check-prefix=GENERICV84A-NO-FP16FML %s
+// GENERICV84A-NO-FP16FML-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// GENERICV84A-NO-FP16FML-NOT: "-target-feature" "{{[+-]}}fullfp16"

fpetrogalli wrote:
> Hi @SjoerdMeijer , I have noticed that this test does something different 
> from what gcc does (well, claims to do, I haven't checked the actual behavior 
> on gcc).
> 
> From the table in [1], it seems that `armv8.4-a` implies `fp16fml`... who got 
> it right? GCC or clang? Or am I missing something?
> 
> Francesco
> 
> 
> [1] https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#AArch64-Options 
> (see the description of `-march=name`)
I think the ARMARM is pretty clear on this. Copied from one of the instruction 
descriptions enabled by fp16ml:

"In Armv8.2 and Armv8.3, this is an OPTIONAL instruction. From Armv8.4 it is 
mandatory for all implementations to support it."

So, looks like this needs fixing.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D50229/new/

https://reviews.llvm.org/D50229



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75169: [ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend

2020-06-11 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:524
+  CallConv))
+return;
   EVT ValueVT = Val.getValueType();

efriedma wrote:
> pratlucas wrote:
> > efriedma wrote:
> > > I'm not sure I understand why the standard 
> > > getCopyFromParts/getCopyToParts codepath doesn't work. Is the issue that 
> > > it uses FP_ROUND/FP_EXTEND to promote from f16 to f32?
> > Yes, the issue is the usage of FP_ROUND/FP_EXTEND indeed. Those cause the 
> > argument to be converted from f16 into f32 - with a `vcvtb.f16.f32` for 
> > instance - instead of simply being placed the value in the LSBs as required 
> > by the AAPCS.
> That makes sense.
> 
> It feels a little weird to have a TLI method to do the splitting, as opposed 
> to adding an extra check to the shared codepath, but I guess this way is more 
> flexible if someone else needs a similar change in the future.
> 
> One other thing to consider is that we could make f16 a "legal" type for all 
> ARM subtargets with floating-point registers, regardless of whether the 
> target actually has native f16 arithmetic instructions. We do this on 
> AArch64.  That would reduce the number of different ways to handle f16 
> values, and I think this change would be unnecessary.
> One other thing to consider is that we could make f16 a "legal" type for all 
> ARM subtargets with floating-point registers, regardless of whether the 
> target actually has native f16 arithmetic instructions. We do this on AArch64.

I am partly guilty here and there when I added _Float16 and v8.2 FP16 support. 
When I did this, life was easy for AArch64, because it doesn't have a soft 
float ABI support and it has or hasn't got FP16 support, and so everything is 
straightforward. Life became an awful lot less pleasant for AArch32, because of 
the hard/soft float and different FP16 support, i.e. there are early FPU 
versions with limited fp16 support for the storage only type (some conversion), 
and from v8.2 and up the native fp16 instruction. It's been a few years now so 
can't remember exactly, which is a bit unhelpful,  but somewhere here for these 
corner cases I got into trouble by treating f16 as a legal type.

But in case you're interested / this might be useful, I think this is the mail 
that I wrote to the llvm dev list when I got into trouble here (which I 
actually need to reread too for details):

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120537.html

and as referred in that mail, earlier revisions of this:

https://reviews.llvm.org/D38315

might have taken exactly that approach. Long story short, I am not saying we 
shouldn't do it, just pointing out some background info. And since we're all a 
few years wiser now since that happened, perhaps we should try again ;-)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75169/new/

https://reviews.llvm.org/D75169



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80716: [AArch64]: BFloat Load/Store Intrinsics

2020-06-10 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/test/CodeGen/AArch64/aarch64-bf16-ldst-intrinsics.ll:264
+; Function Attrs: argmemonly nounwind readonly
+declare { <8 x bfloat>, <8 x bfloat> } 
@llvm.aarch64.neon.ld2lane.v8bf16.p0i8(<8 x bfloat>, <8 x bfloat>, i64, i8*) #3
+

chill wrote:
> SjoerdMeijer wrote:
> > chill wrote:
> > > LukeGeeson wrote:
> > > > SjoerdMeijer wrote:
> > > > > LukeGeeson wrote:
> > > > > > arsenm wrote:
> > > > > > > Why is the IR type name bfloat and not bfloat16?
> > > > > > The naming for the IR type was agreed upon here after quite a big 
> > > > > > discussion. 
> > > > > > https://reviews.llvm.org/D78190
> > > > > I regret very much that I didn't notice this earlier... I.e., I 
> > > > > noticed this in D76077 and wrote that I am relatively unhappy about 
> > > > > this (I think I mentioned this on another ticket too).
> > > > > Because like @arsenm , I would expect the IR type name to be bfloat16.
> > > > > 
> > > > > Correct me if I am wrong, but I don't see a big discussion about this 
> > > > > in D78190. I only see 1 or 2 comments about `BFloat` vs `Bfloat`.
> > > > I cannot see a discussion about the IR type name per-se but I can see 
> > > > you were both involved in the discussion more generally.
> > > > 
> > > > I am concerned that this patch is the wrong place to discuss such 
> > > > issues, and that we should bring this up in a more appropriate place as 
> > > > you mention so that this patch isn't held back.
> > > I don't see a compelling reason for the name to be `bfloat16` or 
> > > `bfloat3`, etc. Like other floating-point types (`float`, `double`, and 
> > > `half`), the name denotes a specific externally defined format, unlike 
> > > `iN`.
> > > Like other floating-point types (float, double, and half), the name 
> > > denotes a specific externally defined format, 
> > 
> > Is the defined format not called bfloat16?
> Indeed, people use the name "bfloat16". But then the `half`, `float`, and 
> `double` also differ from the official `binary16`, `binarty32`, and 
> `binary64`.
> IMHO `bfloat` fits better in the LLVM IR naming convention.
yeah, so that's exactly why I don't follow your logic. If there's any logic in 
the names here, the mapping from source-language type to IR type seems the most 
plausible one. And I just don't see the benefit of dropping the 16, and how 
that would fit better in some naming scheme or how that makes things clearer 
here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80716/new/

https://reviews.llvm.org/D80716



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80716: [AArch64]: BFloat Load/Store Intrinsics

2020-06-10 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/test/CodeGen/AArch64/aarch64-bf16-ldst-intrinsics.ll:264
+; Function Attrs: argmemonly nounwind readonly
+declare { <8 x bfloat>, <8 x bfloat> } 
@llvm.aarch64.neon.ld2lane.v8bf16.p0i8(<8 x bfloat>, <8 x bfloat>, i64, i8*) #3
+

chill wrote:
> LukeGeeson wrote:
> > SjoerdMeijer wrote:
> > > LukeGeeson wrote:
> > > > arsenm wrote:
> > > > > Why is the IR type name bfloat and not bfloat16?
> > > > The naming for the IR type was agreed upon here after quite a big 
> > > > discussion. 
> > > > https://reviews.llvm.org/D78190
> > > I regret very much that I didn't notice this earlier... I.e., I noticed 
> > > this in D76077 and wrote that I am relatively unhappy about this (I think 
> > > I mentioned this on another ticket too).
> > > Because like @arsenm , I would expect the IR type name to be bfloat16.
> > > 
> > > Correct me if I am wrong, but I don't see a big discussion about this in 
> > > D78190. I only see 1 or 2 comments about `BFloat` vs `Bfloat`.
> > I cannot see a discussion about the IR type name per-se but I can see you 
> > were both involved in the discussion more generally.
> > 
> > I am concerned that this patch is the wrong place to discuss such issues, 
> > and that we should bring this up in a more appropriate place as you mention 
> > so that this patch isn't held back.
> I don't see a compelling reason for the name to be `bfloat16` or `bfloat3`, 
> etc. Like other floating-point types (`float`, `double`, and `half`), the 
> name denotes a specific externally defined format, unlike `iN`.
> Like other floating-point types (float, double, and half), the name denotes a 
> specific externally defined format, 

Is the defined format not called bfloat16?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80716/new/

https://reviews.llvm.org/D80716



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80716: [AArch64]: BFloat Load/Store Intrinsics

2020-06-10 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/test/CodeGen/AArch64/aarch64-bf16-ldst-intrinsics.ll:264
+; Function Attrs: argmemonly nounwind readonly
+declare { <8 x bfloat>, <8 x bfloat> } 
@llvm.aarch64.neon.ld2lane.v8bf16.p0i8(<8 x bfloat>, <8 x bfloat>, i64, i8*) #3
+

LukeGeeson wrote:
> arsenm wrote:
> > Why is the IR type name bfloat and not bfloat16?
> The naming for the IR type was agreed upon here after quite a big discussion. 
> https://reviews.llvm.org/D78190
I regret very much that I didn't notice this earlier... I.e., I noticed this in 
D76077 and wrote that I am relatively unhappy about this (I think I mentioned 
this on another ticket too).
Because like @arsenm , I would expect the IR type name to be bfloat16.

Correct me if I am wrong, but I don't see a big discussion about this in 
D78190. I only see 1 or 2 comments about `BFloat` vs `Bfloat`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80716/new/

https://reviews.llvm.org/D80716



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-06-04 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Thanks, LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-06-01 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

> At the moment when going through the GCC compatibility driver (standard 
> interface), we get __bf16 is not supported on this target.

If this is the current behaviour, and consistent with GCC, that sounds 
reasonable. Probably best to be explicit about this and document this in 
LangRef. With a note that this is subject to change?
Oh, and I guess that means we can add a test this for this? We can check this 
error message?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-06-01 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Can you summarise where we are? I.e.,

- float-abi=soft doesn't work. But what is the problem? Are we not simply 
passing i16s, is that not what we are supposed to do?

Can you also update the description of this patch, I got totally confused by:

- "introduces an opaque, storage-only C-type __bf16, which does not introduce a 
new LLVM IR type, but maps it to either i16 or half type."




Comment at: clang/test/CodeGen/arm-mangle-16bit-float.cpp:1
+// RUN: %clang_cc1 -triple aarch64-arm-none-eabi 
-fallow-half-arguments-and-returns -target-feature +fullfp16 -emit-llvm -o - %s 
| FileCheck %s --check-prefix=CHECK64
+// RUN: %clang_cc1 -triple arm-arm-none-eabi 
-fallow-half-arguments-and-returns -target-feature +fullfp16 -emit-llvm -o - %s 
| FileCheck %s --check-prefix=CHECK32

This is testing name mangling of __fp16, so is not related to this patch, 
please commit this separately if this is not tested yet. But I do suspect Dh is 
tested already, but I could be wrong.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-06-01 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/test/CodeGen/arm-bf16-params-returns.c:5
+// RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-abi aapcs -mfloat-abi 
softfp -target-feature +bf16 -target-feature +neon -emit-llvm -O2 -o - %s | opt 
-S -mem2reg -sroa | FileCheck %s --check-prefix=CHECK64-SOFTFP
+
+// function return types

stuij wrote:
> SjoerdMeijer wrote:
> > what happens with `-mfloat-abi=soft`. Does that deserve a test?
> Yes, this one is interesting. I think we shouldn't support bfloat at all in 
> combination with -mfloat-abi=soft. We don't support software emulation of 
> bfloat instructions and all operations on bfloat are simd instructions.
> 
> It turns out cc1 will accept -mfloat-abi=soft with neon intrinsics, which 
> will happily churn out neon instructions. This doesn't sound very soft. The 
> driver will ignore -mfloat-abi=soft in certain combinations of cmdline 
> instructions, but I haven't delved deep enough to know what's what.
> 
> GCC doesn't allow soft+neon combination. Unfortunately it will actually crash 
> for just a bfloat type by itself, which is quite useless without intrinsics. 
> The Arm GCC folks will raise a ticket on this with as proposed solution to 
> not allow this combination.
> 
> As this issue seems bigger than just bfloat, and potentially there's driver 
> code involved as well I thought it'd make sense to handle this in a separate 
> patch.
I think we first need agreement what -mfloat-abi=soft  with bf16 means and how 
it should behave, document this, and have some tests. Possibly document how we 
diverge from this.

I think I tend to disagree with this:

> I think we shouldn't support bfloat at all in combination with 
> -mfloat-abi=soft.

why would it not be supported in some way (promotions to another type, or even 
library calls), like the other float-types? 

> It turns out cc1 will accept -mfloat-abi=soft with neon intrinsics, which 
> will happily churn out neon instructions.

I would say the fact that there are other problems, shouldn't distract us too 
much from trying to get this right; I think at this point that is not yet a 
justification. 

> As this issue seems bigger than just bfloat, and potentially there's driver 
> code involved as well I thought it'd make sense to handle this in a separate 
> patch.

I think at this point I disagree with this, mainly because of my first point: 
the behaviour should be specified. I would also say that not doing this, could 
be a bit of bad precedent for adding a new C type. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-05-27 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/test/CodeGen/arm-mangle-bf16.cpp:3
+// RUN: %clang_cc1 -triple arm-arm-none-eabi -target-feature +bf16 -mfloat-abi 
hard -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK32-HARD
+// RUN: %clang_cc1 -triple arm-arm-none-eabi -target-feature +bf16 -mfloat-abi 
softfp -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK32-SOFTFP
+

SjoerdMeijer wrote:
> nit: if the name mangling is Independent of the float ABI, then you might as 
> well one runline and the abi option.
-> ...  then you might as well have only one runline and remove the abi option.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-05-27 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

As I wrote in one of the inlined comments, I am relatively unhappy that we have 
both bfloat and bfloat16 as names. But that looks like a complain for another 
patch, and not really this one.




Comment at: clang/lib/Basic/Targets/AArch64.cpp:74
+  BFloat16Width = BFloat16Align = 16;
+  BFloat16Format = ::APFloat::BFloat();
+

stuij wrote:
> SjoerdMeijer wrote:
> > Nit: we use bfloat16 everywhere, would `llvm::APFloat::BFloat16()` be 
> > better for consistency?
> In the IR world bfloat is consistently called bfloat, without the 16. I think 
> this might turn into bikeshedding to justify if either one or both sides of 
> the divide should or should not include the '16'. I can think of arguments to 
> support the various options.
> 
> As I think there's no clear winner, I'd like to take the route of less effort 
> and keep things as is.
ah, I couldn't remember on which patch I commented on the bfloat16 vs bfloat 
naming.
Since everything is called bfloat16, from the architecture extension to the 
source language type, I find this a bit of a missed opportunity to get the 
names consistent. And I wouldn't say that naming of types is bikeshedding. But 
maybe that's just me...



Comment at: clang/lib/Basic/Targets/ARM.cpp:80
 DoubleAlign = LongLongAlign = LongDoubleAlign = SuitableAlign = 32;
+  BFloat16Width = BFloat16Align = 16;
+  BFloat16Format = ::APFloat::BFloat();

Is this tested? Can it be tested?



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:4483
+ bool V1Ty=false,
+ bool HasBFloat16Type=true) {
   int IsQuad = TypeFlags.isQuad();

nit: the linter says some spaces here.



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:5514
 
-  llvm::VectorType *VTy = GetNeonType(this, Type, HasLegalHalfType);
+  llvm::VectorType *VTy = GetNeonType(this, Type, HasLegalHalfType, false, 
HasBFloat16Type);
   llvm::Type *Ty = VTy;

nit: looks like the hinter right to be unhappy about the formatting



Comment at: clang/lib/CodeGen/TargetInfo.cpp:6270
 
+  // __bf16  get passed using the bfloat ir type, or using i32 but
+  // with the top 16 bits unspecified.

nit: ir -> IR



Comment at: clang/lib/CodeGen/TargetInfo.cpp:6471
+// FP16/BF16 vectors should be converted to integer vectors
+// This check is similar  to isIllegalVectorType - refactor?
+if ((!getTarget().hasLegalHalfType() &&

nit: perhaps add TODO, or just remove this.



Comment at: clang/lib/Sema/SemaOverload.cpp:1874
+
+// conversions between bfloat and other floats are not permitted
+if (FromType == S.Context.BFloat16Ty || ToType == S.Context.BFloat16Ty)

Nit: conversions -> Conversions,  permitted -> permitted.

Same on line no. 1895.



Comment at: clang/test/CodeGen/arm-bf16-params-returns.c:5
+// RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-abi aapcs -mfloat-abi 
softfp -target-feature +bf16 -target-feature +neon -emit-llvm -O2 -o - %s | opt 
-S -mem2reg -sroa | FileCheck %s --check-prefix=CHECK64-SOFTFP
+
+// function return types

what happens with `-mfloat-abi=soft`. Does that deserve a test?



Comment at: clang/test/CodeGen/arm-mangle-16bit-float.cpp:4
+
+// CHECK64: define {{.*}}void @_Z3foou6__bf16(half %b)
+// CHECK32: define {{.*}}void @_Z3foou6__bf16(i32 %b.coerce)

LukeGeeson wrote:
> SjoerdMeijer wrote:
> > LukeGeeson wrote:
> > > craig.topper wrote:
> > > > How can bfloat16 be passed as half? Don't they have a different format?
> > > see the above comment about soft-abis
> > Probably I am bit confused too now... are the three possible types that we 
> > are expecing not bfloat, i32, or i16? 
> Hi Sjoerd, Good spot here I forgot to add a default case somewhere, which 
> means AArch32 didn't observe `-mfloat-abi softfp` by default - and hence used 
> bfloat where i32 was expected. 
> 
> I have a local patch for this and will send to Ties to merge into this one :) 
> 
> 
> We're not sure what you mean by i16 however?
Ah, as I said, got confused. Now I see we use i16 for targets that don't 
support bfloat16. So we probably need a test for that.



Comment at: clang/test/CodeGen/arm-mangle-bf16.cpp:3
+// RUN: %clang_cc1 -triple arm-arm-none-eabi -target-feature +bf16 -mfloat-abi 
hard -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK32-HARD
+// RUN: %clang_cc1 -triple arm-arm-none-eabi -target-feature +bf16 -mfloat-abi 
softfp -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK32-SOFTFP
+

nit: if the name mangling is Independent of the float ABI, then you might as 
well one runline and the abi option.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  

[PATCH] D79708: [clang][BFloat] add NEON emitter for bfloat

2020-05-27 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

No objections.
Some nits inlined, which you can ignore if you think they are not correct.




Comment at: clang/include/clang/Basic/arm_neon_incl.td:218
 // d: double
+// b: bfloat
 //

nit: perhaps bfloat16?



Comment at: clang/include/clang/Basic/arm_neon_incl.td:240
 // F: change to floating category.
+// B: change to BFloat
 // P: change to polynomial category.

nit: perhaps BFloat16?



Comment at: clang/test/Preprocessor/aarch64-target-features.c:378
+
+// == Check BFloat Extensions.
+// RUN: %clang -target aarch64-arm-none-eabi -march=armv8.6-a+bf16 -x c -E -dM 
%s -o - 2>&1 | FileCheck -check-prefix=CHECK-BFLOAT %s

nit: BFloat16



Comment at: clang/test/Preprocessor/arm-target-features.c:855
+
+// == Check BFloat Extensions.
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.6-a+bf16 -x c -E -dM %s 
-o - 2>&1 | FileCheck -check-prefix=CHECK-BFLOAT %s

same



Comment at: clang/utils/TableGen/NeonEmitter.cpp:746
 Name = Name.drop_front(4);
+  } else if (Name.startswith("bfloat")) {
+T.Kind = BFloat16;

same?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79708/new/

https://reviews.llvm.org/D79708



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-05-19 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Best to ask the release manager. Today Tom Stellard posted a message to the 
llvm dev list with subject:

  [llvm-dev] LLVM 10.0.1-rc1 release update

So best to reply to this(*) asap with your query if it's possible to get this 
onto the branch.

(*) http://lists.llvm.org/pipermail/llvm-dev/2020-May/141670.html


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-05-18 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

There are Sema and CodeGen tests, but was wondering if there would be some 
value in having an AST test too? There are some other types that do have AST 
tests.




Comment at: clang/lib/AST/ASTContext.cpp:2052
+  Width = Target->getBFloat16Width();
+  Align = Target->getBFloat16Align();
 case BuiltinType::Float16:

Is a `break` missing here?



Comment at: clang/lib/CodeGen/CodeGenTypes.cpp:307
+else
+  return llvm::Type::getInt16Ty(VMContext);
+  }

Is this covered in tests?



Comment at: clang/lib/CodeGen/TargetInfo.cpp:6454
+// types into integer vectors.
+// We do not depend on haslegalhalf type for bfloat as it is a
+// separate IR type

nit: camelcase fase for haslegalhalf



Comment at: clang/test/CodeGen/arm-mangle-16bit-float.cpp:4
+
+// CHECK64: define {{.*}}void @_Z3foou6__bf16(half %b)
+// CHECK32: define {{.*}}void @_Z3foou6__bf16(i32 %b.coerce)

LukeGeeson wrote:
> craig.topper wrote:
> > How can bfloat16 be passed as half? Don't they have a different format?
> see the above comment about soft-abis
Probably I am bit confused too now... are the three possible types that we are 
expecing not bfloat, i32, or i16? 



Comment at: clang/test/CodeGen/arm-mangle-16bit-float.cpp:2
+// RUN: %clang_cc1 -triple aarch64-arm-none-eabi 
-fallow-half-arguments-and-returns -target-feature +bf16 -target-feature 
+fullfp16 -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK64
+// RUN: %clang_cc1 -triple arm-arm-none-eabi 
-fallow-half-arguments-and-returns -target-feature +bf16 -target-feature 
+fullfp16 -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK32
+

Do you need to pass `+fullfp16`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-05-18 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:8111
   "pointer cannot be cast to type %0">;
+def err_cast_to_bfloat : Error<"cannot type-cast to __bf16">;
+def err_cast_from_bfloat : Error<"cannot type-cast from __bf16">;

Nit: was wondering if `err_cast_to_bfloat16` would be more consistent



Comment at: clang/lib/AST/ASTContext.cpp:6007
 case Float16Rank:
 case HalfRank: llvm_unreachable("Complex half is not supported");
 case FloatRank:  return FloatComplexTy;

nit: perhaps this error message is not entirely accurate for bfloat16?



Comment at: clang/lib/Basic/Targets/AArch64.cpp:74
+  BFloat16Width = BFloat16Align = 16;
+  BFloat16Format = ::APFloat::BFloat();
+

Nit: we use bfloat16 everywhere, would `llvm::APFloat::BFloat16()` be better 
for consistency?



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:115
   HalfTy = llvm::Type::getHalfTy(LLVMContext);
+  BFloatTy = llvm::Type::getBFloatTy(LLVMContext);
   FloatTy = llvm::Type::getFloatTy(LLVMContext);

nit: perhaps `getBFloat16Ty()`



Comment at: clang/lib/CodeGen/CodeGenTypeCache.h:39
+  /// half, bfloat, float, double
+  llvm::Type *HalfTy, *BFloatTy, *FloatTy, *DoubleTy;
 

And so here too, BFloat16Ty?



Comment at: clang/lib/CodeGen/TargetInfo.cpp:5844
   ABIKind Kind;
+  bool IsSoftFloatABI;
 

Nit: to distinguish the unfortunate float ABI names, I was wondering whether 
`IsFloatABISoftFP` is clearer, or something along those lines, just to make 
clear it is not the "soft" but the "softfp" variant


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-05-18 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:518
+Clang supports three half-precision (16-bit) floating point types: ``__fp16``,
+``_Float16`` and ``__bf16``.  These types are supported in all language modes.
 

stuij wrote:
> SjoerdMeijer wrote:
> > stuij wrote:
> > > SjoerdMeijer wrote:
> > > > Not my field of expertise, and sorry if I've missed this somewhere, but 
> > > > was wondering if this (i.e. all language modes) means we need C++ name 
> > > > mangling support for bfloat? In the AArch64 backend I saw this:
> > > > 
> > > >   getBFloat16Mangling() const override { return "u6__bf16"; }
> > > > 
> > > > But that's something else I guess, and I was guessing a 2 letter 
> > > > mangling needs to be added here?
> > > > 
> > > > https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-builtin
> > > Yes, we need name-mangling support, and yes that method you mentioned 
> > > does that. There's one for AArch32 as well. And yes we have documented it 
> > > in the 'C++ ABI for the latest Arm Architecture' docs:
> > > 
> > > aarch32: https://developer.arm.com/docs/ihi0041/latest
> > > aarch64: https://developer.arm.com/docs/ihi0059/latest
> > But does that mean that for other targets this is not (yet) supported?
> Currently bfloat is a vendor-extended type, hence the 'u' prefix. So to me it 
> would make sense that backends should implement their own naming, as there's 
> no guarantee that the naming scheme chosen for Arm is compatible with all 
> backends.
Ok, true, fair enough


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-05-18 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:518
+Clang supports three half-precision (16-bit) floating point types: ``__fp16``,
+``_Float16`` and ``__bf16``.  These types are supported in all language modes.
 

stuij wrote:
> SjoerdMeijer wrote:
> > Not my field of expertise, and sorry if I've missed this somewhere, but was 
> > wondering if this (i.e. all language modes) means we need C++ name mangling 
> > support for bfloat? In the AArch64 backend I saw this:
> > 
> >   getBFloat16Mangling() const override { return "u6__bf16"; }
> > 
> > But that's something else I guess, and I was guessing a 2 letter mangling 
> > needs to be added here?
> > 
> > https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-builtin
> Yes, we need name-mangling support, and yes that method you mentioned does 
> that. There's one for AArch32 as well. And yes we have documented it in the 
> 'C++ ABI for the latest Arm Architecture' docs:
> 
> aarch32: https://developer.arm.com/docs/ihi0041/latest
> aarch64: https://developer.arm.com/docs/ihi0059/latest
But does that mean that for other targets this is not (yet) supported?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-05-14 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:518
+Clang supports three half-precision (16-bit) floating point types: ``__fp16``,
+``_Float16`` and ``__bf16``.  These types are supported in all language modes.
 

Not my field of expertise, and sorry if I've missed this somewhere, but was 
wondering if this (i.e. all language modes) means we need C++ name mangling 
support for bfloat? In the AArch64 backend I saw this:

  getBFloat16Mangling() const override { return "u6__bf16"; }

But that's something else I guess, and I was guessing a 2 letter mangling needs 
to be added here?

https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-builtin


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78190: Add Bfloat IR type

2020-05-13 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.

Nothing much has changed here: there was already broad consensus on this change 
and direction, and the last few weeks we have only seen a few rounds of minor 
comments and nits, so still LGTM.
Please wait a day with committing to provide the opportunity for a last-minute 
objection to this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78190/new/

https://reviews.llvm.org/D78190



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79708: [clang][BFloat] add NEON emitter for bfloat

2020-05-13 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/include/clang/Basic/arm_bf16.td:1
+//===--- arm_fp16.td - ARM FP16 compiler interface 
===//
+//

typo: fp16 - > bf16?
Here, and a few more places, or is it intentional? If so, I guess that can be a 
bit confusing?




Comment at: clang/utils/TableGen/NeonEmitter.cpp:2411
+/// is comprised of type definitions and function declarations.
+void NeonEmitter::runFP16(raw_ostream ) {
+  OS << "/*=== arm_fp16.h - ARM FP16 intrinsics "

I am a bit confused here, we already have a runFP16, I am missing something?



Comment at: clang/utils/TableGen/NeonEmitter.cpp:2416
+" *\n"
+" * Permission is hereby granted, free of charge, to any person "
+"obtaining a copy\n"

I can't remember the outcome, but I had a discussion with @sdesmalen about this 
license, if this should be the new or old copyright notice. I believe, but am 
not certain, that this should be the new one.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79708/new/

https://reviews.llvm.org/D79708



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-05-13 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Forgot to ask/add: can you commit this, do you have commit rights?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-05-13 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Thanks, LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-05-11 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

This looks good now, but sorry, one more request: I've just noticed a Clang 
driver test is missing. Can you add a test for this to 
`clang/test/Driver/aarch64-cpus.c`? And related to this, the relevant tests to 
`llvm/unittests/Support/TargetParserTest.cpp`?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-05-08 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

In D78129#2023772 , @joelkevinjones 
wrote:

> I don't think it makes sense to combine two unrelated things SVE and PA 
> support into a combined thing. Since we already have UnsupportedFeatures in 
> every sub-target .td file, I think it would be better to instead have:
>
>   def PAUnsupported : AArch64Unsupported {
> let F = [HasPA];
>   }
>
>
> and modify each .td file to have
>
>   list UnsupportedFeatures = !listconcat(SVEUnsupported.F, 
> PAUnsupported.F);
>


Can you add what Joel suggested? I don't see the point of doing this as a 
follow up. This very simple thing is done half right here, so is best fixed 
here as there is no reason to leave this for another time; following up on this 
and talking about this is more work than just doing it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-05-07 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Let's separate out HasPA from SVEFeatures now while we are at it, probably it's 
more work to do this as a follow up, and after that this looks good to me.

Bonus points for adding some llvm-mca tests, see the 
`llvm-project/llvm/test/tools/llvm-mca/` directory for some inspiration how 
this latencies can be checked, but that seems like something for follow-up.




Comment at: llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td:1992
+def : InstRW<[THX3T110Write_11Cyc_LS01_I1], (instregex "^LDRAA", "^LDRAB")>;
+// disable these instructions for the time being
+// pending further decisions

just remove them then instead of commenting them out.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-05-07 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Agreed about the UnsupportedFeatures list. But I just wanted to unblock this 
work, create a first version that compiles, so you can pick it up and clean 
things further up.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-05-06 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Since debugging on phabricator is a bit difficult, I took the patch and had a 
little play. Now I actually remember seeing this before. I think the way this 
works is that when you describe new instructions (PA in this case) that other 
models don't have, they start complaining about incomplete models when they 
have set `let CompleteModel = 1`

I got things compiling with these changes:

1. leave all existing models untouched, revert those changes,

2. remove that predicate HasPA that you added,

3. Change this:

>   def SVEUnsupported : AArch64Unsupported {
>   let F = [HasSVE, HasSVE2, HasSVE2AES, HasSVE2SM4, HasSVE2SHA3,
>   -   HasSVE2BitPerm];
>   +   HasSVE2BitPerm, HasPA];
>   }



4. In your new model set:

>   let CompleteModel =   0;
>   list UnsupportedFeatures = SVEUnsupported.F;



5. This is where the problem is, comment out these descriptions in your model:

>   //def : InstRW<[THX3T110Write_7Cyc_I1],
>   //(instrs AUTDA, AUTDB, AUTDZA, AUTDZB, AUTIA, AUTIA1716, 
> AUTIASP,
>   //AUTIAZ, AUTIB, AUTIB1716, AUTIBSP, AUTIBZ, AUTIZA, 
> AUTIZB,
>   //PACDA, PACDB, PACDZA, PACDZB, PACGA, PACIA, PACIA1716,
>   //PACIASP, PACIAZ, PACIB, PACIB1716, PACIBSP, PACIBZ,
>   //PACIZA, PACIZB, XPACD, XPACI, XPACLRI)>;

I have never really looked into these instructions, and quick grep doesn't show 
any descriptions for some. For example, AUTDA doesn't exist. The others look 
aliases, perhaps they need a different treatment, or a regexp, I don't know, 
but perhaps you can take it from here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-05-06 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

In D78129#2022307 , @joelkevinjones 
wrote:

> In email Wei asked for help about he following error message:
>
>   error message from tblgen
>   Included from 
> /home/wei/project/tx3/llvm-project/llvm/lib/Target/AArch64/AArch64.td:439:
>   
> /home/wei/project/tx3/llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.td:961:5:
>  error: 'CycloneModel' lacks information for 'AUTDZA'
>  
>   def DZA  : SignAuthZero;
>   ^
>   In the end:
>   Incomplete schedule models found.
>  
>   · Consider setting 'CompleteModel = 0' while developing new models.
>   · Pseudo instructions can be marked with 'hasNoSchedulingInfo = 1'.
>   · Instructions should usually have Sched<[...]> as a superclass, 
> you may temporarily use an empty list.
>   · Instructions related to unsupported features can be excluded with 
> list UnsupportedFeatures = [HasA,..,HasY]; in the processor model.
>   error: Incomplete schedule model
>
>
> And the comment from the person on the email was to define the instruction. 
> The instruction is defined, as evidenced when table gen is run to produce the 
> record-list. The error message can be suppressed by defining CompleteModele = 
> 0, but that isn't correct, as there are models for those instructions.
>
> However, I now note that at least In the output that Wei captured, it isn't 
> complaining about the ThunderX3T110 model, but about Cyclone.


Hi Joel,
Thanks for your message, and apologies for not replying earlier to the updates; 
missed the notification.
I am now taking a closer at some changes I hadn't looked at. In the mean, some 
drive-by comments:

It is okay'ish to set `CompleteModel = 0` if you're not interested in 
describing all instructions. The other way is of course to add the missing 
instruction to the model.

But first things first. @chill commented earlier about predicates. I also don't 
think you need to add `HasV8_3a` for the reason Momchill described, and there 
is still one left AArch64InstrInfo.td.

And while I am still looking into this, let me ask the question what I am 
wondering: why do you need to change the other scheduling models? These 
instructions are always supported, as a NOP, or otherwise.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78129: Add Marvell ThunderX3T110 support

2020-04-29 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64.td:849
+  HasV8_1aOps,
+  HasV8_3aOps]>;
+

`HasV8_3aOps` implies `HasV8_2aOps`, which implies `HasV8_1aOps`.
So you can just remove `HasV8_1aOps`? 



Comment at: llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td:13
+//===--===//
+
+//===--===//

I don't intend to check the numbers here, but just curious if there's an 
optimisation guide if people are curious?



Comment at: llvm/test/CodeGen/AArch64/loop-micro-op-buffer-size-t110.ll:1
+; REQUIRES: asserts
+; RUN: opt -mcpu=thunderx3t110 -loop-unroll --debug-only=loop-unroll 
--debug-only=basicblock-utils -S -unroll-allow-partial < %s 2>&1 | FileCheck %s

Couple of question about this test.
Looks like you're both testing output on stdout and stderr, and your checks for 
this are interleaved. If I am not mistaken, the behaviour of this can vary on 
different platforms, and thus may fail on different platforms.

But more importantly, what is this test exactly testing? The file name gives 
away a hint of course, but I don't see yet how this interact with loop 
unrolling. Is the loop unroller looking at MaxInterleaveFactor that is set 
here, but is that what we are testing here?



Comment at: llvm/test/CodeGen/AArch64/loop-micro-op-buffer-size-t110.ll:25
+; CHECK: %val5 = add nuw nsw i32 %counter, 10
+; CHECK-NOT: %val = add i32 %counter, 5
+; CHECK-NOT: %val = add i32 %counter, 6

I guess there won't be another define %val, it will be %val6, so this CHECK-NOT 
will never match even if there's another add?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78129/new/

https://reviews.llvm.org/D78129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78995: [SveEmitter] NFCI: Describe splat operands like any other modifier

2020-04-28 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added inline comments.



Comment at: clang/utils/TableGen/SveEmitter.cpp:68
   TypeSpec TS;
+  bool IsSplat;
   bool Float, Signed, Immediate, Void, Constant, Pointer;

I was wondering if IsSplat belongs here or somewhere else. It doesn't sound 
like a type, like the other members do, but more a specific case of a vector? 
But saying this off the top of my head, haven't looked at the code and the 
context, but perhaps you can comment on this if this is where it should be or 
not.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78995/new/

https://reviews.llvm.org/D78995



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78190: Add Bfloat IR type

2020-04-23 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

This direction of an IR type was taken after a discussion on the list. All 
previous comments have been addressed, and the changes here all look very 
reasonable to me. So, LGTM, but I think we need a LGTM from someone else too. 
So please wait for one more LGTM.




Comment at: llvm/include/llvm/IR/DataLayout.h:655
 return TypeSize::Fixed(16);
+  case Type::BfloatTyID:
+return TypeSize::Fixed(16);

Nit:

  case Type::HalfTyID:
  case Type::BfloatTyID:
return TypeSize::Fixed(16);



Comment at: llvm/lib/Support/APFloat.cpp:3266
 
+APInt IEEEFloat::convertBfloatAPFloatToAPInt() const {
+  assert(semantics == (const llvm::fltSemantics *));

I am not real big fan of all the code duplication here in this file (just the 
constants are different). But there's enough prior art here, so good for now I 
think, and a nice opportunity for a refactoring follow-up.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78190/new/

https://reviews.llvm.org/D78190



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78565: [clang][doc] Clang ARM CPU command line argument reference

2020-04-22 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

To be completely open about this, I had an offline chat with @psmith  and 
@kristof.beyls  about this. The reasoning is that this is probably related to 
my relatively poor choice of the 2 first cores that I describe here. They are 
microcontrollers, and Clang/LLVM doesn't provide a full embedded toolchain 
(linker, libraries, etc.), which is e.g. what we add downstream. So, if you're 
creating an embedded toolchain, you probably know what you're doing, and this 
is not one of the first problems that you have.

While I agree with that, I also mentioned that there are many issues orthogonal 
to this. Yes, we need a full toolchain, but I don't understand how improving 
things step-by-step can be a bad thing, or why we should wait for another thing 
to appear first before this becomes more valuable.  And yes, also -m option 
documentation needs to be improved, but again, for me that is orthogonal to 
this doc change. And what I also mentioned, is that absolutely nothing changes 
about the whole story here if you take an A-core application processor instead 
of an M-core, for which the toolchain probably looks simpler.

I agree with all your updates and replies on the cfe dev list. You mentioned 
"complete overhaul", and yes, I think option handling is inconsistent, 
undocumented, and so in general quite broken. Problem is that I didn't get a 
lot of buy in here and on the mailing list. So, I was happy to abandon this as 
I didn't feel like spending more time on this at this point.   But my idea was 
to step-by-step add more cores to this doc, also A-cores.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78565/new/

https://reviews.llvm.org/D78565



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   >