[PATCH] D65863: [ARM] Add support for the s,j,x,N,O inline asm constraints

2019-08-07 Thread David Candler via Phabricator via cfe-commits
dcandler created this revision.
dcandler added reviewers: rsmith, t.p.northover, compnerd, void, joerg, 
efriedma, ostannard.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls, 
javed.absar.
Herald added projects: clang, LLVM.

A number of inline assembly constraints are currently supported by LLVM, but 
rejected as invalid by Clang:

Target independent constraints:

s: An integer constant, but allowing only relocatable values

ARM specific constraints:

j: An immediate integer between 0 and 65535 (valid for MOVW)
x: A 32, 64, or 128-bit floating-point/SIMD register: s0-s15, d0-d7, or q0-q3
N: An immediate integer between 0 and 31 (Thumb1 only)
O: An immediate integer which is a multiple of 4 between -508 and 508. (Thumb1 
only)

This patch adds support to Clang for the missing constraints along with some 
checks to ensure that the constraints are used with the correct target and 
Thumb mode, and that immediates are within valid ranges (at least where 
possible). The constraints are already implemented in LLVM, but just a couple 
of minor corrections to checks (V8M Baseline includes MOVW so should work with 
'j', 'N' and 'O' shouldn't be valid in Thumb2) so that Clang and LLVM are in 
line with each other and the documentation.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D65863

Files:
  clang/lib/Basic/Targets/ARM.cpp
  clang/test/Sema/arm_inline_asm_constraints.c
  llvm/lib/Target/ARM/ARMISelLowering.cpp

Index: llvm/lib/Target/ARM/ARMISelLowering.cpp
===
--- llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -15143,7 +15143,7 @@
   case 'j':
 // Constant suitable for movw, must be between 0 and
 // 65535.
-if (Subtarget->hasV6T2Ops())
+if (Subtarget->hasV6T2Ops() || (Subtarget->hasV8MBaselineOps()))
   if (CVal >= 0 && CVal <= 65535)
 break;
 return;
@@ -15251,7 +15251,7 @@
 return;
 
   case 'N':
-if (Subtarget->isThumb()) {  // FIXME thumb2
+if (Subtarget->isThumb1Only()) {
   // This must be a constant between 0 and 31, for shift amounts.
   if (CVal >= 0 && CVal <= 31)
 break;
@@ -15259,7 +15259,7 @@
 return;
 
   case 'O':
-if (Subtarget->isThumb()) {  // FIXME thumb2
+if (Subtarget->isThumb1Only()) {
   // This must be a multiple of 4 between -508 and 508, for
   // ADD/SUB sp = sp + immediate.
   if ((CVal >= -508 && CVal <= 508) && ((CVal & 3) == 0))
Index: clang/test/Sema/arm_inline_asm_constraints.c
===
--- /dev/null
+++ clang/test/Sema/arm_inline_asm_constraints.c
@@ -0,0 +1,305 @@
+// REQUIRES: arm-registered-target
+
+// RUN: %clang_cc1 -triple armv6 -verify=arm6 %s
+// RUN: %clang_cc1 -triple armv7 -verify=arm7 %s
+// RUN: %clang_cc1 -triple thumbv6 -verify=thumb1 %s
+// RUN: %clang_cc1 -triple thumbv7 -verify=thumb2 %s
+
+// j: An immediate integer between 0 and 65535 (valid for MOVW) (ARM/Thumb2)
+int test_j(int i) {
+  int res;
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(-1), [ input ] "r"(i)
+:);
+  // arm6-error@13 {{invalid input constraint 'j' in asm}}
+  // arm7-error@13 {{value '-1' out of range for constraint 'j'}}
+  // thumb1-error@13 {{invalid input constraint 'j' in asm}}
+  // thumb2-error@13 {{value '-1' out of range for constraint 'j'}}
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(0), [ input ] "r"(i)
+:);
+  // arm6-error@21 {{invalid input constraint 'j' in asm}}
+  // arm7-no-error
+  // thumb1-error@21 {{invalid input constraint 'j' in asm}}
+  // thumb2-no-error
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(65535), [ input ] "r"(i)
+:);
+  // arm6-error@29 {{invalid input constraint 'j' in asm}}
+  // arm7-no-error
+  // thumb1-error@29 {{invalid input constraint 'j' in asm}}
+  // thumb2-no-error
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(65536), [ input ] "r"(i)
+:);
+  // arm6-error@37 {{invalid input constraint 'j' in asm}}
+  // arm7-error@37 {{value '65536' out of range for constraint 'j'}}
+  // thumb1-error@37 {{invalid input constraint 'j' in asm}}
+  // thumb2-error@37 {{value '65536' out of range for constraint 'j'}}
+  return res;
+}
+
+// I: An immediate integer valid for a data-processing instruction. (ARM/Thumb2)
+//An immediate integer between -255 and -1. (Thumb1)
+int test_I(int i) {
+  int res;
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(-1), [ input ] "r"(i)
+  :); // thumb1-error@53 {{value '-1' out of range for constraint 'I'}}
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(0), [ input ] "r"(

[PATCH] D65863: [ARM] Add support for the s,j,x,N,O inline asm constraints

2019-08-08 Thread David Candler via Phabricator via cfe-commits
dcandler marked 5 inline comments as done.
dcandler added inline comments.



Comment at: clang/lib/Basic/Targets/ARM.cpp:938
+// Thumb1: An immediate integer which is a multiple of 4 between 0 and 
1020.
+Info.setRequiresImmediate();
 return true;

compnerd wrote:
> Can we leave this as a FIXME?  This needs additional validation on the input.
I think it's not just the `M` constraint that requires additional validation. 
Most of these immediate constraints require values that can fit in specific 
encodings to be valid, or have properties like being a multiple of a number, 
but `setRequiresImmediate` at present can only check against a min/max or exact 
values.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65863/new/

https://reviews.llvm.org/D65863



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65863: [ARM] Add support for the s,j,x,N,O inline asm constraints

2019-08-08 Thread David Candler via Phabricator via cfe-commits
dcandler updated this revision to Diff 214205.
dcandler marked an inline comment as done.
dcandler added a comment.

Adjusted the formatting on some comment lines, and added FIXMEs for all the 
constraints that require additional validation to clarify what is still needed 
and where.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65863/new/

https://reviews.llvm.org/D65863

Files:
  clang/lib/Basic/Targets/ARM.cpp
  clang/test/Sema/arm_inline_asm_constraints.c
  llvm/lib/Target/ARM/ARMISelLowering.cpp

Index: llvm/lib/Target/ARM/ARMISelLowering.cpp
===
--- llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -15143,7 +15143,7 @@
   case 'j':
 // Constant suitable for movw, must be between 0 and
 // 65535.
-if (Subtarget->hasV6T2Ops())
+if (Subtarget->hasV6T2Ops() || (Subtarget->hasV8MBaselineOps()))
   if (CVal >= 0 && CVal <= 65535)
 break;
 return;
@@ -15251,7 +15251,7 @@
 return;
 
   case 'N':
-if (Subtarget->isThumb()) {  // FIXME thumb2
+if (Subtarget->isThumb1Only()) {
   // This must be a constant between 0 and 31, for shift amounts.
   if (CVal >= 0 && CVal <= 31)
 break;
@@ -15259,7 +15259,7 @@
 return;
 
   case 'O':
-if (Subtarget->isThumb()) {  // FIXME thumb2
+if (Subtarget->isThumb1Only()) {
   // This must be a multiple of 4 between -508 and 508, for
   // ADD/SUB sp = sp + immediate.
   if ((CVal >= -508 && CVal <= 508) && ((CVal & 3) == 0))
Index: clang/test/Sema/arm_inline_asm_constraints.c
===
--- /dev/null
+++ clang/test/Sema/arm_inline_asm_constraints.c
@@ -0,0 +1,305 @@
+// REQUIRES: arm-registered-target
+
+// RUN: %clang_cc1 -triple armv6 -verify=arm6 %s
+// RUN: %clang_cc1 -triple armv7 -verify=arm7 %s
+// RUN: %clang_cc1 -triple thumbv6 -verify=thumb1 %s
+// RUN: %clang_cc1 -triple thumbv7 -verify=thumb2 %s
+
+// j: An immediate integer between 0 and 65535 (valid for MOVW) (ARM/Thumb2)
+int test_j(int i) {
+  int res;
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(-1), [ input ] "r"(i)
+:);
+  // arm6-error@13 {{invalid input constraint 'j' in asm}}
+  // arm7-error@13 {{value '-1' out of range for constraint 'j'}}
+  // thumb1-error@13 {{invalid input constraint 'j' in asm}}
+  // thumb2-error@13 {{value '-1' out of range for constraint 'j'}}
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(0), [ input ] "r"(i)
+:);
+  // arm6-error@21 {{invalid input constraint 'j' in asm}}
+  // arm7-no-error
+  // thumb1-error@21 {{invalid input constraint 'j' in asm}}
+  // thumb2-no-error
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(65535), [ input ] "r"(i)
+:);
+  // arm6-error@29 {{invalid input constraint 'j' in asm}}
+  // arm7-no-error
+  // thumb1-error@29 {{invalid input constraint 'j' in asm}}
+  // thumb2-no-error
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(65536), [ input ] "r"(i)
+:);
+  // arm6-error@37 {{invalid input constraint 'j' in asm}}
+  // arm7-error@37 {{value '65536' out of range for constraint 'j'}}
+  // thumb1-error@37 {{invalid input constraint 'j' in asm}}
+  // thumb2-error@37 {{value '65536' out of range for constraint 'j'}}
+  return res;
+}
+
+// I: An immediate integer valid for a data-processing instruction. (ARM/Thumb2)
+//An immediate integer between -255 and -1. (Thumb1)
+int test_I(int i) {
+  int res;
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(-1), [ input ] "r"(i)
+  :); // thumb1-error@53 {{value '-1' out of range for constraint 'I'}}
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(0), [ input ] "r"(i)
+  :); // No errors expected.
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(255), [ input ] "r"(i)
+  :); // No errors expected.
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(256), [ input ] "r"(i)
+  :); // thumb1-error@68 {{value '256' out of range for constraint 'I'}}
+  return res;
+}
+
+// J: An immediate integer between -4095 and 4095. (ARM/Thumb2)
+//An immediate integer between -255 and -1. (Thumb1)
+int test_J(int i) {
+  int res;
+  __asm(
+  "movw %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "J"(-4096), [ input ] "r"(i)
+  :);
+  // arm6-error@80 {{value '-4096' out of range for constraint 'J'}}
+  // arm7-error@80 {{value '-4096' out of range for constraint 'J'}}
+  // thumb1-error@80 {{value '-4096' out of range for constraint 'J'}}
+  // thumb2-error@80 {{value '-4096' out of range for constraint 'J'}}
+  __asm(
+

[PATCH] D65863: [ARM] Add support for the s,j,x,N,O inline asm constraints

2019-08-15 Thread David Candler via Phabricator via cfe-commits
dcandler added a comment.

Ping. @compnerd any other changes before this could be accepted?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65863/new/

https://reviews.llvm.org/D65863



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D67216: [cfi] Add flag to always generate .debug_frame

2019-10-11 Thread David Candler via Phabricator via cfe-commits
dcandler updated this revision to Diff 224581.
dcandler retitled this revision from "[cfi] Add flag to always generate call 
frame information" to "[cfi] Add flag to always generate .debug_frame".
dcandler edited the summary of this revision.
dcandler added reviewers: rengolin, joerg.
dcandler added a comment.
Herald added subscribers: jsji, MaskRay, kbarton, nemanjai.

I've modified the patch so that the new flag will ensure the cfi instructions 
are actually present to be emitted as well. I went ahead and renamed the flag 
-gdwarf-frame too, to better reflect that it's dealing with the debug 
information you'd otherwise get with -g, and is meant to specifically put the 
information in a .debug_frame section and not .eh_frame.

Currently, two things signal for need for cfi: exceptions (via the function's 
needsUnwindTableEntry()), and debug (via the machine module information's 
hasDebugInfo()). At frame lowering, both trigger the same thing. But when the 
assembly printer decides on which section to use, needsUnwindTableEntry() is 
checked first and triggers the need for .eh_frame, while hasDebugInfo() is 
checked afterwards for whether .debug_frame is needed. So .debug_frame is only 
present when any level of debug is requested, and no functions need unwinding 
for exceptions.

It wouldn't be appropriate to change either needsUnwindTableEntry() or 
hasDebugInfo(), so I've added a check for my flag alongside them. Because the 
same logic is used in multiple places, I've wrapped all three checks into one 
function to try and clean things up slightly. When deciding on which section to 
emit, the new flag means .debug_frame is produced instead of nothing. If 
.eh_frame would have been needed, rather than replace it, the new flag simply 
emits both .debug_frame and .eh_frame.

The end result is that -gdwarf-frame should only provide a .debug_frame section 
as additional information, without otherwise modifying anything. The existing 
-funwind-tables (and -fasynchronous-unwind-tables) flag can be used to provide 
similar information, but because it takes the exception angle, it alters 
function attributes and ultimately produces .eh_frame instead.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67216/new/

https://reviews.llvm.org/D67216

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/Driver/gdwarf-frame.c
  llvm/include/llvm/CodeGen/CommandFlags.inc
  llvm/include/llvm/CodeGen/MachineFunction.h
  llvm/include/llvm/Target/TargetOptions.h
  llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
  llvm/lib/CodeGen/AsmPrinter/DwarfCFIException.cpp
  llvm/lib/CodeGen/CFIInstrInserter.cpp
  llvm/lib/CodeGen/MachineFunction.cpp
  llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
  llvm/lib/Target/ARC/ARCRegisterInfo.cpp
  llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp
  llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
  llvm/lib/Target/X86/X86FrameLowering.cpp
  llvm/lib/Target/X86/X86InstrInfo.cpp
  llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
  llvm/test/CodeGen/ARM/dwarf-frame.ll

Index: llvm/test/CodeGen/ARM/dwarf-frame.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/dwarf-frame.ll
@@ -0,0 +1,38 @@
+; RUN: llc -mtriple armv7-unknown -frame-pointer=all -filetype=asm -o - %s | FileCheck %s --check-prefix=CHECK-NO-CFI
+; RUN: llc -mtriple armv7-unknown -frame-pointer=all -filetype=asm -dwarf-frame-section -o - %s | FileCheck %s --check-prefix=CHECK-ALWAYS-CFI
+
+declare void @dummy_use(i32*, i32)
+
+define void @test_basic() #0 {
+%mem = alloca i32, i32 10
+call void @dummy_use (i32* %mem, i32 10)
+  ret void
+}
+
+; CHECK-NO-CFI-LABEL: test_basic:
+; CHECK-NO-CFI:   .fnstart
+; CHECK-NO-CFI-NOT:   .cfi_sections .debug_frame
+; CHECK-NO-CFI-NOT:   .cfi_startproc
+; CHECK-NO-CFI:   @ %bb.0:
+; CHECK-NO-CFI:   push {r11, lr}
+; CHECK-NO-CFI-NOT:   .cfi_def_cfa_offset 8
+; CHECK-NO-CFI-NOT:   .cfi_offset lr, -4
+; CHECK-NO-CFI-NOT:   .cfi_offset r11, -8
+; CHECK-NO-CFI:   mov r11, sp
+; CHECK-NO-CFI-NOT:   .cfi_def_cfa_register r11
+; CHECK-NO-CFI-NOT:   .cfi_endproc
+; CHECK-NO-CFI:   .fnend
+
+; CHECK-ALWAYS-CFI-LABEL: test_basic:
+; CHECK-ALWAYS-CFI:   .fnstart
+; CHECK-ALWAYS-CFI:   .cfi_sections .debug_frame
+; CHECK-ALWAYS-CFI:   .cfi_startproc
+; CHECK-ALWAYS-CFI:   @ %bb.0:
+; CHECK-ALWAYS-CFI:   push {r11, lr}
+; CHECK-ALWAYS-CFI:   .cfi_def_cfa_offset 8
+; CHECK-ALWAYS-CFI:   .cfi_offset lr, -4
+; CHECK-ALWAYS-CFI:   .cfi_offset r11, -8
+; CHECK-ALWAYS-CFI:   mov r11, sp
+; CHECK-ALWAYS-CFI:   .cfi_def_cfa_register r11
+; CHECK-ALWAYS-CFI:   .cfi_endproc
+; CHECK-ALWAYS-CFI:   .fnend
Index: llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
===
--- llvm/lib/Target/XCore/XCoreRegisterInfo.c

[PATCH] D67216: [cfi] Add flag to always generate .debug_frame

2019-10-17 Thread David Candler via Phabricator via cfe-commits
dcandler added a comment.

Ping?

I already spotted the line in CommandFlags.inc needs formatting with a couple 
of breaks. Also the help text in Options.td could be clearer. In particular, 
-gno-dwarf-frame shouldn't suggest .debug_frame won't be generated at all 
(since -g might still emit it). It should probably be more along the lines of:

-gdwarf-frame: "Emit a debug_frame section"
-gno-dwarf-frame: "Don't explicitly emit a debug_frame section"


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67216/new/

https://reviews.llvm.org/D67216



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D67216: [cfi] Add flag to always generate .debug_frame

2019-10-17 Thread David Candler via Phabricator via cfe-commits
dcandler added a comment.

I added the negative option more as a way to disable the flag, since I'm 
currently looking at cases where it may want to be turned on by default (and a 
negative option would then allow you to only get .eh_frame in cases where you'd 
get both .debug_frame/.eh_frame).

The other suggested name of -gdwarf-frame-always might then better describe the 
behavior, as the negative -gno-dwarf-frame-always wouldn't sound quite so 
misleading (since 'not always' equates to 'sometimes').


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67216/new/

https://reviews.llvm.org/D67216



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D67216: [cfi] Add flag to always generate .debug_frame

2019-10-22 Thread David Candler via Phabricator via cfe-commits
dcandler added a comment.

I think `-f[no-]force-dwarf-frame` suitably describes the behavior, and looks 
in line with other options. I'll update the patch shortly unless anyone else 
has any other input.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67216/new/

https://reviews.llvm.org/D67216



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D67216: [cfi] Add flag to always generate .debug_frame

2019-10-23 Thread David Candler via Phabricator via cfe-commits
dcandler updated this revision to Diff 226147.
dcandler added a comment.

Updated with the new name for the option.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67216/new/

https://reviews.llvm.org/D67216

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/Driver/fforce-dwarf-frame.c
  llvm/include/llvm/CodeGen/CommandFlags.inc
  llvm/include/llvm/CodeGen/MachineFunction.h
  llvm/include/llvm/Target/TargetOptions.h
  llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
  llvm/lib/CodeGen/AsmPrinter/DwarfCFIException.cpp
  llvm/lib/CodeGen/CFIInstrInserter.cpp
  llvm/lib/CodeGen/MachineFunction.cpp
  llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
  llvm/lib/Target/ARC/ARCRegisterInfo.cpp
  llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp
  llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
  llvm/lib/Target/X86/X86FrameLowering.cpp
  llvm/lib/Target/X86/X86InstrInfo.cpp
  llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
  llvm/test/CodeGen/ARM/dwarf-frame.ll

Index: llvm/test/CodeGen/ARM/dwarf-frame.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/dwarf-frame.ll
@@ -0,0 +1,38 @@
+; RUN: llc -mtriple armv7-unknown -frame-pointer=all -filetype=asm -o - %s | FileCheck %s --check-prefix=CHECK-NO-CFI
+; RUN: llc -mtriple armv7-unknown -frame-pointer=all -filetype=asm -force-dwarf-frame-section -o - %s | FileCheck %s --check-prefix=CHECK-ALWAYS-CFI
+
+declare void @dummy_use(i32*, i32)
+
+define void @test_basic() #0 {
+%mem = alloca i32, i32 10
+call void @dummy_use (i32* %mem, i32 10)
+  ret void
+}
+
+; CHECK-NO-CFI-LABEL: test_basic:
+; CHECK-NO-CFI:   .fnstart
+; CHECK-NO-CFI-NOT:   .cfi_sections .debug_frame
+; CHECK-NO-CFI-NOT:   .cfi_startproc
+; CHECK-NO-CFI:   @ %bb.0:
+; CHECK-NO-CFI:   push {r11, lr}
+; CHECK-NO-CFI-NOT:   .cfi_def_cfa_offset 8
+; CHECK-NO-CFI-NOT:   .cfi_offset lr, -4
+; CHECK-NO-CFI-NOT:   .cfi_offset r11, -8
+; CHECK-NO-CFI:   mov r11, sp
+; CHECK-NO-CFI-NOT:   .cfi_def_cfa_register r11
+; CHECK-NO-CFI-NOT:   .cfi_endproc
+; CHECK-NO-CFI:   .fnend
+
+; CHECK-ALWAYS-CFI-LABEL: test_basic:
+; CHECK-ALWAYS-CFI:   .fnstart
+; CHECK-ALWAYS-CFI:   .cfi_sections .debug_frame
+; CHECK-ALWAYS-CFI:   .cfi_startproc
+; CHECK-ALWAYS-CFI:   @ %bb.0:
+; CHECK-ALWAYS-CFI:   push {r11, lr}
+; CHECK-ALWAYS-CFI:   .cfi_def_cfa_offset 8
+; CHECK-ALWAYS-CFI:   .cfi_offset lr, -4
+; CHECK-ALWAYS-CFI:   .cfi_offset r11, -8
+; CHECK-ALWAYS-CFI:   mov r11, sp
+; CHECK-ALWAYS-CFI:   .cfi_def_cfa_register r11
+; CHECK-ALWAYS-CFI:   .cfi_endproc
+; CHECK-ALWAYS-CFI:   .fnend
Index: llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
===
--- llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
+++ llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
@@ -203,7 +203,7 @@
 }
 
 bool XCoreRegisterInfo::needsFrameMoves(const MachineFunction &MF) {
-  return MF.getMMI().hasDebugInfo() || MF.getFunction().needsUnwindTableEntry();
+  return MF.needsFrameMoves();
 }
 
 const MCPhysReg *
Index: llvm/lib/Target/X86/X86InstrInfo.cpp
===
--- llvm/lib/Target/X86/X86InstrInfo.cpp
+++ llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -3963,9 +3963,7 @@
   MachineFunction &MF = *MBB.getParent();
   const X86FrameLowering *TFL = Subtarget.getFrameLowering();
   bool IsWin64Prologue = MF.getTarget().getMCAsmInfo()->usesWindowsCFI();
-  bool NeedsDwarfCFI =
-  !IsWin64Prologue &&
-  (MF.getMMI().hasDebugInfo() || MF.getFunction().needsUnwindTableEntry());
+  bool NeedsDwarfCFI = !IsWin64Prologue && MF.needsFrameMoves();
   bool EmitCFI = !TFL->hasFP(MF) && NeedsDwarfCFI;
   if (EmitCFI) {
 TFL->BuildCFI(MBB, I, DL,
Index: llvm/lib/Target/X86/X86FrameLowering.cpp
===
--- llvm/lib/Target/X86/X86FrameLowering.cpp
+++ llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -993,8 +993,7 @@
   bool NeedsWinFPO =
   !IsFunclet && STI.isTargetWin32() && MMI.getModule()->getCodeViewFlag();
   bool NeedsWinCFI = NeedsWin64CFI || NeedsWinFPO;
-  bool NeedsDwarfCFI =
-  !IsWin64Prologue && (MMI.hasDebugInfo() || Fn.needsUnwindTableEntry());
+  bool NeedsDwarfCFI = !IsWin64Prologue && MF.needsFrameMoves();
   Register FramePtr = TRI->getFrameRegister(MF);
   const Register MachineFramePtr =
   STI.isTarget64BitILP32()
@@ -1614,10 +1613,9 @@
   bool HasFP = hasFP(MF);
   uint64_t NumBytes = 0;
 
-  bool NeedsDwarfCFI =
-  (!MF.getTarget().getTargetTriple().isOSDarwin() &&
-   !MF.getTarget().getTargetTriple().isOSWindows()) &&
-  (MF.getMMI().hasDebugInfo() || MF.getFunction().needsUnwindTableEntry());
+  bool NeedsDwarfCFI = (!MF.getTarget().getTargetTriple().isOSDarwin() &&
+!MF.get

[PATCH] D129298: Add denormal-fp-math attribute for f16

2022-07-07 Thread David Candler via Phabricator via cfe-commits
dcandler created this revision.
dcandler added reviewers: arsenm, spatel, echristo, andrew.w.kaylor, 
cameron.mcinally.
Herald added subscribers: jsji, kosarev, jdoerfert, pengfei, hiraditya, 
kristof.beyls, tpr.
Herald added a project: All.
dcandler requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits, MaskRay, wdng.
Herald added projects: clang, LLVM.

Denormal flushing behavior is currently controlled with the
denormal-fp-math attribute, with a denormal-fp-math-f32 variant for
targets such as AMDGPU where f32 denormals are controlled separately
from f16/f64. However there are other targets such as Arm (and I
think x86) where f16 denormals can be distinct from f32/f64. As the
attributes are now used for constant folding, this can lead to
incorrect folded values for half precision floats on those targets.

This patch adds a denormal-fp-math-f16 attribute, which functions
identically to denormal-fp-math-f32, but overrides the denormal
handling mode for f16 only. Constant folding tests have been
expanded to include half floats, and check both f16 and f32
variants of the attribute.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D129298

Files:
  clang/docs/UsersManual.rst
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  llvm/docs/LangRef.rst
  llvm/include/llvm/CodeGen/CommandFlags.h
  llvm/include/llvm/Target/TargetOptions.h
  llvm/lib/CodeGen/CommandFlags.cpp
  llvm/lib/IR/Function.cpp
  llvm/test/Transforms/InstSimplify/constant-fold-fp-denormal.ll

Index: llvm/test/Transforms/InstSimplify/constant-fold-fp-denormal.ll
===
--- llvm/test/Transforms/InstSimplify/constant-fold-fp-denormal.ll
+++ llvm/test/Transforms/InstSimplify/constant-fold-fp-denormal.ll
@@ -12,6 +12,73 @@
 ; normal operand (a number plus zero is the same number).
 ;  ;
 
+define half @test_half_fadd_ieee() #0 {
+; CHECK-LABEL: @test_half_fadd_ieee(
+; CHECK-NEXT:ret half 0xH8200
+;
+; default ieee mode leaves result as a denormal
+  %result = fadd half 0xH8400, 0xH0200
+  ret half %result
+}
+
+define half @test_half_fadd_pzero_out() #1 {
+; CHECK-LABEL: @test_half_fadd_pzero_out(
+; CHECK-NEXT:ret half 0xH
+;
+; denormal result is flushed to positive zero
+  %result = fadd half 0xH8400, 0xH0200
+  ret half %result
+}
+
+define half @test_half_fadd_psign_out() #2 {
+; CHECK-LABEL: @test_half_fadd_psign_out(
+; CHECK-NEXT:ret half 0xH8000
+;
+; denormal result is flushed to sign preserved zero
+  %result = fadd half 0xH8400, 0xH0200
+  ret half %result
+}
+
+define half @test_half_fadd_pzero_in() #3 {
+; CHECK-LABEL: @test_half_fadd_pzero_in(
+; CHECK-NEXT:ret half 0xH8400
+;
+; denormal operand is treated as zero
+; normal operand added to zero results in the same operand as a result
+  %result = fadd half 0xH8400, 0xH0200
+  ret half %result
+}
+
+define half @test_half_fadd_psign_in() #4 {
+; CHECK-LABEL: @test_half_fadd_psign_in(
+; CHECK-NEXT:ret half 0xH8400
+;
+; denormal operand is treated as zero
+; normal operand added to zero results in the same operand as a result
+  %result = fadd half 0xH8400, 0xH0200
+  ret half %result
+}
+
+define half @test_half_fadd_pzero_f16_pzero_out() #9 {
+; CHECK-LABEL: @test_half_fadd_pzero_f16_pzero_out(
+; CHECK-NEXT:ret half 0xH
+;
+; f16 only attribute should flush half float output
+; same as pzero_out above
+  %result = fadd half 0xH8400, 0xH0200
+  ret half %result
+}
+
+define half @test_half_fadd_pzero_f32_pzero_out() #5 {
+; CHECK-LABEL: @test_half_fadd_pzero_f32_pzero_out(
+; CHECK-NEXT:ret half 0xH8200
+;
+; f32 only attribute should not flush half float output
+; default ieee mode leaves result as a denormal
+  %result = fadd half 0xH8400, 0xH0200
+  ret half %result
+}
+
 define float @test_float_fadd_ieee() #0 {
 ; CHECK-LABEL: @test_float_fadd_ieee(
 ; CHECK-NEXT:ret float 0xB800
@@ -59,12 +126,22 @@
   ret float %result
 }
 
-define float @test_float_fadd_pzero_f32_out() #5 {
-; CHECK-LABEL: @test_float_fadd_pzero_f32_out(
+define float @test_float_fadd_pzero_f16_pzero_out() #9 {
+; CHECK-LABEL: @test_float_fadd_pzero_f16_pzero_out(
+; CHECK-NEXT:ret float 0xB800
+;
+; f16 only attribute should not flush float output
+; default ieee mode leaves result as a denormal
+  %result = fadd float 0xB810, 0x3800
+  ret float %result
+}
+
+define float @test_float_fadd_pzero_f32_pzero_out() #5 {
+; CHECK-LABEL: @test_float_fadd_pzero_f32_pzero_out(
 ; CHECK-NEXT:ret float 0.00e+00
 ;
 ; f32 only attribute should flush float output
-; default ieee mode leaves result as a denormal
+; same as pzero_out above
   %result = fadd float 0xB810, 0x380

[PATCH] D129298: Add denormal-fp-math attribute for f16

2022-07-08 Thread David Candler via Phabricator via cfe-commits
dcandler added a comment.

There are currently no Arm specific changes, this is just being able to more 
accurately describe the floating point environment via attributes in the case 
where singles and doubles should be flushed, but not halves.

With three precisions to control, an alternative may be to specify them 
individually (denormal-fp-math-f64, denormal-fp-math-f32 and 
denormal-fp-math-f16) so that one doesn't override another, but that would be a 
much larger and more intrusive change.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129298/new/

https://reviews.llvm.org/D129298

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122589: Additionally set f32 mode with denormal-fp-math

2022-03-28 Thread David Candler via Phabricator via cfe-commits
dcandler created this revision.
dcandler added reviewers: arsenm, spatel.
Herald added a project: All.
dcandler requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay, wdng.
Herald added a project: clang.

When the denormal-fp-math option is used, this should set the
denormal handling mode for all floating point types. However,
currently 32-bit float types can ignore this setting as there is a
variant of the option, denormal-fp-math-f32, specifically for that type
which takes priority when checking the mode based on type and remains
at the default of IEEE. From the description, denormal-fp-math would
be expected to set the mode for floats unless overridden by the f32
variant, and code in the front end only emits the f32 option if it is
different to the general one, so setting just denormal-fp-math should
be valid.

This patch changes the denormal-fp-math option to also set the f32
mode. If denormal-fp-math-f32 is also specified, this is then
overridden as expected, but if it is absent floats will be set to the
mode specified by the former option, rather than remain on the default.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D122589

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -1497,7 +1497,8 @@
   if (Opts.FPDenormalMode != llvm::DenormalMode::getIEEE())
 GenerateArg(Args, OPT_fdenormal_fp_math_EQ, Opts.FPDenormalMode.str(), SA);
 
-  if (Opts.FP32DenormalMode != llvm::DenormalMode::getIEEE())
+  if ((Opts.FPDenormalMode != Opts.FP32DenormalMode) ||
+  (Opts.FP32DenormalMode != llvm::DenormalMode::getIEEE()))
 GenerateArg(Args, OPT_fdenormal_fp_math_f32_EQ, 
Opts.FP32DenormalMode.str(),
 SA);
 
@@ -1852,6 +1853,7 @@
   if (Arg *A = Args.getLastArg(OPT_fdenormal_fp_math_EQ)) {
 StringRef Val = A->getValue();
 Opts.FPDenormalMode = llvm::parseDenormalFPAttribute(Val);
+Opts.FP32DenormalMode = Opts.FPDenormalMode;
 if (!Opts.FPDenormalMode.isValid())
   Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val;
   }
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -2870,6 +2870,7 @@
 
 case options::OPT_fdenormal_fp_math_EQ:
   DenormalFPMath = llvm::parseDenormalFPAttribute(A->getValue());
+  DenormalFP32Math = DenormalFPMath;
   if (!DenormalFPMath.isValid()) {
 D.Diag(diag::err_drv_invalid_value)
 << A->getAsString(Args) << A->getValue();


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -1497,7 +1497,8 @@
   if (Opts.FPDenormalMode != llvm::DenormalMode::getIEEE())
 GenerateArg(Args, OPT_fdenormal_fp_math_EQ, Opts.FPDenormalMode.str(), SA);
 
-  if (Opts.FP32DenormalMode != llvm::DenormalMode::getIEEE())
+  if ((Opts.FPDenormalMode != Opts.FP32DenormalMode) ||
+  (Opts.FP32DenormalMode != llvm::DenormalMode::getIEEE()))
 GenerateArg(Args, OPT_fdenormal_fp_math_f32_EQ, Opts.FP32DenormalMode.str(),
 SA);
 
@@ -1852,6 +1853,7 @@
   if (Arg *A = Args.getLastArg(OPT_fdenormal_fp_math_EQ)) {
 StringRef Val = A->getValue();
 Opts.FPDenormalMode = llvm::parseDenormalFPAttribute(Val);
+Opts.FP32DenormalMode = Opts.FPDenormalMode;
 if (!Opts.FPDenormalMode.isValid())
   Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val;
   }
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -2870,6 +2870,7 @@
 
 case options::OPT_fdenormal_fp_math_EQ:
   DenormalFPMath = llvm::parseDenormalFPAttribute(A->getValue());
+  DenormalFP32Math = DenormalFPMath;
   if (!DenormalFPMath.isValid()) {
 D.Diag(diag::err_drv_invalid_value)
 << A->getAsString(Args) << A->getValue();
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122589: Additionally set f32 mode with denormal-fp-math

2022-03-29 Thread David Candler via Phabricator via cfe-commits
dcandler updated this revision to Diff 418865.
dcandler added a comment.

Added a test that checks attributes based on the -fdenormal-fp-math and 
-fdenormal-fp-math-f32 flags.

Only the cases where -fdenormal-fp-math is set to preserve-sign or 
positive-zero and -fdenormal-fp-math-f32 is unset are changed by this patch. 
Previously, these would still result in a "denormal-fp-math-f32"="ieee,ieee" 
attribute.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122589/new/

https://reviews.llvm.org/D122589

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGen/denormalfpmode-f32.c


Index: clang/test/CodeGen/denormalfpmode-f32.c
===
--- /dev/null
+++ clang/test/CodeGen/denormalfpmode-f32.c
@@ -0,0 +1,35 @@
+// RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee %s -emit-llvm -o - | FileCheck 
%s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign %s -emit-llvm -o - | 
FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero %s -emit-llvm -o - | 
FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE
+
+// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | 
FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=ieee %s 
-emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign 
-fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-IEEE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero 
-fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-IEEE
+
+// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - 
| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS
+// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee 
-fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS
+// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign 
-fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero 
-fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-PS
+
+// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - 
| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ
+// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee 
-fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ
+// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign 
-fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-PZ
+// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero 
-fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE
+
+// CHECK-LABEL: main
+
+// CHECK-ATTR: attributes #0 =
+// CHECK-NONE-NOT:"denormal-fp-math"
+// CHECK-IEEE: "denormal-fp-math"="ieee,ieee"
+// CHECK-PS: "denormal-fp-math"="preserve-sign,preserve-sign"
+// CHECK-PZ: "denormal-fp-math"="positive-zero,positive-zero"
+// CHECK-F32-NONE-NOT:"denormal-fp-math-f32"
+// CHECK-F32-IEEE: "denormal-fp-math-f32"="ieee,ieee"
+// CHECK-F32-PS: "denormal-fp-math-f32"="preserve-sign,preserve-sign"
+// CHECK-F32-PZ: "denormal-fp-math-f32"="positive-zero,positive-zero"
+
+int main(void) {
+  return 0;
+}
Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -1497,7 +1497,8 @@
   if (Opts.FPDenormalMode != llvm::DenormalMode::getIEEE())
 GenerateArg(Args, OPT_fdenormal_fp_math_EQ, Opts.FPDenormalMode.str(), SA);
 
-  if (Opts.FP32DenormalMode != llvm::DenormalMode::getIEEE())
+  if ((Opts.FPDenormalMode != Opts.FP32DenormalMode) ||
+  (Opts.FP32DenormalMode != llvm::DenormalMode::getIEEE()))
 GenerateArg(Args, OPT_fdenormal_fp_math_f32_EQ, 
Opts.FP32DenormalMode.str(),
 SA);
 
@@ -1852,6 +1853,7 @@
   if (Arg *A = Args.getLastArg(OPT_fdenormal_fp_math_EQ)) {
 StringRef Val = A->getValue();
 Opts.FPDenormalMode = llvm::parseDenormalFPAttribute(Val);
+Opts.FP32DenormalMode = Opts.FPDenormalMode;
 if (!Opts.FPDenormalMode.isValid())
   Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val;
   }
Index: clang/lib/Driver/ToolChains/C

[PATCH] D122589: Additionally set f32 mode with denormal-fp-math

2022-03-30 Thread David Candler via Phabricator via cfe-commits
dcandler added a comment.

The issue I found was trying to use getDefaultDenormalModeForType during 
constant folding to account for denormals (https://reviews.llvm.org/D116952). 
Setting denormal-fp-math to a non-IEEE mode without specifying 
denormal-fp-math-f32 still results in the denormal-fp-math-f32 attribute being 
present (even if unsued elsewhere), which leads to the wrong result for targets 
that do not support denormal-fp-math-f32.

Only emitting denormal-fp-math-f32 when specified makes sense, but the current 
default effectively always specifies it as IEEE. One alternative would be to 
simply change the default.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122589/new/

https://reviews.llvm.org/D122589

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122589: Additionally set f32 mode with denormal-fp-math

2022-04-29 Thread David Candler via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9e7c9967c3fd: Additionally set f32 mode with 
denormal-fp-math (authored by dcandler).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122589/new/

https://reviews.llvm.org/D122589

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGen/denormalfpmode-f32.c


Index: clang/test/CodeGen/denormalfpmode-f32.c
===
--- /dev/null
+++ clang/test/CodeGen/denormalfpmode-f32.c
@@ -0,0 +1,35 @@
+// RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee %s -emit-llvm -o - | FileCheck 
%s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign %s -emit-llvm -o - | 
FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero %s -emit-llvm -o - | 
FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE
+
+// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | 
FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=ieee %s 
-emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign 
-fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-IEEE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero 
-fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-IEEE
+
+// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - 
| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS
+// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee 
-fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS
+// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign 
-fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE
+// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero 
-fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-PS
+
+// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - 
| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ
+// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee 
-fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ
+// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign 
-fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-PZ
+// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero 
-fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s 
--check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE
+
+// CHECK-LABEL: main
+
+// CHECK-ATTR: attributes #0 =
+// CHECK-NONE-NOT:"denormal-fp-math"
+// CHECK-IEEE: "denormal-fp-math"="ieee,ieee"
+// CHECK-PS: "denormal-fp-math"="preserve-sign,preserve-sign"
+// CHECK-PZ: "denormal-fp-math"="positive-zero,positive-zero"
+// CHECK-F32-NONE-NOT:"denormal-fp-math-f32"
+// CHECK-F32-IEEE: "denormal-fp-math-f32"="ieee,ieee"
+// CHECK-F32-PS: "denormal-fp-math-f32"="preserve-sign,preserve-sign"
+// CHECK-F32-PZ: "denormal-fp-math-f32"="positive-zero,positive-zero"
+
+int main(void) {
+  return 0;
+}
Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -1526,7 +1526,8 @@
   if (Opts.FPDenormalMode != llvm::DenormalMode::getIEEE())
 GenerateArg(Args, OPT_fdenormal_fp_math_EQ, Opts.FPDenormalMode.str(), SA);
 
-  if (Opts.FP32DenormalMode != llvm::DenormalMode::getIEEE())
+  if ((Opts.FPDenormalMode != Opts.FP32DenormalMode) ||
+  (Opts.FP32DenormalMode != llvm::DenormalMode::getIEEE()))
 GenerateArg(Args, OPT_fdenormal_fp_math_f32_EQ, 
Opts.FP32DenormalMode.str(),
 SA);
 
@@ -1879,6 +1880,7 @@
   if (Arg *A = Args.getLastArg(OPT_fdenormal_fp_math_EQ)) {
 StringRef Val = A->getValue();
 Opts.FPDenormalMode = llvm::parseDenormalFPAttribute(Val);
+Opts.FP32DenormalMode = Opts.FPDenormalMode;
 if (!Opts.FPDenormalMode.isValid())
   Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val;
   }
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -2884,6 +2884,7 @@
 
 case 

[PATCH] D67216: [cfi] Add flag to always generate call frame information

2019-09-05 Thread David Candler via Phabricator via cfe-commits
dcandler created this revision.
dcandler added reviewers: echristo, probinson, aprantl.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls, 
javed.absar.
Herald added projects: clang, LLVM.

This adds a flag to LLVM and clang to always generate call frame information, 
even if other debug information is not being generated. This would be useful 
for the Arm toolchain where the .debug_frame section is always expected to be 
present (with or without other debug sections) and can be used to calculate 
stack usage, although the flag itself has been left generic to cover any other 
potential situations where cfi directives are desired, but other debug 
information is not.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D67216

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/Driver/always-need-cfi.c
  llvm/docs/CommandGuide/llc.rst
  llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
  llvm/test/CodeGen/ARM/always-need-cfi.ll

Index: llvm/test/CodeGen/ARM/always-need-cfi.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/always-need-cfi.ll
@@ -0,0 +1,38 @@
+; RUN: llc -mtriple armv7-unknown -frame-pointer=all -filetype=asm -o - %s | FileCheck %s --check-prefix=CHECK-NO-CFI
+; RUN: llc -mtriple armv7-unknown -frame-pointer=all -filetype=asm -always-need-cfi -o - %s | FileCheck %s --check-prefix=CHECK-ALWAYS-CFI
+
+declare void @dummy_use(i32*, i32)
+
+define void @test_basic() #0 {
+%mem = alloca i32, i32 10
+call void @dummy_use (i32* %mem, i32 10)
+	ret void
+}
+
+; CHECK-NO-CFI-LABEL: test_basic:
+; CHECK-NO-CFI:   .fnstart
+; CHECK-NO-CFI-NOT:   .cfi_sections .debug_frame
+; CHECK-NO-CFI-NOT:   .cfi_startproc
+; CHECK-NO-CFI:   @ %bb.0:
+; CHECK-NO-CFI:   push {r11, lr}
+; CHECK-NO-CFI-NOT:   .cfi_def_cfa_offset 8
+; CHECK-NO-CFI-NOT:   .cfi_offset lr, -4
+; CHECK-NO-CFI-NOT:   .cfi_offset r11, -8
+; CHECK-NO-CFI:   mov r11, sp
+; CHECK-NO-CFI-NOT:   .cfi_def_cfa_register r11
+; CHECK-NO-CFI-NOT:   .cfi_endproc
+; CHECK-NO-CFI:   .fnend
+
+; CHECK-ALWAYS-CFI-LABEL: test_basic:
+; CHECK-ALWAYS-CFI:   .fnstart
+; CHECK-ALWAYS-CFI:   .cfi_sections .debug_frame
+; CHECK-ALWAYS-CFI:   .cfi_startproc
+; CHECK-ALWAYS-CFI:   @ %bb.0:
+; CHECK-ALWAYS-CFI:   push {r11, lr}
+; CHECK-ALWAYS-CFI:   .cfi_def_cfa_offset 8
+; CHECK-ALWAYS-CFI:   .cfi_offset lr, -4
+; CHECK-ALWAYS-CFI:   .cfi_offset r11, -8
+; CHECK-ALWAYS-CFI:   mov r11, sp
+; CHECK-ALWAYS-CFI:   .cfi_def_cfa_register r11
+; CHECK-ALWAYS-CFI:   .cfi_endproc
+; CHECK-ALWAYS-CFI:   .fnend
Index: llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
===
--- llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -149,6 +149,11 @@
 cl::desc("Emit a section containing remark diagnostics metadata"),
 cl::init(false));
 
+static cl::opt AlwaysNeedCFI(
+"always-need-cfi",
+cl::desc("Always emit call frame information"),
+cl::init(false));
+
 char AsmPrinter::ID = 0;
 
 using gcp_map_type = DenseMap>;
@@ -929,7 +934,7 @@
   MF->getFunction().needsUnwindTableEntry())
 return CFI_M_EH;
 
-  if (MMI->hasDebugInfo())
+  if (MMI->hasDebugInfo() || AlwaysNeedCFI)
 return CFI_M_Debug;
 
   return CFI_M_None;
Index: llvm/docs/CommandGuide/llc.rst
===
--- llvm/docs/CommandGuide/llc.rst
+++ llvm/docs/CommandGuide/llc.rst
@@ -152,6 +152,10 @@
  Emit the .remarks (ELF) / __remarks (MachO) section which contains metadata
  about remark diagnostics.
 
+.. option:: -always-need-cfi
+
+ Emit call frame information even if other debug information is not present.
+
 Tuning/Configuration Options
 
 
Index: clang/test/Driver/always-need-cfi.c
===
--- /dev/null
+++ clang/test/Driver/always-need-cfi.c
@@ -0,0 +1,7 @@
+
+// RUN: %clang -target arm -c -### %s -falways-need-cfi 2>&1 | FileCheck --check-prefix=CHECK-ALWAYS %s
+// RUN: %clang -target arm -c -### %s -fno-always-need-cfi 2>&1 | FileCheck --check-prefix=CHECK-NO-ALWAYS %s
+// RUN: %clang -target arm -c -### %s 2>&1 | FileCheck --check-prefix=CHECK-NO-ALWAYS %s
+
+// CHECK-ALWAYS: -falways-need-cfi
+// CHECK-NO-ALWAYS-NOT: -falways-need-cfi
\ No newline at end of file
Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -953,6 +953,8 @@
   Args.hasFlag(OPT_fstack_size_section, OPT_fno_stack_size_section, false);
   Opts.UniqueSectionNames = Args.hasFlag(OPT_funique_section_names,
  OPT_fno_un

[PATCH] D65863: [ARM] Add support for the s,j,x,N,O inline asm constraints

2019-09-05 Thread David Candler via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL371079: [ARM] Add support for the s,j,x,N,O inline asm 
constraints (authored by dcandler, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D65863?vs=214205&id=218927#toc

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65863/new/

https://reviews.llvm.org/D65863

Files:
  cfe/trunk/lib/Basic/Targets/ARM.cpp
  cfe/trunk/test/Sema/arm_inline_asm_constraints.c
  llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp

Index: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
===
--- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
+++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
@@ -15323,7 +15323,7 @@
   case 'j':
 // Constant suitable for movw, must be between 0 and
 // 65535.
-if (Subtarget->hasV6T2Ops())
+if (Subtarget->hasV6T2Ops() || (Subtarget->hasV8MBaselineOps()))
   if (CVal >= 0 && CVal <= 65535)
 break;
 return;
@@ -15431,7 +15431,7 @@
 return;
 
   case 'N':
-if (Subtarget->isThumb()) {  // FIXME thumb2
+if (Subtarget->isThumb1Only()) {
   // This must be a constant between 0 and 31, for shift amounts.
   if (CVal >= 0 && CVal <= 31)
 break;
@@ -15439,7 +15439,7 @@
 return;
 
   case 'O':
-if (Subtarget->isThumb()) {  // FIXME thumb2
+if (Subtarget->isThumb1Only()) {
   // This must be a multiple of 4 between -508 and 508, for
   // ADD/SUB sp = sp + immediate.
   if ((CVal >= -508 && CVal <= 508) && ((CVal & 3) == 0))
Index: cfe/trunk/test/Sema/arm_inline_asm_constraints.c
===
--- cfe/trunk/test/Sema/arm_inline_asm_constraints.c
+++ cfe/trunk/test/Sema/arm_inline_asm_constraints.c
@@ -0,0 +1,305 @@
+// REQUIRES: arm-registered-target
+
+// RUN: %clang_cc1 -triple armv6 -verify=arm6 %s
+// RUN: %clang_cc1 -triple armv7 -verify=arm7 %s
+// RUN: %clang_cc1 -triple thumbv6 -verify=thumb1 %s
+// RUN: %clang_cc1 -triple thumbv7 -verify=thumb2 %s
+
+// j: An immediate integer between 0 and 65535 (valid for MOVW) (ARM/Thumb2)
+int test_j(int i) {
+  int res;
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(-1), [ input ] "r"(i)
+:);
+  // arm6-error@13 {{invalid input constraint 'j' in asm}}
+  // arm7-error@13 {{value '-1' out of range for constraint 'j'}}
+  // thumb1-error@13 {{invalid input constraint 'j' in asm}}
+  // thumb2-error@13 {{value '-1' out of range for constraint 'j'}}
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(0), [ input ] "r"(i)
+:);
+  // arm6-error@21 {{invalid input constraint 'j' in asm}}
+  // arm7-no-error
+  // thumb1-error@21 {{invalid input constraint 'j' in asm}}
+  // thumb2-no-error
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(65535), [ input ] "r"(i)
+:);
+  // arm6-error@29 {{invalid input constraint 'j' in asm}}
+  // arm7-no-error
+  // thumb1-error@29 {{invalid input constraint 'j' in asm}}
+  // thumb2-no-error
+  __asm("movw %0, %1;"
+: [ result ] "=r"(res)
+: [ constant ] "j"(65536), [ input ] "r"(i)
+:);
+  // arm6-error@37 {{invalid input constraint 'j' in asm}}
+  // arm7-error@37 {{value '65536' out of range for constraint 'j'}}
+  // thumb1-error@37 {{invalid input constraint 'j' in asm}}
+  // thumb2-error@37 {{value '65536' out of range for constraint 'j'}}
+  return res;
+}
+
+// I: An immediate integer valid for a data-processing instruction. (ARM/Thumb2)
+//An immediate integer between 0 and 255. (Thumb1)
+int test_I(int i) {
+  int res;
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(-1), [ input ] "r"(i)
+  :); // thumb1-error@53 {{value '-1' out of range for constraint 'I'}}
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(0), [ input ] "r"(i)
+  :); // No errors expected.
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(255), [ input ] "r"(i)
+  :); // No errors expected.
+  __asm(
+  "add %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "I"(256), [ input ] "r"(i)
+  :); // thumb1-error@68 {{value '256' out of range for constraint 'I'}}
+  return res;
+}
+
+// J: An immediate integer between -4095 and 4095. (ARM/Thumb2)
+//An immediate integer between -255 and -1. (Thumb1)
+int test_J(int i) {
+  int res;
+  __asm(
+  "movw %0, %1;"
+  : [ result ] "=r"(res)
+  : [ constant ] "J"(-4096), [ input ] "r"(i)
+  :);
+  // arm6-error@80 {{value '-4096' out of range for constraint 'J'}}
+  // arm7-error@80 {{value '-4096' out of range for constraint 'J'}}
+  // thumb1-error@80 {{value '-4096' out of range for co

[PATCH] D67216: [cfi] Add flag to always generate call frame information

2019-09-06 Thread David Candler via Phabricator via cfe-commits
dcandler added a comment.
Herald added a subscriber: ychen.

I was actually torn myself on whether to put the flag in the g group or not, so 
I'm happy to rename it. As far as I could find, no compiler has an existing 
option to control this: instead armcc always includes a debug_frame section by 
default to follow Arm's Dwarf specification. Having it as an option seems more 
flexible than forcing a different behavior.




Comment at: clang/lib/Frontend/CompilerInvocation.cpp:956
  OPT_fno_unique_section_names, true);
+  Opts.AlwaysNeedCFI =
+  Args.hasFlag(OPT_falways_need_cfi, OPT_fno_always_need_cfi, false);

ostannard wrote:
> Is this option actually being read anywhere?
Not any more, since I moved the option directly into AsmPrinter. I'll make sure 
this (and the other line) doesn't get included in the next diff.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67216/new/

https://reviews.llvm.org/D67216



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D102406: [ARM][AArch64] Correct __ARM_FEATURE_CRYPTO macro and crypto feature

2021-05-13 Thread David Candler via Phabricator via cfe-commits
dcandler created this revision.
dcandler added reviewers: t.p.northover, lenary.
Herald added subscribers: danielkiss, kristof.beyls.
dcandler requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This patch contains a couple of minor corrections to my previous
crypto patch:

Since both AArch32 and AArch64 are now correctly setting the aes and
sha2 features individually, it is not necessary to continue to check
the crypto feature when defining feature macros.

In the AArch32 driver, the feature vector is only modified when the
crypto feature is actually in the vector. If crypto is not present,
there is no need to split it and explicitly define crypto/sha2/aes.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D102406

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/ARM.cpp
  clang/lib/Driver/ToolChains/Arch/ARM.cpp


Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -636,6 +636,10 @@
   // FIXME: this needs reimplementation after the TargetParser rewrite
   bool HasSHA2 = false;
   bool HasAES = false;
+  const auto ItCrypto =
+  llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
+return F.contains("crypto");
+  });
   const auto ItSHA2 =
   llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
 return F.contains("crypto") || F.contains("sha2");
@@ -650,7 +654,7 @@
 HasSHA2 = ItSHA2->take_front() == "+";
   if (FoundAES)
 HasAES = ItAES->take_front() == "+";
-  if (FoundSHA2 || FoundAES) {
+  if (ItCrypto != Features.rend()) {
 if (HasSHA2 && HasAES)
   Features.push_back("+crypto");
 else
Index: clang/lib/Basic/Targets/ARM.cpp
===
--- clang/lib/Basic/Targets/ARM.cpp
+++ clang/lib/Basic/Targets/ARM.cpp
@@ -649,7 +649,7 @@
 // ACLE 6.5.7 Crypto Extension
 // The __ARM_FEATURE_CRYPTO is deprecated in favor of finer grained
 // feature macros for AES and SHA2
-if (Crypto || (SHA2 && AES))
+if (SHA2 && AES)
   Builder.defineMacro("__ARM_FEATURE_CRYPTO", "1");
 if (SHA2)
   Builder.defineMacro("__ARM_FEATURE_SHA2", "1");
Index: clang/lib/Basic/Targets/AArch64.cpp
===
--- clang/lib/Basic/Targets/AArch64.cpp
+++ clang/lib/Basic/Targets/AArch64.cpp
@@ -289,7 +289,7 @@
 
   // The __ARM_FEATURE_CRYPTO is deprecated in favor of finer grained feature
   // macros for AES, SHA2, SHA3 and SM4
-  if (HasCrypto || (HasAES && HasSHA2))
+  if (HasAES && HasSHA2)
 Builder.defineMacro("__ARM_FEATURE_CRYPTO", "1");
 
   if (HasAES)


Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -636,6 +636,10 @@
   // FIXME: this needs reimplementation after the TargetParser rewrite
   bool HasSHA2 = false;
   bool HasAES = false;
+  const auto ItCrypto =
+  llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
+return F.contains("crypto");
+  });
   const auto ItSHA2 =
   llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
 return F.contains("crypto") || F.contains("sha2");
@@ -650,7 +654,7 @@
 HasSHA2 = ItSHA2->take_front() == "+";
   if (FoundAES)
 HasAES = ItAES->take_front() == "+";
-  if (FoundSHA2 || FoundAES) {
+  if (ItCrypto != Features.rend()) {
 if (HasSHA2 && HasAES)
   Features.push_back("+crypto");
 else
Index: clang/lib/Basic/Targets/ARM.cpp
===
--- clang/lib/Basic/Targets/ARM.cpp
+++ clang/lib/Basic/Targets/ARM.cpp
@@ -649,7 +649,7 @@
 // ACLE 6.5.7 Crypto Extension
 // The __ARM_FEATURE_CRYPTO is deprecated in favor of finer grained
 // feature macros for AES and SHA2
-if (Crypto || (SHA2 && AES))
+if (SHA2 && AES)
   Builder.defineMacro("__ARM_FEATURE_CRYPTO", "1");
 if (SHA2)
   Builder.defineMacro("__ARM_FEATURE_SHA2", "1");
Index: clang/lib/Basic/Targets/AArch64.cpp
===
--- clang/lib/Basic/Targets/AArch64.cpp
+++ clang/lib/Basic/Targets/AArch64.cpp
@@ -289,7 +289,7 @@
 
   // The __ARM_FEATURE_CRYPTO is deprecated in favor of finer grained feature
   // macros for AES, SHA2, SHA3 and SM4
-  if (HasCrypto || (HasAES && HasSHA2))
+  if (HasAES && HasSHA2)
 Builder.defineMacro("__ARM_FEATURE_CRYPTO", "1");
 
   if (HasAES)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D102406: [ARM][AArch64] Correct __ARM_FEATURE_CRYPTO macro and crypto feature

2021-05-14 Thread David Candler via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG3d59f9d22440: [ARM][AArch64] Correct __ARM_FEATURE_CRYPTO 
macro and crypto feature (authored by dcandler).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102406/new/

https://reviews.llvm.org/D102406

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/ARM.cpp
  clang/lib/Driver/ToolChains/Arch/ARM.cpp


Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -636,6 +636,10 @@
   // FIXME: this needs reimplementation after the TargetParser rewrite
   bool HasSHA2 = false;
   bool HasAES = false;
+  const auto ItCrypto =
+  llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
+return F.contains("crypto");
+  });
   const auto ItSHA2 =
   llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
 return F.contains("crypto") || F.contains("sha2");
@@ -650,7 +654,7 @@
 HasSHA2 = ItSHA2->take_front() == "+";
   if (FoundAES)
 HasAES = ItAES->take_front() == "+";
-  if (FoundSHA2 || FoundAES) {
+  if (ItCrypto != Features.rend()) {
 if (HasSHA2 && HasAES)
   Features.push_back("+crypto");
 else
Index: clang/lib/Basic/Targets/ARM.cpp
===
--- clang/lib/Basic/Targets/ARM.cpp
+++ clang/lib/Basic/Targets/ARM.cpp
@@ -649,7 +649,7 @@
 // ACLE 6.5.7 Crypto Extension
 // The __ARM_FEATURE_CRYPTO is deprecated in favor of finer grained
 // feature macros for AES and SHA2
-if (Crypto || (SHA2 && AES))
+if (SHA2 && AES)
   Builder.defineMacro("__ARM_FEATURE_CRYPTO", "1");
 if (SHA2)
   Builder.defineMacro("__ARM_FEATURE_SHA2", "1");
Index: clang/lib/Basic/Targets/AArch64.cpp
===
--- clang/lib/Basic/Targets/AArch64.cpp
+++ clang/lib/Basic/Targets/AArch64.cpp
@@ -289,7 +289,7 @@
 
   // The __ARM_FEATURE_CRYPTO is deprecated in favor of finer grained feature
   // macros for AES, SHA2, SHA3 and SM4
-  if (HasCrypto || (HasAES && HasSHA2))
+  if (HasAES && HasSHA2)
 Builder.defineMacro("__ARM_FEATURE_CRYPTO", "1");
 
   if (HasAES)


Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -636,6 +636,10 @@
   // FIXME: this needs reimplementation after the TargetParser rewrite
   bool HasSHA2 = false;
   bool HasAES = false;
+  const auto ItCrypto =
+  llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
+return F.contains("crypto");
+  });
   const auto ItSHA2 =
   llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
 return F.contains("crypto") || F.contains("sha2");
@@ -650,7 +654,7 @@
 HasSHA2 = ItSHA2->take_front() == "+";
   if (FoundAES)
 HasAES = ItAES->take_front() == "+";
-  if (FoundSHA2 || FoundAES) {
+  if (ItCrypto != Features.rend()) {
 if (HasSHA2 && HasAES)
   Features.push_back("+crypto");
 else
Index: clang/lib/Basic/Targets/ARM.cpp
===
--- clang/lib/Basic/Targets/ARM.cpp
+++ clang/lib/Basic/Targets/ARM.cpp
@@ -649,7 +649,7 @@
 // ACLE 6.5.7 Crypto Extension
 // The __ARM_FEATURE_CRYPTO is deprecated in favor of finer grained
 // feature macros for AES and SHA2
-if (Crypto || (SHA2 && AES))
+if (SHA2 && AES)
   Builder.defineMacro("__ARM_FEATURE_CRYPTO", "1");
 if (SHA2)
   Builder.defineMacro("__ARM_FEATURE_SHA2", "1");
Index: clang/lib/Basic/Targets/AArch64.cpp
===
--- clang/lib/Basic/Targets/AArch64.cpp
+++ clang/lib/Basic/Targets/AArch64.cpp
@@ -289,7 +289,7 @@
 
   // The __ARM_FEATURE_CRYPTO is deprecated in favor of finer grained feature
   // macros for AES, SHA2, SHA3 and SM4
-  if (HasCrypto || (HasAES && HasSHA2))
+  if (HasAES && HasSHA2)
 Builder.defineMacro("__ARM_FEATURE_CRYPTO", "1");
 
   if (HasAES)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129298: Add denormal-fp-math attribute for f16

2022-11-02 Thread David Candler via Phabricator via cfe-commits
dcandler abandoned this revision.
dcandler added a comment.

Sorry for the quiet on this. I'm going to abandon this for the moment, as what 
I eventually found was that there was some ambiguity in the ARM ABI regarding 
half-floats which would be better to address first, so that the attributes can 
map directly. There is currently only one ARM build attribute for denormals 
which reads as though it affects all precisions, but may not have been updated 
after half-float support was added. Since that maps to denormal-fp-math, which 
also controls all precisions, both may need splitting rather than just the 
function level attribute.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129298/new/

https://reviews.llvm.org/D129298

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D67216: [cfi] Add flag to always generate .debug_frame

2019-10-31 Thread David Candler via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG92aa0c2dbcb7: [cfi] Add flag to always generate .debug_frame 
(authored by dcandler).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67216/new/

https://reviews.llvm.org/D67216

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/Driver/fforce-dwarf-frame.c
  llvm/include/llvm/CodeGen/CommandFlags.inc
  llvm/include/llvm/CodeGen/MachineFunction.h
  llvm/include/llvm/Target/TargetOptions.h
  llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
  llvm/lib/CodeGen/AsmPrinter/DwarfCFIException.cpp
  llvm/lib/CodeGen/CFIInstrInserter.cpp
  llvm/lib/CodeGen/MachineFunction.cpp
  llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
  llvm/lib/Target/ARC/ARCRegisterInfo.cpp
  llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp
  llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
  llvm/lib/Target/X86/X86FrameLowering.cpp
  llvm/lib/Target/X86/X86InstrInfo.cpp
  llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
  llvm/test/CodeGen/ARM/dwarf-frame.ll

Index: llvm/test/CodeGen/ARM/dwarf-frame.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/dwarf-frame.ll
@@ -0,0 +1,38 @@
+; RUN: llc -mtriple armv7-unknown -frame-pointer=all -filetype=asm -o - %s | FileCheck %s --check-prefix=CHECK-NO-CFI
+; RUN: llc -mtriple armv7-unknown -frame-pointer=all -filetype=asm -force-dwarf-frame-section -o - %s | FileCheck %s --check-prefix=CHECK-ALWAYS-CFI
+
+declare void @dummy_use(i32*, i32)
+
+define void @test_basic() #0 {
+%mem = alloca i32, i32 10
+call void @dummy_use (i32* %mem, i32 10)
+  ret void
+}
+
+; CHECK-NO-CFI-LABEL: test_basic:
+; CHECK-NO-CFI:   .fnstart
+; CHECK-NO-CFI-NOT:   .cfi_sections .debug_frame
+; CHECK-NO-CFI-NOT:   .cfi_startproc
+; CHECK-NO-CFI:   @ %bb.0:
+; CHECK-NO-CFI:   push {r11, lr}
+; CHECK-NO-CFI-NOT:   .cfi_def_cfa_offset 8
+; CHECK-NO-CFI-NOT:   .cfi_offset lr, -4
+; CHECK-NO-CFI-NOT:   .cfi_offset r11, -8
+; CHECK-NO-CFI:   mov r11, sp
+; CHECK-NO-CFI-NOT:   .cfi_def_cfa_register r11
+; CHECK-NO-CFI-NOT:   .cfi_endproc
+; CHECK-NO-CFI:   .fnend
+
+; CHECK-ALWAYS-CFI-LABEL: test_basic:
+; CHECK-ALWAYS-CFI:   .fnstart
+; CHECK-ALWAYS-CFI:   .cfi_sections .debug_frame
+; CHECK-ALWAYS-CFI:   .cfi_startproc
+; CHECK-ALWAYS-CFI:   @ %bb.0:
+; CHECK-ALWAYS-CFI:   push {r11, lr}
+; CHECK-ALWAYS-CFI:   .cfi_def_cfa_offset 8
+; CHECK-ALWAYS-CFI:   .cfi_offset lr, -4
+; CHECK-ALWAYS-CFI:   .cfi_offset r11, -8
+; CHECK-ALWAYS-CFI:   mov r11, sp
+; CHECK-ALWAYS-CFI:   .cfi_def_cfa_register r11
+; CHECK-ALWAYS-CFI:   .cfi_endproc
+; CHECK-ALWAYS-CFI:   .fnend
Index: llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
===
--- llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
+++ llvm/lib/Target/XCore/XCoreRegisterInfo.cpp
@@ -203,7 +203,7 @@
 }
 
 bool XCoreRegisterInfo::needsFrameMoves(const MachineFunction &MF) {
-  return MF.getMMI().hasDebugInfo() || MF.getFunction().needsUnwindTableEntry();
+  return MF.needsFrameMoves();
 }
 
 const MCPhysReg *
Index: llvm/lib/Target/X86/X86InstrInfo.cpp
===
--- llvm/lib/Target/X86/X86InstrInfo.cpp
+++ llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -3963,9 +3963,7 @@
   MachineFunction &MF = *MBB.getParent();
   const X86FrameLowering *TFL = Subtarget.getFrameLowering();
   bool IsWin64Prologue = MF.getTarget().getMCAsmInfo()->usesWindowsCFI();
-  bool NeedsDwarfCFI =
-  !IsWin64Prologue &&
-  (MF.getMMI().hasDebugInfo() || MF.getFunction().needsUnwindTableEntry());
+  bool NeedsDwarfCFI = !IsWin64Prologue && MF.needsFrameMoves();
   bool EmitCFI = !TFL->hasFP(MF) && NeedsDwarfCFI;
   if (EmitCFI) {
 TFL->BuildCFI(MBB, I, DL,
Index: llvm/lib/Target/X86/X86FrameLowering.cpp
===
--- llvm/lib/Target/X86/X86FrameLowering.cpp
+++ llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -993,8 +993,7 @@
   bool NeedsWinFPO =
   !IsFunclet && STI.isTargetWin32() && MMI.getModule()->getCodeViewFlag();
   bool NeedsWinCFI = NeedsWin64CFI || NeedsWinFPO;
-  bool NeedsDwarfCFI =
-  !IsWin64Prologue && (MMI.hasDebugInfo() || Fn.needsUnwindTableEntry());
+  bool NeedsDwarfCFI = !IsWin64Prologue && MF.needsFrameMoves();
   Register FramePtr = TRI->getFrameRegister(MF);
   const Register MachineFramePtr =
   STI.isTarget64BitILP32()
@@ -1614,10 +1613,9 @@
   bool HasFP = hasFP(MF);
   uint64_t NumBytes = 0;
 
-  bool NeedsDwarfCFI =
-  (!MF.getTarget().getTargetTriple().isOSDarwin() &&
-   !MF.getTarget().getTargetTriple().isOSWindows()) &&
-  (MF.getMMI().hasDebugInfo() || MF.getFunction().needsUnwindTableEntry());
+  boo

[PATCH] D99079: [ARM][AArch64] Require appropriate features for crypto algorithms

2021-04-28 Thread David Candler via Phabricator via cfe-commits
dcandler added inline comments.



Comment at: clang/lib/Basic/Targets/AArch64.h:36
+  bool HasSHA3;
+  bool HasSM4;
   bool HasUnaligned;

rsanthir.quic wrote:
> Would it make sense to further differentiate SM3 and SM4? I see that we 
> differentiate between the two in arm_neon.td ("ARM_FEATURE_SM3" & 
> "ARM_FEATURE_SM4") but we don't include this differentiation as flags (only 
> HasSM4, +sm4 etc)
It might make more sense in the code to differentiate them, however the current 
feature and command line options align with GCC, so changing them would go 
beyond the scope of this patch.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99079/new/

https://reviews.llvm.org/D99079

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99079: [ARM][AArch64] Require appropriate features for crypto algorithms

2021-04-28 Thread David Candler via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGb8baa2a91324: [ARM][AArch64] Require appropriate features 
for crypto algorithms (authored by dcandler).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99079/new/

https://reviews.llvm.org/D99079

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Basic/Targets/ARM.cpp
  clang/lib/Basic/Targets/ARM.h
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/CodeGen/aarch64-neon-range-checks.c
  clang/test/CodeGen/aarch64-neon-sha3.c
  clang/test/CodeGen/aarch64-neon-sm4-sm3.c
  clang/test/CodeGen/arm-target-features.c
  clang/test/CodeGen/arm64_crypto.c
  clang/test/CodeGen/neon-crypto.c
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  clang/test/Driver/arm-features.c
  clang/test/Driver/arm-mfpu.c
  clang/test/Driver/armv8.1m.main.c
  clang/test/Preprocessor/aarch64-target-features.c
  clang/test/Preprocessor/arm-target-features.c
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/ARMTargetParser.cpp
  llvm/lib/Target/ARM/ARMInstrNEON.td
  llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
  llvm/test/Bindings/llvm-c/ARM/disassemble.test
  llvm/test/MC/ARM/directive-arch_extension-aes-sha2.s
  llvm/test/MC/ARM/directive-arch_extension-crypto.s
  llvm/test/MC/ARM/neon-crypto.s

Index: llvm/test/MC/ARM/neon-crypto.s
===
--- llvm/test/MC/ARM/neon-crypto.s
+++ llvm/test/MC/ARM/neon-crypto.s
@@ -9,10 +9,10 @@
 @ CHECK: aese.8 q0, q1  @ encoding: [0x02,0x03,0xb0,0xf3]
 @ CHECK: aesimc.8 q0, q1@ encoding: [0xc2,0x03,0xb0,0xf3]
 @ CHECK: aesmc.8 q0, q1 @ encoding: [0x82,0x03,0xb0,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
 
 sha1h.32  q0, q1
 sha1su1.32  q0, q1
@@ -20,9 +20,9 @@
 @ CHECK: sha1h.32  q0, q1   @ encoding: [0xc2,0x02,0xb9,0xf3]
 @ CHECK: sha1su1.32 q0, q1  @ encoding: [0x82,0x03,0xba,0xf3]
 @ CHECK: sha256su0.32 q0, q1@ encoding: [0xc2,0x03,0xba,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
 
 sha1c.32  q0, q1, q2
 sha1m.32  q0, q1, q2
@@ -38,14 +38,14 @@
 @ CHECK: sha256h.32  q0, q1, q2  @ encoding: [0x44,0x0c,0x02,0xf3]
 @ CHECK: sha256h2.32 q0, q1, q2  @ encoding: [0x44,0x0c,0x12,0xf3]
 @ CHECK: sha256su1.32 q0, q1, q2 @ encoding: [0x44,0x0c,0x22,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
 
 vmull.p64 q8, d16, d17
 @ CHECK: vmull.p64  q8, d16, d17@ encoding: [0xa1,0x0e,0xe0,0xf2]
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: aes armv8
Index: llvm/test/MC/ARM/directive-arch_extension-crypto.s
===
--- llvm/test/MC/ARM/directive-arch_extension-crypto.s
+++ llvm/test/MC/ARM/directive-arch_extension-crypto.s
@@ -17,38 +17,38 @@
 	.type crypto,%function
 crypto:
 	vmull.p64 q0, d0, d1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 
 	aesd.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 	aese.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 	aesimc.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 	aesmc.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 
 	sha1h.32 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: sha2 armv8
 	sha1su1.32 q0, q1
-@ CHECK-V7: error: instruction requires:

[PATCH] D99079: [ARM][AArch64] Require appropriate features for crypto algorithms

2021-03-22 Thread David Candler via Phabricator via cfe-commits
dcandler created this revision.
dcandler added reviewers: t.p.northover, rsanthir.quic, SjoerdMeijer, efriedma, 
peter.smith, labrinea.
Herald added subscribers: danielkiss, hiraditya, kristof.beyls.
dcandler requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

This patch changes the AArch32 crypto instructions (sha2 and aes) to
require the specific sha2 or aes features. These features have
already been implemented and can be controlled through the command
line, but do not have the expected result (i.e. `+noaes` will not
disable aes instructions). The crypto feature retains its existing
meaning of both sha2 and aes.

Several small changes are included due to the knock-on effect this has:

- The AArch32 driver has been modified to ensure sha2/aes is correctly set 
based on arch/cpu/fpu selection and feature ordering.
- Crypto extensions are permitted for AArch32 v8-R profile, but not enabled by 
default.
- ACLE feature macros have been updated with the fine grained crypto 
algorithms. These are also used by AArch64.
- Various tests updated due to the change in feature lists and macros.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D99079

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Basic/Targets/ARM.cpp
  clang/lib/Basic/Targets/ARM.h
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/CodeGen/aarch64-neon-range-checks.c
  clang/test/CodeGen/aarch64-neon-sha3.c
  clang/test/CodeGen/aarch64-neon-sm4-sm3.c
  clang/test/CodeGen/arm-target-features.c
  clang/test/CodeGen/arm64_crypto.c
  clang/test/CodeGen/neon-crypto.c
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  clang/test/Driver/arm-features.c
  clang/test/Driver/arm-mfpu.c
  clang/test/Driver/armv8.1m.main.c
  clang/test/Preprocessor/aarch64-target-features.c
  clang/test/Preprocessor/arm-target-features.c
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/ARMTargetParser.cpp
  llvm/lib/Target/ARM/ARMInstrNEON.td
  llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
  llvm/test/Bindings/llvm-c/ARM/disassemble.test
  llvm/test/MC/ARM/directive-arch_extension-aes-sha2.s
  llvm/test/MC/ARM/directive-arch_extension-crypto.s
  llvm/test/MC/ARM/neon-crypto.s

Index: llvm/test/MC/ARM/neon-crypto.s
===
--- llvm/test/MC/ARM/neon-crypto.s
+++ llvm/test/MC/ARM/neon-crypto.s
@@ -9,10 +9,10 @@
 @ CHECK: aese.8 q0, q1  @ encoding: [0x02,0x03,0xb0,0xf3]
 @ CHECK: aesimc.8 q0, q1@ encoding: [0xc2,0x03,0xb0,0xf3]
 @ CHECK: aesmc.8 q0, q1 @ encoding: [0x82,0x03,0xb0,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
 
 sha1h.32  q0, q1
 sha1su1.32  q0, q1
@@ -20,9 +20,9 @@
 @ CHECK: sha1h.32  q0, q1   @ encoding: [0xc2,0x02,0xb9,0xf3]
 @ CHECK: sha1su1.32 q0, q1  @ encoding: [0x82,0x03,0xba,0xf3]
 @ CHECK: sha256su0.32 q0, q1@ encoding: [0xc2,0x03,0xba,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
 
 sha1c.32  q0, q1, q2
 sha1m.32  q0, q1, q2
@@ -38,14 +38,14 @@
 @ CHECK: sha256h.32  q0, q1, q2  @ encoding: [0x44,0x0c,0x02,0xf3]
 @ CHECK: sha256h2.32 q0, q1, q2  @ encoding: [0x44,0x0c,0x12,0xf3]
 @ CHECK: sha256su1.32 q0, q1, q2 @ encoding: [0x44,0x0c,0x22,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
 
 vmull.p64 q8, d16, d17
 @ CHECK: vmull.p64  q8, d16, d17@ encoding: [0xa1,0x0e,0xe0,0xf2]
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: aes armv8
Index: llvm/test/MC/ARM/directive-arch_extension-crypto.s
===
--- llvm/test/MC/ARM/directive-arch_extension-crypto.s
+++ llvm/test/

[PATCH] D99079: [ARM][AArch64] Require appropriate features for crypto algorithms

2021-04-16 Thread David Candler via Phabricator via cfe-commits
dcandler updated this revision to Diff 338046.
dcandler added a comment.

I've updated the patch to fix the test failures, and slightly reworked the 
driver code to avoid the above iterator invalidation. I've also added a comment 
there to clarify what it is doing: individually determining whether the sha2 
and aes features should be enabled and explicitly setting them, since they can 
be controlled both by crypto and their specific feature. Using the last 
occurance of either in the vector ensures whatever options are passed to 
-mcpu/-march are evaluated in the correct order.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99079/new/

https://reviews.llvm.org/D99079

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Basic/Targets/ARM.cpp
  clang/lib/Basic/Targets/ARM.h
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/CodeGen/aarch64-neon-range-checks.c
  clang/test/CodeGen/aarch64-neon-sha3.c
  clang/test/CodeGen/aarch64-neon-sm4-sm3.c
  clang/test/CodeGen/arm-target-features.c
  clang/test/CodeGen/arm64_crypto.c
  clang/test/CodeGen/neon-crypto.c
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  clang/test/Driver/arm-features.c
  clang/test/Driver/arm-mfpu.c
  clang/test/Driver/armv8.1m.main.c
  clang/test/Preprocessor/aarch64-target-features.c
  clang/test/Preprocessor/arm-target-features.c
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/ARMTargetParser.cpp
  llvm/lib/Target/ARM/ARMInstrNEON.td
  llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
  llvm/test/Bindings/llvm-c/ARM/disassemble.test
  llvm/test/MC/ARM/directive-arch_extension-aes-sha2.s
  llvm/test/MC/ARM/directive-arch_extension-crypto.s
  llvm/test/MC/ARM/neon-crypto.s

Index: llvm/test/MC/ARM/neon-crypto.s
===
--- llvm/test/MC/ARM/neon-crypto.s
+++ llvm/test/MC/ARM/neon-crypto.s
@@ -9,10 +9,10 @@
 @ CHECK: aese.8 q0, q1  @ encoding: [0x02,0x03,0xb0,0xf3]
 @ CHECK: aesimc.8 q0, q1@ encoding: [0xc2,0x03,0xb0,0xf3]
 @ CHECK: aesmc.8 q0, q1 @ encoding: [0x82,0x03,0xb0,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
 
 sha1h.32  q0, q1
 sha1su1.32  q0, q1
@@ -20,9 +20,9 @@
 @ CHECK: sha1h.32  q0, q1   @ encoding: [0xc2,0x02,0xb9,0xf3]
 @ CHECK: sha1su1.32 q0, q1  @ encoding: [0x82,0x03,0xba,0xf3]
 @ CHECK: sha256su0.32 q0, q1@ encoding: [0xc2,0x03,0xba,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
 
 sha1c.32  q0, q1, q2
 sha1m.32  q0, q1, q2
@@ -38,14 +38,14 @@
 @ CHECK: sha256h.32  q0, q1, q2  @ encoding: [0x44,0x0c,0x02,0xf3]
 @ CHECK: sha256h2.32 q0, q1, q2  @ encoding: [0x44,0x0c,0x12,0xf3]
 @ CHECK: sha256su1.32 q0, q1, q2 @ encoding: [0x44,0x0c,0x22,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
 
 vmull.p64 q8, d16, d17
 @ CHECK: vmull.p64  q8, d16, d17@ encoding: [0xa1,0x0e,0xe0,0xf2]
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: aes armv8
Index: llvm/test/MC/ARM/directive-arch_extension-crypto.s
===
--- llvm/test/MC/ARM/directive-arch_extension-crypto.s
+++ llvm/test/MC/ARM/directive-arch_extension-crypto.s
@@ -17,38 +17,38 @@
 	.type crypto,%function
 crypto:
 	vmull.p64 q0, d0, d1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 
 	aesd.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 	aese.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 	aesimc.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: err

[PATCH] D99079: [ARM][AArch64] Require appropriate features for crypto algorithms

2021-04-16 Thread David Candler via Phabricator via cfe-commits
dcandler updated this revision to Diff 338126.
dcandler marked 2 inline comments as done.
dcandler added a comment.

Removed one duplicated line.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99079/new/

https://reviews.llvm.org/D99079

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Basic/Targets/ARM.cpp
  clang/lib/Basic/Targets/ARM.h
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/CodeGen/aarch64-neon-range-checks.c
  clang/test/CodeGen/aarch64-neon-sha3.c
  clang/test/CodeGen/aarch64-neon-sm4-sm3.c
  clang/test/CodeGen/arm-target-features.c
  clang/test/CodeGen/arm64_crypto.c
  clang/test/CodeGen/neon-crypto.c
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  clang/test/Driver/arm-features.c
  clang/test/Driver/arm-mfpu.c
  clang/test/Driver/armv8.1m.main.c
  clang/test/Preprocessor/aarch64-target-features.c
  clang/test/Preprocessor/arm-target-features.c
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/ARMTargetParser.cpp
  llvm/lib/Target/ARM/ARMInstrNEON.td
  llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
  llvm/test/Bindings/llvm-c/ARM/disassemble.test
  llvm/test/MC/ARM/directive-arch_extension-aes-sha2.s
  llvm/test/MC/ARM/directive-arch_extension-crypto.s
  llvm/test/MC/ARM/neon-crypto.s

Index: llvm/test/MC/ARM/neon-crypto.s
===
--- llvm/test/MC/ARM/neon-crypto.s
+++ llvm/test/MC/ARM/neon-crypto.s
@@ -9,10 +9,10 @@
 @ CHECK: aese.8 q0, q1  @ encoding: [0x02,0x03,0xb0,0xf3]
 @ CHECK: aesimc.8 q0, q1@ encoding: [0xc2,0x03,0xb0,0xf3]
 @ CHECK: aesmc.8 q0, q1 @ encoding: [0x82,0x03,0xb0,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
+@ CHECK-V7: instruction requires: aes armv8
 
 sha1h.32  q0, q1
 sha1su1.32  q0, q1
@@ -20,9 +20,9 @@
 @ CHECK: sha1h.32  q0, q1   @ encoding: [0xc2,0x02,0xb9,0xf3]
 @ CHECK: sha1su1.32 q0, q1  @ encoding: [0x82,0x03,0xba,0xf3]
 @ CHECK: sha256su0.32 q0, q1@ encoding: [0xc2,0x03,0xba,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
 
 sha1c.32  q0, q1, q2
 sha1m.32  q0, q1, q2
@@ -38,14 +38,14 @@
 @ CHECK: sha256h.32  q0, q1, q2  @ encoding: [0x44,0x0c,0x02,0xf3]
 @ CHECK: sha256h2.32 q0, q1, q2  @ encoding: [0x44,0x0c,0x12,0xf3]
 @ CHECK: sha256su1.32 q0, q1, q2 @ encoding: [0x44,0x0c,0x22,0xf3]
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
+@ CHECK-V7: instruction requires: sha2 armv8
 
 vmull.p64 q8, d16, d17
 @ CHECK: vmull.p64  q8, d16, d17@ encoding: [0xa1,0x0e,0xe0,0xf2]
-@ CHECK-V7: instruction requires: crypto armv8
+@ CHECK-V7: instruction requires: aes armv8
Index: llvm/test/MC/ARM/directive-arch_extension-crypto.s
===
--- llvm/test/MC/ARM/directive-arch_extension-crypto.s
+++ llvm/test/MC/ARM/directive-arch_extension-crypto.s
@@ -17,38 +17,38 @@
 	.type crypto,%function
 crypto:
 	vmull.p64 q0, d0, d1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 
 	aesd.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 	aese.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 	aesimc.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 	aesmc.8 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: aes armv8
 
 	sha1h.32 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: sha2 armv8
 	sha1su1.32 q0, q1
-@ CHECK-V7: error: instruction requires: crypto armv8
+@ CHECK-V7: error: instruction requires: sha2 armv8
 	sha256su0.32 q0, q1
-