[clang] [llvm] [Inliner] Propagate more attributes to params when inlining (PR #91101)

2024-05-11 Thread Yingwei Zheng via cfe-commits


@@ -1352,20 +1352,43 @@ static void AddParamAndFnBasicAttributes(const CallBase 
,
   auto  = CalledFunction->getContext();
 
   // Collect valid attributes for all params.
-  SmallVector ValidParamAttrs;
+  SmallVector ValidObjParamAttrs, ValidExactParamAttrs;
   bool HasAttrToPropagate = false;
 
   for (unsigned I = 0, E = CB.arg_size(); I < E; ++I) {
-ValidParamAttrs.emplace_back(AttrBuilder{CB.getContext()});
+ValidObjParamAttrs.emplace_back(AttrBuilder{CB.getContext()});
+ValidExactParamAttrs.emplace_back(AttrBuilder{CB.getContext()});
 // Access attributes can be propagated to any param with the same 
underlying
 // object as the argument.
 if (CB.paramHasAttr(I, Attribute::ReadNone))
-  ValidParamAttrs.back().addAttribute(Attribute::ReadNone);
+  ValidObjParamAttrs.back().addAttribute(Attribute::ReadNone);
 if (CB.paramHasAttr(I, Attribute::ReadOnly))
-  ValidParamAttrs.back().addAttribute(Attribute::ReadOnly);
+  ValidObjParamAttrs.back().addAttribute(Attribute::ReadOnly);
 if (CB.paramHasAttr(I, Attribute::WriteOnly))
-  ValidParamAttrs.back().addAttribute(Attribute::WriteOnly);
-HasAttrToPropagate |= ValidParamAttrs.back().hasAttributes();
+  ValidObjParamAttrs.back().addAttribute(Attribute::WriteOnly);
+
+// Attributes we can only propagate if the exact parameter is forwarded.
+
+// We can propagate both poison generating an UB generating attributes
+// without any extra checks. The only attribute that is tricky to propagate
+// is `noundef` (skipped for now) as that can create new UB where previous

dtcxzyw wrote:

`noundef` should be safe to propagate. If we pass a poison/undef value into the 
callee, we must trigger an immediate UB.


https://github.com/llvm/llvm-project/pull/91101
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Inliner] Propagate more attributes to params when inlining (PR #91101)

2024-05-11 Thread Yingwei Zheng via cfe-commits


@@ -1352,20 +1352,43 @@ static void AddParamAndFnBasicAttributes(const CallBase 
,
   auto  = CalledFunction->getContext();
 
   // Collect valid attributes for all params.
-  SmallVector ValidParamAttrs;
+  SmallVector ValidObjParamAttrs, ValidExactParamAttrs;
   bool HasAttrToPropagate = false;
 
   for (unsigned I = 0, E = CB.arg_size(); I < E; ++I) {
-ValidParamAttrs.emplace_back(AttrBuilder{CB.getContext()});
+ValidObjParamAttrs.emplace_back(AttrBuilder{CB.getContext()});
+ValidExactParamAttrs.emplace_back(AttrBuilder{CB.getContext()});
 // Access attributes can be propagated to any param with the same 
underlying
 // object as the argument.
 if (CB.paramHasAttr(I, Attribute::ReadNone))
-  ValidParamAttrs.back().addAttribute(Attribute::ReadNone);
+  ValidObjParamAttrs.back().addAttribute(Attribute::ReadNone);
 if (CB.paramHasAttr(I, Attribute::ReadOnly))
-  ValidParamAttrs.back().addAttribute(Attribute::ReadOnly);
+  ValidObjParamAttrs.back().addAttribute(Attribute::ReadOnly);
 if (CB.paramHasAttr(I, Attribute::WriteOnly))
-  ValidParamAttrs.back().addAttribute(Attribute::WriteOnly);
-HasAttrToPropagate |= ValidParamAttrs.back().hasAttributes();
+  ValidObjParamAttrs.back().addAttribute(Attribute::WriteOnly);
+
+// Attributes we can only propagate if the exact parameter is forwarded.
+
+// We can propagate both poison generating an UB generating attributes

dtcxzyw wrote:

```suggestion
// We can propagate both poison generating and UB generating attributes
```

https://github.com/llvm/llvm-project/pull/91101
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Inliner] Propagate more attributes to params when inlining (PR #91101)

2024-05-10 Thread Yingwei Zheng via cfe-commits


@@ -1383,15 +1406,54 @@ static void AddParamAndFnBasicAttributes(const CallBase 
,
   AttributeList AL = NewInnerCB->getAttributes();
   for (unsigned I = 0, E = InnerCB->arg_size(); I < E; ++I) {
 // Check if the underlying value for the parameter is an argument.
-const Value *UnderlyingV =
-getUnderlyingObject(InnerCB->getArgOperand(I));
-const Argument *Arg = dyn_cast(UnderlyingV);
-if (!Arg)
-  continue;
+const Argument *Arg = dyn_cast(InnerCB->getArgOperand(I));
+unsigned ArgNo;
+if (Arg) {
+  ArgNo = Arg->getArgNo();
+  // For dereferenceable, dereferenceable_or_null, align, etc...
+  // we don't want to propagate if the existing param has the same
+  // attribute with "better" constraints. So, only remove from the
+  // existing AL if the region of the existing param is smaller than
+  // what we can propagate. AttributeList's merge API honours the
+  // already existing attribute value so we choose the "better"
+  // attribute by removing if the existing one is worse.
+  if (AL.getParamDereferenceableBytes(I) <
+  ValidExactParamAttrs[ArgNo].getDereferenceableBytes())
+AL =
+AL.removeParamAttribute(Context, I, 
Attribute::Dereferenceable);
+  if (AL.getParamDereferenceableOrNullBytes(I) <
+  ValidExactParamAttrs[ArgNo].getDereferenceableOrNullBytes())
+AL =
+AL.removeParamAttribute(Context, I, 
Attribute::Dereferenceable);
+  if (AL.getParamAlignment(I).valueOrOne() <
+  ValidExactParamAttrs[ArgNo].getAlignment().valueOrOne())
+AL = AL.removeParamAttribute(Context, I, Attribute::Alignment);
+
+  auto ExistingRange = AL.getParamRange(I);
+  AL = AL.addParamAttributes(Context, I, ValidExactParamAttrs[ArgNo]);
+
+  // For range we use the exact intersection.
+  if (ExistingRange.has_value()) {
+if (auto NewRange = ValidExactParamAttrs[ArgNo].getRange()) {
+  auto CombinedRange = 
ExistingRange->exactIntersectWith(*NewRange);
+  if (!CombinedRange.has_value())
+CombinedRange =
+ConstantRange::getEmpty(NewRange->getBitWidth());
+  AL = AL.removeParamAttribute(Context, I, Attribute::Range);
+  AL = AL.addRangeParamAttr(Context, I, *CombinedRange);
+}
+  }
+} else {
+  const Value *UnderlyingV =

dtcxzyw wrote:

```suggestion
  // Check if the underlying value for the parameter is an argument.
  const Value *UnderlyingV =
```


https://github.com/llvm/llvm-project/pull/91101
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Inliner] Propagate more attributes to params when inlining (PR #91101)

2024-05-10 Thread Yingwei Zheng via cfe-commits


@@ -1352,20 +1352,43 @@ static void AddParamAndFnBasicAttributes(const CallBase 
,
   auto  = CalledFunction->getContext();
 
   // Collect valid attributes for all params.
-  SmallVector ValidParamAttrs;
+  SmallVector ValidObjParamAttrs, ValidExactParamAttrs;
   bool HasAttrToPropagate = false;
 
   for (unsigned I = 0, E = CB.arg_size(); I < E; ++I) {
-ValidParamAttrs.emplace_back(AttrBuilder{CB.getContext()});
+ValidObjParamAttrs.emplace_back(AttrBuilder{CB.getContext()});
+ValidExactParamAttrs.emplace_back(AttrBuilder{CB.getContext()});
 // Access attributes can be propagated to any param with the same 
underlying
 // object as the argument.
 if (CB.paramHasAttr(I, Attribute::ReadNone))
-  ValidParamAttrs.back().addAttribute(Attribute::ReadNone);
+  ValidObjParamAttrs.back().addAttribute(Attribute::ReadNone);
 if (CB.paramHasAttr(I, Attribute::ReadOnly))
-  ValidParamAttrs.back().addAttribute(Attribute::ReadOnly);
+  ValidObjParamAttrs.back().addAttribute(Attribute::ReadOnly);
 if (CB.paramHasAttr(I, Attribute::WriteOnly))
-  ValidParamAttrs.back().addAttribute(Attribute::WriteOnly);
-HasAttrToPropagate |= ValidParamAttrs.back().hasAttributes();
+  ValidObjParamAttrs.back().addAttribute(Attribute::WriteOnly);
+
+// Attributes we can only propagate if the exact parameter is forwarded.
+
+// We can propagate both poison generating an UB generating attributes
+// without any extra checks. The only attribute that is tricky to propagate
+// is `noundef` (skipped for now) as that can create new UB where previous
+// behavior was just using a poison value.
+if (auto DerefBytes = CB.getParamDereferenceableBytes(I))
+  ValidExactParamAttrs.back().addDereferenceableAttr(DerefBytes);

dtcxzyw wrote:

```suggestion
auto  = ValidExactParamAttrs.back();
if (auto DerefBytes = CB.getParamDereferenceableBytes(I))
  ExactAttr.addDereferenceableAttr(DerefBytes);
```


https://github.com/llvm/llvm-project/pull/91101
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Inliner] Propagate more attributes to params when inlining (PR #91101)

2024-05-10 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw commented:

Oops I forgot to submit my review comment :(


https://github.com/llvm/llvm-project/pull/91101
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Inliner] Propagate more attributes to params when inlining (PR #91101)

2024-05-10 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw edited 
https://github.com/llvm/llvm-project/pull/91101
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add processor definition and scheduling model for XiangShan-KunMingHu (PR #90392)

2024-04-28 Thread Yingwei Zheng via cfe-commits
=?utf-8?b?6YOd5bq36L6+?= 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,1489 @@
+//==- RISCVSchedXiangShanKunMingHu.td - XiangShanKunMingHu Scheduling Defs -*- 
tablegen -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+// The XiangShan is a high-performance open-source RISC-V processor project 
+// initiated by the Institute of Computing Technology(ICT), Chinese Academy of 
Sciences(CAS). 
+// The KunMingHu architecture is its third-generation derivative, 
+// developed by the Institute of Computing Technology, Chinese Academy of 
Sciences  
+// and the Beijing Institute of Open Source Chip (BOSC), 
+// with a focus on achieving higher performance.
+// Source: https://github.com/OpenXiangShan/XiangShan
+// Documentation: https://github.com/OpenXiangShan/XiangShan-doc
+
+//===--===//
+// KunMingHu core supports 
"RV64IMAFDCV_zba_zbb_zbc_zbs_zbkb_zbkc_zbkx_zknd_zkne_zknh
+// _zksed_zksh_svinval_zicbom_zicboz_zicsr_zifencei"
+// then floating-point SEW can only be 64 and 32, not 16 and 8.
+class NoZvfhSchedSEWSet_rm8and16 {
+  defvar t = SchedSEWSet.val; 
+  defvar remove8and16 = !if(isF, !listremove(t, [8, 16]), t);
+  list val = remove8and16;
+}
+
+class NoZvfhSmallestSEW {
+  int r = !head(NoZvfhSchedSEWSet_rm8and16.val);
+}
+
+multiclass NoZvfh_LMULSEWReadAdvanceImpl writes = [],
+  list MxList, bit isF = 0,
+  bit isWidening = 0> {
+  if !exists(name # "_WorstCase") then
+def : ReadAdvance(name # "_WorstCase"), val, writes>;
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm8and16.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : ReadAdvance(name # "_" # mx # "_E" # sew), val, 
writes>;
+  }
+}
+
+multiclass LMULSEWReadAdvanceFnoZvfh 
writes = []>
+  : NoZvfh_LMULSEWReadAdvanceImpl;
+
+multiclass LMULSEWReadAdvanceFWnoZvfh 
writes = []>
+: NoZvfh_LMULSEWReadAdvanceImpl;
+
+//===--===//
+// If Zvfhmin and Zvfh are not supported, floating-point SEW can only be 32 or 
64.
+class NoZvfhSchedSEWSet_rm32and64 {
+  defvar t = SchedSEWSet.val;
+  defvar remove32and64 = !if(isF, !listremove(t, [32, 64]), t);
+  list val = remove32and64;
+}
+
+// Write-Impl
+multiclass NoZvfhLMULSEWWriteResImpl 
resources,
+   list MxList, bit isF = 0,
+   bit isWidening = 0> {
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm32and64.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : WriteRes(name # "_" # mx # "_E" # sew), 
resources>;
+  }
+}
+// Read-Impl
+multiclass NoZvfhLMULSEWReadAdvanceImpl 
writes = [],
+  list MxList, bit isF = 0,
+  bit isWidening = 0> {
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm32and64.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : ReadAdvance(name # "_" # mx # "_E" # sew), val, 
writes>;
+  }
+}
+
+// Write
+multiclass NoZvfhLMULSEWWriteResF 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+multiclass NoZvfhLMULSEWWriteResFW 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+multiclass NoZvfhLMULSEWWriteResFWRed 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+// Read
+multiclass NoZvfhLMULSEWReadAdvanceF 
writes = []>
+  : NoZvfhLMULSEWReadAdvanceImpl;
+multiclass
+NoZvfhLMULSEWReadAdvanceFW writes = 
[]>
+: NoZvfhLMULSEWReadAdvanceImpl;
+
+multiclass UnsupportedSchedZvfh {
+let Unsupported = true in {
+// Write 
+// 13. Vector Floating-Point Instructions
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFALUV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFALUF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWALUV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWALUF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFDivV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFDivF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulAddV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulAddF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulAddV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulAddF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFSqrtV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFRecpV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMinMaxV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMinMaxF", []>;
+defm "" : 

[clang] [llvm] [RISCV] Add processor definition and scheduling model for XiangShan-KunMingHu (PR #90392)

2024-04-28 Thread Yingwei Zheng via cfe-commits
=?utf-8?b?6YOd5bq36L6+?= 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,1489 @@
+//==- RISCVSchedXiangShanKunMingHu.td - XiangShanKunMingHu Scheduling Defs -*- 
tablegen -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+// The XiangShan is a high-performance open-source RISC-V processor project 
+// initiated by the Institute of Computing Technology(ICT), Chinese Academy of 
Sciences(CAS). 
+// The KunMingHu architecture is its third-generation derivative, 
+// developed by the Institute of Computing Technology, Chinese Academy of 
Sciences  
+// and the Beijing Institute of Open Source Chip (BOSC), 
+// with a focus on achieving higher performance.
+// Source: https://github.com/OpenXiangShan/XiangShan
+// Documentation: https://github.com/OpenXiangShan/XiangShan-doc
+
+//===--===//
+// KunMingHu core supports 
"RV64IMAFDCV_zba_zbb_zbc_zbs_zbkb_zbkc_zbkx_zknd_zkne_zknh
+// _zksed_zksh_svinval_zicbom_zicboz_zicsr_zifencei"
+// then floating-point SEW can only be 64 and 32, not 16 and 8.
+class NoZvfhSchedSEWSet_rm8and16 {
+  defvar t = SchedSEWSet.val; 
+  defvar remove8and16 = !if(isF, !listremove(t, [8, 16]), t);
+  list val = remove8and16;
+}
+
+class NoZvfhSmallestSEW {
+  int r = !head(NoZvfhSchedSEWSet_rm8and16.val);
+}
+
+multiclass NoZvfh_LMULSEWReadAdvanceImpl writes = [],
+  list MxList, bit isF = 0,
+  bit isWidening = 0> {
+  if !exists(name # "_WorstCase") then
+def : ReadAdvance(name # "_WorstCase"), val, writes>;
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm8and16.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : ReadAdvance(name # "_" # mx # "_E" # sew), val, 
writes>;
+  }
+}
+
+multiclass LMULSEWReadAdvanceFnoZvfh 
writes = []>
+  : NoZvfh_LMULSEWReadAdvanceImpl;
+
+multiclass LMULSEWReadAdvanceFWnoZvfh 
writes = []>
+: NoZvfh_LMULSEWReadAdvanceImpl;
+
+//===--===//
+// If Zvfhmin and Zvfh are not supported, floating-point SEW can only be 32 or 
64.
+class NoZvfhSchedSEWSet_rm32and64 {
+  defvar t = SchedSEWSet.val;
+  defvar remove32and64 = !if(isF, !listremove(t, [32, 64]), t);
+  list val = remove32and64;
+}
+
+// Write-Impl
+multiclass NoZvfhLMULSEWWriteResImpl 
resources,
+   list MxList, bit isF = 0,
+   bit isWidening = 0> {
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm32and64.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : WriteRes(name # "_" # mx # "_E" # sew), 
resources>;
+  }
+}
+// Read-Impl
+multiclass NoZvfhLMULSEWReadAdvanceImpl 
writes = [],
+  list MxList, bit isF = 0,
+  bit isWidening = 0> {
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm32and64.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : ReadAdvance(name # "_" # mx # "_E" # sew), val, 
writes>;
+  }
+}
+
+// Write
+multiclass NoZvfhLMULSEWWriteResF 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+multiclass NoZvfhLMULSEWWriteResFW 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+multiclass NoZvfhLMULSEWWriteResFWRed 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+// Read
+multiclass NoZvfhLMULSEWReadAdvanceF 
writes = []>
+  : NoZvfhLMULSEWReadAdvanceImpl;
+multiclass
+NoZvfhLMULSEWReadAdvanceFW writes = 
[]>
+: NoZvfhLMULSEWReadAdvanceImpl;
+
+multiclass UnsupportedSchedZvfh {
+let Unsupported = true in {
+// Write 
+// 13. Vector Floating-Point Instructions
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFALUV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFALUF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWALUV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWALUF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFDivV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFDivF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulAddV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulAddF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulAddV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulAddF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFSqrtV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFRecpV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMinMaxV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMinMaxF", []>;
+defm "" : 

[clang] [llvm] [RISCV] Add processor definition and scheduling model for XiangShan-KunMingHu (PR #90392)

2024-04-28 Thread Yingwei Zheng via cfe-commits
=?utf-8?b?6YOd5bq36L6+?= 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,1489 @@
+//==- RISCVSchedXiangShanKunMingHu.td - XiangShanKunMingHu Scheduling Defs -*- 
tablegen -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+// The XiangShan is a high-performance open-source RISC-V processor project 
+// initiated by the Institute of Computing Technology(ICT), Chinese Academy of 
Sciences(CAS). 
+// The KunMingHu architecture is its third-generation derivative, 
+// developed by the Institute of Computing Technology, Chinese Academy of 
Sciences  
+// and the Beijing Institute of Open Source Chip (BOSC), 
+// with a focus on achieving higher performance.
+// Source: https://github.com/OpenXiangShan/XiangShan
+// Documentation: https://github.com/OpenXiangShan/XiangShan-doc
+
+//===--===//
+// KunMingHu core supports 
"RV64IMAFDCV_zba_zbb_zbc_zbs_zbkb_zbkc_zbkx_zknd_zkne_zknh
+// _zksed_zksh_svinval_zicbom_zicboz_zicsr_zifencei"
+// then floating-point SEW can only be 64 and 32, not 16 and 8.
+class NoZvfhSchedSEWSet_rm8and16 {
+  defvar t = SchedSEWSet.val; 
+  defvar remove8and16 = !if(isF, !listremove(t, [8, 16]), t);
+  list val = remove8and16;
+}
+
+class NoZvfhSmallestSEW {
+  int r = !head(NoZvfhSchedSEWSet_rm8and16.val);
+}
+
+multiclass NoZvfh_LMULSEWReadAdvanceImpl writes = [],
+  list MxList, bit isF = 0,
+  bit isWidening = 0> {
+  if !exists(name # "_WorstCase") then
+def : ReadAdvance(name # "_WorstCase"), val, writes>;
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm8and16.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : ReadAdvance(name # "_" # mx # "_E" # sew), val, 
writes>;
+  }
+}
+
+multiclass LMULSEWReadAdvanceFnoZvfh 
writes = []>
+  : NoZvfh_LMULSEWReadAdvanceImpl;
+
+multiclass LMULSEWReadAdvanceFWnoZvfh 
writes = []>
+: NoZvfh_LMULSEWReadAdvanceImpl;
+
+//===--===//
+// If Zvfhmin and Zvfh are not supported, floating-point SEW can only be 32 or 
64.
+class NoZvfhSchedSEWSet_rm32and64 {
+  defvar t = SchedSEWSet.val;
+  defvar remove32and64 = !if(isF, !listremove(t, [32, 64]), t);
+  list val = remove32and64;
+}
+
+// Write-Impl
+multiclass NoZvfhLMULSEWWriteResImpl 
resources,
+   list MxList, bit isF = 0,
+   bit isWidening = 0> {
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm32and64.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : WriteRes(name # "_" # mx # "_E" # sew), 
resources>;
+  }
+}
+// Read-Impl
+multiclass NoZvfhLMULSEWReadAdvanceImpl 
writes = [],
+  list MxList, bit isF = 0,
+  bit isWidening = 0> {
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm32and64.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : ReadAdvance(name # "_" # mx # "_E" # sew), val, 
writes>;
+  }
+}
+
+// Write
+multiclass NoZvfhLMULSEWWriteResF 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+multiclass NoZvfhLMULSEWWriteResFW 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+multiclass NoZvfhLMULSEWWriteResFWRed 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+// Read
+multiclass NoZvfhLMULSEWReadAdvanceF 
writes = []>
+  : NoZvfhLMULSEWReadAdvanceImpl;
+multiclass
+NoZvfhLMULSEWReadAdvanceFW writes = 
[]>
+: NoZvfhLMULSEWReadAdvanceImpl;
+
+multiclass UnsupportedSchedZvfh {
+let Unsupported = true in {
+// Write 
+// 13. Vector Floating-Point Instructions
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFALUV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFALUF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWALUV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWALUF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFDivV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFDivF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulAddV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulAddF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulAddV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulAddF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFSqrtV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFRecpV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMinMaxV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMinMaxF", []>;
+defm "" : 

[clang] [llvm] [RISCV] Add processor definition and scheduling model for XiangShan-KunMingHu (PR #90392)

2024-04-28 Thread Yingwei Zheng via cfe-commits
=?utf-8?b?6YOd5bq36L6+?= 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,1489 @@
+//==- RISCVSchedXiangShanKunMingHu.td - XiangShanKunMingHu Scheduling Defs -*- 
tablegen -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+// The XiangShan is a high-performance open-source RISC-V processor project 
+// initiated by the Institute of Computing Technology(ICT), Chinese Academy of 
Sciences(CAS). 
+// The KunMingHu architecture is its third-generation derivative, 
+// developed by the Institute of Computing Technology, Chinese Academy of 
Sciences  
+// and the Beijing Institute of Open Source Chip (BOSC), 
+// with a focus on achieving higher performance.
+// Source: https://github.com/OpenXiangShan/XiangShan
+// Documentation: https://github.com/OpenXiangShan/XiangShan-doc
+
+//===--===//
+// KunMingHu core supports 
"RV64IMAFDCV_zba_zbb_zbc_zbs_zbkb_zbkc_zbkx_zknd_zkne_zknh
+// _zksed_zksh_svinval_zicbom_zicboz_zicsr_zifencei"
+// then floating-point SEW can only be 64 and 32, not 16 and 8.
+class NoZvfhSchedSEWSet_rm8and16 {
+  defvar t = SchedSEWSet.val; 
+  defvar remove8and16 = !if(isF, !listremove(t, [8, 16]), t);
+  list val = remove8and16;
+}
+
+class NoZvfhSmallestSEW {
+  int r = !head(NoZvfhSchedSEWSet_rm8and16.val);
+}
+
+multiclass NoZvfh_LMULSEWReadAdvanceImpl writes = [],
+  list MxList, bit isF = 0,
+  bit isWidening = 0> {
+  if !exists(name # "_WorstCase") then
+def : ReadAdvance(name # "_WorstCase"), val, writes>;
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm8and16.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : ReadAdvance(name # "_" # mx # "_E" # sew), val, 
writes>;
+  }
+}
+
+multiclass LMULSEWReadAdvanceFnoZvfh 
writes = []>
+  : NoZvfh_LMULSEWReadAdvanceImpl;
+
+multiclass LMULSEWReadAdvanceFWnoZvfh 
writes = []>
+: NoZvfh_LMULSEWReadAdvanceImpl;
+
+//===--===//
+// If Zvfhmin and Zvfh are not supported, floating-point SEW can only be 32 or 
64.
+class NoZvfhSchedSEWSet_rm32and64 {
+  defvar t = SchedSEWSet.val;
+  defvar remove32and64 = !if(isF, !listremove(t, [32, 64]), t);
+  list val = remove32and64;
+}
+
+// Write-Impl
+multiclass NoZvfhLMULSEWWriteResImpl 
resources,
+   list MxList, bit isF = 0,
+   bit isWidening = 0> {
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm32and64.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : WriteRes(name # "_" # mx # "_E" # sew), 
resources>;
+  }
+}
+// Read-Impl
+multiclass NoZvfhLMULSEWReadAdvanceImpl 
writes = [],
+  list MxList, bit isF = 0,
+  bit isWidening = 0> {
+  foreach mx = MxList in {
+foreach sew = NoZvfhSchedSEWSet_rm32and64.val in
+  if !exists(name # "_" # mx # "_E" # sew) then
+def : ReadAdvance(name # "_" # mx # "_E" # sew), val, 
writes>;
+  }
+}
+
+// Write
+multiclass NoZvfhLMULSEWWriteResF 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+multiclass NoZvfhLMULSEWWriteResFW 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+multiclass NoZvfhLMULSEWWriteResFWRed 
resources>
+: NoZvfhLMULSEWWriteResImpl;
+
+// Read
+multiclass NoZvfhLMULSEWReadAdvanceF 
writes = []>
+  : NoZvfhLMULSEWReadAdvanceImpl;
+multiclass
+NoZvfhLMULSEWReadAdvanceFW writes = 
[]>
+: NoZvfhLMULSEWReadAdvanceImpl;
+
+multiclass UnsupportedSchedZvfh {
+let Unsupported = true in {
+// Write 
+// 13. Vector Floating-Point Instructions
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFALUV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFALUF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWALUV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWALUF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFDivV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFDivF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulAddV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMulAddF", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulAddV", []>;
+defm "" : NoZvfhLMULSEWWriteResFW<"WriteVFWMulAddF", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFSqrtV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFRecpV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMinMaxV", []>;
+defm "" : NoZvfhLMULSEWWriteResF<"WriteVFMinMaxF", []>;
+defm "" : 

[clang] [llvm] [RISCV] Add processor definition and scheduling model for XiangShan-KunMingHu (PR #90392)

2024-04-28 Thread Yingwei Zheng via cfe-commits
=?utf-8?b?6YOd5bq36L6+?= 
Message-ID:
In-Reply-To: 



@@ -378,3 +378,31 @@ def XIANGSHAN_NANHU : 
RISCVProcessorModel<"xiangshan-nanhu",
 TuneZExtHFusion,
 TuneZExtWFusion,
 TuneShiftedZExtWFusion]>;
+   
 
+def XIANGSHAN_KUNMINGHU : RISCVProcessorModel<"xiangshan-kunminghu",
+  XiangShanKunMingHuModel,
+  [Feature64Bit,
+   FeatureStdExtI,
+   FeatureStdExtZicsr,
+   FeatureStdExtZifencei,
+   FeatureStdExtM,
+   FeatureStdExtA,
+   FeatureStdExtF,
+   FeatureStdExtD,
+   FeatureStdExtC,
+   FeatureStdExtZba,
+   FeatureStdExtZbb,
+   FeatureStdExtZbc,
+   FeatureStdExtZbs,
+   FeatureStdExtZkn,
+   FeatureStdExtZksed,
+   FeatureStdExtZksh,
+   FeatureStdExtSvinval,
+   FeatureStdExtZicbom,
+   FeatureStdExtZicboz,
+   FeatureStdExtV,
+   FeatureStdExtZvl128b],
+   [TuneNoDefaultUnroll,

dtcxzyw wrote:

See https://github.com/llvm/llvm-project/pull/89359#discussion_r1574366104

https://github.com/llvm/llvm-project/pull/90392
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Swap out range metadata to range attribute for cttz/ctlz/ctpop (PR #88776)

2024-04-24 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/88776
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add processor definition for XiangShan-KunMingHu (PR #89359)

2024-04-22 Thread Yingwei Zheng via cfe-commits


@@ -378,3 +378,30 @@ def XIANGSHAN_NANHU : 
RISCVProcessorModel<"xiangshan-nanhu",
 TuneZExtHFusion,
 TuneZExtWFusion,
 TuneShiftedZExtWFusion]>;
+def XIANGSHAN_KUNMINGHU : RISCVProcessorModel<"xiangshan-kunminghu",
+  NoSchedModel,
+  [Feature64Bit,
+   FeatureStdExtI,
+   FeatureStdExtZicsr,
+   FeatureStdExtZifencei,
+   FeatureStdExtM,
+   FeatureStdExtA,
+   FeatureStdExtF,
+   FeatureStdExtD,
+   FeatureStdExtC,
+   FeatureStdExtZba,
+   FeatureStdExtZbb,
+   FeatureStdExtZbc,
+   FeatureStdExtZbs,
+   FeatureStdExtZkn,
+   FeatureStdExtZksed,
+   FeatureStdExtZksh,
+   FeatureStdExtSvinval,
+   FeatureStdExtZicbom,
+   FeatureStdExtZicboz,
+   FeatureStdExtV,
+   FeatureStdExtZvl128b],
+   [TuneNoDefaultUnroll,

dtcxzyw wrote:

Can you provide some performance data about these options?
IIRC KunMingHu core supports the `lui + addi` fusion.


https://github.com/llvm/llvm-project/pull/89359
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] Fix warning about mismatches between function parameter and call-site args names (PR #89294)

2024-04-19 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Isn't the warning about a mismatch between declaration and definition, not 
> call args? The InstCombine change does make the definition and declaration 
> match.
> […](#)
> On Fri, Apr 19, 2024, at 17:07, Mehdi Amini wrote: ***@***. commented on 
> this pull request. In llvm/lib/Transforms/InstCombine/InstCombineInternal.h 
> <[#89294 
> (comment)](https://github.com/llvm/llvm-project/pull/89294#discussion_r1571996570)>:
>  > @@ -433,7 +433,7 @@ class LLVM_LIBRARY_VISIBILITY InstCombinerImpl final 
> Value *foldAndOrOfICmpsOfAndWithPow2(ICmpInst *LHS, ICmpInst *RHS, 
> Instruction *CxtI, bool IsAnd, bool IsLogical = false); - Value 
> *matchSelectFromAndOr(Value *A, Value *B, Value *C, Value *D, You could fix 
> the warning by using vastly different names for the function parameters? — 
> Reply to this email directly, view it on GitHub <[#89294 
> (comment)](https://github.com/llvm/llvm-project/pull/89294#discussion_r1571996570)>,
>  or unsubscribe 
> .
>  You are receiving this because your review was requested.Message ID: 
> ***@***.***>

Sorry for my misreading. Would be better to add a header comment for the 
declaration of `matchSelectFromAndOr`.


https://github.com/llvm/llvm-project/pull/89294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] Fix Definition Mismatches (PR #89294)

2024-04-19 Thread Yingwei Zheng via cfe-commits


@@ -433,7 +433,7 @@ class LLVM_LIBRARY_VISIBILITY InstCombinerImpl final
   Value *foldAndOrOfICmpsOfAndWithPow2(ICmpInst *LHS, ICmpInst *RHS,
Instruction *CxtI, bool IsAnd,
bool IsLogical = false);
-  Value *matchSelectFromAndOr(Value *A, Value *B, Value *C, Value *D,

dtcxzyw wrote:

It is weird. I think it cannot fix the warning.

See the users of matchSelectFromAndOr:
```
  if (Value *V = matchSelectFromAndOr(A, C, B, D))
return replaceInstUsesWith(I, V);
  if (Value *V = matchSelectFromAndOr(A, C, D, B))
return replaceInstUsesWith(I, V);
  if (Value *V = matchSelectFromAndOr(C, A, B, D))
return replaceInstUsesWith(I, V);
  if (Value *V = matchSelectFromAndOr(C, A, D, B))
return replaceInstUsesWith(I, V);
  if (Value *V = matchSelectFromAndOr(B, D, A, C))
return replaceInstUsesWith(I, V);
  if (Value *V = matchSelectFromAndOr(B, D, C, A))
return replaceInstUsesWith(I, V);
  if (Value *V = matchSelectFromAndOr(D, B, A, C))
return replaceInstUsesWith(I, V);
  if (Value *V = matchSelectFromAndOr(D, B, C, A))
return replaceInstUsesWith(I, V);
```


https://github.com/llvm/llvm-project/pull/89294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] Fix Definition Mismatches (PR #89294)

2024-04-19 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw requested changes to this pull request.


https://github.com/llvm/llvm-project/pull/89294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] Fix Definition Mismatches (PR #89294)

2024-04-19 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw edited 
https://github.com/llvm/llvm-project/pull/89294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Add canonicalization of `sitofp` -> `uitofp nneg` (PR #88299)

2024-04-17 Thread Yingwei Zheng via cfe-commits


@@ -1964,11 +1964,25 @@ Instruction *InstCombinerImpl::visitFPToSI(FPToSIInst 
) {
 }
 
 Instruction *InstCombinerImpl::visitUIToFP(CastInst ) {
-  return commonCastTransforms(CI);
+  if (Instruction *R = commonCastTransforms(CI))
+return R;
+  if (!CI.hasNonNeg() && isKnownNonNegative(CI.getOperand(0), SQ)) {
+CI.setNonNeg();
+return 
+  }
+  return nullptr;
 }
 
 Instruction *InstCombinerImpl::visitSIToFP(CastInst ) {
-  return commonCastTransforms(CI);
+  if (Instruction *R = commonCastTransforms(CI))
+return R;
+  if (isKnownNonNegative(CI.getOperand(0), SQ)) {
+auto UI =

dtcxzyw wrote:

@goldsteinn We always use `auto *` for pointers.

https://llvm.org/docs/CodingStandards.html#beware-unnecessary-copies-with-auto

https://github.com/llvm/llvm-project/pull/88299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [ValueTracking] Convert `isKnownNonZero` to use SimplifyQuery (PR #85863)

2024-04-12 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/85863
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [ValueTracking] Convert `isKnownNonZero` to use SimplifyQuery (PR #85863)

2024-04-12 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Looks like the clang build is failing again?

Done.

https://github.com/llvm/llvm-project/pull/85863
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [ValueTracking] Convert `isKnownNonZero` to use SimplifyQuery (PR #85863)

2024-04-12 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw updated 
https://github.com/llvm/llvm-project/pull/85863

>From 9b725ffdb93b3029263129063d021063783f9cd9 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Thu, 21 Mar 2024 21:10:46 +0800
Subject: [PATCH 1/4] [ValueTracking] Add pre-commit tests. NFC.

---
 llvm/test/Transforms/InstCombine/icmp-dom.ll | 139 +++
 1 file changed, 139 insertions(+)

diff --git a/llvm/test/Transforms/InstCombine/icmp-dom.ll 
b/llvm/test/Transforms/InstCombine/icmp-dom.ll
index f4b9022d14349b2..138254d912b259b 100644
--- a/llvm/test/Transforms/InstCombine/icmp-dom.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-dom.ll
@@ -403,3 +403,142 @@ truelabel:
 falselabel:
   ret i8 0
 }
+
+define i1 @and_mask1_eq(i32 %conv) {
+; CHECK-LABEL: @and_mask1_eq(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 1
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 3
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 1
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 3
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}
+
+define i1 @and_mask1_ne(i32 %conv) {
+; CHECK-LABEL: @and_mask1_ne(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 1
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 3
+; CHECK-NEXT:[[CMP1:%.*]] = icmp ne i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 1
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 3
+  %cmp1 = icmp ne i32 %and1, 0
+  ret i1 %cmp1
+}
+
+define i1 @and_mask2(i32 %conv) {
+; CHECK-LABEL: @and_mask2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 4
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 3
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 4
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 3
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}
+
+; TODO: %cmp1 can be folded into false.
+
+define i1 @and_mask3(i32 %conv) {
+; CHECK-LABEL: @and_mask3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 3
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 7
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 3
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 7
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}
+
+; TODO: %cmp1 can be folded into false.
+
+define i1 @and_mask4(i32 %conv) {
+; CHECK-LABEL: @and_mask4(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 4
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 7
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 4
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 7
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}

>From b80b7de04dcad4be5eff5545ee0a67984c55f17e Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Thu, 21 Mar 2024 21:21:13 +0800
Subject: [PATCH 2/4] [ValueTracking] Convert `isKnownNonZero` to use
 SimplifyQuery

---
 llvm/include/llvm/Analysis/ValueTracking.h|  6 +---
 llvm/lib/Analysis/BasicAliasAnalysis.cpp  |  3 +-
 llvm/lib/Analysis/InstructionSimplify.cpp | 29 ---
 llvm/lib/Analysis/LazyValueInfo.cpp   |  5 ++--
 llvm/lib/Analysis/Loads.cpp   |  6 ++--
 llvm/lib/Analysis/ScalarEvolution.cpp |  2 +-
 llvm/lib/Analysis/ValueTracking.cpp   | 17 +++
 

[clang] [llvm] [InstCombine] Infer nsw/nuw for trunc (PR #87910)

2024-04-11 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/87910
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Add canonicalization of `sitofp` -> `uitofp nneg` (PR #88299)

2024-04-11 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/88299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Add canonicalization of `sitofp` -> `uitofp nneg` (PR #88299)

2024-04-11 Thread Yingwei Zheng via cfe-commits


@@ -1964,11 +1964,25 @@ Instruction *InstCombinerImpl::visitFPToSI(FPToSIInst 
) {
 }
 
 Instruction *InstCombinerImpl::visitUIToFP(CastInst ) {
-  return commonCastTransforms(CI);
+  if (Instruction *R = commonCastTransforms(CI))
+return R;
+  if (!CI.hasNonNeg() && isKnownNonNegative(CI.getOperand(0), SQ)) {
+CI.setNonNeg();
+return 
+  }
+  return nullptr;
 }
 
 Instruction *InstCombinerImpl::visitSIToFP(CastInst ) {
-  return commonCastTransforms(CI);
+  if (Instruction *R = commonCastTransforms(CI))
+return R;
+  if (isKnownNonNegative(CI.getOperand(0), SQ)) {
+auto UI =

dtcxzyw wrote:

```suggestion
auto *UI =
```

https://github.com/llvm/llvm-project/pull/88299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [libclc] [libcxx] [lld] [llvm] [openmp] llvm encode decode (PR #87187)

2024-04-11 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

Please send this PR to your downstream fork 
https://github.com/x-codingman/llvm-project.



https://github.com/llvm/llvm-project/pull/87187
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [libclc] [libcxx] [lld] [llvm] [openmp] llvm encode decode (PR #87187)

2024-04-11 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/87187
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nsw/nuw for trunc (PR #87910)

2024-04-10 Thread Yingwei Zheng via cfe-commits


@@ -897,7 +897,20 @@ Instruction *InstCombinerImpl::visitTrunc(TruncInst 
) {
 }
   }
 
-  return nullptr;
+  bool Changed = false;
+  if (!Trunc.hasNoSignedWrap() &&
+  ComputeMaxSignificantBits(Src, /*Depth=*/0, ) <= DestWidth) {
+Trunc.setHasNoSignedWrap(true);
+Changed = true;
+  }
+  if (!Trunc.hasNoUnsignedWrap() &&
+  MaskedValueIsZero(Src, APInt::getBitsSetFrom(SrcWidth, DestWidth),
+/*Depth=*/0, )) {
+Trunc.setHasNoUnsignedWrap(true);
+Changed = true;

dtcxzyw wrote:

Ping @nikic


https://github.com/llvm/llvm-project/pull/87910
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nsw/nuw for trunc (PR #87910)

2024-04-09 Thread Yingwei Zheng via cfe-commits


@@ -897,7 +897,20 @@ Instruction *InstCombinerImpl::visitTrunc(TruncInst 
) {
 }
   }
 
-  return nullptr;
+  bool Changed = false;
+  if (!Trunc.hasNoSignedWrap() &&
+  ComputeMaxSignificantBits(Src, /*Depth=*/0, ) <= DestWidth) {
+Trunc.setHasNoSignedWrap(true);
+Changed = true;
+  }
+  if (!Trunc.hasNoUnsignedWrap() &&
+  MaskedValueIsZero(Src, APInt::getBitsSetFrom(SrcWidth, DestWidth),
+/*Depth=*/0, )) {
+Trunc.setHasNoUnsignedWrap(true);
+Changed = true;

dtcxzyw wrote:

> We can't infer nsw, but we can infer nuw.

I prefer to infer both flags here, then we may reuse KnownBits in further 
patches.

> Do you see any reason why doing this in SimplifyDemanded would be problematic?

`SimplifyDemanded` is context-sensitive. IMO it is not the right place to infer 
flags.


https://github.com/llvm/llvm-project/pull/87910
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nsw/nuw for trunc (PR #87910)

2024-04-09 Thread Yingwei Zheng via cfe-commits


@@ -897,7 +897,20 @@ Instruction *InstCombinerImpl::visitTrunc(TruncInst 
) {
 }
   }
 
-  return nullptr;
+  bool Changed = false;
+  if (!Trunc.hasNoSignedWrap() &&
+  ComputeMaxSignificantBits(Src, /*Depth=*/0, ) <= DestWidth) {
+Trunc.setHasNoSignedWrap(true);
+Changed = true;
+  }
+  if (!Trunc.hasNoUnsignedWrap() &&
+  MaskedValueIsZero(Src, APInt::getBitsSetFrom(SrcWidth, DestWidth),
+/*Depth=*/0, )) {
+Trunc.setHasNoUnsignedWrap(true);
+Changed = true;

dtcxzyw wrote:

We cannot infer nsw flags from KnownBits (e.g., `trunc (ashr i64 X, 32) to 
i32`). BTW we never set poison-generating flags in `SimplifyDemanded`.


https://github.com/llvm/llvm-project/pull/87910
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [ValueTracking] Convert `isKnownNonZero` to use SimplifyQuery (PR #85863)

2024-03-21 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Can you please fix the clang build?

Done.

https://github.com/llvm/llvm-project/pull/85863
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [ValueTracking] Convert `isKnownNonZero` to use SimplifyQuery (PR #85863)

2024-03-21 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw updated 
https://github.com/llvm/llvm-project/pull/85863

>From bacdc24af088560a986028824a0ac43e929c2f1b Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Thu, 21 Mar 2024 21:10:46 +0800
Subject: [PATCH 1/2] [ValueTracking] Add pre-commit tests. NFC.

---
 llvm/test/Transforms/InstCombine/icmp-dom.ll | 139 +++
 1 file changed, 139 insertions(+)

diff --git a/llvm/test/Transforms/InstCombine/icmp-dom.ll 
b/llvm/test/Transforms/InstCombine/icmp-dom.ll
index f4b9022d14349b..138254d912b259 100644
--- a/llvm/test/Transforms/InstCombine/icmp-dom.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-dom.ll
@@ -403,3 +403,142 @@ truelabel:
 falselabel:
   ret i8 0
 }
+
+define i1 @and_mask1_eq(i32 %conv) {
+; CHECK-LABEL: @and_mask1_eq(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 1
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 3
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 1
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 3
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}
+
+define i1 @and_mask1_ne(i32 %conv) {
+; CHECK-LABEL: @and_mask1_ne(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 1
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 3
+; CHECK-NEXT:[[CMP1:%.*]] = icmp ne i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 1
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 3
+  %cmp1 = icmp ne i32 %and1, 0
+  ret i1 %cmp1
+}
+
+define i1 @and_mask2(i32 %conv) {
+; CHECK-LABEL: @and_mask2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 4
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 3
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 4
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 3
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}
+
+; TODO: %cmp1 can be folded into false.
+
+define i1 @and_mask3(i32 %conv) {
+; CHECK-LABEL: @and_mask3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 3
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 7
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 3
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 7
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}
+
+; TODO: %cmp1 can be folded into false.
+
+define i1 @and_mask4(i32 %conv) {
+; CHECK-LABEL: @and_mask4(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[AND:%.*]] = and i32 [[CONV:%.*]], 4
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:ret i1 false
+; CHECK:   else:
+; CHECK-NEXT:[[AND1:%.*]] = and i32 [[CONV]], 7
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i32 [[AND1]], 0
+; CHECK-NEXT:ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 4
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  %and1 = and i32 %conv, 7
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}

>From 746f3cc306d2cddb222904e73157daf29114a3f3 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Thu, 21 Mar 2024 21:21:13 +0800
Subject: [PATCH 2/2] [ValueTracking] Convert `isKnownNonZero` to use
 SimplifyQuery

---
 clang/lib/CodeGen/CGCall.cpp  |  4 +--
 llvm/include/llvm/Analysis/ValueTracking.h|  6 +---
 llvm/lib/Analysis/BasicAliasAnalysis.cpp  |  3 +-
 llvm/lib/Analysis/InstructionSimplify.cpp | 29 ---
 llvm/lib/Analysis/LazyValueInfo.cpp   |  5 ++--
 llvm/lib/Analysis/Loads.cpp   |  6 ++--
 llvm/lib/Analysis/ScalarEvolution.cpp |  2 +-
 

[clang] [llvm] [InstCombine] Canonicalize `(sitofp x)` -> `(uitofp x)` if `x >= 0` (PR #82404)

2024-03-21 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Apart from the correctness issues, we've seen some regressions on various 
> benchmarks from LLVM Test Suite after this patch. Specifically, around 3-5% 
> regression on x86-64 in various metrics of the 
> [Interpolation](https://github.com/llvm/llvm-test-suite/tree/main/MicroBenchmarks/ImageProcessing/Interpolation)
>  benchmarks, and up to 30% regression on a number of floating point-centric 
> benchmarks from 
> https://github.com/llvm/llvm-test-suite/tree/main/SingleSource/Benchmarks/Misc
>  (flops-4.c, flops-5.c, flops-6.c, flops-8.c, fp-convert.c). The numbers vary 
> depending on the microarchitecture, with Skylake being less affected (on the 
> order of ~10%) and AMD Rome showing larger regressions (up to 30%).

FYI this patch saves ~3% instructions for some benchmarks from LLVM-test-suite 
on RISC-V.
https://github.com/dtcxzyw/llvm-ci/issues/1115
https://github.com/dtcxzyw/llvm-ci/issues/1114

https://github.com/llvm/llvm-project/pull/82404
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [lld] [llvm] [IR] Change representation of getelementptr inrange (PR #84341)

2024-03-20 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> This is a very niche feature, and I don't think trying to upgrade it is 
> worthwhile.

It exists in many real-world applications. If you are not willing to implement 
the upgrader, I will do this for the original IR files in my benchmark :)


https://github.com/llvm/llvm-project/pull/84341
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [lld] [llvm] [IR] Change representation of getelementptr inrange (PR #84341)

2024-03-20 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> bin/opt: ../../llvm-opt-benchmark/bench/icu/original/servlkf.ll:776:98: 
> error: expected ')' in constantexpr
  store ptr getelementptr inbounds ({ [11 x ptr] }, ptr 
@_ZTVN6icu_7516LocaleKeyFactoryE, i32 0, inrange i32 0, i32 2), ptr %this1, 
align 8

@nikic Do we need an auto-upgrader?

https://github.com/llvm/llvm-project/pull/84341
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add back SiFive's cdiscard.d.l1, cflush.d.l1, and cease instructions. (PR #83896)

2024-03-13 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw approved this pull request.


https://github.com/llvm/llvm-project/pull/83896
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NFC] Eliminate trailing white space causing CI build failure (PR #84632)

2024-03-09 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/84632
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add back SiFive's cdiscard.d.l1 and cflush.d.l1 instructions. (PR #83896)

2024-03-04 Thread Yingwei Zheng via cfe-commits


@@ -60,6 +60,8 @@
 // CHECK-NOT: __riscv_xsfvfwmaccqqq {{.*$}}
 // CHECK-NOT: __riscv_xsfqmaccdod {{.*$}}
 // CHECK-NOT: __riscv_xsfvqmaccqoq {{.*$}}
+// CHECK-NOT: __riscv_sifivecdiscarddlone {{.*$}}
+// CHECK-NOT: __riscv_sifivecflushdlone {{.*$}}

dtcxzyw wrote:

```suggestion
// CHECK-NOT: __riscv_xsifivecdiscarddlone {{.*$}}
// CHECK-NOT: __riscv_xsifivecflushdlone {{.*$}}
```

https://github.com/llvm/llvm-project/pull/83896
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add support of Sscofpmf (PR #83831)

2024-03-04 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw approved this pull request.

LGTM. Nice catch!
Related patch: https://github.com/llvm/llvm-project/pull/79399

https://github.com/llvm/llvm-project/pull/83831
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Support RISC-V Profiles in -march option (PR #76357)

2024-03-01 Thread Yingwei Zheng via cfe-commits


@@ -839,6 +860,33 @@ RISCVISAInfo::parseArchString(StringRef Arch, bool 
EnableExperimentalExtension,
  "string must be lowercase");
   }
 
+  bool IsProfile = Arch.starts_with("rvi") || Arch.starts_with("rva") ||
+   Arch.starts_with("rvb") || Arch.starts_with("rvm");

dtcxzyw wrote:

> profile-name ::= 
> "RV"``
> profile-family-name ::= "I" | "M" | "A"

Missing tests for `rvm`.
Do you know what "rvb" stands for?


https://github.com/llvm/llvm-project/pull/76357
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Remove experimental from Zacas. (PR #83195)

2024-02-28 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/83195
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [RISCV] Disable generation of asynchronous unwind tables for RISCV baremetal (PR #81727)

2024-02-14 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw edited 
https://github.com/llvm/llvm-project/pull/81727
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][RISCV] Add assumptions to vsetvli/vsetvlimax (PR #79975)

2024-02-12 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

Can we implement this in `computeKnownBitsFromOperator/getRangeForIntrinsic`?
https://github.com/llvm/llvm-project/blob/b21e3282864c9f7ad656c64bc375f5869ef76d19/llvm/lib/Analysis/ValueTracking.cpp#L1578-L1584

https://github.com/llvm/llvm-project/pull/79975
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw approved this pull request.

LGTM. Thanks!

https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits


@@ -1877,3 +1877,139 @@ Value 
*InstCombinerImpl::SimplifyDemandedVectorElts(Value *V,
 
   return MadeChange ? I : nullptr;
 }
+
+/// For floating-point classes that resolve to a single bit pattern, return 
that
+/// value.
+static Constant *getFPClassConstant(Type *Ty, FPClassTest Mask) {
+  switch (Mask) {
+  case fcPosZero:
+return ConstantFP::getZero(Ty);
+  case fcNegZero:
+return ConstantFP::getZero(Ty, true);
+  case fcPosInf:
+return ConstantFP::getInfinity(Ty);
+  case fcNegInf:
+return ConstantFP::getInfinity(Ty, true);
+  case fcNone:
+return PoisonValue::get(Ty);
+  default:
+return nullptr;
+  }
+}
+
+Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
+Value *V, const FPClassTest DemandedMask, KnownFPClass ,
+unsigned Depth, Instruction *CxtI) {
+  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  Type *VTy = V->getType();
+
+  assert(Known == KnownFPClass() && "expected uninitialized state");
+
+  if (DemandedMask == fcNone)
+return isa(V) ? nullptr : PoisonValue::get(VTy);
+
+  if (Depth == MaxAnalysisRecursionDepth)
+return nullptr;
+
+  Instruction *I = dyn_cast(V);
+  if (!I) {
+// Handle constants and arguments
+Known = computeKnownFPClass(V, fcAllFlags, CxtI, Depth + 1);
+Value *FoldedToConst =
+getFPClassConstant(VTy, DemandedMask & Known.KnownFPClasses);
+return FoldedToConst == V ? nullptr : FoldedToConst;
+  }
+
+  if (!I->hasOneUse())
+return nullptr;
+
+  // TODO: Should account for nofpclass/FastMathFlags on current instruction
+  switch (I->getOpcode()) {
+  case Instruction::FNeg: {
+if (SimplifyDemandedFPClass(I, 0, llvm::fneg(DemandedMask), Known,
+Depth + 1))
+  return I;
+Known.fneg();
+break;
+  }
+  case Instruction::Call: {
+CallInst *CI = cast(I);
+switch (CI->getIntrinsicID()) {
+case Intrinsic::fabs:
+  if (SimplifyDemandedFPClass(I, 0, llvm::inverse_fabs(DemandedMask), 
Known,
+  Depth + 1))
+return I;
+  Known.fabs();
+  break;
+case Intrinsic::arithmetic_fence:
+  if (SimplifyDemandedFPClass(I, 0, DemandedMask, Known, Depth + 1))
+return I;
+  break;
+case Intrinsic::copysign: {
+  // Flip on more potentially demanded classes
+  const FPClassTest DemandedMaskAnySign = llvm::unknown_sign(DemandedMask);
+  if (SimplifyDemandedFPClass(I, 0, DemandedMaskAnySign, Known, Depth + 1))
+return I;
+
+  if ((DemandedMask & fcPositive) == fcNone) {
+// Roundabout way of replacing with fneg(fabs)
+I->setOperand(1, ConstantFP::get(VTy, -1.0));
+return I;
+  }
+
+  if ((DemandedMask & fcNegative) == fcNone) {
+// Roundabout way of replacing with fabs
+I->setOperand(1, ConstantFP::getZero(VTy));
+return I;
+  }
+
+  KnownFPClass KnownSign =
+  computeKnownFPClass(I->getOperand(1), fcAllFlags, CxtI, Depth + 1);
+  Known.copysign(KnownSign);
+  break;
+}
+default:
+  Known = computeKnownFPClass(I, ~DemandedMask, CxtI, Depth + 1);

dtcxzyw wrote:

Oh, I had mistakenly thought that `computeKnownFPClass` returns the subset of 
`~DemandedMask` :(


https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits


@@ -1877,3 +1877,139 @@ Value 
*InstCombinerImpl::SimplifyDemandedVectorElts(Value *V,
 
   return MadeChange ? I : nullptr;
 }
+
+/// For floating-point classes that resolve to a single bit pattern, return 
that
+/// value.
+static Constant *getFPClassConstant(Type *Ty, FPClassTest Mask) {
+  switch (Mask) {
+  case fcPosZero:
+return ConstantFP::getZero(Ty);
+  case fcNegZero:
+return ConstantFP::getZero(Ty, true);
+  case fcPosInf:
+return ConstantFP::getInfinity(Ty);
+  case fcNegInf:
+return ConstantFP::getInfinity(Ty, true);
+  case fcNone:
+return PoisonValue::get(Ty);
+  default:
+return nullptr;
+  }
+}
+
+Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
+Value *V, const FPClassTest DemandedMask, KnownFPClass ,
+unsigned Depth, Instruction *CxtI) {
+  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  Type *VTy = V->getType();
+
+  assert(Known == KnownFPClass() && "expected uninitialized state");
+
+  if (DemandedMask == fcNone)
+return isa(V) ? nullptr : PoisonValue::get(VTy);
+
+  if (Depth == MaxAnalysisRecursionDepth)
+return nullptr;
+
+  Instruction *I = dyn_cast(V);
+  if (!I) {
+// Handle constants and arguments
+Known = computeKnownFPClass(V, fcAllFlags, CxtI, Depth + 1);
+Value *FoldedToConst =
+getFPClassConstant(VTy, DemandedMask & Known.KnownFPClasses);
+return FoldedToConst == V ? nullptr : FoldedToConst;

dtcxzyw wrote:

Ah, sorry for my misreading.

https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> > I don't know why it fails:
> > ```
> > error: patch failed: 
> > llvm/lib/Transforms/InstCombine/InstCombineInternal.h:551
> > error: llvm/lib/Transforms/InstCombine/InstCombineInternal.h: patch does 
> > not apply
> > error: patch failed: 
> > llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp:466
> > error: llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp: 
> > patch does not apply
> > ```
> 
> How are you trying to apply this?

```
cd llvm-project
git checkout main
wget https://github.com/llvm/llvm-project/pull/74056.patch
git apply 74056.patch
```


https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits


@@ -1877,3 +1877,139 @@ Value 
*InstCombinerImpl::SimplifyDemandedVectorElts(Value *V,
 
   return MadeChange ? I : nullptr;
 }
+
+/// For floating-point classes that resolve to a single bit pattern, return 
that
+/// value.
+static Constant *getFPClassConstant(Type *Ty, FPClassTest Mask) {
+  switch (Mask) {
+  case fcPosZero:
+return ConstantFP::getZero(Ty);
+  case fcNegZero:
+return ConstantFP::getZero(Ty, true);
+  case fcPosInf:
+return ConstantFP::getInfinity(Ty);
+  case fcNegInf:
+return ConstantFP::getInfinity(Ty, true);
+  case fcNone:
+return PoisonValue::get(Ty);
+  default:
+return nullptr;
+  }
+}
+
+Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
+Value *V, const FPClassTest DemandedMask, KnownFPClass ,
+unsigned Depth, Instruction *CxtI) {
+  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  Type *VTy = V->getType();
+
+  assert(Known == KnownFPClass() && "expected uninitialized state");
+
+  if (DemandedMask == fcNone)
+return isa(V) ? nullptr : PoisonValue::get(VTy);
+
+  if (Depth == MaxAnalysisRecursionDepth)
+return nullptr;
+
+  Instruction *I = dyn_cast(V);
+  if (!I) {
+// Handle constants and arguments
+Known = computeKnownFPClass(V, fcAllFlags, CxtI, Depth + 1);
+Value *FoldedToConst =
+getFPClassConstant(VTy, DemandedMask & Known.KnownFPClasses);
+return FoldedToConst == V ? nullptr : FoldedToConst;
+  }
+
+  if (!I->hasOneUse())
+return nullptr;
+
+  // TODO: Should account for nofpclass/FastMathFlags on current instruction
+  switch (I->getOpcode()) {
+  case Instruction::FNeg: {
+if (SimplifyDemandedFPClass(I, 0, llvm::fneg(DemandedMask), Known,
+Depth + 1))
+  return I;
+Known.fneg();
+break;
+  }
+  case Instruction::Call: {
+CallInst *CI = cast(I);
+switch (CI->getIntrinsicID()) {
+case Intrinsic::fabs:
+  if (SimplifyDemandedFPClass(I, 0, llvm::inverse_fabs(DemandedMask), 
Known,
+  Depth + 1))
+return I;
+  Known.fabs();
+  break;
+case Intrinsic::arithmetic_fence:
+  if (SimplifyDemandedFPClass(I, 0, DemandedMask, Known, Depth + 1))
+return I;
+  break;
+case Intrinsic::copysign: {
+  // Flip on more potentially demanded classes
+  const FPClassTest DemandedMaskAnySign = llvm::unknown_sign(DemandedMask);
+  if (SimplifyDemandedFPClass(I, 0, DemandedMaskAnySign, Known, Depth + 1))
+return I;
+
+  if ((DemandedMask & fcPositive) == fcNone) {
+// Roundabout way of replacing with fneg(fabs)
+I->setOperand(1, ConstantFP::get(VTy, -1.0));
+return I;
+  }
+
+  if ((DemandedMask & fcNegative) == fcNone) {
+// Roundabout way of replacing with fabs
+I->setOperand(1, ConstantFP::getZero(VTy));
+return I;
+  }
+
+  KnownFPClass KnownSign =
+  computeKnownFPClass(I->getOperand(1), fcAllFlags, CxtI, Depth + 1);
+  Known.copysign(KnownSign);
+  break;
+}
+default:
+  Known = computeKnownFPClass(I, ~DemandedMask, CxtI, Depth + 1);
+  break;
+}
+
+break;
+  }
+  case Instruction::Select: {
+KnownFPClass KnownLHS, KnownRHS;
+if (SimplifyDemandedFPClass(I, 2, DemandedMask, KnownRHS, Depth + 1) ||
+SimplifyDemandedFPClass(I, 1, DemandedMask, KnownLHS, Depth + 1))
+  return I;
+
+if (KnownLHS.isKnownNever(DemandedMask))
+  return I->getOperand(2);
+if (KnownRHS.isKnownNever(DemandedMask))
+  return I->getOperand(1);
+
+// TODO: Recognize clamping patterns
+Known = KnownLHS | KnownRHS;
+break;
+  }
+  default:
+Known = computeKnownFPClass(I, ~DemandedMask, CxtI, Depth + 1);

dtcxzyw wrote:

Should be `DemandedMask` or `fcAllFlags`? Otherwise, `DemandedMask & 
Known.KnownFPClasses` will evaluate to `fcNone`.


https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits


@@ -1877,3 +1877,139 @@ Value 
*InstCombinerImpl::SimplifyDemandedVectorElts(Value *V,
 
   return MadeChange ? I : nullptr;
 }
+
+/// For floating-point classes that resolve to a single bit pattern, return 
that
+/// value.
+static Constant *getFPClassConstant(Type *Ty, FPClassTest Mask) {
+  switch (Mask) {
+  case fcPosZero:
+return ConstantFP::getZero(Ty);
+  case fcNegZero:
+return ConstantFP::getZero(Ty, true);
+  case fcPosInf:
+return ConstantFP::getInfinity(Ty);
+  case fcNegInf:
+return ConstantFP::getInfinity(Ty, true);
+  case fcNone:
+return PoisonValue::get(Ty);
+  default:
+return nullptr;
+  }
+}
+
+Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
+Value *V, const FPClassTest DemandedMask, KnownFPClass ,
+unsigned Depth, Instruction *CxtI) {
+  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  Type *VTy = V->getType();
+
+  assert(Known == KnownFPClass() && "expected uninitialized state");
+
+  if (DemandedMask == fcNone)
+return isa(V) ? nullptr : PoisonValue::get(VTy);
+
+  if (Depth == MaxAnalysisRecursionDepth)
+return nullptr;
+
+  Instruction *I = dyn_cast(V);
+  if (!I) {
+// Handle constants and arguments
+Known = computeKnownFPClass(V, fcAllFlags, CxtI, Depth + 1);
+Value *FoldedToConst =
+getFPClassConstant(VTy, DemandedMask & Known.KnownFPClasses);
+return FoldedToConst == V ? nullptr : FoldedToConst;
+  }
+
+  if (!I->hasOneUse())
+return nullptr;
+
+  // TODO: Should account for nofpclass/FastMathFlags on current instruction
+  switch (I->getOpcode()) {
+  case Instruction::FNeg: {
+if (SimplifyDemandedFPClass(I, 0, llvm::fneg(DemandedMask), Known,
+Depth + 1))
+  return I;
+Known.fneg();
+break;
+  }
+  case Instruction::Call: {
+CallInst *CI = cast(I);
+switch (CI->getIntrinsicID()) {
+case Intrinsic::fabs:
+  if (SimplifyDemandedFPClass(I, 0, llvm::inverse_fabs(DemandedMask), 
Known,
+  Depth + 1))
+return I;
+  Known.fabs();
+  break;
+case Intrinsic::arithmetic_fence:
+  if (SimplifyDemandedFPClass(I, 0, DemandedMask, Known, Depth + 1))
+return I;
+  break;
+case Intrinsic::copysign: {
+  // Flip on more potentially demanded classes
+  const FPClassTest DemandedMaskAnySign = llvm::unknown_sign(DemandedMask);
+  if (SimplifyDemandedFPClass(I, 0, DemandedMaskAnySign, Known, Depth + 1))
+return I;
+
+  if ((DemandedMask & fcPositive) == fcNone) {
+// Roundabout way of replacing with fneg(fabs)
+I->setOperand(1, ConstantFP::get(VTy, -1.0));
+return I;
+  }
+
+  if ((DemandedMask & fcNegative) == fcNone) {

dtcxzyw wrote:

```suggestion
  if ((DemandedMask & (fcNegative | fcNan)) == fcNone) {
```

https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits


@@ -1877,3 +1877,139 @@ Value 
*InstCombinerImpl::SimplifyDemandedVectorElts(Value *V,
 
   return MadeChange ? I : nullptr;
 }
+
+/// For floating-point classes that resolve to a single bit pattern, return 
that
+/// value.
+static Constant *getFPClassConstant(Type *Ty, FPClassTest Mask) {
+  switch (Mask) {
+  case fcPosZero:
+return ConstantFP::getZero(Ty);
+  case fcNegZero:
+return ConstantFP::getZero(Ty, true);
+  case fcPosInf:
+return ConstantFP::getInfinity(Ty);
+  case fcNegInf:
+return ConstantFP::getInfinity(Ty, true);
+  case fcNone:
+return PoisonValue::get(Ty);
+  default:
+return nullptr;
+  }
+}
+
+Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
+Value *V, const FPClassTest DemandedMask, KnownFPClass ,
+unsigned Depth, Instruction *CxtI) {
+  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  Type *VTy = V->getType();
+
+  assert(Known == KnownFPClass() && "expected uninitialized state");
+
+  if (DemandedMask == fcNone)
+return isa(V) ? nullptr : PoisonValue::get(VTy);
+
+  if (Depth == MaxAnalysisRecursionDepth)
+return nullptr;
+
+  Instruction *I = dyn_cast(V);
+  if (!I) {
+// Handle constants and arguments
+Known = computeKnownFPClass(V, fcAllFlags, CxtI, Depth + 1);
+Value *FoldedToConst =
+getFPClassConstant(VTy, DemandedMask & Known.KnownFPClasses);
+return FoldedToConst == V ? nullptr : FoldedToConst;
+  }
+
+  if (!I->hasOneUse())
+return nullptr;
+
+  // TODO: Should account for nofpclass/FastMathFlags on current instruction
+  switch (I->getOpcode()) {
+  case Instruction::FNeg: {
+if (SimplifyDemandedFPClass(I, 0, llvm::fneg(DemandedMask), Known,
+Depth + 1))
+  return I;
+Known.fneg();
+break;
+  }
+  case Instruction::Call: {
+CallInst *CI = cast(I);
+switch (CI->getIntrinsicID()) {
+case Intrinsic::fabs:
+  if (SimplifyDemandedFPClass(I, 0, llvm::inverse_fabs(DemandedMask), 
Known,
+  Depth + 1))
+return I;
+  Known.fabs();
+  break;
+case Intrinsic::arithmetic_fence:
+  if (SimplifyDemandedFPClass(I, 0, DemandedMask, Known, Depth + 1))
+return I;
+  break;
+case Intrinsic::copysign: {
+  // Flip on more potentially demanded classes
+  const FPClassTest DemandedMaskAnySign = llvm::unknown_sign(DemandedMask);
+  if (SimplifyDemandedFPClass(I, 0, DemandedMaskAnySign, Known, Depth + 1))
+return I;
+
+  if ((DemandedMask & fcPositive) == fcNone) {
+// Roundabout way of replacing with fneg(fabs)
+I->setOperand(1, ConstantFP::get(VTy, -1.0));
+return I;
+  }
+
+  if ((DemandedMask & fcNegative) == fcNone) {
+// Roundabout way of replacing with fabs
+I->setOperand(1, ConstantFP::getZero(VTy));
+return I;
+  }
+
+  KnownFPClass KnownSign =
+  computeKnownFPClass(I->getOperand(1), fcAllFlags, CxtI, Depth + 1);
+  Known.copysign(KnownSign);
+  break;
+}
+default:
+  Known = computeKnownFPClass(I, ~DemandedMask, CxtI, Depth + 1);

dtcxzyw wrote:

Should be `DemandedMask` or `fcAllFlags`?

https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits


@@ -1877,3 +1877,139 @@ Value 
*InstCombinerImpl::SimplifyDemandedVectorElts(Value *V,
 
   return MadeChange ? I : nullptr;
 }
+
+/// For floating-point classes that resolve to a single bit pattern, return 
that
+/// value.
+static Constant *getFPClassConstant(Type *Ty, FPClassTest Mask) {
+  switch (Mask) {
+  case fcPosZero:
+return ConstantFP::getZero(Ty);
+  case fcNegZero:
+return ConstantFP::getZero(Ty, true);
+  case fcPosInf:
+return ConstantFP::getInfinity(Ty);
+  case fcNegInf:
+return ConstantFP::getInfinity(Ty, true);
+  case fcNone:
+return PoisonValue::get(Ty);
+  default:
+return nullptr;
+  }
+}
+
+Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
+Value *V, const FPClassTest DemandedMask, KnownFPClass ,
+unsigned Depth, Instruction *CxtI) {
+  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  Type *VTy = V->getType();
+
+  assert(Known == KnownFPClass() && "expected uninitialized state");
+
+  if (DemandedMask == fcNone)
+return isa(V) ? nullptr : PoisonValue::get(VTy);
+
+  if (Depth == MaxAnalysisRecursionDepth)
+return nullptr;
+
+  Instruction *I = dyn_cast(V);
+  if (!I) {
+// Handle constants and arguments
+Known = computeKnownFPClass(V, fcAllFlags, CxtI, Depth + 1);
+Value *FoldedToConst =
+getFPClassConstant(VTy, DemandedMask & Known.KnownFPClasses);
+return FoldedToConst == V ? nullptr : FoldedToConst;
+  }
+
+  if (!I->hasOneUse())
+return nullptr;
+
+  // TODO: Should account for nofpclass/FastMathFlags on current instruction
+  switch (I->getOpcode()) {
+  case Instruction::FNeg: {
+if (SimplifyDemandedFPClass(I, 0, llvm::fneg(DemandedMask), Known,
+Depth + 1))
+  return I;
+Known.fneg();
+break;
+  }
+  case Instruction::Call: {
+CallInst *CI = cast(I);
+switch (CI->getIntrinsicID()) {
+case Intrinsic::fabs:
+  if (SimplifyDemandedFPClass(I, 0, llvm::inverse_fabs(DemandedMask), 
Known,
+  Depth + 1))
+return I;
+  Known.fabs();
+  break;
+case Intrinsic::arithmetic_fence:
+  if (SimplifyDemandedFPClass(I, 0, DemandedMask, Known, Depth + 1))
+return I;
+  break;
+case Intrinsic::copysign: {
+  // Flip on more potentially demanded classes
+  const FPClassTest DemandedMaskAnySign = llvm::unknown_sign(DemandedMask);
+  if (SimplifyDemandedFPClass(I, 0, DemandedMaskAnySign, Known, Depth + 1))
+return I;
+
+  if ((DemandedMask & fcPositive) == fcNone) {

dtcxzyw wrote:

```suggestion
  if ((DemandedMask & (fcPositive | fcNan)) == fcNone) {
```

https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw requested changes to this pull request.


https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits


@@ -1877,3 +1877,139 @@ Value 
*InstCombinerImpl::SimplifyDemandedVectorElts(Value *V,
 
   return MadeChange ? I : nullptr;
 }
+
+/// For floating-point classes that resolve to a single bit pattern, return 
that
+/// value.
+static Constant *getFPClassConstant(Type *Ty, FPClassTest Mask) {
+  switch (Mask) {
+  case fcPosZero:
+return ConstantFP::getZero(Ty);
+  case fcNegZero:
+return ConstantFP::getZero(Ty, true);
+  case fcPosInf:
+return ConstantFP::getInfinity(Ty);
+  case fcNegInf:
+return ConstantFP::getInfinity(Ty, true);
+  case fcNone:
+return PoisonValue::get(Ty);
+  default:
+return nullptr;
+  }
+}
+
+Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
+Value *V, const FPClassTest DemandedMask, KnownFPClass ,
+unsigned Depth, Instruction *CxtI) {
+  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  Type *VTy = V->getType();
+
+  assert(Known == KnownFPClass() && "expected uninitialized state");
+
+  if (DemandedMask == fcNone)
+return isa(V) ? nullptr : PoisonValue::get(VTy);
+
+  if (Depth == MaxAnalysisRecursionDepth)
+return nullptr;
+
+  Instruction *I = dyn_cast(V);
+  if (!I) {
+// Handle constants and arguments
+Known = computeKnownFPClass(V, fcAllFlags, CxtI, Depth + 1);
+Value *FoldedToConst =
+getFPClassConstant(VTy, DemandedMask & Known.KnownFPClasses);
+return FoldedToConst == V ? nullptr : FoldedToConst;

dtcxzyw wrote:

`FoldedToConst == V` always evaluates to false.

https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw edited 
https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> > @arsenm Can you rebase this patch first?
> 
> It was already fresh, I just re-merged again with no conflicts

I don't know why it fails:
```
error: patch failed: llvm/lib/Transforms/InstCombine/InstCombineInternal.h:551
error: llvm/lib/Transforms/InstCombine/InstCombineInternal.h: patch does not 
apply
error: patch failed: 
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp:466
error: llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp: patch 
does not apply
```
baseline: 7e4ac8541dcc389ca8f0d11614e19ea7bae07af7

https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang-tools-extra] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-06 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

@arsenm Can you rebase this patch first?


https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] Revert "InstCombine: Fold is.fpclass(x, fcInf) to fabs+fcmp" (PR #76338)

2024-02-06 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> @dtcxzyw are you planning on a codegen patch to improve the backend handling?

I will post a patch this week.


https://github.com/llvm/llvm-project/pull/76338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [RISCV] Add Ssqosid support to -march. (PR #80747)

2024-02-05 Thread Yingwei Zheng via cfe-commits


@@ -1612,6 +1613,14 @@
 // RUN:   -o - | FileCheck --check-prefix=CHECK-SUPM-EXT %s
 // CHECK-SUPM-EXT: __riscv_supm 8000{{$}}
 
+// RUN: %clang --target=riscv32 -menable-experimental-extensions \
+// RUN:   -march=rv32i_ssqosid1p0 -E -dM %s \
+// RUN:   -o - | FileCheck --check-prefix=CHECK-SSQOSID-EXT %s
+// RUN: %clang --target=riscv64 \
+// RUN:   -march=rv64i_ssqosid1p0 -E -dM %s -menable-experimental-extensions \

dtcxzyw wrote:

It is required for clang when using experimental RISC-V extensions.
> def menable_experimental_extensions : Flag<["-"], 
> "menable-experimental-extensions">, Group,
  HelpText<"Enable use of experimental RISC-V extensions.">;


https://github.com/llvm/llvm-project/pull/80747
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][CodeGen] Mark `__dynamic_cast` as `willreturn` (PR #80409)

2024-02-03 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/80409
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][CodeGen] Mark `__dynamic_cast` as `willreturn` (PR #80409)

2024-02-02 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw created 
https://github.com/llvm/llvm-project/pull/80409

According to the C++ standard, `dynamic_cast` of pointers either returns a 
pointer (7.6.1.7) or results in undefined behavior (11.9.5). This patch marks 
`__dynamic_cast` as `willreturn` to remove unused calls.

Fixes #77606.

>From f96205dbcdbc5bb89a95cd563e47a4bb3616d843 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Fri, 2 Feb 2024 16:14:54 +0800
Subject: [PATCH] [Clang][CodeGen] Mark `__dynamic_cast` as `willreturn`

---
 clang/lib/CodeGen/ItaniumCXXABI.cpp  | 3 ++-
 clang/test/CodeGenCXX/dynamic-cast-address-space.cpp | 2 +-
 clang/test/CodeGenCXX/dynamic-cast-dead.cpp  | 8 
 clang/test/CodeGenCXX/dynamic-cast.cpp   | 2 +-
 4 files changed, 12 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenCXX/dynamic-cast-dead.cpp

diff --git a/clang/lib/CodeGen/ItaniumCXXABI.cpp 
b/clang/lib/CodeGen/ItaniumCXXABI.cpp
index d173806ec8ce5..60b45ee78d931 100644
--- a/clang/lib/CodeGen/ItaniumCXXABI.cpp
+++ b/clang/lib/CodeGen/ItaniumCXXABI.cpp
@@ -1347,9 +1347,10 @@ static llvm::FunctionCallee 
getItaniumDynamicCastFn(CodeGenFunction ) {
 
   llvm::FunctionType *FTy = llvm::FunctionType::get(Int8PtrTy, Args, false);
 
-  // Mark the function as nounwind readonly.
+  // Mark the function as nounwind willreturn readonly.
   llvm::AttrBuilder FuncAttrs(CGF.getLLVMContext());
   FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
+  FuncAttrs.addAttribute(llvm::Attribute::WillReturn);
   FuncAttrs.addMemoryAttr(llvm::MemoryEffects::readOnly());
   llvm::AttributeList Attrs = llvm::AttributeList::get(
   CGF.getLLVMContext(), llvm::AttributeList::FunctionIndex, FuncAttrs);
diff --git a/clang/test/CodeGenCXX/dynamic-cast-address-space.cpp 
b/clang/test/CodeGenCXX/dynamic-cast-address-space.cpp
index c278988c9647b..83a408984b760 100644
--- a/clang/test/CodeGenCXX/dynamic-cast-address-space.cpp
+++ b/clang/test/CodeGenCXX/dynamic-cast-address-space.cpp
@@ -20,5 +20,5 @@ const B& f(A *a) {
 
 // CHECK: declare ptr @__dynamic_cast(ptr, ptr addrspace(1), ptr addrspace(1), 
i64) [[NUW_RO:#[0-9]+]]
 
-// CHECK: attributes [[NUW_RO]] = { nounwind memory(read) }
+// CHECK: attributes [[NUW_RO]] = { nounwind willreturn memory(read) }
 // CHECK: attributes [[NR]] = { noreturn }
diff --git a/clang/test/CodeGenCXX/dynamic-cast-dead.cpp 
b/clang/test/CodeGenCXX/dynamic-cast-dead.cpp
new file mode 100644
index 0..8154cc1ba123a
--- /dev/null
+++ b/clang/test/CodeGenCXX/dynamic-cast-dead.cpp
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -I%S %s -O3 -triple x86_64-apple-darwin10 -emit-llvm 
-fcxx-exceptions -fexceptions -std=c++11 -o - | FileCheck %s
+struct A { virtual ~A(); };
+struct B : A { };
+
+void foo(A* a) {
+  // CHECK-NOT: call {{.*}} @__dynamic_cast
+  B* b = dynamic_cast(a);
+}
diff --git a/clang/test/CodeGenCXX/dynamic-cast.cpp 
b/clang/test/CodeGenCXX/dynamic-cast.cpp
index 1d36376a55bc7..b39186c85b60a 100644
--- a/clang/test/CodeGenCXX/dynamic-cast.cpp
+++ b/clang/test/CodeGenCXX/dynamic-cast.cpp
@@ -20,5 +20,5 @@ const B& f(A *a) {
 
 // CHECK: declare ptr @__dynamic_cast(ptr, ptr, ptr, i64) [[NUW_RO:#[0-9]+]]
 
-// CHECK: attributes [[NUW_RO]] = { nounwind memory(read) }
+// CHECK: attributes [[NUW_RO]] = { nounwind willreturn memory(read) }
 // CHECK: attributes [[NR]] = { noreturn }

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add experimental support of Zaamo and Zalrsc (PR #77424)

2024-01-18 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> I guess Zaamo + Zacas is technically a way one could implement atomics 
> without LR/SC?

The Zacas extension depends upon the A extension.


https://github.com/llvm/llvm-project/pull/77424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [flang] [clang] [InstCombine] Canonicalize constant GEPs to i8 source element type (PR #68882)

2024-01-18 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

Ping?

https://github.com/llvm/llvm-project/pull/68882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [RISCV][clang] Optimize memory usage of intrinsic lookup table (PR #77487)

2024-01-09 Thread Yingwei Zheng via cfe-commits


@@ -463,7 +464,7 @@ void 
RISCVIntrinsicManagerImpl::CreateRVVIntrinsicDecl(LookupResult ,
 bool RISCVIntrinsicManagerImpl::CreateIntrinsicIfFound(LookupResult ,
IdentifierInfo *II,
Preprocessor ) {
-  StringRef Name = II->getName();
+  StringRef Name = II->getName().substr(8);

dtcxzyw wrote:

Looks like it is the string length of `__riscv_`.

https://github.com/llvm/llvm-project/pull/77487
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add experimental support of Zaamo and Zalrsc (PR #77424)

2024-01-09 Thread Yingwei Zheng via cfe-commits


@@ -0,0 +1,11 @@
+# RUN: not llvm-mc -triple riscv32 -mattr=+experimental-zaamo < %s 2>&1 | 
FileCheck %s

dtcxzyw wrote:

Can we split `rv32a-invalid.s` into two files? I think it is better than 
duplicating tests for new extensions.


https://github.com/llvm/llvm-project/pull/77424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Revert "InstCombine: Fold is.fpclass(x, fcInf) to fabs+fcmp" (PR #76338)

2024-01-07 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/76338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] static operators should evaluate object argument (PR #68485)

2024-01-04 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

Ping.

https://github.com/llvm/llvm-project/pull/68485
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [clang] [llvm] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2024-01-01 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Can you please fix or revert 
> https://lab.llvm.org/buildbot/#/builders/74/builds/24592 ?

Should be fixed by 
https://github.com/llvm/llvm-project/commit/7e405eb722e40c79b7726201d0f76b5dab34ba0f.
https://lab.llvm.org/buildbot/#/builders/74/builds/24613

https://github.com/llvm/llvm-project/pull/76553
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [llvm] [clang] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2024-01-01 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> > Yeah, we should skip this inference for functions with the sanitize_memory 
> > attribute.
> 
> I will post a patch later.

Candidate patch: https://github.com/llvm/llvm-project/pull/76691

https://github.com/llvm/llvm-project/pull/76553
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [lld] [clang] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2024-01-01 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Yeah, we should skip this inference for functions with the sanitize_memory 
> attribute.

I will post a patch later.

https://github.com/llvm/llvm-project/pull/76553
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [lld] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2024-01-01 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Can you please fix or revert 
> https://lab.llvm.org/buildbot/#/builders/74/builds/24592 ?

Thank you for reporting this! I will check the error log.


https://github.com/llvm/llvm-project/pull/76553
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [lld] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2023-12-31 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/76553
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [lld] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2023-12-31 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw edited 
https://github.com/llvm/llvm-project/pull/76553
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [lld] [llvm] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2023-12-31 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw updated 
https://github.com/llvm/llvm-project/pull/76553

>From 30dcc33c4ea3ab50397a7adbe85fe977d4a400bd Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Fri, 29 Dec 2023 14:27:22 +0800
Subject: [PATCH 1/2] [FuncAttrs] Add pre-commit tests. NFC.

---
 llvm/test/Transforms/FunctionAttrs/noundef.ll | 145 ++
 1 file changed, 145 insertions(+)
 create mode 100644 llvm/test/Transforms/FunctionAttrs/noundef.ll

diff --git a/llvm/test/Transforms/FunctionAttrs/noundef.ll 
b/llvm/test/Transforms/FunctionAttrs/noundef.ll
new file mode 100644
index 00..9eca495e111e8f
--- /dev/null
+++ b/llvm/test/Transforms/FunctionAttrs/noundef.ll
@@ -0,0 +1,145 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt < %s -passes='function-attrs' -S | FileCheck %s
+
+define i32 @test_ret_constant() {
+; CHECK-LABEL: define i32 @test_ret_constant(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:ret i32 0
+;
+  ret i32 0
+}
+
+define i32 @test_ret_poison() {
+; CHECK-LABEL: define i32 @test_ret_poison(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 poison
+;
+  ret i32 poison
+}
+
+define i32 @test_ret_undef() {
+; CHECK-LABEL: define i32 @test_ret_undef(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 undef
+;
+  ret i32 undef
+}
+
+define i32 @test_ret_param(i32 %x) {
+; CHECK-LABEL: define i32 @test_ret_param(
+; CHECK-SAME: i32 returned [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 [[X]]
+;
+  ret i32 %x
+}
+
+define i32 @test_ret_noundef_param(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_noundef_param(
+; CHECK-SAME: i32 noundef returned [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 [[X]]
+;
+  ret i32 %x
+}
+
+define i32 @test_ret_noundef_expr(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_noundef_expr(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[Y:%.*]] = add i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[Y]]
+;
+  %y = add i32 %x, 1
+  ret i32 %y
+}
+
+define i32 @test_ret_create_poison_expr(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_create_poison_expr(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[Y:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[Y]]
+;
+  %y = add nsw i32 %x, 1
+  ret i32 %y
+}
+
+define i32 @test_ret_freezed(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_freezed(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[Y:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:[[Z:%.*]] = freeze i32 [[Y]]
+; CHECK-NEXT:ret i32 [[Z]]
+;
+  %y = add nsw i32 %x, 1
+  %z = freeze i32 %y
+  ret i32 %z
+}
+
+define i32 @test_ret_control_flow(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_control_flow(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[COND:%.*]] = icmp eq i32 [[X]], 0
+; CHECK-NEXT:br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i32 2
+; CHECK:   if.else:
+; CHECK-NEXT:[[RET:%.*]] = add i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %cond = icmp eq i32 %x, 0
+  br i1 %cond, label %if.then, label %if.else
+if.then:
+  ret i32 2
+if.else:
+  %ret = add i32 %x, 1
+  ret i32 %ret
+}
+
+define i32 @test_ret_control_flow_may_poison(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_control_flow_may_poison(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[COND:%.*]] = icmp eq i32 [[X]], 0
+; CHECK-NEXT:br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i32 2
+; CHECK:   if.else:
+; CHECK-NEXT:[[RET:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %cond = icmp eq i32 %x, 0
+  br i1 %cond, label %if.then, label %if.else
+if.then:
+  ret i32 2
+if.else:
+  %ret = add nsw i32 %x, 1
+  ret i32 %ret
+}
+
+; TODO: use context-sensitive analysis
+define i32 @test_ret_control_flow_never_poison(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_control_flow_never_poison(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[COND:%.*]] = icmp eq i32 [[X]], 2147483647
+; CHECK-NEXT:br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i32 2
+; CHECK:   if.else:
+; CHECK-NEXT:[[RET:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %cond = icmp eq i32 %x, 2147483647
+  br i1 %cond, label %if.then, label %if.else
+if.then:
+  ret i32 2
+if.else:
+  %ret = add nsw i32 %x, 1
+  ret i32 %ret
+}
+
+define i32 @test_noundef_prop() {
+; CHECK-LABEL: define i32 @test_noundef_prop(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:[[RET:%.*]] = call i32 @test_ret_constant()
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %ret = call i32 @test_ret_constant()
+  ret i32 %ret
+}

>From c5e8738d4bfbf1e97e3f455fded90b791f223d74 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 

[lld] [clang] [llvm] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2023-12-31 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Failed Tests (3):
  LLVM :: CodeGen/BPF/loop-exit-cond.ll
  LLVM :: CodeGen/NVPTX/nvvm-reflect-opaque.ll
  LLVM :: CodeGen/NVPTX/nvvm-reflect.ll

https://github.com/llvm/llvm-project/pull/76553
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [lld] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2023-12-31 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> There are lld test failures.

Done.

https://github.com/llvm/llvm-project/pull/76553
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [lld] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2023-12-31 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw updated 
https://github.com/llvm/llvm-project/pull/76553

>From 30dcc33c4ea3ab50397a7adbe85fe977d4a400bd Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Fri, 29 Dec 2023 14:27:22 +0800
Subject: [PATCH 1/2] [FuncAttrs] Add pre-commit tests. NFC.

---
 llvm/test/Transforms/FunctionAttrs/noundef.ll | 145 ++
 1 file changed, 145 insertions(+)
 create mode 100644 llvm/test/Transforms/FunctionAttrs/noundef.ll

diff --git a/llvm/test/Transforms/FunctionAttrs/noundef.ll 
b/llvm/test/Transforms/FunctionAttrs/noundef.ll
new file mode 100644
index 00..9eca495e111e8f
--- /dev/null
+++ b/llvm/test/Transforms/FunctionAttrs/noundef.ll
@@ -0,0 +1,145 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt < %s -passes='function-attrs' -S | FileCheck %s
+
+define i32 @test_ret_constant() {
+; CHECK-LABEL: define i32 @test_ret_constant(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:ret i32 0
+;
+  ret i32 0
+}
+
+define i32 @test_ret_poison() {
+; CHECK-LABEL: define i32 @test_ret_poison(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 poison
+;
+  ret i32 poison
+}
+
+define i32 @test_ret_undef() {
+; CHECK-LABEL: define i32 @test_ret_undef(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 undef
+;
+  ret i32 undef
+}
+
+define i32 @test_ret_param(i32 %x) {
+; CHECK-LABEL: define i32 @test_ret_param(
+; CHECK-SAME: i32 returned [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 [[X]]
+;
+  ret i32 %x
+}
+
+define i32 @test_ret_noundef_param(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_noundef_param(
+; CHECK-SAME: i32 noundef returned [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 [[X]]
+;
+  ret i32 %x
+}
+
+define i32 @test_ret_noundef_expr(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_noundef_expr(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[Y:%.*]] = add i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[Y]]
+;
+  %y = add i32 %x, 1
+  ret i32 %y
+}
+
+define i32 @test_ret_create_poison_expr(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_create_poison_expr(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[Y:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[Y]]
+;
+  %y = add nsw i32 %x, 1
+  ret i32 %y
+}
+
+define i32 @test_ret_freezed(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_freezed(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[Y:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:[[Z:%.*]] = freeze i32 [[Y]]
+; CHECK-NEXT:ret i32 [[Z]]
+;
+  %y = add nsw i32 %x, 1
+  %z = freeze i32 %y
+  ret i32 %z
+}
+
+define i32 @test_ret_control_flow(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_control_flow(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[COND:%.*]] = icmp eq i32 [[X]], 0
+; CHECK-NEXT:br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i32 2
+; CHECK:   if.else:
+; CHECK-NEXT:[[RET:%.*]] = add i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %cond = icmp eq i32 %x, 0
+  br i1 %cond, label %if.then, label %if.else
+if.then:
+  ret i32 2
+if.else:
+  %ret = add i32 %x, 1
+  ret i32 %ret
+}
+
+define i32 @test_ret_control_flow_may_poison(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_control_flow_may_poison(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[COND:%.*]] = icmp eq i32 [[X]], 0
+; CHECK-NEXT:br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i32 2
+; CHECK:   if.else:
+; CHECK-NEXT:[[RET:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %cond = icmp eq i32 %x, 0
+  br i1 %cond, label %if.then, label %if.else
+if.then:
+  ret i32 2
+if.else:
+  %ret = add nsw i32 %x, 1
+  ret i32 %ret
+}
+
+; TODO: use context-sensitive analysis
+define i32 @test_ret_control_flow_never_poison(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_control_flow_never_poison(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[COND:%.*]] = icmp eq i32 [[X]], 2147483647
+; CHECK-NEXT:br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i32 2
+; CHECK:   if.else:
+; CHECK-NEXT:[[RET:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %cond = icmp eq i32 %x, 2147483647
+  br i1 %cond, label %if.then, label %if.else
+if.then:
+  ret i32 2
+if.else:
+  %ret = add nsw i32 %x, 1
+  ret i32 %ret
+}
+
+define i32 @test_noundef_prop() {
+; CHECK-LABEL: define i32 @test_noundef_prop(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:[[RET:%.*]] = call i32 @test_ret_constant()
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %ret = call i32 @test_ret_constant()
+  ret i32 %ret
+}

>From fe11127cd7e9ed4669243502eda5991504b9809a Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 

[llvm] [clang] [lld] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2023-12-31 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw updated 
https://github.com/llvm/llvm-project/pull/76553

>From 30dcc33c4ea3ab50397a7adbe85fe977d4a400bd Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Fri, 29 Dec 2023 14:27:22 +0800
Subject: [PATCH 1/2] [FuncAttrs] Add pre-commit tests. NFC.

---
 llvm/test/Transforms/FunctionAttrs/noundef.ll | 145 ++
 1 file changed, 145 insertions(+)
 create mode 100644 llvm/test/Transforms/FunctionAttrs/noundef.ll

diff --git a/llvm/test/Transforms/FunctionAttrs/noundef.ll 
b/llvm/test/Transforms/FunctionAttrs/noundef.ll
new file mode 100644
index 00..9eca495e111e8f
--- /dev/null
+++ b/llvm/test/Transforms/FunctionAttrs/noundef.ll
@@ -0,0 +1,145 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt < %s -passes='function-attrs' -S | FileCheck %s
+
+define i32 @test_ret_constant() {
+; CHECK-LABEL: define i32 @test_ret_constant(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:ret i32 0
+;
+  ret i32 0
+}
+
+define i32 @test_ret_poison() {
+; CHECK-LABEL: define i32 @test_ret_poison(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 poison
+;
+  ret i32 poison
+}
+
+define i32 @test_ret_undef() {
+; CHECK-LABEL: define i32 @test_ret_undef(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 undef
+;
+  ret i32 undef
+}
+
+define i32 @test_ret_param(i32 %x) {
+; CHECK-LABEL: define i32 @test_ret_param(
+; CHECK-SAME: i32 returned [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 [[X]]
+;
+  ret i32 %x
+}
+
+define i32 @test_ret_noundef_param(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_noundef_param(
+; CHECK-SAME: i32 noundef returned [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:ret i32 [[X]]
+;
+  ret i32 %x
+}
+
+define i32 @test_ret_noundef_expr(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_noundef_expr(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[Y:%.*]] = add i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[Y]]
+;
+  %y = add i32 %x, 1
+  ret i32 %y
+}
+
+define i32 @test_ret_create_poison_expr(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_create_poison_expr(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[Y:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[Y]]
+;
+  %y = add nsw i32 %x, 1
+  ret i32 %y
+}
+
+define i32 @test_ret_freezed(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_freezed(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[Y:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:[[Z:%.*]] = freeze i32 [[Y]]
+; CHECK-NEXT:ret i32 [[Z]]
+;
+  %y = add nsw i32 %x, 1
+  %z = freeze i32 %y
+  ret i32 %z
+}
+
+define i32 @test_ret_control_flow(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_control_flow(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[COND:%.*]] = icmp eq i32 [[X]], 0
+; CHECK-NEXT:br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i32 2
+; CHECK:   if.else:
+; CHECK-NEXT:[[RET:%.*]] = add i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %cond = icmp eq i32 %x, 0
+  br i1 %cond, label %if.then, label %if.else
+if.then:
+  ret i32 2
+if.else:
+  %ret = add i32 %x, 1
+  ret i32 %ret
+}
+
+define i32 @test_ret_control_flow_may_poison(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_control_flow_may_poison(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[COND:%.*]] = icmp eq i32 [[X]], 0
+; CHECK-NEXT:br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i32 2
+; CHECK:   if.else:
+; CHECK-NEXT:[[RET:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %cond = icmp eq i32 %x, 0
+  br i1 %cond, label %if.then, label %if.else
+if.then:
+  ret i32 2
+if.else:
+  %ret = add nsw i32 %x, 1
+  ret i32 %ret
+}
+
+; TODO: use context-sensitive analysis
+define i32 @test_ret_control_flow_never_poison(i32 noundef %x) {
+; CHECK-LABEL: define i32 @test_ret_control_flow_never_poison(
+; CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:[[COND:%.*]] = icmp eq i32 [[X]], 2147483647
+; CHECK-NEXT:br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i32 2
+; CHECK:   if.else:
+; CHECK-NEXT:[[RET:%.*]] = add nsw i32 [[X]], 1
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %cond = icmp eq i32 %x, 2147483647
+  br i1 %cond, label %if.then, label %if.else
+if.then:
+  ret i32 2
+if.else:
+  %ret = add nsw i32 %x, 1
+  ret i32 %ret
+}
+
+define i32 @test_noundef_prop() {
+; CHECK-LABEL: define i32 @test_noundef_prop(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:[[RET:%.*]] = call i32 @test_ret_constant()
+; CHECK-NEXT:ret i32 [[RET]]
+;
+  %ret = call i32 @test_ret_constant()
+  ret i32 %ret
+}

>From 46188dfe1a8069c94fc4628660889e070e6a82cb Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 

[clang] [llvm] [RFC][RISCV] Support RISC-V Profiles in -march option (PR #76357)

2023-12-28 Thread Yingwei Zheng via cfe-commits


@@ -0,0 +1,112 @@
+// RUN: %clang -### -c %s 2>&1 -march=rvi20u32 | FileCheck 
-check-prefix=RVI20U32 %s
+// RVI20U32: "-target-cpu" "generic-rv32"
+// RVI20U32: "-target-feature" "-a"
+// RVI20U32: "-target-feature" "-c"
+// RVI20U32: "-target-feature" "-d"
+// RVI20U32: "-target-feature" "-f"
+// RVI20U32: "-target-feature" "-m"
+// RVI20U32: "-target-feature" "+rvi20u32"
+// RVI20U32: "-target-abi" "ilp32"
+
+// RUN: %clang -### -c %s 2>&1 -march=rvi20u64 | FileCheck 
-check-prefix=RVI20U64 %s
+// RVI20U64: "-target-cpu" "generic-rv64"
+// RVI20U64: "-target-feature" "-a"
+// RVI20U64: "-target-feature" "-c"
+// RVI20U64: "-target-feature" "-d"
+// RVI20U64: "-target-feature" "-f"
+// RVI20U64: "-target-feature" "-m"
+// RVI20U64: "-target-feature" "+rvi20u64"
+// RVI20U64: "-target-abi" "lp64"
+
+// RUN: %clang -### -c %s 2>&1 -march=rva20u64 | FileCheck 
-check-prefix=RVA20U64 %s
+// RVA20U64: "-target-cpu" "generic-rv64"
+// RVA20U64: "-target-feature" "+m"
+// RVA20U64: "-target-feature" "+a"
+// RVA20U64: "-target-feature" "+f"
+// RVA20U64: "-target-feature" "+d"
+// RVA20U64: "-target-feature" "+c"
+// RVA20U64: "-target-feature" "+zicsr"
+// RVA20U64: "-target-feature" "+rva20u64"
+// RVA20U64: "-target-abi" "lp64d"
+
+// RUN: %clang -### -c %s 2>&1 -march=rva20s64 | FileCheck 
-check-prefix=RVA20S64 %s
+// RVA20S64: "-target-cpu" "generic-rv64"
+// RVA20S64: "-target-feature" "+m"
+// RVA20S64: "-target-feature" "+a"
+// RVA20S64: "-target-feature" "+f"
+// RVA20S64: "-target-feature" "+d"
+// RVA20S64: "-target-feature" "+c"
+// RVA20S64: "-target-feature" "+zicsr"
+// RVA20S64: "-target-feature" "+zifencei"
+// RVA20S64: "-target-feature" "+rva20s64"
+// RVA20S64: "-target-abi" "lp64d"
+
+// RUN: %clang -### -c %s 2>&1 -march=rva22u64 | FileCheck 
-check-prefix=RVA22U64 %s
+// RVA22U64: "-target-cpu" "generic-rv64"
+// RVA22U64: "-target-feature" "+m"
+// RVA22U64: "-target-feature" "+a"
+// RVA22U64: "-target-feature" "+f"
+// RVA22U64: "-target-feature" "+d"
+// RVA22U64: "-target-feature" "+c"
+// RVA22U64: "-target-feature" "+zicbom"
+// RVA22U64: "-target-feature" "+zicbop"
+// RVA22U64: "-target-feature" "+zicboz"
+// RVA22U64: "-target-feature" "+zicsr"
+// RVA22U64: "-target-feature" "+zihintpause"
+// RVA22U64: "-target-feature" "+zfhmin" 
+// RVA22U64: "-target-feature" "+zba"
+// RVA22U64: "-target-feature" "+zbb"
+// RVA22U64: "-target-feature" "+zbs"
+// RVA22U64: "-target-feature" "+zkt"
+// RVA22U64: "-target-feature" "+rva22u64"
+// RVA22U64: "-target-abi" "lp64d"
+
+// RUN: %clang -### -c %s 2>&1 -march=rva22s64 | FileCheck 
-check-prefix=RVA22S64 %s
+// RVA22S64: "-target-cpu" "generic-rv64"
+// RVA22S64: "-target-feature" "+m"
+// RVA22S64: "-target-feature" "+a"
+// RVA22S64: "-target-feature" "+f"
+// RVA22S64: "-target-feature" "+d"
+// RVA22S64: "-target-feature" "+c"
+// RVA22S64: "-target-feature" "+zicbom"
+// RVA22S64: "-target-feature" "+zicbop"
+// RVA22S64: "-target-feature" "+zicboz"
+// RVA22S64: "-target-feature" "+zicsr"
+// RVA22S64: "-target-feature" "+zifencei"
+// RVA22S64: "-target-feature" "+zihintpause"
+// RVA22S64: "-target-feature" "+zfhmin" 
+// RVA22S64: "-target-feature" "+zba"
+// RVA22S64: "-target-feature" "+zbb"
+// RVA22S64: "-target-feature" "+zbs"
+// RVA22S64: "-target-feature" "+zkt"
+// RVA22S64: "-target-feature" "+svinval"
+// RVA22S64: "-target-feature" "+svpbmt"
+// RVA22S64: "-target-feature" "+rva22s64"
+// RVA22S64: "-target-abi" "lp64d"
+
+// RUN: %clang -### -c %s 2>&1 -march=rva22u64_zfa | FileCheck 
-check-prefix=PROFILE-WITH-ADDITIONAL %s

dtcxzyw wrote:

```suggestion
// RUN: %clang -### -c %s 2>&1 -march=RVA22U64+zfa | FileCheck 
-check-prefix=PROFILE-WITH-ADDITIONAL %s
```

> profiles format has the following BNF form 
> `"-march=""+"[option-ext]*`.

https://github.com/llvm/llvm-project/pull/76357
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [RFC][RISCV] Support RISC-V Profiles in -march option (PR #76357)

2023-12-28 Thread Yingwei Zheng via cfe-commits


@@ -206,6 +210,17 @@ static const RISCVSupportedExtension 
SupportedExperimentalExtensions[] = {
 {"zvfbfwma", RISCVExtensionVersion{0, 8}},
 };
 
+static const RISCVProfile SupportedProfiles[] = {
+{"rvi20u32", "rv32i"},

dtcxzyw wrote:

Profile names should use uppercase letters.
> e.g. `-march=RVA20U64` is a legal profile input, it will be expanded into:
 `-march=rv64imafdc_zicsr_ziccif_ziccrse_ziccamoa_zicclsm_za128rs`,
 which include all the mandatory extensions required by this profile.
 `-march=RVA20U64+zba_zbb_zbc_zbs` is also a legal profile input, it will add
 four new extensions after expanded profile strings:
 
`-march=rv64imafdc_zicsr_ziccif_ziccrse_ziccamoa_zicclsm_za128rs_zba_zbb_zbc_zbs`
 and `-march=rva20u64` is an illegal profile input, it does not use uppercase 
letters.

https://github.com/llvm/llvm-project/pull/76357
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [FuncAttrs] Deduce `noundef` attributes for return values (PR #76553)

2023-12-28 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw created 
https://github.com/llvm/llvm-project/pull/76553

This patch deduces `noundef` attributes for return values.
IIUC, a function returns `noundef` values iff all of its return values are 
guaranteed not to be `undef` or `poison`.
Definition of `noundef` from LangRef:
```
noundef
This attribute applies to parameters and return values. If the value 
representation contains any 
undefined or poison bits, the behavior is undefined. Note that this does not 
refer to padding 
introduced by the type’s storage representation.
```
Alive2: https://alive2.llvm.org/ce/z/g8Eis6

Compile-time impact: 
http://llvm-compile-time-tracker.com/compare.php?from=7f69c8b3a6c02ea32fefb16c2016dfa1ba994858=1dafc281ff8c04bb0a968fb3d898f08876dc59e0=instructions:u
|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|+0.01%|+0.01%|-0.01%|+0.00%|+0.03%|+0.02%|+0.01%|

The motivation of this patch is to reduce the number of `freeze` insts and 
enable more optimizations.
Example:
```
diff --git a/bench/flac/optimized/replaygain.c.ll 
b/bench/flac/optimized/replaygain.c.ll
index fa826475..413bd717 100644
--- a/bench/flac/optimized/replaygain.c.ll
+++ b/bench/flac/optimized/replaygain.c.ll
@@ -63,7 +63,7 @@ entry:
 declare i32 @InitGainAnalysis(i64 noundef) local_unnamed_addr #1
 
 ; Function Attrs: nounwind sspstrong uwtable
-define dso_local i32 @grabbag__replaygain_analyze(ptr nocapture noundef 
readonly %input, i32 noundef %is_stereo, i32 noundef %bps, i32 noundef 
%samples) local_unnamed_addr #0 {
+define dso_local noundef i32 @grabbag__replaygain_analyze(ptr nocapture 
noundef readonly %input, i32 noundef %is_stereo, i32 noundef %bps, i32 noundef 
%samples) local_unnamed_addr #0 {
 entry:
   %cmp = icmp eq i32 %bps, 16
   br i1 %cmp, label %if.then, label %if.else71
@@ -337,7 +337,7 @@ entry:
 declare float @GetTitleGain() local_unnamed_addr #1
 
 ; Function Attrs: nounwind sspstrong uwtable
-define dso_local ptr @grabbag__replaygain_analyze_file(ptr noundef %filename, 
ptr nocapture noundef writeonly %title_gain, ptr nocapture noundef writeonly 
%title_peak) local_unnamed_addr #0 {
+define dso_local noundef ptr @grabbag__replaygain_analyze_file(ptr noundef 
%filename, ptr nocapture noundef writeonly %title_gain, ptr nocapture noundef 
writeonly %title_peak) local_unnamed_addr #0 {
 entry:
   %instance = alloca %struct.DecoderInstance, align 4
   %call = tail call ptr @FLAC__stream_decoder_new() #15
@@ -392,7 +392,7 @@ declare i32 @FLAC__stream_decoder_set_metadata_respond(ptr 
noundef, i32 noundef)
 declare i32 @FLAC__stream_decoder_init_file(ptr noundef, ptr noundef, ptr 
noundef, ptr noundef, ptr noundef, ptr noundef) local_unnamed_addr #1
 
 ; Function Attrs: nounwind sspstrong uwtable
-define internal i32 @write_callback_(ptr nocapture readnone %decoder, ptr 
nocapture noundef readonly %frame, ptr nocapture noundef readonly %buffer, ptr 
nocapture noundef %client_data) #0 {
+define internal noundef i32 @write_callback_(ptr nocapture readnone %decoder, 
ptr nocapture noundef readonly %frame, ptr nocapture noundef readonly %buffer, 
ptr nocapture noundef %client_data) #0 {
 entry:
   %bits_per_sample1 = getelementptr inbounds %struct.FLAC__FrameHeader, ptr 
%frame, i64 0, i32 4
   %0 = load i32, ptr %bits_per_sample1, align 8
@@ -429,23 +429,16 @@ land.lhs.true14:  ; preds 
= %land.lhs.true11
   %cmp16 = icmp eq i32 %2, %8
   br i1 %cmp16, label %if.end, label %if.end.thread
 
-if.end.thread:; preds = %land.lhs.true, 
%land.lhs.true14, %land.lhs.true11, %land.lhs.true8, %entry
-  store i32 1, ptr %error, align 4
-  br label %9
-
 if.end:   ; preds = %land.lhs.true14
   %conv = zext i1 %cmp to i32
   %call = tail call i32 @grabbag__replaygain_analyze(ptr noundef %buffer, i32 
noundef %conv, i32 noundef %0, i32 noundef %3), !range !14
-  %call.fr = freeze i32 %call
-  %lnot.ext = xor i32 %call.fr, 1
-  store i32 %lnot.ext, ptr %error, align 4
-  %tobool22.not = icmp ne i32 %lnot.ext, 0
-  %spec.select = zext i1 %tobool22.not to i32
-  br label %9
-
-9:; preds = %if.end, 
%if.end.thread
-  %10 = phi i32 [ 1, %if.end.thread ], [ %spec.select, %if.end ]
-  ret i32 %10
+  %lnot.ext = xor i32 %call, 1
+  br label %if.end.thread
+
+if.end.thread:; preds = %entry, 
%land.lhs.true8, %land.lhs.true11, %land.lhs.true14, %land.lhs.true, %if.end
+  %storemerge = phi i32 [ %lnot.ext, %if.end ], [ 1, %land.lhs.true ], [ 1, 
%land.lhs.true14 ], [ 1, %land.lhs.true11 ], [ 1, %land.lhs.true8 ], [ 1, 
%entry ]
+  store i32 %storemerge, ptr %error, align 4
+  ret i32 %storemerge
 }
 
 ; Function Attrs: nounwind sspstrong uwtable

```

>From 30dcc33c4ea3ab50397a7adbe85fe977d4a400bd Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Fri, 29 Dec 2023 14:27:22 

[clang] [llvm] [FuncAttrs] Infer `norecurse` for funcs with calls to `nocallback` callees (PR #76372)

2023-12-26 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/76372
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [FuncAttrs] Infer `norecurse` for funcs with calls to `nocallback` callees (PR #76372)

2023-12-26 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> There is a failing clang test.

Fixed.

https://github.com/llvm/llvm-project/pull/76372
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [FuncAttrs] Infer `norecurse` for funcs with calls to `nocallback` callees (PR #76372)

2023-12-26 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw updated 
https://github.com/llvm/llvm-project/pull/76372

>From 5ceb22715cdcfc52b77b451110295ea083c09327 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Tue, 26 Dec 2023 05:10:06 +0800
Subject: [PATCH] [FuncAttrs] Infer `norecurse` for funcs with calls to
 `nocallback` callees

---
 .../RISCV/rvv-intrinsics-handcrafted/vlenb.c  | 24 +--
 llvm/lib/Transforms/IPO/FunctionAttrs.cpp |  5 +++-
 .../TypeBasedAliasAnalysis/functionattrs.ll   | 12 --
 .../Transforms/FunctionAttrs/argmemonly.ll|  6 ++---
 .../Transforms/FunctionAttrs/convergent.ll|  2 +-
 .../FunctionAttrs/int_sideeffect.ll   |  4 ++--
 .../FunctionAttrs/make-buffer-rsrc.ll |  2 +-
 .../Transforms/FunctionAttrs/nocapture.ll | 16 ++---
 .../FunctionAttrs/nofree-attributor.ll|  4 ++--
 .../Transforms/FunctionAttrs/norecurse.ll | 17 ++---
 llvm/test/Transforms/FunctionAttrs/nosync.ll  |  6 ++---
 .../Transforms/FunctionAttrs/readattrs.ll |  4 ++--
 .../Transforms/FunctionAttrs/writeonly.ll | 18 +++---
 13 files changed, 61 insertions(+), 59 deletions(-)

diff --git a/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/vlenb.c 
b/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/vlenb.c
index 9d95acc33dddcd..582d5fd812bc34 100644
--- a/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/vlenb.c
+++ b/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/vlenb.c
@@ -21,19 +21,19 @@ unsigned long test_vlenb(void) {
   return __riscv_vlenb();
 }
 //.
-// RV32: attributes #0 = { mustprogress nofree noinline nosync nounwind 
willreturn memory(read) vscale_range(2,1024) "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" 
"target-features"="+32bit,+d,+f,+v,+zicsr,+zve32f,+zve32x,+zve64d,+zve64f,+zve64x,+zvl128b,+zvl32b,+zvl64b"
 }
-// RV32: attributes #1 = { mustprogress nocallback nofree nosync nounwind 
willreturn memory(read) }
+// RV32: attributes #[[ATTR0:[0-9]+]] = { mustprogress nofree noinline 
norecurse nosync nounwind willreturn memory(read) vscale_range(2,1024) 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+32bit,+d,+f,+v,+zicsr,+zve32f,+zve32x,+zve64d,+zve64f,+zve64x,+zvl128b,+zvl32b,+zvl64b"
 }
+// RV32: attributes #[[ATTR1:[0-9]+]] = { mustprogress nocallback nofree 
nosync nounwind willreturn memory(read) }
 //.
-// RV64: attributes #0 = { mustprogress nofree noinline nosync nounwind 
willreturn memory(read) vscale_range(2,1024) "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" 
"target-features"="+64bit,+d,+f,+v,+zicsr,+zve32f,+zve32x,+zve64d,+zve64f,+zve64x,+zvl128b,+zvl32b,+zvl64b"
 }
-// RV64: attributes #1 = { mustprogress nocallback nofree nosync nounwind 
willreturn memory(read) }
+// RV64: attributes #[[ATTR0:[0-9]+]] = { mustprogress nofree noinline 
norecurse nosync nounwind willreturn memory(read) vscale_range(2,1024) 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+64bit,+d,+f,+v,+zicsr,+zve32f,+zve32x,+zve64d,+zve64f,+zve64x,+zvl128b,+zvl32b,+zvl64b"
 }
+// RV64: attributes #[[ATTR1:[0-9]+]] = { mustprogress nocallback nofree 
nosync nounwind willreturn memory(read) }
 //.
-// RV32: !0 = !{i32 1, !"wchar_size", i32 4}
-// RV32: !1 = !{i32 1, !"target-abi", !"ilp32d"}
-// RV32: !2 = !{i32 8, !"SmallDataLimit", i32 0}
-// RV32: !3 = !{!"vlenb"}
+// RV32: [[META0:![0-9]+]] = !{i32 1, !"wchar_size", i32 4}
+// RV32: [[META1:![0-9]+]] = !{i32 1, !"target-abi", !"ilp32d"}
+// RV32: [[META2:![0-9]+]] = !{i32 8, !"SmallDataLimit", i32 0}
+// RV32: [[META3]] = !{!"vlenb"}
 //.
-// RV64: !0 = !{i32 1, !"wchar_size", i32 4}
-// RV64: !1 = !{i32 1, !"target-abi", !"lp64d"}
-// RV64: !2 = !{i32 8, !"SmallDataLimit", i32 0}
-// RV64: !3 = !{!"vlenb"}
+// RV64: [[META0:![0-9]+]] = !{i32 1, !"wchar_size", i32 4}
+// RV64: [[META1:![0-9]+]] = !{i32 1, !"target-abi", !"lp64d"}
+// RV64: [[META2:![0-9]+]] = !{i32 8, !"SmallDataLimit", i32 0}
+// RV64: [[META3]] = !{!"vlenb"}
 //.
diff --git a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp 
b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
index 7c277518b21dbf..9ce9f8451a95fa 100644
--- a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
@@ -1629,7 +1629,10 @@ static void addNoRecurseAttrs(const SCCNodeSet ,
 for (auto  : BB.instructionsWithoutDebug())
   if (auto *CB = dyn_cast()) {
 Function *Callee = CB->getCalledFunction();
-if (!Callee || Callee == F || !Callee->doesNotRecurse())
+if (!Callee || Callee == F ||
+(!Callee->doesNotRecurse() &&
+ !(Callee->isDeclaration() &&
+   Callee->hasFnAttribute(Attribute::NoCallback
   // Function calls a potentially recursive function.
   return;
   }
diff --git a/llvm/test/Analysis/TypeBasedAliasAnalysis/functionattrs.ll 
b/llvm/test/Analysis/TypeBasedAliasAnalysis/functionattrs.ll
index 86e7f8c113d1d8..bea56a72bdeaef 

[llvm] [clang] Revert "InstCombine: Fold is.fpclass(x, fcInf) to fabs+fcmp" (PR #76338)

2023-12-24 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw updated 
https://github.com/llvm/llvm-project/pull/76338

>From a646e872e72bab7b143db7496adfeb633b882dc4 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Mon, 25 Dec 2023 01:39:27 +0800
Subject: [PATCH] Revert "InstCombine: Fold is.fpclass(x, fcInf) to fabs+fcmp"

This reverts commit 2b582440c16c72b6b021ea5c212ceda3bdfb2b9b.
---
 clang/test/CodeGen/isfpclass.c| 23 -
 clang/test/Headers/__clang_hip_math.hip   | 40 ++-
 .../InstCombine/InstCombineCalls.cpp  | 18 ---
 llvm/test/Transforms/InstCombine/and-fcmp.ll  |  9 ++--
 .../combine-is.fpclass-and-fcmp.ll| 26 --
 .../create-class-from-logic-fcmp.ll   | 30 ---
 .../test/Transforms/InstCombine/is_fpclass.ll | 51 ---
 7 files changed, 72 insertions(+), 125 deletions(-)

diff --git a/clang/test/CodeGen/isfpclass.c b/clang/test/CodeGen/isfpclass.c
index 34873c08e04f87..08c2633266dbd5 100644
--- a/clang/test/CodeGen/isfpclass.c
+++ b/clang/test/CodeGen/isfpclass.c
@@ -4,9 +4,8 @@
 // CHECK-LABEL: define dso_local i1 @check_isfpclass_finite
 // CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call float @llvm.fabs.f32(float [[X]])
-// CHECK-NEXT:[[TMP1:%.*]] = fcmp one float [[TMP0]], 0x7FF0
-// CHECK-NEXT:ret i1 [[TMP1]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 504)
+// CHECK-NEXT:ret i1 [[TMP0]]
 //
 _Bool check_isfpclass_finite(float x) {
   return __builtin_isfpclass(x, 504 /*Finite*/);
@@ -15,7 +14,7 @@ _Bool check_isfpclass_finite(float x) {
 // CHECK-LABEL: define dso_local i1 @check_isfpclass_finite_strict
 // CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2:[0-9]+]] {
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 504) #[[ATTR6:[0-9]+]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 504) #[[ATTR5:[0-9]+]]
 // CHECK-NEXT:ret i1 [[TMP0]]
 //
 _Bool check_isfpclass_finite_strict(float x) {
@@ -36,7 +35,7 @@ _Bool check_isfpclass_nan_f32(float x) {
 // CHECK-LABEL: define dso_local i1 @check_isfpclass_nan_f32_strict
 // CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 3) #[[ATTR6]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 3) #[[ATTR5]]
 // CHECK-NEXT:ret i1 [[TMP0]]
 //
 _Bool check_isfpclass_nan_f32_strict(float x) {
@@ -57,7 +56,7 @@ _Bool check_isfpclass_snan_f64(double x) {
 // CHECK-LABEL: define dso_local i1 @check_isfpclass_snan_f64_strict
 // CHECK-SAME: (double noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f64(double 
[[X]], i32 1) #[[ATTR6]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f64(double 
[[X]], i32 1) #[[ATTR5]]
 // CHECK-NEXT:ret i1 [[TMP0]]
 //
 _Bool check_isfpclass_snan_f64_strict(double x) {
@@ -78,7 +77,7 @@ _Bool check_isfpclass_zero_f16(_Float16 x) {
 // CHECK-LABEL: define dso_local i1 @check_isfpclass_zero_f16_strict
 // CHECK-SAME: (half noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f16(half [[X]], 
i32 96) #[[ATTR6]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f16(half [[X]], 
i32 96) #[[ATTR5]]
 // CHECK-NEXT:ret i1 [[TMP0]]
 //
 _Bool check_isfpclass_zero_f16_strict(_Float16 x) {
@@ -89,7 +88,7 @@ _Bool check_isfpclass_zero_f16_strict(_Float16 x) {
 // CHECK-LABEL: define dso_local i1 @check_isnan
 // CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 3) #[[ATTR6]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 3) #[[ATTR5]]
 // CHECK-NEXT:ret i1 [[TMP0]]
 //
 _Bool check_isnan(float x) {
@@ -100,7 +99,7 @@ _Bool check_isnan(float x) {
 // CHECK-LABEL: define dso_local i1 @check_isinf
 // CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 516) #[[ATTR6]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 516) #[[ATTR5]]
 // CHECK-NEXT:ret i1 [[TMP0]]
 //
 _Bool check_isinf(float x) {
@@ -111,7 +110,7 @@ _Bool check_isinf(float x) {
 // CHECK-LABEL: define dso_local i1 @check_isfinite
 // CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float 
[[X]], i32 504) 

[clang] [Clang][RISCV] Add missing support for `__riscv_clmulr_32/64` in `riscv_bitmanip.h` (PR #76289)

2023-12-24 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/76289
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][RISCV] Use `__builtin_popcount` in `__riscv_cpop_32/64` (PR #76286)

2023-12-24 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/76286
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][RISCV] Add missing support for `__builtin_riscv_cpop_32/64` (PR #76256)

2023-12-24 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/76256
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][RISCV] Add missing support for `__riscv_clmulr_32/64` in `riscv_bitmanip.h` (PR #76289)

2023-12-23 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw created 
https://github.com/llvm/llvm-project/pull/76289

This patch adds support for `__riscv_clmulr_32/64` in `riscv_bitmanip.h`.
It also fixes the extension requirements of `clmul/clmulh`.

Thank @Liaoshihua for reporting this!
 

>From bad9203e4416f02eb03475b8874db7a999d83657 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Sat, 23 Dec 2023 22:26:29 +0800
Subject: [PATCH 1/2] [Clang][RISCV] Add missing support for
 `__riscv_clmulr_32/64` in `riscv_bitmanip.h`

---
 clang/lib/Headers/riscv_bitmanip.h | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Headers/riscv_bitmanip.h 
b/clang/lib/Headers/riscv_bitmanip.h
index 1a81cc8618c975..ee388de735f770 100644
--- a/clang/lib/Headers/riscv_bitmanip.h
+++ b/clang/lib/Headers/riscv_bitmanip.h
@@ -120,7 +120,23 @@ __riscv_zip_32(uint32_t __x) {
 #endif
 #endif // defined(__riscv_zbkb)
 
-#if defined(__riscv_zbkc)
+#if defined(__riscv_zbc)
+#if __riscv_xlen == 32
+static __inline__ uint32_t __attribute__((__always_inline__, __nodebug__))
+__riscv_clmulr_32(uint32_t __x, uint32_t __y) {
+  return __builtin_riscv_clmulr_32(__x, __y);
+}
+#endif
+
+#if __riscv_xlen == 64
+static __inline__ uint64_t __attribute__((__always_inline__, __nodebug__))
+__riscv_clmulr_64(uint64_t __x, uint64_t __y) {
+  return __builtin_riscv_clmulr_64(__x, __y);
+}
+#endif
+#endif // defined(__riscv_zbc)
+
+#if defined(__riscv_zbkc) || defined(__riscv_zbc)
 static __inline__ uint32_t __attribute__((__always_inline__, __nodebug__))
 __riscv_clmul_32(uint32_t __x, uint32_t __y) {
   return __builtin_riscv_clmul_32(__x, __y);
@@ -144,7 +160,7 @@ __riscv_clmulh_64(uint64_t __x, uint64_t __y) {
   return __builtin_riscv_clmulh_64(__x, __y);
 }
 #endif
-#endif // defined(__riscv_zbkc)
+#endif // defined(__riscv_zbkc) || defined(__riscv_zbc)
 
 #if defined(__riscv_zbkx)
 #if __riscv_xlen == 32

>From dee8cafe3c364c55dbf6bd6ce07cb65ea5522d93 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Sat, 23 Dec 2023 22:27:31 +0800
Subject: [PATCH 2/2] [Clang][RISCV] Use riscv_bitmanip.h in zbc.c. NFC.

---
 clang/test/CodeGen/RISCV/rvb-intrinsics/zbc.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/clang/test/CodeGen/RISCV/rvb-intrinsics/zbc.c 
b/clang/test/CodeGen/RISCV/rvb-intrinsics/zbc.c
index ae9153eff155e1..93db3a482ef2bc 100644
--- a/clang/test/CodeGen/RISCV/rvb-intrinsics/zbc.c
+++ b/clang/test/CodeGen/RISCV/rvb-intrinsics/zbc.c
@@ -6,7 +6,7 @@
 // RUN: -disable-O0-optnone | opt -S -passes=mem2reg \
 // RUN: | FileCheck %s  -check-prefix=RV64ZBC
 
-#include 
+#include 
 
 #if __riscv_xlen == 64
 // RV64ZBC-LABEL: @clmul_64(
@@ -15,7 +15,7 @@
 // RV64ZBC-NEXT:ret i64 [[TMP0]]
 //
 uint64_t clmul_64(uint64_t a, uint64_t b) {
-  return __builtin_riscv_clmul_64(a, b);
+  return __riscv_clmul_64(a, b);
 }
 
 // RV64ZBC-LABEL: @clmulh_64(
@@ -24,7 +24,7 @@ uint64_t clmul_64(uint64_t a, uint64_t b) {
 // RV64ZBC-NEXT:ret i64 [[TMP0]]
 //
 uint64_t clmulh_64(uint64_t a, uint64_t b) {
-  return __builtin_riscv_clmulh_64(a, b);
+  return __riscv_clmulh_64(a, b);
 }
 
 // RV64ZBC-LABEL: @clmulr_64(
@@ -33,7 +33,7 @@ uint64_t clmulh_64(uint64_t a, uint64_t b) {
 // RV64ZBC-NEXT:ret i64 [[TMP0]]
 //
 uint64_t clmulr_64(uint64_t a, uint64_t b) {
-  return __builtin_riscv_clmulr_64(a, b);
+  return __riscv_clmulr_64(a, b);
 }
 #endif
 
@@ -48,7 +48,7 @@ uint64_t clmulr_64(uint64_t a, uint64_t b) {
 // RV64ZBC-NEXT:ret i32 [[TMP0]]
 //
 uint32_t clmul_32(uint32_t a, uint32_t b) {
-  return __builtin_riscv_clmul_32(a, b);
+  return __riscv_clmul_32(a, b);
 }
 
 #if __riscv_xlen == 32
@@ -58,7 +58,7 @@ uint32_t clmul_32(uint32_t a, uint32_t b) {
 // RV32ZBC-NEXT:ret i32 [[TMP0]]
 //
 uint32_t clmulh_32(uint32_t a, uint32_t b) {
-  return __builtin_riscv_clmulh_32(a, b);
+  return __riscv_clmulh_32(a, b);
 }
 
 // RV32ZBC-LABEL: @clmulr_32(
@@ -67,6 +67,6 @@ uint32_t clmulh_32(uint32_t a, uint32_t b) {
 // RV32ZBC-NEXT:ret i32 [[TMP0]]
 //
 uint32_t clmulr_32(uint32_t a, uint32_t b) {
-  return __builtin_riscv_clmulr_32(a, b);
+  return __riscv_clmulr_32(a, b);
 }
 #endif

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][RISCV] Use `__builtin_popcount` in `__riscv_cpop_32/64` (PR #76286)

2023-12-23 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw created 
https://github.com/llvm/llvm-project/pull/76286

This patch replaces `__builtin_riscv_cpop_32/64` with `__builtin_popcount(ll)` 
because `__builtin_riscv_cpop_32/64` is not implemented in clang.
Thank @Liaoshihua for reporting this!

It is an alternative to #76256.

>From b9c654fcc9b25fdfbd1a323f4c3820a367378e19 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Sat, 23 Dec 2023 21:56:09 +0800
Subject: [PATCH] [Clang][RISCV] Use `__builtin_popcount` in
 `__riscv_cpop_32/64`

---
 clang/lib/Headers/riscv_bitmanip.h|  4 +--
 clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c | 34 ---
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/clang/lib/Headers/riscv_bitmanip.h 
b/clang/lib/Headers/riscv_bitmanip.h
index 1a81cc8618c975..044cbaa037e43a 100644
--- a/clang/lib/Headers/riscv_bitmanip.h
+++ b/clang/lib/Headers/riscv_bitmanip.h
@@ -34,7 +34,7 @@ __riscv_ctz_32(uint32_t __x) {
 
 static __inline__ unsigned __attribute__((__always_inline__, __nodebug__))
 __riscv_cpop_32(uint32_t __x) {
-  return __builtin_riscv_cpop_32(__x);
+  return __builtin_popcount(__x);
 }
 
 #if __riscv_xlen == 64
@@ -55,7 +55,7 @@ __riscv_ctz_64(uint64_t __x) {
 
 static __inline__ unsigned __attribute__((__always_inline__, __nodebug__))
 __riscv_cpop_64(uint64_t __x) {
-  return __builtin_riscv_cpop_64(__x);
+  return __builtin_popcountll(__x);
 }
 #endif
 #endif // defined(__riscv_zbb)
diff --git a/clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c 
b/clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c
index 5edbc578e82e9a..fbc51b4bf144ae 100644
--- a/clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c
+++ b/clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c
@@ -51,8 +51,8 @@ unsigned int clz_32(uint32_t a) {
 // RV64ZBB-LABEL: @clz_64(
 // RV64ZBB-NEXT:  entry:
 // RV64ZBB-NEXT:[[TMP0:%.*]] = call i64 @llvm.ctlz.i64(i64 [[A:%.*]], i1 
false)
-// RV64ZBB-NEXT:[[CAST:%.*]] = trunc i64 [[TMP0]] to i32
-// RV64ZBB-NEXT:ret i32 [[CAST]]
+// RV64ZBB-NEXT:[[CAST_I:%.*]] = trunc i64 [[TMP0]] to i32
+// RV64ZBB-NEXT:ret i32 [[CAST_I]]
 //
 unsigned int clz_64(uint64_t a) {
   return __riscv_clz_64(a);
@@ -77,10 +77,36 @@ unsigned int ctz_32(uint32_t a) {
 // RV64ZBB-LABEL: @ctz_64(
 // RV64ZBB-NEXT:  entry:
 // RV64ZBB-NEXT:[[TMP0:%.*]] = call i64 @llvm.cttz.i64(i64 [[A:%.*]], i1 
false)
-// RV64ZBB-NEXT:[[CAST:%.*]] = trunc i64 [[TMP0]] to i32
-// RV64ZBB-NEXT:ret i32 [[CAST]]
+// RV64ZBB-NEXT:[[CAST_I:%.*]] = trunc i64 [[TMP0]] to i32
+// RV64ZBB-NEXT:ret i32 [[CAST_I]]
 //
 unsigned int ctz_64(uint64_t a) {
   return __riscv_ctz_64(a);
 }
 #endif
+
+// RV32ZBB-LABEL: @cpop_32(
+// RV32ZBB-NEXT:  entry:
+// RV32ZBB-NEXT:[[TMP0:%.*]] = call i32 @llvm.ctpop.i32(i32 [[A:%.*]])
+// RV32ZBB-NEXT:ret i32 [[TMP0]]
+//
+// RV64ZBB-LABEL: @cpop_32(
+// RV64ZBB-NEXT:  entry:
+// RV64ZBB-NEXT:[[TMP0:%.*]] = call i32 @llvm.ctpop.i32(i32 [[A:%.*]])
+// RV64ZBB-NEXT:ret i32 [[TMP0]]
+//
+unsigned int cpop_32(uint32_t a) {
+  return __riscv_cpop_32(a);
+}
+
+#if __riscv_xlen == 64
+// RV64ZBB-LABEL: @cpop_64(
+// RV64ZBB-NEXT:  entry:
+// RV64ZBB-NEXT:[[TMP0:%.*]] = call i64 @llvm.ctpop.i64(i64 [[A:%.*]])
+// RV64ZBB-NEXT:[[CAST_I:%.*]] = trunc i64 [[TMP0]] to i32
+// RV64ZBB-NEXT:ret i32 [[CAST_I]]
+//
+unsigned int cpop_64(uint64_t a) {
+  return __riscv_cpop_64(a);
+}
+#endif

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][RISCV] Add missing support for `__builtin_riscv_cpop_32/64` (PR #76256)

2023-12-22 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw created 
https://github.com/llvm/llvm-project/pull/76256

This patch adds support for `__builtin_riscv_cpop_32/64`, which are used by 
`riscv_bitmanip.h`.
See also 
https://github.com/llvm/llvm-project/blob/04c473bea3e0f135432698fcaafab52e1fe1b5ec/clang/lib/Headers/riscv_bitmanip.h#L35-L60.
Thank @Liaoshihua for reporting this!


>From a69599fcda5f1a4df13ec0bfe3432ba39ef09246 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Sat, 23 Dec 2023 01:21:39 +0800
Subject: [PATCH] [Clang][RISCV] Add missing support for
 __builtin_riscv_cpop_32/64

---
 clang/include/clang/Basic/BuiltinsRISCV.def   |  2 ++
 clang/lib/CodeGen/CGBuiltin.cpp   | 10 +++
 clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c | 26 +++
 3 files changed, 38 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsRISCV.def 
b/clang/include/clang/Basic/BuiltinsRISCV.def
index 1528b18c82eade..1df1c53733cfa1 100644
--- a/clang/include/clang/Basic/BuiltinsRISCV.def
+++ b/clang/include/clang/Basic/BuiltinsRISCV.def
@@ -22,6 +22,8 @@ TARGET_BUILTIN(__builtin_riscv_clz_32, "UiUi", "nc", 
"zbb|xtheadbb")
 TARGET_BUILTIN(__builtin_riscv_clz_64, "UiUWi", "nc", "zbb|xtheadbb,64bit")
 TARGET_BUILTIN(__builtin_riscv_ctz_32, "UiUi", "nc", "zbb")
 TARGET_BUILTIN(__builtin_riscv_ctz_64, "UiUWi", "nc", "zbb,64bit")
+TARGET_BUILTIN(__builtin_riscv_cpop_32, "UiUi", "nc", "zbb")
+TARGET_BUILTIN(__builtin_riscv_cpop_64, "UiUWi", "nc", "zbb,64bit")
 
 // Zbc or Zbkc extension
 TARGET_BUILTIN(__builtin_riscv_clmul_32, "UiUiUi", "nc", "zbc|zbkc")
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 5081062da2862e..64210e76ed2218 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -20696,6 +20696,8 @@ Value *CodeGenFunction::EmitRISCVBuiltinExpr(unsigned 
BuiltinID,
   case RISCV::BI__builtin_riscv_clz_64:
   case RISCV::BI__builtin_riscv_ctz_32:
   case RISCV::BI__builtin_riscv_ctz_64:
+  case RISCV::BI__builtin_riscv_cpop_32:
+  case RISCV::BI__builtin_riscv_cpop_64:
   case RISCV::BI__builtin_riscv_clmul_32:
   case RISCV::BI__builtin_riscv_clmul_64:
   case RISCV::BI__builtin_riscv_clmulh_32:
@@ -20735,6 +20737,14 @@ Value *CodeGenFunction::EmitRISCVBuiltinExpr(unsigned 
BuiltinID,
"cast");
   return Result;
 }
+case RISCV::BI__builtin_riscv_cpop_32:
+case RISCV::BI__builtin_riscv_cpop_64: {
+  Value *Result = Builder.CreateUnaryIntrinsic(Intrinsic::ctpop, Ops[0]);
+  if (Result->getType() != ResultType)
+Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/ true,
+   "cast");
+  return Result;
+}
 
 // Zbc
 case RISCV::BI__builtin_riscv_clmul_32:
diff --git a/clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c 
b/clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c
index 3a421f8c6cd421..a5715e330172bd 100644
--- a/clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c
+++ b/clang/test/CodeGen/RISCV/rvb-intrinsics/zbb.c
@@ -82,3 +82,29 @@ unsigned int ctz_64(unsigned long a) {
   return __builtin_riscv_ctz_64(a);
 }
 #endif
+
+// RV32ZBB-LABEL: @cpop_32(
+// RV32ZBB-NEXT:  entry:
+// RV32ZBB-NEXT:[[TMP0:%.*]] = call i32 @llvm.ctpop.i32(i32 [[A:%.*]])
+// RV32ZBB-NEXT:ret i32 [[TMP0]]
+//
+// RV64ZBB-LABEL: @cpop_32(
+// RV64ZBB-NEXT:  entry:
+// RV64ZBB-NEXT:[[TMP0:%.*]] = call i32 @llvm.ctpop.i32(i32 [[A:%.*]])
+// RV64ZBB-NEXT:ret i32 [[TMP0]]
+//
+unsigned int cpop_32(unsigned int a) {
+  return __builtin_riscv_cpop_32(a);
+}
+
+#if __riscv_xlen == 64
+// RV64ZBB-LABEL: @cpop_64(
+// RV64ZBB-NEXT:  entry:
+// RV64ZBB-NEXT:[[TMP0:%.*]] = call i64 @llvm.ctpop.i64(i64 [[A:%.*]])
+// RV64ZBB-NEXT:[[CAST:%.*]] = trunc i64 [[TMP0]] to i32
+// RV64ZBB-NEXT:ret i32 [[CAST]]
+//
+unsigned int cpop_64(unsigned long a) {
+  return __builtin_riscv_cpop_64(a);
+}
+#endif

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [flang] [clang] [InstCombine] Canonicalize constant GEPs to i8 source element type (PR #68882)

2023-12-21 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> arrayidx

We should teach `foldCmpLoadFromIndexedGlobal` to handle constant GEPs with i8 
source element type.


https://github.com/llvm/llvm-project/pull/68882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [flang] [clang] [InstCombine] Canonicalize constant GEPs to i8 source element type (PR #68882)

2023-12-21 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> @nikic Could you please have a look at 
> [dtcxzyw/llvm-opt-benchmark#17](https://github.com/dtcxzyw/llvm-opt-benchmark/pull/17)?
>  One regression:
> 
> ```
> diff --git a/bench/brotli/optimized/compound_dictionary.c.ll 
> b/bench/brotli/optimized/compound_dictionary.c.ll
> index 21fd37fd..b9894810 100644
> --- a/bench/brotli/optimized/compound_dictionary.c.ll
> +++ b/bench/brotli/optimized/compound_dictionary.c.ll
> @@ -3,9 +3,6 @@ source_filename = 
> "bench/brotli/original/compound_dictionary.c.ll"
>  target datalayout = 
> "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
>  target triple = "x86_64-unknown-linux-gnu"
>  
> -%struct.PreparedDictionary = type { i32, i32, i32, i32, i32, i32 }
> -%struct.CompoundDictionary = type { i64, i64, [16 x ptr], [16 x ptr], [16 x 
> i64], i64, [16 x ptr] }
> -
>  ; Function Attrs: nounwind uwtable
>  define hidden ptr @CreatePreparedDictionary(ptr noundef %m, ptr noundef 
> %source, i64 noundef %source_size) local_unnamed_addr #0 {
>  entry:
> @@ -168,25 +165,29 @@ cond.true119.i:   ; 
> preds = %for.end106.i
>  
>  cond.end123.i:; preds = %cond.true119.i, 
> %for.end106.i
>%cond124.i = phi ptr [ %call121.i, %cond.true119.i ], [ null, 
> %for.end106.i ]
> -  %arrayidx125.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
> %cond124.i, i64 1
> +  %arrayidx125.i = getelementptr inbounds i8, ptr %cond124.i, i64 24
>%arrayidx127.i = getelementptr inbounds i32, ptr %arrayidx125.i, i64 
> %idxprom.i
>%arrayidx129.i = getelementptr inbounds i16, ptr %arrayidx127.i, i64 
> %idxprom26.i
>%arrayidx131.i = getelementptr inbounds i32, ptr %arrayidx129.i, i64 
> %conv113.i
>store i32 -558043677, ptr %cond124.i, align 4
> -  %num_items.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
> %cond124.i, i64 0, i32 1
> +  %num_items.i = getelementptr inbounds i8, ptr %cond124.i, i64 4
>store i32 %add100.i, ptr %num_items.i, align 4
>%conv132.i = trunc i64 %source_size to i32
> -  %source_size133.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
> %cond124.i, i64 0, i32 2
> +  %source_size133.i = getelementptr inbounds i8, ptr %cond124.i, i64 8
>store i32 %conv132.i, ptr %source_size133.i, align 4
> -  %hash_bits134.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
> %cond124.i, i64 0, i32 3
> +  %hash_bits134.i = getelementptr inbounds i8, ptr %cond124.i, i64 12
>store i32 40, ptr %hash_bits134.i, align 4
> -  %bucket_bits135.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
> %cond124.i, i64 0, i32 4
> +  %bucket_bits135.i = getelementptr inbounds i8, ptr %cond124.i, i64 16
>store i32 %bucket_bits.0.lcssa, ptr %bucket_bits135.i, align 4
> -  %slot_bits136.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
> %cond124.i, i64 0, i32 5
> +  %slot_bits136.i = getelementptr inbounds i8, ptr %cond124.i, i64 20
>store i32 %slot_bits.0.lcssa, ptr %slot_bits136.i, align 4
>store ptr %source, ptr %arrayidx131.i, align 1
>br label %for.body140.i
>  
> +for.cond151.preheader.i:  ; preds = %for.body140.i
> +  %invariant.gep.i = getelementptr i8, ptr %arrayidx129.i, i64 -4
> +  br label %for.body154.i
> +
>  for.body140.i:; preds = %for.body140.i, 
> %cond.end123.i
>%indvars.iv145.i = phi i64 [ 0, %cond.end123.i ], [ %indvars.iv.next146.i, 
> %for.body140.i ]
>%total_items.1139.i = phi i32 [ 0, %cond.end123.i ], [ %add145.i, 
> %for.body140.i ]
> @@ -198,10 +199,10 @@ for.body140.i:; 
> preds = %for.body140.i, %con
>store i32 0, ptr %arrayidx144.i, align 4
>%indvars.iv.next146.i = add nuw nsw i64 %indvars.iv145.i, 1
>%exitcond150.not.i = icmp eq i64 %indvars.iv.next146.i, %idxprom.i
> -  br i1 %exitcond150.not.i, label %for.body154.i, label %for.body140.i, 
> !llvm.loop !9
> +  br i1 %exitcond150.not.i, label %for.cond151.preheader.i, label 
> %for.body140.i, !llvm.loop !9
>  
> -for.body154.i:; preds = %for.body140.i, 
> %for.inc204.i
> -  %indvars.iv152.i = phi i64 [ %indvars.iv.next153.i, %for.inc204.i ], [ 0, 
> %for.body140.i ]
> +for.body154.i:; preds = %for.inc204.i, 
> %for.cond151.preheader.i
> +  %indvars.iv152.i = phi i64 [ 0, %for.cond151.preheader.i ], [ 
> %indvars.iv.next153.i, %for.inc204.i ]
>%5 = trunc i64 %indvars.iv152.i to i32
>%and155.i = and i32 %sub3.i, %5
>%arrayidx158.i = getelementptr inbounds i16, ptr %arrayidx25.i, i64 
> %indvars.iv152.i
> @@ -243,7 +244,7 @@ for.body194.i:; preds 
> = %for.body194.i, %if.
>%pos.0.in140.i = phi ptr [ %arrayidx189.i, %if.end177.i ], [ 
> %arrayidx198.i, %for.body194.i ]
>%pos.0.i = load i32, ptr %pos.0.in140.i, align 4
>%inc195.i = add nuw nsw i64 %cursor.0142.i, 1
> 

[flang] [llvm] [clang] [InstCombine] Canonicalize constant GEPs to i8 source element type (PR #68882)

2023-12-21 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

An unique regression:
```
diff --git a/bench/openssl/optimized/hexstr_test-bin-hexstr_test.ll 
b/bench/openssl/optimized/hexstr_test-bin-hexstr_test.ll
index 534c0a07..85a097fc 100644
--- a/bench/openssl/optimized/hexstr_test-bin-hexstr_test.ll
+++ b/bench/openssl/optimized/hexstr_test-bin-hexstr_test.ll
@@ -48,7 +48,7 @@ entry:
   %idxprom = sext i32 %test_index to i64
   %arrayidx = getelementptr inbounds [6 x %struct.testdata], ptr 
@tbl_testdata, i64 0, i64 %idxprom
   %0 = load ptr, ptr %arrayidx, align 16
-  %sep = getelementptr inbounds [6 x %struct.testdata], ptr @tbl_testdata, i64 
0, i64 %idxprom, i32 3
+  %sep = getelementptr inbounds i8, ptr %arrayidx, i64 24
   %1 = load i8, ptr %sep, align 8
   %call = call ptr @ossl_hexstr2buf_sep(ptr noundef %0, ptr noundef nonnull 
%len, i8 noundef signext %1) #2
   %call1 = call i32 @test_ptr(ptr noundef nonnull @.str.3, i32 noundef 71, ptr 
noundef nonnull @.str.4, ptr noundef %call) #2
@@ -57,9 +57,9 @@ entry:
 
 lor.lhs.false:; preds = %entry
   %2 = load i64, ptr %len, align 8
-  %expected = getelementptr inbounds [6 x %struct.testdata], ptr 
@tbl_testdata, i64 0, i64 %idxprom, i32 1
+  %expected = getelementptr inbounds i8, ptr %arrayidx, i64 8
   %3 = load ptr, ptr %expected, align 8
-  %expected_len = getelementptr inbounds [6 x %struct.testdata], ptr 
@tbl_testdata, i64 0, i64 %idxprom, i32 2
+  %expected_len = getelementptr inbounds i8, ptr %arrayidx, i64 16
   %4 = load i64, ptr %expected_len, align 16
   %call2 = call i32 @test_mem_eq(ptr noundef nonnull @.str.3, i32 noundef 72, 
ptr noundef nonnull @.str.5, ptr noundef nonnull @.str.6, ptr noundef %call, 
i64 noundef %2, ptr noundef %3, i64 noundef %4) #2
   %tobool3.not = icmp eq i32 %call2, 0
@@ -93,8 +93,9 @@ entry:
   store i64 0, ptr %len, align 8
   %idxprom = sext i32 %test_index to i64
   %arrayidx = getelementptr inbounds [6 x %struct.testdata], ptr 
@tbl_testdata, i64 0, i64 %idxprom
-  %0 = and i32 %test_index, -2
-  %cmp.not = icmp eq i32 %0, 2
+  %sep = getelementptr inbounds i8, ptr %arrayidx, i64 24
+  %0 = load i8, ptr %sep, align 8
+  %cmp.not = icmp eq i8 %0, 95
   %1 = load ptr, ptr %arrayidx, align 16
   %call28 = call ptr @OPENSSL_hexstr2buf(ptr noundef %1, ptr noundef nonnull 
%len) #2
   br i1 %cmp.not, label %if.else26, label %if.then
@@ -106,9 +107,9 @@ if.then:  ; preds = 
%entry
 
 lor.lhs.false:; preds = %if.then
   %2 = load i64, ptr %len, align 8
-  %expected = getelementptr inbounds [6 x %struct.testdata], ptr 
@tbl_testdata, i64 0, i64 %idxprom, i32 1
+  %expected = getelementptr inbounds i8, ptr %arrayidx, i64 8
   %3 = load ptr, ptr %expected, align 8
-  %expected_len = getelementptr inbounds [6 x %struct.testdata], ptr 
@tbl_testdata, i64 0, i64 %idxprom, i32 2
+  %expected_len = getelementptr inbounds i8, ptr %arrayidx, i64 16
   %4 = load i64, ptr %expected_len, align 16
   %call3 = call i32 @test_mem_eq(ptr noundef nonnull @.str.3, i32 noundef 94, 
ptr noundef nonnull @.str.5, ptr noundef nonnull @.str.6, ptr noundef %call28, 
i64 noundef %2, ptr noundef %3, i64 noundef %4) #2
   %tobool4.not = icmp eq i32 %call3, 0
@@ -122,7 +123,7 @@ lor.lhs.false5:   ; preds = 
%lor.lhs.false
   br i1 %tobool8.not, label %err, label %if.end
 
 if.end:   ; preds = %lor.lhs.false5
-  %cmp12 = icmp ult i32 %test_index, 2
+  %cmp12 = icmp eq i8 %0, 58
   br i1 %cmp12, label %if.then14, label %if.else
 
 if.then14:; preds = %if.end
@@ -171,9 +172,9 @@ entry:
 
 land.lhs.true:; preds = %entry
   %1 = load i64, ptr %len, align 8
-  %expected = getelementptr inbounds [6 x %struct.testdata], ptr 
@tbl_testdata, i64 0, i64 %idxprom, i32 1
+  %expected = getelementptr inbounds i8, ptr %arrayidx, i64 8
   %2 = load ptr, ptr %expected, align 8
-  %expected_len = getelementptr inbounds [6 x %struct.testdata], ptr 
@tbl_testdata, i64 0, i64 %idxprom, i32 2
+  %expected_len = getelementptr inbounds i8, ptr %arrayidx, i64 16
   %3 = load i64, ptr %expected_len, align 16
   %call3 = call i32 @test_mem_eq(ptr noundef nonnull @.str.3, i32 noundef 122, 
ptr noundef nonnull @.str.5, ptr noundef nonnull @.str.6, ptr noundef nonnull 
%buf, i64 noundef %1, ptr noundef %2, i64 noundef %3) #2
   %tobool4.not = icmp eq i32 %call3, 0

```

https://github.com/llvm/llvm-project/pull/68882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[flang] [llvm] [clang] [InstCombine] Canonicalize constant GEPs to i8 source element type (PR #68882)

2023-12-21 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> @dtcxzyw GitHub can't display the diff, and struggles to clone the repo. Can 
> you share the diffs for just the mentioned files?

I have posted the diff between optimized IRs.


https://github.com/llvm/llvm-project/pull/68882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [InstCombine] Canonicalize constant GEPs to i8 source element type (PR #68882)

2023-12-21 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

Another example:
```
diff --git a/bench/hermes/optimized/Sorting.cpp.ll 
b/bench/hermes/optimized/Sorting.cpp.ll
index 1a808c47..e03089ca 100644
--- a/bench/hermes/optimized/Sorting.cpp.ll
+++ b/bench/hermes/optimized/Sorting.cpp.ll
@@ -41,20 +41,22 @@ if.end:   ; preds = 
%entry
   %call5.i.i.i.i.i.i = tail call noalias noundef nonnull ptr @_Znwm(i64 
noundef %mul.i.i.i.i.i.i) #9
   store ptr %call5.i.i.i.i.i.i, ptr %index, align 8
   %add.ptr.i.i.i = getelementptr inbounds i32, ptr %call5.i.i.i.i.i.i, i64 
%conv
-  %_M_end_of_storage.i.i.i = getelementptr inbounds 
%"struct.std::_Vector_base>::_Vector_impl_data", ptr %index, i64 0, i32 2
+  %_M_end_of_storage.i.i.i = getelementptr inbounds i8, ptr %index, i64 16
   store ptr %add.ptr.i.i.i, ptr %_M_end_of_storage.i.i.i, align 8
   store i32 0, ptr %call5.i.i.i.i.i.i, align 4
-  %incdec.ptr.i.i.i.i.i = getelementptr i32, ptr %call5.i.i.i.i.i.i, i64 1
-  %cmp.i.i.i.i.i.i.i = icmp eq i32 %sub, 1
+  %incdec.ptr.i.i.i.i.i = getelementptr i8, ptr %call5.i.i.i.i.i.i, i64 4
+  %sub.i.i.i.i.i = add nsw i64 %conv, -1
+  %cmp.i.i.i.i.i.i.i = icmp eq i64 %sub.i.i.i.i.i, 0
   br i1 %cmp.i.i.i.i.i.i.i, label %_ZNSt6vectorIjSaIjEEC2EmRKS0_.exit, label 
%if.end.i.i.i.i.i.i.i
 
 if.end.i.i.i.i.i.i.i: ; preds = %if.end
   %1 = add nsw i64 %mul.i.i.i.i.i.i, -4
   tail call void @llvm.memset.p0.i64(ptr align 4 %incdec.ptr.i.i.i.i.i, i8 0, 
i64 %1, i1 false)
+  %add.ptr.i.i.i.i.i.i.i = getelementptr inbounds i32, ptr 
%incdec.ptr.i.i.i.i.i, i64 %sub.i.i.i.i.i
   br label %_ZNSt6vectorIjSaIjEEC2EmRKS0_.exit
 
 _ZNSt6vectorIjSaIjEEC2EmRKS0_.exit:   ; preds = %if.end, 
%if.end.i.i.i.i.i.i.i
-  %__first.addr.0.i.i.i.i.i = phi ptr [ %incdec.ptr.i.i.i.i.i, %if.end ], [ 
%add.ptr.i.i.i, %if.end.i.i.i.i.i.i.i ]
+  %__first.addr.0.i.i.i.i.i = phi ptr [ %incdec.ptr.i.i.i.i.i, %if.end ], [ 
%add.ptr.i.i.i.i.i.i.i, %if.end.i.i.i.i.i.i.i ]
   store ptr %__first.addr.0.i.i.i.i.i, ptr %0, align 8
   %cmp116.not = icmp eq i32 %end, %begin
   br i1 %cmp116.not, label %for.end, label %for.body
```

https://github.com/llvm/llvm-project/pull/68882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[flang] [llvm] [clang] [InstCombine] Canonicalize constant GEPs to i8 source element type (PR #68882)

2023-12-21 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

@nikic Could you please have a look at 
https://github.com/dtcxzyw/llvm-opt-benchmark/pull/17?
One regression:
```
diff --git a/bench/brotli/optimized/compound_dictionary.c.ll 
b/bench/brotli/optimized/compound_dictionary.c.ll
index 21fd37fd..b9894810 100644
--- a/bench/brotli/optimized/compound_dictionary.c.ll
+++ b/bench/brotli/optimized/compound_dictionary.c.ll
@@ -3,9 +3,6 @@ source_filename = 
"bench/brotli/original/compound_dictionary.c.ll"
 target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
 
-%struct.PreparedDictionary = type { i32, i32, i32, i32, i32, i32 }
-%struct.CompoundDictionary = type { i64, i64, [16 x ptr], [16 x ptr], [16 x 
i64], i64, [16 x ptr] }
-
 ; Function Attrs: nounwind uwtable
 define hidden ptr @CreatePreparedDictionary(ptr noundef %m, ptr noundef 
%source, i64 noundef %source_size) local_unnamed_addr #0 {
 entry:
@@ -168,25 +165,29 @@ cond.true119.i:   ; preds 
= %for.end106.i
 
 cond.end123.i:; preds = %cond.true119.i, 
%for.end106.i
   %cond124.i = phi ptr [ %call121.i, %cond.true119.i ], [ null, %for.end106.i ]
-  %arrayidx125.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
%cond124.i, i64 1
+  %arrayidx125.i = getelementptr inbounds i8, ptr %cond124.i, i64 24
   %arrayidx127.i = getelementptr inbounds i32, ptr %arrayidx125.i, i64 
%idxprom.i
   %arrayidx129.i = getelementptr inbounds i16, ptr %arrayidx127.i, i64 
%idxprom26.i
   %arrayidx131.i = getelementptr inbounds i32, ptr %arrayidx129.i, i64 
%conv113.i
   store i32 -558043677, ptr %cond124.i, align 4
-  %num_items.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
%cond124.i, i64 0, i32 1
+  %num_items.i = getelementptr inbounds i8, ptr %cond124.i, i64 4
   store i32 %add100.i, ptr %num_items.i, align 4
   %conv132.i = trunc i64 %source_size to i32
-  %source_size133.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
%cond124.i, i64 0, i32 2
+  %source_size133.i = getelementptr inbounds i8, ptr %cond124.i, i64 8
   store i32 %conv132.i, ptr %source_size133.i, align 4
-  %hash_bits134.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
%cond124.i, i64 0, i32 3
+  %hash_bits134.i = getelementptr inbounds i8, ptr %cond124.i, i64 12
   store i32 40, ptr %hash_bits134.i, align 4
-  %bucket_bits135.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
%cond124.i, i64 0, i32 4
+  %bucket_bits135.i = getelementptr inbounds i8, ptr %cond124.i, i64 16
   store i32 %bucket_bits.0.lcssa, ptr %bucket_bits135.i, align 4
-  %slot_bits136.i = getelementptr inbounds %struct.PreparedDictionary, ptr 
%cond124.i, i64 0, i32 5
+  %slot_bits136.i = getelementptr inbounds i8, ptr %cond124.i, i64 20
   store i32 %slot_bits.0.lcssa, ptr %slot_bits136.i, align 4
   store ptr %source, ptr %arrayidx131.i, align 1
   br label %for.body140.i
 
+for.cond151.preheader.i:  ; preds = %for.body140.i
+  %invariant.gep.i = getelementptr i8, ptr %arrayidx129.i, i64 -4
+  br label %for.body154.i
+
 for.body140.i:; preds = %for.body140.i, 
%cond.end123.i
   %indvars.iv145.i = phi i64 [ 0, %cond.end123.i ], [ %indvars.iv.next146.i, 
%for.body140.i ]
   %total_items.1139.i = phi i32 [ 0, %cond.end123.i ], [ %add145.i, 
%for.body140.i ]
@@ -198,10 +199,10 @@ for.body140.i:; preds 
= %for.body140.i, %con
   store i32 0, ptr %arrayidx144.i, align 4
   %indvars.iv.next146.i = add nuw nsw i64 %indvars.iv145.i, 1
   %exitcond150.not.i = icmp eq i64 %indvars.iv.next146.i, %idxprom.i
-  br i1 %exitcond150.not.i, label %for.body154.i, label %for.body140.i, 
!llvm.loop !9
+  br i1 %exitcond150.not.i, label %for.cond151.preheader.i, label 
%for.body140.i, !llvm.loop !9
 
-for.body154.i:; preds = %for.body140.i, 
%for.inc204.i
-  %indvars.iv152.i = phi i64 [ %indvars.iv.next153.i, %for.inc204.i ], [ 0, 
%for.body140.i ]
+for.body154.i:; preds = %for.inc204.i, 
%for.cond151.preheader.i
+  %indvars.iv152.i = phi i64 [ 0, %for.cond151.preheader.i ], [ 
%indvars.iv.next153.i, %for.inc204.i ]
   %5 = trunc i64 %indvars.iv152.i to i32
   %and155.i = and i32 %sub3.i, %5
   %arrayidx158.i = getelementptr inbounds i16, ptr %arrayidx25.i, i64 
%indvars.iv152.i
@@ -243,7 +244,7 @@ for.body194.i:; preds = 
%for.body194.i, %if.
   %pos.0.in140.i = phi ptr [ %arrayidx189.i, %if.end177.i ], [ %arrayidx198.i, 
%for.body194.i ]
   %pos.0.i = load i32, ptr %pos.0.in140.i, align 4
   %inc195.i = add nuw nsw i64 %cursor.0142.i, 1
-  %arrayidx196.i = getelementptr i32, ptr %arrayidx129.i, i64 %cursor.0142.i
+  %arrayidx196.i = getelementptr inbounds i32, ptr %arrayidx129.i, i64 
%cursor.0142.i
   store i32 %pos.0.i, ptr %arrayidx196.i, align 4
   %idxprom197.i = zext 

[clang] [clang] static operators should evaluate object argument (PR #68485)

2023-12-14 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

Ping.

https://github.com/llvm/llvm-project/pull/68485
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add support for experimental Zimop extension (PR #74824)

2023-12-08 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

I guess you should split it into patch series.
+ [ ] MC support (and docs)
+ [ ] Sched support
+ [ ] ISel support
+ [ ] Builtin intrinsic support in clang


https://github.com/llvm/llvm-project/pull/74824
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [ValueTracking] Add dominating condition support in computeKnownBits() (PR #73662)

2023-12-06 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw approved this pull request.

The regression in `Shootout-C++-ary2` may be caused by ThinLTO. But I think it 
is OK to go ahead and merge :)


https://github.com/llvm/llvm-project/pull/73662
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [ValueTracking] Add dominating condition support in computeKnownBits() (PR #73662)

2023-12-04 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

Looks like the regression in `DILATE` has been addressed.
Could you please have a look at 
`MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode`?


https://github.com/llvm/llvm-project/pull/73662
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [ValueTracking] Add dominating condition support in computeKnownBits() (PR #73662)

2023-12-03 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

Could you please rebase this patch on 
https://github.com/llvm/llvm-project/pull/74246 and add a test for 
https://github.com/llvm/llvm-project/issues/74242?


https://github.com/llvm/llvm-project/pull/73662
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   >