from:"Anna Thomas via Phabricator via cfe\-commits"

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-08-02 Thread Anna Thomas via Phabricator via cfe-commits

anna added a comment.

thank you @craig.topper  and @pengfei .


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-08-02 Thread Anna Thomas via Phabricator via cfe-commits

anna added a comment.

In D155145#4554786 , @anna wrote:

>> Can you capture the values of EAX, EBX, ECX, and EDX after the two calls to 
>> getX86CpuIDAndInfoEx that have 0x7 as the first argument? Maybe there's a 
>> bug in CPUID on Sandy Bridge.
>
> Sure, on the original code before the patch you suggested right?
> The two calls are:
>
>bool HasLeaf7 =
>   MaxLevel >= 7 && !getX86CpuIDAndInfoEx(0x7, 0x0, , , , 
> );
>   +   llvm::errs() << "Before setting fsgsbase the value for EAX: " << EAX
>   + << " EBX: " << EBX << " ECX: " << ECX << "  EDX: " << 
> EDX
>   + << "\n";
> 
>   Features["fsgsbase"]   = HasLeaf7 && ((EBX >>  0) & 1);
>   
>   bool HasLeaf7Subleaf1 =
>   MaxLevel >= 7 && !getX86CpuIDAndInfoEx(0x7, 0x1, , , , 
> );
>   +   llvm::errs() << "Before setting sha512 the value for EAX: " << EAX
>   + << " EBX: " << EBX << " ECX: " << ECX << "  EDX: " << 
> EDX
>   + << "\n";
>   Features["sha512"] = HasLeaf7Subleaf1 && ((EAX >> 0) & 1);
>   ...
>   we set avxvnniint16 after this
>
> Takes a while to get a build on this machine, should have the output soon.

@ctopper here is the output:

  Before setting fsgsbase the value for EAX: 0 EBX: 0 ECX: 0  EDX: 2617246720 
// this is after the HasLeaf7 calculation
  Before setting sha512 the value for EAX: 0 EBX: 0 ECX: 0  EDX: 2617246720 // 
this is after the HasLeaf7Subleaf1 calculation

So, with your patch `HasLeaf7Subleaf1` is 0 as EAX is 0. Pls let me know if you 
need  any additional diagnostics output (we actually lose access to the machine 
on friday, since it is being retired!).

> The documentation says that invalid subleaves of leaf 7 should return all 0s. 
> So we thought it was safe to check the bits of sub leaf 1 even if eax from 
> subleaf 0 doesn't say subleaf 1 is supported.

This means the CPUID doesn't satisfy the documentation since EDX != 0 for 
SubLeaf1?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-08-02 Thread Anna Thomas via Phabricator via cfe-commits

anna added a comment.

> Can you capture the values of EAX, EBX, ECX, and EDX after the two calls to 
> getX86CpuIDAndInfoEx that have 0x7 as the first argument? Maybe there's a bug 
> in CPUID on Sandy Bridge.

Sure, on the original code before the patch you suggested right?
The two calls are:

   bool HasLeaf7 =
  MaxLevel >= 7 && !getX86CpuIDAndInfoEx(0x7, 0x0, , , , 
);
  +   llvm::errs() << "Before setting fsgsbase the value for EAX: " << EAX
  + << " EBX: " << EBX << " ECX: " << ECX << "  EDX: " << 
EDX
  + << "\n";

  Features["fsgsbase"]   = HasLeaf7 && ((EBX >>  0) & 1);
  
  bool HasLeaf7Subleaf1 =
  MaxLevel >= 7 && !getX86CpuIDAndInfoEx(0x7, 0x1, , , , 
);
  +   llvm::errs() << "Before setting sha512 the value for EAX: " << EAX
  + << " EBX: " << EBX << " ECX: " << ECX << "  EDX: " << 
EDX
  + << "\n";
  Features["sha512"] = HasLeaf7Subleaf1 && ((EAX >> 0) & 1);
  ...
  we set avxvnniint16 after this

Takes a while to get a build on this machine, should have the output soon.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-08-02 Thread Anna Thomas via Phabricator via cfe-commits

anna added a comment.

In D155145#4551621 , @craig.topper 
wrote:

> In D155145#4551526 , @anna wrote:
>
>> In D155145#4544068 , @pengfei 
>> wrote:
>>
>>> In D155145#4543326 , @anna wrote:
>>>
 We see a crash bisected to this patch about using an illegal instruction. 
 Here's the CPUInfo for the machine:

   CPU info:
   current cpu id: 22
   total 32(physical cores 16) (assigned logical cores 32) (assigned 
 physical cores 16) (assigned_sockets:2 of 2) (8 cores per cpu, 2 threads 
 per core) family 6 model 45 stepping 7 microcode 0x71a, cmov, cx8, fxsr, 
 mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, aes, 
 clmul, ht, tsc, tscinvbit, tscinv, clflush
   AvgLoads: 0.30, 0.10, 0.18
   CPU Model and flags from /proc/cpuinfo:
   model name  : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
   flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
 mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall 
 nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl 
 xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl 
 vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt 
 tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp 
 tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts 
 md_clear flush_l1d
   Online cpus: 0-31
   Offline cpus:
   BIOS frequency limitation: 
   Frequency switch latency (ns): 2
   Available cpu frequencies: 
   Current governor: schedutil
   Core performance/turbo boost: 

 I don't see `avxvnniint16` in the flags list nor avx2. So, this 
 (relatively new) instruction shouldn't be generated for this machine. Any 
 ideas on why this might be happening?
>>>
>>> As far as I can see from the patch, the only way to generate avxvnniint16 
>>> instructions is to call its specific intrinsics explicitly. And we will 
>>> check compiling options in FE before allowing to call the intrinsics. We do 
>>> have an optimization to generate vnni instructions without intrinsics, but 
>>> we haven't extend it to avxvnniint16 so far.
>>> So I don't know what's wrong in your case, could you provide a reproducer 
>>> for your problem?
>>
>> I've investigated what is going on. With this patch, we are now passing in 
>> `+avxvnniint16` into machine attributes. With that attribute, we now 
>> generate an instruction which is illegal on sandybridge machine:
>>
>>0x3013f2af:   jmpq   0x3013f09b
>>  0x3013f2b4: mov%rax,%rdi
>>  0x3013f2b7: and$0xfff0,%rdi
>>   => 0x3013f2bb: vpbroadcastd %xmm0,%ymm2
>>  0x3013f2c0: vpbroadcastd %xmm1,%ymm3
>>
>> The instruction `vpbroadcastd %xmm0,%ymm2` requires `AVX2` CPU flag: 
>> https://www.felixcloutier.com/x86/vpbroadcast. However, the machine has only 
>> AVX flag.
>>
>> This is the complete mattr generated:
>>
>>   !3 = 
>> !{!"-mattr=-prfchw,-cldemote,+avx,+aes,+sahf,+pclmul,-xop,+crc32,-xsaves,-avx512fp16,-sm4,+sse4.1,-avx512ifma,+xsave,-avx512pf,+sse4.2,-tsxldtrk,-ptwrite,-widekl,-sm3,-invpcid,+64bit,-xsavec,-avx512vpopcntdq,+cmov,-avx512vp2intersect,-avx512cd,-movbe,-avxvnniint8,-avx512er,-amx-int8,-kl,-sha512,-avxvnni,-rtm,-adx,-avx2,-hreset,-movdiri,-serialize,-vpclmulqdq,-avx512vl,-uintr,-clflushopt,-raoint,-cmpccxadd,-bmi,-amx-tile,+sse,-gfni,+avxvnniint16,-amx-fp16,+xsaveopt,-rdrnd,-avx512f,-amx-bf16,-avx512bf16,-avx512vnni,+cx8,-avx512bw,+sse3,-pku,-fsgsbase,-clzero,-mwaitx,-lwp,-lzcnt,-sha,-movdir64b,-wbnoinvd,-enqcmd,-prefetchwt1,-avxneconvert,-tbm,-pconfig,-amx-complex,+ssse3,+cx16,-bmi2,-fma,+popcnt,-avxifma,-f16c,-avx512bitalg,-rdpru,-clwb,+mmx,+sse2,-rdseed,-avx512vbmi2,-prefetchi,-rdpid,-fma4,-avx512vbmi,-shstk,-vaes,-waitpkg,-sgx,+fxsr,-avx512dq,-sse4a"}
>>
>> I've confirmed if we changed to `-avxvnniint16` we do not generate 
>> `vpbroadcastd`.
>>
>> W.r.t. how we get the machine attributes generated through our front-end:
>>
>>   if (!sys::getHostCPUFeatures(Features))
>> return std::move(mattr);
>> 
>>   // Fill mattr with default values.
>>   mattr.reserve(Features.getNumItems());
>>   for (auto  : Features) {
>> std::string attr(I.first());
>> mattr.emplace_back(std::string(I.second ? "+" : "-") + attr);
>>   }
>>
>> So, the problem is in getHostCPUFeatures, possibly this line from the patch 
>> : 
>> `Features["avxvnniint16"] = HasLeaf7Subleaf1 && ((EDX >> 10) & 1) && 
>> HasAVXSave;`.
>
> Does this patch help
>
>   diff --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp
>   index 1141df09307c..11a6879fb76a 100644
>   --- a/llvm/lib/TargetParser/Host.cpp
>   +++

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-08-01 Thread Anna Thomas via Phabricator via cfe-commits

anna added a comment.

In D155145#4544068 , @pengfei wrote:

> In D155145#4543326 , @anna wrote:
>
>> We see a crash bisected to this patch about using an illegal instruction. 
>> Here's the CPUInfo for the machine:
>>
>>   CPU info:
>>   current cpu id: 22
>>   total 32(physical cores 16) (assigned logical cores 32) (assigned physical 
>> cores 16) (assigned_sockets:2 of 2) (8 cores per cpu, 2 threads per core) 
>> family 6 model 45 stepping 7 microcode 0x71a, cmov, cx8, fxsr, mmx, sse, 
>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, aes, clmul, ht, 
>> tsc, tscinvbit, tscinv, clflush
>>   AvgLoads: 0.30, 0.10, 0.18
>>   CPU Model and flags from /proc/cpuinfo:
>>   model name  : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
>>   flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
>> pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
>> nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est 
>> tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt 
>> tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp 
>> tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts 
>> md_clear flush_l1d
>>   Online cpus: 0-31
>>   Offline cpus:
>>   BIOS frequency limitation: 
>>   Frequency switch latency (ns): 2
>>   Available cpu frequencies: 
>>   Current governor: schedutil
>>   Core performance/turbo boost: 
>>
>> I don't see `avxvnniint16` in the flags list nor avx2. So, this (relatively 
>> new) instruction shouldn't be generated for this machine. Any ideas on why 
>> this might be happening?
>
> As far as I can see from the patch, the only way to generate avxvnniint16 
> instructions is to call its specific intrinsics explicitly. And we will check 
> compiling options in FE before allowing to call the intrinsics. We do have an 
> optimization to generate vnni instructions without intrinsics, but we haven't 
> extend it to avxvnniint16 so far.
> So I don't know what's wrong in your case, could you provide a reproducer for 
> your problem?

I've investigated what is going on. With this patch, we are now passing in 
`+avxvnniint16` into machine attributes. With that attribute, we now generate 
an instruction which is illegal on sandybridge machine:

   0x3013f2af:  jmpq   0x3013f09b
 0x3013f2b4:mov%rax,%rdi
 0x3013f2b7:and$0xfff0,%rdi
  => 0x3013f2bb:vpbroadcastd %xmm0,%ymm2
 0x3013f2c0:vpbroadcastd %xmm1,%ymm3

The instruction `vpbroadcastd %xmm0,%ymm2` requires `AVX2` CPU flag: 
https://www.felixcloutier.com/x86/vpbroadcast. However, the machine has only 
AVX flag.

This is the complete mattr generated:

  !3 = 
!{!"-mattr=-prfchw,-cldemote,+avx,+aes,+sahf,+pclmul,-xop,+crc32,-xsaves,-avx512fp16,-sm4,+sse4.1,-avx512ifma,+xsave,-avx512pf,+sse4.2,-tsxldtrk,-ptwrite,-widekl,-sm3,-invpcid,+64bit,-xsavec,-avx512vpopcntdq,+cmov,-avx512vp2intersect,-avx512cd,-movbe,-avxvnniint8,-avx512er,-amx-int8,-kl,-sha512,-avxvnni,-rtm,-adx,-avx2,-hreset,-movdiri,-serialize,-vpclmulqdq,-avx512vl,-uintr,-clflushopt,-raoint,-cmpccxadd,-bmi,-amx-tile,+sse,-gfni,+avxvnniint16,-amx-fp16,+xsaveopt,-rdrnd,-avx512f,-amx-bf16,-avx512bf16,-avx512vnni,+cx8,-avx512bw,+sse3,-pku,-fsgsbase,-clzero,-mwaitx,-lwp,-lzcnt,-sha,-movdir64b,-wbnoinvd,-enqcmd,-prefetchwt1,-avxneconvert,-tbm,-pconfig,-amx-complex,+ssse3,+cx16,-bmi2,-fma,+popcnt,-avxifma,-f16c,-avx512bitalg,-rdpru,-clwb,+mmx,+sse2,-rdseed,-avx512vbmi2,-prefetchi,-rdpid,-fma4,-avx512vbmi,-shstk,-vaes,-waitpkg,-sgx,+fxsr,-avx512dq,-sse4a"}

I've confirmed if we changed to `-avxvnniint16` we do not generate 
`vpbroadcastd`.

W.r.t. how we get the machine attributes generated through our front-end:

  if (!sys::getHostCPUFeatures(Features))
return std::move(mattr);

  // Fill mattr with default values.
  mattr.reserve(Features.getNumItems());
  for (auto  : Features) {
std::string attr(I.first());
mattr.emplace_back(std::string(I.second ? "+" : "-") + attr);
  }

So, the problem is in getHostCPUFeatures, possibly this line from the patch : 
`Features["avxvnniint16"] = HasLeaf7Subleaf1 && ((EDX >> 10) & 1) && 
HasAVXSave;`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-07-28 Thread Anna Thomas via Phabricator via cfe-commits

anna added a comment.

We see a crash bisected to this patch about using an illegal instruction. 
Here's the CPUInfo for the machine:

  CPU info:
  current cpu id: 22
  total 32(physical cores 16) (assigned logical cores 32) (assigned physical 
cores 16) (assigned_sockets:2 of 2) (8 cores per cpu, 2 threads per core) 
family 6 model 45 stepping 7 microcode 0x71a, cmov, cx8, fxsr, mmx, sse, sse2, 
sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, aes, clmul, ht, tsc, 
tscinvbit, tscinv, clflush
  AvgLoads: 0.30, 0.10, 0.18
  CPU Model and flags from /proc/cpuinfo:
  model name  : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
  flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est 
tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt 
tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp 
tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear 
flush_l1d
  Online cpus: 0-31
  Offline cpus:
  BIOS frequency limitation: 
  Frequency switch latency (ns): 2
  Available cpu frequencies: 
  Current governor: schedutil
  Core performance/turbo boost: 

I don't see `avxvnniint16` in the flags list nor avx2. So, this (relatively 
new) instruction shouldn't be generated for this machine. Any ideas on why this 
might be happening?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-04-02 Thread Anna Thomas via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGbf7a16a76871: [InlineFunction] Update valid return 
attributes at callsite within callee body (authored by anna).

Changed prior to commit:
  https://reviews.llvm.org/D76140?vs=254222=254573#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

Files:
  llvm/lib/Transforms/Utils/InlineFunction.cpp
  llvm/test/Transforms/Inline/ret_attr_update.ll

Index: llvm/test/Transforms/Inline/ret_attr_update.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/ret_attr_update.ll
@@ -0,0 +1,223 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -inline-threshold=0 -always-inline -S | FileCheck %s
+; RUN: opt < %s -passes=always-inline -S | FileCheck %s
+
+declare i8* @foo(i8*) argmemonly nounwind
+
+define i8* @callee(i8 *%p) alwaysinline {
+; CHECK-LABEL: @callee(
+; CHECK-NEXT:[[R:%.*]] = call i8* @foo(i8* noalias [[P:%.*]])
+; CHECK-NEXT:ret i8* [[R]]
+;
+  %r = call i8* @foo(i8* noalias %p)
+  ret i8* %r
+}
+
+define i8* @caller(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call nonnull i8* @foo(i8* noalias [[GEP]])
+; CHECK-NEXT:ret i8* [[R_I]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee(i8* %gep)
+  ret i8* %p
+}
+
+declare void @llvm.experimental.guard(i1,...)
+; Cannot add nonnull attribute to foo
+; because the guard is a throwing call
+define internal i8* @callee_with_throwable(i8* %p) alwaysinline {
+  %r = call i8* @foo(i8* %p)
+  %cond = icmp ne i8* %r, null
+  call void (i1, ...) @llvm.experimental.guard(i1 %cond) [ "deopt"() ]
+  ret i8* %r
+}
+
+declare i8* @bar(i8*) readonly nounwind
+; Here also we cannot add nonnull attribute to the call bar.
+define internal i8* @callee_with_explicit_control_flow(i8* %p) alwaysinline {
+  %r = call i8* @bar(i8* %p)
+  %cond = icmp ne i8* %r, null
+  br i1 %cond, label %ret, label %orig
+
+ret:
+  ret i8* %r
+
+orig:
+  ret i8* %p
+}
+
+define i8* @caller2(i8* %ptr, i64 %x, i1 %cond) {
+; CHECK-LABEL: @caller2(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call i8* @foo(i8* [[GEP]])
+; CHECK-NEXT:[[COND_I:%.*]] = icmp ne i8* [[R_I]], null
+; CHECK-NEXT:call void (i1, ...) @llvm.experimental.guard(i1 [[COND_I]]) [ "deopt"() ]
+; CHECK-NEXT:[[R_I1:%.*]] = call i8* @bar(i8* [[GEP]])
+; CHECK-NEXT:[[COND_I2:%.*]] = icmp ne i8* [[R_I1]], null
+; CHECK-NEXT:br i1 [[COND_I2]], label [[RET_I:%.*]], label [[ORIG_I:%.*]]
+; CHECK:   ret.i:
+; CHECK-NEXT:br label [[CALLEE_WITH_EXPLICIT_CONTROL_FLOW_EXIT:%.*]]
+; CHECK:   orig.i:
+; CHECK-NEXT:br label [[CALLEE_WITH_EXPLICIT_CONTROL_FLOW_EXIT]]
+; CHECK:   callee_with_explicit_control_flow.exit:
+; CHECK-NEXT:[[Q3:%.*]] = phi i8* [ [[R_I1]], [[RET_I]] ], [ [[GEP]], [[ORIG_I]] ]
+; CHECK-NEXT:br i1 [[COND:%.*]], label [[PRET:%.*]], label [[QRET:%.*]]
+; CHECK:   pret:
+; CHECK-NEXT:ret i8* [[R_I]]
+; CHECK:   qret:
+; CHECK-NEXT:ret i8* [[Q3]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee_with_throwable(i8* %gep)
+  %q = call nonnull i8* @callee_with_explicit_control_flow(i8* %gep)
+  br i1 %cond, label %pret, label %qret
+
+pret:
+  ret i8* %p
+
+qret:
+  ret i8* %q
+}
+
+define internal i8* @callee3(i8 *%p) alwaysinline {
+  %r = call noalias i8* @foo(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to the existing attributes on foo.
+define i8* @caller3(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller3(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call noalias dereferenceable_or_null(12) i8* @foo(i8* [[GEP]])
+; CHECK-NEXT:ret i8* [[R_I]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee3(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @inf_loop_call(i8*) nounwind
+; We cannot propagate attributes to foo because we do not know whether inf_loop_call
+; will return execution.
+define internal i8* @callee_with_sideeffect_callsite(i8* %p) alwaysinline {
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @inf_loop_call(i8* %p)
+  ret i8* %r
+}
+
+; do not add deref attribute to foo
+define i8* @test4(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @test4(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call i8* @foo(i8* [[GEP]])
+; CHECK-NEXT:[[V_I:%.*]] = call i8* @inf_loop_call(i8* [[GEP]])
+; CHECK-NEXT:ret i8* [[R_I]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-04-02 Thread Anna Thomas via Phabricator via cfe-commits

anna added a comment.

In D76140#1957416 , @reames wrote:

> LGTM again, with minor change.


will update it.

> p.s. Sorry for missing the functional issue the first time.  All of the test 
> changes should have made the issue obvious, but despite reading the LangRef 
> description of signext, I somehow managed to miss the separation between ABI 
> and optimization attributes.

thanks for the review Philip and pointing out the problem. All of us had missed 
the functional issue the first time around.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-04-01 Thread Anna Thomas via Phabricator via cfe-commits

anna updated this revision to Diff 254222.
anna added a comment.

fixed missing code left out during rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

Files:
  llvm/lib/Transforms/Utils/InlineFunction.cpp
  llvm/test/Transforms/Inline/ret_attr_update.ll

Index: llvm/test/Transforms/Inline/ret_attr_update.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/ret_attr_update.ll
@@ -0,0 +1,223 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -inline-threshold=0 -always-inline -S | FileCheck %s
+; RUN: opt < %s -passes=always-inline -S | FileCheck %s
+
+declare i8* @foo(i8*) argmemonly nounwind
+
+define i8* @callee(i8 *%p) alwaysinline {
+; CHECK-LABEL: @callee(
+; CHECK-NEXT:[[R:%.*]] = call i8* @foo(i8* noalias [[P:%.*]])
+; CHECK-NEXT:ret i8* [[R]]
+;
+  %r = call i8* @foo(i8* noalias %p)
+  ret i8* %r
+}
+
+define i8* @caller(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call nonnull i8* @foo(i8* noalias [[GEP]])
+; CHECK-NEXT:ret i8* [[R_I]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee(i8* %gep)
+  ret i8* %p
+}
+
+declare void @llvm.experimental.guard(i1,...)
+; Cannot add nonnull attribute to foo
+; because the guard is a throwing call
+define internal i8* @callee_with_throwable(i8* %p) alwaysinline {
+  %r = call i8* @foo(i8* %p)
+  %cond = icmp ne i8* %r, null
+  call void (i1, ...) @llvm.experimental.guard(i1 %cond) [ "deopt"() ]
+  ret i8* %r
+}
+
+declare i8* @bar(i8*) readonly nounwind
+; Here also we cannot add nonnull attribute to the call bar.
+define internal i8* @callee_with_explicit_control_flow(i8* %p) alwaysinline {
+  %r = call i8* @bar(i8* %p)
+  %cond = icmp ne i8* %r, null
+  br i1 %cond, label %ret, label %orig
+
+ret:
+  ret i8* %r
+
+orig:
+  ret i8* %p
+}
+
+define i8* @caller2(i8* %ptr, i64 %x, i1 %cond) {
+; CHECK-LABEL: @caller2(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call i8* @foo(i8* [[GEP]])
+; CHECK-NEXT:[[COND_I:%.*]] = icmp ne i8* [[R_I]], null
+; CHECK-NEXT:call void (i1, ...) @llvm.experimental.guard(i1 [[COND_I]]) [ "deopt"() ]
+; CHECK-NEXT:[[R_I1:%.*]] = call i8* @bar(i8* [[GEP]])
+; CHECK-NEXT:[[COND_I2:%.*]] = icmp ne i8* [[R_I1]], null
+; CHECK-NEXT:br i1 [[COND_I2]], label [[RET_I:%.*]], label [[ORIG_I:%.*]]
+; CHECK:   ret.i:
+; CHECK-NEXT:br label [[CALLEE_WITH_EXPLICIT_CONTROL_FLOW_EXIT:%.*]]
+; CHECK:   orig.i:
+; CHECK-NEXT:br label [[CALLEE_WITH_EXPLICIT_CONTROL_FLOW_EXIT]]
+; CHECK:   callee_with_explicit_control_flow.exit:
+; CHECK-NEXT:[[Q3:%.*]] = phi i8* [ [[R_I1]], [[RET_I]] ], [ [[GEP]], [[ORIG_I]] ]
+; CHECK-NEXT:br i1 [[COND:%.*]], label [[PRET:%.*]], label [[QRET:%.*]]
+; CHECK:   pret:
+; CHECK-NEXT:ret i8* [[R_I]]
+; CHECK:   qret:
+; CHECK-NEXT:ret i8* [[Q3]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee_with_throwable(i8* %gep)
+  %q = call nonnull i8* @callee_with_explicit_control_flow(i8* %gep)
+  br i1 %cond, label %pret, label %qret
+
+pret:
+  ret i8* %p
+
+qret:
+  ret i8* %q
+}
+
+define internal i8* @callee3(i8 *%p) alwaysinline {
+  %r = call noalias i8* @foo(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to the existing attributes on foo.
+define i8* @caller3(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller3(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call noalias dereferenceable_or_null(12) i8* @foo(i8* [[GEP]])
+; CHECK-NEXT:ret i8* [[R_I]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee3(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @inf_loop_call(i8*) nounwind
+; We cannot propagate attributes to foo because we do not know whether inf_loop_call
+; will return execution.
+define internal i8* @callee_with_sideeffect_callsite(i8* %p) alwaysinline {
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @inf_loop_call(i8* %p)
+  ret i8* %r
+}
+
+; do not add deref attribute to foo
+define i8* @test4(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @test4(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call i8* @foo(i8* [[GEP]])
+; CHECK-NEXT:[[V_I:%.*]] = call i8* @inf_loop_call(i8* [[GEP]])
+; CHECK-NEXT:ret i8* [[R_I]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee_with_sideeffect_callsite(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @baz(i8*) nounwind readonly
+define internal i8* @callee5(i8* %p)

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-04-01 Thread Anna Thomas via Phabricator via cfe-commits

anna requested review of this revision.
anna added a comment.

fixed buildbot failure. see above comments and added testcase `test8`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-04-01 Thread Anna Thomas via Phabricator via cfe-commits

anna updated this revision to Diff 254213.
anna added a comment.
This revision is now accepted and ready to land.

whitelist valid return attributes and only add those. Added testcase for 
signext.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

Files:
  llvm/lib/Transforms/Utils/InlineFunction.cpp
  llvm/test/Transforms/Inline/ret_attr_update.ll

Index: llvm/test/Transforms/Inline/ret_attr_update.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/ret_attr_update.ll
@@ -0,0 +1,223 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -inline-threshold=0 -always-inline -S | FileCheck %s
+; RUN: opt < %s -passes=always-inline -S | FileCheck %s
+
+declare i8* @foo(i8*) argmemonly nounwind
+
+define i8* @callee(i8 *%p) alwaysinline {
+; CHECK-LABEL: @callee(
+; CHECK-NEXT:[[R:%.*]] = call i8* @foo(i8* noalias [[P:%.*]])
+; CHECK-NEXT:ret i8* [[R]]
+;
+  %r = call i8* @foo(i8* noalias %p)
+  ret i8* %r
+}
+
+define i8* @caller(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call nonnull i8* @foo(i8* noalias [[GEP]])
+; CHECK-NEXT:ret i8* [[R_I]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee(i8* %gep)
+  ret i8* %p
+}
+
+declare void @llvm.experimental.guard(i1,...)
+; Cannot add nonnull attribute to foo
+; because the guard is a throwing call
+define internal i8* @callee_with_throwable(i8* %p) alwaysinline {
+  %r = call i8* @foo(i8* %p)
+  %cond = icmp ne i8* %r, null
+  call void (i1, ...) @llvm.experimental.guard(i1 %cond) [ "deopt"() ]
+  ret i8* %r
+}
+
+declare i8* @bar(i8*) readonly nounwind
+; Here also we cannot add nonnull attribute to the call bar.
+define internal i8* @callee_with_explicit_control_flow(i8* %p) alwaysinline {
+  %r = call i8* @bar(i8* %p)
+  %cond = icmp ne i8* %r, null
+  br i1 %cond, label %ret, label %orig
+
+ret:
+  ret i8* %r
+
+orig:
+  ret i8* %p
+}
+
+define i8* @caller2(i8* %ptr, i64 %x, i1 %cond) {
+; CHECK-LABEL: @caller2(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call i8* @foo(i8* [[GEP]])
+; CHECK-NEXT:[[COND_I:%.*]] = icmp ne i8* [[R_I]], null
+; CHECK-NEXT:call void (i1, ...) @llvm.experimental.guard(i1 [[COND_I]]) [ "deopt"() ]
+; CHECK-NEXT:[[R_I1:%.*]] = call i8* @bar(i8* [[GEP]])
+; CHECK-NEXT:[[COND_I2:%.*]] = icmp ne i8* [[R_I1]], null
+; CHECK-NEXT:br i1 [[COND_I2]], label [[RET_I:%.*]], label [[ORIG_I:%.*]]
+; CHECK:   ret.i:
+; CHECK-NEXT:br label [[CALLEE_WITH_EXPLICIT_CONTROL_FLOW_EXIT:%.*]]
+; CHECK:   orig.i:
+; CHECK-NEXT:br label [[CALLEE_WITH_EXPLICIT_CONTROL_FLOW_EXIT]]
+; CHECK:   callee_with_explicit_control_flow.exit:
+; CHECK-NEXT:[[Q3:%.*]] = phi i8* [ [[R_I1]], [[RET_I]] ], [ [[GEP]], [[ORIG_I]] ]
+; CHECK-NEXT:br i1 [[COND:%.*]], label [[PRET:%.*]], label [[QRET:%.*]]
+; CHECK:   pret:
+; CHECK-NEXT:ret i8* [[R_I]]
+; CHECK:   qret:
+; CHECK-NEXT:ret i8* [[Q3]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee_with_throwable(i8* %gep)
+  %q = call nonnull i8* @callee_with_explicit_control_flow(i8* %gep)
+  br i1 %cond, label %pret, label %qret
+
+pret:
+  ret i8* %p
+
+qret:
+  ret i8* %q
+}
+
+define internal i8* @callee3(i8 *%p) alwaysinline {
+  %r = call noalias i8* @foo(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to the existing attributes on foo.
+define i8* @caller3(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller3(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call noalias dereferenceable_or_null(12) i8* @foo(i8* [[GEP]])
+; CHECK-NEXT:ret i8* [[R_I]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee3(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @inf_loop_call(i8*) nounwind
+; We cannot propagate attributes to foo because we do not know whether inf_loop_call
+; will return execution.
+define internal i8* @callee_with_sideeffect_callsite(i8* %p) alwaysinline {
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @inf_loop_call(i8* %p)
+  ret i8* %r
+}
+
+; do not add deref attribute to foo
+define i8* @test4(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @test4(
+; CHECK-NEXT:[[GEP:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[X:%.*]]
+; CHECK-NEXT:[[R_I:%.*]] = call i8* @foo(i8* [[GEP]])
+; CHECK-NEXT:[[V_I:%.*]] = call i8* @inf_loop_call(i8* [[GEP]])
+; CHECK-NEXT:ret i8* [[R_I]]
+;
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee_with_sideeffect_callsite(i8* %gep)
+  ret i8* %p
+}

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-31 Thread Anna Thomas via Phabricator via cfe-commits

anna reopened this revision.
anna added a comment.
This revision is now accepted and ready to land.

In D76140#1953201 , @anna wrote:

> I got a failure in one of the binaryFormats:
>  lib/BinaryFormat/CMakeFiles/LLVMBinaryFormat.dir/MsgPackReader.cpp
>
>   Attributes 'zeroext and signext' are incompatible!
> %rev.i.i.i.i.i.i.i.i = tail call signext zeroext i16 @llvm.bswap.i16(i16 
> %ret.0.copyload.i.i.i.i) #6
>   in function _ZN4llvm7msgpack6Reader7readIntIsEENS_8ExpectedIbEERNS0_6ObjectE
>   fatal error: error in backend: Broken function found, compilation aborted!
>
>
> I believe this just showcases undefined behaviour since we were having a 
> returnValue (i.e. call) with an incomptable attribute compared to the return 
> attribute on the callsite.

The last statement is not true. Had a discussion offline with Philip and he 
pointed out that we missed the fact that attributes such as `signext` and 
`zeroext` are part of the *call* itself. We cannot propagate these attributes 
into the callee since such attributes are part of the ABI for the call it is 
attached to. 
I'm reopening this review to fix this issue.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-31 Thread Anna Thomas via Phabricator via cfe-commits

anna marked an inline comment as done.
anna added inline comments.



Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1175
+  continue;
+// Sanity check that the cloned return instruction exists and is a return
+// instruction itself.

anna wrote:
> anna wrote:
> > reames wrote:
> > > Ok, after staring at it a bit, I've convinced myself the code here is 
> > > correct, just needlessly conservative.
> > > 
> > > What you're doing is:
> > > If the callees return instruction and returned call both map to the same 
> > > instructions once inlined, determine whether there's a possible exit 
> > > between the inlined copy.
> > > 
> > > What you could be doing:
> > > If the callee returns a call, check if *in the callee* there's a possible 
> > > exit between call and return, then apply attribute to cloned call.
> > > 
> > > The key difference is when the caller directly returns the result vs uses 
> > > it locally.  The result here is that your transform is much more narrow 
> > > in applicability than it first appears.
> > yes, thanks for pointing it out. I realized it after our offline discussion 
> > :) 
> > For now, I will add a FIXME testcase which showcases the difference in code 
> > and handle that testcase in a followon change. 
> > The key difference is when the caller directly returns the result vs uses 
> > it locally. The result here is that your transform is much more narrow in 
> > applicability than it first appears.
> I tried multiple test cases to showcase the difference between the two ideas 
> above but failed. Specifically, `simplifyInstruction` used during inlining 
> the callee is not too great at optimizing the body. For example, see added 
> testcase `test7`. 
> 
> I also tried the less restrictive version (check the safety of the 
> optimization in the callee itself, and do the attribute update on the cloned 
> instruction), but didn't see any testcases in clang that needed update. Of 
> course, that doesn't mean anything :)
> 
> 
Clarified this with Philip offline. The current patch is not restrictive. In 
fact, now that I think of it, sometimes, it may be better - 
`simplifyInstruction` can fold away instructions and reduce the "window size" 
between the RV and the ReturnInst.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-31 Thread Anna Thomas via Phabricator via cfe-commits

anna planned changes to this revision.
anna added a comment.

see above comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-31 Thread Anna Thomas via Phabricator via cfe-commits

anna added a comment.

I got a failure in one of the binaryFormats:
lib/BinaryFormat/CMakeFiles/LLVMBinaryFormat.dir/MsgPackReader.cpp

  Attributes 'zeroext and signext' are incompatible!
%rev.i.i.i.i.i.i.i.i = tail call signext zeroext i16 @llvm.bswap.i16(i16 
%ret.0.copyload.i.i.i.i) #6
  in function _ZN4llvm7msgpack6Reader7readIntIsEENS_8ExpectedIbEERNS0_6ObjectE
  fatal error: error in backend: Broken function found, compilation aborted!

I believe this just showcases undefined behaviour since we were having a 
returnValue (i.e. call) with an incomptable attribute compared to the return 
attribute on the callsite.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-31 Thread Anna Thomas via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG28518d9ae39f: [InlineFunction] Handle return attributes on 
call within inlined body (authored by anna).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

Files:
  clang/test/CodeGen/builtins-systemz-zvector.c
  clang/test/CodeGen/builtins-systemz-zvector2.c
  clang/test/CodeGen/movbe-builtins.c
  clang/test/CodeGen/rot-intrinsics.c
  llvm/lib/Transforms/Utils/InlineFunction.cpp
  llvm/test/Transforms/Inline/ret_attr_update.ll

Index: llvm/test/Transforms/Inline/ret_attr_update.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/ret_attr_update.ll
@@ -0,0 +1,159 @@
+; RUN: opt < %s -inline-threshold=0 -always-inline -S | FileCheck %s
+; RUN: opt < %s -passes=always-inline -S | FileCheck %s
+
+declare i8* @foo(i8*) argmemonly nounwind
+
+define i8* @callee(i8 *%p) alwaysinline {
+; CHECK: @callee(
+; CHECK: call i8* @foo(i8* noalias %p)
+  %r = call i8* @foo(i8* noalias %p)
+  ret i8* %r
+}
+
+define i8* @caller(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller
+; CHECK: call nonnull i8* @foo(i8* noalias
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee(i8* %gep)
+  ret i8* %p
+}
+
+declare void @llvm.experimental.guard(i1,...)
+; Cannot add nonnull attribute to foo
+; because the guard is a throwing call
+define internal i8* @callee_with_throwable(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_throwable
+  %r = call i8* @foo(i8* %p)
+  %cond = icmp ne i8* %r, null
+  call void (i1, ...) @llvm.experimental.guard(i1 %cond) [ "deopt"() ]
+  ret i8* %r
+}
+
+declare i8* @bar(i8*) readonly nounwind
+; Here also we cannot add nonnull attribute to the call bar.
+define internal i8* @callee_with_explicit_control_flow(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_explicit_control_flow
+  %r = call i8* @bar(i8* %p)
+  %cond = icmp ne i8* %r, null
+  br i1 %cond, label %ret, label %orig
+
+ret:
+  ret i8* %r
+
+orig:
+  ret i8* %p
+}
+
+define i8* @caller2(i8* %ptr, i64 %x, i1 %cond) {
+; CHECK-LABEL: @caller2
+; CHECK: call i8* @foo
+; CHECK: call i8* @bar
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee_with_throwable(i8* %gep)
+  %q = call nonnull i8* @callee_with_explicit_control_flow(i8* %gep)
+  br i1 %cond, label %pret, label %qret
+
+pret:
+  ret i8* %p
+
+qret:
+  ret i8* %q
+}
+
+define internal i8* @callee3(i8 *%p) alwaysinline {
+; CHECK-NOT: callee3
+  %r = call noalias i8* @foo(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to the existing attributes on foo.
+define i8* @caller3(i8* %ptr, i64 %x) {
+; CHECK-LABEL: caller3
+; CHECK: call noalias dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee3(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @inf_loop_call(i8*) nounwind
+; We cannot propagate attributes to foo because we do not know whether inf_loop_call
+; will return execution.
+define internal i8* @callee_with_sideeffect_callsite(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_sideeffect_callsite
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @inf_loop_call(i8* %p)
+  ret i8* %r
+}
+
+; do not add deref attribute to foo
+define i8* @test4(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test4
+; CHECK: call i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee_with_sideeffect_callsite(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @baz(i8*) nounwind readonly
+define internal i8* @callee5(i8* %p) alwaysinline {
+; CHECK-NOT: callee5
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @baz(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to foo.
+define i8* @test5(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test5
+; CHECK: call dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %s = call dereferenceable_or_null(12) i8* @callee5(i8* %gep)
+  ret i8* %s
+}
+
+; deref attributes have different values on the callee and the call feeding into
+; the return.
+; AttrBuilder chooses the already existing value and does not overwrite it.
+define internal i8* @callee6(i8* %p) alwaysinline {
+; CHECK-NOT: callee6
+  %r = call dereferenceable_or_null(16) i8* @foo(i8* %p)
+  %v = call i8* @baz(i8* %p)
+  ret i8* %r
+}
+
+
+define i8* @test6(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test6
+; CHECK: call dereferenceable_or_null(16) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %s = call dereferenceable_or_null(12) i8* @callee6(i8* %gep)
+  ret i8* %s
+}
+
+; We add the attributes from the callee to both the calls below.
+define internal i8* @callee7(i8 *%ptr, i1 %cond) alwaysinline {
+; CHECK-NOT: @callee7(
+  br i1 %cond, label %pass, label %fail
+
+pass:
+  %r = call i8* @foo(i8* noalias %ptr)
+  ret i8* %r
+
+fail:

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-30 Thread Anna Thomas via Phabricator via cfe-commits

anna marked 3 inline comments as done.
anna added inline comments.



Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1175
+  continue;
+// Sanity check that the cloned return instruction exists and is a return
+// instruction itself.

anna wrote:
> reames wrote:
> > Ok, after staring at it a bit, I've convinced myself the code here is 
> > correct, just needlessly conservative.
> > 
> > What you're doing is:
> > If the callees return instruction and returned call both map to the same 
> > instructions once inlined, determine whether there's a possible exit 
> > between the inlined copy.
> > 
> > What you could be doing:
> > If the callee returns a call, check if *in the callee* there's a possible 
> > exit between call and return, then apply attribute to cloned call.
> > 
> > The key difference is when the caller directly returns the result vs uses 
> > it locally.  The result here is that your transform is much more narrow in 
> > applicability than it first appears.
> yes, thanks for pointing it out. I realized it after our offline discussion 
> :) 
> For now, I will add a FIXME testcase which showcases the difference in code 
> and handle that testcase in a followon change. 
> The key difference is when the caller directly returns the result vs uses it 
> locally. The result here is that your transform is much more narrow in 
> applicability than it first appears.
I tried multiple test cases to showcase the difference between the two ideas 
above but failed. Specifically, `simplifyInstruction` used during inlining the 
callee is not too great at optimizing the body. For example, see added testcase 
`test7`. 

I also tried the less restrictive version (check the safety of the optimization 
in the callee itself, and do the attribute update on the cloned instruction), 
but didn't see any testcases in clang that needed update. Of course, that 
doesn't mean anything :)





Comment at: llvm/test/Transforms/Inline/ret_attr_update.ll:112
+  ret i8* %s
+}

anna wrote:
> reames wrote:
> > There's a critical missing test case here:
> > - Callee and caller have the same attributes w/different values (i.e. deref)
> > 
> > And thinking through the code, I think there might be a bug here.  It's not 
> > a serious one, but the if the callee specifies a larger deref than the 
> > caller, it looks like the the smaller value is being written over the 
> > larger.
> > 
> > Actually, digging through the attribute code, I think I'm wrong about the 
> > bug.  However, you should definitely write the test to confirm and document 
> > merging behaviour!
> > 
> > If it does turn out I'm correct, I'm fine with this being addressed in a 
> > follow up patch provided that the test is added in this one and isn't a 
> > functional issue.  
> will check this.
added test case and documented merge behaviour. No bug in code, since we use 
the already existing value on attribute.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-30 Thread Anna Thomas via Phabricator via cfe-commits

anna updated this revision to Diff 253704.
anna added a comment.

addressed review comments. Added two test cases: deref value being different, 
inlined callee body better optimized compared to callee.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

Files:
  clang/test/CodeGen/builtins-systemz-zvector.c
  clang/test/CodeGen/builtins-systemz-zvector2.c
  clang/test/CodeGen/movbe-builtins.c
  clang/test/CodeGen/rot-intrinsics.c
  llvm/lib/Transforms/Utils/InlineFunction.cpp
  llvm/test/Transforms/Inline/ret_attr_update.ll

Index: llvm/test/Transforms/Inline/ret_attr_update.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/ret_attr_update.ll
@@ -0,0 +1,159 @@
+; RUN: opt < %s -inline-threshold=0 -always-inline -S | FileCheck %s
+; RUN: opt < %s -passes=always-inline -S | FileCheck %s
+
+declare i8* @foo(i8*) argmemonly nounwind
+
+define i8* @callee(i8 *%p) alwaysinline {
+; CHECK: @callee(
+; CHECK: call i8* @foo(i8* noalias %p)
+  %r = call i8* @foo(i8* noalias %p)
+  ret i8* %r
+}
+
+define i8* @caller(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller
+; CHECK: call nonnull i8* @foo(i8* noalias
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee(i8* %gep)
+  ret i8* %p
+}
+
+declare void @llvm.experimental.guard(i1,...)
+; Cannot add nonnull attribute to foo
+; because the guard is a throwing call
+define internal i8* @callee_with_throwable(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_throwable
+  %r = call i8* @foo(i8* %p)
+  %cond = icmp ne i8* %r, null
+  call void (i1, ...) @llvm.experimental.guard(i1 %cond) [ "deopt"() ]
+  ret i8* %r
+}
+
+declare i8* @bar(i8*) readonly nounwind
+; Here also we cannot add nonnull attribute to the call bar.
+define internal i8* @callee_with_explicit_control_flow(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_explicit_control_flow
+  %r = call i8* @bar(i8* %p)
+  %cond = icmp ne i8* %r, null
+  br i1 %cond, label %ret, label %orig
+
+ret:
+  ret i8* %r
+
+orig:
+  ret i8* %p
+}
+
+define i8* @caller2(i8* %ptr, i64 %x, i1 %cond) {
+; CHECK-LABEL: @caller2
+; CHECK: call i8* @foo
+; CHECK: call i8* @bar
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee_with_throwable(i8* %gep)
+  %q = call nonnull i8* @callee_with_explicit_control_flow(i8* %gep)
+  br i1 %cond, label %pret, label %qret
+
+pret:
+  ret i8* %p
+
+qret:
+  ret i8* %q
+}
+
+define internal i8* @callee3(i8 *%p) alwaysinline {
+; CHECK-NOT: callee3
+  %r = call noalias i8* @foo(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to the existing attributes on foo.
+define i8* @caller3(i8* %ptr, i64 %x) {
+; CHECK-LABEL: caller3
+; CHECK: call noalias dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee3(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @inf_loop_call(i8*) nounwind
+; We cannot propagate attributes to foo because we do not know whether inf_loop_call
+; will return execution.
+define internal i8* @callee_with_sideeffect_callsite(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_sideeffect_callsite
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @inf_loop_call(i8* %p)
+  ret i8* %r
+}
+
+; do not add deref attribute to foo
+define i8* @test4(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test4
+; CHECK: call i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee_with_sideeffect_callsite(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @baz(i8*) nounwind readonly
+define internal i8* @callee5(i8* %p) alwaysinline {
+; CHECK-NOT: callee5
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @baz(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to foo.
+define i8* @test5(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test5
+; CHECK: call dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %s = call dereferenceable_or_null(12) i8* @callee5(i8* %gep)
+  ret i8* %s
+}
+
+; deref attributes have different values on the callee and the call feeding into
+; the return.
+; AttrBuilder chooses the already existing value and does not overwrite it.
+define internal i8* @callee6(i8* %p) alwaysinline {
+; CHECK-NOT: callee6
+  %r = call dereferenceable_or_null(16) i8* @foo(i8* %p)
+  %v = call i8* @baz(i8* %p)
+  ret i8* %r
+}
+
+
+define i8* @test6(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test6
+; CHECK: call dereferenceable_or_null(16) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %s = call dereferenceable_or_null(12) i8* @callee6(i8* %gep)
+  ret i8* %s
+}
+
+; We add the attributes from the callee to both the calls below.
+define internal i8* @callee7(i8 *%ptr, i1 %cond) alwaysinline {
+; CHECK-NOT: @callee7(
+  br i1 %cond, label %pass, label %fail
+
+pass:
+  %r = call i8* @foo(i8* noalias %ptr)
+  ret i8* %r
+

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-30 Thread Anna Thomas via Phabricator via cfe-commits

anna marked 3 inline comments as done.
anna added inline comments.



Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1159
+
+  auto MayContainThrowingOrExitingCall = [&](Instruction *RVal,
+ Instruction *RInst) {

reames wrote:
> Pull this out as a static helper instead of a lambda, add an assert 
> internally that the two instructions are in the same block.  
> 
> Why?  Because I'm 80% sure the state capture on the lambda isn't needed, and 
> having it as a separate function forces that discipline.  
agreed. I'll do that in this change itself before landing. I am using this 
static helper in followon change D76792.



Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1175
+  continue;
+// Sanity check that the cloned return instruction exists and is a return
+// instruction itself.

reames wrote:
> Ok, after staring at it a bit, I've convinced myself the code here is 
> correct, just needlessly conservative.
> 
> What you're doing is:
> If the callees return instruction and returned call both map to the same 
> instructions once inlined, determine whether there's a possible exit between 
> the inlined copy.
> 
> What you could be doing:
> If the callee returns a call, check if *in the callee* there's a possible 
> exit between call and return, then apply attribute to cloned call.
> 
> The key difference is when the caller directly returns the result vs uses it 
> locally.  The result here is that your transform is much more narrow in 
> applicability than it first appears.
yes, thanks for pointing it out. I realized it after our offline discussion :) 
For now, I will add a FIXME testcase which showcases the difference in code and 
handle that testcase in a followon change. 



Comment at: llvm/test/Transforms/Inline/ret_attr_update.ll:112
+  ret i8* %s
+}

reames wrote:
> There's a critical missing test case here:
> - Callee and caller have the same attributes w/different values (i.e. deref)
> 
> And thinking through the code, I think there might be a bug here.  It's not a 
> serious one, but the if the callee specifies a larger deref than the caller, 
> it looks like the the smaller value is being written over the larger.
> 
> Actually, digging through the attribute code, I think I'm wrong about the 
> bug.  However, you should definitely write the test to confirm and document 
> merging behaviour!
> 
> If it does turn out I'm correct, I'm fine with this being addressed in a 
> follow up patch provided that the test is added in this one and isn't a 
> functional issue.  
will check this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-26 Thread Anna Thomas via Phabricator via cfe-commits

anna updated this revision to Diff 252868.
anna added a comment.

NFC w.r.t prev diff. Use VMap.lookup instead of a lambda which does the same.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

Files:
  clang/test/CodeGen/builtins-systemz-zvector.c
  clang/test/CodeGen/builtins-systemz-zvector2.c
  clang/test/CodeGen/movbe-builtins.c
  clang/test/CodeGen/rot-intrinsics.c
  llvm/lib/Transforms/Utils/InlineFunction.cpp
  llvm/test/Transforms/Inline/ret_attr_update.ll

Index: llvm/test/Transforms/Inline/ret_attr_update.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/ret_attr_update.ll
@@ -0,0 +1,112 @@
+; RUN: opt < %s -inline-threshold=0 -always-inline -S | FileCheck %s
+; RUN: opt < %s -passes=always-inline -S | FileCheck %s
+
+declare i8* @foo(i8*) argmemonly nounwind
+
+define i8* @callee(i8 *%p) alwaysinline {
+; CHECK: @callee(
+; CHECK: call i8* @foo(i8* noalias %p)
+  %r = call i8* @foo(i8* noalias %p)
+  ret i8* %r
+}
+
+define i8* @caller(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller
+; CHECK: call nonnull i8* @foo(i8* noalias
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee(i8* %gep)
+  ret i8* %p
+}
+
+declare void @llvm.experimental.guard(i1,...)
+; Cannot add nonnull attribute to foo
+; because the guard is a throwing call
+define internal i8* @callee_with_throwable(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_throwable
+  %r = call i8* @foo(i8* %p)
+  %cond = icmp ne i8* %r, null
+  call void (i1, ...) @llvm.experimental.guard(i1 %cond) [ "deopt"() ]
+  ret i8* %r
+}
+
+declare i8* @bar(i8*) readonly nounwind
+; Here also we cannot add nonnull attribute to the call bar.
+define internal i8* @callee_with_explicit_control_flow(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_explicit_control_flow
+  %r = call i8* @bar(i8* %p)
+  %cond = icmp ne i8* %r, null
+  br i1 %cond, label %ret, label %orig
+
+ret:
+  ret i8* %r
+
+orig:
+  ret i8* %p
+}
+
+define i8* @caller2(i8* %ptr, i64 %x, i1 %cond) {
+; CHECK-LABEL: @caller2
+; CHECK: call i8* @foo
+; CHECK: call i8* @bar
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee_with_throwable(i8* %gep)
+  %q = call nonnull i8* @callee_with_explicit_control_flow(i8* %gep)
+  br i1 %cond, label %pret, label %qret
+
+pret:
+  ret i8* %p
+
+qret:
+  ret i8* %q
+}
+
+define internal i8* @callee3(i8 *%p) alwaysinline {
+; CHECK-NOT: callee3
+  %r = call noalias i8* @foo(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to the existing attributes on foo.
+define i8* @caller3(i8* %ptr, i64 %x) {
+; CHECK-LABEL: caller3
+; CHECK: call noalias dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee3(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @inf_loop_call(i8*) nounwind
+; We cannot propagate attributes to foo because we do not know whether inf_loop_call
+; will return execution.
+define internal i8* @callee_with_sideeffect_callsite(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_sideeffect_callsite
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @inf_loop_call(i8* %p)
+  ret i8* %r
+}
+
+; do not add deref attribute to foo
+define i8* @test4(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test4
+; CHECK: call i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee_with_sideeffect_callsite(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @baz(i8*) nounwind readonly
+define internal i8* @callee5(i8* %p) alwaysinline {
+; CHECK-NOT: callee5
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @baz(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to foo.
+define i8* @test5(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test5
+; CHECK: call dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %s = call dereferenceable_or_null(12) i8* @callee5(i8* %gep)
+  ret i8* %s
+}
Index: llvm/lib/Transforms/Utils/InlineFunction.cpp
===
--- llvm/lib/Transforms/Utils/InlineFunction.cpp
+++ llvm/lib/Transforms/Utils/InlineFunction.cpp
@@ -80,11 +80,21 @@
   cl::Hidden,
   cl::desc("Convert noalias attributes to metadata during inlining."));
 
+static cl::opt UpdateReturnAttributes(
+"update-return-attrs", cl::init(true), cl::Hidden,
+cl::desc("Update return attributes on calls within inlined body"));
+
 static cl::opt
 PreserveAlignmentAssumptions("preserve-alignment-assumptions-during-inlining",
   cl::init(true), cl::Hidden,
   cl::desc("Convert align attributes to assumptions during inlining."));
 
+static cl::opt MaxInstCheckedForThrow(
+"max-inst-checked-for-throw-during-inlining", cl::Hidden,
+cl::desc("the maximum number of instructions analyzed for may throw during "
+ "attribute inference in

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-25 Thread Anna Thomas via Phabricator via cfe-commits

anna added a comment.

ping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-23 Thread Anna Thomas via Phabricator via cfe-commits

anna marked an inline comment as done.
anna added inline comments.



Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1172
+  if (NumInstChecked++ > MaxInstCheckedForThrow ||
+  isGuaranteedToTransferExecutionToSuccessor())
+return true;

lebedev.ri wrote:
> anna wrote:
> > lebedev.ri wrote:
> > > anna wrote:
> > > > Noticed while adding couple more tests, there are 2 bugs here:
> > > > 1. the `isGuaranteedToTransferExecutionToSuccessor` check should be 
> > > > inverted
> > > > 2.  make_range should be until the return instruction - so we do not 
> > > > want `std::prev` on the returnInstruction. what's needed is: 
> > > > `make_range(RVal->getIterator(), RInst->getIterator())`
> > > > This means that from the callsite until (and excluding) the return 
> > > > instruction should be guaranteed to transfer execution to successor - 
> > > > only then we can backward propagate the attribute to that callsite.
> > > Are you aware of `llvm::isValidAssumeForContext()`?
> > > All this (including pitfalls) sound awfully close to that function.
> > as stated in a previous comment (https://reviews.llvm.org/D76140#1922292), 
> > adding `Assumes` here for simple cases seems like an overkill. It has 
> > significant IR churn and it also adds a use for something which can be 
> > easily inferred. 
> > Consider optimizations that depend on facts such as `hasOneUse` or a 
> > limited number of uses. We will now be inhibiting those optimizations. 
> While i venomously disagree with the avoidance of the usage of one versatile 
> interface
> and hope things will change once there's more progress on attributor & assume 
> bundles,
> in this case, as it can be seen even from the signature of the 
> `isValidAssumeForContext()` function,
> it implies/forces nothing about using assumes, but only performs a validity 
> checking,
> similar to the `MayContainThrowingOrExitingCall()`
> 
> https://github.com/llvm/llvm-project/blob/ca04d0c8fd269978be1c13fe1241172cdfe6a6ea/llvm/lib/Analysis/ValueTracking.cpp#L603
> 
> That being said, i haven't reviewed this code so maybe there's some 
> differences here
> that make that function unapplicable here.
> 
`isValidAssumeForContext(Inv, CxtI, DT)` does not force anything about assumes, 
but AFAICT all code which uses this function either has some sort of guard in 
the caller that the instruction is an assume. Also, the comments in the code 
state that it is for an assume. In fact, I believe if we intend to use that 
function more widely for other purposes, we should rename the function before 
using it (just a thought), and currently we should assert that `Inv` is an 
assume. It captures the intent of the function.

That being said, I checked the code in `isValidAssumeForContext` and it does 
not fit the bill here for multiple reasons. We either do:
1. `isValidAssumeForContext(RVal /* Inv */, RInst /* CxtI  */)` which fails 
when we do not have DT and just return true when RVal comes before RInst - this 
is always the case, since RVal will come before RInst.
2. `isValidAssumeForContext(RInst /* Inv*/, RVal /* CxtI*/)` and it fails at 
the `!isEphemeralValueOf(Inv /* RI */, CxtI /* RV*/)` check.  
(By fail here, I mean, it does not have the same behaviour as 
`MayContainThrowingOrExitingCall`).



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-22 Thread Anna Thomas via Phabricator via cfe-commits

anna marked an inline comment as done.
anna added inline comments.



Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1172
+  if (NumInstChecked++ > MaxInstCheckedForThrow ||
+  isGuaranteedToTransferExecutionToSuccessor())
+return true;

lebedev.ri wrote:
> anna wrote:
> > Noticed while adding couple more tests, there are 2 bugs here:
> > 1. the `isGuaranteedToTransferExecutionToSuccessor` check should be inverted
> > 2.  make_range should be until the return instruction - so we do not want 
> > `std::prev` on the returnInstruction. what's needed is: 
> > `make_range(RVal->getIterator(), RInst->getIterator())`
> > This means that from the callsite until (and excluding) the return 
> > instruction should be guaranteed to transfer execution to successor - only 
> > then we can backward propagate the attribute to that callsite.
> Are you aware of `llvm::isValidAssumeForContext()`?
> All this (including pitfalls) sound awfully close to that function.
as stated in a previous comment (https://reviews.llvm.org/D76140#1922292), 
adding `Assumes` here for simple cases seems like an overkill. It has 
significant IR churn and it also adds a use for something which can be easily 
inferred. 
Consider optimizations that depend on facts such as `hasOneUse` or a limited 
number of uses. We will now be inhibiting those optimizations. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-22 Thread Anna Thomas via Phabricator via cfe-commits

anna updated this revision to Diff 251901.
anna added a comment.

Noticed while adding couple more tests, there were 2 bugs:

1 the isGuaranteedToTransferExecutionToSuccessor check should be inverted

2. make_range should be until the return instruction - so we do not want 
std::prev on the returnInstruction. what's needed is: 
make_range(RVal->getIterator(), RInst->getIterator())

This means that from the callsite until (and excluding) the return instruction 
should be guaranteed to transfer execution to successor - only then we can 
backward propagate the attribute to that callsite.
Updated patch and added test cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

Files:
  clang/test/CodeGen/builtins-systemz-zvector.c
  clang/test/CodeGen/builtins-systemz-zvector2.c
  clang/test/CodeGen/movbe-builtins.c
  clang/test/CodeGen/rot-intrinsics.c
  llvm/lib/Transforms/Utils/InlineFunction.cpp
  llvm/test/Transforms/Inline/ret_attr_update.ll

Index: llvm/test/Transforms/Inline/ret_attr_update.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/ret_attr_update.ll
@@ -0,0 +1,112 @@
+; RUN: opt < %s -inline-threshold=0 -always-inline -S | FileCheck %s
+; RUN: opt < %s -passes=always-inline -S | FileCheck %s
+
+declare i8* @foo(i8*) argmemonly nounwind
+
+define i8* @callee(i8 *%p) alwaysinline {
+; CHECK: @callee(
+; CHECK: call i8* @foo(i8* noalias %p)
+  %r = call i8* @foo(i8* noalias %p)
+  ret i8* %r
+}
+
+define i8* @caller(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller
+; CHECK: call nonnull i8* @foo(i8* noalias
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee(i8* %gep)
+  ret i8* %p
+}
+
+declare void @llvm.experimental.guard(i1,...)
+; Cannot add nonnull attribute to foo
+; because the guard is a throwing call
+define internal i8* @callee_with_throwable(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_throwable
+  %r = call i8* @foo(i8* %p)
+  %cond = icmp ne i8* %r, null
+  call void (i1, ...) @llvm.experimental.guard(i1 %cond) [ "deopt"() ]
+  ret i8* %r
+}
+
+declare i8* @bar(i8*) readonly nounwind
+; Here also we cannot add nonnull attribute to the call bar.
+define internal i8* @callee_with_explicit_control_flow(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_explicit_control_flow
+  %r = call i8* @bar(i8* %p)
+  %cond = icmp ne i8* %r, null
+  br i1 %cond, label %ret, label %orig
+
+ret:
+  ret i8* %r
+
+orig:
+  ret i8* %p
+}
+
+define i8* @caller2(i8* %ptr, i64 %x, i1 %cond) {
+; CHECK-LABEL: @caller2
+; CHECK: call i8* @foo
+; CHECK: call i8* @bar
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee_with_throwable(i8* %gep)
+  %q = call nonnull i8* @callee_with_explicit_control_flow(i8* %gep)
+  br i1 %cond, label %pret, label %qret
+
+pret:
+  ret i8* %p
+
+qret:
+  ret i8* %q
+}
+
+define internal i8* @callee3(i8 *%p) alwaysinline {
+; CHECK-NOT: callee3
+  %r = call noalias i8* @foo(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to the existing attributes on foo.
+define i8* @caller3(i8* %ptr, i64 %x) {
+; CHECK-LABEL: caller3
+; CHECK: call noalias dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee3(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @inf_loop_call(i8*) nounwind
+; We cannot propagate attributes to foo because we do not know whether inf_loop_call
+; will return execution.
+define internal i8* @callee_with_sideeffect_callsite(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_sideeffect_callsite
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @inf_loop_call(i8* %p)
+  ret i8* %r
+}
+
+; do not add deref attribute to foo
+define i8* @test4(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test4
+; CHECK: call i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee_with_sideeffect_callsite(i8* %gep)
+  ret i8* %p
+}
+
+declare i8* @baz(i8*) nounwind readonly
+define internal i8* @callee5(i8* %p) alwaysinline {
+; CHECK-NOT: callee5
+  %r = call i8* @foo(i8* %p)
+  %v = call i8* @baz(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to foo.
+define i8* @test5(i8* %ptr, i64 %x) {
+; CHECK-LABEL: test5
+; CHECK: call dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %s = call dereferenceable_or_null(12) i8* @callee5(i8* %gep)
+  ret i8* %s
+}
Index: llvm/lib/Transforms/Utils/InlineFunction.cpp
===
--- llvm/lib/Transforms/Utils/InlineFunction.cpp
+++ llvm/lib/Transforms/Utils/InlineFunction.cpp
@@ -80,11 +80,21 @@
   cl::Hidden,
   cl::desc("Convert noalias attributes to metadata during inlining."));
 
+static cl::opt UpdateReturnAttributes(
+"update-return-attrs", cl::init(true), cl::Hidden,
+cl::desc("Update

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-22 Thread Anna Thomas via Phabricator via cfe-commits

anna marked an inline comment as done.
anna added inline comments.



Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1172
+  if (NumInstChecked++ > MaxInstCheckedForThrow ||
+  isGuaranteedToTransferExecutionToSuccessor())
+return true;

Noticed while adding couple more tests, there are 2 bugs here:
1. the `isGuaranteedToTransferExecutionToSuccessor` check should be inverted
2.  make_range should be until the return instruction - so we do not want 
`std::prev` on the returnInstruction. what's needed is: 
`make_range(RVal->getIterator(), RInst->getIterator())`
This means that from the callsite until (and excluding) the return instruction 
should be guaranteed to transfer execution to successor - only then we can 
backward propagate the attribute to that callsite.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-20 Thread Anna Thomas via Phabricator via cfe-commits

anna updated this revision to Diff 251615.
anna added a comment.

use isGuaranteedToTransferExecutionToSuccessor instead of `MayThrow`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

Files:
  clang/test/CodeGen/builtins-systemz-zvector.c
  clang/test/CodeGen/builtins-systemz-zvector2.c
  clang/test/CodeGen/movbe-builtins.c
  clang/test/CodeGen/rot-intrinsics.c
  clang/test/CodeGen/waitpkg.c
  llvm/lib/Transforms/Utils/InlineFunction.cpp
  llvm/test/Transforms/Inline/ret_attr_update.ll

Index: llvm/test/Transforms/Inline/ret_attr_update.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/ret_attr_update.ll
@@ -0,0 +1,75 @@
+; RUN: opt < %s -inline-threshold=0 -always-inline -S | FileCheck %s
+; RUN: opt < %s -passes=always-inline -S | FileCheck %s
+
+declare i8* @foo(i8*)
+declare i8* @bar(i8*)
+
+define i8* @callee(i8 *%p) alwaysinline {
+; CHECK: @callee(
+; CHECK: call i8* @foo(i8* noalias %p)
+  %r = call i8* @foo(i8* noalias %p)
+  ret i8* %r
+}
+
+define i8* @caller(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller
+; CHECK: call nonnull i8* @foo(i8* noalias
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee(i8* %gep)
+  ret i8* %p
+}
+
+declare void @llvm.experimental.guard(i1,...)
+; Cannot add nonnull attribute to foo.
+define internal i8* @callee_with_throwable(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_throwable
+  %r = call i8* @foo(i8* %p)
+  %cond = icmp ne i8* %r, null
+  call void (i1, ...) @llvm.experimental.guard(i1 %cond) [ "deopt"() ]
+  ret i8* %r
+}
+
+; Here also we cannot add nonnull attribute to the call bar.
+define internal i8* @callee_with_explicit_control_flow(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_explicit_control_flow
+  %r = call i8* @bar(i8* %p)
+  %cond = icmp ne i8* %r, null
+  br i1 %cond, label %ret, label %orig
+
+ret:
+  ret i8* %r
+
+orig:
+  ret i8* %p
+}
+
+define i8* @caller2(i8* %ptr, i64 %x, i1 %cond) {
+; CHECK-LABEL: @caller2
+; CHECK: call i8* @foo
+; CHECK: call i8* @bar
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee_with_throwable(i8* %gep)
+  %q = call nonnull i8* @callee_with_explicit_control_flow(i8* %gep)
+  br i1 %cond, label %pret, label %qret
+
+pret:
+  ret i8* %p
+
+qret:
+  ret i8* %q
+}
+
+define internal i8* @callee3(i8 *%p) alwaysinline {
+; CHECK-NOT: callee3
+  %r = call noalias i8* @foo(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to the existing attributes on foo.
+define i8* @caller3(i8* %ptr, i64 %x) {
+; CHECK-LABEL: caller3
+; CHECK: call noalias dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee3(i8* %gep)
+  ret i8* %p
+}
Index: llvm/lib/Transforms/Utils/InlineFunction.cpp
===
--- llvm/lib/Transforms/Utils/InlineFunction.cpp
+++ llvm/lib/Transforms/Utils/InlineFunction.cpp
@@ -80,11 +80,21 @@
   cl::Hidden,
   cl::desc("Convert noalias attributes to metadata during inlining."));
 
+static cl::opt UpdateReturnAttributes(
+"update-return-attrs", cl::init(true), cl::Hidden,
+cl::desc("Update return attributes on calls within inlined body"));
+
 static cl::opt
 PreserveAlignmentAssumptions("preserve-alignment-assumptions-during-inlining",
   cl::init(true), cl::Hidden,
   cl::desc("Convert align attributes to assumptions during inlining."));
 
+static cl::opt MaxInstCheckedForThrow(
+"max-inst-checked-for-throw-during-inlining", cl::Hidden,
+cl::desc("the maximum number of instructions analyzed for may throw during "
+ "attribute inference in inlined body"),
+cl::init(4));
+
 llvm::InlineResult llvm::InlineFunction(CallBase *CB, InlineFunctionInfo ,
 AAResults *CalleeAAR,
 bool InsertLifetime) {
@@ -1136,6 +1146,80 @@
   }
 }
 
+static void AddReturnAttributes(CallSite CS, ValueToValueMapTy ) {
+  if (!UpdateReturnAttributes)
+return;
+  AttrBuilder AB(CS.getAttributes(), AttributeList::ReturnIndex);
+  if (AB.empty())
+return;
+
+  auto *CalledFunction = CS.getCalledFunction();
+  auto  = CalledFunction->getContext();
+
+  auto GetClonedValue = [&](Instruction *I) -> Value * {
+ValueToValueMapTy::iterator VMI = VMap.find(I);
+if (VMI == VMap.end())
+  return nullptr;
+return VMI->second;
+  };
+
+  auto MayContainThrowingOrExitingCall = [&](Instruction *RVal,
+ Instruction *RInst) {
+unsigned NumInstChecked = 0;
+for (auto  :
+ make_range(RVal->getIterator(), std::prev(RInst->getIterator(
+  if (NumInstChecked++ > MaxInstCheckedForThrow ||
+  isGuaranteedToTransferExecutionToSuccessor())
+return true;
+return false;

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-20 Thread Anna Thomas via Phabricator via cfe-commits

anna marked an inline comment as done.
anna added inline comments.
Herald added a reviewer: aartbik.

Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1172
+return true;
+return false;
+  };

jdoerfert wrote:
> `mayThrow` is not sufficient. As with my earlier example, a potential `exit` 
> is sufficient to break this, thus you need `willreturn` as well.
What we need is just `isGuaranteedToTransferExecutionToSuccessor`. That handles 
`mayThrow`, exits/pthread_exit and willreturn.

Just to note, an unconditional exit in the callee itself is not an issue here. 
The problem is something like this:
```
;nothrow
foo(i8* %arg) {
if (%arg == null)
  exit;
ret %arg
}
callee() {
  %r = call i8* @bar
  %v = call i8* @foo(i8* %r)
   ret i8* %r
}
caller() {
  call nonnull i8* @callee
}
```
 Here propagating nonnull to callsite `bar` is incorrect since if %r is null, 
the program exits. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76140: [InlineFunction] update attributes during inlining

2020-03-19 Thread Anna Thomas via Phabricator via cfe-commits

anna updated this revision to Diff 251447.
anna added a comment.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

fixed clang tests. rot-intrinsics.c testcase has 5 different RUNs with 3 
prefixes. Depending on target-triple, the attribute is added to the caller, so 
I've disabled the optimization for that specific test with 
-update-return-attrs=false


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76140/new/

https://reviews.llvm.org/D76140

Files:
  clang/test/CodeGen/builtins-systemz-zvector.c
  clang/test/CodeGen/builtins-systemz-zvector2.c
  clang/test/CodeGen/movbe-builtins.c
  clang/test/CodeGen/rot-intrinsics.c
  clang/test/CodeGen/waitpkg.c
  llvm/lib/Transforms/Utils/InlineFunction.cpp
  llvm/test/Transforms/Inline/ret_attr_update.ll

Index: llvm/test/Transforms/Inline/ret_attr_update.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/ret_attr_update.ll
@@ -0,0 +1,75 @@
+; RUN: opt < %s -inline-threshold=0 -always-inline -S | FileCheck %s
+; RUN: opt < %s -passes=always-inline -S | FileCheck %s
+
+declare i8* @foo(i8*)
+declare i8* @bar(i8*)
+
+define i8* @callee(i8 *%p) alwaysinline {
+; CHECK: @callee(
+; CHECK: call i8* @foo(i8* noalias %p)
+  %r = call i8* @foo(i8* noalias %p)
+  ret i8* %r
+}
+
+define i8* @caller(i8* %ptr, i64 %x) {
+; CHECK-LABEL: @caller
+; CHECK: call nonnull i8* @foo(i8* noalias
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee(i8* %gep)
+  ret i8* %p
+}
+
+declare void @llvm.experimental.guard(i1,...)
+; Cannot add nonnull attribute to foo.
+define internal i8* @callee_with_throwable(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_throwable
+  %r = call i8* @foo(i8* %p)
+  %cond = icmp ne i8* %r, null
+  call void (i1, ...) @llvm.experimental.guard(i1 %cond) [ "deopt"() ]
+  ret i8* %r
+}
+
+; Here also we cannot add nonnull attribute to the call bar.
+define internal i8* @callee_with_explicit_control_flow(i8* %p) alwaysinline {
+; CHECK-NOT: callee_with_explicit_control_flow
+  %r = call i8* @bar(i8* %p)
+  %cond = icmp ne i8* %r, null
+  br i1 %cond, label %ret, label %orig
+
+ret:
+  ret i8* %r
+
+orig:
+  ret i8* %p
+}
+
+define i8* @caller2(i8* %ptr, i64 %x, i1 %cond) {
+; CHECK-LABEL: @caller2
+; CHECK: call i8* @foo
+; CHECK: call i8* @bar
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call nonnull i8* @callee_with_throwable(i8* %gep)
+  %q = call nonnull i8* @callee_with_explicit_control_flow(i8* %gep)
+  br i1 %cond, label %pret, label %qret
+
+pret:
+  ret i8* %p
+
+qret:
+  ret i8* %q
+}
+
+define internal i8* @callee3(i8 *%p) alwaysinline {
+; CHECK-NOT: callee3
+  %r = call noalias i8* @foo(i8* %p)
+  ret i8* %r
+}
+
+; add the deref attribute to the existing attributes on foo.
+define i8* @caller3(i8* %ptr, i64 %x) {
+; CHECK-LABEL: caller3
+; CHECK: call noalias dereferenceable_or_null(12) i8* @foo
+  %gep = getelementptr inbounds i8, i8* %ptr, i64 %x
+  %p = call dereferenceable_or_null(12) i8* @callee3(i8* %gep)
+  ret i8* %p
+}
Index: llvm/lib/Transforms/Utils/InlineFunction.cpp
===
--- llvm/lib/Transforms/Utils/InlineFunction.cpp
+++ llvm/lib/Transforms/Utils/InlineFunction.cpp
@@ -80,11 +80,21 @@
   cl::Hidden,
   cl::desc("Convert noalias attributes to metadata during inlining."));
 
+static cl::opt UpdateReturnAttributes(
+"update-return-attrs", cl::init(true), cl::Hidden,
+cl::desc("Update return attributes on calls within inlined body"));
+
 static cl::opt
 PreserveAlignmentAssumptions("preserve-alignment-assumptions-during-inlining",
   cl::init(true), cl::Hidden,
   cl::desc("Convert align attributes to assumptions during inlining."));
 
+static cl::opt MaxInstCheckedForThrow(
+"max-inst-checked-for-throw-during-inlining", cl::Hidden,
+cl::desc("the maximum number of instructions analyzed for may throw during "
+ "attribute inference in inlined body"),
+cl::init(4));
+
 llvm::InlineResult llvm::InlineFunction(CallBase *CB, InlineFunctionInfo ,
 AAResults *CalleeAAR,
 bool InsertLifetime) {
@@ -1136,6 +1146,77 @@
   }
 }
 
+static void AddReturnAttributes(CallSite CS, ValueToValueMapTy ) {
+  if (!UpdateReturnAttributes)
+return;
+  AttrBuilder AB(CS.getAttributes(), AttributeList::ReturnIndex);
+  if (AB.empty())
+return;
+
+  auto *CalledFunction = CS.getCalledFunction();
+  auto  = CalledFunction->getContext();
+
+  auto GetClonedValue = [&](Instruction *I) -> Value * {
+ValueToValueMapTy::iterator VMI = VMap.find(I);
+if (VMI == VMap.end())
+  return nullptr;
+return VMI->second;
+  };
+
+  auto MayContainThrowingCall = [&](Instruction *RVal, Instruction *RInst) {
+unsigned NumInstChecked = 0;
+for (auto  :
+ make_range(RVal->getIterator(),

[PATCH] D41077: [analyser] different.CallArgsOrder checker implementation

2017-12-14 Thread Anna Thomas via Phabricator via cfe-commits

anna resigned from this revision.
anna added a comment.

Perhaps added me incorrectly as reviewer?


Repository:
  rC Clang

https://reviews.llvm.org/D41077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D76140: [InlineFunction] update attributes during inlining

[PATCH] D41077: [analyser] different.CallArgsOrder checker implementation

29 matches

Site Navigation

Mail list logo

Footer information