[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

Jon Chesterfield via cfe-commits Mon, 01 Jul 2024 05:57:34 -0700

================
@@ -0,0 +1,77 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -emit-llvm -o - %s | FileCheck 
%s
+
+extern void varargs_simple(int, ...);
+
+// CHECK-LABEL: define dso_local void @foo(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[C:%.*]] = alloca i8, align 1
+// CHECK-NEXT:    [[S:%.*]] = alloca i16, align 2
+// CHECK-NEXT:    [[I:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[L:%.*]] = alloca i64, align 8
+// CHECK-NEXT:    [[F:%.*]] = alloca float, align 4
+// CHECK-NEXT:    [[D:%.*]] = alloca double, align 8
+// CHECK-NEXT:    [[A:%.*]] = alloca [[STRUCT_ANON:%.*]], align 4
+// CHECK-NEXT:    [[V:%.*]] = alloca <4 x i32>, align 16
+// CHECK-NEXT:    store i8 1, ptr [[C]], align 1
+// CHECK-NEXT:    store i16 1, ptr [[S]], align 2
+// CHECK-NEXT:    store i32 1, ptr [[I]], align 4
+// CHECK-NEXT:    store i64 1, ptr [[L]], align 8
+// CHECK-NEXT:    store float 1.000000e+00, ptr [[F]], align 4
+// CHECK-NEXT:    store double 1.000000e+00, ptr [[D]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load i8, ptr [[C]], align 1
+// CHECK-NEXT:    [[CONV:%.*]] = sext i8 [[TMP0]] to i32
----------------
JonChesterfield wrote:


C promotes them to i32. C has a lot of rules around vararg type promotion that 
have not aged brilliantly.

If you want a i8 or i16, put it in a struct. C doesn't say anything about 
promoting that and amdgpu will pass it inlined into the struct, i.e. with no 
overhead. I believe nvptx will do likewise.

https://github.com/llvm/llvm-project/pull/96369
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

Reply via email to