MrSidims wrote:

> There's nothing new to do here. This has always existed

@arsenm here is a small experiment, I've compiled the following OpenCL code:
```
struct S {
    char i8_3[3];
};

kernel void test(global struct S *p, float3 v)
{
   int3 tmp;
   frexp(v, &tmp);
   tmp += 1;
   p->i8_3[0] = tmp.x;
   p->i8_3[1] = tmp.y;
   p->i8_3[2] = tmp.z;
}
```
with the PR pulled in (on top of LLVM's HEAD aadfba9b2a), the compilation 
command is:
`clang++ -cl-std=CL2.0 -emit-llvm -c -x cl -g0 --target=spir -Xclang 
-finclude-default-header -O2 test.cl`
The output LLVM IR after the optimizations is:
```
; Function Attrs: convergent norecurse nounwind
define dso_local spir_kernel void @test(ptr addrspace(1) nocapture noundef 
writeonly align 1 %p, <3 x float> noundef %v) local_unnamed_addr #0 
!kernel_arg_a>
entry:
  %tmp = alloca <3 x i32>, align 16
  call void @llvm.lifetime.start.p0(i64 16, ptr nonnull %tmp) #3
  %tmp.ascast = addrspacecast ptr %tmp to ptr addrspace(4)
  %call = call spir_func <3 x float> @_Z5frexpDv3_fPU3AS4Dv3_i(<3 x float> 
noundef %v, ptr addrspace(4) noundef %tmp.ascast) #4
  %loadVec42 = load <4 x i32>, ptr %tmp, align 16
  %extractVec4 = add <4 x i32> %loadVec42, <i32 1, i32 1, i32 1, i32 1>
  %0 = bitcast <4 x i32> %extractVec4 to i128
  %1 = trunc i128 %0 to i96
  %2 = bitcast i96 %1 to <12 x i8>
  %conv = trunc i128 %0 to i8
  store i8 %conv, ptr addrspace(1) %p, align 1, !tbaa !9
  %conv5 = extractelement <12 x i8> %2, i64 4
  %arrayidx7 = getelementptr inbounds i8, ptr addrspace(1) %p, i32 1
  store i8 %conv5, ptr addrspace(1) %arrayidx7, align 1, !tbaa !9
  %conv8 = extractelement <12 x i8> %2, i64 8
  %arrayidx10 = getelementptr inbounds i8, ptr addrspace(1) %p, i32 2
  store i8 %conv8, ptr addrspace(1) %arrayidx10, align 1, !tbaa !9
  call void @llvm.lifetime.end.p0(i64 16, ptr nonnull %tmp) #3
  ret void
}
```
note bitcast to i128 with the following truncation to i96 - those types aren't 
part of the datalayout, yet some optimization generated them. So something has 
to be done with it and changing the datalayout is not enough.

> This does not mean arbitrary integer bitwidths do not work. The n field is 
> weird, it's more of an optimization hint.

Let me clarify myself, _BitInt(N) will work with the change, I have no doubts. 
But I can imagine a SPIR-V extension to appear that would add support for 4-bit 
integers. And I can imagine that we would want to not only be able to emit 
4-bit integers in the frontend, but also allow optimization passes to emit 
them. For this it would be nice to have a mechanism that would change 
datalayout depending on --spirv-ext (or other option).

https://github.com/llvm/llvm-project/pull/110695
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to