Hi,

I believe llvm_target_triple and llvm_cpu should be enough to get the OpenCL 
source compiled for the target CPU. Then the compiled bitcode is linked against 
the kernel library and then parallelized. Just to be sure - are you linking 
against a kernel library built for ARMv8 ? You can easily check by running your 
program with POCL_DEBUG=all, it should print something like this:


  |      LLVM |  Using 
<builddir>/lib/kernel/host/kernel-x86_64-unknown-linux-gnu-skylake.bc as the 
built-in lib.


the format is lib/kernel/<device>/kernel-<triple>-<cpu>.bc, make sure it 
matches your ARM device. It might also be worthwhile to compile the kernel 
library on your ARM device (if it's Cortex-A53 even a raspberry pi 3 running in 
64bit mode will suffice, just make sure you use the same LLVM version), then 
copy the resulting kernel.bc to your host, and see if linking against that 
fixes the error. BTW you don't need to compile the entire pocl (which takes a 
while on ARM), 'cmake <options> && cd lib/kernel && make' is enough to get the 
kernel library bc.


As for malformed LLVM IR, you can dump the IR on the console. In 
pocl_llvm_codegen() at the beginning:


https://github.com/pocl/pocl/blob/dc5832685994f34d24e922907515b28a00933e03/lib/CL/pocl_llvm_wg.cc#L588


add this:


Input->dump();


this will print the IR on STDERR.



Regards,

  -- mb

_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to