Hi,
I believe llvm_target_triple and llvm_cpu should be enough to get the OpenCL source compiled for the target CPU. Then the compiled bitcode is linked against the kernel library and then parallelized. Just to be sure - are you linking against a kernel library built for ARMv8 ? You can easily check by running your program with POCL_DEBUG=all, it should print something like this: | LLVM | Using <builddir>/lib/kernel/host/kernel-x86_64-unknown-linux-gnu-skylake.bc as the built-in lib. the format is lib/kernel/<device>/kernel-<triple>-<cpu>.bc, make sure it matches your ARM device. It might also be worthwhile to compile the kernel library on your ARM device (if it's Cortex-A53 even a raspberry pi 3 running in 64bit mode will suffice, just make sure you use the same LLVM version), then copy the resulting kernel.bc to your host, and see if linking against that fixes the error. BTW you don't need to compile the entire pocl (which takes a while on ARM), 'cmake <options> && cd lib/kernel && make' is enough to get the kernel library bc. As for malformed LLVM IR, you can dump the IR on the console. In pocl_llvm_codegen() at the beginning: https://github.com/pocl/pocl/blob/dc5832685994f34d24e922907515b28a00933e03/lib/CL/pocl_llvm_wg.cc#L588 add this: Input->dump(); this will print the IR on STDERR. Regards, -- mb
_______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
