On Tue, Oct 7, 2014 at 3:12 AM, Kalle Raiskila <[email protected]> wrote: > >> > If not, one has to set the feature set (-triple and -march passed to >> > Clang/llc) to the lowest possible, and it results in much worse >> > performing kernel compilation results (no SIMD or other special >> > instructions used) which is not nice as OpenCL is all about >> > performance. >> > >> > Do you know if there is such a minimal ISA feature set defined >> > for CPU architectures in Debian? >> >> I will try to find them. But, another possibility is to require an >> extended feature set but bails at runtime with a clear error message >> if it appears that the current CPU is not powerful enough. I think >> that, at least for x86-32, it would be a better option. I know from >> previous Debian discussion that the minimal ISA feature set on this >> architecture is pretty low but the use case would probably be x86-64 >> processors that run 32bits software... > > For x86_64 it seems all processors are supported on Debian: > http://www.debian.org/releases/stable/amd64/ch02s01.html.en > I think the minimum set is not that bad, it has e.g. SSE2. If I > understand correctly, this is the "x86-64" CPU for LLVM (note the > underscore in the arch vs. dash in the CPU variant...) > > For x86, the minimum requirement seems to be the 486: > http://www.debian.org/releases/stable/i386/ch02s01.html.en > which kills performance. > > ARM is supported on "any ARM CPU". > http://www.debian.org/releases/stable/armhf/ch02s01.html.en > This suggests some really old ARM ISA - but pocl is exclusively tested > on ARMv7 (Cortex-series). To add to the confusion, the NEON SIMD > extension is not to my knowledge mandated even by the ARMv7, so to be > portable even here, it would have to disable the SIMD. > > And then there are all the other cpu architectures pocl 0.10 is not > tested on... > > Bailing out run-time sounds like a nice solution, but probably would > need quite a bit of work, or at least quite a bit of testing. > The ultimate solution is here to compile the OCL kernel run-time at > run-time. But for performance reasons, this would need a bit of code to > selectively pick just the few needed kernel functions to compile.
We could probably pick a few (three?) CPU feature sets, and then choose at run time the highest possible one. My suggestion for these would be i486, SSE2, and AVX. -erik -- Erik Schnetter <[email protected]> http://www.perimeterinstitute.ca/personal/eschnetter/ AIM: eschnett247, Skype: eschnett, Google Talk: [email protected] ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk _______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
