On 09/08/2013 11:53 PM, Erik Schnetter wrote: > Some CPU attributes influence the ABI. These need to be set correctly at > all times, otherwise the executable won't work. This influences e.g. the > calling conventions for functions, which is explicitly represented in > bytecode. That is, a fully generic bytecode library is not possible, but we > may be able to get away with using just a few per architecture.
Ah, true, the ABI. > One would probably also need to make sure that earlier optimizations don't > already expand builtins, since a different CPU may offer a more efficient > implementation in terms of a CPU instruction that exists only on some CPUs > (e.g. popcount, clz). Yes. This is the idea of the intrinsics. They are expanded to whatever is the most efficient implementation for the target in the backend. > Apart from this -- implementing the kernel library purely with scalar > functions and builtins is possible. We would have to experiment with how to > present this to the vectorizer to make things as easy as possible. > Currently, we split e.g. int16 into two int8 operations; this is a nicely > recursive implementation, but the vectorizer may prefer a loop instead. Currently the WG autovectorizer reuses the loopvectorizer of LLVM. It wants to see the work-group's parallel regions as parallel loops with scalar code which is as free from control constructs as possible. Thus, I think fully scalarizing (no loops) might lead to best solutions in the current WG autovectorization. > I should introduce an option to Vecmathlib to do this. This would easily > allow comparing performance, and could give hints to shortcomings of the > vectorizer (and conversely, of Vecmathlib) that could then be addressed. Good. It would be ideal to provide two versions of the kernel lib: one optimized for "intra vector usage" for when WG autovectorization is hopeless (but it would still be nice to use vector instructions, e.g., a WG of size 1), and another for WG autovectorization. OTOH, for the former, the BB vectorizer might do the trick automatically. Another thing to try. -- Pekka ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk _______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
