>
> At pocl build-time rebuild-index.pl generates an indexed version of the
> kerel.bc library? What is the format of this, or are all the kernel
> library functions kept in separate files? If the latter, what are the
> functions.list and functions.index needed for?
>

The kernel library functions are kept in separate files.

At POCL build time "list-functions.sh" extracts all the function names from
the bitcode files into "functions.list" (with llvm-dis and grep), then "
rebuild-index.pl" figures out the dependencies between bitcode files and
builds a "functions.index" to be installed.

When pocl-workgroup runs it calls "pocl-objs.pl", which does the same
llvm-dis/grep thing on the user kernel to find out which functions it
references, then looks in "functions.index" and returns a list of which
bitcode files to link in.

Why put this in the lib/kernel/x86_64 directory? It doesn't seem
> x86_64-specific. At least ARM and PPC would benefit directly from this,
> why not TCE too?
>

Yea, should work all of those places too. I just only wanted it faster on
x86_64 this week.


> This sounds essentially as a "dead-code-elimination" done a priori.
> Since I'm trying to get rid of the scripts, I wonder if this could not
> be done without the need for extra helper scripts? Did you trace the
> performance - what exactly did consume the time with the monolitihic
> kenrel.bc library? I remember trying a DCE as the first pass to opt in
> pocl-workgroup, but it having no effect on execution time.
>

The problem is just too much code in the monolithic kernel.bc file. On my
laptop, timing the "opt" run in pocl-workgroup with -time-passes shows five
seconds just parsing the user kernel with the monolithic bitcode file
linked in.

Selecting which bitcode files to link in to user code probably could be
done as an LLVM pass - traverse the bitcode for extern declarations or call
sites if the declarations aren't easy to get at and then lookup those
functions in "functions.index". It'd probably be faster too, even just
because llvm-dis probably isn't a huge optimization target for the LLVM
folks, but I don't think it's the bottleneck at this point.
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to