Hi Guys- I'm trying to write re-usable code for automatically tuning (CUDA) kernels and their associated (python) calling functions for optimal execution time.
My main goals: 1. Write re-usable code to handle the optimization of kernels (and their associated support code), given templates that contain: 2. Inline definition of logic for conditional unrolling of loops, disabling sections of code, etc... 3. Inline definition of the tunable parameters, along with the sets of possible values they can take I've found two templating languages that I think are promising for this goal: Cheetah http://www.cheetahtemplate.org and Mako http://www.makotemplates.org Both are pretty rich and Pythonic in terms of the logic you can incorporate into your templates, so that covers 2. pretty well. I'm stuck on 3. To give you some idea what I'm talking about, here's a dummy template that generates python code that just waits different amounts of time based on the (after the template is rendered) hard coded values of x1, x2, y1, y2. __________contents of mainFileName__________________ ## This is a mako template demonstrating how to make a template for use with the AutoTunedFunction class <%! tuneableParameterSets = [] %> import time def main: ## Two tuneable parameters that must be optimized together <%! tuneableParameterSets.append({}) tuneableParameterSets[0]['x1'] = [0, 3, 2] tuneableParameterSets[0]['x2'] = [0, 5, 4] %> time.sleep(.01 * ${x1} * ${x2}) ## Another set of parameters that should be optimized together. <%! tuneableParameterSets.append({}) tuneableParameterSets[1]['y1'] = [True, False] tuneableParameterSets[1]['y2'] = [False, True] %> if ${y1} or ${y2}: time.sleep(.02) ____________________________ So there is a list of dictionaries, tuneableParameterSets, that I want to be able to access from outside the template. When you do: >>> from mako.template import Template >>> t = Template(filename=mainTemplateFileName) the template source gets compiled to template code. I was hoping there'd be a nice way to inspect the resulting class instance t to get the tuneableParameterSets variable, and use that to iterate through the possible values for the tuned parameters. Unfortunately, there doesn't seem to be a nice clean way to do this. One ugly hack would be to scrape the code that sets the variables out of the compiled (but not rendered/filled) template code, perhaps with the help of some added delimiters. This is really ugly, but so far it's the only thing that I've come up with that would work. In Cheetah there is a syntax #attr that lets you set attributes of the resulting class (after the template is compiled, before it is rendered/filled). That almost worked, but unfortunately it doesn't seem like you can set a dictionary, only string and numeric literals. Anyone familiar with these templating engines have any clever ideas? Is anyone else using generic tuning code, or is everyone writing a separate auto-tuning script for each module? I also looked a bit a codepy, but I don't quite understand what to do with it; it doesn't seem suitable for manipulating chunks of code written in another language. I may be misunderstanding it completely though, and there's not much documentation. Thanks! Drew _______________________________________________ PyCUDA mailing list [email protected] http://tiker.net/mailman/listinfo/pycuda_tiker.net
