MXNet has a number of context-specific operator parameters:  'cudnn_tune', 
'cudnn_off' and 'workspace' are parameters that control the behavior of 
Convolution on gpu contexts with NVIDIA gpus.  Even with these, there would be 
benefits to having additional parameters, e.g. to  set Convolution algos by 
number, or force the compute precision to float16.  With the desire to support 
multiple backends and a growing number of operators, it's time to ask the 
question, "Is this scalable?"

I propose that, rather than adding a new parameter at the Python level for each 
new backend-specific parameter 'knob', all context-specific parameters be swept 
into a single dictionary, called e.g. 'ctx_params':

Convolution(..., ctx_params= {'cudnn_tune': 2, 'cudnn_off': False,  
'workspace': 2000}, ...)

I'll stop short of working out all the details to hopefully generate more 
discussion.  Some open questions:

Do all backends share the same namespace, or do we have separate 
'gpu_ctx_params', 'cpu_ctx_params', etc.?

Is there a clean extension to the general parameter parsing facility of dmlc to 
handle this dictionary, and what form do these extension params take in the 
backend, Map<string,string>?

And while this proposes to organize and consolidate these context-specific 
parameters at the Python level, we'd need to tolerate (and auto-create) 
documentation for these new parameters.

Other approaches welcome.

Reply via email to