QueensGambit edited a comment on issue #16173: Saving and loading cudNN autotune and graph optimization URL: https://github.com/apache/incubator-mxnet/issues/16173#issuecomment-615882613 @mk-61 Thank you for your proposal. Generating a unique key based on model an inference attributes is an option. I'm not sure if I understood your idea correctly but one downside of this approach is that it might become difficult to maintain at some point if every inference backend uses a different set of attributes. Therefore, I prefer my previous idea of adding a save and load function to the [Executor class](https://github.com/apache/incubator-mxnet/blob/992c3c0dd90c0723de6934e826a49bad6569eeac/include/mxnet/executor.h#L53). This way, the programmer can define a name for his executor object including all optimizations. I implemented a wrapper for plain TensorRT to save an reload TensorRT optimization in a trt-engine file: * https://github.com/QueensGambit/CrazyAra/blob/master/engine/src/nn/tensorrtapi.cpp The class implements the same functionality as our MXNet wrapper: * https://github.com/QueensGambit/CrazyAra/blob/master/engine/src/nn/mxnetapi.cpp The start-up time for loading the TensorRT optimization takes a few seconds while optimizing the model takes several minutes. This process is called **serialization** and **de-serialization**. For a TensorRT engine, it is as simple as dumping a binary memory buffer into a file and reloading it later: ```c++ void write_buffer(void* buffer, size_t bufferSize, const string& filePath); const char* read_buffer(const string& filePath, size_t& bufferSize); ``` @mikeobr Would exporting and importing a MXNet executor objects suffice in your case?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services