arcadiaphy commented on issue #16431: [RFC] MXNet Multithreaded Inference Interface URL: https://github.com/apache/incubator-mxnet/issues/16431#issuecomment-562052116 @anirudh2290 Just see this RFC. Let me share what I've done in multithreaded infererce, I think it's the only viable way now in mxnet. I've deployed many models with scala API, and run them in multiple threads. The whole system has run smoothly in production environment for more than 2 months. The backend of inference is graph executor, which is created for each thread with shared model parameters. The executors can be dynamically reshaped in each thread independently according to the shape of the data input. Like what's mentioned above, the dependency engine is not thread safe, so if you run it in threaded engine, dead lock and core dump will happen. Therefore, naive engine is the only option left. Without the dependency scheduling, any write dependency on model parameters is likely to be executed simultaneously and mess the internal data. If mkldnn is used to accelerate inference, you will get non-deterministic results per inference because mxnet stealthily reorder the data in ndarray (write dependency involved) for mkldnn operators. I've used a temporary method to address this issue which is not suitable for an official PR. Multithreaded inference should be used with care. Sharing model parameters can reduce the memory footprint in your program, but a lot of memory usage is consumed by global resources (temporary workspace, random number generator, ...) or op cache for mkldnn which are stored in static thread_local variables. So **thread number** is the most important factor for memory footprint, any thread involving mxnet operation, be it any trivial imperative invoking of operators, will incur memory overhead by creating its own set of thread_local variables. I've spent so much time tracking memory leak and the best solution is to limit thread number. A new method to do multithreaded inference by threaded engine is much welcomed here. It will solve the above problems automatically.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services