Hi all,
I have two containers: 1. One running Python 2 and MXNet 1.1 2. An updated container running Python 3 and MXNet 1.4 I have observed some significant performance regressions in the py3-MXNet 1.4.1 container, which is built with MKLDNN enabled. I am using code at this repo as a 'minimal reproducible example': https://github.com/opringle/multivariate_time_series_forecasting I used to profiler in each version to capture the second training batch for both containers, in a manner like this: ``` i = 0 for batch in train_iter: start_time = time.time() if i==1: profiler.set_state('run') module.forward(batch, is_train=True) module.backward() mx.nd.waitall() if i==1: profiler.set_state('stop') profiler.dump() ``` This is the profiler output when sorted by total op time for the py2-1.1 container:  Same for the Py3-1.4.1 container: (uploading as a reply due to new-user restriction) Some ops like `backward_Convolution` are significantly slower. My machine CPU is a 6-core Intel i7. Does anyone know if this operator specific, or know a method to determine if it is? Is this issue related to MKL-DNN somehow? Other context: Due to how our code is currently structured in my org, it's quite difficult to upgrade to 1.5+. When I run the same example with the same containers on a machine with an Intel Xeon CPU (c5 instance on AWS), the opposite occurs: the py3-1.4.1 container is much faster per batch than the py2-1.1 container. --- [Visit Topic](https://discuss.mxnet.apache.org/t/performance-regression-in-1-4/6790/1) or reply to this email to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.mxnet.apache.org/email/unsubscribe/04b1841b1f707aae318c5cdeb4115abfe96f15d058cd9b28f339435f5659754c).
