nolanliou opened a new issue #6354: URL: https://github.com/apache/incubator-tvm/issues/6354
Compared two similar Bert models running on CPU with TVM, one is PyTorch model, the other is MXNet model. Due to the large performance difference, I did some profiling. The result shows the run time of the same operation(matmul) with same workload varies big. ENV: 1. TVM: build with MKL. 2. Intel CPU 3. OpenMP: `KMP_AFFINITY=compact,1,0 OMP_NUM_THREADS=24` Model inference time: ``` # mxnet model TVM Mean inference time: 5.53 ms # pytorch model TVM Mean inference time: 23.05 ms ``` Profiling result: ``` # MXNet model Node Name Ops. Time(us) Time(%) Shape. Inputs Outputs --------- fused_nn_dense_add_15 fused_nn_dense_add_1 308.926 5.58 (32, 768) 3 1 fused_nn_dense_add_11 fused_nn_dense_add_1 307.277 5.551 (32, 768) 3 1 # PyTorch Model Node Name Ops. Time(us) Time(%) Shape. Inputs Outputs --------- fused_nn_dense_add_3 fused_nn_dense_add_3 1783.75 7.631 (32, 768) 3 1 fused_nn_dense_add_31 fused_nn_dense_add_3 1593.08 6.815 (32, 768) 3 1 ``` IR code (same between PyTorch model and MXNet model) ``` attr [0] "compute_scope" = "fused_nn_dense_add_3_compute_"; attr [C: handle] "storage_scope" = "global"; allocate(C, float32, [24576]) { attr [0] "extern_scope" = 0; @tir.tvm_call_packed("tvm.contrib.cblas.matmul", @tir.tvm_stack_make_array(placeholder, @tir.tvm_stack_make_shape(32, 3072, dtype=handle), 0, 2, 0f32, 0, dtype=handle), @tir.tvm_stack_make_array(placeholder_1, @tir.tvm_stack_make_shape(768, 3072, dtype=handle), 0, 2, 0f32, 0, dtype=handle), @tir.tvm_stack_make_array(C, @tir.tvm_stack_make_shape(32, 768, dtype=handle), 0, 2, 0f32, 0, dtype=handle), False, True, dtype=int32) for (ax0: int32, 0, 32) "parallel" { for (ax1: int32, 0, 768) { T_add[((ax0*768) + ax1)] = ((float32*)C[((ax0*768) + ax1)] + (float32*)placeholder_2[ax1]) } } ``` However, when setting `OMP_NUM_THREADS=1` the model inference time is same, seems it's a problem with multiple threads. What may cause the difference? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org