kpuatamazon opened a new issue #19502: URL: https://github.com/apache/incubator-mxnet/issues/19502
## Description intgemm is a 3rd-party library written by me and included as a submodule. Unrelated continuous integration tests were going slower afterwards on CentOS 7 CPU. This has been discussed in comments on https://github.com/apache/incubator-mxnet/commit/13936020d4cc3ccb2d4192adccaa282cef509193 After losing several hairs, the issue appears to be OpenMP support in intgemm's `CMakeLists.txt`: https://github.com/kpu/intgemm/blob/8f28282c3bd854922da638024d2659be52e892e9/CMakeLists.txt#L47-L56 I think this is causing MXNet to use the slow CentOS OpenMP instead of the bundled support. ## Question What's the best practice for a standalone library that has its own OpenMP support to not step on MXNet's internal support? ## To Reproduce Started with c5.18xlarge with the latest AL2 machine learning image. To build: ``` # Always delete the build directory. This is sneaky and appears to survive. rm -rf build; ci/build.py --docker-registry mxnetci --platform centos7_cpu --docker-build-retries 3 --shm-size 500m /work/runtime_functions.sh build_centos7_cpu ``` To run: ``` #Running docker run --cap-add SYS_PTRACE --rm --shm-size=500m -v $HOME/incubator-mxnet:/work/mxnet -v $HOME/incubator-mxnet/build:/work/build -v $HOME/.ccache:/work/ccache -u 1001:1001 -e CCACHE_MAXSIZE=500G -e CCACHE_TEMPDIR=/tmp/ccache -e CCACHE_DIR=/work/ccache -e CCACHE_LOGFILE=/tmp/ccache.log -ti mxnetci/build.centos7_cpu:latest bash CI_CUDA_COMPUTE_CAPABILITIES='-gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_70,code=sm_70' CI_CMAKE_CUDA_ARCH='5.2 7.0' set +x source /opt/rh/rh-python36/enable export PATH=/opt/rh/rh-python36/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PATH=/opt/rh/rh-python36/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin export LD_LIBRARY_PATH=/opt/rh/rh-python36/root/usr/lib64 LD_LIBRARY_PATH=/opt/rh/rh-python36/root/usr/lib64 export MANPATH=/opt/rh/rh-python36/root/usr/share/man: MANPATH=/opt/rh/rh-python36/root/usr/share/man: export PKG_CONFIG_PATH=/opt/rh/rh-python36/root/usr/lib64/pkgconfig PKG_CONFIG_PATH=/opt/rh/rh-python36/root/usr/lib64/pkgconfig export XDG_DATA_DIRS=/opt/rh/rh-python36/root/usr/share:/usr/local/share:/usr/share XDG_DATA_DIRS=/opt/rh/rh-python36/root/usr/share:/usr/local/share:/usr/share cd /work/mxnet nproc expr 72 / 4 OMP_NUM_THREADS=18 python -m pytest --verbose tests/python/unittest/test_gluon.py::test_slice_pooling2d_slice_pooling2d ``` Repeat the above steps for master and again with these lines commented out in `3rdparty/intgemm/CMakeLists.txt` ``` #option(USE_OPENMP "Use OpenMP" OFF) #if (USE_OPENMP) # message(STATUS "Compiling with OpenMP") # find_package(OpenMP) # if (NOT ${OpenMP_CXX_FOUND}) # message(SEND_ERROR "OpenMP requested but C++ support not found") # endif() # add_compile_options(${OpenMP_CXX_FLAGS}) # target_link_libraries(intgemm PUBLIC OpenMP::OpenMP_CXX) #endif() ``` The master version takes about 548.22s on a c5.18xlarge, while the commented version takes about 58.87s. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
