what is meant by:

*Profiling*
The profiler is always on even for production release builds, because MXNet
can not be build without it [2].  ?

you mean it is always built or it is turned on (ie recording and saving
profiling information)?  I am not aware of it being turned on by default.


profiler has no overhead when built in but not turned on.


On Thu, Nov 22, 2018 at 2:35 AM Anton Chernov <mecher...@gmail.com> wrote:

> Dear MXNet community,
>
> I propose to raise the minimal required cmake version that is needed to
> build MXNet to 3.10 which was tagged on March 16 2018 [1].
>
> The effort of repairing cmake scripts in general is targeting to deprecate
> make and maintain only 1 build system.
>
> *Need*
>
> The build system is the foundation of every software project. It's quality
> is directly impacting the quality of the project. The MXNet build system is
> fragile, partially broken and not maintained.
>
> Users of MXNet and developers are confused by the fact that 2 build systems
> exist at the same time: make and CMake.
>
> The main functional areas which are impacted by the current state of the
> cmake files are:
>
> *OpenMP*
> The current CMake files mix OpenMP libraries from different compliers which
> is undefined behaviour. It leads to indeterministic crashes on some
> platforms. Build and deployment are very hard. No evidence exists that
> proves that there is any benefit of having llvm OpenMP library as a
> submodule in MXNet.
>
> *BLAS and LAPACK*
> Basic math library usage is mixed up. It is hard and confusing to configure
> and the choosing logic of the most optimal library is not present. MKL and
> OpenBLAS are intermixed in an unpredictable manner.
>
> *Profiling*
> The profiler is always on even for production release builds, because MXNet
> can not be build without it [2].
>
> *CUDA*
> CUDA is detected by 3 different files in the current cmake scripts and the
> choice of those is based on a obscure logic with involves different
> versions of cmake and platforms which it's building on
>
> * CMakeLists.txt
> * cmake/FirstClassLangCuda.cmake
> * 3rdparty/mshadow/cmake/Cuda.cmake
>
>
> *Confusing and misleading cmake user options*
> For example, USE_CUDA / USE_OLDCMAKECUDA. Some of them will do or not do
> what they supposed to based on cmake generator version and version of cmake
> [3].
> There are currently more than 30 build parameters for MXNet none of them
> documented. Some of them not even located in the main CMakeLists.txt file,
> for example 'BLAS'.
>
>
> *Issues*
> There is a significant amount of github issues related to cmake or build in
> general. New tickets are issued frequently.
>
> * #8702 (https://github.com/apache/incubator-mxnet/issues/8702)
>  [DISCUSSION] Should we deprecate Makefile and only use CMake?
> * #5079 (https://github.com/apache/incubator-mxnet/issues/5079)   troubles
> building python interface on raspberry pi 3
> * #1722 (https://github.com/apache/incubator-mxnet/issues/1722)   problem:
> compile mxnet with hdfs
> * #11549 (https://github.com/apache/incubator-mxnet/issues/11549) Pip
> package can be much faster (OpenCV version?)
> * #11417 (https://github.com/apache/incubator-mxnet/issues/11417)
> libomp.so
> dependency (need REAL fix)
> * #8532 (https://github.com/apache/incubator-mxnet/issues/8532)
>  mxnet-mkl
> (v0.12.0) crash when using (conda-installed) numpy with MKL // (indirectly)
> * #11131 (https://github.com/apache/incubator-mxnet/issues/11131)
> mxnet-cu92 low efficiency  // (indirectly)
> * #10743 (https://github.com/apache/incubator-mxnet/issues/10743) CUDA
> 9.1.xx failed if not set OLDCMAKECUDA on cmake 3.10.3 with unix makefile or
> Ninja generator
> * #10742 (https://github.com/apache/incubator-mxnet/issues/10742) typo in
> cpp-package/CMakeLists.txt
> * #10737 (https://github.com/apache/incubator-mxnet/issues/10737) Cmake is
> running again when execute make install
> * #10543 (https://github.com/apache/incubator-mxnet/issues/10543) Failed
> to
> build from source when set USE_CPP_PACKAGE = 1, fatal error C1083: unabel
> to open file: “mxnet-cpp/op.h”: No such file or directory
> * #10217 (https://github.com/apache/incubator-mxnet/issues/10217) Building
> with OpenCV causes link errors
> * #10175 (https://github.com/apache/incubator-mxnet/issues/10175) MXNet
> MKLDNN build dependency/flow discussion
> * #10009 (https://github.com/apache/incubator-mxnet/issues/10009)
> [CMAKE][IoT] Remove pthread from android_arm64 build
> * #9944 (https://github.com/apache/incubator-mxnet/issues/9944)   MXNet
> MinGW-w64 build error // (indirectly)
> * #9868 (https://github.com/apache/incubator-mxnet/issues/9868)   MKL and
> CMake
> * #9516 (https://github.com/apache/incubator-mxnet/issues/9516)   cmake
> cuda arch issues
> * #9105 (https://github.com/apache/incubator-mxnet/issues/9105)
>  libmxnet.so load path error
> * #9096 (https://github.com/apache/incubator-mxnet/issues/9096)   MXNet
> built with GPerftools crashes
> * #8786 (https://github.com/apache/incubator-mxnet/issues/8786)   Link
> failure on DEBUG=1 (static member symbol not defined) // (indirectly)
> * #8729 (https://github.com/apache/incubator-mxnet/issues/8729)   Build
> amalgamation using a docker // (indirectly)
> * #8667 (https://github.com/apache/incubator-mxnet/issues/8667)
>  Compiler/linker error while trying to build from source on Mac OSX Sierra
> 10.12.6
> * #8295 (https://github.com/apache/incubator-mxnet/issues/8295)   Building
> with cmake - error
> * #7852 (https://github.com/apache/incubator-mxnet/issues/7852)   Trouble
> installing MXNet on Raspberry Pi 3
> * #13303 (https://github.com/apache/incubator-mxnet/issues/13303)
> mxnet-cpp
> package cross-compilation fails with OSError: "wrong ELF class: ELFCLASS32"
> * #13245 (https://github.com/apache/incubator-mxnet/issues/13245)
> mxnet::cpp::NDArray::WaitAll() take about 160ms on gtx1080ti //
> (indirectly, cmake impact on performance)
> * #12849 (https://github.com/apache/incubator-mxnet/issues/12849)
> [cmake][cpp-package] Building with cmake does not install the cpp-package
> API
> * #12568 (https://github.com/apache/incubator-mxnet/issues/12568)
> [Scala][macOS] Trying to build from source
> * #12134 (https://github.com/apache/incubator-mxnet/issues/12134) why MKL
> and MKL-DNN can't be used simultaneously in ChooseBlas.cmake
> * #12107 (https://github.com/apache/incubator-mxnet/issues/12107) Faulty
> CUDA detection with cmake
> * #11769 (https://github.com/apache/incubator-mxnet/issues/11769)
> USE_BLAS=MKL fails due to mshadow requiring openblas
> * #11563 (https://github.com/apache/incubator-mxnet/issues/11563)
> Deprecate
> USE_PROFILER from make/cmake
> * #10856 (https://github.com/apache/incubator-mxnet/issues/10856) Failed
> OpenMP assertion when loading MXNet compiled with DEBUG=1
> * #10742 (https://github.com/apache/incubator-mxnet/issues/10742) typo in
> cpp-package/CMakeLists.txt
>
>
> *Approach*
>
> We are going to iteratively fix and simplify the cmake build system and
> once is possible deprecate and remove the make system. This PR's have been
> opened so far:
>
>
> * #11148 (https://github.com/apache/incubator-mxnet/pull/11148)
> [MXNET-679]
> Refactor handling BLAS libraries with cmake
> * #12160 (https://github.com/apache/incubator-mxnet/pull/12160) Remove
> conflicting llvm OpenMP from cmake builds
> * #10564 (https://github.com/apache/incubator-mxnet/pull/10564) Simplified
> CUDA language detection in cmake
> * #10530 (https://github.com/apache/incubator-mxnet/pull/10530) Jetson
> build with cmake and CUDA
>
> Unfortunately, none of them with any success. The question of updating the
> minimal required version was not asked before, so I'm raising it now.
>
> By upgrading the version we would remove all custom error-prone cmake files
> that are related to: CUDA, BLAS and LAPACK. Essentially covering most of
> the problems.
>
> OpenMP and profiling would need to be addressed separately.
>
> *Benefit*
>
> Ease of maintaining of MXNet build, clarity for users, quality and
> predictability.
>
> *Alternatives*
>
> * Leave the situation as is
> * Proceed with the make build
>
>
> I would appreciate hearing your thoughts.
>
> Best
> Anton
>
> [1] https://github.com/Kitware/CMake/releases/tag/v3.10.3
> [2] https://github.com/apache/incubator-mxnet/issues/11563
> [3]
>
> https://github.com/apache/incubator-mxnet/blob/master/CMakeLists.txt#L46-L57
>

Reply via email to