nopattern opened a new issue #15482: mx2onnx error  about  batchnorm
URL: https://github.com/apache/incubator-mxnet/issues/15482
 
 
   
   ## Description
    I use mx2onnx onnx_mxnet.export_model to transfer mxnet symbol to onnx . 
But the moving_mean&moving_var param of Batchnorm is not in the params. So the 
   
   ## Environment info (Required)
   
   ```
   ----------Python Info----------
   Version      : 3.6.8
   Compiler     : GCC 5.4.0 20160609
   Build        : ('default', 'May  7 2019 14:58:50')
   Arch         : ('64bit', 'ELF')
   ------------Pip Info-----------
   Version      : 19.1.1
   Directory    : /usr/local/lib/python3.6/dist-packages/pip
   ----------MXNet Info-----------
   Version      : 1.5.0
   Directory    : /home/deep/workssd/mxnet/incubator-mxnet/python/mxnet
   Hashtag not found. Not installed from pre-built package.
   ----------System Info----------
   Platform     : Linux-4.4.0-148-generic-x86_64-with-Ubuntu-16.04-xenial
   system       : Linux
   node         : MS-7817
   release      : 4.4.0-148-generic
   version      : #174-Ubuntu SMP Tue May 7 12:20:14 UTC 2019
   ----------Hardware Info----------
   machine      : x86_64
   processor    : x86_64
   Architecture:          x86_64
   CPU op-mode(s):        32-bit, 64-bit
   Byte Order:            Little Endian
   CPU(s):                4
   On-line CPU(s) list:   0-3
   Thread(s) per core:    1
   Core(s) per socket:    4
   Socket(s):             1
   NUMA node(s):          1
   Vendor ID:             GenuineIntel
   CPU family:            6
   Model:                 60
   Model name:            Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz
   Stepping:              3
   CPU MHz:               3657.070
   CPU max MHz:           3700.0000
   CPU min MHz:           800.0000
   BogoMIPS:              6600.45
   Virtualization:        VT-x
   L1d cache:             32K
   L1i cache:             32K
   L2 cache:              256K
   L3 cache:              6144K
   NUMA node0 CPU(s):     0-3
   
   
   ```
   
   Package used (Python/R/Scala/Julia):
   (I'm usining Python)
   
   ## Build info (Required if built from source)
   
   Compiler (gcc):
   
   MXNet commit hash:
   (da4b2a82511df)
   
   Build config:
   # Licensed to the Apache Software Foundation (ASF) under one
   # or more contributor license agreements.  See the NOTICE file
   # distributed with this work for additional information
   # regarding copyright ownership.  The ASF licenses this file
   # to you under the Apache License, Version 2.0 (the
   # "License"); you may not use this file except in compliance
   # with the License.  You may obtain a copy of the License at
   #
   #   http://www.apache.org/licenses/LICENSE-2.0
   #
   # Unless required by applicable law or agreed to in writing,
   # software distributed under the License is distributed on an
   # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
   # KIND, either express or implied.  See the License for the
   # specific language governing permissions and limitations
   # under the License.
   
   
#-------------------------------------------------------------------------------
   #  Template configuration for compiling mxnet
   #
   #  If you want to change the configuration, please use the following
   #  steps. Assume you are on the root directory of mxnet. First copy the this
   #  file so that any local changes will be ignored by git
   #
   #  $ cp make/config.mk .
   #
   #  Next modify the according entries, and then compile by
   #
   #  $ make
   #
   #  or build in parallel with 8 threads
   #
   #  $ make -j8
   
#-------------------------------------------------------------------------------
   
   #---------------------
   # choice of compiler
   #--------------------
   
   ifndef CC
   export CC = gcc
   endif
   ifndef CXX
   export CXX = g++
   endif
   ifndef NVCC
   export NVCC = nvcc
   endif
   
   # whether compile with options for MXNet developer
   DEV = 0
   
   # whether compile with debug
   DEBUG = 0
   
   # whether to turn on segfault signal handler to log the stack trace
   USE_SIGNAL_HANDLER =
   
   # the additional link flags you want to add
   ADD_LDFLAGS =
   
   # the additional compile flags you want to add
   ADD_CFLAGS =
   
   #---------------------------------------------
   # matrix computation libraries for CPU/GPU
   #---------------------------------------------
   
   # whether use CUDA during compile
   USE_CUDA = 1
   
   # add the path to CUDA library to link and compile flag
   # if you have already add them to environment variable, leave it as NONE
   USE_CUDA_PATH = /usr/local/cuda
   #USE_CUDA_PATH = NONE
   
   # whether to enable CUDA runtime compilation
   ENABLE_CUDA_RTC = 1
   
   # whether use CuDNN R3 library
   USE_CUDNN = 1
   
   # whether to use NVTX when profiling
   USE_NVTX = 0
   
   #whether to use NCCL library
   USE_NCCL = 0
   #add the path to NCCL library
   USE_NCCL_PATH = NONE
   
   # whether use opencv during compilation
   # you can disable it, however, you will not able to use
   # imbin iterator
   USE_OPENCV = 1
   # Add OpenCV include path, in which the directory `opencv2` exists
   USE_OPENCV_INC_PATH = NONE
   # Add OpenCV shared library path, in which the shared library exists
   USE_OPENCV_LIB_PATH = NONE
   
   #whether use libjpeg-turbo for image decode without OpenCV wrapper
   USE_LIBJPEG_TURBO = 0
   #add the path to libjpeg-turbo library
   USE_LIBJPEG_TURBO_PATH = NONE
   
   # use openmp for parallelization
   USE_OPENMP = 1
   
   # whether use MKL-DNN library: 0 = disabled, 1 = enabled
   # if USE_MKLDNN is not defined, MKL-DNN will be enabled by default on x86 
Linux.
   # you can disable it explicity with USE_MKLDNN = 0
   USE_MKLDNN = 0
   
   # whether use NNPACK library
   USE_NNPACK = 0
   
   # choose the version of blas you want to use
   # can be: mkl, blas, atlas, openblas
   # in default use atlas for linux while apple for osx
   UNAME_S := $(shell uname -s)
   ifeq ($(UNAME_S), Darwin)
   USE_BLAS = apple
   else
   USE_BLAS = atlas
   endif
   
   # whether use lapack during compilation
   # only effective when compiled with blas versions openblas/apple/atlas/mkl
   USE_LAPACK = 1
   
   # path to lapack library in case of a non-standard installation
   USE_LAPACK_PATH =
   
   # add path to intel library, you may need it for MKL, if you did not add the 
path
   # to environment variable
   USE_INTEL_PATH = NONE
   
   # If use MKL only for BLAS, choose static link automatically to allow python 
wrapper
   ifeq ($(USE_BLAS), mkl)
   USE_STATIC_MKL = 1
   else
   USE_STATIC_MKL = NONE
   endif
   
   #----------------------------
   # Settings for power and arm arch
   #----------------------------
   ARCH := $(shell uname -a)
   ifneq (,$(filter $(ARCH), armv6l armv7l powerpc64le ppc64le aarch64))
        USE_SSE=0
        USE_F16C=0
   else
        USE_SSE=1
   endif
   
   #----------------------------
   # F16C instruction support for faster arithmetic of fp16 on CPU
   #----------------------------
   # For distributed training with fp16, this helps even if training on GPUs
   # If left empty, checks CPU support and turns it on.
   # For cross compilation, please check support for F16C on target device and 
turn off if necessary.
   USE_F16C =
   
   #----------------------------
   # distributed computing
   #----------------------------
   
   # whether or not to enable multi-machine supporting
   USE_DIST_KVSTORE = 0
   
   # whether or not allow to read and write HDFS directly. If yes, then hadoop 
is
   # required
   USE_HDFS = 0
   
   # path to libjvm.so. required if USE_HDFS=1
   LIBJVM=$(JAVA_HOME)/jre/lib/amd64/server
   
   # whether or not allow to read and write AWS S3 directly. If yes, then
   # libcurl4-openssl-dev is required, it can be installed on Ubuntu by
   # sudo apt-get install -y libcurl4-openssl-dev
   USE_S3 = 0
   
   #----------------------------
   # performance settings
   #----------------------------
   # Use operator tuning
   USE_OPERATOR_TUNING = 1
   
   # Use gperftools if found
   # Disable because of #8968
   USE_GPERFTOOLS = 0
   
   # path to gperftools (tcmalloc) library in case of a non-standard 
installation
   USE_GPERFTOOLS_PATH =
   
   # Link gperftools statically
   USE_GPERFTOOLS_STATIC =
   
   # Use JEMalloc if found, and not using gperftools
   USE_JEMALLOC = 1
   
   # path to jemalloc library in case of a non-standard installation
   USE_JEMALLOC_PATH =
   
   # Link jemalloc statically
   USE_JEMALLOC_STATIC =
   
   #----------------------------
   # additional operators
   #----------------------------
   
   # path to folders containing projects specific operators that you don't want 
to put in src/operators
   EXTRA_OPERATORS =
   
   #----------------------------
   # other features
   #----------------------------
   
   # Create C++ interface package
   USE_CPP_PACKAGE = 0
   
   # Use int64_t type to represent the total number of elements in a tensor
   # This will cause performance degradation reported in issue #14496
   # Set to 1 for large tensor with tensor size greater than INT32_MAX i.e. 
2147483647
   # Note: the size of each dimension is still bounded by INT32_MAX
   USE_INT64_TENSOR_SIZE = 0
   
   # Python executable. Needed for cython target
   PYTHON = python
   
   #----------------------------
   # plugins
   #----------------------------
   
   # whether to use caffe integration. This requires installing caffe.
   # You also need to add CAFFE_PATH/build/lib to your LD_LIBRARY_PATH
   # CAFFE_PATH = $(HOME)/caffe
   # MXNET_PLUGINS += plugin/caffe/caffe.mk
   
   
   #WARPCTC_PATH = $(HOME)/warp-ctc
   WARPCTC_PATH = /home/deep/warp-ctc
   MXNET_PLUGINS += plugin/warpctc/warpctc.mk
   
   # whether to use sframe integration. This requires build sframe
   # g...@github.com:dato-code/SFrame.git
   # SFRAME_PATH = $(HOME)/SFrame
   # MXNET_PLUGINS += plugin/sframe/plugin.mk
   
   ## Error Message:
   INFO:root:Converting idx: 0, op: null, name: data
   INFO:root:Converting idx: 1, op: null, name: first-3x3-conv-conv2d_weight
   INFO:root:Converting idx: 2, op: Convolution, name: first-3x3-conv-conv2d
   INFO:root:Converting idx: 3, op: null, name: first-3x3-conv-batchnorm_gamma
   INFO:root:Converting idx: 4, op: null, name: first-3x3-conv-batchnorm_beta
   INFO:root:Converting idx: 5, op: null, name: 
first-3x3-conv-batchnorm_moving_mean
   Traceback (most recent call last):
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 484, 
in <module>
       tune_and_evaluate(tuning_option)
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 436, 
in tune_and_evaluate
       net, params, input_shape, _ = get_network(network, batch_size=1)
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 93, 
in get_network
       return get_network_lpr_mb2(name,batch_size)
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 143, 
in get_network_lpr_mb2
       test_onnx()
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 135, 
in test_onnx
       converted_model_path = onnx_mxnet.export_model(mx_sym, args, 
[input_shape], np.float32, onnx_file, True)
     File 
"/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/export_model.py",
 line 87, in export_model
       verbose=verbose)
     File 
"/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/export_onnx.py",
 line 256, in create_onnx_graph_proto
       idx=idx
     File 
"/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/export_onnx.py",
 line 92, in convert_layer
       return convert_func(node, **kwargs)
     File 
"/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/_op_translations.py",
 line 170, in convert_weights_and_inputs
       np_arr = weights[name]
   KeyError: 'first-3x3-conv-batchnorm_moving_mean'
   Error in sys.excepthook:
   Traceback (most recent call last):
     File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in 
apport_excepthook
       from apport.fileutils import likely_packaged, get_recent_crashes
     File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in 
<module>
       from apport.report import Report
     File "/usr/lib/python3/dist-packages/apport/report.py", line 30, in 
<module>
       import apport.fileutils
     File "/usr/lib/python3/dist-packages/apport/fileutils.py", line 23, in 
<module>
       from apport.packaging_impl import impl as packaging
     File "/usr/lib/python3/dist-packages/apport/packaging_impl.py", line 23, 
in <module>
       import apt
     File "/usr/lib/python3/dist-packages/apt/__init__.py", line 23, in <module>
       import apt_pkg
   ModuleNotFoundError: No module named 'apt_pkg'
   
   Original exception was:
   Traceback (most recent call last):
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 484, 
in <module>
       tune_and_evaluate(tuning_option)
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 436, 
in tune_and_evaluate
       net, params, input_shape, _ = get_network(network, batch_size=1)
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 93, 
in get_network
       return get_network_lpr_mb2(name,batch_size)
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 143, 
in get_network_lpr_mb2
       test_onnx()
     File "/home/deep/workssd/arm/tvm_app/tune_relay_mobile_gpu.py", line 135, 
in test_onnx
       converted_model_path = onnx_mxnet.export_model(mx_sym, args, 
[input_shape], np.float32, onnx_file, True)
     File 
"/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/export_model.py",
 line 87, in export_model
       verbose=verbose)
     File 
"/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/export_onnx.py",
 line 256, in create_onnx_graph_proto
       idx=idx
     File 
"/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/export_onnx.py",
 line 92, in convert_layer
       return convert_func(node, **kwargs)
     File 
"/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/_op_translations.py",
 line 170, in convert_weights_and_inputs
       np_arr = weights[name]
   KeyError: 'first-3x3-conv-batchnorm_moving_mean'
   
   ## Minimum reproducible example
   `batch_size =  1
       input_shape = (batch_size, 3, 512, 512)
       output_shape = (batch_size, 65520,14)
   
   
       mx_sym, args,auxs = 
mx.model.load_checkpoint('./model/ssd_mobilenetv2_512', 18)
       mx_sym = get_symbol('mobilenetv2',512, num_classes=1,nms_thresh=0.5, 
force_nms=True, nms_topk=400)
   
       onnx_file = './mxnet_exported_resnet18.onnx'
       converted_model_path = onnx_mxnet.export_model(mx_sym, args, 
[input_shape], np.float32, onnx_file, True)`
   
   ## Steps to reproduce
   (Paste the commands you ran that produced the error.)
   
   1.python3 tran2onnx.py
   2.
   
   ## What have you tried to solve it?
   
   1.By debugging ,the moving_mean&moving_var  of batchnorm is not in params 
,so the converter treat it as input which is not real.
   2. There should be code to process the moving_mean&moving_var  of batchnorm 
indepently.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to