moseswmwong opened a new issue #19242: URL: https://github.com/apache/incubator-mxnet/issues/19242
## Description Installation stem from this link - https://mxnet.apache.org/versions/1.7/get_started/windows_setup Mxnet runs very slow on my Windows 10 machine. It is 64 bits system: - CPU - Intel i7, gaming grade PC - RAM - 16 G - GPU - nVidia GTX 1080 8G RAM - Harddisc - 4 TB Installations: Strictly followed all instruction possible with careful check every milestone. - Anaconda 3, added channel "conda-forge", created environment Python 3.6 - Intel MKL software installed - 2020 Update 3 - Microsoft Visual Studio 2017 - nVidia CUDA 10.2 installed, compiled sample code passed nvidia querydevice and fluidsimulator test. (I know the instruction recommend 9.2 but I have this installed, 9.2 is bit old and CUDA is download compatible) - nVidia cuDNN for CUDA 10.2 installed, all bin, include, lib installed to CUDA 10.2 directory tree Installation inside anaconda environment. - As a new conda environment is created here are the modules - python 3.6 - opencv 4.4.0 from conda-forge - pip install mxnet-cu102mkl failed from issue list found that it is too large so need to do this from your repo - pip install mxnet-cu102mkl==1.7.0 -f https://dist.mxnet.io/python (Issue [link](https://github.com/apache/incubator-mxnet/issues/17963)) - pip install gluoncv Checks all passed: check1 Python 3.6.11 (default, Aug 5 2020, 19:41:03) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import cv2 >>> print (cv2.__version__) 4.4.0 check2 Python 3.6.11 (default, Aug 5 2020, 19:41:03) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import mxnet >>> print (mxnet.__version__) 1.7.0 check3 Python 3.6.11 (default, Aug 5 2020, 19:41:03) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import mxnet >>> import gluoncv >>> print (gluoncv.__version__) 0.8.0 check4 Python 3.6.11 (default, Aug 5 2020, 19:41:03) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import mxnet as mx >>> a = mx.nd.ones((2,3), mx.gpu()) >>> b = a * 2 + 1 >>> b.asnumpy() array([[3., 3., 3.], [3., 3., 3.]], dtype=float32) check5 Python 3.6.11 (default, Aug 5 2020, 19:41:03) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import mxnet >>> from mxnet.runtime import feature_list >>> feature_list() [✔ CUDA, ✔ CUDNN, ✖ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✖ CPU_SSE, ✖ CPU_SSE2, ✖ CPU_SSE3, ✖ CPU_SSE4_1, ✖ CPU_SSE4_2, ✖ CPU_SSE4A, ✖ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✖ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✖ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP] ### Error Message No error message Problem is that the following code runs about 0.5 frame per second on a 480p youtube MP4 video. `import time import gluoncv as gcv from gluoncv.utils import try_import_cv2 cv2 = try_import_cv2() import mxnet as mx def gpu_device(gpu_number=0): try: _ = mx.nd.array([1, 2, 3], ctx=mx.gpu(gpu_number)) except mx.MXNetError: return False _ = mx.gpu(gpu_number) return True if not gpu_device(): print('No GPU device found!') exit() net = gcv.model_zoo.get_model('ssd_512_mobilenet1.0_voc', pretrained=True) # Compile the model for faster speed net.hybridize() cap = cv2.VideoCapture('video.mp4') time.sleep(1) axes = None #NUM_FRAMES = 1000 #for i in range(NUM_FRAMES): while True: # Load frame from the camera ret, frame = cap.read() if not ret: break # Image pre-processing frame = mx.nd.array(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)).astype('uint8') rgb_nd, frame = gcv.data.transforms.presets.ssd.transform_test(frame, short=512, max_size=700) # Run frame through network class_IDs, scores, bounding_boxes = net(rgb_nd) # Display the result img = gcv.utils.viz.cv_plot_bbox(frame, bounding_boxes[0], scores[0], class_IDs[0], class_names=net.classes) gcv.utils.viz.cv_plot_image(img) if cv2.waitKey(1) == ord('q'): break #cv2.waitKey(1) cap.release() cv2.destroyAllWindows() ` YOLOv3 should be able to do more than 30 fps. Look at Windows task manager report ` 22% 45% 1% Name CPU Memory ... GPU GPU engine Python 16% 278 MB 9% ` Obviously GPU is not working for MXnet, but as check5 shown CUDA, cuDNN features are all availagble to Mxnet, and the check 4 ">>> a = mx.nd.ones((2,3), mx.gpu())" proved it is using gpu. ## To Reproduce please see above ### Steps to reproduce please see above ## What have you tried to solve it? please see check 1-5 ## Environment please see above # paste outputs here Nice Yolo video with bounding boxes on each person, car, bicycle etc. but run at about 0.5 fps ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
