arcadiaphy opened a new issue #14057: validation stucks when training gluoncv ssd model URL: https://github.com/apache/incubator-mxnet/issues/14057 ## Description When training gluoncv ssd model, validation sometimes takes way more longer time than the training epoch. After debugging, the problem comes from the `box_nms` operator which contributes most of the time. ## Environment info (Required) ``` Centos 7 CUDA: 9.0 cudnn: 7 mxnet: 1.4.0.rc2 gluon-cv: latest ``` ## Minimum reproducible example The following snippets show `box_nms` will take very long time when processing a lot of prior boxes ``` import mxnet as mx import numpy as np np.random.seed(0) batch_size = 32 prior_number = 100000 data = np.zeros((batch_size, prior_number, 6)) data[:, :, 0] = np.random.randint(-1, 1, (batch_size, prior_number)) data[:, :, 1] = np.random.random((batch_size, prior_number)) xmin = np.random.random((batch_size, prior_number)) ymin = np.random.random((batch_size, prior_number)) width = np.random.random((batch_size, prior_number)) height = np.random.random((batch_size, prior_number)) data[:, :, 2] = xmin data[:, :, 3] = ymin data[:, :, 4] = xmin + width data[:, :, 5] = ymin + height mx_data = mx.nd.array(data, ctx=mx.gpu(0)) rv = mx.nd.contrib.box_nms(mx_data, overlap_thresh=0.5, valid_thresh=0.01, topk=400, score_index=1, id_index=0) mx.nd.waitall() ``` ## What I have found out 1. The gpu version of stable sort in `SortByKey` function degrades badly on sorting length 2. The `box_nms` operator doesn't remove background boxes in valid box filtering which leads to big sorting length ## What I have done 1. Add SORT_WITH_THRUST compiling definition in Makefile: the validation process is still very slow 2. Add background boxes filtering in `box_nms`: the validation process accelerates dramatically since most of boxes are classified as background. I will post a PR on the second solution.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services