DickJC123 edited a comment on issue #14029: Out of memory error in 3d Conv for matrix splits > 10, CUDNN strange behaviour URL: https://github.com/apache/incubator-mxnet/issues/14029#issuecomment-459466018 As pointed out earlier, going from 10 to 11 is the threshhold for when cudnn thinks the fft implementation is fastest. That algo apparently has a huge workspace requirement, probably related to being 3D. There is no cudnn bug here. You have a couple of remedies: 1. Set MXNET_CUDNN_AUTOTUNE_DEFAULT=1. That will result in all convolutions in your model being chosen by cudnnFind(), subject to the limitation that the workspace is less than 1GB. The detrimental fft will be avoided because its workspace is too large (although the model may run slower). 2. Leave MXNET_CUDNN_AUTOTUNE_DEFAULT=0, but control the 3d convolution locally, e.g. with Convolution(..., cudnn_tune='fastest', ...). Only the problem Convolution will have its algo determined by cudnnFind(), subject to a workspace limitation of 1GB. If you don't like the 1GB, then override it locally with e.g. workspace=2048 to set the workspace to 2GB. There is currently no way to limit algos by workspace size without also running cudnnFind(). We could add this functionality in a backward-compatible way by adding a new supported value to MXNET_CUDNN_AUTOTUNE_DEFAULT: - Values: 0, 1, 2, **or 3** (default=1) - The default value of cudnn auto tuning for convolution layers. - Value of 0 means there is no auto tuning to pick the convolution algo - Performance tests are run to pick the convolution algo when value is 1 or 2 - Value of 1 chooses the best algo in a limited workspace - Value of 2 chooses the fastest algo whose memory requirements may be larger than the default workspace threshold - **Value of 3 means there is no auto tuning to pick the convolution algo, but the algo cannot have a workspace requirement greater than the limit.** There would be a locally set equivalent to this in the Convolution parameters: - cudnn_tune='off' # use cudnnGet(), no workspace limit, even if set locally - **cudnn_tune='off_limited_workspace' # use cudnnGet() subject to 1GB or locally-set limit** - cudnn_tune='limited_workspace' # use cudnnFind() subject to 1GB or locally-set limit - cudnn_tune='fastest' # use cudnnFind(), no workspace limit, even if set locally While we're at it, I'm not fond of the compiled in default workspace size of 1GB. I'd suggest adding an environment variable: MXNET_CUDNN_WORKSPACE_LIMIT_DEFAULT # If not set, then limit = 1024 (MB)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services