stephenrawls commented on issue #14208: Add support for fast variable-length LSTM URL: https://github.com/apache/incubator-mxnet/pull/14208#issuecomment-465680153 Hi @TaoLv , Basically my motivation is: we want to use stacked bidirectional LSTM's with variable sequence length. Currently this is slow in MxNet because to do it right you either have to (1) Use the LSTMCell and unroll which doesn't take advantage of cuDNN on the GPU; (2) Use cuDNN one layer at a time, and do a lot of reversing of your output to pass into the backward-direction lstm, so you can be sure that padding doesn't effect the result. This all seemed a bit silly since cuDNN directly provides support for variable length stacked bidirectional LSTMs efficiently. Beyond my direct use case, for the community I guess it would probably be nice if: - All RNN types are supported. (This happens already via cuDNN even though I am focussed on LSTM) - Both CPU and GPU are supported I guess the API I have is currently like this: ``` lstm = mx.gluon.LSTM(..normal params..., use_sequence_length=True) lstm(input, sequence_length) ``` I am not tied to this api however. For example if we want to not have a `use_sequence_length` param in initialization then I'm happy to remove it. I only put it there in the first place to be similar to existing Sequence* family of operators. As far as what changes need to happen on the symbol side, I am less sure, I primarily use gluon. Happy for folks to offer suggestions. Thanks, Stephen
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services