stephenrawls commented on issue #14208: Add support for fast variable-length 
LSTM
URL: https://github.com/apache/incubator-mxnet/pull/14208#issuecomment-465680153
 
 
   Hi @TaoLv ,
   
   Basically my motivation is:  we want to use stacked bidirectional LSTM's 
with variable sequence length. Currently this is slow in MxNet because to do it 
right you either have to (1) Use the LSTMCell and unroll which doesn't take 
advantage of cuDNN on the GPU; (2) Use cuDNN one layer at a time, and do a lot 
of reversing of your output to pass into the backward-direction lstm, so you 
can be sure that padding doesn't effect the result.
   
   This all seemed a bit silly since cuDNN directly provides support for 
variable length stacked bidirectional LSTMs efficiently.
   
   Beyond my direct use case, for the community I guess it would probably be 
nice if:
     - All RNN types are supported.  (This happens already via cuDNN even 
though I am focussed on LSTM)
     - Both CPU and GPU are supported
   
   I guess the API I have is currently like this:
   
   ```
   lstm = mx.gluon.LSTM(..normal params..., use_sequence_length=True)
   lstm(input, sequence_length)
   ```
   
   I am not tied to this api however. For example if we want to not have a 
`use_sequence_length` param in initialization then I'm happy to remove it. I 
only put it there in the first place to be similar to existing Sequence* family 
of operators.
   
   As far as what changes need to happen on the symbol side, I am less sure, I 
primarily use gluon. Happy for folks to offer suggestions.
   
   Thanks,
   Stephen

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to