I am trying to work with **bidirectional GRU**. As I am going to work in
batches, I will be padding my sequence with 0s in case of sequences with
unequal lengths. For that, I need to use `use_sequence_lengths` argument in GRU
definition. But when I turn this boolean (use_sequence_lengths) to true, defer
initialization functionality fails to work for some reason. Below is code to
replicate:
```
from mxnet import gluon
import mxnet as mx
ctx = [mx.gpu(0)]
class TestModel(gluon.nn.HybridBlock):
def __init__(self, bidirectional=True):
super(TestModel, self).__init__(prefix="TestModel_")
with self.name_scope():
self.embed = gluon.nn.Embedding(input_dim=50, output_dim=5)
self.rnn = gluon.rnn.GRU(hidden_size=20,
bidirectional=bidirectional, use_sequence_length=True)
self.dense = gluon.nn.Dense(1)
def hybrid_forward(self, F, x, x_len):
embed = self.embed(x).transpose((1, 0, 2)) # to make in
max_sequence_length, batch_size, other_feature_dims
rnn_all = self.rnn(embed, sequence_length=x_len)
out = F.SequenceLast(rnn_all, sequence_length=x_len,
use_sequence_length=True)
out = self.dense(out)
return out
example_codes = [[1,2,3,4,5], [1,2,3,0,0]]
example_len = [5, 3]
x_input = mx.nd.array(example_codes).as_in_context(ctx[0])
x_len_input = mx.nd.array(example_len).as_in_context(ctx[0])
mx.random.seed(0)
net = TestModel(bidirectional=True)
net.initialize(mx.init.Xavier(), ctx=ctx, force_reinit=True)
net(x_input, x_len_input)
```
This gives `DeferredInitializationError` error. It works in either of the
following edits in `self.rnn`.. line :
```
# 1. Turn off use_sequence_length
self.rnn = gluon.rnn.GRU(hidden_size=20, bidirectional=bidirectional,
use_sequence_length=False)
# 2. Provide input_size
self.rnn = gluon.rnn.GRU(hidden_size=20, bidirectional=bidirectional,
input_size=5, use_sequence_length=True)
```
The above 2 options make the code to work but do not solve my case because 1) I
need to use sequence_lengths argument to make sure it doesn't use anything from
padded values, 2) input_size for all batches can vary based on sequence length
and I don't want to pre-fix it.
Can anyone help in debugging this?
Additional info:
There are 2 exceptions I get when I run above code chunk:
1.
```
DeferredInitializationError: Parameter 'TestModel_gru0_l0_i2h_weight' has
not been initialized yet because initialization was deferred. Actual
initialization happens during the first forward pass. Please pass one batch of
data through the network before accessing Parameters. You can also avoid
deferred initialization by specifying in_units, num_features, etc., for network
layers.
During handling of the above exception, another exception occurred:
```
2.
```
> MXNetError: [07:02:38] src/core/symbolic.cc:91: Symbol.ComposeKeyword
> argument name state_cell not found.
> Candidate arguments:
> [0]data
> [1]parameters
> [2]state
> [3]sequence_length
```
---
[Visit
Topic](https://discuss.mxnet.apache.org/t/gluon-rnn-with-sequence-length-and-defer-initialization/6640/1)
or reply to this email to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.mxnet.apache.org/email/unsubscribe/0444cfe5fe9949b1d3e1991c33a072c400329a666a55076146c92cb887c77a7e).