No. Loss != Inaccuracy. If you want to compute the accuracy, you need to create an evaluator=singa.Accuracy(), and call evaluator.Evaluate(o, t), where o is the output from the dense layer and t is the ground truth tensor. You can follow the example here https://github.com/apache/incubator-singa/blob/master/python/singa/metric.py#L67 .
Good Luck! On Sun, Oct 9, 2016 at 12:08 PM Arash Shafiei <arash.shaf...@gmail.com> wrote: > Thanks for the hint. > > I was sending it to the device but the problem turned out to be that I did > not cast labels to int32. > > Now it is working and I am getting: > > [................... ] 96.4% training loss = 0.003444 > Epoch 49, train loss is 0.003509 > Epoch 49, evaluation loss is 0.003534 > > Does this mean that after 50 epoch the evaluation has only 3.5% inaccuracy? > > On Sun, Oct 9, 2016 at 11:29 AM, Wei Wang <wangwei...@gmail.com> wrote: > > Have you moved all tensor onto the same devices? Including the tensor for > the labels. > > > On 9 Oct 2016, at 11:02 AM, Arash Shafiei <arash.shaf...@gmail.com> wrote: > > outputs = rnn.forward(model_pb2.kTrain, inputs)[0:-2] > grads = [] > batch_loss = 0 > g_dense_w.set_value(0.0) > g_dense_b.set_value(0.0) > print 'outputs len', len(outputs) // 128 > output = outputs[-1] > act = dense.forward(model_pb2.kTrain, output) > print 'output shape', output.shape // (256, 28) > print 'activation shape', act.shape // (256, 6) > print 'labels shape', labels.shape // (256, 6) > lvalue = lossfun.forward(model_pb2.kTrain, act, labels) > batch_loss += lvalue.l1() // [F d1009 t11:00:24 p23551:016 > /home/wuwf/work/incubator-singa/src/core/tensor/./tensor_math_cuda.h:344] > Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0) > CUBLAS_STATUS_MAPPING_ERROR > Aborted (core dumped) > > > > > On Sun, Oct 9, 2016 at 10:55 AM, Wei Wang <wangwei...@gmail.com> wrote: > > Could you please paste the relevant code leading to this error? > > > > On 9 Oct 2016, at 10:32 AM, Arash Shafiei <arash.shaf...@gmail.com> wrote: > > Thanks, it worked. > > So far, I managed to do rnn::forward(...) but now I am stuck somewhere > else. > > rnn::forward(...) returns a tensor (denoted as lvalue). I have to obtain > the L1 norm using lvalue.l1(). > > But I get this error: > [F d1009 t10:30:14 p23056:-56 > /home/wuwf/work/incubator-singa/src/core/tensor/./tensor_math_cuda.h:344] > Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0) > CUBLAS_STATUS_MAPPING_ERROR > Aborted (core dumped) > > On Sat, Oct 8, 2016 at 9:43 PM, Wang Wei <wang...@comp.nus.edu.sg> wrote: > > Actually, the char-rnn example is from type (4), where each rnn unit would > generate a prediction and has a ground truth label. > > For your model (type 2), you only need to use the y128 (of shape 256, 28) > from the rnn::forward() as the input to the dense layer. All other yi > should be ignored. > Consequently, you would have an output (denoted as o) of shape (256, 6) > from the dense layer, which is the prediction for the whole sequence (of > length 128). > By feeding the prediction o and the label into the loss layer, you can > compute the loss value and compute the gradient for o (denoted as o'). > Backward propagating the o through the dense layer, you would get the > gradient for y128, denoted as y'128. > > *The input of the rnn::backward() would be <y'1, y'2, ...y'128, hy', cy'>, > where only y'128 is a valid tensor. y'1, y'2 ... should be tensor with > value 0.* > > Best, > Wei > > > On Sat, Oct 8, 2016 at 9:33 PM Arash Shafiei <arash.shaf...@gmail.com> > wrote: > > Thanks. It worked. > > I am now at the phase of evaluating the loss. > > singa.loss.SoftmaxCrossEntropy has a forward function where it takes > prediction tensors and ground truth. > > My problem now is that the prediction is a sequence and my label is not a > sequence. > > Your char-rnn example is an application of type (1) in the figure bellow, > but activity recognition is an application of type (2). > > > <rnn-app.png> > Therefore for each sequence in a batch I have only 1 label. (although this > label can be of one dimension from the set of {1,2,3,4,5,6} or of 6 > dimension from the set of { [1,0,0,0,0,0], [0,1,0,0,0,0] , etc. } > > So now I need predictions and ground truth. The prediction for me is of > shape > (128, 256, 28) > where 128 is the length of the sequence, 256 is the batch size and 28 is > the hidden layer size. > > And my ground truth is of shape > (256, 1) or (256, 6) -- depending on how you model it.. > > But as I understood from the example of char-rnn my ground truth must be > of shape: > (128, 256) > > Would you have any insight about this? > Thanks.. > > > On Sat, Oct 8, 2016 at 6:42 PM, Wang Wei <wang...@comp.nus.edu.sg> wrote: > > Currently, numpy array of dtype=np.float32 or np.int could be converted > into singa tensor. > Please convert the numpy array into np.float32 and then call > tensor.from_numpy(t) (without dtype=np.float32). > > On Sat, Oct 8, 2016 at 6:36 PM Arash Shafiei <arash.shaf...@gmail.com> > wrote: > > The values that I have are floating points [-1 1]. > > While using tensor.from_numpy(...), I was getting this error: > > Not implemented yet for float64 > > I understood from the tutorial that we could pass the data type: > > y = tensor.from_numpy(..., dtype=np.float32) > > But using dtype, I am getting another error: > > TypeError: from_numpy() got an unexpected keyword argument 'dtype' > > > > On Sat, Oct 8, 2016 at 3:45 PM, Wang Wei <wang...@comp.nus.edu.sg> wrote: > > Hi > > According to the API of forward function: > http://singa.apache.org/en/docs/layer.html#singa.layer.RNN.forward > The input should be a vector of Tensors, <x1, x2, ... x128, hx, cx>, xi is > of shape (1500, 9), hx and cx are optional whose shape should be (1500, 28). > The output would be a vector of Tensors, <y1, y2, ..., y128, hy, cy>, yi > is of shape (1500, 28), hy and cy are optional depending on the existence > of hx and cx. > If you want to put the dense layer on top of the last rnn unit (i.e. the > 128-th), then you feed y128 to the dense layer. > > function convert just reshapes the raw data into a sequence of tensors > <x1, x2, ..>. > > BTW, typically, people would use a smaller batchsize e.g. less than 256. > > May I forward our discussion to the incubator email list in case others > have similar problems? > Thanks. > > Best, > Wei > > So here what I have: > > input batch of dimension (1500, 128, 9) > This means a batch of 1500 windows each having 128 vector of 9 dimensions. > > input label of dimension (1500, 6) > This means a label batch of 1500 vector of 6 dimensions. This is to label > if the person is sitting ([1,0,0,0,0,0]) or standing ([0,1,0,0,0,0]), etc. > > I am creating an lstm layer with hidden_size=28 and > input_sample_shape=(9,) and num_stacks=1 > > Then I create a dense layer with num_output=6 and input_sample_shape=(28,) > > Now I would like to feed the data to the 'forward' function of lstm and > dense layer. But I could not make it work and I could not quit understand > from the example what 'convert' and 'numpy2tensors' are suppose to do... > > I would appreciate your comments.. > > On Sun, Sep 25, 2016 at 12:23 PM, Arash Shafiei <arash.shaf...@gmail.com> > wrote: > > Yes, I was thinking of batch size to be 32. > > Thanks. I am getting more how it works and I am thinking how RNN would be > helpful. Because we do not want to predict a sequence. We just have a > sequence (in raw data) and a set of features (in processed data) and we > want to know the classification. > > So I was thinking of using other approaches with SINGA. I understood that > there is also MLP. We could use MLP from SINGA to see the result first. > > In this case input would be a set of 561 values with a label. > Then the MLP, given a set of test data with 561 features would predict the > label. > > Thanks for advices.. > > > > On Sun, Sep 25, 2016 at 12:03 PM, Wang Wei <wang...@comp.nus.edu.sg> > wrote: > > > > On Sun, Sep 25, 2016 at 9:37 AM, Arash Shafiei <arash.shaf...@gmail.com> > wrote: > > Hi Wang Wei, > > I am trying to understand the char-nn example, but there is still > something that I am missing and cannot figure is out by myself. > > The convert function creates two numpy array x and y. As I understood the > array x is the data and array y are labels. > > I checked the dimentions of these arrays. > x.shape is (32, 100, 101) > y.shape is (32, 100) > > 32 is the batch size > 100 is the sequence size > 101 is the vocabulary size, i.e. there ae 101 unique chars in the > linux_input.txt. each input from one sample and at one time step is a > one-hot vector with all positions being 0 except the position of the > character (set to 1). > > > given a sequence of chars, a,b,c,d,e,f > if the input (x) is a, b, c, d, e > then the label is b, c, d, e, f > > > > In my understanding you are taking a batch of 100 character and the next > character must be the label. So according to my understanding > x.shape must be (32, 100) > y.shape must be (32, 1) > > I mean that you have a batch of 32 sample to train and each sample is a > series of 100 character. For each sample, there must be a label, which says > what character must follow this series. And that character is only 1. > > Is there anything that I do not quit understand? > > I would need this information in order to modify your sample program for > the activity recognition. > So ultimately in my use case: > x.shape probably is (32, 561) > y.shape probably is (32, 1) > > > For you case, if you use 561 features, then how about the sequence length? > Is 32 the batchsize? > > 561 are floating point features which is between [-1:1]. > 1 is the label which is in [1,2,3,4,5,6] > > I would appreciate your help. > Thanks. > >