XJDKC commented on pull request #730: URL: https://github.com/apache/singa/pull/730#issuecomment-644678297
This pr is almost done. Now the constructed graph is correct and can be executed normally. There is only one issue. After training some iterations, the loss value will become NaN. This phenomenon will appear no matter whether the graph is enabled or not. Any ideas about this problem? @dcslin @chrishkchris ```bash root@ip-172-31-6-19:/home/ubuntu/Program/singa/examples/qabot git:(lstm-graph*) # python train.py -g successfully loaded word2vec model and corpus successfully generated train, eval, test data epoch 0, time used 7 sec, top1 hits: 0.000000, loss: [6.2321043] epoch 1, time used 7 sec, top1 hits: 0.000000, loss: [6.1910734] epoch 2, time used 7 sec, top1 hits: 0.010000, loss: [6.1914635] epoch 3, time used 7 sec, top1 hits: 1.000000, loss: [nan] epoch 4, time used 7 sec, top1 hits: 1.000000, loss: [nan] epoch 5, time used 7 sec, top1 hits: 1.000000, loss: [nan] epoch 6, time used 7 sec, top1 hits: 1.000000, loss: [nan] epoch 7, time used 7 sec, top1 hits: 1.000000, loss: [nan] epoch 8, time used 7 sec, top1 hits: 1.000000, loss: [nan] epoch 9, time used 7 sec, top1 hits: 1.000000, loss: [nan] training top1 hits rate: 1.0 ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
