[GitHub] [incubator-mxnet] perdasilva commented on issue #13103: Flaky test test_gluon_rnn.test_layer_bidirectional

2019-04-12 Thread GitBox
perdasilva commented on issue #13103: Flaky test 
test_gluon_rnn.test_layer_bidirectional
URL: 
https://github.com/apache/incubator-mxnet/issues/13103#issuecomment-482569369
 
 
   @pengzhao-intel forgot to say thank you. Thank you! =D


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] perdasilva commented on issue #13103: Flaky test test_gluon_rnn.test_layer_bidirectional

2019-04-12 Thread GitBox
perdasilva commented on issue #13103: Flaky test 
test_gluon_rnn.test_layer_bidirectional
URL: 
https://github.com/apache/incubator-mxnet/issues/13103#issuecomment-482551962
 
 
   @pengzhao-intel 
   
   ```
   [DEBUG] 1 of 1: Setting test np/mx/python random seeds, use 
MXNET_TEST_SEED=796240428 to reproduce.
   ok
   
   --
   Ran 1 test in 159.016s
   
   OK
   ```
   
   I'll close my skip_test PR and post my fix test PR =)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] perdasilva commented on issue #13103: Flaky test test_gluon_rnn.test_layer_bidirectional

2019-04-12 Thread GitBox
perdasilva commented on issue #13103: Flaky test 
test_gluon_rnn.test_layer_bidirectional
URL: 
https://github.com/apache/incubator-mxnet/issues/13103#issuecomment-482495472
 
 
   @haojin2 no good:
   
   ```
   ==
   FAIL: test_gluon_rnn.test_layer_bidirectional
   --
   Traceback (most recent call last):
 File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in 
runTest
   self.test(*self.arg)
 File "/work/mxnet/tests/python/unittest/common.py", line 110, in test_new
   orig_test(*args, **kwargs)
 File "/work/mxnet/tests/python/unittest/common.py", line 177, in test_new
   orig_test(*args, **kwargs)
 File "/work/mxnet/tests/python/unittest/test_gluon_rnn.py", line 283, in 
test_layer_bidirectional
   assert_allclose(net(data).asnumpy(), ref_net(data).asnumpy(), rtol=2e-7)
 File 
"/usr/local/lib/python3.5/dist-packages/numpy/testing/_private/utils.py", line 
1452, in assert_allclose
   verbose=verbose, header=header, equal_nan=equal_nan)
 File 
"/usr/local/lib/python3.5/dist-packages/numpy/testing/_private/utils.py", line 
789, in assert_array_compare
   raise AssertionError(msg)
   AssertionError:
   Not equal to tolerance rtol=2e-07, atol=0
   
   (mismatch 0.06493506493507084%)
x: array([0.424288, 0.560531, 0.600333, ..., 0.402131, 0.560952, 0.505039],
 dtype=float32)
y: array([0.424288, 0.560531, 0.600333, ..., 0.402131, 0.560952, 0.505039],
 dtype=float32)
    >> begin captured logging << 
   tests.python.unittest.common: INFO: Setting test np/mx/python random seeds, 
use MXNET_TEST_SEED=1305130208 to reproduce.
   - >> end captured logging << -
   
   --
   Ran 1 test in 0.030s
   ```
   
   What I did to the test code:
   
   ```
   # Added import statement
   from tests.python.unittest.common import with_seed
   
   # Added with_seed decorator to test function
   @with_seed()
   def test_layer_bidirectional():
   
   # Update rtol as suggested to the assertion statement
   assert_allclose(net(data).asnumpy(), ref_net(data).asnumpy(), rtol=2e-7)
   ```
   
   To test the changes, I have a g3.8xlarge instance with nvidia drivers 418 
and nvidia-docker:
   
   ```
   # On host
   $ docker run -ti -v `pwd`:/work/mxnet mxnetcd/build.ubuntu_cpu_static 
/bin/bash
   
   # Within container
   $ source tools/staticbuild/build.sh cu92mkl pip
   $ exit
   
   # On host
   $ docker run -ti --runtime=nvidia -v `pwd`:/work/mxnet 
mxnetcd/build.ubuntu_gpu_cu92 /bin/bash
   $ export PYTHONPATH=./python/
   $ MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s 
tests/python/unittest/test_gluon_rnn.py:test_layer_bidirectional
   ``` 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] perdasilva commented on issue #13103: Flaky test test_gluon_rnn.test_layer_bidirectional

2019-04-11 Thread GitBox
perdasilva commented on issue #13103: Flaky test 
test_gluon_rnn.test_layer_bidirectional
URL: 
https://github.com/apache/incubator-mxnet/issues/13103#issuecomment-482446377
 
 
   @haojin2 I'll give it a go, and let you know how it goes. Thanks for the 
help!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] perdasilva commented on issue #13103: Flaky test test_gluon_rnn.test_layer_bidirectional

2019-04-11 Thread GitBox
perdasilva commented on issue #13103: Flaky test 
test_gluon_rnn.test_layer_bidirectional
URL: 
https://github.com/apache/incubator-mxnet/issues/13103#issuecomment-482445402
 
 
   @szha in the two cases I've linked to, it was tested against a binary 
compiled with your tools for static linking, and the variants used were cu80mkl 
and cu90mkl.
   
   @haojin2 I'm happy to bump them, but I just wouldn't know what to bump them 
to =S I'm not familiar with this side of the code and don't really know what 
reasonable tolerance levels would be.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] perdasilva commented on issue #13103: Flaky test test_gluon_rnn.test_layer_bidirectional

2019-04-11 Thread GitBox
perdasilva commented on issue #13103: Flaky test 
test_gluon_rnn.test_layer_bidirectional
URL: 
https://github.com/apache/incubator-mxnet/issues/13103#issuecomment-482086093
 
 
   I'm currently working on some CD pipelines and I'm seeing this issue crop up:
   
   
http://jenkins.mxnet-ci-dev.amazon-ml.com/blue/organizations/jenkins/restricted-mxnet-cd%2Fmxnet-static-binary-cu80mkl-release/detail/mxnet-static-binary-cu80mkl-release/4/pipeline
   
   
http://jenkins.mxnet-ci-dev.amazon-ml.com/blue/organizations/jenkins/restricted-mxnet-cd%2Fmxnet-static-binary-cu92mkl-release/detail/mxnet-static-binary-cu92mkl-release/11/pipeline
   
   I'll create a quick PR to disable it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services