marcoabreu opened a new issue #11441: Failing KVStore test dist-kvstore tests 
GPU
URL: https://github.com/apache/incubator-mxnet/issues/11441
 
 
   
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11433/1/pipeline/1569
 
   This test throws a lot of errors:
   
   ```
   + ../../tools/launch.py -n 7 --launcher local python dist_sync_kvstore.py
   
   worker 0 is initialized
   
   worker 3 is initialized
   
   worker 4 is initialized
   
   worker 2 is initialized
   
   worker 5 is initialized
   
   worker 1 is initialized
   
   worker 6 is initialized
   
   worker 6 is done with non compression tests
   
   worker 1 is done with non compression tests
   
   Traceback (most recent call last):
   
     File "dist_sync_kvstore.py", line 384, in <module>
   
       kv = init_kv()
   
     File "dist_sync_kvstore.py", line 72, in init_kv
   
       kv.init(keys_shape, [mx.nd.ones(shape)] * len(keys_shape))
   
     File "../../python/mxnet/kvstore.py", line 154, in init
   
       check_call(_LIB.MXKVStoreInitEx(self.handle, mx_uint(len(ckeys)), ckeys, 
cvals))
   
     File "../../python/mxnet/base.py", line 210, in check_call
   
       raise MXNetError(py_str(_LIB.MXGetLastError()))
   
   mxnet.base.MXNetError: [23:38:17] src/kvstore/./kvstore_local.h:84: Check 
failed: str_key_dict_.find(str_key) == str_key_dict_.end() duplicate init of 
key 3
   
   
   
   Stack trace returned 10 entries:
   
   [bt] (0) 
/work/mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::StackTrace[abi:cxx11]()+0x5b)
 [0x7f2efc0db08b]
   
   [bt] (1) 
/work/mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28)
 [0x7f2efc0dbbf8]
   
   [bt] (2) 
/work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::kvstore::KVStoreLocal::Init(std::vector<std::__cxx11::basic_string<char,
 std::char_traits<char>, std::allocator<char> >, 
std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > > > const&, std::vector<mxnet::NDArray, 
std::allocator<mxnet::NDArray> > const&)+0x50d) [0x7f2efedb375d]
   
   [bt] (3) 
/work/mxnet/python/mxnet/../../lib/libmxnet.so(MXKVStoreInitEx+0x4be) 
[0x7f2efed2721e]
   
   [bt] (4) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) 
[0x7f2f5d8e7e40]
   
   [bt] (5) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x2eb) 
[0x7f2f5d8e78ab]
   
   [bt] (6) 
/usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(_ctypes_callproc+0x48f)
 [0x7f2f5daf73df]
   
   [bt] (7) 
/usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(+0x11d82) 
[0x7f2f5dafbd82]
   
   [bt] (8) python(PyEval_EvalFrameEx+0x578f) [0x4c15bf]
   
   [bt] (9) python(PyEval_EvalCodeEx+0x306) [0x4b9ab6]
   
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to