[GitHub] [incubator-mxnet] samskalicky commented on issue #15921: dynamic custom operator support

2019-12-05 Thread GitBox
samskalicky commented on issue #15921: dynamic custom operator support
URL: https://github.com/apache/incubator-mxnet/pull/15921#issuecomment-562375617
 
 
   > Hi @samskalicky and @rondogency , is it ready to merge this PR after CI 
passes?
   
   Yes! We're soo ready to merge :) 
   
   Thanks @zachgk for reruning the unix_cpu job!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] samskalicky commented on issue #15921: dynamic custom operator support

2019-10-18 Thread GitBox
samskalicky commented on issue #15921: dynamic custom operator support
URL: https://github.com/apache/incubator-mxnet/pull/15921#issuecomment-543775483
 
 
   pipeline timed out, for centos-cpu, had to push again. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] samskalicky commented on issue #15921: dynamic custom operator support

2019-10-18 Thread GitBox
samskalicky commented on issue #15921: dynamic custom operator support
URL: https://github.com/apache/incubator-mxnet/pull/15921#issuecomment-543632838
 
 
   More flaky test failures:
   ```
   ==
   FAIL: test_operator_gpu.test_bulking_operator_gpu
   --
   Traceback (most recent call last):
 File "/usr/lib/python3.6/site-packages/nose/case.py", line 198, in runTest
   self.test(*self.arg)
 File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 177, in 
test_new
   orig_test(*args, **kwargs)
 File "/work/mxnet/tests/python/gpu/test_operator_gpu.py", line 2398, in 
test_bulking_operator_gpu
   _test_bulking(_test_bulking_in_process)
 File "/work/mxnet/tests/python/gpu/test_gluon_gpu.py", line 553, in 
_test_bulking
   time_per_iteration):
 File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 334, in 
run_in_spawned_process
   assert p.exitcode == 0, "Non-zero exit code %d from %s()." % 
(p.exitcode, func.__name__)
   AssertionError: Non-zero exit code -6 from _test_bulking_in_process().
    >> begin captured logging << 
   common: INFO: Setting test np/mx/python random seeds, use 
MXNET_TEST_SEED=298254377 to reproduce.
   - >> end captured logging << -
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] samskalicky commented on issue #15921: dynamic custom operator support

2019-10-18 Thread GitBox
samskalicky commented on issue #15921: dynamic custom operator support
URL: https://github.com/apache/incubator-mxnet/pull/15921#issuecomment-543570416
 
 
   Thanks @wkcn, i ran into some flaky test failures and had to push another 
empty commit:
   ```
   =
   FAIL: test_quantization_mkldnn.test_requantize_int32_to_int8
   --
   
   Traceback (most recent call last):
 File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in 
runTest
   self.test(*self.arg)
 File "/usr/local/lib/python3.5/dist-packages/nose/util.py", line 620, in 
newfunc
   return func(*arg, **kw)
 File "/work/mxnet/tests/python/mkl/../unittest/common.py", line 177, in 
test_new
   orig_test(*args, **kwargs)
 File "/work/mxnet/tests/python/mkl/../quantization/test_quantization.py", 
line 186, in test_requantize_int32_to_int8
   check_requantize_with_symbol((3, 4, 10, 10))
 File "/work/mxnet/tests/python/mkl/../quantization/test_quantization.py", 
line 181, in check_requantize_with_symbol
   assert_almost_equal(qdata_int8.asnumpy(), qdata_int8_np)
 File "/work/mxnet/python/mxnet/test_utils.py", line 624, in 
assert_almost_equal
   raise AssertionError(msg)
   
   AssertionError: 
   Items are not equal:
   Error 1562.50 exceeds tolerance rtol=1.00e-05, atol=1.00e-20 
(mismatch 0.08%).
   Location of maximum error: (2, 0, 9, 8), a=63., b=64.
ACTUAL: array( 106,   17,  -56, ...,   88,   -9,  -38],
[ 107, -120,  -49, ...,  -78,   81,   93],
[-100,  -90,  -17, ...,   84,   49, -118],...
DESIRED: array( 106,   17,  -56, ...,   88,   -9,  -38],
[ 107, -120,  -49, ...,  -78,   81,   93],
[-100,  -90,  -17, ...,   84,   49, -118],...
   
    >> begin captured stdout << -
   
   *** Maximum errors for vector of size 1200:  rtol=1e-05, atol=1e-20
   
 1: Error 1562.50  Location of error: (2, 0, 9, 8), a=63., 
b=64.
   
   
   ==
   FAIL: test_operator_gpu.test_fast_lars
   --
   
   Traceback (most recent call last):
 File "C:\Python37\lib\site-packages\nose\case.py", line 198, in runTest
   self.test(*self.arg)
 File 
"C:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\../unittest\common.py",
 line 177, in test_new
   orig_test(*args, **kwargs)
 File 
"C:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\test_operator_gpu.py",
 line 328, in test_fast_lars
   check_fast_lars(w_dtype, g_dtype, shapes, ctx, tol1, tol2)
 File 
"C:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\test_operator_gpu.py",
 line 310, in check_fast_lars
   assert_almost_equal(ref_new_lrs.asnumpy(), mx_new_lrs.asnumpy(), 
atol=tol2, rtol=tol2)
 File 
"C:\jenkins_slave\workspace\ut-python-gpu\windows_package\python\mxnet\test_utils.py",
 line 624, in assert_almost_equal
   raise AssertionError(msg)
   
   AssertionError: 
   Items are not equal:
   Error 2.844314 exceeds tolerance rtol=1.00e-06, atol=1.00e-06 
(mismatch 1.724138%).
   Location of maximum error: (33,), a=0.01896909, b=0.01897199
ACTUAL: array([0.00013492, 0.00045911, 0.00021266, ..., 0.00068631, 
0.0002449 ,
  0.00085557], dtype=float32)
DESIRED: array([0.00013493, 0.00045911, 0.00021266, ..., 0.00068631, 
0.0002449 ,
  0.00085556], dtype=float32)
    >> begin captured stdout << -
   
   *** Maximum errors for vector of size 58:  rtol=1e-06, atol=1e-06
   
 1: Error 2.844314  Location of error: (33,), a=0.01896909, b=0.01897199
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] samskalicky commented on issue #15921: dynamic custom operator support

2019-10-17 Thread GitBox
samskalicky commented on issue #15921: dynamic custom operator support
URL: https://github.com/apache/incubator-mxnet/pull/15921#issuecomment-543515556
 
 
   @yzhliu any idea why this is failing?
   ```
   Traceback (most recent call last):
 File "/work/mxnet/contrib/tvmop/compile.py", line 20, in 
   import tvm
 File "/work/mxnet/3rdparty/tvm/python/tvm/__init__.py", line 23, in 

   from . import tensor
 File "/work/mxnet/3rdparty/tvm/python/tvm/tensor.py", line 20, in 
   from ._ffi.node import NodeBase, NodeGeneric, register_node, 
convert_to_node
 File "/work/mxnet/3rdparty/tvm/python/tvm/_ffi/node.py", line 24, in 

   from .node_generic import NodeGeneric, convert_to_node, const
 File "/work/mxnet/3rdparty/tvm/python/tvm/_ffi/node_generic.py", line 23, 
in 
   from .base import string_types
 File "/work/mxnet/3rdparty/tvm/python/tvm/_ffi/base.py", line 60, in 

   _LIB, _LIB_NAME = _load_lib()
 File "/work/mxnet/3rdparty/tvm/python/tvm/_ffi/base.py", line 52, in 
_load_lib
   lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_GLOBAL)
 File "/usr/lib/python3.5/ctypes/__init__.py", line 347, in __init__
   self._handle = _dlopen(self._name, mode)
   OSError: /work/build/3rdparty/tvm/libtvm.so: file too short
   ninja: build stopped: subcommand failed.
   ```
   
   @wkcn can you please restart the unix-cpu job and see if it was transient?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] samskalicky commented on issue #15921: dynamic custom operator support

2019-10-14 Thread GitBox
samskalicky commented on issue #15921: dynamic custom operator support
URL: https://github.com/apache/incubator-mxnet/pull/15921#issuecomment-541983523
 
 
   > Just talked with @szha and he found out a better way to make `MXTensor` 
not diverge from the de facto standard tensor format `DLTensor` -- we can have 
a `MXTensor` class which contains a `DLTensor` as its member. Just like [what 
TVM 
did](https://github.com/dmlc/tvm/blob/6b0359b440135b19116ded681be9bee0d7d4c985/include/tvm/runtime/ndarray.h#L242-L294)
 for its own tensor format.
   
   Thanks @junrushao1994  & @szha thats a good idea. I added the DLTensor to 
the list of features for the next custom Op PR. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] samskalicky commented on issue #15921: dynamic custom operator support

2019-10-09 Thread GitBox
samskalicky commented on issue #15921: dynamic custom operator support
URL: https://github.com/apache/incubator-mxnet/pull/15921#issuecomment-540242870
 
 
   Note: the two failing CI jobs are related to numpy:
   sanity:
   ```
   * Module mxnet.numpy_op_signature
   
   python/mxnet/numpy_op_signature.py:21:0: E0402: Attempted relative import 
beyond top-level package (relative-beyond-top-level)
   ```
   unix-gpu:
   ```
   Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1746065449 to 
reproduce.
   
   ERROR
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] samskalicky commented on issue #15921: dynamic custom operator support

2019-10-07 Thread GitBox
samskalicky commented on issue #15921: dynamic custom operator support
URL: https://github.com/apache/incubator-mxnet/pull/15921#issuecomment-539344942
 
 
   > Hey, thank you guys for the nice work! BTW, would you mind if you guys 
clearly state why DLTensor is not adopted, which I believe would be useful for 
other community members for refenrece
   
   @wkcn (who implemented DLTensor support in MXNet) and I had a long 
discussion about this. In fact, we did investigate supporting DLTensor:
   
https://github.com/samskalicky/incubator-mxnet/blob/custom_op/example/custom_op/test.cc#L14
   
   One takeaway we had was that it would not be easy or convenient to modify 
the structure of DLPack. The reason is that DLPack is used commonly in multiple 
deep learning frameworks, and we should keep the consistence of DLPack. So 
building MXNet custom operators on top of DLTensor would limit our future 
extensibility of MXNet and custom operator support. 
   
   One example of this would be adding a "layout" field to the tensor structure 
(ie. NCHW). This is something that I have heard as a request, but is [not 
currently something the DLPack community is willing to 
accept](https://github.com/dmlc/dlpack/pull/42). The MXTensor structure in this 
work is compatible with DLPack/DLTensor, so any user that wants to convert from 
MXTensor to DLTensor can do so by simply setting the fields in a DLTensor 
without copying data, and with a very small overhead. 
   
   Just because we're not using DLTensor now, does not mean that we cannot 
support it directly in a future PR. If enough users want this feature, the work 
in this PR can easily be extended to simply pass DLTensors from MXNet to the 
custom operators in the external library. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] samskalicky commented on issue #15921: dynamic custom operator support

2019-10-02 Thread GitBox
samskalicky commented on issue #15921: dynamic custom operator support
URL: https://github.com/apache/incubator-mxnet/pull/15921#issuecomment-537741566
 
 
   Failing centos-gpu job is failing with Numpy issue #16358 unrelated to this 
PR. We've sync'ed with the authors and they confirmed. Assume all CI jobs are 
passing at this point until they can address the issue


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services