apeforest edited a comment on issue #17292: Can't run horovod with latest 
nightly wheel
URL: 
https://github.com/apache/incubator-mxnet/issues/17292#issuecomment-574848779
 
 
   Thanks @stephenrawls for the analysis. 
   Here are the causes of the problem:
   
   1) Horovod uses MX_API_BEGIN() and MX_API_END() from mxnet/c_api_error.h to 
catch and throw errors in horovod APIs: 
https://github.com/horovod/horovod/blob/master/horovod/mxnet/mpi_ops.cc#L224
   2) MX_API_BEGIN() is a macro that calls MXAPIHandleException 
https://github.com/apache/incubator-mxnet/blob/master/include/mxnet/c_api_error.h#L36
   3) Before #17128, MXAPIHandleException is an inline function. And therefore 
when #17128 introduced a new function call NormalizeError() inside 
MXAPIHandleException it broke Horovod integration because the symbol of 
NormalizeError is not whitelist by MXNet distribution.
   4) #17298 removed NormalizeError() from MXAPIHandleException and made it not 
inline. 
https://github.com/apache/incubator-mxnet/pull/17208/files#diff-875aa4c013dbd73b044531e439e8afddR67.
 This time the error becomes undefined symbol of MXAPIHandleException.
   
   So to summarize, the problem is not that Horovod requires 
`MXAPIHandleException` function to be inline. The rootcause is that MXNet did 
not export the symbol `*MXAPIHandleException*` in its 
[whitelist](https://github.com/apache/incubator-mxnet/blob/a296dad87438624bc6388e4659db4cb039a7908a/make/config/libmxnet.sym#L15),
 but only the symbols that are being used inside `MXAPIHandleException` 
function. It was okay when the function `MXAPIHandleException` is inline, but 
became a problem when it's not. A good practice is to whitelist symbol 
`*MXAPIHandleException*` instead of its internals.
   
   I will create a PR to fix this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to