anirudh2290 commented on a change in pull request #14575: fix custom exception handling URL: https://github.com/apache/incubator-mxnet/pull/14575#discussion_r272305858
########## File path: src/operator/custom/custom-inl.h ########## @@ -96,7 +96,14 @@ class CustomOperator { bool prev_recording = Imperative::Get()->set_is_recording(recording); bool prev_training = Imperative::Get()->set_is_training(training); - func(); + try { + func(); + } catch (dmlc::Error& e) { + exception_ = + std::make_shared<std::exception_ptr>(std::current_exception()); + ctx.async_on_complete(); + return; + } Review comment: I think we can solve both 1 and 2 this way: After func is called do wait_to_read on all elements in arrs. Then catch and save. Remove lines 104 and 105. In PushSync, check if exception is set and rethrow exception. Also catch it and call async_on_complete in pushsync. and return. Something like the following: ``` Engine::Get()->PushSync( [=](RunContext rctx) { try { if (exception_) { std::rethrow_exception(exception_); } } catch(dmlc::Error& err) { ctx.async_on_complete(&err); return; } } ``` Thanks to this support added for horovod: https://github.com/apache/incubator-mxnet/pull/13932 we may be able to leverage this to call async_on_complete with the error. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services