vexilligera commented on a change in pull request #16800: [Numpy] Add NumPy 
support for np.linalg.det and np.linalg.slogdet
URL: https://github.com/apache/incubator-mxnet/pull/16800#discussion_r347139368
 
 

 ##########
 File path: src/operator/tensor/la_op-inl.h
 ##########
 @@ -921,6 +927,9 @@ struct det_backward {
     using namespace mshadow;
     using namespace mshadow::expr;
     using namespace mxnet_op;
+    if (dA.shape_.Size() == 0U) {
+      return;
+    }
     // compute inverse(A) and stores it to LU
     linalg_batch_det_backward_helper(LU, pivot, det, dA, DType(0), ctx);
     const_cast<Tensor<xpu, 3, DType>&>(dA) = broadcast_to(reshape(det * ddet, \
 
 Review comment:
   @reminisce I believe the perpetrator to the flakiness on Windows is here. I 
was able to pinpoint the bug thanks to @haojin2 . I think this may corrupt 
memory sometimes.
   
   P.S. It's not exactly line 926, it is another line of almost identical code 
in "slogdet_backward" in the same file. In fact, operator "det" does not 
produce any error in my test. Later I decomposed that line into three separate 
calls to operators and it turned out that "broadcast_to" affects the flaky 
behavior. However, when it comes to the direct source of crash (usually access 
violation of a null pointer or ndim overflow), it is actually located in 
"LaOpDetBackward", where the access to "req" array is out of boundary. I still 
tend to believe that the bug is on a higher level and I must have done 
something silly, I'll have a closer look tomorrow.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to