sxjscience commented on a change in pull request #16979: [Bugfix] [Numpy] Add 
`kAddTo` and kNullOp to Transpose
URL: https://github.com/apache/incubator-mxnet/pull/16979#discussion_r354451534
 
 

 ##########
 File path: src/operator/tensor/pseudo2DTranspose_op-inl.cuh
 ##########
 @@ -39,22 +39,29 @@ namespace mxnet {
 namespace op {
 namespace cuda {
 
-
-template <typename DType, typename CType>
+/*!
+ * \brief The `transpose_pseudo2D` based on chosen vectorized types. It 
transpose an array of
+ *    shape (k, m, n) to (k, n, m)
+ * \param out Pointer to output memory.
+ * \param inp Pointer to input memory.
+ * \param m First of tensor dimensions.
+ * \param n Second of tensor dimensions.
+ * \param nIterY The number of iterations in the y-dim of the thread to cover 
all rows. (1-->m)
+ * \param nIterZ The number of iterations in the z-dim of the thread to cover 
all rows. (1-->m)
+ * \tparam DType Data type
+ * \tparam CType The type to load the data.
+ * \tparam TSR the vectorized ratio.
+ * \tparam is_addto Whether to perform out += transpose(data) or out = 
transpose(data)
+ */
+template <typename DType, typename CType, int TSR, bool is_addto>
 __global__ void transpose_pseudo2D(DType* out, DType* inp,
                                    const index_t m, const index_t n,
                                    const index_t nIterY, const index_t nIterZ) 
{
-  const index_t TSR = sizeof(CType)/sizeof(DType);  // TypeSizeRatio
 
 Review comment:
   I moved it to the template to avoid the `CType tmp[0];` error. Because in 
the inner loop we are using switch to switch over all possible dtype sizes, 
some will have TSR = 0. I guess it would be better to use 
`max(sizeof(CType)/sizeof(DType), 1)`. Let me try that

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to