On 08/04/2015 08:21 AM, Sumit Kumar wrote: > Hi Karl > Based on all your suggestions, > I came up with this implementation: > > // Copy Sparse Eigen matrices to Dense viennacl matrices > typedef Eigen::Matrix<ScalarType, Eigen::Dynamic, Eigen::Dynamic, > Eigen::RowMajor> RMMatrix; > viennacl::matrix<ScalarType, viennacl::row_major> > vcl_A(source->rows(), source->cols()); > viennacl::copy(RMMatrix(*source), vcl_A); > viennacl::matrix<ScalarType, viennacl::row_major> > vcl_B(target->rows(), target->cols()); > viennacl::copy(RMMatrix(*target), vcl_B); > viennacl::matrix<ScalarType, viennacl::row_major> > vcl_C(result->rows(), result->cols()); > // Implement the matrix multiplication on the GPU. > vcl_C = viennacl::linalg::prod(vcl_A, vcl_B); > // Copy the matrix back to the host matrix > RMMatrix temp = RMMatrix(*result); > viennacl::copy(vcl_C, temp); > (*result) = temp.sparseView();
You should really double-check your uses of RMMatrix. Each of the calls to copy() creates a temporary Eigen matrix from the buffer, resulting in another copy. fast_copy() is *much* more appropriate here. async_copy() is not needed here, because you don't have other computations for overlapping host->device->host transfers. The sparse-to-dense conversion depends on your use case. If you have more than ~10 percent nonzeros in your matrix, a dense matrix-matrix product may pay off. It depends a lot on the sparse matrix pattern of the result matrix, which can be fairly hard to predict. Best regards, Karli ------------------------------------------------------------------------------ _______________________________________________ ViennaCL-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/viennacl-devel
