chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521247206 > Just now I did a test: When I replace the merged tensor.cc file by the latest master branch tensor.cc, the loss can be reduced again. So I just changed the tensor.cc of dist_new to the latest master branch one. This act would omit the commit f54a526 done on tensor.cc > > I guess this maybe because: > The commit f54a526 in dist_new has redesigned the softmax function by calling cudnn, but this is not compatiable with the latest master branch. OK, I check again in the AWS image, in the previous two months the code we used for distributed module has commented out the code done by the commit f54a526 on softmax function, so softmax is the same as master branch. This is the code of softmax we used in the AWS image of distributed module: ```cpp void SoftMax(const Tensor &in, Tensor *out) { CHECK_LE(in.nDim(), 2u); out->CopyData(in); size_t nrow = 1, ncol = in.Size(), size = ncol; if (in.nDim() == 2u) { nrow = in.shape(0); ncol = size / nrow; out->Reshape(Shape{nrow, ncol}); } Tensor tmp = RowMax(*out); SubColumn(tmp, out); Exp(*out, out); SumColumns(*out, &tmp); DivColumn(tmp, out); out->Reshape(in.shape()); } // void SoftMax(const Tensor &in, Tensor *out) { // CHECK_LE(in.nDim(), 2u); // TYPE_LANG_SWITCH(in.data_type(), DType, in.device()->lang(), Lang, { // out->device()->Exec([in, out](Context * ctx) { // SoftMax<DType, Lang>(in, out, ctx); // }, {in.block(), out->block()}, {out->block()}); // }); // } ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services