Re: DGL crashes in the recent master branch

2019-05-21 Thread Zheng, Da
sponsibility to it to maintain compatibility rather than the other way around? On Tue, May 21, 2019 at 3:39 PM Zheng, Da wrote: > Hello all, > > I recently find that DGL don’t run with the recent MXNet. DGL crashes with > memory errors. > Yesterd

DGL crashes in the recent master branch

2019-05-21 Thread Zheng, Da
Hello all, I recently find that DGL don’t run with the recent MXNet. DGL crashes with memory errors. Yesterday we have identified a bug in DLPack and Junru has implemented a fix: https://github.com/apache/incubator-mxnet/pull/15016 However, there are some other bugs that causes DGL to crash with

Re: [Announce] Upcoming Apache MXNet (incubating) 1.4.0 release

2018-11-29 Thread Zheng, Da
Hello Steffen, Can this bug be fixed in 1.4.0 release? It's a significant performance regression on sparse matrix multiplication. https://github.com/apache/incubator-mxnet/issues/13449 Thanks, Da On 11/26/18, 6:42 AM, "Steffen Rochel" wrote: Dear MXNet community, I will be the r

Significant performance regression in SpMV

2018-11-27 Thread Zheng, Da
Hello all, I notice there is a significant performance regression in SpMV after this PR (https://github.com/apache/incubator-mxnet/pull/12380). It seems the problem occurs when running on multiple GPUs (e.g., 8 GPUs). When running multiple GPUs for training, SpMV on CPU only uses two or three t

Re: There is a bug in shape inference of the where operator

2018-08-24 Thread Zheng, Da
> Hi Da, > > I am currently running the unit test and will check in the fix once it's > complete. > > Thanks, > > Lin > > On Thu, Aug 23, 2018 at 3:59 PM Zheng, Da > wrote: > > > Hello all, > >

There is a bug in shape inference of the where operator

2018-08-23 Thread Zheng, Da
Hello all, There is a little bug in shape inference of the where operator. Currently, the where operator doesn’t work if the first input is a 1D array. Yuan Lin will provide a patch to fix this bug. Best, Da

A proposal of supporting dynamic shape and shape symbol

2018-07-19 Thread Zheng, Da
Hello all, As you know, MXNet performs static shape inference to optimize the performance. However, there are cases that the output arrays of an operator can’t be statically inferred. In addition, MXNet doesn’t support shape symbol. For example, the input shape of mx.sym.ones has to be a Python

Re: Reverting pull request

2018-06-15 Thread Zheng, Da
+1 The PR has been merged a while ago, so it has been tested by many people. Other people's work now depends on this PR. Reverting it at this point can cause a lot of problems for many other people. Best, Da On 6/15/18, 2:18 PM, "workc...@gmail.com on behalf of Tianqi Chen" wrote: +1 W

Re: Regarding 1.2.1 patch release

2018-06-07 Thread Zheng, Da
o > have > > below PRs. But it depends the progress of review and merging. > > 1. bug fix: https://github.com/apache/incubator-mxnet/pull/11095 (under > > review) > > 2. perf improvement: https://github.com/apache/ > incubator-mxnet/pull/1104

Re: A proposal for unified integration with external acceleration libraries

2018-06-04 Thread Zheng, Da
the future. Any comments and suggestions will be highly appreciated. Thanks. -tao -Original Message- From: Zheng, Da [mailto:dzz...@amazon.com] Sent: Saturday, June 2, 2018 4:38 AM To: dev@mxnet.incubator.apache.org Subject: A proposal for unified integ

A proposal for unified integration with external acceleration libraries

2018-06-01 Thread Zheng, Da
Hello all, We would like to propose a new mechanism that unifies the integration with most of the external acceleration libraries, including TVM, MKLDNN, TensorRT and more. The main idea is to integrate with the external libraries in the level of subgraphs instead of operators. There are a few

Re: Compilation error in old Mac

2018-05-10 Thread Zheng, Da
https://github.com/apache/incubator-mxnet/issues/10898 On 5/10/18, 10:26 PM, "Zheng, Da" wrote: I didn't create an issue for this. I think Sheng can provide more details. Best, Da On 5/10/18, 10:10 PM, "Anirudh" wrote: Hi Da,

Re: Compilation error in old Mac

2018-05-10 Thread Zheng, Da
hen patch release needs to be considered. Anirudh On Thu, May 10, 2018 at 9:34 PM, Zheng, Da wrote: > Hello, > > It has been reported that MXNet v1.2 has compilation errors in old Mac. > The fix is on the way. > We have been dis

Proposal for optimizing Gluon dynamic models for seamless deployment

2018-05-10 Thread Zheng, Da
Hello, Scientists like to develop models with Gluon or Pytorch and hand the models over to engineer for deployment. It takes a lot of effort to deploy these models because engineers usually need to reimplement the models (this is especially for NLP and speech models). Recently, Pytorch announce

Compilation error in old Mac

2018-05-10 Thread Zheng, Da
Hello, It has been reported that MXNet v1.2 has compilation errors in old Mac. The fix is on the way. We have been discussing that since the problem only exists on an uncommon hardware, maybe we don’t block the release of v1.2. Instead, we can have a patch release in a near future. Best, Da

Re: segmentation fault in master using mkdlnn

2018-05-04 Thread Zheng, Da
er build. ci/docker/runtime_functions.sh clean_repo Pedro. On Thu, May 3, 2018 at 7:17 PM, Zheng, Da wrote: > Hello Pedro, > > I tried your instructions. It seems I can't run the docker in EC2 > instances. > Where did you r

Re: segmentation fault in master using mkdlnn

2018-05-03 Thread Zheng, Da
pu --into-container --print-docker-run A core should be there. you might need to install gdb as root by executing the previous command without uid so you can use apt-get. Good luck. On Thu, May 3, 2018 at 4:51 PM

Re: segmentation fault in master using mkdlnn

2018-05-03 Thread Zheng, Da
data_heap_ = 0x0}, } (gdb) On Thu, May 3, 2018 at 4:36 PM, Zheng, Da wrote: > There are a few problems with valgrind, which makes it not an ideal tool > for mxnet with python interface. > > First, valgrind generates a huge number of irrelevant m

Re: segmentation fault in master using mkdlnn

2018-05-03 Thread Zheng, Da
> > > >> It might also be possible that this isn't an MKLDNN bug. > > > >> I just saw a similar memory error without MKLDNN build. > > > >> > > > >> > > > http://jenkins.mxnet-ci.amazon-ml.c

Re: segmentation fault in master using mkdlnn

2018-05-02 Thread Zheng, Da
There might be a race condition that causes the memory error. It might be caused by this PR: https://github.com/apache/incubator-mxnet/pull/10706/files This PR removes MKLDNN memory from NDArray. However, I don't know why this causes memory error. If someone is using the memory, it should still ho

Re: [VOTE] Release Apache MXNet(incubating) version 1.2.0.RC1

2018-05-02 Thread Zheng, Da
I have to agree that the DMLC subrepos do make the development much more difficult sometimes. On 5/2/18, 3:57 AM, "Pedro Larroy" wrote: For me the situation with DMLC is problematic. I often find myself having to fix things in the DMLC subrepos. * These changes are imposs

Re: [VOTE] Release Apache MXNet (incubating) version 1.2.0.RC0

2018-04-23 Thread Zheng, Da
, the padding should be converted correctly. For the time being, I'll just fix MKLDNN so it doesn't check the tuple length of padding. Best, Da On 4/23/18, 2:58 PM, "Zheng, Da" wrote: I can reproduce the bug now. I'm working on a fix for the bug. Currently,

Re: [VOTE] Release Apache MXNet (incubating) version 1.2.0.RC0

2018-04-23 Thread Zheng, Da
59 PM, "Zheng, Da" wrote: It seems I have problems of compiling scala when running "make docs". Please see the error below. Are there any instructions of compiling these scala code? I guess I might miss some packages. I tried installing libslf4j-java and didn'

Re: [VOTE] Release Apache MXNet (incubating) version 1.2.0.RC0

2018-04-21 Thread Zheng, Da
@ThomasDelteil could you show me how to reproduce the problem? I'll take it a look as well. Best, Da Sent from my iPhone On Apr 21, 2018, at 1:12 PM, Anirudh Acharya mailto:anirudhk...@gmail.com>> wrote: @ThomasDelteil that might be due to the fact that in the example, the context is being se