This vote has been closed. We will make another tag and start vote again.
-sz
> On Jun 18, 2019, at 5:24 PM, Lin Yuan wrote:
>
> With the PR https://github.com/apache/incubator-mxnet/pull/15213 I could
> verify that building Horovod is successful with MXNet built from source. So
> I will remove
With the PR https://github.com/apache/incubator-mxnet/pull/15213 I could
verify that building Horovod is successful with MXNet built from source. So
I will remove my pervious -1 vote.
Best,
Lin
On Tue, Jun 18, 2019 at 2:10 PM Junru Shao wrote:
> Dear community,
>
> I am happy to share some res
Dear community,
I am happy to share some results with regard to commit 83d2c2d0e (PR
#14192, link: https://github.com/apache/incubator-mxnet/pull/14192) that
Pedro mentioned that causes regression.
First, using the exact model that Pedro provides, we did rigorous profiling
and found out that the
I reach you in private, the model is not public. We should be able to
see this problem in a public model using LSTM I think.
On Thu, Jun 13, 2019 at 11:15 AM Junru Shao wrote:
>
> Hi Pedro,
>
> Thanks for brining this up!
>
> Could you provide your model so that we can dig into this?
>
> Thanks,
Hi Pedro,
Thanks for brining this up!
Could you provide your model so that we can dig into this?
Thanks,
Junru
On Thu, Jun 13, 2019 at 10:33 Pedro Larroy
wrote:
> I have isolated some of the commits that are causing performance
> regressions in wavenet like models:
>
> Title: 83d2c2d0e:[MXNET
I have isolated some of the commits that are causing performance
regressions in wavenet like models:
Title: 83d2c2d0e:[MXNET-1324] Add NaiveRunGraph to imperative utils (#14192)
Causes a regression making hybridize with static slower using GPU inference.
[0f63659be5070af218095a6a460427d2a1b67aba
Hi @dev,
I am canceling the vote as the issue Lin discovered require a fix[1] and
the solution is not ready yet.
It's a general problem when building from source with MXNet, not only
impacting horovod use cases. Any help is appreciated.
Other issues we are tracking:
1. Regression on hybridize wi
Tested with CPU, 2.6x slower. comparing master vs 1.4.1.
Looks like a general regression.
On Tue, Jun 11, 2019 at 2:31 PM Lai Wei wrote:
>
> Hi guys,
>
> Thanks for the updates. Currently, we are able to confirm Lin's issue with
> Horovod, and there is a fix pending. [1]
> Will update later tod
-1
There's an autogenerated file that doesn't get cleaned up in the
scala-package folder when you run make clean. This causes the scaladoc
step to fail. I'm putting in workaround messaging in the error message
and that'll go into master, but if anyone wants to specifically run
the scaladocs for 1.5
Hi guys,
Thanks for the updates. Currently, we are able to confirm Lin's issue with
Horovod, and there is a fix pending. [1]
Will update later today to see if we need to cancel this vote for the fix.
As for the hybridize with static alloc performance regression. IMO it does
not need to be a block
On 2019/06/11 18:53:56, Pedro Larroy wrote:
> The stack trace doesn't seem to come from MXNet, do you have more info?
>
> On Tue, Jun 11, 2019 at 11:46 AM Zhi Zhang wrote:
> >
> >
> >
> > On 2019/06/11 17:36:09, Pedro Larroy wrote:
> > > A bit more background into this:
> > >
> > > While tu
Correction, I wanted to say:
1.5 is 33% faster than 1.4.1 when using hybridize without static_alloc
and static_shape.
We are claiming that static_alloc should improve speed and in this
case it makes it worse. Is that a blocker for the release?
Pedro.
On Tue, Jun 11, 2019 at 10:36 AM Pedro Larro
The stack trace doesn't seem to come from MXNet, do you have more info?
On Tue, Jun 11, 2019 at 11:46 AM Zhi Zhang wrote:
>
>
>
> On 2019/06/11 17:36:09, Pedro Larroy wrote:
> > A bit more background into this:
> >
> > While tuning a model using LSTM and convolutions we find that using
> > hybri
On 2019/06/11 17:36:09, Pedro Larroy wrote:
> A bit more background into this:
>
> While tuning a model using LSTM and convolutions we find that using
> hybridize with static_alloc and static_shape is 15% slower in the
> latest revision vs in version 1.4.1 in which using hybridize with
> stat
-1. Built from source, import mxnet in python cause Segfault.
back trace:
Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x7fff3e8a9f20 in ?? ()
(gdb) bt
#0 0x7fff3e8a9f20 in ?? ()
#1 0x7fffebbf440c in ReadConfigFile(Configuration&,
std::__cxx11::basic_string,
std::
A bit more background into this:
While tuning a model using LSTM and convolutions we find that using
hybridize with static_alloc and static_shape is 15% slower in the
latest revision vs in version 1.4.1 in which using hybridize with
static_alloc and static_shape is 10% faster than without.
Overwa
-1
We found a performance regression vs 1.4 related to CachedOp which
affects Hybrid forward, which we are looking into.
Pedro.
On Mon, Jun 10, 2019 at 4:33 PM Lin Yuan wrote:
>
> -1 (Tentatively until resolved)
>
> I tried to build MXNet 1.5.0 from source and pip install horovod but got
> the
-1 (Tentatively until resolved)
I tried to build MXNet 1.5.0 from source and pip install horovod but got
the following error:
Reproduce:
1) cp make/config.mk .
2) turn on USE_CUDA, USE_CUDNN, USE_NCCL
3) make -j
MXNet can build successfully.
4) pip install horovod
/home/ubuntu/src/incubator-m
+1
Lai Wei 于2019年6月9日周日 上午4:12写道:
> Dear MXNet community,
>
> This is the 3-day vote to release Apache MXNet (incubating) version 1.5.0.
> Voting on dev@ will start June 8, 23:59:59(PST) and close on June 11,
> 23:59:59.
>
> 1) Link to release notes:
> https://cwiki.apache.org/confluence/displa
Dear MXNet community,
This is the 3-day vote to release Apache MXNet (incubating) version 1.5.0.
Voting on dev@ will start June 8, 23:59:59(PST) and close on June 11,
23:59:59.
1) Link to release notes:
https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+Notes
2) Link to release can
20 matches
Mail list logo