Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-12 Thread Anton Chernov
Unfortunately, merging the following PR

Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13121

Broke `dist-kvstore tests CPU` test stage:

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/82/pipeline

A revert PR has been opened:

Revert "Set correct update on kvstore flag in dist_device_sync mode
(v1.3.x) (#13121)
https://github.com/apache/incubator-mxnet/pull/13228

The test already passed, so the PR is good to go. The initial fix will not
be considered for the release and will get a notion in the known issues
section.

Added a version bump to the release branch:

news, readme update for v1.3.1 release
https://github.com/apache/incubator-mxnet/pull/13225

Since patch releases are now done on branches the master branch needs a
version update. Following PR for introducing the change:

Bumped minor version to 1.4.0 as 1.3.1 will be continued in the v1.3x branch
https://github.com/apache/incubator-mxnet/pull/13231


The confluence page 'Apache MXNet (incubating) 1.3.1 Release Notes' has
been updated:
https://cwiki.apache.org/confluence/x/eZGzBQ


Best
Anton

сб, 10 нояб. 2018 г. в 11:59, Anton Chernov :

> Due to various problems we had to postpone the tagging and vote for the
> release till Monday, the 12th of November 2018.
>
> Following change has been updated and waiting to be merged:
>
> Disable flaky test test_operator.test_dropout (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13200
>
> Indeed the MACOS tests timed out as well for the branch. The proposed
> change contains thus only the build:
>
> [MXNET-908] Enable minimal OSX Travis build (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13179
>
>
> Best
> Anton
>
> пт, 9 нояб. 2018 г. в 13:11, Anton Chernov :
>
>> I created the following PR to disable the test:
>>
>> Disable flaky test test_operator.test_dropout (v1.3.x)
>> https://github.com/apache/incubator-mxnet/pull/13200
>>
>> The second failure I suppose is related to:
>>
>> distributed kvstore bug in MXNet
>> https://github.com/apache/incubator-mxnet/issues/12713
>>
>> Which partially was fixed by
>>
>> Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
>> https://github.com/apache/incubator-mxnet/pull/13121
>>
>> But another part of the issue is still open and does not have a fix yet:
>>
>> "When distributed kvstore is used, by default gluon.Trainer doesn't work
>> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be more
>> specific, the trainer updates once per GPU, the LRScheduler object is
>> shared across GPUs and get a wrong update count."
>>
>>
>> Best
>> Anton
>>
>>
>> пт, 9 нояб. 2018 г. в 11:48, Anton Chernov :
>>
>>> In case the tests for MACOS will time out as well we can disable them
>>> and keep at least the build stage as in:
>>>
>>> Disable travis tests
>>> https://github.com/apache/incubator-mxnet/pull/13137
>>>
>>> Best
>>> Anton
>>>
>>> пт, 9 нояб. 2018 г. в 11:17, Anton Chernov :
>>>

 Hi Naveen,

 I believe that the timeout is not an issue for the branch. And I see
 great benefit in having tests for MACOS on the release branch. The travis
 build is not blocking anyway, so I don't see any risk in adding it.

 * test_dropout

 Currently, there is a problem with test_dropout that fails consistently
 on the branch:


 http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/97/pipeline

 Error reported:

 ==
 FAIL: test_operator.test_dropout
 --
 Traceback (most recent call last):
   File "C:\Anaconda3\envs\py3\lib\site-packages\nose\case.py", line
 197, in runTest
 self.test(*self.arg)
   File
 "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\common.py",
 line 173, in test_new
 orig_test(*args, **kwargs)
   File
 "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
 line 5853, in test_dropout
 check_dropout_ratio(0.0, shape)
   File
 "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
 line 5797, in check_dropout_ratio
 assert exe.outputs[0].asnumpy().min() == min_value
 AssertionError:
  >> begin captured logging << 
 common: INFO: Setting test np/mx/python random seeds, use
 MXNET_TEST_SEED=428273587 to reproduce.
 - >> end captured logging << -

 The test is enabled on master:

 Re-enables test_operator.test_dropout
 https://github.com/apache/incubator-mxnet/pull/12717

 And there are no failures for it [1].

 * KVStore tests

 Unfortunately, KVStore tests fail 

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-10 Thread Anton Chernov
Due to various problems we had to postpone the tagging and vote for the
release till Monday, the 12th of November 2018.

Following change has been updated and waiting to be merged:

Disable flaky test test_operator.test_dropout (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13200

Indeed the MACOS tests timed out as well for the branch. The proposed
change contains thus only the build:

[MXNET-908] Enable minimal OSX Travis build (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13179


Best
Anton

пт, 9 нояб. 2018 г. в 13:11, Anton Chernov :

> I created the following PR to disable the test:
>
> Disable flaky test test_operator.test_dropout (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13200
>
> The second failure I suppose is related to:
>
> distributed kvstore bug in MXNet
> https://github.com/apache/incubator-mxnet/issues/12713
>
> Which partially was fixed by
>
> Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13121
>
> But another part of the issue is still open and does not have a fix yet:
>
> "When distributed kvstore is used, by default gluon.Trainer doesn't work
> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be more
> specific, the trainer updates once per GPU, the LRScheduler object is
> shared across GPUs and get a wrong update count."
>
>
> Best
> Anton
>
>
> пт, 9 нояб. 2018 г. в 11:48, Anton Chernov :
>
>> In case the tests for MACOS will time out as well we can disable them and
>> keep at least the build stage as in:
>>
>> Disable travis tests
>> https://github.com/apache/incubator-mxnet/pull/13137
>>
>> Best
>> Anton
>>
>> пт, 9 нояб. 2018 г. в 11:17, Anton Chernov :
>>
>>>
>>> Hi Naveen,
>>>
>>> I believe that the timeout is not an issue for the branch. And I see
>>> great benefit in having tests for MACOS on the release branch. The travis
>>> build is not blocking anyway, so I don't see any risk in adding it.
>>>
>>> * test_dropout
>>>
>>> Currently, there is a problem with test_dropout that fails consistently
>>> on the branch:
>>>
>>>
>>> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/97/pipeline
>>>
>>> Error reported:
>>>
>>> ==
>>> FAIL: test_operator.test_dropout
>>> --
>>> Traceback (most recent call last):
>>>   File "C:\Anaconda3\envs\py3\lib\site-packages\nose\case.py", line 197,
>>> in runTest
>>> self.test(*self.arg)
>>>   File
>>> "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\common.py",
>>> line 173, in test_new
>>> orig_test(*args, **kwargs)
>>>   File
>>> "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
>>> line 5853, in test_dropout
>>> check_dropout_ratio(0.0, shape)
>>>   File
>>> "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
>>> line 5797, in check_dropout_ratio
>>> assert exe.outputs[0].asnumpy().min() == min_value
>>> AssertionError:
>>>  >> begin captured logging << 
>>> common: INFO: Setting test np/mx/python random seeds, use
>>> MXNET_TEST_SEED=428273587 to reproduce.
>>> - >> end captured logging << -
>>>
>>> The test is enabled on master:
>>>
>>> Re-enables test_operator.test_dropout
>>> https://github.com/apache/incubator-mxnet/pull/12717
>>>
>>> And there are no failures for it [1].
>>>
>>> * KVStore tests
>>>
>>> Unfortunately, KVStore tests fail as well.
>>>
>>>
>>> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/96/pipeline
>>>
>>> Error reported:
>>>
>>> AssertionError
>>> test_gluon_trainer_type()
>>> assert trainer._update_on_kvstore is update_on_kv\
>>>   File "dist_sync_kvstore.py", line 388, in test_gluon_trainer_type
>>>
>>> If nobody has a fix for these issues, I will disable the tests and add
>>> information to the known issues section.
>>>
>>> Best
>>> Anton
>>>
>>> [1]
>>> http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/master/
>>>
>>> чт, 8 нояб. 2018 г. в 21:44, Naveen Swamy :
>>>
 Anton, I don't think we need to add the Mac OS tests for 1.3.1 branch
 since
 travis CI is timing out and creates blockers, it also did not exist for
 v1.3.0.


 On Thu, Nov 8, 2018 at 10:04 AM Anton Chernov 
 wrote:

 > A PR to fix the tests:
 >
 > Remove test for non existing index copy operator (v1.3.x)
 > https://github.com/apache/incubator-mxnet/pull/13180
 >
 >
 > Best
 > Anton
 >
 > чт, 8 нояб. 2018 г. в 10:05, Anton Chernov :
 >
 > > An addition has been made to include MacOS tests for the v1.3.x
 branch:
 > >
 > > [MXNET-908] Enable minimal OSX Travis build (v1.3.x)
 > > https://github.com/apache/incubator-mxnet/pull

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-09 Thread Anton Chernov
I created the following PR to disable the test:

Disable flaky test test_operator.test_dropout (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13200

The second failure I suppose is related to:

distributed kvstore bug in MXNet
https://github.com/apache/incubator-mxnet/issues/12713

Which partially was fixed by

Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13121

But another part of the issue is still open and does not have a fix yet:

"When distributed kvstore is used, by default gluon.Trainer doesn't work
with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be more
specific, the trainer updates once per GPU, the LRScheduler object is
shared across GPUs and get a wrong update count."


Best
Anton


пт, 9 нояб. 2018 г. в 11:48, Anton Chernov :

> In case the tests for MACOS will time out as well we can disable them and
> keep at least the build stage as in:
>
> Disable travis tests
> https://github.com/apache/incubator-mxnet/pull/13137
>
> Best
> Anton
>
> пт, 9 нояб. 2018 г. в 11:17, Anton Chernov :
>
>>
>> Hi Naveen,
>>
>> I believe that the timeout is not an issue for the branch. And I see
>> great benefit in having tests for MACOS on the release branch. The travis
>> build is not blocking anyway, so I don't see any risk in adding it.
>>
>> * test_dropout
>>
>> Currently, there is a problem with test_dropout that fails consistently
>> on the branch:
>>
>>
>> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/97/pipeline
>>
>> Error reported:
>>
>> ==
>> FAIL: test_operator.test_dropout
>> --
>> Traceback (most recent call last):
>>   File "C:\Anaconda3\envs\py3\lib\site-packages\nose\case.py", line 197,
>> in runTest
>> self.test(*self.arg)
>>   File
>> "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\common.py",
>> line 173, in test_new
>> orig_test(*args, **kwargs)
>>   File
>> "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
>> line 5853, in test_dropout
>> check_dropout_ratio(0.0, shape)
>>   File
>> "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
>> line 5797, in check_dropout_ratio
>> assert exe.outputs[0].asnumpy().min() == min_value
>> AssertionError:
>>  >> begin captured logging << 
>> common: INFO: Setting test np/mx/python random seeds, use
>> MXNET_TEST_SEED=428273587 to reproduce.
>> - >> end captured logging << -
>>
>> The test is enabled on master:
>>
>> Re-enables test_operator.test_dropout
>> https://github.com/apache/incubator-mxnet/pull/12717
>>
>> And there are no failures for it [1].
>>
>> * KVStore tests
>>
>> Unfortunately, KVStore tests fail as well.
>>
>>
>> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/96/pipeline
>>
>> Error reported:
>>
>> AssertionError
>> test_gluon_trainer_type()
>> assert trainer._update_on_kvstore is update_on_kv\
>>   File "dist_sync_kvstore.py", line 388, in test_gluon_trainer_type
>>
>> If nobody has a fix for these issues, I will disable the tests and add
>> information to the known issues section.
>>
>> Best
>> Anton
>>
>> [1] http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/master/
>>
>> чт, 8 нояб. 2018 г. в 21:44, Naveen Swamy :
>>
>>> Anton, I don't think we need to add the Mac OS tests for 1.3.1 branch
>>> since
>>> travis CI is timing out and creates blockers, it also did not exist for
>>> v1.3.0.
>>>
>>>
>>> On Thu, Nov 8, 2018 at 10:04 AM Anton Chernov 
>>> wrote:
>>>
>>> > A PR to fix the tests:
>>> >
>>> > Remove test for non existing index copy operator (v1.3.x)
>>> > https://github.com/apache/incubator-mxnet/pull/13180
>>> >
>>> >
>>> > Best
>>> > Anton
>>> >
>>> > чт, 8 нояб. 2018 г. в 10:05, Anton Chernov :
>>> >
>>> > > An addition has been made to include MacOS tests for the v1.3.x
>>> branch:
>>> > >
>>> > > [MXNET-908] Enable minimal OSX Travis build (v1.3.x)
>>> > > https://github.com/apache/incubator-mxnet/pull/13179
>>> > >
>>> > > It includes following PR's for master:
>>> > >
>>> > > [MXNET-908] Enable minimal OSX Travis build
>>> > > https://github.com/apache/incubator-mxnet/pull/12462
>>> > >
>>> > > [MXNET-908] Enable python tests in Travis
>>> > > https://github.com/apache/incubator-mxnet/pull/12550
>>> > >
>>> > > [MXNET-968] Fix MacOS python tests
>>> > > https://github.com/apache/incubator-mxnet/pull/12590
>>> > >
>>> > >
>>> > > Best
>>> > > Anton
>>> > >
>>> > >
>>> > > чт, 8 нояб. 2018 г. в 9:38, Anton Chernov :
>>> > >
>>> > >> Thank you everyone for your support and suggestions. All proposed
>>> PR's
>>> > >> have been merged. We will tag the release candidate and start the
>>> vote
>>> > on
>>> > >> Frida

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-09 Thread Anton Chernov
In case the tests for MACOS will time out as well we can disable them and
keep at least the build stage as in:

Disable travis tests
https://github.com/apache/incubator-mxnet/pull/13137

Best
Anton

пт, 9 нояб. 2018 г. в 11:17, Anton Chernov :

>
> Hi Naveen,
>
> I believe that the timeout is not an issue for the branch. And I see great
> benefit in having tests for MACOS on the release branch. The travis build
> is not blocking anyway, so I don't see any risk in adding it.
>
> * test_dropout
>
> Currently, there is a problem with test_dropout that fails consistently on
> the branch:
>
>
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/97/pipeline
>
> Error reported:
>
> ==
> FAIL: test_operator.test_dropout
> --
> Traceback (most recent call last):
>   File "C:\Anaconda3\envs\py3\lib\site-packages\nose\case.py", line 197,
> in runTest
> self.test(*self.arg)
>   File
> "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\common.py",
> line 173, in test_new
> orig_test(*args, **kwargs)
>   File
> "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
> line 5853, in test_dropout
> check_dropout_ratio(0.0, shape)
>   File
> "C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
> line 5797, in check_dropout_ratio
> assert exe.outputs[0].asnumpy().min() == min_value
> AssertionError:
>  >> begin captured logging << 
> common: INFO: Setting test np/mx/python random seeds, use
> MXNET_TEST_SEED=428273587 to reproduce.
> - >> end captured logging << -
>
> The test is enabled on master:
>
> Re-enables test_operator.test_dropout
> https://github.com/apache/incubator-mxnet/pull/12717
>
> And there are no failures for it [1].
>
> * KVStore tests
>
> Unfortunately, KVStore tests fail as well.
>
>
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/96/pipeline
>
> Error reported:
>
> AssertionError
> test_gluon_trainer_type()
> assert trainer._update_on_kvstore is update_on_kv\
>   File "dist_sync_kvstore.py", line 388, in test_gluon_trainer_type
>
> If nobody has a fix for these issues, I will disable the tests and add
> information to the known issues section.
>
> Best
> Anton
>
> [1] http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/master/
>
> чт, 8 нояб. 2018 г. в 21:44, Naveen Swamy :
>
>> Anton, I don't think we need to add the Mac OS tests for 1.3.1 branch
>> since
>> travis CI is timing out and creates blockers, it also did not exist for
>> v1.3.0.
>>
>>
>> On Thu, Nov 8, 2018 at 10:04 AM Anton Chernov 
>> wrote:
>>
>> > A PR to fix the tests:
>> >
>> > Remove test for non existing index copy operator (v1.3.x)
>> > https://github.com/apache/incubator-mxnet/pull/13180
>> >
>> >
>> > Best
>> > Anton
>> >
>> > чт, 8 нояб. 2018 г. в 10:05, Anton Chernov :
>> >
>> > > An addition has been made to include MacOS tests for the v1.3.x
>> branch:
>> > >
>> > > [MXNET-908] Enable minimal OSX Travis build (v1.3.x)
>> > > https://github.com/apache/incubator-mxnet/pull/13179
>> > >
>> > > It includes following PR's for master:
>> > >
>> > > [MXNET-908] Enable minimal OSX Travis build
>> > > https://github.com/apache/incubator-mxnet/pull/12462
>> > >
>> > > [MXNET-908] Enable python tests in Travis
>> > > https://github.com/apache/incubator-mxnet/pull/12550
>> > >
>> > > [MXNET-968] Fix MacOS python tests
>> > > https://github.com/apache/incubator-mxnet/pull/12590
>> > >
>> > >
>> > > Best
>> > > Anton
>> > >
>> > >
>> > > чт, 8 нояб. 2018 г. в 9:38, Anton Chernov :
>> > >
>> > >> Thank you everyone for your support and suggestions. All proposed
>> PR's
>> > >> have been merged. We will tag the release candidate and start the
>> vote
>> > on
>> > >> Friday, the 9th of November 2018.
>> > >>
>> > >> Unfortunately after the merges the tests started to fail:
>> > >>
>> > >>
>> http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/v1.3.x/
>> > >>
>> > >> I will look into the failures, but any help as usual is very
>> > appreciated.
>> > >>
>> > >> The nightly tests are fine:
>> > >> http://jenkins.mxnet-ci.amazon-ml.com/job/NightlyTests/job/v1.3.x/
>> > >>
>> > >>
>> > >> Best
>> > >> Anton
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> ср, 7 нояб. 2018 г. в 17:19, Anton Chernov :
>> > >>
>> > >>> Yes, you are right about the versions wording, thanks for
>> > clarification.
>> > >>>
>> > >>> A performance improvement can be considered a bugfix as well. I see
>> no
>> > >>> big risks in including PR's by Haibin and Lin into the patch
>> release.
>> > >>>
>> > >>> @Haibin, if you can reopen the PR's they should be good to go for
>> the
>> > >>> relase, considering the importance of the improvements.
>> > >>>
>> > >>> I propose t

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-09 Thread Anton Chernov
Hi Naveen,

I believe that the timeout is not an issue for the branch. And I see great
benefit in having tests for MACOS on the release branch. The travis build
is not blocking anyway, so I don't see any risk in adding it.

* test_dropout

Currently, there is a problem with test_dropout that fails consistently on
the branch:

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/97/pipeline

Error reported:

==
FAIL: test_operator.test_dropout
--
Traceback (most recent call last):
  File "C:\Anaconda3\envs\py3\lib\site-packages\nose\case.py", line 197, in
runTest
self.test(*self.arg)
  File
"C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\common.py",
line 173, in test_new
orig_test(*args, **kwargs)
  File
"C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
line 5853, in test_dropout
check_dropout_ratio(0.0, shape)
  File
"C:\jenkins_slave\workspace\ut-python-gpu\tests\python\unittest\test_operator.py",
line 5797, in check_dropout_ratio
assert exe.outputs[0].asnumpy().min() == min_value
AssertionError:
 >> begin captured logging << 
common: INFO: Setting test np/mx/python random seeds, use
MXNET_TEST_SEED=428273587 to reproduce.
- >> end captured logging << -

The test is enabled on master:

Re-enables test_operator.test_dropout
https://github.com/apache/incubator-mxnet/pull/12717

And there are no failures for it [1].

* KVStore tests

Unfortunately, KVStore tests fail as well.

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.3.x/96/pipeline

Error reported:

AssertionError
test_gluon_trainer_type()
assert trainer._update_on_kvstore is update_on_kv\
  File "dist_sync_kvstore.py", line 388, in test_gluon_trainer_type

If nobody has a fix for these issues, I will disable the tests and add
information to the known issues section.

Best
Anton

[1] http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/master/

чт, 8 нояб. 2018 г. в 21:44, Naveen Swamy :

> Anton, I don't think we need to add the Mac OS tests for 1.3.1 branch since
> travis CI is timing out and creates blockers, it also did not exist for
> v1.3.0.
>
>
> On Thu, Nov 8, 2018 at 10:04 AM Anton Chernov  wrote:
>
> > A PR to fix the tests:
> >
> > Remove test for non existing index copy operator (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13180
> >
> >
> > Best
> > Anton
> >
> > чт, 8 нояб. 2018 г. в 10:05, Anton Chernov :
> >
> > > An addition has been made to include MacOS tests for the v1.3.x branch:
> > >
> > > [MXNET-908] Enable minimal OSX Travis build (v1.3.x)
> > > https://github.com/apache/incubator-mxnet/pull/13179
> > >
> > > It includes following PR's for master:
> > >
> > > [MXNET-908] Enable minimal OSX Travis build
> > > https://github.com/apache/incubator-mxnet/pull/12462
> > >
> > > [MXNET-908] Enable python tests in Travis
> > > https://github.com/apache/incubator-mxnet/pull/12550
> > >
> > > [MXNET-968] Fix MacOS python tests
> > > https://github.com/apache/incubator-mxnet/pull/12590
> > >
> > >
> > > Best
> > > Anton
> > >
> > >
> > > чт, 8 нояб. 2018 г. в 9:38, Anton Chernov :
> > >
> > >> Thank you everyone for your support and suggestions. All proposed PR's
> > >> have been merged. We will tag the release candidate and start the vote
> > on
> > >> Friday, the 9th of November 2018.
> > >>
> > >> Unfortunately after the merges the tests started to fail:
> > >>
> > >> http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/v1.3.x/
> > >>
> > >> I will look into the failures, but any help as usual is very
> > appreciated.
> > >>
> > >> The nightly tests are fine:
> > >> http://jenkins.mxnet-ci.amazon-ml.com/job/NightlyTests/job/v1.3.x/
> > >>
> > >>
> > >> Best
> > >> Anton
> > >>
> > >>
> > >>
> > >>
> > >> ср, 7 нояб. 2018 г. в 17:19, Anton Chernov :
> > >>
> > >>> Yes, you are right about the versions wording, thanks for
> > clarification.
> > >>>
> > >>> A performance improvement can be considered a bugfix as well. I see
> no
> > >>> big risks in including PR's by Haibin and Lin into the patch release.
> > >>>
> > >>> @Haibin, if you can reopen the PR's they should be good to go for the
> > >>> relase, considering the importance of the improvements.
> > >>>
> > >>> I propose the following bugfixes for the release as well (already
> > >>> created corresponding PR's):
> > >>>
> > >>> Fixed __setattr__ method of _MXClassPropertyMetaClass (v1.3.x)
> > >>> https://github.com/apache/incubator-mxnet/pull/13157
> > >>>
> > >>> fixed symbols naming in RNNCell, LSTMCell, GRUCell (v1.3.x)
> > >>> https://github.com/apache/incubator-mxnet/pull/13158
> > >>>
> > >>> We will be starting to merge the PR's shortly. If are no more
> proposals
> > >>> for b

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-08 Thread Naveen Swamy
Anton, I don't think we need to add the Mac OS tests for 1.3.1 branch since
travis CI is timing out and creates blockers, it also did not exist for
v1.3.0.


On Thu, Nov 8, 2018 at 10:04 AM Anton Chernov  wrote:

> A PR to fix the tests:
>
> Remove test for non existing index copy operator (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13180
>
>
> Best
> Anton
>
> чт, 8 нояб. 2018 г. в 10:05, Anton Chernov :
>
> > An addition has been made to include MacOS tests for the v1.3.x branch:
> >
> > [MXNET-908] Enable minimal OSX Travis build (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13179
> >
> > It includes following PR's for master:
> >
> > [MXNET-908] Enable minimal OSX Travis build
> > https://github.com/apache/incubator-mxnet/pull/12462
> >
> > [MXNET-908] Enable python tests in Travis
> > https://github.com/apache/incubator-mxnet/pull/12550
> >
> > [MXNET-968] Fix MacOS python tests
> > https://github.com/apache/incubator-mxnet/pull/12590
> >
> >
> > Best
> > Anton
> >
> >
> > чт, 8 нояб. 2018 г. в 9:38, Anton Chernov :
> >
> >> Thank you everyone for your support and suggestions. All proposed PR's
> >> have been merged. We will tag the release candidate and start the vote
> on
> >> Friday, the 9th of November 2018.
> >>
> >> Unfortunately after the merges the tests started to fail:
> >>
> >> http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/v1.3.x/
> >>
> >> I will look into the failures, but any help as usual is very
> appreciated.
> >>
> >> The nightly tests are fine:
> >> http://jenkins.mxnet-ci.amazon-ml.com/job/NightlyTests/job/v1.3.x/
> >>
> >>
> >> Best
> >> Anton
> >>
> >>
> >>
> >>
> >> ср, 7 нояб. 2018 г. в 17:19, Anton Chernov :
> >>
> >>> Yes, you are right about the versions wording, thanks for
> clarification.
> >>>
> >>> A performance improvement can be considered a bugfix as well. I see no
> >>> big risks in including PR's by Haibin and Lin into the patch release.
> >>>
> >>> @Haibin, if you can reopen the PR's they should be good to go for the
> >>> relase, considering the importance of the improvements.
> >>>
> >>> I propose the following bugfixes for the release as well (already
> >>> created corresponding PR's):
> >>>
> >>> Fixed __setattr__ method of _MXClassPropertyMetaClass (v1.3.x)
> >>> https://github.com/apache/incubator-mxnet/pull/13157
> >>>
> >>> fixed symbols naming in RNNCell, LSTMCell, GRUCell (v1.3.x)
> >>> https://github.com/apache/incubator-mxnet/pull/13158
> >>>
> >>> We will be starting to merge the PR's shortly. If are no more proposals
> >>> for backporting I would consider the list as set.
> >>>
> >>> Best
> >>> Anton
> >>>
> >>> ср, 7 нояб. 2018 г. в 17:01, Sheng Zha :
> >>>
>  Hi Anton,
> 
>  I hear your concern about a simultaneous 1.4.0 release and it
> certainly
>  is a valid one.
> 
>  Regarding the release, let’s agree on the language first. According to
>  semver.org, 1.3.1 release is considered patch release, which is for
>  backward compatible bug fixes, while 1.4.0 release is considered minor
>  release, which is for backward compatible new features. A major
> release
>  would mean 2.0.
> 
>  The three PRs suggested by Haibin and Lin are all introducing new
>  features. If they go into a patch release, it would require an
> exception
>  accepted by the community. Also, if other violation happens it could
> be
>  ground for declining a release during votes.
> 
>  -sz
> 
>  > On Nov 7, 2018, at 2:25 AM, Anton Chernov 
>  wrote:
>  >
>  > [MXNET-1179] Enforce deterministic algorithms in convolution layers
> 
> >>>
>


Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-08 Thread Anton Chernov
Thank you everyone for your support and suggestions. All proposed PR's have
been merged. We will tag the release candidate and start the vote on
Friday, the 9th of November 2018.

Unfortunately after the merges the tests started to fail:

http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/v1.3.x/

I will look into the failures, but any help as usual is very appreciated.

The nightly tests are fine:
http://jenkins.mxnet-ci.amazon-ml.com/job/NightlyTests/job/v1.3.x/


Best
Anton




ср, 7 нояб. 2018 г. в 17:19, Anton Chernov :

> Yes, you are right about the versions wording, thanks for clarification.
>
> A performance improvement can be considered a bugfix as well. I see no big
> risks in including PR's by Haibin and Lin into the patch release.
>
> @Haibin, if you can reopen the PR's they should be good to go for the
> relase, considering the importance of the improvements.
>
> I propose the following bugfixes for the release as well (already created
> corresponding PR's):
>
> Fixed __setattr__ method of _MXClassPropertyMetaClass (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13157
>
> fixed symbols naming in RNNCell, LSTMCell, GRUCell (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13158
>
> We will be starting to merge the PR's shortly. If are no more proposals
> for backporting I would consider the list as set.
>
> Best
> Anton
>
> ср, 7 нояб. 2018 г. в 17:01, Sheng Zha :
>
>> Hi Anton,
>>
>> I hear your concern about a simultaneous 1.4.0 release and it certainly
>> is a valid one.
>>
>> Regarding the release, let’s agree on the language first. According to
>> semver.org, 1.3.1 release is considered patch release, which is for
>> backward compatible bug fixes, while 1.4.0 release is considered minor
>> release, which is for backward compatible new features. A major release
>> would mean 2.0.
>>
>> The three PRs suggested by Haibin and Lin are all introducing new
>> features. If they go into a patch release, it would require an exception
>> accepted by the community. Also, if other violation happens it could be
>> ground for declining a release during votes.
>>
>> -sz
>>
>> > On Nov 7, 2018, at 2:25 AM, Anton Chernov  wrote:
>> >
>> > [MXNET-1179] Enforce deterministic algorithms in convolution layers
>>
>


Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-08 Thread Anton Chernov
A PR to fix the tests:

Remove test for non existing index copy operator (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13180


Best
Anton

чт, 8 нояб. 2018 г. в 10:05, Anton Chernov :

> An addition has been made to include MacOS tests for the v1.3.x branch:
>
> [MXNET-908] Enable minimal OSX Travis build (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13179
>
> It includes following PR's for master:
>
> [MXNET-908] Enable minimal OSX Travis build
> https://github.com/apache/incubator-mxnet/pull/12462
>
> [MXNET-908] Enable python tests in Travis
> https://github.com/apache/incubator-mxnet/pull/12550
>
> [MXNET-968] Fix MacOS python tests
> https://github.com/apache/incubator-mxnet/pull/12590
>
>
> Best
> Anton
>
>
> чт, 8 нояб. 2018 г. в 9:38, Anton Chernov :
>
>> Thank you everyone for your support and suggestions. All proposed PR's
>> have been merged. We will tag the release candidate and start the vote on
>> Friday, the 9th of November 2018.
>>
>> Unfortunately after the merges the tests started to fail:
>>
>> http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/v1.3.x/
>>
>> I will look into the failures, but any help as usual is very appreciated.
>>
>> The nightly tests are fine:
>> http://jenkins.mxnet-ci.amazon-ml.com/job/NightlyTests/job/v1.3.x/
>>
>>
>> Best
>> Anton
>>
>>
>>
>>
>> ср, 7 нояб. 2018 г. в 17:19, Anton Chernov :
>>
>>> Yes, you are right about the versions wording, thanks for clarification.
>>>
>>> A performance improvement can be considered a bugfix as well. I see no
>>> big risks in including PR's by Haibin and Lin into the patch release.
>>>
>>> @Haibin, if you can reopen the PR's they should be good to go for the
>>> relase, considering the importance of the improvements.
>>>
>>> I propose the following bugfixes for the release as well (already
>>> created corresponding PR's):
>>>
>>> Fixed __setattr__ method of _MXClassPropertyMetaClass (v1.3.x)
>>> https://github.com/apache/incubator-mxnet/pull/13157
>>>
>>> fixed symbols naming in RNNCell, LSTMCell, GRUCell (v1.3.x)
>>> https://github.com/apache/incubator-mxnet/pull/13158
>>>
>>> We will be starting to merge the PR's shortly. If are no more proposals
>>> for backporting I would consider the list as set.
>>>
>>> Best
>>> Anton
>>>
>>> ср, 7 нояб. 2018 г. в 17:01, Sheng Zha :
>>>
 Hi Anton,

 I hear your concern about a simultaneous 1.4.0 release and it certainly
 is a valid one.

 Regarding the release, let’s agree on the language first. According to
 semver.org, 1.3.1 release is considered patch release, which is for
 backward compatible bug fixes, while 1.4.0 release is considered minor
 release, which is for backward compatible new features. A major release
 would mean 2.0.

 The three PRs suggested by Haibin and Lin are all introducing new
 features. If they go into a patch release, it would require an exception
 accepted by the community. Also, if other violation happens it could be
 ground for declining a release during votes.

 -sz

 > On Nov 7, 2018, at 2:25 AM, Anton Chernov 
 wrote:
 >
 > [MXNET-1179] Enforce deterministic algorithms in convolution layers

>>>


Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-08 Thread Anton Chernov
An addition has been made to include MacOS tests for the v1.3.x branch:

[MXNET-908] Enable minimal OSX Travis build (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13179

It includes following PR's for master:

[MXNET-908] Enable minimal OSX Travis build
https://github.com/apache/incubator-mxnet/pull/12462

[MXNET-908] Enable python tests in Travis
https://github.com/apache/incubator-mxnet/pull/12550

[MXNET-968] Fix MacOS python tests
https://github.com/apache/incubator-mxnet/pull/12590


Best
Anton


чт, 8 нояб. 2018 г. в 9:38, Anton Chernov :

> Thank you everyone for your support and suggestions. All proposed PR's
> have been merged. We will tag the release candidate and start the vote on
> Friday, the 9th of November 2018.
>
> Unfortunately after the merges the tests started to fail:
>
> http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/v1.3.x/
>
> I will look into the failures, but any help as usual is very appreciated.
>
> The nightly tests are fine:
> http://jenkins.mxnet-ci.amazon-ml.com/job/NightlyTests/job/v1.3.x/
>
>
> Best
> Anton
>
>
>
>
> ср, 7 нояб. 2018 г. в 17:19, Anton Chernov :
>
>> Yes, you are right about the versions wording, thanks for clarification.
>>
>> A performance improvement can be considered a bugfix as well. I see no
>> big risks in including PR's by Haibin and Lin into the patch release.
>>
>> @Haibin, if you can reopen the PR's they should be good to go for the
>> relase, considering the importance of the improvements.
>>
>> I propose the following bugfixes for the release as well (already created
>> corresponding PR's):
>>
>> Fixed __setattr__ method of _MXClassPropertyMetaClass (v1.3.x)
>> https://github.com/apache/incubator-mxnet/pull/13157
>>
>> fixed symbols naming in RNNCell, LSTMCell, GRUCell (v1.3.x)
>> https://github.com/apache/incubator-mxnet/pull/13158
>>
>> We will be starting to merge the PR's shortly. If are no more proposals
>> for backporting I would consider the list as set.
>>
>> Best
>> Anton
>>
>> ср, 7 нояб. 2018 г. в 17:01, Sheng Zha :
>>
>>> Hi Anton,
>>>
>>> I hear your concern about a simultaneous 1.4.0 release and it certainly
>>> is a valid one.
>>>
>>> Regarding the release, let’s agree on the language first. According to
>>> semver.org, 1.3.1 release is considered patch release, which is for
>>> backward compatible bug fixes, while 1.4.0 release is considered minor
>>> release, which is for backward compatible new features. A major release
>>> would mean 2.0.
>>>
>>> The three PRs suggested by Haibin and Lin are all introducing new
>>> features. If they go into a patch release, it would require an exception
>>> accepted by the community. Also, if other violation happens it could be
>>> ground for declining a release during votes.
>>>
>>> -sz
>>>
>>> > On Nov 7, 2018, at 2:25 AM, Anton Chernov  wrote:
>>> >
>>> > [MXNET-1179] Enforce deterministic algorithms in convolution layers
>>>
>>


Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-07 Thread Anton Chernov
Yes, you are right about the versions wording, thanks for clarification.

A performance improvement can be considered a bugfix as well. I see no big
risks in including PR's by Haibin and Lin into the patch release.

@Haibin, if you can reopen the PR's they should be good to go for the
relase, considering the importance of the improvements.

I propose the following bugfixes for the release as well (already created
corresponding PR's):

Fixed __setattr__ method of _MXClassPropertyMetaClass (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13157

fixed symbols naming in RNNCell, LSTMCell, GRUCell (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13158

We will be starting to merge the PR's shortly. If are no more proposals for
backporting I would consider the list as set.

Best
Anton

ср, 7 нояб. 2018 г. в 17:01, Sheng Zha :

> Hi Anton,
>
> I hear your concern about a simultaneous 1.4.0 release and it certainly is
> a valid one.
>
> Regarding the release, let’s agree on the language first. According to
> semver.org, 1.3.1 release is considered patch release, which is for
> backward compatible bug fixes, while 1.4.0 release is considered minor
> release, which is for backward compatible new features. A major release
> would mean 2.0.
>
> The three PRs suggested by Haibin and Lin are all introducing new
> features. If they go into a patch release, it would require an exception
> accepted by the community. Also, if other violation happens it could be
> ground for declining a release during votes.
>
> -sz
>
> > On Nov 7, 2018, at 2:25 AM, Anton Chernov  wrote:
> >
> > [MXNET-1179] Enforce deterministic algorithms in convolution layers
>


Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-07 Thread Sheng Zha
Hi Anton,

I hear your concern about a simultaneous 1.4.0 release and it certainly is a 
valid one.

Regarding the release, let’s agree on the language first. According to 
semver.org, 1.3.1 release is considered patch release, which is for backward 
compatible bug fixes, while 1.4.0 release is considered minor release, which is 
for backward compatible new features. A major release would mean 2.0.

The three PRs suggested by Haibin and Lin are all introducing new features. If 
they go into a patch release, it would require an exception accepted by the 
community. Also, if other violation happens it could be ground for declining a 
release during votes.

-sz

> On Nov 7, 2018, at 2:25 AM, Anton Chernov  wrote:
> 
> [MXNET-1179] Enforce deterministic algorithms in convolution layers


Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-07 Thread Anton Chernov
>> >>>>>
>> >>>>>> On Nov 6, 2018, at 7:17 AM, Anton Chernov 
>> >>> wrote:
>> >>>>>>
>> >>>>>> The following PR's have been created so far:
>> >>>>>>
>> >>>>>> Infer dtype in SymbolBlock import from input symbol (v1.3.x)
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13117
>> >>>>>>
>> >>>>>> [MXNET-953] Fix oob memory read (v1.3.x)
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13118
>> >>>>>>
>> >>>>>> [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x)
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13119
>> >>>>>>
>> >>>>>> [MXNET-922] Fix memleak in profiler (v1.3.x)
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13120
>> >>>>>>
>> >>>>>> Set correct update on kvstore flag in dist_device_sync mode
>> >> (v1.3.x)
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13121
>> >>>>>>
>> >>>>>> update mshadow (v1.3.x)
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13122
>> >>>>>>
>> >>>>>> CudnnFind() usage improvements (v1.3.x)
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13123
>> >>>>>>
>> >>>>>> Fix lazy record io when used with dataloader and multi_worker > 0
>> >>>>> (v1.3.x)
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13124
>> >>>>>>
>> >>>>>>
>> >>>>>> As stated previously I would be rather opposed to have following
>> >> PR's
>> >>>> it
>> >>>>> in
>> >>>>>> the patch release:
>> >>>>>>
>> >>>>>> Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13129
>> >>>>>>
>> >>>>>> sample_like operators (#13034) v1.3.x
>> >>>>>> https://github.com/apache/incubator-mxnet/pull/13130
>> >>>>>>
>> >>>>>>
>> >>>>>> Best
>> >>>>>> Anton
>> >>>>>>
>> >>>>>> вт, 6 нояб. 2018 г. в 16:06, Anton Chernov :
>> >>>>>>
>> >>>>>>> Hi Haibin,
>> >>>>>>>
>> >>>>>>> I have a few comments regarding the proposed performance
>> >> improvement
>> >>>>>>> changes.
>> >>>>>>>
>> >>>>>>> CUDNN support for LSTM with projection & clipping
>> >>>>>>> https://github.com/apache/incubator-mxnet/pull/13056
>> >>>>>>>
>> >>>>>>> There is no doubt that this change brings value, but I don't see
>> >> it
>> >>>> as a
>> >>>>>>> critical bug fix. I would rather leave it for the next major
>> >>> release.
>> >>>>>>>
>> >>>>>>> sample_like operators
>> >>>>>>> https://github.com/apache/incubator-mxnet/pull/13034
>> >>>>>>>
>> >>>>>>> Even if it's related to performance, this is an addition of
>> >>>>> functionality
>> >>>>>>> and I would also push this to be in the next major release only.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Best
>> >>>>>>> Anton
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :
>> >>>>>>>
>> >>>>>>>> Hi Patric,
>> >>>>>>>>
>> >>>>>>>> This change was listed in the 'PR candidates suggested for
>> >>>>> consideration
>> >>>>>>>> for v1.3.1 patch release' section [1].
>> >>>>>>>>
>> >>>>&

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-07 Thread Anton Chernov
cubator-mxnet/pull/13121
> >>>>>>
> >>>>>> update mshadow (v1.3.x)
> >>>>>> https://github.com/apache/incubator-mxnet/pull/13122
> >>>>>>
> >>>>>> CudnnFind() usage improvements (v1.3.x)
> >>>>>> https://github.com/apache/incubator-mxnet/pull/13123
> >>>>>>
> >>>>>> Fix lazy record io when used with dataloader and multi_worker > 0
> >>>>> (v1.3.x)
> >>>>>> https://github.com/apache/incubator-mxnet/pull/13124
> >>>>>>
> >>>>>>
> >>>>>> As stated previously I would be rather opposed to have following
> >> PR's
> >>>> it
> >>>>> in
> >>>>>> the patch release:
> >>>>>>
> >>>>>> Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
> >>>>>> https://github.com/apache/incubator-mxnet/pull/13129
> >>>>>>
> >>>>>> sample_like operators (#13034) v1.3.x
> >>>>>> https://github.com/apache/incubator-mxnet/pull/13130
> >>>>>>
> >>>>>>
> >>>>>> Best
> >>>>>> Anton
> >>>>>>
> >>>>>> вт, 6 нояб. 2018 г. в 16:06, Anton Chernov :
> >>>>>>
> >>>>>>> Hi Haibin,
> >>>>>>>
> >>>>>>> I have a few comments regarding the proposed performance
> >> improvement
> >>>>>>> changes.
> >>>>>>>
> >>>>>>> CUDNN support for LSTM with projection & clipping
> >>>>>>> https://github.com/apache/incubator-mxnet/pull/13056
> >>>>>>>
> >>>>>>> There is no doubt that this change brings value, but I don't see
> >> it
> >>>> as a
> >>>>>>> critical bug fix. I would rather leave it for the next major
> >>> release.
> >>>>>>>
> >>>>>>> sample_like operators
> >>>>>>> https://github.com/apache/incubator-mxnet/pull/13034
> >>>>>>>
> >>>>>>> Even if it's related to performance, this is an addition of
> >>>>> functionality
> >>>>>>> and I would also push this to be in the next major release only.
> >>>>>>>
> >>>>>>>
> >>>>>>> Best
> >>>>>>> Anton
> >>>>>>>
> >>>>>>>
> >>>>>>> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :
> >>>>>>>
> >>>>>>>> Hi Patric,
> >>>>>>>>
> >>>>>>>> This change was listed in the 'PR candidates suggested for
> >>>>> consideration
> >>>>>>>> for v1.3.1 patch release' section [1].
> >>>>>>>>
> >>>>>>>> You are right, I also think that this is not a critical hotfix
> >>> change
> >>>>>>>> that should be included into the 1.3.1 patch release.
> >>>>>>>>
> >>>>>>>> Thus I'm not making any further efforts to bring it in.
> >>>>>>>>
> >>>>>>>> Best
> >>>>>>>> Anton
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric  >>> :
> >>>>>>>>
> >>>>>>>>> Hi Anton,
> >>>>>>>>>
> >>>>>>>>> Thanks for looking into the MKL-DNN PR.
> >>>>>>>>>
> >>>>>>>>> As my understanding of cwiki (
> >>>>>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
> >>>>>>>>> ),
> >>>>>>>&g

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Sheng Zha
;>> https://github.com/apache/incubator-mxnet/pull/13130
>>>>>> 
>>>>>> 
>>>>>> Best
>>>>>> Anton
>>>>>> 
>>>>>> вт, 6 нояб. 2018 г. в 16:06, Anton Chernov :
>>>>>> 
>>>>>>> Hi Haibin,
>>>>>>> 
>>>>>>> I have a few comments regarding the proposed performance
>> improvement
>>>>>>> changes.
>>>>>>> 
>>>>>>> CUDNN support for LSTM with projection & clipping
>>>>>>> https://github.com/apache/incubator-mxnet/pull/13056
>>>>>>> 
>>>>>>> There is no doubt that this change brings value, but I don't see
>> it
>>>> as a
>>>>>>> critical bug fix. I would rather leave it for the next major
>>> release.
>>>>>>> 
>>>>>>> sample_like operators
>>>>>>> https://github.com/apache/incubator-mxnet/pull/13034
>>>>>>> 
>>>>>>> Even if it's related to performance, this is an addition of
>>>>> functionality
>>>>>>> and I would also push this to be in the next major release only.
>>>>>>> 
>>>>>>> 
>>>>>>> Best
>>>>>>> Anton
>>>>>>> 
>>>>>>> 
>>>>>>> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :
>>>>>>> 
>>>>>>>> Hi Patric,
>>>>>>>> 
>>>>>>>> This change was listed in the 'PR candidates suggested for
>>>>> consideration
>>>>>>>> for v1.3.1 patch release' section [1].
>>>>>>>> 
>>>>>>>> You are right, I also think that this is not a critical hotfix
>>> change
>>>>>>>> that should be included into the 1.3.1 patch release.
>>>>>>>> 
>>>>>>>> Thus I'm not making any further efforts to bring it in.
>>>>>>>> 
>>>>>>>> Best
>>>>>>>> Anton
>>>>>>>> 
>>>>>>>> [1]
>>>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
>>>>>>>> 
>>>>>>>> 
>>>>>>>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric >> :
>>>>>>>> 
>>>>>>>>> Hi Anton,
>>>>>>>>> 
>>>>>>>>> Thanks for looking into the MKL-DNN PR.
>>>>>>>>> 
>>>>>>>>> As my understanding of cwiki (
>>>>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
>>>>>>>>> ),
>>>>>>>>> these features will go into 1.4 rather than patch release of
>>> 1.3.1.
>>>>>>>>> 
>>>>>>>>> Feel free to correct me :)
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> 
>>>>>>>>> --Patric
>>>>>>>>> 
>>>>>>>>>> -Original Message-
>>>>>>>>>> From: Anton Chernov [mailto:mecher...@gmail.com]
>>>>>>>>>> Sent: Tuesday, November 6, 2018 3:11 AM
>>>>>>>>>> To: d...@mxnet.apache.org
>>>>>>>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating)
>> 1.3.1
>>>>> patch
>>>>>>>>>> release
>>>>>>>>>> 
>>>>>>>>>> It seems that there is a problem porting following changes to
>> the
>>>>>>>>> v1.3.x
>>>>>>>>>> release branch:
>>>>>>>>>> 
>>>>>>>>>> Implement mkldnn convolution fusion and quantization
>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12530
>>>>>>>>>> 
>>>>>>>>>> MKL-DNN Quantization Examples and README
>>>>>>>>&

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Lin Yuan
hat this change brings value, but I don't see
> it
> > > as a
> > > > >> critical bug fix. I would rather leave it for the next major
> > release.
> > > > >>
> > > > >> sample_like operators
> > > > >> https://github.com/apache/incubator-mxnet/pull/13034
> > > > >>
> > > > >> Even if it's related to performance, this is an addition of
> > > > functionality
> > > > >> and I would also push this to be in the next major release only.
> > > > >>
> > > > >>
> > > > >> Best
> > > > >> Anton
> > > > >>
> > > > >>
> > > > >> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :
> > > > >>
> > > > >>> Hi Patric,
> > > > >>>
> > > > >>> This change was listed in the 'PR candidates suggested for
> > > > consideration
> > > > >>> for v1.3.1 patch release' section [1].
> > > > >>>
> > > > >>> You are right, I also think that this is not a critical hotfix
> > change
> > > > >>> that should be included into the 1.3.1 patch release.
> > > > >>>
> > > > >>> Thus I'm not making any further efforts to bring it in.
> > > > >>>
> > > > >>> Best
> > > > >>> Anton
> > > > >>>
> > > > >>> [1]
> > > > >>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
> > > > >>>
> > > > >>>
> > > > >>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric  >:
> > > > >>>
> > > > >>>> Hi Anton,
> > > > >>>>
> > > > >>>> Thanks for looking into the MKL-DNN PR.
> > > > >>>>
> > > > >>>> As my understanding of cwiki (
> > > > >>>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
> > > > >>>> ),
> > > > >>>> these features will go into 1.4 rather than patch release of
> > 1.3.1.
> > > > >>>>
> > > > >>>> Feel free to correct me :)
> > > > >>>>
> > > > >>>> Thanks,
> > > > >>>>
> > > > >>>> --Patric
> > > > >>>>
> > > > >>>>> -Original Message-
> > > > >>>>> From: Anton Chernov [mailto:mecher...@gmail.com]
> > > > >>>>> Sent: Tuesday, November 6, 2018 3:11 AM
> > > > >>>>> To: d...@mxnet.apache.org
> > > > >>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating)
> 1.3.1
> > > > patch
> > > > >>>>> release
> > > > >>>>>
> > > > >>>>> It seems that there is a problem porting following changes to
> the
> > > > >>>> v1.3.x
> > > > >>>>> release branch:
> > > > >>>>>
> > > > >>>>> Implement mkldnn convolution fusion and quantization
> > > > >>>>> https://github.com/apache/incubator-mxnet/pull/12530
> > > > >>>>>
> > > > >>>>> MKL-DNN Quantization Examples and README
> > > > >>>>> https://github.com/apache/incubator-mxnet/pull/12808
> > > > >>>>>
> > > > >>>>> The bases are different.
> > > > >>>>>
> > > > >>>>> I would need help from authors of these changes to make a
> > backport
> > > > PR.
> > > > >>>>>
> > > > >>>>> @ZhennanQin, @xinyu-intel would you be able to assist me and
> > create
> > > > the
> > > > >>>>> corresponding PR's?
> > > > >>>>>
> > > > >>>>> Without proper history and domain knowledge I would not be able
> > to
> > > > >>>> create
> > > > >>>>> them by my own in reasonable amount of ti

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Aaron Markham
>>> This change was listed in the 'PR candidates suggested for
> > > consideration
> > > >>> for v1.3.1 patch release' section [1].
> > > >>>
> > > >>> You are right, I also think that this is not a critical hotfix
> change
> > > >>> that should be included into the 1.3.1 patch release.
> > > >>>
> > > >>> Thus I'm not making any further efforts to bring it in.
> > > >>>
> > > >>> Best
> > > >>> Anton
> > > >>>
> > > >>> [1]
> > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
> > > >>>
> > > >>>
> > > >>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric :
> > > >>>
> > > >>>> Hi Anton,
> > > >>>>
> > > >>>> Thanks for looking into the MKL-DNN PR.
> > > >>>>
> > > >>>> As my understanding of cwiki (
> > > >>>>
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
> > > >>>> ),
> > > >>>> these features will go into 1.4 rather than patch release of
> 1.3.1.
> > > >>>>
> > > >>>> Feel free to correct me :)
> > > >>>>
> > > >>>> Thanks,
> > > >>>>
> > > >>>> --Patric
> > > >>>>
> > > >>>>> -Original Message-
> > > >>>>> From: Anton Chernov [mailto:mecher...@gmail.com]
> > > >>>>> Sent: Tuesday, November 6, 2018 3:11 AM
> > > >>>>> To: d...@mxnet.apache.org
> > > >>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1
> > > patch
> > > >>>>> release
> > > >>>>>
> > > >>>>> It seems that there is a problem porting following changes to the
> > > >>>> v1.3.x
> > > >>>>> release branch:
> > > >>>>>
> > > >>>>> Implement mkldnn convolution fusion and quantization
> > > >>>>> https://github.com/apache/incubator-mxnet/pull/12530
> > > >>>>>
> > > >>>>> MKL-DNN Quantization Examples and README
> > > >>>>> https://github.com/apache/incubator-mxnet/pull/12808
> > > >>>>>
> > > >>>>> The bases are different.
> > > >>>>>
> > > >>>>> I would need help from authors of these changes to make a
> backport
> > > PR.
> > > >>>>>
> > > >>>>> @ZhennanQin, @xinyu-intel would you be able to assist me and
> create
> > > the
> > > >>>>> corresponding PR's?
> > > >>>>>
> > > >>>>> Without proper history and domain knowledge I would not be able
> to
> > > >>>> create
> > > >>>>> them by my own in reasonable amount of time, I'm afraid.
> > > >>>>>
> > > >>>>> Best regards,
> > > >>>>> Anton
> > > >>>>>
> > > >>>>> пн, 5 нояб. 2018 г. в 19:45, Anton Chernov  >:
> > > >>>>>
> > > >>>>>>
> > > >>>>>> As part of:
> > > >>>>>>
> > > >>>>>> Implement mkldnn convolution fusion and quantization
> > > >>>>>> https://github.com/apache/incubator-mxnet/pull/12530
> > > >>>>>>
> > > >>>>>> I propose to add the examples and documentation PR as well:
> > > >>>>>>
> > > >>>>>> MKL-DNN Quantization Examples and README
> > > >>>>>> https://github.com/apache/incubator-mxnet/pull/12808
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Best regards,
> > > >>>>>> Anton
> > > >>>>>>
> > > >>>>>> пн, 5 нояб. 2018 г. в 19:02, Anton Chernov  >:
> > > >>>>>>
> > > >>>>>>> De

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Lai Wei
Hi Anton,

Thanks for driving this, I would like to include the following fix in
1.3.1:
Allow infer shape partial on foreach operator:
https://github.com/apache/incubator-mxnet/pull/12471

Keras-MXNet needs this functionality to infer shape partially
on foreach operator. (Used in RNN operators)

Thanks a lot!


Best Regards
Lai Wei



On Tue, Nov 6, 2018 at 10:44 AM Haibin Lin  wrote:

> Hi Naveen and Anton,
>
> Thanks for pointing that out. You are right that these are not critical
> fixes. Putting them in 1.4.0 is more appropriate. PRs are closed.
>
> Best,
> Haibin
>
> On Tue, Nov 6, 2018 at 7:35 AM Naveen Swamy  wrote:
>
> > Please note that this is a patch release(1.3.1) to address critical
> bugs!,
> > For everything else please wait for 1.4.0 which is planned very shortly
> > after 1.3.1
> >
> > > On Nov 6, 2018, at 7:17 AM, Anton Chernov  wrote:
> > >
> > > The following PR's have been created so far:
> > >
> > > Infer dtype in SymbolBlock import from input symbol (v1.3.x)
> > > https://github.com/apache/incubator-mxnet/pull/13117
> > >
> > > [MXNET-953] Fix oob memory read (v1.3.x)
> > > https://github.com/apache/incubator-mxnet/pull/13118
> > >
> > > [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x)
> > > https://github.com/apache/incubator-mxnet/pull/13119
> > >
> > > [MXNET-922] Fix memleak in profiler (v1.3.x)
> > > https://github.com/apache/incubator-mxnet/pull/13120
> > >
> > > Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
> > > https://github.com/apache/incubator-mxnet/pull/13121
> > >
> > > update mshadow (v1.3.x)
> > > https://github.com/apache/incubator-mxnet/pull/13122
> > >
> > > CudnnFind() usage improvements (v1.3.x)
> > > https://github.com/apache/incubator-mxnet/pull/13123
> > >
> > > Fix lazy record io when used with dataloader and multi_worker > 0
> > (v1.3.x)
> > > https://github.com/apache/incubator-mxnet/pull/13124
> > >
> > >
> > > As stated previously I would be rather opposed to have following PR's
> it
> > in
> > > the patch release:
> > >
> > > Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
> > > https://github.com/apache/incubator-mxnet/pull/13129
> > >
> > > sample_like operators (#13034) v1.3.x
> > > https://github.com/apache/incubator-mxnet/pull/13130
> > >
> > >
> > > Best
> > > Anton
> > >
> > > вт, 6 нояб. 2018 г. в 16:06, Anton Chernov :
> > >
> > >> Hi Haibin,
> > >>
> > >> I have a few comments regarding the proposed performance improvement
> > >> changes.
> > >>
> > >> CUDNN support for LSTM with projection & clipping
> > >> https://github.com/apache/incubator-mxnet/pull/13056
> > >>
> > >> There is no doubt that this change brings value, but I don't see it
> as a
> > >> critical bug fix. I would rather leave it for the next major release.
> > >>
> > >> sample_like operators
> > >> https://github.com/apache/incubator-mxnet/pull/13034
> > >>
> > >> Even if it's related to performance, this is an addition of
> > functionality
> > >> and I would also push this to be in the next major release only.
> > >>
> > >>
> > >> Best
> > >> Anton
> > >>
> > >>
> > >> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :
> > >>
> > >>> Hi Patric,
> > >>>
> > >>> This change was listed in the 'PR candidates suggested for
> > consideration
> > >>> for v1.3.1 patch release' section [1].
> > >>>
> > >>> You are right, I also think that this is not a critical hotfix change
> > >>> that should be included into the 1.3.1 patch release.
> > >>>
> > >>> Thus I'm not making any further efforts to bring it in.
> > >>>
> > >>> Best
> > >>> Anton
> > >>>
> > >>> [1]
> > >>>
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
> > >>>
> > >>>
> > >>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric :
> > >>>
> > >>>> Hi Anton,
> > >>>>
> > >>>> Thanks for looking into t

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Haibin Lin
Hi Naveen and Anton,

Thanks for pointing that out. You are right that these are not critical
fixes. Putting them in 1.4.0 is more appropriate. PRs are closed.

Best,
Haibin

On Tue, Nov 6, 2018 at 7:35 AM Naveen Swamy  wrote:

> Please note that this is a patch release(1.3.1) to address critical bugs!,
> For everything else please wait for 1.4.0 which is planned very shortly
> after 1.3.1
>
> > On Nov 6, 2018, at 7:17 AM, Anton Chernov  wrote:
> >
> > The following PR's have been created so far:
> >
> > Infer dtype in SymbolBlock import from input symbol (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13117
> >
> > [MXNET-953] Fix oob memory read (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13118
> >
> > [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13119
> >
> > [MXNET-922] Fix memleak in profiler (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13120
> >
> > Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13121
> >
> > update mshadow (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13122
> >
> > CudnnFind() usage improvements (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13123
> >
> > Fix lazy record io when used with dataloader and multi_worker > 0
> (v1.3.x)
> > https://github.com/apache/incubator-mxnet/pull/13124
> >
> >
> > As stated previously I would be rather opposed to have following PR's it
> in
> > the patch release:
> >
> > Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
> > https://github.com/apache/incubator-mxnet/pull/13129
> >
> > sample_like operators (#13034) v1.3.x
> > https://github.com/apache/incubator-mxnet/pull/13130
> >
> >
> > Best
> > Anton
> >
> > вт, 6 нояб. 2018 г. в 16:06, Anton Chernov :
> >
> >> Hi Haibin,
> >>
> >> I have a few comments regarding the proposed performance improvement
> >> changes.
> >>
> >> CUDNN support for LSTM with projection & clipping
> >> https://github.com/apache/incubator-mxnet/pull/13056
> >>
> >> There is no doubt that this change brings value, but I don't see it as a
> >> critical bug fix. I would rather leave it for the next major release.
> >>
> >> sample_like operators
> >> https://github.com/apache/incubator-mxnet/pull/13034
> >>
> >> Even if it's related to performance, this is an addition of
> functionality
> >> and I would also push this to be in the next major release only.
> >>
> >>
> >> Best
> >> Anton
> >>
> >>
> >> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :
> >>
> >>> Hi Patric,
> >>>
> >>> This change was listed in the 'PR candidates suggested for
> consideration
> >>> for v1.3.1 patch release' section [1].
> >>>
> >>> You are right, I also think that this is not a critical hotfix change
> >>> that should be included into the 1.3.1 patch release.
> >>>
> >>> Thus I'm not making any further efforts to bring it in.
> >>>
> >>> Best
> >>> Anton
> >>>
> >>> [1]
> >>>
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
> >>>
> >>>
> >>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric :
> >>>
> >>>> Hi Anton,
> >>>>
> >>>> Thanks for looking into the MKL-DNN PR.
> >>>>
> >>>> As my understanding of cwiki (
> >>>>
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
> >>>> ),
> >>>> these features will go into 1.4 rather than patch release of 1.3.1.
> >>>>
> >>>> Feel free to correct me :)
> >>>>
> >>>> Thanks,
> >>>>
> >>>> --Patric
> >>>>
> >>>>> -Original Message-
> >>>>> From: Anton Chernov [mailto:mecher...@gmail.com]
> >>>>> Sent: Tuesday, November 6, 2018 3:11 AM
> >>>>> To: d...@mxnet.apache.org
> >>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1
> patch
> >>>>> release
> >>>>>
> >>&g

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Naveen Swamy
Please note that this is a patch release(1.3.1) to address critical bugs!, For 
everything else please wait for 1.4.0 which is planned very shortly after 1.3.1

> On Nov 6, 2018, at 7:17 AM, Anton Chernov  wrote:
> 
> The following PR's have been created so far:
> 
> Infer dtype in SymbolBlock import from input symbol (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13117
> 
> [MXNET-953] Fix oob memory read (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13118
> 
> [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13119
> 
> [MXNET-922] Fix memleak in profiler (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13120
> 
> Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13121
> 
> update mshadow (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13122
> 
> CudnnFind() usage improvements (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13123
> 
> Fix lazy record io when used with dataloader and multi_worker > 0 (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13124
> 
> 
> As stated previously I would be rather opposed to have following PR's it in
> the patch release:
> 
> Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
> https://github.com/apache/incubator-mxnet/pull/13129
> 
> sample_like operators (#13034) v1.3.x
> https://github.com/apache/incubator-mxnet/pull/13130
> 
> 
> Best
> Anton
> 
> вт, 6 нояб. 2018 г. в 16:06, Anton Chernov :
> 
>> Hi Haibin,
>> 
>> I have a few comments regarding the proposed performance improvement
>> changes.
>> 
>> CUDNN support for LSTM with projection & clipping
>> https://github.com/apache/incubator-mxnet/pull/13056
>> 
>> There is no doubt that this change brings value, but I don't see it as a
>> critical bug fix. I would rather leave it for the next major release.
>> 
>> sample_like operators
>> https://github.com/apache/incubator-mxnet/pull/13034
>> 
>> Even if it's related to performance, this is an addition of functionality
>> and I would also push this to be in the next major release only.
>> 
>> 
>> Best
>> Anton
>> 
>> 
>> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :
>> 
>>> Hi Patric,
>>> 
>>> This change was listed in the 'PR candidates suggested for consideration
>>> for v1.3.1 patch release' section [1].
>>> 
>>> You are right, I also think that this is not a critical hotfix change
>>> that should be included into the 1.3.1 patch release.
>>> 
>>> Thus I'm not making any further efforts to bring it in.
>>> 
>>> Best
>>> Anton
>>> 
>>> [1]
>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
>>> 
>>> 
>>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric :
>>> 
>>>> Hi Anton,
>>>> 
>>>> Thanks for looking into the MKL-DNN PR.
>>>> 
>>>> As my understanding of cwiki (
>>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
>>>> ),
>>>> these features will go into 1.4 rather than patch release of 1.3.1.
>>>> 
>>>> Feel free to correct me :)
>>>> 
>>>> Thanks,
>>>> 
>>>> --Patric
>>>> 
>>>>> -Original Message-
>>>>> From: Anton Chernov [mailto:mecher...@gmail.com]
>>>>> Sent: Tuesday, November 6, 2018 3:11 AM
>>>>> To: d...@mxnet.apache.org
>>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch
>>>>> release
>>>>> 
>>>>> It seems that there is a problem porting following changes to the
>>>> v1.3.x
>>>>> release branch:
>>>>> 
>>>>> Implement mkldnn convolution fusion and quantization
>>>>> https://github.com/apache/incubator-mxnet/pull/12530
>>>>> 
>>>>> MKL-DNN Quantization Examples and README
>>>>> https://github.com/apache/incubator-mxnet/pull/12808
>>>>> 
>>>>> The bases are different.
>>>>> 
>>>>> I would need help from authors of these changes to make a backport PR.
>>>>> 
>>>>> @ZhennanQin, @xinyu-intel would you be able to assist me and create the
>>>>&g

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Anton Chernov
The following PR's have been created so far:

Infer dtype in SymbolBlock import from input symbol (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13117

[MXNET-953] Fix oob memory read (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13118

[MXNET-969] Fix buffer overflow in RNNOp (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13119

[MXNET-922] Fix memleak in profiler (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13120

Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13121

update mshadow (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13122

CudnnFind() usage improvements (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13123

Fix lazy record io when used with dataloader and multi_worker > 0 (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13124


As stated previously I would be rather opposed to have following PR's it in
the patch release:

Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
https://github.com/apache/incubator-mxnet/pull/13129

sample_like operators (#13034) v1.3.x
https://github.com/apache/incubator-mxnet/pull/13130


Best
Anton

вт, 6 нояб. 2018 г. в 16:06, Anton Chernov :

> Hi Haibin,
>
> I have a few comments regarding the proposed performance improvement
> changes.
>
> CUDNN support for LSTM with projection & clipping
> https://github.com/apache/incubator-mxnet/pull/13056
>
> There is no doubt that this change brings value, but I don't see it as a
> critical bug fix. I would rather leave it for the next major release.
>
> sample_like operators
> https://github.com/apache/incubator-mxnet/pull/13034
>
> Even if it's related to performance, this is an addition of functionality
> and I would also push this to be in the next major release only.
>
>
> Best
> Anton
>
>
> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :
>
>> Hi Patric,
>>
>> This change was listed in the 'PR candidates suggested for consideration
>> for v1.3.1 patch release' section [1].
>>
>> You are right, I also think that this is not a critical hotfix change
>> that should be included into the 1.3.1 patch release.
>>
>> Thus I'm not making any further efforts to bring it in.
>>
>> Best
>> Anton
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
>>
>>
>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric :
>>
>>> Hi Anton,
>>>
>>> Thanks for looking into the MKL-DNN PR.
>>>
>>> As my understanding of cwiki (
>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
>>> ),
>>> these features will go into 1.4 rather than patch release of 1.3.1.
>>>
>>> Feel free to correct me :)
>>>
>>> Thanks,
>>>
>>> --Patric
>>>
>>> > -Original Message-
>>> > From: Anton Chernov [mailto:mecher...@gmail.com]
>>> > Sent: Tuesday, November 6, 2018 3:11 AM
>>> > To: d...@mxnet.apache.org
>>> > Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch
>>> > release
>>> >
>>> > It seems that there is a problem porting following changes to the
>>> v1.3.x
>>> > release branch:
>>> >
>>> > Implement mkldnn convolution fusion and quantization
>>> > https://github.com/apache/incubator-mxnet/pull/12530
>>> >
>>> > MKL-DNN Quantization Examples and README
>>> > https://github.com/apache/incubator-mxnet/pull/12808
>>> >
>>> > The bases are different.
>>> >
>>> > I would need help from authors of these changes to make a backport PR.
>>> >
>>> > @ZhennanQin, @xinyu-intel would you be able to assist me and create the
>>> > corresponding PR's?
>>> >
>>> > Without proper history and domain knowledge I would not be able to
>>> create
>>> > them by my own in reasonable amount of time, I'm afraid.
>>> >
>>> > Best regards,
>>> > Anton
>>> >
>>> > пн, 5 нояб. 2018 г. в 19:45, Anton Chernov :
>>> >
>>> > >
>>> > > As part of:
>>> > >
>>> > > Implement mkldnn convolution fusion and quantization
>>> > > https://github.com/apache/incubator-mxnet/pull/12530
>>> > >
>>> > > I propose to add the examples and documentation PR as well:
>>> > >
>>

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Anton Chernov
Hi Haibin,

I have a few comments regarding the proposed performance improvement
changes.

CUDNN support for LSTM with projection & clipping
https://github.com/apache/incubator-mxnet/pull/13056

There is no doubt that this change brings value, but I don't see it as a
critical bug fix. I would rather leave it for the next major release.

sample_like operators
https://github.com/apache/incubator-mxnet/pull/13034

Even if it's related to performance, this is an addition of functionality
and I would also push this to be in the next major release only.


Best
Anton


вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :

> Hi Patric,
>
> This change was listed in the 'PR candidates suggested for consideration
> for v1.3.1 patch release' section [1].
>
> You are right, I also think that this is not a critical hotfix change that
> should be included into the 1.3.1 patch release.
>
> Thus I'm not making any further efforts to bring it in.
>
> Best
> Anton
>
> [1]
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
>
>
> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric :
>
>> Hi Anton,
>>
>> Thanks for looking into the MKL-DNN PR.
>>
>> As my understanding of cwiki (
>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
>> ),
>> these features will go into 1.4 rather than patch release of 1.3.1.
>>
>> Feel free to correct me :)
>>
>> Thanks,
>>
>> --Patric
>>
>> > -Original Message-----
>> > From: Anton Chernov [mailto:mecher...@gmail.com]
>> > Sent: Tuesday, November 6, 2018 3:11 AM
>> > To: d...@mxnet.apache.org
>> > Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch
>> > release
>> >
>> > It seems that there is a problem porting following changes to the v1.3.x
>> > release branch:
>> >
>> > Implement mkldnn convolution fusion and quantization
>> > https://github.com/apache/incubator-mxnet/pull/12530
>> >
>> > MKL-DNN Quantization Examples and README
>> > https://github.com/apache/incubator-mxnet/pull/12808
>> >
>> > The bases are different.
>> >
>> > I would need help from authors of these changes to make a backport PR.
>> >
>> > @ZhennanQin, @xinyu-intel would you be able to assist me and create the
>> > corresponding PR's?
>> >
>> > Without proper history and domain knowledge I would not be able to
>> create
>> > them by my own in reasonable amount of time, I'm afraid.
>> >
>> > Best regards,
>> > Anton
>> >
>> > пн, 5 нояб. 2018 г. в 19:45, Anton Chernov :
>> >
>> > >
>> > > As part of:
>> > >
>> > > Implement mkldnn convolution fusion and quantization
>> > > https://github.com/apache/incubator-mxnet/pull/12530
>> > >
>> > > I propose to add the examples and documentation PR as well:
>> > >
>> > > MKL-DNN Quantization Examples and README
>> > > https://github.com/apache/incubator-mxnet/pull/12808
>> > >
>> > >
>> > > Best regards,
>> > > Anton
>> > >
>> > > пн, 5 нояб. 2018 г. в 19:02, Anton Chernov :
>> > >
>> > >> Dear MXNet community,
>> > >>
>> > >> I will be the release manager for the upcoming 1.3.1 patch release.
>> > >> Naveen will be co-managing the release and providing help from the
>> > >> committers side.
>> > >>
>> > >> The following dates have been set:
>> > >>
>> > >> Code Freeze: 31st October 2018
>> > >> Release published: 13th November 2018
>> > >>
>> > >> Release notes have been drafted here [1].
>> > >>
>> > >>
>> > >> * Known issues
>> > >>
>> > >> Update MKL-DNN dependency
>> > >> https://github.com/apache/incubator-mxnet/pull/12953
>> > >>
>> > >> This PR hasn't been merged even to master yet. Requires additional
>> > >> discussion and merge.
>> > >>
>> > >> distributed kvstore bug in MXNet
>> > >> https://github.com/apache/incubator-mxnet/issues/12713
>> > >>
>> > >> > When distributed kvstore is used, by default gluon.Trainer doesn't
>> > >> > work
>> > >> with mx.optimizer.LRScheduler if a

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Anton Chernov
Hi Patric,

This change was listed in the 'PR candidates suggested for consideration
for v1.3.1 patch release' section [1].

You are right, I also think that this is not a critical hotfix change that
should be included into the 1.3.1 patch release.

Thus I'm not making any further efforts to bring it in.

Best
Anton

[1]
https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates


вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric :

> Hi Anton,
>
> Thanks for looking into the MKL-DNN PR.
>
> As my understanding of cwiki (
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
> ),
> these features will go into 1.4 rather than patch release of 1.3.1.
>
> Feel free to correct me :)
>
> Thanks,
>
> --Patric
>
> > -Original Message-
> > From: Anton Chernov [mailto:mecher...@gmail.com]
> > Sent: Tuesday, November 6, 2018 3:11 AM
> > To: d...@mxnet.apache.org
> > Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch
> > release
> >
> > It seems that there is a problem porting following changes to the v1.3.x
> > release branch:
> >
> > Implement mkldnn convolution fusion and quantization
> > https://github.com/apache/incubator-mxnet/pull/12530
> >
> > MKL-DNN Quantization Examples and README
> > https://github.com/apache/incubator-mxnet/pull/12808
> >
> > The bases are different.
> >
> > I would need help from authors of these changes to make a backport PR.
> >
> > @ZhennanQin, @xinyu-intel would you be able to assist me and create the
> > corresponding PR's?
> >
> > Without proper history and domain knowledge I would not be able to create
> > them by my own in reasonable amount of time, I'm afraid.
> >
> > Best regards,
> > Anton
> >
> > пн, 5 нояб. 2018 г. в 19:45, Anton Chernov :
> >
> > >
> > > As part of:
> > >
> > > Implement mkldnn convolution fusion and quantization
> > > https://github.com/apache/incubator-mxnet/pull/12530
> > >
> > > I propose to add the examples and documentation PR as well:
> > >
> > > MKL-DNN Quantization Examples and README
> > > https://github.com/apache/incubator-mxnet/pull/12808
> > >
> > >
> > > Best regards,
> > > Anton
> > >
> > > пн, 5 нояб. 2018 г. в 19:02, Anton Chernov :
> > >
> > >> Dear MXNet community,
> > >>
> > >> I will be the release manager for the upcoming 1.3.1 patch release.
> > >> Naveen will be co-managing the release and providing help from the
> > >> committers side.
> > >>
> > >> The following dates have been set:
> > >>
> > >> Code Freeze: 31st October 2018
> > >> Release published: 13th November 2018
> > >>
> > >> Release notes have been drafted here [1].
> > >>
> > >>
> > >> * Known issues
> > >>
> > >> Update MKL-DNN dependency
> > >> https://github.com/apache/incubator-mxnet/pull/12953
> > >>
> > >> This PR hasn't been merged even to master yet. Requires additional
> > >> discussion and merge.
> > >>
> > >> distributed kvstore bug in MXNet
> > >> https://github.com/apache/incubator-mxnet/issues/12713
> > >>
> > >> > When distributed kvstore is used, by default gluon.Trainer doesn't
> > >> > work
> > >> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be
> > >> more specific, the trainer updates once per GPU, the LRScheduler
> > >> object is shared across GPUs and get a wrong update count.
> > >>
> > >> This needs to be fixed. [6]
> > >>
> > >>
> > >> * Changes
> > >>
> > >> The following changes will be ported to the release branch, per [2]:
> > >>
> > >> Infer dtype in SymbolBlock import from input symbol [3]
> > >> https://github.com/apache/incubator-mxnet/pull/12412
> > >>
> > >> [MXNET-953] Fix oob memory read
> > >> https://github.com/apache/incubator-mxnet/pull/12631
> > >>
> > >> [MXNET-969] Fix buffer overflow in RNNOp
> > >> https://github.com/apache/incubator-mxnet/pull/12603
> > >>
> > >> [MXNET-922] Fix memleak in profiler
> > >> https://github.com/apache/incubator-mxnet/pull/12499
> > >>
> > >> Implement m

RE: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-05 Thread Zhao, Patric
Hi Anton,

Thanks for looking into the MKL-DNN PR.

As my understanding of cwiki 
(https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release),
these features will go into 1.4 rather than patch release of 1.3.1.

Feel free to correct me :)

Thanks,

--Patric

> -Original Message-
> From: Anton Chernov [mailto:mecher...@gmail.com]
> Sent: Tuesday, November 6, 2018 3:11 AM
> To: d...@mxnet.apache.org
> Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch
> release
> 
> It seems that there is a problem porting following changes to the v1.3.x
> release branch:
> 
> Implement mkldnn convolution fusion and quantization
> https://github.com/apache/incubator-mxnet/pull/12530
> 
> MKL-DNN Quantization Examples and README
> https://github.com/apache/incubator-mxnet/pull/12808
> 
> The bases are different.
> 
> I would need help from authors of these changes to make a backport PR.
> 
> @ZhennanQin, @xinyu-intel would you be able to assist me and create the
> corresponding PR's?
> 
> Without proper history and domain knowledge I would not be able to create
> them by my own in reasonable amount of time, I'm afraid.
> 
> Best regards,
> Anton
> 
> пн, 5 нояб. 2018 г. в 19:45, Anton Chernov :
> 
> >
> > As part of:
> >
> > Implement mkldnn convolution fusion and quantization
> > https://github.com/apache/incubator-mxnet/pull/12530
> >
> > I propose to add the examples and documentation PR as well:
> >
> > MKL-DNN Quantization Examples and README
> > https://github.com/apache/incubator-mxnet/pull/12808
> >
> >
> > Best regards,
> > Anton
> >
> > пн, 5 нояб. 2018 г. в 19:02, Anton Chernov :
> >
> >> Dear MXNet community,
> >>
> >> I will be the release manager for the upcoming 1.3.1 patch release.
> >> Naveen will be co-managing the release and providing help from the
> >> committers side.
> >>
> >> The following dates have been set:
> >>
> >> Code Freeze: 31st October 2018
> >> Release published: 13th November 2018
> >>
> >> Release notes have been drafted here [1].
> >>
> >>
> >> * Known issues
> >>
> >> Update MKL-DNN dependency
> >> https://github.com/apache/incubator-mxnet/pull/12953
> >>
> >> This PR hasn't been merged even to master yet. Requires additional
> >> discussion and merge.
> >>
> >> distributed kvstore bug in MXNet
> >> https://github.com/apache/incubator-mxnet/issues/12713
> >>
> >> > When distributed kvstore is used, by default gluon.Trainer doesn't
> >> > work
> >> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be
> >> more specific, the trainer updates once per GPU, the LRScheduler
> >> object is shared across GPUs and get a wrong update count.
> >>
> >> This needs to be fixed. [6]
> >>
> >>
> >> * Changes
> >>
> >> The following changes will be ported to the release branch, per [2]:
> >>
> >> Infer dtype in SymbolBlock import from input symbol [3]
> >> https://github.com/apache/incubator-mxnet/pull/12412
> >>
> >> [MXNET-953] Fix oob memory read
> >> https://github.com/apache/incubator-mxnet/pull/12631
> >>
> >> [MXNET-969] Fix buffer overflow in RNNOp
> >> https://github.com/apache/incubator-mxnet/pull/12603
> >>
> >> [MXNET-922] Fix memleak in profiler
> >> https://github.com/apache/incubator-mxnet/pull/12499
> >>
> >> Implement mkldnn convolution fusion and quantization (MXNet Graph
> >> Optimization and Quantization based on subgraph and MKL-DNN
> proposal
> >> [4])
> >> https://github.com/apache/incubator-mxnet/pull/12530
> >>
> >> Following items (test cases) should be already part of 1.3.0:
> >>
> >> [MXNET-486] Create CPP test for concat MKLDNN operator
> >> https://github.com/apache/incubator-mxnet/pull/11371
> >>
> >> [MXNET-489] MKLDNN Pool test
> >> https://github.com/apache/incubator-mxnet/pull/11608
> >>
> >> [MXNET-484] MKLDNN C++ test for LRN operator
> >> https://github.com/apache/incubator-mxnet/pull/11831
> >>
> >> [MXNET-546] Add unit test for MKLDNNSum
> >> https://github.com/apache/incubator-mxnet/pull/11272
> >>
> >> [MXNET-498] Test MKLDNN backward operators
> >> https://github.com/apache/incubator-mxnet/pull/11232
> >>
> >> [MXNET-5

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-05 Thread Haibin Lin
Hi Anton,

Thanks for driving the patch release. Besides the MKL improvements, I
suggest we include two changes for *performance improvement* for NLP tasks
below:

CUDNN support for LSTM with projection & clipping:
- https://github.com/apache/incubator-mxnet/pull/13056
- It is used in state of the art language models such as BIG-LSTM [1] and
Elmo (ACL 2018 best paper) [2]

sample_like operators:
- https://github.com/apache/incubator-mxnet/pull/13034
- Many models require candidate sampling (e.g. word2vec [3], fasttext [4])
for training. The sample_like operator enables drawing random samples
without shape information, therefore the candidate sampling blocks can now
be hybridized and be accelerated a lot.

If there is no concern I will open two PRs for the above two changes to
1.3.x branch. Thanks!

Best,
Haibin

[1] https://arxiv.org/pdf/1602.02410.pdf
[2] https://arxiv.org/pdf/1802.05365.pdf
[3]
https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

[4] https://arxiv.org/pdf/1607.01759.pdf

On Mon, Nov 5, 2018 at 11:11 AM Anton Chernov  wrote:

> It seems that there is a problem porting following changes to the v1.3.x
> release branch:
>
> Implement mkldnn convolution fusion and quantization
> https://github.com/apache/incubator-mxnet/pull/12530
>
> MKL-DNN Quantization Examples and README
> https://github.com/apache/incubator-mxnet/pull/12808
>
> The bases are different.
>
> I would need help from authors of these changes to make a backport PR.
>
> @ZhennanQin, @xinyu-intel would you be able to assist me and create the
> corresponding PR's?
>
> Without proper history and domain knowledge I would not be able to create
> them by my own in reasonable amount of time, I'm afraid.
>
> Best regards,
> Anton
>
> пн, 5 нояб. 2018 г. в 19:45, Anton Chernov :
>
> >
> > As part of:
> >
> > Implement mkldnn convolution fusion and quantization
> > https://github.com/apache/incubator-mxnet/pull/12530
> >
> > I propose to add the examples and documentation PR as well:
> >
> > MKL-DNN Quantization Examples and README
> > https://github.com/apache/incubator-mxnet/pull/12808
> >
> >
> > Best regards,
> > Anton
> >
> > пн, 5 нояб. 2018 г. в 19:02, Anton Chernov :
> >
> >> Dear MXNet community,
> >>
> >> I will be the release manager for the upcoming 1.3.1 patch release.
> >> Naveen will be co-managing the release and providing help from the
> >> committers side.
> >>
> >> The following dates have been set:
> >>
> >> Code Freeze: 31st October 2018
> >> Release published: 13th November 2018
> >>
> >> Release notes have been drafted here [1].
> >>
> >>
> >> * Known issues
> >>
> >> Update MKL-DNN dependency
> >> https://github.com/apache/incubator-mxnet/pull/12953
> >>
> >> This PR hasn't been merged even to master yet. Requires additional
> >> discussion and merge.
> >>
> >> distributed kvstore bug in MXNet
> >> https://github.com/apache/incubator-mxnet/issues/12713
> >>
> >> > When distributed kvstore is used, by default gluon.Trainer doesn't
> work
> >> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be
> more
> >> specific, the trainer updates once per GPU, the LRScheduler object is
> >> shared across GPUs and get a wrong update count.
> >>
> >> This needs to be fixed. [6]
> >>
> >>
> >> * Changes
> >>
> >> The following changes will be ported to the release branch, per [2]:
> >>
> >> Infer dtype in SymbolBlock import from input symbol [3]
> >> https://github.com/apache/incubator-mxnet/pull/12412
> >>
> >> [MXNET-953] Fix oob memory read
> >> https://github.com/apache/incubator-mxnet/pull/12631
> >>
> >> [MXNET-969] Fix buffer overflow in RNNOp
> >> https://github.com/apache/incubator-mxnet/pull/12603
> >>
> >> [MXNET-922] Fix memleak in profiler
> >> https://github.com/apache/incubator-mxnet/pull/12499
> >>
> >> Implement mkldnn convolution fusion and quantization (MXNet Graph
> >> Optimization and Quantization based on subgraph and MKL-DNN proposal
> [4])
> >> https://github.com/apache/incubator-mxnet/pull/12530
> >>
> >> Following items (test cases) should be already part of 1.3.0:
> >>
> >> [MXNET-486] Create CPP test for concat MKLDNN operator
> >> https://github.com/apache/incubator-mxnet/pull/11371
> >>
> >> [MXNET-489] MKLDNN Pool test
> >> https://github.com/apache/incubator-mxnet/pull/11608
> >>
> >> [MXNET-484] MKLDNN C++ test for LRN operator
> >> https://github.com/apache/incubator-mxnet/pull/11831
> >>
> >> [MXNET-546] Add unit test for MKLDNNSum
> >> https://github.com/apache/incubator-mxnet/pull/11272
> >>
> >> [MXNET-498] Test MKLDNN backward operators
> >> https://github.com/apache/incubator-mxnet/pull/11232
> >>
> >> [MXNET-500] Test cases improvement for MKLDNN on Gluon
> >> https://github.com/apache/incubator-mxnet/pull/10921
> >>
> >> Set correct update on kvstore flag in dist_device_sync mode (as part of
> >> fixing [5])
> >> https://github.com/apache/incubator-mxnet/pull/12786
> >>
> >> upgrade mshadow version
> >> https:/

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-05 Thread Anton Chernov
It seems that there is a problem porting following changes to the v1.3.x
release branch:

Implement mkldnn convolution fusion and quantization
https://github.com/apache/incubator-mxnet/pull/12530

MKL-DNN Quantization Examples and README
https://github.com/apache/incubator-mxnet/pull/12808

The bases are different.

I would need help from authors of these changes to make a backport PR.

@ZhennanQin, @xinyu-intel would you be able to assist me and create the
corresponding PR's?

Without proper history and domain knowledge I would not be able to create
them by my own in reasonable amount of time, I'm afraid.

Best regards,
Anton

пн, 5 нояб. 2018 г. в 19:45, Anton Chernov :

>
> As part of:
>
> Implement mkldnn convolution fusion and quantization
> https://github.com/apache/incubator-mxnet/pull/12530
>
> I propose to add the examples and documentation PR as well:
>
> MKL-DNN Quantization Examples and README
> https://github.com/apache/incubator-mxnet/pull/12808
>
>
> Best regards,
> Anton
>
> пн, 5 нояб. 2018 г. в 19:02, Anton Chernov :
>
>> Dear MXNet community,
>>
>> I will be the release manager for the upcoming 1.3.1 patch release.
>> Naveen will be co-managing the release and providing help from the
>> committers side.
>>
>> The following dates have been set:
>>
>> Code Freeze: 31st October 2018
>> Release published: 13th November 2018
>>
>> Release notes have been drafted here [1].
>>
>>
>> * Known issues
>>
>> Update MKL-DNN dependency
>> https://github.com/apache/incubator-mxnet/pull/12953
>>
>> This PR hasn't been merged even to master yet. Requires additional
>> discussion and merge.
>>
>> distributed kvstore bug in MXNet
>> https://github.com/apache/incubator-mxnet/issues/12713
>>
>> > When distributed kvstore is used, by default gluon.Trainer doesn't work
>> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be more
>> specific, the trainer updates once per GPU, the LRScheduler object is
>> shared across GPUs and get a wrong update count.
>>
>> This needs to be fixed. [6]
>>
>>
>> * Changes
>>
>> The following changes will be ported to the release branch, per [2]:
>>
>> Infer dtype in SymbolBlock import from input symbol [3]
>> https://github.com/apache/incubator-mxnet/pull/12412
>>
>> [MXNET-953] Fix oob memory read
>> https://github.com/apache/incubator-mxnet/pull/12631
>>
>> [MXNET-969] Fix buffer overflow in RNNOp
>> https://github.com/apache/incubator-mxnet/pull/12603
>>
>> [MXNET-922] Fix memleak in profiler
>> https://github.com/apache/incubator-mxnet/pull/12499
>>
>> Implement mkldnn convolution fusion and quantization (MXNet Graph
>> Optimization and Quantization based on subgraph and MKL-DNN proposal [4])
>> https://github.com/apache/incubator-mxnet/pull/12530
>>
>> Following items (test cases) should be already part of 1.3.0:
>>
>> [MXNET-486] Create CPP test for concat MKLDNN operator
>> https://github.com/apache/incubator-mxnet/pull/11371
>>
>> [MXNET-489] MKLDNN Pool test
>> https://github.com/apache/incubator-mxnet/pull/11608
>>
>> [MXNET-484] MKLDNN C++ test for LRN operator
>> https://github.com/apache/incubator-mxnet/pull/11831
>>
>> [MXNET-546] Add unit test for MKLDNNSum
>> https://github.com/apache/incubator-mxnet/pull/11272
>>
>> [MXNET-498] Test MKLDNN backward operators
>> https://github.com/apache/incubator-mxnet/pull/11232
>>
>> [MXNET-500] Test cases improvement for MKLDNN on Gluon
>> https://github.com/apache/incubator-mxnet/pull/10921
>>
>> Set correct update on kvstore flag in dist_device_sync mode (as part of
>> fixing [5])
>> https://github.com/apache/incubator-mxnet/pull/12786
>>
>> upgrade mshadow version
>> https://github.com/apache/incubator-mxnet/pull/12692
>> But another PR will be used instead:
>> update mshadow
>> https://github.com/apache/incubator-mxnet/pull/12674
>>
>> CudnnFind() usage improvements
>> https://github.com/apache/incubator-mxnet/pull/12804
>> A critical CUDNN fix that reduces GPU memory consumption and addresses
>> this memory leak issue. This is an important fix to include in 1.3.1
>>
>>
>> From discussion about gluon toolkits:
>>
>> disable opencv threading for forked process
>> https://github.com/apache/incubator-mxnet/pull/12025
>>
>> Fix lazy record io when used with dataloader and multi_worker > 0
>> https://github.com/apache/incubator-mxnet/pull/12554
>>
>> fix potential floating number overflow, enable float16
>> https://github.com/apache/incubator-mxnet/pull/12118
>>
>>
>>
>> * Resolved issues
>>
>> MxNet 1.2.1–module get_outputs()
>> https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-outputs/1882
>>
>> As far as I can see from the comments the issue has been resolved, no
>> actions need to be taken for this release. [7] is mentioned in this
>> regards, but I don't see any action points here either.
>>
>>
>> I will start with help of Naveen port the mentioned PR's to the 1.3.x
>> branch.
>>
>>
>> Best regards,
>> Anton
>>
>> [1] https://cwiki.apache.org/confluence/x/eZGzBQ
>> [2]
>> https://cwiki.apache.org/confluenc

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-05 Thread Anton Chernov
As part of:

Implement mkldnn convolution fusion and quantization
https://github.com/apache/incubator-mxnet/pull/12530

I propose to add the examples and documentation PR as well:

MKL-DNN Quantization Examples and README
https://github.com/apache/incubator-mxnet/pull/12808


Best regards,
Anton

пн, 5 нояб. 2018 г. в 19:02, Anton Chernov :

> Dear MXNet community,
>
> I will be the release manager for the upcoming 1.3.1 patch release. Naveen
> will be co-managing the release and providing help from the committers side.
>
> The following dates have been set:
>
> Code Freeze: 31st October 2018
> Release published: 13th November 2018
>
> Release notes have been drafted here [1].
>
>
> * Known issues
>
> Update MKL-DNN dependency
> https://github.com/apache/incubator-mxnet/pull/12953
>
> This PR hasn't been merged even to master yet. Requires additional
> discussion and merge.
>
> distributed kvstore bug in MXNet
> https://github.com/apache/incubator-mxnet/issues/12713
>
> > When distributed kvstore is used, by default gluon.Trainer doesn't work
> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be more
> specific, the trainer updates once per GPU, the LRScheduler object is
> shared across GPUs and get a wrong update count.
>
> This needs to be fixed. [6]
>
>
> * Changes
>
> The following changes will be ported to the release branch, per [2]:
>
> Infer dtype in SymbolBlock import from input symbol [3]
> https://github.com/apache/incubator-mxnet/pull/12412
>
> [MXNET-953] Fix oob memory read
> https://github.com/apache/incubator-mxnet/pull/12631
>
> [MXNET-969] Fix buffer overflow in RNNOp
> https://github.com/apache/incubator-mxnet/pull/12603
>
> [MXNET-922] Fix memleak in profiler
> https://github.com/apache/incubator-mxnet/pull/12499
>
> Implement mkldnn convolution fusion and quantization (MXNet Graph
> Optimization and Quantization based on subgraph and MKL-DNN proposal [4])
> https://github.com/apache/incubator-mxnet/pull/12530
>
> Following items (test cases) should be already part of 1.3.0:
>
> [MXNET-486] Create CPP test for concat MKLDNN operator
> https://github.com/apache/incubator-mxnet/pull/11371
>
> [MXNET-489] MKLDNN Pool test
> https://github.com/apache/incubator-mxnet/pull/11608
>
> [MXNET-484] MKLDNN C++ test for LRN operator
> https://github.com/apache/incubator-mxnet/pull/11831
>
> [MXNET-546] Add unit test for MKLDNNSum
> https://github.com/apache/incubator-mxnet/pull/11272
>
> [MXNET-498] Test MKLDNN backward operators
> https://github.com/apache/incubator-mxnet/pull/11232
>
> [MXNET-500] Test cases improvement for MKLDNN on Gluon
> https://github.com/apache/incubator-mxnet/pull/10921
>
> Set correct update on kvstore flag in dist_device_sync mode (as part of
> fixing [5])
> https://github.com/apache/incubator-mxnet/pull/12786
>
> upgrade mshadow version
> https://github.com/apache/incubator-mxnet/pull/12692
> But another PR will be used instead:
> update mshadow
> https://github.com/apache/incubator-mxnet/pull/12674
>
> CudnnFind() usage improvements
> https://github.com/apache/incubator-mxnet/pull/12804
> A critical CUDNN fix that reduces GPU memory consumption and addresses
> this memory leak issue. This is an important fix to include in 1.3.1
>
>
> From discussion about gluon toolkits:
>
> disable opencv threading for forked process
> https://github.com/apache/incubator-mxnet/pull/12025
>
> Fix lazy record io when used with dataloader and multi_worker > 0
> https://github.com/apache/incubator-mxnet/pull/12554
>
> fix potential floating number overflow, enable float16
> https://github.com/apache/incubator-mxnet/pull/12118
>
>
>
> * Resolved issues
>
> MxNet 1.2.1–module get_outputs()
> https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-outputs/1882
>
> As far as I can see from the comments the issue has been resolved, no
> actions need to be taken for this release. [7] is mentioned in this
> regards, but I don't see any action points here either.
>
>
> I will start with help of Naveen port the mentioned PR's to the 1.3.x
> branch.
>
>
> Best regards,
> Anton
>
> [1] https://cwiki.apache.org/confluence/x/eZGzBQ
> [2]
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
> [3] https://github.com/apache/incubator-mxnet/issues/11849
> [4]
> https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimization+and+Quantization+based+on+subgraph+and+MKL-DNN
> [5] https://github.com/apache/incubator-mxnet/issues/12713
> [6]
> https://github.com/apache/incubator-mxnet/issues/12713#issuecomment-435773777
> [7] https://github.com/apache/incubator-mxnet/pull/11005
>
>


[Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-05 Thread Anton Chernov
Dear MXNet community,

I will be the release manager for the upcoming 1.3.1 patch release. Naveen
will be co-managing the release and providing help from the committers side.

The following dates have been set:

Code Freeze: 31st October 2018
Release published: 13th November 2018

Release notes have been drafted here [1].


* Known issues

Update MKL-DNN dependency
https://github.com/apache/incubator-mxnet/pull/12953

This PR hasn't been merged even to master yet. Requires additional
discussion and merge.

distributed kvstore bug in MXNet
https://github.com/apache/incubator-mxnet/issues/12713

> When distributed kvstore is used, by default gluon.Trainer doesn't work
with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be more
specific, the trainer updates once per GPU, the LRScheduler object is
shared across GPUs and get a wrong update count.

This needs to be fixed. [6]


* Changes

The following changes will be ported to the release branch, per [2]:

Infer dtype in SymbolBlock import from input symbol [3]
https://github.com/apache/incubator-mxnet/pull/12412

[MXNET-953] Fix oob memory read
https://github.com/apache/incubator-mxnet/pull/12631

[MXNET-969] Fix buffer overflow in RNNOp
https://github.com/apache/incubator-mxnet/pull/12603

[MXNET-922] Fix memleak in profiler
https://github.com/apache/incubator-mxnet/pull/12499

Implement mkldnn convolution fusion and quantization (MXNet Graph
Optimization and Quantization based on subgraph and MKL-DNN proposal [4])
https://github.com/apache/incubator-mxnet/pull/12530

Following items (test cases) should be already part of 1.3.0:

[MXNET-486] Create CPP test for concat MKLDNN operator
https://github.com/apache/incubator-mxnet/pull/11371

[MXNET-489] MKLDNN Pool test
https://github.com/apache/incubator-mxnet/pull/11608

[MXNET-484] MKLDNN C++ test for LRN operator
https://github.com/apache/incubator-mxnet/pull/11831

[MXNET-546] Add unit test for MKLDNNSum
https://github.com/apache/incubator-mxnet/pull/11272

[MXNET-498] Test MKLDNN backward operators
https://github.com/apache/incubator-mxnet/pull/11232

[MXNET-500] Test cases improvement for MKLDNN on Gluon
https://github.com/apache/incubator-mxnet/pull/10921

Set correct update on kvstore flag in dist_device_sync mode (as part of
fixing [5])
https://github.com/apache/incubator-mxnet/pull/12786

upgrade mshadow version
https://github.com/apache/incubator-mxnet/pull/12692
But another PR will be used instead:
update mshadow
https://github.com/apache/incubator-mxnet/pull/12674

CudnnFind() usage improvements
https://github.com/apache/incubator-mxnet/pull/12804
A critical CUDNN fix that reduces GPU memory consumption and addresses this
memory leak issue. This is an important fix to include in 1.3.1


>From discussion about gluon toolkits:

disable opencv threading for forked process
https://github.com/apache/incubator-mxnet/pull/12025

Fix lazy record io when used with dataloader and multi_worker > 0
https://github.com/apache/incubator-mxnet/pull/12554

fix potential floating number overflow, enable float16
https://github.com/apache/incubator-mxnet/pull/12118



* Resolved issues

MxNet 1.2.1–module get_outputs()
https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-outputs/1882

As far as I can see from the comments the issue has been resolved, no
actions need to be taken for this release. [7] is mentioned in this
regards, but I don't see any action points here either.


I will start with help of Naveen port the mentioned PR's to the 1.3.x
branch.


Best regards,
Anton

[1] https://cwiki.apache.org/confluence/x/eZGzBQ
[2]
https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
[3] https://github.com/apache/incubator-mxnet/issues/11849
[4]
https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimization+and+Quantization+based+on+subgraph+and+MKL-DNN
[5] https://github.com/apache/incubator-mxnet/issues/12713
[6]
https://github.com/apache/incubator-mxnet/issues/12713#issuecomment-435773777
[7] https://github.com/apache/incubator-mxnet/pull/11005