Re: Request for Edit Permissions on Confluence

2020-04-02 Thread Zha, Sheng
Granted. Thanks for contributing!

-sz

On 4/2/20, 11:18 AM, "Goggins, Connor"  wrote:

Hello, I would like to be given edit permissions on the Apache MXNet 
Confluence 
wiki so 
that I can add important documentation regarding the steps to build/update the 
new MXNet website. My username is cggoggin and my email is 
cggog...@amazon.com.

Thank you,
Connor




Re: MXNet CD pipelines cost savings

2020-02-10 Thread Zha, Sheng
+dev@

-sz

On Feb 10, 2020, at 1:35 PM, Zha, Sheng  wrote:

 As already stated in the public threads, I’ve vetoed the CodeBuild solution 
from becoming the long term solution as it’s not publicly manageable.

As communicated before, the team should have put efforts in maintaining and 
fixing the Jenkins CD pipeline but has neglected to do so. Promoting the 
CodeBuild solution this way is a step in the wrong direction that has to be 
stopped.

-sz

On Feb 10, 2020, at 1:13 PM, Davydenko, Denis  wrote:


Hello guys,

I would like to start this discussion so that we can align on handling CD 
pipelines we currently have. There are two of them: one in 
Jenkins<http://jenkins.mxnet-ci.amazon-ml.com/job/restricted-mxnet-cd/> and one 
in CodeBuild<https://tiny.amazon.com/1h49a01qg/IsenLink>. The one in Jenkins is 
currently functioning but its runs are always failing. The one in CodeBuild is 
currently functioning and publishing artifacts to S3 
bucket<https://tiny.amazon.com/39negmk0/IsenLink>.

MXNet Engineering team proposal is to shut down Jenkins based CD completely as 
it is currently just a waste of resources and use CodeBuild based setup to 
continue publishing nightly builds to S3 bucket, which provides public access 
to all binaries stored in it. This doesn’t affect a discussion of whether to 
publish binaries to S3 or to pypi – once that concludes (if ever) we can switch 
destination of CodeBuild projects so that they would upload MXNet nightly 
binaries to pypi instead of S3.

This is an effort to get alignment internally, if possible, before bringing 
this as a proposal for community discussion.

--
Thanks,
Denis


Acknowledgement for 1.3.0 release (was: Re: [RESULT][VOTE] Release MXNet version 1.3.0)

2018-09-12 Thread Zha, Sheng
Hi all,

We would like to thank to all who contributed to the 1.3.0 release:
Aaron Markham, Alex Li, Alexander Zai, Amol Lele, Andrew Ayres, Anirudh 
Acharya, Anirudh Subramanian, Ankit Khedia, Anton Chernov, starimpact, Asmus 
Hetzel, Aston Zhang, brli, Burness Duan, cclauss, chinakook, ctcyang, Da Zheng, 
Deokjae Lee, Dick Carter, Eric Junyuan Xie, Felix Hieber, Hagay Lupesko, Haibin 
Lin, Hang Zhang, Hao Jin, Hao Li, Haozhi Qi, Hu Shiwen, Indhu Bharathi, Istvan 
Fehervari, JackieWu, James MacGlashan, jeremiedb, Jerry Zhang, Jian Guo, Jin 
Huang, Jun Wu, Kalyanee Chendke, Kellen Sunderland, kpmurali, Leonard Lausen, 
Lin Yuan, Marco de Abreu, Marek Kolodziej, Mu Li, Nan Zhu, Naveen Swamy, Nehal 
J Wani, PatricZhao, Pedro Larroy, Pracheer Gupta, Przemyslaw Tredak, Qiang Kou, 
Qing Lan, Rahul Huilgol, Robert Stone, Roshani Nagmote, Sandeep Krishnamurthy, 
Sebastian Bodenstein, Sergey Kolychev, Sergey Sokolov, Sheng Zha, Sheng-Ying, 
Simon, Sina Afrooze, solin319, Soonhwan-Kwon, Steffen Rochel, Taliesin Beynon, 
Tao Lv, Thom Lane, ThomasDelteil, Tianqi Chen, Tong He, Wei Wu, Wen-Yang Chu, 
Xingjian Shi, Xinyu Chen, yifeim, Yizhi Liu, Yu-Xiang Wang, Yuan Tang, Yuntao 
Chen, Zhi Zhang, Ziyue Huang, Shuai Zheng, Junru Shao, Philip Hyunsu Cho
 
Especially we would like to thank for first time contributions from:
bl0, Abhinav Sharma, access2rohit, Alexander Alexandrov, Arunkumar V Ramanan, 
Burin Choomnuan, Carin Meier, Carl Tsai, Chance Bair, Chudong Tian, ciyong, 
Dang Trung Kien, Francisco Facioni, Frank Liu, Gnanesh, Huilin Qu, Jake Lee, 
jimdunn, Jingbei Li, Lai Wei, Milan Desai, Mingkun Huang, Paul Stadig, 
perdasilva, Piyush Ghai, qiuhan, Rakesh Vasudevan, Ray Zhang, Sam Skalicky, 
Soji Adeshina, Todd Sundsted, Vishaal Kapoor, YouRancestor, Yuelin Zhang, Zach 
Kimberg, zhiyuan-huang, Zhuo Zhang, Ziyi Mu, luobao-intel, Manu Seth, Matthew 
Brookhart, Vandana Kannan, vdantu
 
And a friendly welcome and thank you for everybody who provided PR feedback for 
the first time:
aplikaplik, Ben Kamphaus, Caenorst, Cliff Woolley, Didier A., Faldict, 
hasanmua, Kovas Boguta, Kurman Karabukaev, Lianmin Zheng, lufenamazon, 
miteshyh, Philip Hyunsu Cho, Pishen Tsai, Shen Zhu, slitsey, wangzhe, xcgoner, 
Zhennan Qin

Best regards,
-sz

On 9/7/18, 11:19 AM, "Roshani Nagmote"  wrote:

Hi All,

So, this vote passes with *seven* +1, *two* 0  and *three* -1 votes.

*+1 votes*
*Committers:*
- Joshua Zhang
- Carin
- Naveen
- Indu
- Haibin

*Community:*
- Pigeon Lucky
- Steffen
*0 votes:*
*Community:*
- Thomas
- Aaron
*-1 votes:*
*Committers:*
- Sandeep
- Anirudh

*Community:*
- Hagay

*Vote Thread:*


https://lists.apache.org/thread.html/8ad6f14811be465cdf663d6962980fd95e12193626292631a21ec6f1@%3Cdev.mxnet.apache.org%3E


I will continue with the release process on general@ and the release
announcement will follow in the next few days.

Thanks,
Roshani




Re: [DISCUSS] Subscribe dev@ to Github Activities?

2018-07-12 Thread Zha, Sheng
My intention is really just to bridge the gap between so much happening on 
github v.s. "whatever didn't happen on dev list didn't happen".

Also, since dev@ is intended to be an asynchronous way for community to follow 
technical conversations, there wasn't really a requirement for anyone to read 
all of them in the first place.

Best regards,
-sz

On 7/12/18, 3:20 PM, "Timur Shenkao"  wrote:

Flink - yes
Spark - it was previously but not now

Yeah, amount of messages would be tripled at least: Jira + Github issue + PR

On Thu, Jul 12, 2018 at 11:13 PM, Haibin Lin 
wrote:

> I'm a bit concerned with the amount of emails flooding in. In the past 
week
> there're 32 new issues and 35 new pull requests. This means on avg 10 
email
> per day and I doubt I'll read all of them.. Does the Spark community
> subscribe dev@ to github?
>
> Best,
> Haibin
>
> On Thu, Jul 12, 2018 at 3:08 PM, Pedro Larroy <
> pedro.larroy.li...@gmail.com>
> wrote:
>
> > -1   It's a lot of traffic, whomever wants to subscribe can do it in
> > github. I'm afraid it will decrease signal to noise ratio in the list.
> >
> > On Thu, Jul 12, 2018 at 11:32 PM Lin Yuan  wrote:
> >
> > > +1
> > >
> > > On Thu, Jul 12, 2018 at 12:26 PM Anirudh Acharya <
> anirudhk...@gmail.com>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > On Thu, Jul 12, 2018 at 11:51 AM Piyush Ghai 
> > > > wrote:
> > > >
> > > > > +1
> > > > > > On Jul 12, 2018, at 11:50 AM, Tianqi Chen <
> > tqc...@cs.washington.edu>
> > > > > wrote:
> > > > > >
> > > > > > +1
> > > > > >
> > > > > > On Thu, Jul 12, 2018 at 11:10 AM, Sheng Zha 
> > > > wrote:
> > > > > >
> > > > > >> Hi all,
> > > > > >>
> > > > > >> Should we subscribe dev list to github updates on mxnet repo?
> Both
> > > > > github
> > > > > >> issues/PRs and the dev list are intended for technical
> discussions
> > > and
> > > > > in
> > > > > >> that aspect largely share the same goal. Since MXNet has most
> > > activity
> > > > > >> github, this could help dev@ to become more active. Some pros
> and
> > > > cons:
> > > > > >>
> > > > > >> Pros:
> > > > > >> - There have been many high quality discussions that happen on
> > > github
> > > > to
> > > > > >> which the dev list can benefit.
> > > > > >> - Replies on update emails are reflected on the specific
> issue/PR.
> > > > > >> - Users can also choose to click on the link and go to github 
to
> > > > > >> participate in discussion.
> > > > > >> - We still have the ability to carry out dev@ only
> conversation.
> > > > > >>
> > > > > >> Cons:
> > > > > >> - Higher volume on dev list.
> > > > > >> - Some discussions might not be suitable for dev@. (though I
> > can't
> > > > > think
> > > > > >> of
> > > > > >> why such conversation should happen on github either)
> > > > > >>
> > > > > >> -sz
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>




Re: move entirely to CMakefiles for building MXNet, drop Makefiles

2018-03-07 Thread Zha, Sheng
You can set rpath to $${ORIGIN} to make the path where the libmxnet.so resides 
a path to look up.

-sz



- Sent by my thumb
> On Mar 7, 2018, at 7:45 AM, Pedro Larroy  wrote:
> 
> About libomp.so
> 
> This is giving me some problems when creating a pip package for installing
> on Jetson. I'm thinking that in this case would be better to compile with
> -fopenmp. I tried adding libomp.so to the pip package next to libmxnet.so
> and still I couldn't load the library...  Any ideas?
> 
> Pedro
> 
>> On Wed, Mar 7, 2018 at 6:31 AM, Eric Xie  wrote:
>> 
>> We want as few dependencies as possible.
>> CMake alone is enough trouble for our users. We don't want to burden them
>> with other stuff.
>> 
>> On 2018/03/06 17:21:15, kellen sunderland 
>> wrote:
>>> Short term solution sounds good to me Chris.  Converting the CI should be
>>> pretty easy.  One thing we should keep in mind is that there's going to
>> be
>>> a bunch of doc's we'll have to update.
>>> 
>>> Warning, slight thread hijack ahead:
>>> As a more long term change I was wondering if we had considered using
>>> hunter for third party packages?  It seems like a good system, and while
>> it
>>> likely won't have support for all our projects, we can contribute back
>>> support for the ones we care about.
>>> 
>>> For me the primary benefit would be that it would conditionally fetch
>>> source at build time based on your cmake configuration.  This would mean
>> it
>>> could say, detect you want opencv/mp/protobuf (if you're using onnx) and
>>> then it'd check out the pinned version we specify and build for your
>>> platform.
>>> 
>>> 
>>> On Tue, Mar 6, 2018 at 6:19 PM, Chris Olivier 
>> wrote:
>>> 
 Here is discussion:
 
 https://github.com/apache/incubator-mxnet/issues/8702
 
 On Tue, Mar 6, 2018 at 9:14 AM, Chris Olivier 
 wrote:
 
> This was agreed upon some time ago in a github issue thread, unless
>> there
> are new objections to it.
> 
> As far as I know, it's just a matter of someone putting in the work
>> to
 add
> more functionality to cmake or to fuse the two builds.
> 
> One solution for the short term might include having the Makefile
>> launch
> cmake for most of the build and fall back to Makefile for some of the
> remaining stuff, like scalapkg, rpkg, etc.
> 
> btw, cmake uses the openmp in 3rdparty
> 
> 
> 
> On Tue, Mar 6, 2018 at 8:51 AM, Pedro Larroy <
 pedro.larroy.li...@gmail.com
>> wrote:
> 
>> Hi
>> 
>> I would like to raise the issue that maintaining two overlapping
>> build
>> systems is too much overhead. It adds unnecessary complexity and
>> differences on how the project is built.
>> 
>> For example, openmp is used differently from CMake and Make, in the
 former
>> the one provided by gcc is used and in the later is compiled from
>> the
>> 3rdparty folder.
>> 
>> I think this situation is not sustainable for the project, and
>> specially
>> if
>> we add that we want to support compilation and cross compilation on
>> devices.
>> 
>> My proposal would be to identify any gaps that are not covered by
>> the
>> CMake
>> build system, cover them and make CMake the single build system for
 MXNet,
>> well tested and fully supported.
>> 
>> Pedro.
>> 
> 
> 
 
>>> 
>> 


Re: dmlc packages into 3rdpary

2018-01-19 Thread Zha, Sheng
Some existing release pipelines may rely on the folder structure for caching, 
so this needs to be coordinated.

Best regards,
-sz

On 1/19/18, 11:51 AM, "Tianqi Chen"  wrote:

I think it is fine either way as it won’t affect the build status of the
projects.

Tianqi
On Fri, Jan 19, 2018 at 11:43 AM Chris Olivier 
wrote:

> Is the general consensus to move the dmlc packages into 3rdparty?
>
> If so, I can submit a PR that does this.
>
> I have no strong opinion on it either way and am very open to other
> opinions on this.
>
> -Chris
>




Re: Protected master needs to be turned off

2017-11-19 Thread Zha, Sheng
My +1 vote stands. The vote is about what we should do right now, not where we 
should be ideally in 3 months. I don’t think we can move forward without 
disabling the branch protection, because current CI is not in any state to base 
the merge-decisions on. For example, here’s why:
1. Master branch protection is currently on. A change that breaks build was 
still merged in on the 16th despite the protection. I wasn’t able to merge the 
fix in yesterday for 8 hours because the CI tests fail.
2. False negative rate is currently too high (see the red crosses in 
https://github.com/apache/incubator-mxnet/pulls and 
https://github.com/apache/incubator-mxnet/commits/master)

People who are working on test infrastructure might say that it’s “enough work 
to isolate and fix the current issues”, and I can certainly relate to that. On 
the other hand, you too can probably empathize with the developers who has 
“enough work to develop new features and write tests” without having to deal 
with the broken CI. (note that my argument is on the CI system, and flaky test 
cases are a separate issue).

Regarding “doesn't that mean that our users and customers are also going to 
face those issues”, I honestly don’t think the argument stands. Release cycles 
and distribution channels, as well as the safety measures are there exactly to 
isolate the problems to the development branch and protect the users. If 
anything, turning on branch protection on release branches should suffice.

Finally, master branch protection being off doesn’t mean PRs can be merged 
without being tested. Contributors own the code quality and are responsible for 
the changes. Committers and reviewers are there to ensure that merged changes 
are OK.

Best regards,
-sz

On 11/19/17, 1:51 PM, "Marco de Abreu" <marco.g.ab...@googlemail.com> wrote:

Hello,

-1 (non binding)

Who is going to be responsible for changes breaking tests and having other
side effects after they have been merged? I'm afraid that this will harm
further development. At the moment I'm the responsible person for setting
up the new CI and so far have my results shown that not the CI itself is
the problem but also the stability of our code as well as the tests
themselves. At the moment we are having big issues to get a stable CI
because MXNet seems to be relying on so specific architectures,
dependencies and other factors which I'm not even able to track down that
this causes everything to be unstable.

Just to point it out: If we encounter so many problems while setting up a
CI system, doesn't that mean that our users and customers are also going to
face those issues as soon as things are getting more complicated? This is a
red flag in my opinion and I'm really looking forward to the usability
Sprint, but at the moment I'm afraid that an unprotected master will make
the situation even worse. It's already enough work to isolate and fix the
current issues, but if new untested changes get merged, this is going to be
like fighting a wildfire with a bottle of water.

So please revise your thoughts. If anybody is blocked by the protected
master, I would really appreciate it if they could approach me personally
in order to help stabilising the current situation. Just feeding in more
and more changes on one end while we're fixing issues on the other end
won't get us anywhere.

Best regards,
Marco

Am 19.11.2017 10:08 nachm. schrieb "Chris Olivier" <cjolivie...@gmail.com>:

> Revised:
>
>
> +1 at least until new CI is implemented. Then reevaluate.
>
> On Sun, Nov 19, 2017 at 1:07 PM Chris Olivier <cjolivie...@gmail.com>
> wrote:
>
> > +1
> >
> >
> > On Sun, Nov 19, 2017 at 12:52 PM Zha, Sheng <zhash...@amazon.com> wrote:
> >
> >> +1
> >>
> >> Best regards,
> >> -sz
> >>
> >> On 11/19/17, 12:51 PM, "Eric Xie" <j...@apache.org> wrote:
> >>
> >> Hi all,
> >> I'm starting this thread to vote on turning off protected master.
> The
> >> reasons are:
> >>
> >> 1. Since we turned on protected master pending PRs has grown from 
40
> >> to 80. It is severely slowing down development.
> >>
> >> 2. Committers, not CI, are ultimately responsible for the code they
> >> merge. You should only override the CI when you are very confident that
> CI
> >> is the problem, not your code. If it turns out you are wrong, you 
should
> >> fix it ASAP. This is the bare minimum requirement for all committers: 
BE
> >> RESPONSIBLE.
> >>

Re: AWS contributing ONNX-MXNet

2017-11-16 Thread Zha, Sheng
Hi Hagay,

> But why assume this was done on intention?
Given that you mentioned that you talked to Mu on this, would it be right to 
assume that you have paid sufficient attention and have been extra careful on 
acknowledgements already?

> import_onnx.py [1] is the only one that seem to have been missed… Monday to 
> discuss in details.
Sure, happy to talk to you then. I’d like to see the reasons why the other 
three files I listed don’t deserve acknowledgement.

Given the similarity in the code, I can’t help but conclude that code in one is 
copied from another source, and in order to do that you must already be aware 
of the origin of the code. So, how can files can be missed unintentionally? 
Otherwise, I’ll do my best in assuming the best intention in your actions. 
Thanks.

Best regards,
-sz

On 11/16/17, 7:04 PM, "Lupesko, Hagay"  wrote:

Chiming in as well.

First and foremost, I agree wholeheartedly that acknowledgments are due 
when deserved. In fact, we took care to add acknowledgments in the code, and in 
the blog post for that precise reason!
I also personally talked with Mu, to make sure these are in order and 
appropriate, and he had no comments.
Have we missed acknowledgments? Maybe (more on that below). But why assume 
this was done on intention?

Addressing specific points (I won’t repeat Henri’s points):
- I’m happy to take another look and see whether more files need to have 
the “ack” statement. But looking into it again, import_onnx.py [1] is the only 
one that seem to have been missed, and ack was already added. Sheng – I’ll grab 
some time with you Monday to discuss in details.
- The tutorial itself was actually referenced from PyTorch, not nnvm. This 
is acknowledged by onnx-mxnet code, as well as the nnvm code.
- We intentionally ack-ed an open source community (dmlc/nnvm) and not 
individuals. There’s more than Tianqi and Zhi that worked on nnvm and onnx, it 
is a whole community that we thank to.
- “I was wondering why your below email didn't include such 
acknowledgement?” – as noted by Hen, the email did include the ack.

One last thing, quoting Sheng: “In general, to have a healthy community, I 
believe the right things to do would be…”
I would stress out that in order to have a healthy community, we should 
always assume others have best intentions – this will make us a stronger 
community, one that works together, and one that if fun to be part of.

Hagay

[1] https://github.com/onnx/onnx-mxnet/blob/master/onnx_mxnet/import_onnx.py

On 11/16/17, 18:06, "Hen"  wrote:

On Thu, Nov 16, 2017 at 4:32 PM, Sheng Zha  wrote:

> Hi Hagay,
>
> (cc'd Zhi, Tianqi to make sure real authors are aware)
>
>
>
> At first glance the code in the repo you shared (i.e.
> https://github.com/onnx/onnx-mxnet) looks very
>
> familiar, so I did some searching. It looks like *almost all* the code
> are adopted from the *nnvm onnx*
>
> frontend, but the main contributor (*Zhi Zhang*, committer of mxnet, 
and
> intern at AWS) from this same
>
> community was not given his due credit in your email. To elaborate on 
why
> I think almost all the
>
> onnx-mxnet code is from nnvm onnx frontend:
>
>
>
> The following is the content of this repo:
>
> ├── LICENSE.txt
>
> ├── README.md
>
> ├── onnx_mxnet
>
> │   ├── __init__.py
>
> │   ├── common.py
>
> │   ├── import_helper.py
>
> │   ├── import_onnx.py
>
> │   └── tests
>
> │   ├── test_models.py
>
> │   └── test_super_resolution.py
>
> ├── setup.py
>
> ├── super_res_input.jpg
>
> └── super_res_output.jpg
>
> (Also attached a screenshot of the commit history of onnx_mxnet at the
> moment, as well as a copy of the git package, in case commit hash 
mismatch
> happens)
>
>
>
>- Out of the 6 files under onnx_mxnet package
>   - the following two files are marked as being derived from nnvm:
>  - common.py
>  

>  - import_helper.py
>  

>   - the rest four files that are not marked as being derived from
>   nnvm:
>  - __init__.py
>  
:
>  looks 

Re: [RESULT][VOTE] Release Apache MXNet (incubating) Version 0.12.0.rc0

2017-10-28 Thread Zha, Sheng
Who’s executing?

Best regards,
-sz

On 10/24/17, 4:50 PM, "Meghna Baijal"  wrote:

Hi All,
The vote for releasing Apache MXNet (incubating) 0.12.0 RC0 passed with the
following result -

+1 binding
- Chris Olivier
- Suneel Marthi
- Indhu Bharathi

+1 non-binding
- Gautam Kumar

There was no -1 and 0

Vote thread :

https://lists.apache.org/thread.html/800402860be8a1b4055ede075ab465af48b7f8d041b42217372a63b9@%3Cdev.mxnet.apache.org%3E

I am now going to create a vote on the general@ list.

Thanks,
Meghna Baijal




Re: Improving and rationalizing unit tests

2017-10-16 Thread Zha, Sheng
Here’s a package that may help us on flaky tests: 
https://pypi.python.org/pypi/flaky. It can retry tests that are marked flaky 
and can pass or fail based on specified threshold. In the short term we can use 
this to pass tests that are not 100% reliable.

Best regards,
-sz

On 10/16/17, 10:32 AM, "kellen sunderland"  wrote:

I think you’ve covered the pros/cons of having determinism in your tests.  
It seems like a potential maintenance effort versus forced code robustness 
argument to me.  I’d suggest you have a separate vote on this topic.

For me the main topic that should be focused on is making the CI system 
fast and stable for PR builds.  I think the best road to doing this, as 
previously suggested, is to segment tests and move those that are long running, 
fail occasionally, or require external components into a nightly/weekly test.

I would also propose that:

6) Test fixtures are created to test a subset of functionality, and that we 
don’t have test fixtures like test_operator.py that are nearly 5000 lines long, 
and take 20 minutes to run.  There’s a few advantages to breaking these tests 
into smaller files:  

We will have fewer merge conflicts, because fewer people will be editing 
the same test files across PRs.  Debugging issues with tests will become 
easier, as hopefully there will be less potential side effects between tests 
(this does happen now). We may be a little more confident that the tests run 
independently, eventually meaning that we could run them in parallel more 
easily, which would reduce test run latency time (but not throughput).  Last, 
we will be able to disable tests at some convenient level of granularity, for 
example when running on IoT devices, or without OpenCV.  At the moment we’d 
have to disable tests individually.

7) We cleanup tests that are no longer needed:

I’ve personally found it quite unintuitive in MXNet to discover which tests 
are actually needed, where they are run, how often, etc.  Are the nightly tests 
actually being run nightly?  Are the cpp tests run, why is the Travis CI folder 
still there, what is the difference between ci_build folder and the Jenkins 
folder, etc.  If we’re going to take a look at revamping the tests folder I’d 
recommend we clean up the folder structure a bit, and delete the non-relevant 
files to make it easier for newcomers to know what’s happening.  We’ll always 
have these files for reference in source control.

-Kellen

From: Chris Olivier
Sent: Monday, October 16, 2017 6:46 PM
To: dev@mxnet.incubator.apache.org
Subject: Re: Improving and rationalizing unit tests

My argument is that I am actually categorically against having a
requirement that the same input values be used for testing for every run.

I don't personally view "convenience in reproducing" as outweighing
"finding edge cases that I didn't think of or that haven't been tried
before".

On Mon, Oct 16, 2017 at 9:34 AM, Pedro Larroy 
wrote:

> It's always going to be deterministic one way or another unless you use
> random from the entropy pool such as /dev/random. I don't think it's a 
good
> practice not to seed properly and have values depend on execution order /
> parallelism / time or whatever, but that's just my opinion. I would want 
to
> use the same values for all test runs for reproducibility.
>
> I think your argument goes more towards the previously mentioned "property
> based testing" approach, which is in the spirit of what you are 
supporting,
> if I'm not mistaken.
>
> On Mon, Oct 16, 2017 at 6:22 PM, Chris Olivier 
> wrote:
>
> > My take on the suggestion of purely deterministic inputs is (including
> > deterministic seeding):
> >
> > "I want the same values to be used for all test runs because it is
> > inconvenient when a unit test fails for some edge cases.  I prefer that
> > unforseen edge case failures occur in the field and not during testing".
> >
> > Is this the motivation?  Seems strange to me.
> >
> >
> > On Mon, Oct 16, 2017 at 9:09 AM, Pedro Larroy <
> > pedro.larroy.li...@gmail.com>
> > wrote:
> >
> > > I think using a properly seeded and initialized (pseudo)random is
> > actually
> > > beneficial (and deterministic), handpicked examples are usually too
> > > simplistic and miss corner cases.
> > >
> > > Better yet is to use property based testing, which will pick corner
> cases
> > > and do fuzzing automatically to check with high degree of confidence
> > that a
> > > testing condition holds.
> > >
> > > Probably it would be good if we use a property based testing library 
in
> > > adition to nose to check invariants.
> > >
> > > A quick googling yields 

Re: MXNet Slack channel

2017-09-28 Thread Zha, Sheng
Invited.

Best regards,
-sz

On 9/28/17, 12:31 PM, "Jin Sun"  wrote:

Please give me an invite as well.
Thanks,
Jin

2017-09-28 12:14 GMT-07:00 Jean K :

> Hi,
> I would like to join the MXNet Slack Channel.
> Best,
> Jean
>



-- 

*Jin Sun*
Software Engineer II at Uber
M.Eng. in ECE at Carnegie Mellon University




Re: Apache MXNet build failures are mostly valid - verify before merge

2017-09-28 Thread Zha, Sheng
> Thanks Kumar for resuming important discussion.
> > > > > > > > > > >
> > > > > > > > > > > Best regards
> > > > > > > > > > > - Tsuyoshi
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Sep 28, 2017 at 12:56 PM, Kumar, Gautam <
> > > > > > ga...@amazon.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > Reviving the discussion.
> > > > > > > > > > > >
> > > > > > > > > > > > At this point of time we have couple of stable 
builds
> > > > > > > > > > > >
> > > > > > > > > > >
> https://builds.apache.org/view/Incubator%20Projects/job/
> > > > > > > > > > incubator-mxnet/job/master/448/
> > > > > > > > > > > >
> > > > > > > > > > >
> https://builds.apache.org/view/Incubator%20Projects/job/
> > > > > > > > > > incubator-mxnet/job/master/449/
> > > > > > > > > > > >
> > > > > > > > > > > > Should we have a quick discussion or polling on
> making
> > > the
> > > > > > mx-net
> > > > > > > > > > branch
> > > > > > > > > > > protected? If you still think we shouldn’t make it
> > > protected
> > > > > > please
> > > > > > > > > > provide
> > > > > > > > > > > a reason to support your claim.
> > > > > > > > > > > >
> > > > > > > > > > > > Few of us have concern over Jenkin’s stability. 
If I
> > look
> > > > two
> > > > > > > weeks
> > > > > > > > > > > back, after upgrading Linux slave to g2.8x and new
> > windows
> > > > AMI,
> > > > > > we
> > > > > > > > have
> > > > > > > > > > not
> > > > > > > > > > > seen any case where instance died due to high 
memory
> > usage
> > > or
> > > > > any
> > > > > > > > > process
> > > > > > > > > > > got killed due to high cpu usage or any other 
issue
> with
> > > > > windows
> > > > > > > > > slaves.
> > > > > > > > > > > >
> > > > > > > > > > > > Going forward we are also planning that if we 
add any
> > new
> > > > > slave
> > > > > > > we
> > > > > > > > > will
> > > > > > > > > > > not enable the main load immediately, but rather 
will
> do
> > > > ‘test
> > > > > > > build’
> > > > > > > > > to
> > > > > > > > > > > make sure that new slaves are not causing any
> > > infrastructure
> > > > > > issue
> > > > > > > > and
> > > > > > > > > > > capable to perform as good as existing slaves.
> > > > > > > > > > > >
> > > > > > > > > > > > -Gautam
> > > > > > > > > > > >
> > > > > > > > > > > > On 8/31/17, 5:27 PM, "Lupesko, Hagay" <
> > lupe...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > @madan looking into some failures – you’re 
right…
> > > > there’s
> > > > > > > > > multiple
> > > > > >

Re: Apache MXNet build failures are mostly valid - verify before merge

2017-08-31 Thread Zha, Sheng
Just one thing: please don’t disable more tests or just raise the tolerance 
thresholds.

Best regards,
-sz

On 8/31/17, 10:45 AM, "Madan Jampani"  wrote:

+1
Before we can turn protected mode I feel we should first get to a stable CI
pipeline.
Sandeep is chasing down known breaking issues.


On Thu, Aug 31, 2017 at 10:27 AM, Hagay Lupesko  wrote:

> Build stability is a major issue, builds have been failing left and right
> over the last week. Some of it is due to Jenkins slave issues, but some 
are
> real regressions.
> We need to be more strict in the code we're committing.
>
> I propose we configure our master to be a protected branch (
> https://help.github.com/articles/about-protected-branches/).
>
> Thoughts?
>
> On 2017-08-28 22:41, sandeep krishnamurthy  wrote:
> > Hello Committers and Contributors,>
> >
> > Due to unstable build pipelines, from past 1 week, PRs are being merged>
> > after CR ignoring PR build status. Build pipeline is much more stable
> than>
> > last week and most of the build failures you see from now on, are likely
> to>
> > be a valid failure and hence, it is recommended to wait for PR builds,
> see>
> > the root cause of any build failures before proceeding with merges.>
> >
> > At this point of time, there are 2 intermittent issue yet to be fixed ->
> > * Network error leading to GitHub requests throwing 404>
> > * A conflict in artifacts generated between branches/PR - Cause unknown
> yet.>
> > These issues will be fixed soon.>
> >
> >
> > -- >
> > Sandeep Krishnamurthy>
> >
>