Re: [VOTE] A Separate CI System for Apache MXNet (incubating)

2017-11-10 Thread Madan Jampani
+1 for (1)

On Thu, Nov 9, 2017 at 4:41 PM, Meghna Baijal 
wrote:

> Hi All,
> A need has been identified for MXNet’s CI/CD setup to move away from the
> Apache Jenkins Service. Over the past few days there has been active
> discussion on the necessary and advanced features for such a system and the
> various options available. These are being tracked in this Google Doc
> 
>  (and
> are also in the pdf attached).
>
> I would like to start a vote to choose the framework for this new CI/CD
> system. The options are -
> [1] Jenkins (A setup separated from Apache Jenkins) - with various plugins
> [2] TeamCity
> [3] Travis CI
> [4] GitLabCI
> [5] BuildBot
> [6] Other - Please Name
>
> Please feel free to add a comment to support your choice.
> This vote will be open from now until the end of the day on Monday
> 11/13/2017
>
> Thanks,
> Meghna Baijal
>
>
>


Re: [DISCUSSION] Adding labels to PRs

2017-11-09 Thread madan jampani
I really like (2). Yes it is work to link each PR to a Jira. But it really 
helps users understand what they are getting. The Jira can contain the 
necessary context.

Spark does this well: https://github.com/apache/spark/commits/master

Madan

> On Nov 9, 2017, at 2:43 PM, Meghna Baijal  wrote:
> 
> Hello All,
> 
> Currently, there is no process in place to identify the new features that
> go into every release.  All the commits since the previous release are
> manually parsed to find the important changes that go into the release
> notes.
> 
> 
> In order to improve this process, I want to start a discussion on the
> following options -
> 
> 1. *Better PR titles* - if possible, these should be good enough to be
> picked as is into the release notes.
> 
> 2. *JIRA Issues* - each commit should be tagged with an associated JIRA
> issue. This issue should describe the problem. JIRA tickets can be used to
> automate the generation of release notes.
> 
> 3. *Adding Labels to the PRs/Commits* - There can be a set of 3-5 labels
> such as ‘Bug-Fix’, ‘New Feature’, ‘Docs’, ‘Minor Change’ etc. Atleast those
> PRs which are important and should be included in the release notes should
> be labeled.
> However, labels can only be added by those with write access to the repo.
> So the committers will have to triage this label addition as they
> review/merge the PRs.
> 
> 
> Do you think these changes are feasible? Will they help? What other options
> should be considered?
> 
> Thanks,
> Meghna Baijal


Re: New Apache MXNet logo idea

2017-09-28 Thread Madan Jampani
Is there a picture of Max?

On Thu, Sep 28, 2017 at 12:17 PM, Seb Kiureghian  wrote:

> Hi all,
>
> I have a new idea for a logo that I'd like to propose.
>
> The rabbit (I call him Max) is blazingly fast, like MXNet, but also
> friendly and approachable, like the Gluon interface.
>
> Do you all like it?
>
> Seb
>


Re: Apache MXNet build failures are mostly valid - verify before merge

2017-09-28 Thread Madan Jampani
Chris,
I don't think Naveen is suggesting that a merge happen without full
verification i.e. all tests across all platforms pass.
If a PR has some back and forth and results in multiple revisions (which is
arguably more common than a random unit test failing), we simply delay full
verification until the owner/reviewer have settled on a mutually acceptable
state.

On Thu, Sep 28, 2017 at 10:25 AM, Chris Olivier <cjolivie...@gmail.com>
wrote:

> -1 for running only partial tests.  Most failing unit tests that get
> through fail only for certain platforms/configurations.  I personally
> prefer to be assured the build and test is good before merge.  Most PR
> merges aren't in a huge hurry.
>
> On Thu, Sep 28, 2017 at 9:54 AM, Naveen Swamy <mnnav...@gmail.com> wrote:
>
> > +1 to make it protected. Here is what I am thinking for PR builds
> > on a PR run Sanity Tests + build/test one platform->committer reviews the
> > code and issues "Build Now", a full build is run->Github checks that the
> > full build checks succeed before it can be merged.
> >
> > I agree with Madan that PR should be approved by one another committer.
> >
> >
> >
> > On Thu, Sep 28, 2017 at 9:37 AM, Madan Jampani <madan.jamp...@gmail.com>
> > wrote:
> >
> > > +1
> > >
> > > At a minimum I'd like to see the following two happen:
> > > - Option to merge is disabled until all required checks pass.
> > > - Code is reviewed and given +1 by at least one other committer (no
> self
> > > review).
> > >
> > > On Wed, Sep 27, 2017 at 11:15 PM, Gautam <gautamn...@gmail.com> wrote:
> > >
> > > > Hi Chris,
> > > >
> > > >   Here <https://help.github.com/articles/about-protected-branches/>
> is
> > > > user
> > > > document on semantics of protected branch.
> > > > In short when a branch is protected following applies to that branch.
> > > >
> > > >- Can't be force pushed
> > > >- Can't be deleted
> > > >- Can't have changes merged into it until required status checks
> > > ><https://help.github.com/articles/about-required-status-checks>
> > pass
> > > >- Can't have changes merged into it until required reviews are
> > > approved
> > > ><https://help.github.com/articles/approving-a-pull-
> > > > request-with-required-reviews>
> > > >- Can't be edited or have files uploaded to it from the web
> > > >- Can't have changes merged into it until changes to files that
> > > > have a designated
> > > >code owner <https://help.github.com/articles/about-codeowners/>
> > have
> > > >been approved by that owner
> > > >
> > > >  I am sure many of us might not want to have all these but we can
> > debate
> > > on
> > > > it. My main motive was to "*Can't have changes merged into it until
> > > > required status checks pass*"
> > > >
> > > >
> > > > -Gautam
> > > >
> > > >
> > > >
> > > > On Wed, Sep 27, 2017 at 11:09 PM, Chris Olivier <
> cjolivie...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > What does that mean? "Protected"? Protected from what?
> > > > >
> > > > > On Wed, Sep 27, 2017 at 11:08 PM Gautam <gautamn...@gmail.com>
> > wrote:
> > > > >
> > > > > > Hi Chris,
> > > > > >
> > > > > >I mean make "master branch protected" of  MXNet.
> > > > > >
> > > > > > -Gautam
> > > > > >
> > > > > > On Wed, Sep 27, 2017 at 11:04 PM, Chris Olivier <
> > > cjolivie...@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > What does this mean? "Mx-net branch protected"?
> > > > > > >
> > > > > > > On Wed, Sep 27, 2017 at 9:59 PM Tsuyoshi OZAWA <
> > > > > ozawa.tsuyo...@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1,
> > > > > > > >
> > > > > > > > While I'm checking the recent build failures, and I think the
> > > > > decision
> > > > > > > > of making the mx

Re: MXNet: Run PR builds on Apache Jenkins only after the commit is reviewed

2017-09-13 Thread Madan Jampani
I agree with Naveen. This is not a one-way door. We can refine and improve
the process as needed in the future.
I don't see this as case of who works harder (machines or humans). PRs
always involve a manual step. All we are saying is lets do the full
verification when someone reviewed the changes and feels it is in a good
enough shape to be merged.

On Wed, Sep 13, 2017 at 9:56 AM, Naveen Swamy <mnnav...@gmail.com> wrote:

> The current number of build steps for a successful build is around 30 each
> requiring to run on a Jenkins executor(running on one instance), there are
> plans to add additional tests(more build steps) for additional platforms
> such as MAC OSX, IOT devices. I don't think it is practical nor necessary
> to run the entire pipeline on every PR change(even if you cancel the old
> build) before someone has looked into it.  Please see the attached image
> that shows the complete set of build steps. Each step not a stage in the
> pipeline needs an instance, running on average to 2.5 hours. We have
> allocated 20 instances running for the build.
>
> [image: Inline image 1]
>
>
> While we optimize our builds to better handle changes, become smarter in
> what builds to kick off, As a compromise, I suggest we modify the PR builds
> to run the following steps in sequence:
> 1. Sanity Checks(lint, Coverity, etc.,)
> 2. CPU Build for Linux
> 3. Python CPU Unit Tests on Linux
>
> On a PR review and request(label or comment) run the existing pipeline.
>
> Also, we can revisit and refine this again after we have optimized the
> current build pipeline.
>
> Let me know if you still have concerns, we can chat offline as well to
> understand better.
>
> Thanks, Naveen
>
>
>
>
>
> On Tue, Sep 12, 2017 at 11:17 PM, Kumar, Gautam <ga...@amazon.com> wrote:
>
>> Agreed with Bhavin.
>>
>> One point which is being mentioned here is abort the previous build if a
>> new change is being published on same PR.
>> Which can be achievable by just changing the job configuration “Cancel
>> build on update” under Trigger setup.
>> https://github.com/jenkinsci/ghprb-plugin/issues/379 is the git hub
>> discussion page for same.
>>
>> -Gautam
>>
>>
>> On 9/12/17, 9:43 PM, "Lin, Haibin" <haibi...@amazon.com> wrote:
>>
>> Also +1 on this.
>>
>> Haibin
>>
>> On 9/12/17, 9:28 PM, "Chris Olivier" <cjolivie...@gmail.com> wrote:
>>
>> +1 to this
>>
>> On Tue, Sep 12, 2017 at 8:48 PM Bhavin Thaker <
>> bhavintha...@gmail.com>
>> wrote:
>>
>> > My vote is to make the CI build and test process lightweight
>> and efficient
>> > and not involve a human to give a go-ahead for a PR build.
>> >
>> > There are multiple ways to engineer a stable and efficient CI
>> system,
>> > (already discussed on this email thread), including canceling
>> previously
>> > running build for a particular PR before invoking a new build,
>> using
>> > Incredibuild, adding more machines to CI, etc.
>> >
>> > In short, as we scale to many engineers working on MXNet around
>> the globe
>> > in different time-zones, precious human involvement should be
>> minimal and
>> > be used judiciously only for critical things like code-review
>> and only
>> > after sufficient amount of sanity build-tests have passed.
>> >
>>     > Let the machines work harder for humans and not the other way
>> around.
>> >
>> > Bhavin Thaker.
>> >
>> > On Tue, Sep 12, 2017 at 12:20 PM Chris Olivier <
>> cjolivie...@gmail.com>
>> > wrote:
>> >
>> > > The majority of these iterations is to trigger a build on the
>> broken CI
>> > in
>> > > hopes it will finally pass...
>> > >
>> > > On Tue, Sep 12, 2017 at 11:15 AM Madan Jampani <
>> madan.jamp...@gmail.com>
>> > > wrote:
>> > >
>> > > > +1
>> > > > I second only running sanity test (lint) until manual
>> approval.
>> > > >
>> > > > On Tue, Sep 12, 2017 at 11:05 AM, Naveen Swamy <
>> mnnav...@gmail.com>
>> > > wrote:
>> > > >
>> > > > > Just to be clear, the proposal is not to re

Re: MXNet: Run PR builds on Apache Jenkins only after the commit is reviewed

2017-09-12 Thread Madan Jampani
+1
I second only running sanity test (lint) until manual approval.

On Tue, Sep 12, 2017 at 11:05 AM, Naveen Swamy  wrote:

> Just to be clear, the proposal is not to remove the PR build. It's only to
> delay the PR build until a reviewer has looked at it and marks it Approved
> or adds a Label to build. Once it's approved and PR build succeeds a
> reviewer/committer can see the build result and merge to the master.
>
> I don't mean to pick on the author of this PR(Chris), I am just using this
> build to support my argument.
> https://builds.apache.org/view/Incubator%20Projects/job/
> incubator-mxnet/view/change-requests/job/PR-7577/
>  This has gone through 74 iterations, I understand not all of them are due
> to changes and some of them are due to build instability and some dummy
> commits to getting the PR build to pass. However, the PR has been under
> review and changes being made for the last several days. I don't think it
> warrants a build trigger for every new change. Each build on average takes
> about 2 hours running on all platforms.
>
> Probably we can run a lean down version of the current PR build where we
> have sanity-test(lint)->build on linux(cpu)->unit test on linux(cpu) ?
>
>
> Thanks, Naveen
>
>
>
>
> On Tue, Sep 12, 2017 at 4:27 AM, Joern Kottmann 
> wrote:
>
> > Not sure how it works with jenkins, but other CI serves can look at
> > the commit message and skip the CI run based on certain commands in
> > it.
> >
> > Might make sense for small changes such as documentation updates, half
> > done PRs, etc.
> >
> > Jörn
> >
> > On Tue, Sep 12, 2017 at 11:17 AM, Larroy, Pedro 
> > wrote:
> > > Hi
> > >
> > > I would like to integrate our CI system for devices to make sure PRs
> > build on ARM / android etc. Who has admin rights on the repository so we
> > can install the necessary hooks to trigger our builds?
> > >
> > >
> > > Kind regards.
> > > --
> > >
> > > Pedro
> > >
> > > On 12/09/17 02:50, "Meghna Baijal"  wrote:
> > >
> > > Hi All,
> > > We would like to initiate a change in the way the PR builds are
> > being triggered. At the moment, every time a Pull Request is created, a
> > build gets triggered on Jenkins. Additional builds also get triggered due
> > to changes to the same PR.
> > > Too many PR builds leads to resource starvation and very long
> queues
> > and long build times. Hence we would like to add some checks where a
> human
> > reviewer manually marks it to something like “ok to build” before a PR
> > build is triggered.
> > >
> > > Do you think this approach would be helpful and we should move
> > forward with it?
> > >
> > > Thanks,
> > > Meghna Baijal
> > >
> > >
> > >
> > >
> > >
> > >
> > > Amazon Development Center Germany GmbH
> > > Berlin - Dresden - Aachen
> > > main office: Krausenstr. 38, 10117 Berlin
> > > Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
> > > Ust-ID: DE289237879
> > > Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
> >
>


Re: Apache MXNet build failures are mostly valid - verify before merge

2017-08-31 Thread Madan Jampani
@hagay: we agree on the end state. I'm not too particular about how we get
there. If you think enabling it now and fixes regression later is doable,
I'm fine with. I see a bit of a chicken and egg problem. We need to get
some fixes in even when the status checks are failing.

On Thu, Aug 31, 2017 at 11:25 AM, Lupesko, Hagay <lupe...@gmail.com> wrote:

> @madan – re: getting to a stable CI first:
> I’m concerned that by not enabling protected branch mode ASAP, we’re just
> taking in more regressions, which makes a stable build a moving target for
> us…
>
> On 8/31/17, 10:49, "Zha, Sheng" <zhash...@amazon.com> wrote:
>
> Just one thing: please don’t disable more tests or just raise the
> tolerance thresholds.
>
> Best regards,
> -sz
>
> On 8/31/17, 10:45 AM, "Madan Jampani" <madan.jamp...@gmail.com> wrote:
>
> +1
> Before we can turn protected mode I feel we should first get to a
> stable CI
> pipeline.
> Sandeep is chasing down known breaking issues.
>
>
> On Thu, Aug 31, 2017 at 10:27 AM, Hagay Lupesko <lupe...@gmail.com>
> wrote:
>
> > Build stability is a major issue, builds have been failing left
> and right
> > over the last week. Some of it is due to Jenkins slave issues,
> but some are
> > real regressions.
> > We need to be more strict in the code we're committing.
> >
> > I propose we configure our master to be a protected branch (
> > https://help.github.com/articles/about-protected-branches/).
> >
> > Thoughts?
> >
> > On 2017-08-28 22:41, sandeep krishnamurthy <s...@gmail.com>
> wrote:
> > > Hello Committers and Contributors,>
> > >
> > > Due to unstable build pipelines, from past 1 week, PRs are
> being merged>
> > > after CR ignoring PR build status. Build pipeline is much more
> stable
> > than>
> > > last week and most of the build failures you see from now on,
> are likely
> > to>
> > > be a valid failure and hence, it is recommended to wait for PR
> builds,
> > see>
> > > the root cause of any build failures before proceeding with
> merges.>
> > >
> > > At this point of time, there are 2 intermittent issue yet to
> be fixed ->
> > > * Network error leading to GitHub requests throwing 404>
> > > * A conflict in artifacts generated between branches/PR -
> Cause unknown
> > yet.>
> > > These issues will be fixed soon.>
> > >
> > >
> > > -- >
> > > Sandeep Krishnamurthy>
> > >
> >
>
>
>
>
>
>


Re: Apache MXNet build failures are mostly valid - verify before merge

2017-08-31 Thread Madan Jampani
+1
Before we can turn protected mode I feel we should first get to a stable CI
pipeline.
Sandeep is chasing down known breaking issues.


On Thu, Aug 31, 2017 at 10:27 AM, Hagay Lupesko  wrote:

> Build stability is a major issue, builds have been failing left and right
> over the last week. Some of it is due to Jenkins slave issues, but some are
> real regressions.
> We need to be more strict in the code we're committing.
>
> I propose we configure our master to be a protected branch (
> https://help.github.com/articles/about-protected-branches/).
>
> Thoughts?
>
> On 2017-08-28 22:41, sandeep krishnamurthy  wrote:
> > Hello Committers and Contributors,>
> >
> > Due to unstable build pipelines, from past 1 week, PRs are being merged>
> > after CR ignoring PR build status. Build pipeline is much more stable
> than>
> > last week and most of the build failures you see from now on, are likely
> to>
> > be a valid failure and hence, it is recommended to wait for PR builds,
> see>
> > the root cause of any build failures before proceeding with merges.>
> >
> > At this point of time, there are 2 intermittent issue yet to be fixed ->
> > * Network error leading to GitHub requests throwing 404>
> > * A conflict in artifacts generated between branches/PR - Cause unknown
> yet.>
> > These issues will be fixed soon.>
> >
> >
> > -- >
> > Sandeep Krishnamurthy>
> >
>


Re: Formalize Committer Proposal and Application Procedure

2017-08-04 Thread Madan Jampani
There is a middle ground here. Instead of saying someone either has full
committer privileges or zero, an alternative is to have scope of ownership
start small and localized to modules or source folders where their primary
contributions currently lie. For example, there are folks who contributed
full languages bindings, or very good examples/tutorials.

Over time, depending on the scope and complexity of their contributions
they can be nominated to have their commit privileges broadened or even
become core committers. Core committers have full commit privileges.

Irrespective of whether some one is committer or not, we should all be
using the PR process and opening up contributions for review/feedback.

Madan



On Fri, Aug 4, 2017 at 5:04 AM, Isabel Drost-Fromm 
wrote:

> On Fri, Aug 04, 2017 at 12:27:16PM +0100, Chiyuan Zhang wrote:
> > Suppose we lower the standard or completely remove the formal standard
> for
> > committers, then we could probably be able to get more committers from
> the
> > first type. But that might not necessarily be good to us
>
> Can you elaborate your reasoning here? (I'm not implying that I agree or
> disagree with you, I just want to understand where this fear is coming
> from.)
>
>
> > having people that could either contribute relatively important
> components
> > or provide longer term commitment to the project. But on the other hand,
> > having a standard for committers do not (I hope) discourage the first
> type
> > of contributors to contribute PRs.
>
> Let me tell you a little campfire story: Back in the old days of Mahout we
> implicitly had a relatively high bar for becoming a committer. People
> thought
> that in order to become committer they would have to contribute substantial
> patches, often full new algorithm implementations.
>
> What the project really needed were a lot of work polishing, optimising,
> cleaning, making easier to use, documenting etc.
>
> Due to the perception of requiring substantial contributions to get the
> reward of becoming committer however we never received much of the latter.
>
>
> Lesson learnt for me: The way you setup your reward systems greatly
> influences which kind of help your project will receive.
>
>
> Isabel
>
> --
> Sorry for any typos: Mail was typed in vim, written in mutt, via ssh (most
> likely involving some kind of mobile connection only.)
>


development process wiki

2017-07-07 Thread Madan Jampani
Hi Everyone,

I put together a wiki page outlining the process and guidelines we could
follow to better manage contributions to mxnet. This started out as a
proposal to better manage the PR backlog. However it felt necessary to
codify a set of guidelines or best practices new contributors could follow
to better prepare their contributions for a review and subsequent merge.

I am opening it up to the community for feedback.

https://cwiki.apache.org/confluence/display/MXNET/Development+Process

Madan.