Hi Kellen,

Many thanks for your and Marco's effort! I think this is a very crucial
piece to improve MXNet stability.

To add some data points:
1) Customers using CoreML to MXNet converter were blocked for a while
because the converter was broken and no unit test was in place to detect
that.
2) Developers on Mac cannot verify their local commits because some unit
tests on master were broken. This wasted much time and resource on jenkins
server to detect the failure.
3) Please consider running the CI on Mac OS 10.13 since this is the minimum
Mac OS version that supports CoreML (to support CoreML to MXNet converter)

Best Regards,

Lin

On Wed, Sep 5, 2018, 3:02 AM kellen sunderland <kellen.sunderl...@gmail.com>
wrote:

> I'm bumping this thread as we've recently had our first serious bug on
> MacOS that would have been caught by enabling Travis.
>
> I'm going to do a little experimental work together with Marco with the
> goal of enabling a minimal Travis build that will run python tests.  So far
> I've verified that Travis will in fact find a bug that currently exists in
> master and has been reproduced by MacOS clients.  This indicates to me that
> adding Travis will add value to our CI.
>
> My best guess is that it might take us some iteration before we find a
> scalable way to integrate Travis.  Given this we're going to enable Travis
> in non-blocking mode (i.e. failures are safe to ignore for the time being).
>
> To help mitigate the risk of timeouts, and to remove legacy code I'm going
> to re-create the travis.yml file from scratch.  I think it'll be much less
> confusing if we only have working code related to Travis in our codebase,
> so that contributors won't have to experiment to see what is or isn't
> working.  We've got some great, but slightly out-of-date functionality in
> the legacy .travis.yml file.  I hope we can work together to update the
> legacy features, ensure they work with the current folder structure and
> also make sure the features run within Travis's 45 minute global time
> window.
>
> I'd also like to set expectations that this is strictly a volunteer
> effort.  I'd welcome help from the community for support and maintenance.
> The model downloading caching work particularly stands out to me as
> something I'd like to re-enable again as soon as possible.
>
> -Kellen
>
> On Tue, Jan 9, 2018 at 11:52 AM Marco de Abreu <
> marco.g.ab...@googlemail.com>
> wrote:
>
> > Looks good! +1
> >
> > On Tue, Jan 9, 2018 at 10:24 AM, kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > I think most were in favour of at a minimum creating a clang build so
> > I've
> > > created a PR
> > > https://github.com/apache/incubator-mxnet/pull/9330/commits/
> > > 84089ea14123ebe4d66cc92e82a2d529cfbd8b19.
> > > My hope is this will catch many of the issues blocking OSX builds.  In
> > fact
> > > it already caught one issue.  If you guys are in favour I can remove
> the
> > > WIP and ask that it be merged.
> > >
> > > On Thu, Jan 4, 2018 at 6:29 PM, Chris Olivier <cjolivie...@gmail.com>
> > > wrote:
> > >
> > > > Nope, I have been on vacation.
> > > >
> > > > On Thu, Jan 4, 2018 at 9:10 AM, kellen sunderland <
> > > > kellen.sunderl...@gmail.com> wrote:
> > > >
> > > > > Hope everyone had a good break.  Just wanted to check if there were
> > > > further
> > > > > thoughts on OSX builds.  Chris, did you have time to look into
> > > > virtualizing
> > > > > Mac OS?  Would it make sense for us to put something in place in
> the
> > > > > interim e.g. the clang solution?
> > > > >
> > > > > On Tue, Dec 12, 2017 at 7:59 PM, de Abreu, Marco <
> mab...@amazon.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks for looking into this, Chris! No hurries on that one,
> we’ll
> > > look
> > > > > > into it next stage when we add new system- and
> build-configurations
> > > to
> > > > > the
> > > > > > CI.
> > > > > >
> > > > > > On 12.12.17, 19:12, "Chris Olivier" <cjolivie...@gmail.com>
> wrote:
> > > > > >
> > > > > >     I am on vacation starting Thursday.
> > > > > >
> > > > > >     On Tue, Dec 12, 2017 at 9:49 AM kellen sunderland <
> > > > > >     kellen.sunderl...@gmail.com> wrote:
> > > > > >
> > > > > >     > Absolutely, let's do an investigation and see if it's
> > possible
> > > to
> > > > > >     > virtualize.  Would you have time to look into it a bit
> > further?
> > > > > >     >
> > > > > >     > On Tue, Dec 12, 2017 at 6:47 PM, Chris Olivier <
> > > > > > cjolivie...@gmail.com>
> > > > > >     > wrote:
> > > > > >     >
> > > > > >     > > Don’t get me wrong, I’m not saying this Mac OS Jenkins
> > > solution
> > > > > is
> > > > > > doable
> > > > > >     > > but I feel like we should investigate because the payoff
> > > would
> > > > be
> > > > > > large.
> > > > > >     > >
> > > > > >     > >
> > > > > >     > > On Tue, Dec 12, 2017 at 9:38 AM Chris Olivier <
> > > > > > cjolivie...@gmail.com>
> > > > > >     > > wrote:
> > > > > >     > >
> > > > > >     > > > Apple’s Darwin OS Is recently open-sourced.
> > > > > >     > > > https://github.com/PureDarwin/PureDarwin
> > > > > >     > > >
> > > > > >     > > > How to convert this into a non-GUI VM I am not sure
> but I
> > > am
> > > > > > willing to
> > > > > >     > > > bet that people have done it already.
> > > > > >     > > >
> > > > > >     > > > On Tue, Dec 12, 2017 at 9:16 AM kellen sunderland <
> > > > > >     > > > kellen.sunderl...@gmail.com> wrote:
> > > > > >     > > >
> > > > > >     > > >> It might be technically possible, but I think it would
> > > > violate
> > > > > > the
> > > > > >     > MacOS
> > > > > >     > > >> license: http://store.apple.com/
> > > > Catalog/US/Images/MacOSX.htm
> > > > > >     > > >>
> > > > > >     > > >> "2. Permitted License Uses and Restrictions.
> > > > > >     > > >> A. This License allows you to install and use one copy
> > of
> > > > the
> > > > > > Apple
> > > > > >     > > >> Software on a single Apple-labeled computer at a time.
> > > This
> > > > > > License
> > > > > >     > does
> > > > > >     > > >> not allow the Apple Software to exist on more than one
> > > > > computer
> > > > > > at a
> > > > > >     > > >> time,and you may not make the Apple Software available
> > > over
> > > > a
> > > > > > network
> > > > > >     > > >> where
> > > > > >     > > >> it could be used by multiple computers at the same
> time.
> > > You
> > > > > > may make
> > > > > >     > > one
> > > > > >     > > >> copy of the Apple Software (excluding the Boot ROM
> code)
> > > in
> > > > > >     > > >> machine-readable form for backup purposes only;
> provided
> > > > that
> > > > > > the
> > > > > >     > backup
> > > > > >     > > >> copy must include all copyright or other proprietary
> > > notices
> > > > > > contained
> > > > > >     > > on
> > > > > >     > > >> the original. "
> > > > > >     > > >>
> > > > > >     > > >> I could be wrong though, does anyone know the details
> of
> > > > MacOS
> > > > > >     > > licensing /
> > > > > >     > > >> virtualization?
> > > > > >     > > >>
> > > > > >     > > >> On Tue, Dec 12, 2017 at 6:10 PM, Chris Olivier <
> > > > > > cjolivie...@gmail.com
> > > > > >     > >
> > > > > >     > > >> wrote:
> > > > > >     > > >>
> > > > > >     > > >> > googling seems to be full of running OSX (and even
> > > > > > open-sourced
> > > > > >     > > >> PureDarwin)
> > > > > >     > > >> > in VMs. One could conceivably run a VM on an EC2
> > > instance,
> > > > > > right?
> > > > > >     > > >> >
> > > > > >     > > >> > On Tue, Dec 12, 2017 at 9:01 AM kellen sunderland <
> > > > > >     > > >> > kellen.sunderl...@gmail.com> wrote:
> > > > > >     > > >> >
> > > > > >     > > >> > > It would be ideal if we could cover OSX in
> Jenkins,
> > > but
> > > > > the
> > > > > > only
> > > > > >     > > >> solution
> > > > > >     > > >> > > that I'm aware of would require physical machines
> to
> > > be
> > > > > the
> > > > > >     > workers.
> > > > > >     > > >> I
> > > > > >     > > >> > > would be weakly opposed to having physical servers
> > > > running
> > > > > > on PRs.
> > > > > >     > > >> The
> > > > > >     > > >> > > downsides that I see in order of importance:
> > > > > >     > > >> > >
> > > > > >     > > >> > > -  We can't autoscale physical hardware.   If we
> > find
> > > > that
> > > > > > the
> > > > > >     > load
> > > > > >     > > is
> > > > > >     > > >> > too
> > > > > >     > > >> > > high we have to buy more machines.
> > > > > >     > > >> > > -  Security would be tricky, as they'd have to be
> > > > > connected
> > > > > > to the
> > > > > >     > > >> > internet
> > > > > >     > > >> > > and then to our Jekins master instance.
> Connecting
> > > via
> > > > a
> > > > > > wired
> > > > > >     > > >> network
> > > > > >     > > >> > > would probably not be possible on most corporate
> > > > networks
> > > > > > as these
> > > > > >     > > >> > machines
> > > > > >     > > >> > > are by definition running arbitrary code from the
> > > > > > internet.  Many
> > > > > >     > > >> > corporate
> > > > > >     > > >> > > sites have public wifi that this machine could
> > > > potentially
> > > > > > connect
> > > > > >     > > to,
> > > > > >     > > >> > but
> > > > > >     > > >> > > then our PRs start failing if the wifi disconnects
> > > > > > temporarily.
> > > > > >     > To
> > > > > >     > > >> > connect
> > > > > >     > > >> > > to the master we would need to setup a vpn
> solution
> > > with
> > > > > > endpoints
> > > > > >     > > in
> > > > > >     > > >> our
> > > > > >     > > >> > > vpc on AWS.  This is possible but would probably
> > > > require a
> > > > > > lot of
> > > > > >     > > >> > security
> > > > > >     > > >> > > work.
> > > > > >     > > >> > > -  We can't just create a simple startup script or
> > > yaml
> > > > > > file that
> > > > > >     > is
> > > > > >     > > >> > > checked into GitHub to manage the machine.
> Someone
> > > will
> > > > > > actually
> > > > > >     > > >> have to
> > > > > >     > > >> > > physically administer the machine, apply updates,
> > etc.
> > > > > > which will
> > > > > >     > > make
> > > > > >     > > >> > > community ownership difficult.
> > > > > >     > > >> > >
> > > > > >     > > >> > > Specific to an OSX build:
> > > > > >     > > >> > > -  We can't virtualize OSX which means we'd only
> be
> > > able
> > > > > to
> > > > > > cover
> > > > > >     > > one
> > > > > >     > > >> OSX
> > > > > >     > > >> > > build environment per physical device.  We
> couldn't
> > > > > target a
> > > > > >     > matrix
> > > > > >     > > of
> > > > > >     > > >> > OSX
> > > > > >     > > >> > > and Xcode versions as in Travis.
> > > > > >     > > >> > >
> > > > > >     > > >> > > -Kellen
> > > > > >     > > >> > >
> > > > > >     > > >> > > On Tue, Dec 12, 2017 at 5:46 PM, Chris Olivier <
> > > > > >     > > cjolivie...@gmail.com
> > > > > >     > > >> >
> > > > > >     > > >> > > wrote:
> > > > > >     > > >> > >
> > > > > >     > > >> > > > So why Travis when we could possibly use
> Jenkins?
> > > > > >     > > >> > > >
> > > > > >     > > >> > > > On Tue, Dec 12, 2017 at 7:59 AM Marco de Abreu <
> > > > > >     > > >> > > > marco.g.ab...@googlemail.com>
> > > > > >     > > >> > > > wrote:
> > > > > >     > > >> > > >
> > > > > >     > > >> > > > > Yes that's correct, Chris.
> > > > > >     > > >> > > > >
> > > > > >     > > >> > > > > Am 12.12.2017 4:46 nachm. schrieb "Chris
> > Olivier"
> > > <
> > > > > >     > > >> > > cjolivie...@gmail.com
> > > > > >     > > >> > > > >:
> > > > > >     > > >> > > > >
> > > > > >     > > >> > > > > > A quick google search seems to indicate that
> > Mac
> > > > can
> > > > > > be used
> > > > > >     > > as
> > > > > >     > > >> a
> > > > > >     > > >> > > > Jenkins
> > > > > >     > > >> > > > > > slave. Is this correct?
> > > > > >     > > >> > > > > >
> > > > > >     > > >> > > > > > On Tue, Dec 12, 2017 at 7:42 AM Steffen
> > Rochel <
> > > > > >     > > >> > > > steffenroc...@gmail.com>
> > > > > >     > > >> > > > > > wrote:
> > > > > >     > > >> > > > > >
> > > > > >     > > >> > > > > > > +1 for #1 and #2
> > > > > >     > > >> > > > > > >
> > > > > >     > > >> > > > > > > I’m working on getting a MacPro to add to
> CI
> > > > > system.
> > > > > >     > > >> > > > > > > On Tue, Dec 12, 2017 at 1:43 AM kellen
> > > > sunderland
> > > > > <
> > > > > >     > > >> > > > > > > kellen.sunderl...@gmail.com> wrote:
> > > > > >     > > >> > > > > > >
> > > > > >     > > >> > > > > > > > Background:  TravisCI is a startup
> > providing
> > > > > > managed
> > > > > >     > > >> continuous
> > > > > >     > > >> > > > > > > > integration services with GitHub
> > integration
> > > > and
> > > > > > YAML
> > > > > >     > > based
> > > > > >     > > >> > > > > > > configuration.
> > > > > >     > > >> > > > > > > > TravisCI is one of the few CI providers
> > that
> > > > > will
> > > > > > build
> > > > > >     > a
> > > > > >     > > >> > variety
> > > > > >     > > >> > > > of
> > > > > >     > > >> > > > > > > > OSX/MacOS builds for software projects.
> > > Their
> > > > > > pricing
> > > > > >     > > >> ranges
> > > > > >     > > >> > > from
> > > > > >     > > >> > > > > Free
> > > > > >     > > >> > > > > > > > (for open source, 1 concurrent job, to
> > $489
> > > > > > monthly for
> > > > > >     > 10
> > > > > >     > > >> > > > concurrent
> > > > > >     > > >> > > > > > > jobs).
> > > > > >     > > >> > > > > > > >
> > > > > >     > > >> > > > > > > > Problem: We’ve had a few OSX build
> issues
> > > slip
> > > > > > into
> > > > > >     > MXNet
> > > > > >     > > >> > master
> > > > > >     > > >> > > in
> > > > > >     > > >> > > > > the
> > > > > >     > > >> > > > > > > > past few weeks.  We’ve previously had a
> > > Travis
> > > > > CI
> > > > > > based
> > > > > >     > > >> testing
> > > > > >     > > >> > > > > system
> > > > > >     > > >> > > > > > > that
> > > > > >     > > >> > > > > > > > would have caught these issues.
> > > > > >     > > >> > > > > > > >
> > > > > >     > > >> > > > > > > > Proposals so far:
> > > > > >     > > >> > > > > > > >
> > > > > >     > > >> > > > > > > > 1) Use TravisCI in it’s free mode for a
> > very
> > > > > > minimal
> > > > > >     > > sanity
> > > > > >     > > >> > check
> > > > > >     > > >> > > > on
> > > > > >     > > >> > > > > > OSX.
> > > > > >     > > >> > > > > > > > If we compile the program, and for
> example
> > > run
> > > > > > C++ unit
> > > > > >     > > >> tests
> > > > > >     > > >> > > we’re
> > > > > >     > > >> > > > > > > > unlikely to run into problems with
> queued
> > > > > > builds.  The
> > > > > >     > > total
> > > > > >     > > >> > > build
> > > > > >     > > >> > > > > time
> > > > > >     > > >> > > > > > > > here should be less than 15 minutes.
> > > > > > Configuration
> > > > > >     > should
> > > > > >     > > >> be
> > > > > >     > > >> > > quite
> > > > > >     > > >> > > > > > > simple
> > > > > >     > > >> > > > > > > > and easy to maintain.  Error messages
> > should
> > > > > also
> > > > > > be
> > > > > >     > > >> obvious to
> > > > > >     > > >> > > > > > > > contributors.
> > > > > >     > > >> > > > > > > > 2) Run clang in Linux with our current
> CI.
> > > > > > Building
> > > > > >     > with
> > > > > >     > > >> clang
> > > > > >     > > >> > > > > should
> > > > > >     > > >> > > > > > > > take less than 10 minutes, should flush
> > out
> > > a
> > > > > > large
> > > > > >     > subset
> > > > > >     > > >> of
> > > > > >     > > >> > the
> > > > > >     > > >> > > > > > issues
> > > > > >     > > >> > > > > > > > we’ve seen with OSX, and be quite easy
> to
> > > > > > maintain.
> > > > > >     > > >> > > > > > > > 3) Run full test-suites in TravisCI,
> > > equaling
> > > > > the
> > > > > > level
> > > > > >     > of
> > > > > >     > > >> > > coverage
> > > > > >     > > >> > > > > we
> > > > > >     > > >> > > > > > > > provide to Linux in Jenkins.  This could
> > > > require
> > > > > > us to
> > > > > >     > > >> > subscribe
> > > > > >     > > >> > > > to a
> > > > > >     > > >> > > > > > > > monthly package with Travis to ensure
> our
> > > > build
> > > > > > queue
> > > > > >     > > >> doesn’t
> > > > > >     > > >> > > grow
> > > > > >     > > >> > > > to
> > > > > >     > > >> > > > > > an
> > > > > >     > > >> > > > > > > > unacceptable length.  It may also
> require
> > a
> > > > > > volunteer to
> > > > > >     > > >> setup
> > > > > >     > > >> > > and
> > > > > >     > > >> > > > > > > maintain
> > > > > >     > > >> > > > > > > > long-term.
> > > > > >     > > >> > > > > > > >
> > > > > >     > > >> > > > > > > > I’d +1 #1 and #2 as I think those should
> > be
> > > > > > low-cost,
> > > > > >     > > >> > > low-maintence
> > > > > >     > > >> > > > > > > > solutions that should catch the majority
> > of
> > > > the
> > > > > > problems
> > > > > >     > > >> we’ve
> > > > > >     > > >> > > seen
> > > > > >     > > >> > > > > > thus
> > > > > >     > > >> > > > > > > > far.
> > > > > >     > > >> > > > > > > >
> > > > > >     > > >> > > > > > > > -Kellen
> > > > > >     > > >> > > > > > > >
> > > > > >     > > >> > > > > > >
> > > > > >     > > >> > > > > >
> > > > > >     > > >> > > > >
> > > > > >     > > >> > > >
> > > > > >     > > >> > >
> > > > > >     > > >> >
> > > > > >     > > >>
> > > > > >     > > >
> > > > > >     > >
> > > > > >     >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to