Do we know if using "the best" available hardware would improve the build
times?
Imagine we would run the build on machines with plenty of main memory to
mount everything to ramdisk + the latest CPU architecture?

Throwing hardware at the problem could help reduce the time of an
individual build, and using our own infrastructure would remove our
dependency on Apache's Travis account (with the obvious downside of having
to maintain the infrastructure)
We could use an open source travis alternative, to have a similar
experience and make the migration easy.


On Wed, Jun 26, 2019 at 9:34 AM Chesnay Schepler <ches...@apache.org> wrote:

>  From what I gathered, there's no special sauce that the Zeppelin
> project uses which actually integrates a users Travis account into the PR.
>
> They just disabled Travis for PRs. And that's kind of it.
>
> Naturally we can do this (duh) and safe the ASF a fair amount of
> resources, but there are downsides:
>
> The discoverability of the Travis check takes a nose-dive. Either we
> require every contributor to always, an every commit, also post a Travis
> build, or we have the reviewer sift through the contributors account to
> find it.
>
> This is rather cumbersome. Additionally, it's also not equivalent to
> having a PR build.
>
> A normal branch build takes a branch as is and tests it. A PR build
> merges the branch into master, and then runs it. (Fun fact: This is why
> a PR without merge conflicts is not being run on Travis.)
>
> And ultimately, everyone can already make use of this approach anyway.
>
> On 25/06/2019 08:02, Jark Wu wrote:
> > Hi Jeff,
> >
> > Thanks for sharing the Zeppelin approach. I think it's a good idea to
> > leverage user's travis account.
> > In this way, we can have almost unlimited concurrent build jobs and
> > developers can restart build by themselves (currently only committers
> > can restart PR's build).
> >
> > But I'm still not very clear how to integrate user's travis build into
> > the Flink pull request's build automatically. Can you explain more in
> > detail?
> >
> > Another question: does travis only build branches for user account?
> > My concern is that builds for PRs will rebase user's commits against
> > current master branch.
> > This will help us to find problems before merge.  Builds for branches
> > will lose the impact of new commits in master.
> > How does Zeppelin solve this problem?
> >
> > Thanks again for sharing the idea.
> >
> > Regards,
> > Jark
> >
> > On Tue, 25 Jun 2019 at 11:01, Jeff Zhang <zjf...@gmail.com
> > <mailto:zjf...@gmail.com>> wrote:
> >
> >     Hi Folks,
> >
> >     Zeppelin meet this kind of issue before, we solve it by delegating
> >     each
> >     one's PR build to his travis account (Everyone can have 5 free
> >     slot for
> >     travis build).
> >     Apache account travis build is only triggered when PR is merged.
> >
> >
> >
> >     Kurt Young <ykt...@gmail.com <mailto:ykt...@gmail.com>>
> >     于2019年6月25日周二 上午10:16写道:
> >
> >     > (Forgot to cc George)
> >     >
> >     > Best,
> >     > Kurt
> >     >
> >     >
> >     > On Tue, Jun 25, 2019 at 10:16 AM Kurt Young <ykt...@gmail.com
> >     <mailto:ykt...@gmail.com>> wrote:
> >     >
> >     > > Hi Bowen,
> >     > >
> >     > > Thanks for bringing this up. We actually have discussed about
> >     this, and I
> >     > > think Till and George have
> >     > > already spend sometime investigating it. I have cced both of
> >     them, and
> >     > > maybe they can share
> >     > > their findings.
> >     > >
> >     > > Best,
> >     > > Kurt
> >     > >
> >     > >
> >     > > On Tue, Jun 25, 2019 at 10:08 AM Jark Wu <imj...@gmail.com
> >     <mailto:imj...@gmail.com>> wrote:
> >     > >
> >     > >> Hi Bowen,
> >     > >>
> >     > >> Thanks for bringing this. We also suffered from the long
> >     build time.
> >     > >> I agree that we should focus on solving build capacity
> >     problem in the
> >     > >> thread.
> >     > >>
> >     > >> My observation is there is only one build is running, all the
> >     others
> >     > >> (other
> >     > >> PRs, master) are pending.
> >     > >> The pricing plan[1] of travis shows it can support concurrent
> >     build
> >     > jobs.
> >     > >> But I don't know which plan we are using, might be the free
> >     plan for
> >     > open
> >     > >> source.
> >     > >>
> >     > >> I cc-ed Chesnay who may have some experience on Travis.
> >     > >>
> >     > >> Regards,
> >     > >> Jark
> >     > >>
> >     > >> [1]: https://travis-ci.com/plans
> >     > >>
> >     > >> On Tue, 25 Jun 2019 at 08:11, Bowen Li <bowenl...@gmail.com
> >     <mailto:bowenl...@gmail.com>> wrote:
> >     > >>
> >     > >> > Hi Steven,
> >     > >> >
> >     > >> > I think you may not read what I wrote. The discussion is about
> >     > "unstable
> >     > >> > build **capacity**", in another word "unstable / lack of build
> >     > >> resources",
> >     > >> > not "unstable build".
> >     > >> >
> >     > >> > On Mon, Jun 24, 2019 at 4:40 PM Steven Wu
> >     <stevenz...@gmail.com <mailto:stevenz...@gmail.com>>
> >     > wrote:
> >     > >> >
> >     > >> > > long and sometimes unstable build is definitely a pain
> point.
> >     > >> > >
> >     > >> > > I suspect the build failure here in flink-connector-kafka
> >     is not
> >     > >> related
> >     > >> > to
> >     > >> > > my change. but there is no easy re-run the build on
> >     travis UI.
> >     > Google
> >     > >> > > search showed a trick of close-and-open the PR will
> >     trigger rebuild.
> >     > >> but
> >     > >> > > that could add noises to the PR activities.
> >     > >> > > https://travis-ci.org/apache/flink/jobs/545555519
> >     > >> > >
> >     > >> > > travis-ci for my personal repo often failed with
> >     exceeding time
> >     > limit
> >     > >> > after
> >     > >> > > 4+ hours.
> >     > >> > > The job exceeded the maximum time limit for jobs, and has
> >     been
> >     > >> > terminated.
> >     > >> > >
> >     > >> > > On Mon, Jun 24, 2019 at 4:15 PM Bowen Li
> >     <bowenl...@gmail.com <mailto:bowenl...@gmail.com>>
> >     > wrote:
> >     > >> > >
> >     > >> > > > https://travis-ci.org/apache/flink/builds/549681530
> >     This build
> >     > >> > request
> >     > >> > > > has
> >     > >> > > > been sitting at **HEAD of the queue** since I first saw
> >     it at PST
> >     > >> > 10:30am
> >     > >> > > > (not sure how long it's been there before 10:30am).
> >     It's PST
> >     > 4:12pm
> >     > >> now
> >     > >> > > and
> >     > >> > > > it hasn't started yet.
> >     > >> > > >
> >     > >> > > > On Mon, Jun 24, 2019 at 2:48 PM Bowen Li
> >     <bowenl...@gmail.com <mailto:bowenl...@gmail.com>>
> >     > >> wrote:
> >     > >> > > >
> >     > >> > > > > Hi devs,
> >     > >> > > > >
> >     > >> > > > > I've been experiencing the pain resulting from lack
> >     of stable
> >     > >> build
> >     > >> > > > > capacity on Travis for Flink PRs [1]. Specifically, I
> >     noticed
> >     > >> often
> >     > >> > > that
> >     > >> > > > no
> >     > >> > > > > build in the queue is making any progress for hours, and
> >     > suddenly
> >     > >> 5
> >     > >> > or
> >     > >> > > 6
> >     > >> > > > > builds kick off all together after the long pause.
> >     I'm at PST
> >     > >> > (UTC-08)
> >     > >> > > > time
> >     > >> > > > > zone, and I've seen pause can be as long as 6 hours
> >     from PST 9am
> >     > >> to
> >     > >> > 3pm
> >     > >> > > > > (let alone the time needed to drain the queue
> >     afterwards).
> >     > >> > > > >
> >     > >> > > > > I think this has greatly impacted our productivity. I've
> >     > >> experienced
> >     > >> > > that
> >     > >> > > > > PRs submitted in the early morning of PST time zone
> >     won't finish
> >     > >> > their
> >     > >> > > > > build until late night of the same day.
> >     > >> > > > >
> >     > >> > > > > So my questions are:
> >     > >> > > > >
> >     > >> > > > > - Has anyone else experienced the same problem or
> >     have similar
> >     > >> > > > observation
> >     > >> > > > > on TravisCI? (I suspect it has things to do with time
> >     zone)
> >     > >> > > > >
> >     > >> > > > > - What pricing plan of TravisCI is Flink currently
> >     using? Is it
> >     > >> the
> >     > >> > > free
> >     > >> > > > > plan for open source projects? What are the
> >     guaranteed build
> >     > >> capacity
> >     > >> > > of
> >     > >> > > > > the current plan?
> >     > >> > > > >
> >     > >> > > > > - If the current pricing plan (either free or paid)
> can't
> >     > provide
> >     > >> > > stable
> >     > >> > > > > build capacity, can we upgrade to a higher priced
> >     plan with
> >     > larger
> >     > >> > and
> >     > >> > > > more
> >     > >> > > > > stable build capacity?
> >     > >> > > > >
> >     > >> > > > > BTW, another factor that contribute to the
> >     productivity problem
> >     > is
> >     > >> > that
> >     > >> > > > > our build is slow - we run full build for every PR and a
> >     > >> successful
> >     > >> > > full
> >     > >> > > > > build takes ~5h. We definitely have more options to
> >     solve it,
> >     > for
> >     > >> > > > instance,
> >     > >> > > > > modularize the build graphs and reuse artifacts from the
> >     > previous
> >     > >> > > build.
> >     > >> > > > > But I think that can be a big effort which is much
> >     harder to
> >     > >> > accomplish
> >     > >> > > > in
> >     > >> > > > > a short period of time and may deserve its own separate
> >     > >> discussion.
> >     > >> > > > >
> >     > >> > > > > [1] https://travis-ci.org/apache/flink/pull_requests
> >     > >> > > > >
> >     > >> > > > >
> >     > >> > > >
> >     > >> > >
> >     > >> >
> >     > >>
> >     > >
> >     >
> >
> >
> >     --
> >     Best Regards
> >
> >     Jeff Zhang
> >
>
>

Reply via email to