+1 to move to a private Travis account.

I can confirm that Ververica will sponsor a Travis CI plan that is
equivalent or a bit higher than the previous ASF quota (10 concurrent build
queues)

Best,
Stephan

On Thu, Jul 4, 2019 at 10:46 AM Chesnay Schepler <ches...@apache.org> wrote:

> I've raised a JIRA
> <https://issues.apache.org/jira/browse/INFRA-18703>with INFRA to inquire
> whether it would be possible to switch to a different Travis account,
> and if so what steps would need to be taken.
> We need a proper confirmation from INFRA since we are not in full
> control of the flink repository (for example, we cannot access the
> settings page).
>
> If this is indeed possible, Ververica is willing sponsor a Travis
> account for the Flink project.
> This would provide us with more than enough resources than we need.
>
> Since this makes the project more reliant on resources provided by
> external companies I would like to vote on this.
>
> Please vote on this proposal, as follows:
> [ ] +1, Approve the migration to a Ververica-sponsored Travis account,
> provided that INFRA approves
> [ ] -1, Do not approach the migration to a Ververica-sponsored Travis
> account
>
> The vote will be open for at least 24h, and until we have confirmation
> from INFRA. The voting period may be shorter than the usual 3 days since
> our current is effectively not working.
>
> On 04/07/2019 06:51, Bowen Li wrote:
> > Re: > Are they using their own Travis CI pool, or did the switch to an
> > entirely different CI service?
> >
> > I reached out to Wes and Krisztián from Apache Arrow PMC. They are
> > currently moving away from ASF's Travis to their own in-house metal
> > machines at [1] with custom CI application at [2]. They've seen
> > significant improvement w.r.t both much higher performance and
> > basically no resource waiting time, "night-and-day" difference quoting
> > Wes.
> >
> > Re: > If we can just switch to our own Travis pool, just for our
> > project, then this might be something we can do fairly quickly?
> >
> > I believe so, according to [3] and [4]
> >
> >
> > [1] https://ci.ursalabs.org/ <https://ci.ursalabs.org/#/>
> > [2] https://github.com/ursa-labs/ursabot
> > [3]
> > https://docs.travis-ci.com/user/migrate/open-source-repository-migration
> > [4] https://docs.travis-ci.com/user/migrate/open-source-on-travis-ci-com
> >
> >
> >
> > On Wed, Jul 3, 2019 at 12:01 AM Chesnay Schepler <ches...@apache.org
> > <mailto:ches...@apache.org>> wrote:
> >
> >     Are they using their own Travis CI pool, or did the switch to an
> >     entirely different CI service?
> >
> >     If we can just switch to our own Travis pool, just for our
> >     project, then
> >     this might be something we can do fairly quickly?
> >
> >     On 03/07/2019 05:55, Bowen Li wrote:
> >     > I responded in the INFRA ticket [1] that I believe they are
> >     using a wrong
> >     > metric against Flink and the total build time is a completely
> >     different
> >     > thing than guaranteed build capacity.
> >     >
> >     > My response:
> >     >
> >     > "As mentioned above, since I started to pay attention to Flink's
> >     build
> >     > queue a few tens of days ago, I'm in Seattle and I saw no build
> >     was kicking
> >     > off in PST daytime in weekdays for Flink. Our teammates in China
> >     and Europe
> >     > have also reported similar observations. So we need to evaluate
> >     how the
> >     > large total build time came from - if 1) your number and 2) our
> >     > observations from three locations that cover pretty much a full
> >     day, are
> >     > all true, I **guess** one reason can be that - highly likely the
> >     extra
> >     > build time came from weekends when other Apache projects may be
> >     idle and
> >     > Flink just drains hard its congested queue.
> >     >
> >     > Please be aware of that we're not complaining about the lack of
> >     resources
> >     > in general, I'm complaining about the lack of **stable, dedicated**
> >     > resources. An example for the latter one is, currently even if
> >     no build is
> >     > in Flink's queue and I submit a request to be the queue head in PST
> >     > morning, my build won't even start in 6-8+h. That is an absurd
> >     amount of
> >     > waiting time.
> >     >
> >     > That's saying, if ASF INFRA decides to adopt a quota system and
> >     grants
> >     > Flink five DEDICATED servers that runs all the time only for
> >     Flink, that'll
> >     > be PERFECT and can totally solve our problem now.
> >     >
> >     > Please be aware of that we're not complaining about the lack of
> >     resources
> >     > in general, I'm complaining about the lack of **stable, dedicated**
> >     > resources. An example for the latter one is, currently even if
> >     no build is
> >     > in Flink's queue and I submit a request to be the queue head in PST
> >     > morning, my build won't even start in 6-8+h. That is an absurd
> >     amount of
> >     > waiting time.
> >     >
> >     >
> >     > That's saying, if ASF INFRA decides to adopt a quota system and
> >     grants
> >     > Flink five DEDICATED servers that runs all the time only for
> >     Flink, that'll
> >     > be PERFECT and can totally solve our problem now.
> >     >
> >     > I feel what's missing in the ASF INFRA's Travis resource pool is
> >     some level
> >     > of build capacity SLAs and certainty"
> >     >
> >     >
> >     > Again, I believe there are differences in nature of these two
> >     problems,
> >     > long build time v.s. lack of dedicated build resource. That's
> >     saying,
> >     > shortening build time may relieve the situation, and may not.
> >     I'm sightly
> >     > negative on disabling IT cases for PRs, due to the downside is
> >     that we are
> >     > at risk of any potential bugs in PR that UTs doesn't catch, and
> >     may cost a
> >     > lot more to fix and if it slows others down or even block
> >     others, but am
> >     > open to others opinions on it.
> >     >
> >     > AFAICT from INFRA ticket[1], donating to ASF INFRA won't be
> >     feasible to
> >     > solve our problem since INFRA's pool is fully shared and they
> >     have no
> >     > control and finer insights over resource allocation to a
> >     specific Apache
> >     > project. As mentioned in [1], Apache Arrow is moving away from
> >     ASF INFRA
> >     > Travis pool (they are actually surprised Flink hasn't plan to do
> >     so). I
> >     > know that Spark is on its own build infra. If we all agree that
> >     funding our
> >     > own build infra, I'd be glad to help investigate any potential
> >     options
> >     > after releasing 1.9 since I'm super busy with 1.9 now.
> >     >
> >     > [1] https://issues.apache.org/jira/browse/INFRA-18533
> >     >
> >     >
> >     >
> >     > On Tue, Jul 2, 2019 at 4:46 AM Chesnay Schepler
> >     <ches...@apache.org <mailto:ches...@apache.org>> wrote:
> >     >
> >     >> As a short-term stopgap, since we can assume this issue to
> >     become much
> >     >> worse in the following days/weeks, we could disable IT cases in
> >     PRs and
> >     >> only run them on master.
> >     >>
> >     >> On 02/07/2019 12:03, Chesnay Schepler wrote:
> >     >>> People really have to stop thinking that just because
> >     something works
> >     >>> for us it is also a good solution.
> >     >>> Also, please remember that our builds run for 2h from start to
> >     finish,
> >     >>> and not the 14 _minutes_ it takes for zeppelin.
> >     >>> We are dealing with an entirely different scale here, both in
> >     terms of
> >     >>> build times and number of builds.
> >     >>>
> >     >>> In this very thread people have been complaining about long queue
> >     >>> times for their builds. Surprise, other Apache projects have been
> >     >>> suffering the very same thing due to us not controlling our build
> >     >>> times. While switching services (be it Jenkins, CircleCI or
> >     whatever)
> >     >>> will possibly work for us (and these options are actually
> >     attractive,
> >     >>> like CircleCI's proper support for build artifacts), it will also
> >     >>> result in us likely negatively affecting other projects in
> >     significant
> >     >>> ways.
> >     >>>
> >     >>> Sure, the Jenkins setup has a good user experience for us, at
> >     the cost
> >     >>> of blocking Jenkins workers for a _lot_ of time. Right now we
> >     have 25
> >     >>> PR's in our queue; that's possibly 50h we'd consume of Jenkins
> >     >>> resources, and the European contributors haven't even really
> >     started yet.
> >     >>>
> >     >>> FYI, the latest INFRA response from INFRA-18533:
> >     >>>
> >     >>> "Our rough metrics shows that Flink used over 5800 hours of
> >     build time
> >     >>> last month. That is equal to EIGHT servers running 24/7 for
> >     the ENTIRE
> >     >>> MONTH. EIGHT. nonstop.
> >     >>> When we discovered this last night, we discussed it some and
> >     are going
> >     >>> to tune down Flink to allow only five executors maximum. We
> cannot
> >     >>> allow Flink to consume so much of a Foundation shared resource."
> >     >>>
> >     >>> So yes, we either
> >     >>> a) have to heavily reduce our CI usage or
> >     >>> b) fund our own, either maintaining it ourselves or donating
> >     to Apache.
> >     >>>
> >     >>> On 02/07/2019 05:11, Bowen Li wrote:
> >     >>>> By looking at the git history of the Jenkins script, its core
> >     part
> >     >>>> was finished in March 2017 (and only two minor update in
> >     2017/2018),
> >     >>>> so it's been running for over two years now and feels like
> >     Zepplin
> >     >>>> community has been quite happy with it. @Jeff Zhang
> >     >>>> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>> can you
> >     share your insights and user
> >     >>>> experience with the Jenkins+Travis approach?
> >     >>>>
> >     >>>> Things like:
> >     >>>>
> >     >>>> - has the approach completely solved the resource capacity
> >     problem
> >     >>>> for Zepplin community? is Zepplin community happy with the
> >     result?
> >     >>>> - is the whole configuration chain stable (e.g. uptime) enough?
> >     >>>> - how often do you need to maintain the Jenkins infra? how many
> >     >>>> people are usually involved in maintenance and bug-fixes?
> >     >>>>
> >     >>>> The downside of this approach seems mostly to be on the
> >     maintenance
> >     >>>> to me - maintain the script and Jenkins infra.
> >     >>>>
> >     >>>> ** Having Our Own Travis-CI.com Account **
> >     >>>>
> >     >>>> Another alternative I've been thinking of is to have our own
> >     >>>> travis-ci.com <http://travis-ci.com> <http://travis-ci.com>
> >     account with paid dedicated
> >     >>>> resources. Note travis-ci.org <http://travis-ci.org>
> >     <http://travis-ci.org> is the free
> >     >>>> version and travis-ci.com <http://travis-ci.com>
> >     <http://travis-ci.com> is the commercial
> >     >>>> version. We currently use a shared resource pool managed by
> >     ASK INFRA
> >     >>>> team on travis-ci.org <http://travis-ci.org>
> >     <http://travis-ci.org>, but we have no control
> >     >>>> over it - we can't see how it's configured, how much
> >     resources are
> >     >>>> available, how resources are allocated among Apache projects,
> >     etc.
> >     >>>> The nice thing about having an account on travis-ci.com
> >     <http://travis-ci.com>
> >     >>>> <http://travis-ci.com> are:
> >     >>>>
> >     >>>> - relatively low cost with much better resource guarantee
> >     than what
> >     >>>> we currently have [1]: $249/month with 5 dedicated concurrency,
> >     >>>> $489/month with 10 concurrency
> >     >>>> - low maintenance work compared to using Jenkins
> >     >>>> - (potentially) no migration cost according to Travis's doc [2]
> >     >>>> (pending verification)
> >     >>>> - full control over the build capacity/configuration compared to
> >     >>>> using ASF INFRA's pool
> >     >>>>
> >     >>>> I'd be surprised if we as such a vibrant community cannot
> >     find and
> >     >>>> fund $249*12=$2988 a year in exchange for a much better
> developer
> >     >>>> experience and much higher productivity.
> >     >>>>
> >     >>>> [1] https://travis-ci.com/plans
> >     >>>> [2]
> >     >>>>
> >     >>
> >
> https://docs.travis-ci.com/user/migrate/open-source-repository-migration
> >     >>>> On Sat, Jun 29, 2019 at 8:39 AM Chesnay Schepler
> >     <ches...@apache.org <mailto:ches...@apache.org>
> >     >>>> <mailto:ches...@apache.org <mailto:ches...@apache.org>>> wrote:
> >     >>>>
> >     >>>>      So yes, the Jenkins job keeps pulling the state from
> >     Travis until it
> >     >>>>      finishes.
> >     >>>>
> >     >>>>      Note sure I'm comfortable with the idea of using Jenkins
> >     workers
> >     >>>>      just to
> >     >>>>      idle for a several hours.
> >     >>>>
> >     >>>>      On 29/06/2019 14:56, Jeff Zhang wrote:
> >     >>>>      > Here's what zeppelin community did, we make a python
> >     script to
> >     >>>>      check the
> >     >>>>      > build status of pull request.
> >     >>>>      > Here's script:
> >     >>>>      >
> >     https://github.com/apache/zeppelin/blob/master/travis_check.py
> >     >>>>      >
> >     >>>>      > And this is the script we used in Jenkins build job.
> >     >>>>      >
> >     >>>>      > if [ -f "travis_check.py" ]; then
> >     >>>>      >    git log -n 1
> >     >>>>      >    STATUS=$(curl -s $BUILD_URL | grep -e "GitHub pull
> >     >>>>      request.*from.*" | sed
> >     >>>>      > 's/.*GitHub pull request <a
> >     >>>>      > href=\"\(https[^"]*\).*from[^"]*.\(https[^"]*\).*/\1
> >     \2/g')
> >     >>>>      >    AUTHOR=$(echo $STATUS | sed 's/.*[/]\(.*\)$/\1/g')
> >     >>>>      >    PR=$(echo $STATUS | awk '{print $1}' | sed
> >     >>>> 's/.*[/]\(.*\)$/\1/g')
> >     >>>>      >    #COMMIT=$(git log -n 1 | grep "^Merge:" | awk
> >     '{print $3}')
> >     >>>>      >    #if [ -z $COMMIT ]; then
> >     >>>>      >    #  COMMIT=$(curl -s
> >     >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR
> >     >>>>      > | grep -e "\"label\":" -e "\"ref\":" -e "\"sha\":" |
> >     tr '\n' ' '
> >     >>>>      | sed
> >     >>>>      > 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr = '\n' |
> >     grep -v
> >     >>>>      "apache:" |
> >     >>>>      > sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g')
> >     >>>>      >    #fi
> >     >>>>      >
> >     >>>>      >    # get commit hash from PR
> >     >>>>      >    COMMIT=$(curl -s
> >     >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR |
> >     >>>>      > grep -e "\"label\":" -e "\"ref\":" -e "\"sha\":" | tr
> >     '\n' ' '
> >     >>>> | sed
> >     >>>>      > 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr = '\n' |
> >     grep -v
> >     >>>>      "apache:" |
> >     >>>>      > sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g')
> >     >>>>      >    sleep 30 # sleep few moment to wait travis starts
> >     the build
> >     >>>>      >    RET_CODE=0
> >     >>>>      >    python ./travis_check.py ${AUTHOR} ${COMMIT} ||
> >     RET_CODE=$?
> >     >>>>      >    if [ $RET_CODE -eq 2 ]; then # try with repository
> >     name when
> >     >>>>      travis-ci is
> >     >>>>      > not available in the account
> >     >>>>      >      RET_CODE=0
> >     >>>>      >      AUTHOR=$(curl -s
> >     >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR
> >     >>>>      > | grep '"full_name":' | grep -v "apache/zeppelin" | sed
> >     >>>>      > 's/.*[:][^"]*["]\([^/]*\).*/\1/g')
> >     >>>>      >    python ./travis_check.py ${AUTHOR} ${COMMIT} ||
> >     RET_CODE=$?
> >     >>>>      >    fi
> >     >>>>      >
> >     >>>>      >    if [ $RET_CODE -eq 2 ]; then # fail with can't find
> >     build
> >     >>>>      information in
> >     >>>>      > the travis
> >     >>>>      >      set +x
> >     >>>>      >      echo
> >     "-----------------------------------------------------"
> >     >>>>      >      echo "Looks like travis-ci is not configured for
> >     your fork."
> >     >>>>      >      echo "Please setup by swich on 'zeppelin'
> >     repository at
> >     >>>>      > https://travis-ci.org/profile and travis-ci."
> >     >>>>      >      echo "And then make sure 'Build branch updates'
> >     option is
> >     >>>>      enabled in
> >     >>>>      > the settings
> >     https://travis-ci.org/${AUTHOR}/zeppelin/settings
> >     <https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings>
> >     >>>> <https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings>."
> >     >>>>      >      echo ""
> >     >>>>      >      echo "To trigger CI after setup, you will need
> >     ammend your
> >     >>>>      last commit
> >     >>>>      > with"
> >     >>>>      >      echo "git commit --amend"
> >     >>>>      >      echo "git push your-remote HEAD --force"
> >     >>>>      >      echo ""
> >     >>>>      >      echo "See
> >     >>>>      >
> >     >>>>
> >     >>
> >
> http://zeppelin.apache.org/contribution/contributions.html#continuous-integration
> >     >>>>      > ."
> >     >>>>      >    fi
> >     >>>>      >
> >     >>>>      >    exit $RET_CODE
> >     >>>>      > else
> >     >>>>      >    set +x
> >     >>>>      >    echo "travis_check.py does not exists"
> >     >>>>      >    exit 1
> >     >>>>      > fi
> >     >>>>      >
> >     >>>>      > Chesnay Schepler <ches...@apache.org
> >     <mailto:ches...@apache.org>
> >     >>>>      <mailto:ches...@apache.org <mailto:ches...@apache.org>>>
> >     于2019年6月29日周六 下午3:17写道:
> >     >>>>      >
> >     >>>>      >> Does this imply that a Jenkins job is active as long
> >     as the
> >     >>>>      Travis build
> >     >>>>      >> runs?
> >     >>>>      >>
> >     >>>>      >> On 26/06/2019 21:28, Bowen Li wrote:
> >     >>>>      >>> Hi,
> >     >>>>      >>>
> >     >>>>      >>> @Dawid, I think the "long test running" as I
> >     mentioned in the
> >     >>>>      first
> >     >>>>      >> email,
> >     >>>>      >>> also as you guys said, belongs to "a big effort
> >     which is much
> >     >>>>      harder to
> >     >>>>      >>> accomplish in a short period of time and may deserve
> >     its own
> >     >>>>      separate
> >     >>>>      >>> discussion". Thus I didn't include it in what we can
> >     do in a
> >     >>>>      foreseeable
> >     >>>>      >>> short term.
> >     >>>>      >>>
> >     >>>>      >>> Besides, I don't think that's the ultimate reason
> >     for lack of
> >     >>>>      build
> >     >>>>      >>> resources. Even if the build is shortened to
> >     something like
> >     >>>>      2h, the
> >     >>>>      >>> problems of no build machine works about 6 or more
> >     hours in
> >     >>>>      PST daytime
> >     >>>>      >>> that I described will still happen, because no
> >     machine from
> >     >>>>      ASF INFRA's
> >     >>>>      >>> pool is allocated to Flink. As I have paid close
> >     attention to
> >     >>>>      the build
> >     >>>>      >>> queue in the past few weekdays, it's a pretty clear
> >     pattern now.
> >     >>>>      >>>
> >     >>>>      >>> **The ultimate root cause** for that is - we don't
> >     have any
> >     >>>>      **dedicated**
> >     >>>>      >>> build resources that we can stably rely on. I'm
> >     actually ok to
> >     >>>>      wait for a
> >     >>>>      >>> long time if there are build requests running, it
> >     means at
> >     >>>>      least we are
> >     >>>>      >>> making progress. But I'm not ok with no build
> >     resource. A
> >     >>>>      better place I
> >     >>>>      >>> think we should aim at in short term is to always
> >     have at
> >     >>>>      least a central
> >     >>>>      >>> pool (can be 3 or 5) of machines dedicated to build
> >     Flink at
> >     >>>>      any time, or
> >     >>>>      >>> maybe use users resources.
> >     >>>>      >>>
> >     >>>>      >>> @Chesnay @Robert I synced with Jeff offline that
> >     Zeppelin
> >     >>>>      community is
> >     >>>>      >>> using a Jenkins job to automatically build on users'
> >     travis
> >     >>>>      account and
> >     >>>>      >>> link the result back to github PR. I guess the
> >     Jenkins job
> >     >>>>      would fetch
> >     >>>>      >>> latest upstream master and build the PR against it.
> >     Jeff has
> >     >>>> filed
> >     >>>>      >> tickets
> >     >>>>      >>> to learn and get access to the Jenkins infra. It'll
> >     better to
> >     >>>>      fully
> >     >>>>      >>> understand it first before judging this approach.
> >     >>>>      >>>
> >     >>>>      >>> I also heard good things about CircleCI, and ASF
> >     INFRA seems
> >     >>>>      to have a
> >     >>>>      >> pool
> >     >>>>      >>> of build capacity there too. Can be an alternative
> >     to consider.
> >     >>>>      >>>
> >     >>>>      >>>
> >     >>>>      >>>
> >     >>>>      >>>
> >     >>>>      >>>
> >     >>>>      >>>
> >     >>>>      >>>
> >     >>>>      >>>
> >     >>>>      >>>
> >     >>>>      >>> On Wed, Jun 26, 2019 at 12:44 AM Dawid Wysakowicz <
> >     >>>>      >> dwysakow...@apache.org
> >     <mailto:dwysakow...@apache.org> <mailto:dwysakow...@apache.org
> >     <mailto:dwysakow...@apache.org>>>
> >     >>>>      >>> wrote:
> >     >>>>      >>>
> >     >>>>      >>>> Sorry to jump in late, but I think Bowen missed the
> >     most
> >     >>>>      important point
> >     >>>>      >>>> from Chesnay's previous message in the summary. The
> >     ultimate
> >     >>>>      reason for
> >     >>>>      >>>> all the problems is that the tests take close to 2
> >     hours to
> >     >>>>      run already.
> >     >>>>      >>>> I fully support this claim: "Unless people start
> >     caring about
> >     >>>>      test times
> >     >>>>      >>>> before adding them, this issue cannot be solved"
> >     >>>>      >>>>
> >     >>>>      >>>> This is also another reason why using user's Travis
> >     account
> >     >>>>      won't help.
> >     >>>>      >>>> Every few weeks we reach the user's time limit for
> >     a single
> >     >>>>      profile.
> >     >>>>      >>>> This makes the user's builds simply fail, until we
> >     either
> >     >>>>      properly
> >     >>>>      >>>> decrease the time the tests take (which I am not
> >     sure we ever
> >     >>>>      did) or
> >     >>>>      >>>> postpone the problem by splitting into more
> >     profiles. (Note
> >     >>>>      that the ASF
> >     >>>>      >>>> Travis account has higher time limits)
> >     >>>>      >>>>
> >     >>>>      >>>> Best,
> >     >>>>      >>>>
> >     >>>>      >>>> Dawid
> >     >>>>      >>>>
> >     >>>>      >>>> On 26/06/2019 09:36, Robert Metzger wrote:
> >     >>>>      >>>>> Do we know if using "the best" available hardware
> >     would
> >     >>>>      improve the
> >     >>>>      >> build
> >     >>>>      >>>>> times?
> >     >>>>      >>>>> Imagine we would run the build on machines with
> >     plenty of
> >     >>>>      main memory
> >     >>>>      >> to
> >     >>>>      >>>>> mount everything to ramdisk + the latest CPU
> >     architecture?
> >     >>>>      >>>>>
> >     >>>>      >>>>> Throwing hardware at the problem could help reduce
> >     the time
> >     >>>>      of an
> >     >>>>      >>>>> individual build, and using our own infrastructure
> >     would
> >     >>>>      remove our
> >     >>>>      >>>>> dependency on Apache's Travis account (with the
> >     obvious
> >     >>>>      downside of
> >     >>>>      >>>> having
> >     >>>>      >>>>> to maintain the infrastructure)
> >     >>>>      >>>>> We could use an open source travis alternative, to
> >     have a
> >     >>>>      similar
> >     >>>>      >>>>> experience and make the migration easy.
> >     >>>>      >>>>>
> >     >>>>      >>>>>
> >     >>>>      >>>>> On Wed, Jun 26, 2019 at 9:34 AM Chesnay Schepler
> >     >>>>      <ches...@apache.org <mailto:ches...@apache.org>
> >     <mailto:ches...@apache.org <mailto:ches...@apache.org>>>
> >     >>>>      >>>> wrote:
> >     >>>>      >>>>>>    >From what I gathered, there's no special
> >     sauce that the
> >     >>>>      Zeppelin
> >     >>>>      >>>>>> project uses which actually integrates a users
> Travis
> >     >>>>      account into the
> >     >>>>      >>>> PR.
> >     >>>>      >>>>>> They just disabled Travis for PRs. And that's
> >     kind of it.
> >     >>>>      >>>>>>
> >     >>>>      >>>>>> Naturally we can do this (duh) and safe the ASF a
> >     fair
> >     >>>>      amount of
> >     >>>>      >>>>>> resources, but there are downsides:
> >     >>>>      >>>>>>
> >     >>>>      >>>>>> The discoverability of the Travis check takes a
> >     nose-dive.
> >     >>>>      Either we
> >     >>>>      >>>>>> require every contributor to always, an every
> >     commit, also
> >     >>>>      post a
> >     >>>>      >> Travis
> >     >>>>      >>>>>> build, or we have the reviewer sift through the
> >     >>>>      contributors account
> >     >>>>      >> to
> >     >>>>      >>>>>> find it.
> >     >>>>      >>>>>>
> >     >>>>      >>>>>> This is rather cumbersome. Additionally, it's
> >     also not
> >     >>>>      equivalent to
> >     >>>>      >>>>>> having a PR build.
> >     >>>>      >>>>>>
> >     >>>>      >>>>>> A normal branch build takes a branch as is and
> >     tests it. A
> >     >>>>      PR build
> >     >>>>      >>>>>> merges the branch into master, and then runs it.
> >     (Fun fact:
> >     >>>>      This is
> >     >>>>      >> why
> >     >>>>      >>>>>> a PR without merge conflicts is not being run on
> >     Travis.)
> >     >>>>      >>>>>>
> >     >>>>      >>>>>> And ultimately, everyone can already make use of
> this
> >     >>>>      approach anyway.
> >     >>>>      >>>>>>
> >     >>>>      >>>>>> On 25/06/2019 08:02, Jark Wu wrote:
> >     >>>>      >>>>>>> Hi Jeff,
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>> Thanks for sharing the Zeppelin approach. I
> >     think it's a
> >     >>>>      good idea to
> >     >>>>      >>>>>>> leverage user's travis account.
> >     >>>>      >>>>>>> In this way, we can have almost unlimited
> >     concurrent build
> >     >>>>      jobs and
> >     >>>>      >>>>>>> developers can restart build by themselves
> >     (currently only
> >     >>>>      committers
> >     >>>>      >>>>>>> can restart PR's build).
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>> But I'm still not very clear how to integrate
> user's
> >     >>>>      travis build
> >     >>>>      >> into
> >     >>>>      >>>>>>> the Flink pull request's build automatically.
> >     Can you
> >     >>>>      explain more in
> >     >>>>      >>>>>>> detail?
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>> Another question: does travis only build
> >     branches for user
> >     >>>>      account?
> >     >>>>      >>>>>>> My concern is that builds for PRs will rebase
> user's
> >     >>>>      commits against
> >     >>>>      >>>>>>> current master branch.
> >     >>>>      >>>>>>> This will help us to find problems before
> >     merge.  Builds
> >     >>>>      for branches
> >     >>>>      >>>>>>> will lose the impact of new commits in master.
> >     >>>>      >>>>>>> How does Zeppelin solve this problem?
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>> Thanks again for sharing the idea.
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>> Regards,
> >     >>>>      >>>>>>> Jark
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>> On Tue, 25 Jun 2019 at 11:01, Jeff Zhang
> >     <zjf...@gmail.com <mailto:zjf...@gmail.com>
> >     >>>>      <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>>
> >     >>>>      >>>>>>> <mailto:zjf...@gmail.com
> >     <mailto:zjf...@gmail.com> <mailto:zjf...@gmail.com
> >     <mailto:zjf...@gmail.com>>>> wrote:
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>>       Hi Folks,
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>>  Zeppelin meet this kind of issue before, we solve
> >     >>>> it by
> >     >>>>      >> delegating
> >     >>>>      >>>>>>>       each
> >     >>>>      >>>>>>>       one's PR build to his travis account
> >     (Everyone can
> >     >>>>      have 5 free
> >     >>>>      >>>>>>>       slot for
> >     >>>>      >>>>>>>  travis build).
> >     >>>>      >>>>>>>  Apache account travis build is only triggered when
> >     >>>>      PR is merged.
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>>       Kurt Young <ykt...@gmail.com
> >     <mailto:ykt...@gmail.com>
> >     >>>>      <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>>
> >     <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>
> >     >>>>      <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>>>>
> >     >>>>      >>>>>>>  于2019年6月25日周二 上午10:16写道:
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>>       > (Forgot to cc George)
> >     >>>>      >>>>>>>       >
> >     >>>>      >>>>>>>       > Best,
> >     >>>>      >>>>>>>       > Kurt
> >     >>>>      >>>>>>>       >
> >     >>>>      >>>>>>>       >
> >     >>>>      >>>>>>>       > On Tue, Jun 25, 2019 at 10:16 AM Kurt Young
> >     >>>>      <ykt...@gmail.com <mailto:ykt...@gmail.com>
> >     <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>>
> >     >>>>      >>>>>>> <mailto:ykt...@gmail.com
> >     <mailto:ykt...@gmail.com> <mailto:ykt...@gmail.com
> >     <mailto:ykt...@gmail.com>>>>
> >     >>>>      wrote:
> >     >>>>      >>>>>>>       >
> >     >>>>      >>>>>>>       > > Hi Bowen,
> >     >>>>      >>>>>>>       > >
> >     >>>>      >>>>>>>       > > Thanks for bringing this up. We
> >     actually have
> >     >>>>      discussed
> >     >>>>      >> about
> >     >>>>      >>>>>>>       this, and I
> >     >>>>      >>>>>>>       > > think Till and George have
> >     >>>>      >>>>>>>       > > already spend sometime investigating
> >     it. I have
> >     >>>>      cced both of
> >     >>>>      >>>>>>>       them, and
> >     >>>>      >>>>>>>       > > maybe they can share
> >     >>>>      >>>>>>>       > > their findings.
> >     >>>>      >>>>>>>       > >
> >     >>>>      >>>>>>>       > > Best,
> >     >>>>      >>>>>>>       > > Kurt
> >     >>>>      >>>>>>>       > >
> >     >>>>      >>>>>>>       > >
> >     >>>>      >>>>>>>       > > On Tue, Jun 25, 2019 at 10:08 AM Jark Wu
> >     >>>>      <imj...@gmail.com <mailto:imj...@gmail.com>
> >     <mailto:imj...@gmail.com <mailto:imj...@gmail.com>>
> >     >>>>      >>>>>>> <mailto:imj...@gmail.com
> >     <mailto:imj...@gmail.com> <mailto:imj...@gmail.com
> >     <mailto:imj...@gmail.com>>>>
> >     >>>>      wrote:
> >     >>>>      >>>>>>>       > >
> >     >>>>      >>>>>>>       > >> Hi Bowen,
> >     >>>>      >>>>>>>       > >>
> >     >>>>      >>>>>>>       > >> Thanks for bringing this. We also
> >     suffered from
> >     >>>>      the long
> >     >>>>      >>>>>>>       build time.
> >     >>>>      >>>>>>>       > >> I agree that we should focus on
> >     solving build
> >     >>>>      capacity
> >     >>>>      >>>>>>>  problem in the
> >     >>>>      >>>>>>>       > >> thread.
> >     >>>>      >>>>>>>       > >>
> >     >>>>      >>>>>>>       > >> My observation is there is only one
> >     build is
> >     >>>>      running, all
> >     >>>>      >> the
> >     >>>>      >>>>>>>  others
> >     >>>>      >>>>>>>       > >> (other
> >     >>>>      >>>>>>>       > >> PRs, master) are pending.
> >     >>>>      >>>>>>>       > >> The pricing plan[1] of travis shows
> >     it can
> >     >>>> support
> >     >>>>      >> concurrent
> >     >>>>      >>>>>>>       build
> >     >>>>      >>>>>>>       > jobs.
> >     >>>>      >>>>>>>       > >> But I don't know which plan we are
> >     using, might
> >     >>>>      be the free
> >     >>>>      >>>>>>>       plan for
> >     >>>>      >>>>>>>       > open
> >     >>>>      >>>>>>>       > >> source.
> >     >>>>      >>>>>>>       > >>
> >     >>>>      >>>>>>>       > >> I cc-ed Chesnay who may have some
> >     experience on
> >     >>>>      Travis.
> >     >>>>      >>>>>>>       > >>
> >     >>>>      >>>>>>>       > >> Regards,
> >     >>>>      >>>>>>>       > >> Jark
> >     >>>>      >>>>>>>       > >>
> >     >>>>      >>>>>>>       > >> [1]: https://travis-ci.com/plans
> >     >>>>      >>>>>>>       > >>
> >     >>>>      >>>>>>>       > >> On Tue, 25 Jun 2019 at 08:11, Bowen Li <
> >     >>>>      >> bowenl...@gmail.com <mailto:bowenl...@gmail.com>
> >     <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>
> >     >>>>      >>>>>>> <mailto:bowenl...@gmail.com
> >     <mailto:bowenl...@gmail.com>
> >     >>>>      <mailto:bowenl...@gmail.com
> >     <mailto:bowenl...@gmail.com>>>> wrote:
> >     >>>>      >>>>>>>       > >>
> >     >>>>      >>>>>>>       > >> > Hi Steven,
> >     >>>>      >>>>>>>       > >> >
> >     >>>>      >>>>>>>       > >> > I think you may not read what I
> >     wrote. The
> >     >>>>      discussion is
> >     >>>>      >>>> about
> >     >>>>      >>>>>>>       > "unstable
> >     >>>>      >>>>>>>       > >> > build **capacity**", in another word
> >     >>>>      "unstable / lack of
> >     >>>>      >>>> build
> >     >>>>      >>>>>>>       > >> resources",
> >     >>>>      >>>>>>>       > >> > not "unstable build".
> >     >>>>      >>>>>>>       > >> >
> >     >>>>      >>>>>>>       > >> > On Mon, Jun 24, 2019 at 4:40 PM
> >     Steven Wu
> >     >>>>      >>>>>>>       <stevenz...@gmail.com
> >     <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com
> >     <mailto:stevenz...@gmail.com>>
> >     >>>>      <mailto:stevenz...@gmail.com
> >     <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com
> >     <mailto:stevenz...@gmail.com>>>>
> >     >>>>      >>>>>>>       > wrote:
> >     >>>>      >>>>>>>       > >> >
> >     >>>>      >>>>>>>       > >> > > long and sometimes unstable build is
> >     >>>>      definitely a pain
> >     >>>>      >>>>>> point.
> >     >>>>      >>>>>>>       > >> > >
> >     >>>>      >>>>>>>       > >> > > I suspect the build failure here in
> >     >>>>      >> flink-connector-kafka
> >     >>>>      >>>>>>>       is not
> >     >>>>      >>>>>>>       > >> related
> >     >>>>      >>>>>>>       > >> > to
> >     >>>>      >>>>>>>       > >> > > my change. but there is no easy
> >     re-run the
> >     >>>>      build on
> >     >>>>      >>>>>>>  travis UI.
> >     >>>>      >>>>>>>       > Google
> >     >>>>      >>>>>>>       > >> > > search showed a trick of
> >     close-and-open the
> >     >>>>      PR will
> >     >>>>      >>>>>>>  trigger rebuild.
> >     >>>>      >>>>>>>       > >> but
> >     >>>>      >>>>>>>       > >> > > that could add noises to the PR
> >     activities.
> >     >>>>      >>>>>>>       > >> > >
> >     >>>> https://travis-ci.org/apache/flink/jobs/545555519
> >     >>>>      >>>>>>>       > >> > >
> >     >>>>      >>>>>>>       > >> > > travis-ci for my personal repo
> >     often failed
> >     >>>>      with
> >     >>>>      >>>>>>>  exceeding time
> >     >>>>      >>>>>>>       > limit
> >     >>>>      >>>>>>>       > >> > after
> >     >>>>      >>>>>>>       > >> > > 4+ hours.
> >     >>>>      >>>>>>>       > >> > > The job exceeded the maximum time
> >     limit for
> >     >>>>      jobs, and
> >     >>>>      >> has
> >     >>>>      >>>>>>>       been
> >     >>>>      >>>>>>>       > >> > terminated.
> >     >>>>      >>>>>>>       > >> > >
> >     >>>>      >>>>>>>       > >> > > On Mon, Jun 24, 2019 at 4:15 PM
> >     Bowen Li
> >     >>>>      >>>>>>>       <bowenl...@gmail.com
> >     <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com
> >     <mailto:bowenl...@gmail.com>>
> >     >>>>      <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>
> >     <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>>
> >     >>>>      >>>>>>>       > wrote:
> >     >>>>      >>>>>>>       > >> > >
> >     >>>>      >>>>>>>       > >> > > >
> >     >>>> https://travis-ci.org/apache/flink/builds/549681530
> >     >>>>      >>>>>>>       This build
> >     >>>>      >>>>>>>       > >> > request
> >     >>>>      >>>>>>>       > >> > > > has
> >     >>>>      >>>>>>>       > >> > > > been sitting at **HEAD of the
> >     queue**
> >     >>>>      since I first
> >     >>>>      >> saw
> >     >>>>      >>>>>>>       it at PST
> >     >>>>      >>>>>>>       > >> > 10:30am
> >     >>>>      >>>>>>>       > >> > > > (not sure how long it's been
> >     there before
> >     >>>>      10:30am).
> >     >>>>      >>>>>>>       It's PST
> >     >>>>      >>>>>>>       > 4:12pm
> >     >>>>      >>>>>>>       > >> now
> >     >>>>      >>>>>>>       > >> > > and
> >     >>>>      >>>>>>>       > >> > > > it hasn't started yet.
> >     >>>>      >>>>>>>       > >> > > >
> >     >>>>      >>>>>>>       > >> > > > On Mon, Jun 24, 2019 at 2:48 PM
> >     Bowen Li
> >     >>>>      >>>>>>>       <bowenl...@gmail.com
> >     <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com
> >     <mailto:bowenl...@gmail.com>>
> >     >>>>      <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>
> >     <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>>
> >     >>>>      >>>>>>>       > >> wrote:
> >     >>>>      >>>>>>>       > >> > > >
> >     >>>>      >>>>>>>       > >> > > > > Hi devs,
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > > > I've been experiencing the pain
> >     >>>>      resulting from lack
> >     >>>>      >>>>>>>       of stable
> >     >>>>      >>>>>>>       > >> build
> >     >>>>      >>>>>>>       > >> > > > > capacity on Travis for Flink
> >     PRs [1].
> >     >>>>      >> Specifically, I
> >     >>>>      >>>>>>>  noticed
> >     >>>>      >>>>>>>       > >> often
> >     >>>>      >>>>>>>       > >> > > that
> >     >>>>      >>>>>>>       > >> > > > no
> >     >>>>      >>>>>>>       > >> > > > > build in the queue is making any
> >     >>>>      progress for
> >     >>>>      >> hours,
> >     >>>>      >>>> and
> >     >>>>      >>>>>>>       > suddenly
> >     >>>>      >>>>>>>       > >> 5
> >     >>>>      >>>>>>>       > >> > or
> >     >>>>      >>>>>>>       > >> > > 6
> >     >>>>      >>>>>>>       > >> > > > > builds kick off all together
> >     after the
> >     >>>>      long pause.
> >     >>>>      >>>>>>>       I'm at PST
> >     >>>>      >>>>>>>       > >> > (UTC-08)
> >     >>>>      >>>>>>>       > >> > > > time
> >     >>>>      >>>>>>>       > >> > > > > zone, and I've seen pause can
> >     be as
> >     >>>>      long as 6 hours
> >     >>>>      >>>>>>>       from PST 9am
> >     >>>>      >>>>>>>       > >> to
> >     >>>>      >>>>>>>       > >> > 3pm
> >     >>>>      >>>>>>>       > >> > > > > (let alone the time needed to
> >     drain the
> >     >>>>      queue
> >     >>>>      >>>>>>>  afterwards).
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > > > I think this has greatly
> >     impacted our
> >     >>>>      productivity.
> >     >>>>      >>>> I've
> >     >>>>      >>>>>>>       > >> experienced
> >     >>>>      >>>>>>>       > >> > > that
> >     >>>>      >>>>>>>       > >> > > > > PRs submitted in the early
> >     morning of
> >     >>>>      PST time zone
> >     >>>>      >>>>>>>       won't finish
> >     >>>>      >>>>>>>       > >> > their
> >     >>>>      >>>>>>>       > >> > > > > build until late night of the
> >     same day.
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > > > So my questions are:
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > > > - Has anyone else experienced
> >     the same
> >     >>>>      problem or
> >     >>>>      >>>>>>>       have similar
> >     >>>>      >>>>>>>       > >> > > > observation
> >     >>>>      >>>>>>>       > >> > > > > on TravisCI? (I suspect it
> >     has things
> >     >>>>      to do with
> >     >>>>      >> time
> >     >>>>      >>>>>>>       zone)
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > > > - What pricing plan of
> >     TravisCI is
> >     >>>>      Flink currently
> >     >>>>      >>>>>>>  using? Is it
> >     >>>>      >>>>>>>       > >> the
> >     >>>>      >>>>>>>       > >> > > free
> >     >>>>      >>>>>>>       > >> > > > > plan for open source
> >     projects? What
> >     >>>> are the
> >     >>>>      >>>>>>>  guaranteed build
> >     >>>>      >>>>>>>       > >> capacity
> >     >>>>      >>>>>>>       > >> > > of
> >     >>>>      >>>>>>>       > >> > > > > the current plan?
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > > > - If the current pricing plan
> >     (either
> >     >>>>      free or paid)
> >     >>>>      >>>>>> can't
> >     >>>>      >>>>>>>       > provide
> >     >>>>      >>>>>>>       > >> > > stable
> >     >>>>      >>>>>>>       > >> > > > > build capacity, can we
> >     upgrade to a
> >     >>>>      higher priced
> >     >>>>      >>>>>>>       plan with
> >     >>>>      >>>>>>>       > larger
> >     >>>>      >>>>>>>       > >> > and
> >     >>>>      >>>>>>>       > >> > > > more
> >     >>>>      >>>>>>>       > >> > > > > stable build capacity?
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > > > BTW, another factor that
> >     contribute to
> >     >>>> the
> >     >>>>      >>>>>>>  productivity problem
> >     >>>>      >>>>>>>       > is
> >     >>>>      >>>>>>>       > >> > that
> >     >>>>      >>>>>>>       > >> > > > > our build is slow - we run
> >     full build
> >     >>>>      for every PR
> >     >>>>      >>>> and a
> >     >>>>      >>>>>>>       > >> successful
> >     >>>>      >>>>>>>       > >> > > full
> >     >>>>      >>>>>>>       > >> > > > > build takes ~5h. We
> >     definitely have
> >     >>>>      more options to
> >     >>>>      >>>>>>>       solve it,
> >     >>>>      >>>>>>>       > for
> >     >>>>      >>>>>>>       > >> > > > instance,
> >     >>>>      >>>>>>>       > >> > > > > modularize the build graphs
> >     and reuse
> >     >>>>      artifacts
> >     >>>>      >> from
> >     >>>>      >>>> the
> >     >>>>      >>>>>>>       > previous
> >     >>>>      >>>>>>>       > >> > > build.
> >     >>>>      >>>>>>>       > >> > > > > But I think that can be a big
> >     effort
> >     >>>>      which is much
> >     >>>>      >>>>>>>  harder to
> >     >>>>      >>>>>>>       > >> > accomplish
> >     >>>>      >>>>>>>       > >> > > > in
> >     >>>>      >>>>>>>       > >> > > > > a short period of time and
> >     may deserve
> >     >>>>      its own
> >     >>>>      >>>> separate
> >     >>>>      >>>>>>>       > >> discussion.
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > > > [1]
> >     >>>>      >> https://travis-ci.org/apache/flink/pull_requests
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > > >
> >     >>>>      >>>>>>>       > >> > > >
> >     >>>>      >>>>>>>       > >> > >
> >     >>>>      >>>>>>>       > >> >
> >     >>>>      >>>>>>>       > >>
> >     >>>>      >>>>>>>       > >
> >     >>>>      >>>>>>>       >
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>>       --
> >     >>>>      >>>>>>>       Best Regards
> >     >>>>      >>>>>>>
> >     >>>>      >>>>>>>       Jeff Zhang
> >     >>>>      >>>>>>>
> >     >>>>      >>
> >     >>>>
> >     >>>
> >     >>
> >
>
>

Reply via email to