Wow. That's great! Thanks Chesnay.

On Fri, 2 Aug 2019 at 17:50, Chesnay Schepler <ches...@apache.org> wrote:

> I'm currently modifying the cibot to do this automatically; should be
> finished until Monday.
>
> On 02/08/2019 07:41, Jark Wu wrote:
> > Hi Chesnay,
> >
> > Can we assign Flink Committers the permission of flink-ci/flink repo?
> > Several times, when I pushed some new commits, the old build jobs are
> still
> > in pending and not canceled.
> > Before we fix that, we can manually cancel some old jobs to save build
> > resource.
> >
> > Best,
> > Jark
> >
> >
> > On Wed, 10 Jul 2019 at 16:17, Chesnay Schepler <ches...@apache.org>
> wrote:
> >
> >> Your best bet would be to check the first commit in the PR and check the
> >> parent commit.
> >>
> >> To re-run things, you will have to rebase the PR on the latest master.
> >>
> >> On 10/07/2019 03:32, Kurt Young wrote:
> >>> Thanks for all your efforts Chesnay, it indeed improves a lot for our
> >>> develop experience. BTW, do you know how to find the master branch
> >>> information which the CI runs with?
> >>>
> >>> For example, like this one:
> >>> https://travis-ci.com/flink-ci/flink/jobs/214542568
> >>> It shows pass with the commits, which rebased on the master when the CI
> >>> is triggered. But it's both possible that the master branch CI runs on
> is
> >>> the
> >>> same or different with current master. If it's the same, I can simply
> >> rely
> >>> on the
> >>> passed information to push commits, but if it's not, I think i should
> >> find
> >>> another
> >>> way to re-trigger tests based on the newest master.
> >>>
> >>> Do you know where can I get such information?
> >>>
> >>> Best,
> >>> Kurt
> >>>
> >>>
> >>> On Tue, Jul 9, 2019 at 3:27 AM Chesnay Schepler <ches...@apache.org>
> >> wrote:
> >>>> The kinks have been worked out; the bot is running again and pr builds
> >>>> are yet again no longer running on ASF resources.
> >>>>
> >>>> PRs are mirrored to: https://github.com/flink-ci/flink
> >>>> Bot source: https://github.com/flink-ci/ci-bot
> >>>>
> >>>> On 08/07/2019 17:14, Chesnay Schepler wrote:
> >>>>> I have temporarily re-enabled running PR builds on the ASF account;
> >>>>> migrating to the Travis subscription caused some issues in the bot
> >>>>> that I have to fix first.
> >>>>>
> >>>>> On 07/07/2019 23:01, Chesnay Schepler wrote:
> >>>>>> The vote has passed unanimously in favor of migrating to a separate
> >>>>>> Travis account.
> >>>>>>
> >>>>>> I will now set things up such that no PullRequest is no longer run
> on
> >>>>>> the ASF servers.
> >>>>>> This is a major setup in reducing our usage of ASF resources.
> >>>>>> For the time being we'll use free Travis plan for flink-ci (i.e. 5
> >>>>>> workers, which is the same the ASF gives us). Over the course of the
> >>>>>> next week we'll setup the Ververica subscription to increase this
> >> limit.
> >>>>>>   From now now, a bot will mirror all new and updated PullRequests
> to a
> >>>>>> mirror repository (https://github.com/flink-ci/flink-ci) and write
> an
> >>>>>> update into the PR once the build is complete.
> >>>>>> I have ran the bots for the past 3 days in parallel to our existing
> >>>>>> Travis and it was working without major issues.
> >>>>>>
> >>>>>> The biggest change that contributors will see is that there's no
> >>>>>> longer a icon next to each commit. We may revisit this in the
> future.
> >>>>>>
> >>>>>> I'll setup a repo with the source of the bot later.
> >>>>>>
> >>>>>> On 04/07/2019 10:46, Chesnay Schepler wrote:
> >>>>>>> I've raised a JIRA
> >>>>>>> <https://issues.apache.org/jira/browse/INFRA-18703>with INFRA to
> >>>>>>> inquire whether it would be possible to switch to a different
> Travis
> >>>>>>> account, and if so what steps would need to be taken.
> >>>>>>> We need a proper confirmation from INFRA since we are not in full
> >>>>>>> control of the flink repository (for example, we cannot access the
> >>>>>>> settings page).
> >>>>>>>
> >>>>>>> If this is indeed possible, Ververica is willing sponsor a Travis
> >>>>>>> account for the Flink project.
> >>>>>>> This would provide us with more than enough resources than we need.
> >>>>>>>
> >>>>>>> Since this makes the project more reliant on resources provided by
> >>>>>>> external companies I would like to vote on this.
> >>>>>>>
> >>>>>>> Please vote on this proposal, as follows:
> >>>>>>> [ ] +1, Approve the migration to a Ververica-sponsored Travis
> >>>>>>> account, provided that INFRA approves
> >>>>>>> [ ] -1, Do not approach the migration to a Ververica-sponsored
> >>>>>>> Travis account
> >>>>>>>
> >>>>>>> The vote will be open for at least 24h, and until we have
> >>>>>>> confirmation from INFRA. The voting period may be shorter than the
> >>>>>>> usual 3 days since our current is effectively not working.
> >>>>>>>
> >>>>>>> On 04/07/2019 06:51, Bowen Li wrote:
> >>>>>>>> Re: > Are they using their own Travis CI pool, or did the switch
> to
> >>>>>>>> an entirely different CI service?
> >>>>>>>>
> >>>>>>>> I reached out to Wes and Krisztián from Apache Arrow PMC. They are
> >>>>>>>> currently moving away from ASF's Travis to their own in-house
> metal
> >>>>>>>> machines at [1] with custom CI application at [2]. They've seen
> >>>>>>>> significant improvement w.r.t both much higher performance and
> >>>>>>>> basically no resource waiting time, "night-and-day" difference
> >>>>>>>> quoting Wes.
> >>>>>>>>
> >>>>>>>> Re: > If we can just switch to our own Travis pool, just for our
> >>>>>>>> project, then this might be something we can do fairly quickly?
> >>>>>>>>
> >>>>>>>> I believe so, according to [3] and [4]
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> [1] https://ci.ursalabs.org/ <https://ci.ursalabs.org/#/>
> >>>>>>>> [2] https://github.com/ursa-labs/ursabot
> >>>>>>>> [3]
> >>>>>>>>
> >>
> https://docs.travis-ci.com/user/migrate/open-source-repository-migration
> >>>>>>>> [4]
> >>>>>>>>
> >> https://docs.travis-ci.com/user/migrate/open-source-on-travis-ci-com
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Jul 3, 2019 at 12:01 AM Chesnay Schepler
> >>>>>>>> <ches...@apache.org <mailto:ches...@apache.org>> wrote:
> >>>>>>>>
> >>>>>>>>       Are they using their own Travis CI pool, or did the switch
> to
> >> an
> >>>>>>>>       entirely different CI service?
> >>>>>>>>
> >>>>>>>>       If we can just switch to our own Travis pool, just for our
> >>>>>>>>       project, then
> >>>>>>>>       this might be something we can do fairly quickly?
> >>>>>>>>
> >>>>>>>>       On 03/07/2019 05:55, Bowen Li wrote:
> >>>>>>>>       > I responded in the INFRA ticket [1] that I believe they
> are
> >>>>>>>>       using a wrong
> >>>>>>>>       > metric against Flink and the total build time is a
> completely
> >>>>>>>>       different
> >>>>>>>>       > thing than guaranteed build capacity.
> >>>>>>>>       >
> >>>>>>>>       > My response:
> >>>>>>>>       >
> >>>>>>>>       > "As mentioned above, since I started to pay attention to
> >> Flink's
> >>>>>>>>       build
> >>>>>>>>       > queue a few tens of days ago, I'm in Seattle and I saw no
> >> build
> >>>>>>>>       was kicking
> >>>>>>>>       > off in PST daytime in weekdays for Flink. Our teammates in
> >> China
> >>>>>>>>       and Europe
> >>>>>>>>       > have also reported similar observations. So we need to
> >> evaluate
> >>>>>>>>       how the
> >>>>>>>>       > large total build time came from - if 1) your number and
> 2)
> >> our
> >>>>>>>>       > observations from three locations that cover pretty much a
> >> full
> >>>>>>>>       day, are
> >>>>>>>>       > all true, I **guess** one reason can be that - highly
> likely
> >> the
> >>>>>>>>       extra
> >>>>>>>>       > build time came from weekends when other Apache projects
> may
> >> be
> >>>>>>>>       idle and
> >>>>>>>>       > Flink just drains hard its congested queue.
> >>>>>>>>       >
> >>>>>>>>       > Please be aware of that we're not complaining about the
> lack
> >> of
> >>>>>>>>       resources
> >>>>>>>>       > in general, I'm complaining about the lack of **stable,
> >>>>>>>> dedicated**
> >>>>>>>>       > resources. An example for the latter one is, currently
> even
> >> if
> >>>>>>>>       no build is
> >>>>>>>>       > in Flink's queue and I submit a request to be the queue
> head
> >>>>>>>> in PST
> >>>>>>>>       > morning, my build won't even start in 6-8+h. That is an
> >> absurd
> >>>>>>>>       amount of
> >>>>>>>>       > waiting time.
> >>>>>>>>       >
> >>>>>>>>       > That's saying, if ASF INFRA decides to adopt a quota
> system
> >> and
> >>>>>>>>       grants
> >>>>>>>>       > Flink five DEDICATED servers that runs all the time only
> for
> >>>>>>>>       Flink, that'll
> >>>>>>>>       > be PERFECT and can totally solve our problem now.
> >>>>>>>>       >
> >>>>>>>>       > Please be aware of that we're not complaining about the
> lack
> >> of
> >>>>>>>>       resources
> >>>>>>>>       > in general, I'm complaining about the lack of **stable,
> >>>>>>>> dedicated**
> >>>>>>>>       > resources. An example for the latter one is, currently
> even
> >> if
> >>>>>>>>       no build is
> >>>>>>>>       > in Flink's queue and I submit a request to be the queue
> head
> >>>>>>>> in PST
> >>>>>>>>       > morning, my build won't even start in 6-8+h. That is an
> >> absurd
> >>>>>>>>       amount of
> >>>>>>>>       > waiting time.
> >>>>>>>>       >
> >>>>>>>>       >
> >>>>>>>>       > That's saying, if ASF INFRA decides to adopt a quota
> system
> >> and
> >>>>>>>>       grants
> >>>>>>>>       > Flink five DEDICATED servers that runs all the time only
> for
> >>>>>>>>       Flink, that'll
> >>>>>>>>       > be PERFECT and can totally solve our problem now.
> >>>>>>>>       >
> >>>>>>>>       > I feel what's missing in the ASF INFRA's Travis resource
> >> pool is
> >>>>>>>>       some level
> >>>>>>>>       > of build capacity SLAs and certainty"
> >>>>>>>>       >
> >>>>>>>>       >
> >>>>>>>>       > Again, I believe there are differences in nature of these
> two
> >>>>>>>>       problems,
> >>>>>>>>       > long build time v.s. lack of dedicated build resource.
> That's
> >>>>>>>>       saying,
> >>>>>>>>       > shortening build time may relieve the situation, and may
> not.
> >>>>>>>>       I'm sightly
> >>>>>>>>       > negative on disabling IT cases for PRs, due to the
> downside
> >> is
> >>>>>>>>       that we are
> >>>>>>>>       > at risk of any potential bugs in PR that UTs doesn't
> catch,
> >> and
> >>>>>>>>       may cost a
> >>>>>>>>       > lot more to fix and if it slows others down or even block
> >>>>>>>>       others, but am
> >>>>>>>>       > open to others opinions on it.
> >>>>>>>>       >
> >>>>>>>>       > AFAICT from INFRA ticket[1], donating to ASF INFRA won't
> be
> >>>>>>>>       feasible to
> >>>>>>>>       > solve our problem since INFRA's pool is fully shared and
> they
> >>>>>>>>       have no
> >>>>>>>>       > control and finer insights over resource allocation to a
> >>>>>>>>       specific Apache
> >>>>>>>>       > project. As mentioned in [1], Apache Arrow is moving away
> >> from
> >>>>>>>>       ASF INFRA
> >>>>>>>>       > Travis pool (they are actually surprised Flink hasn't plan
> >> to do
> >>>>>>>>       so). I
> >>>>>>>>       > know that Spark is on its own build infra. If we all agree
> >> that
> >>>>>>>>       funding our
> >>>>>>>>       > own build infra, I'd be glad to help investigate any
> >> potential
> >>>>>>>>       options
> >>>>>>>>       > after releasing 1.9 since I'm super busy with 1.9 now.
> >>>>>>>>       >
> >>>>>>>>       > [1] https://issues.apache.org/jira/browse/INFRA-18533
> >>>>>>>>       >
> >>>>>>>>       >
> >>>>>>>>       >
> >>>>>>>>       > On Tue, Jul 2, 2019 at 4:46 AM Chesnay Schepler
> >>>>>>>>       <ches...@apache.org <mailto:ches...@apache.org>> wrote:
> >>>>>>>>       >
> >>>>>>>>       >> As a short-term stopgap, since we can assume this issue
> to
> >>>>>>>>       become much
> >>>>>>>>       >> worse in the following days/weeks, we could disable IT
> >> cases in
> >>>>>>>>       PRs and
> >>>>>>>>       >> only run them on master.
> >>>>>>>>       >>
> >>>>>>>>       >> On 02/07/2019 12:03, Chesnay Schepler wrote:
> >>>>>>>>       >>> People really have to stop thinking that just because
> >>>>>>>>       something works
> >>>>>>>>       >>> for us it is also a good solution.
> >>>>>>>>       >>> Also, please remember that our builds run for 2h from
> >> start to
> >>>>>>>>       finish,
> >>>>>>>>       >>> and not the 14 _minutes_ it takes for zeppelin.
> >>>>>>>>       >>> We are dealing with an entirely different scale here,
> both
> >> in
> >>>>>>>>       terms of
> >>>>>>>>       >>> build times and number of builds.
> >>>>>>>>       >>>
> >>>>>>>>       >>> In this very thread people have been complaining about
> long
> >>>>>>>> queue
> >>>>>>>>       >>> times for their builds. Surprise, other Apache projects
> >>>>>>>> have been
> >>>>>>>>       >>> suffering the very same thing due to us not controlling
> our
> >>>>>>>> build
> >>>>>>>>       >>> times. While switching services (be it Jenkins,
> CircleCI or
> >>>>>>>>       whatever)
> >>>>>>>>       >>> will possibly work for us (and these options are
> actually
> >>>>>>>>       attractive,
> >>>>>>>>       >>> like CircleCI's proper support for build artifacts), it
> >>>>>>>> will also
> >>>>>>>>       >>> result in us likely negatively affecting other projects
> in
> >>>>>>>>       significant
> >>>>>>>>       >>> ways.
> >>>>>>>>       >>>
> >>>>>>>>       >>> Sure, the Jenkins setup has a good user experience for
> us,
> >> at
> >>>>>>>>       the cost
> >>>>>>>>       >>> of blocking Jenkins workers for a _lot_ of time. Right
> now
> >> we
> >>>>>>>>       have 25
> >>>>>>>>       >>> PR's in our queue; that's possibly 50h we'd consume of
> >> Jenkins
> >>>>>>>>       >>> resources, and the European contributors haven't even
> >> really
> >>>>>>>>       started yet.
> >>>>>>>>       >>>
> >>>>>>>>       >>> FYI, the latest INFRA response from INFRA-18533:
> >>>>>>>>       >>>
> >>>>>>>>       >>> "Our rough metrics shows that Flink used over 5800
> hours of
> >>>>>>>>       build time
> >>>>>>>>       >>> last month. That is equal to EIGHT servers running 24/7
> for
> >>>>>>>>       the ENTIRE
> >>>>>>>>       >>> MONTH. EIGHT. nonstop.
> >>>>>>>>       >>> When we discovered this last night, we discussed it some
> >> and
> >>>>>>>>       are going
> >>>>>>>>       >>> to tune down Flink to allow only five executors
> maximum. We
> >>>>>>>> cannot
> >>>>>>>>       >>> allow Flink to consume so much of a Foundation shared
> >>>>>>>> resource."
> >>>>>>>>       >>>
> >>>>>>>>       >>> So yes, we either
> >>>>>>>>       >>> a) have to heavily reduce our CI usage or
> >>>>>>>>       >>> b) fund our own, either maintaining it ourselves or
> >> donating
> >>>>>>>>       to Apache.
> >>>>>>>>       >>>
> >>>>>>>>       >>> On 02/07/2019 05:11, Bowen Li wrote:
> >>>>>>>>       >>>> By looking at the git history of the Jenkins script,
> its
> >> core
> >>>>>>>>       part
> >>>>>>>>       >>>> was finished in March 2017 (and only two minor update
> in
> >>>>>>>>       2017/2018),
> >>>>>>>>       >>>> so it's been running for over two years now and feels
> like
> >>>>>>>>       Zepplin
> >>>>>>>>       >>>> community has been quite happy with it. @Jeff Zhang
> >>>>>>>>       >>>> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>>
> can
> >> you
> >>>>>>>>       share your insights and user
> >>>>>>>>       >>>> experience with the Jenkins+Travis approach?
> >>>>>>>>       >>>>
> >>>>>>>>       >>>> Things like:
> >>>>>>>>       >>>>
> >>>>>>>>       >>>> - has the approach completely solved the resource
> capacity
> >>>>>>>>       problem
> >>>>>>>>       >>>> for Zepplin community? is Zepplin community happy with
> the
> >>>>>>>>       result?
> >>>>>>>>       >>>> - is the whole configuration chain stable (e.g. uptime)
> >>>>>>>> enough?
> >>>>>>>>       >>>> - how often do you need to maintain the Jenkins infra?
> how
> >>>>>>>> many
> >>>>>>>>       >>>> people are usually involved in maintenance and
> bug-fixes?
> >>>>>>>>       >>>>
> >>>>>>>>       >>>> The downside of this approach seems mostly to be on the
> >>>>>>>>       maintenance
> >>>>>>>>       >>>> to me - maintain the script and Jenkins infra.
> >>>>>>>>       >>>>
> >>>>>>>>       >>>> ** Having Our Own Travis-CI.com Account **
> >>>>>>>>       >>>>
> >>>>>>>>       >>>> Another alternative I've been thinking of is to have
> our
> >> own
> >>>>>>>>       >>>> travis-ci.com <http://travis-ci.com> <
> >> http://travis-ci.com>
> >>>>>>>>       account with paid dedicated
> >>>>>>>>       >>>> resources. Note travis-ci.org <http://travis-ci.org>
> >>>>>>>>       <http://travis-ci.org> is the free
> >>>>>>>>       >>>> version and travis-ci.com <http://travis-ci.com>
> >>>>>>>>       <http://travis-ci.com> is the commercial
> >>>>>>>>       >>>> version. We currently use a shared resource pool
> managed
> >> by
> >>>>>>>>       ASK INFRA
> >>>>>>>>       >>>> team on travis-ci.org <http://travis-ci.org>
> >>>>>>>>       <http://travis-ci.org>, but we have no control
> >>>>>>>>       >>>> over it - we can't see how it's configured, how much
> >>>>>>>>       resources are
> >>>>>>>>       >>>> available, how resources are allocated among Apache
> >> projects,
> >>>>>>>>       etc.
> >>>>>>>>       >>>> The nice thing about having an account on
> travis-ci.com
> >>>>>>>>       <http://travis-ci.com>
> >>>>>>>>       >>>> <http://travis-ci.com> are:
> >>>>>>>>       >>>>
> >>>>>>>>       >>>> - relatively low cost with much better resource
> guarantee
> >>>>>>>>       than what
> >>>>>>>>       >>>> we currently have [1]: $249/month with 5 dedicated
> >>>>>>>> concurrency,
> >>>>>>>>       >>>> $489/month with 10 concurrency
> >>>>>>>>       >>>> - low maintenance work compared to using Jenkins
> >>>>>>>>       >>>> - (potentially) no migration cost according to Travis's
> >>>>>>>> doc [2]
> >>>>>>>>       >>>> (pending verification)
> >>>>>>>>       >>>> - full control over the build capacity/configuration
> >>>>>>>> compared to
> >>>>>>>>       >>>> using ASF INFRA's pool
> >>>>>>>>       >>>>
> >>>>>>>>       >>>> I'd be surprised if we as such a vibrant community
> cannot
> >>>>>>>>       find and
> >>>>>>>>       >>>> fund $249*12=$2988 a year in exchange for a much better
> >>>>>>>> developer
> >>>>>>>>       >>>> experience and much higher productivity.
> >>>>>>>>       >>>>
> >>>>>>>>       >>>> [1] https://travis-ci.com/plans
> >>>>>>>>       >>>> [2]
> >>>>>>>>       >>>>
> >>>>>>>>       >>
> >>>>>>>>
> >>
> https://docs.travis-ci.com/user/migrate/open-source-repository-migration
> >>>>>>>>       >>>> On Sat, Jun 29, 2019 at 8:39 AM Chesnay Schepler
> >>>>>>>>       <ches...@apache.org <mailto:ches...@apache.org>
> >>>>>>>>       >>>> <mailto:ches...@apache.org <mailto:ches...@apache.org
> >>>
> >>>>>>>> wrote:
> >>>>>>>>       >>>>
> >>>>>>>>       >>>>      So yes, the Jenkins job keeps pulling the state
> from
> >>>>>>>>       Travis until it
> >>>>>>>>       >>>>      finishes.
> >>>>>>>>       >>>>
> >>>>>>>>       >>>>      Note sure I'm comfortable with the idea of using
> >> Jenkins
> >>>>>>>>       workers
> >>>>>>>>       >>>>      just to
> >>>>>>>>       >>>>      idle for a several hours.
> >>>>>>>>       >>>>
> >>>>>>>>       >>>>      On 29/06/2019 14:56, Jeff Zhang wrote:
> >>>>>>>>       >>>>      > Here's what zeppelin community did, we make a
> >> python
> >>>>>>>>       script to
> >>>>>>>>       >>>>      check the
> >>>>>>>>       >>>>      > build status of pull request.
> >>>>>>>>       >>>>      > Here's script:
> >>>>>>>>       >>>>      >
> >>>>>>>> https://github.com/apache/zeppelin/blob/master/travis_check.py
> >>>>>>>>       >>>>      >
> >>>>>>>>       >>>>      > And this is the script we used in Jenkins build
> >> job.
> >>>>>>>>       >>>>      >
> >>>>>>>>       >>>>      > if [ -f "travis_check.py" ]; then
> >>>>>>>>       >>>>      >    git log -n 1
> >>>>>>>>       >>>>      >    STATUS=$(curl -s $BUILD_URL | grep -e "GitHub
> >> pull
> >>>>>>>>       >>>>      request.*from.*" | sed
> >>>>>>>>       >>>>      > 's/.*GitHub pull request <a
> >>>>>>>>       >>>>      >
> >> href=\"\(https[^"]*\).*from[^"]*.\(https[^"]*\).*/\1
> >>>>>>>>       \2/g')
> >>>>>>>>       >>>>      >    AUTHOR=$(echo $STATUS | sed
> >> 's/.*[/]\(.*\)$/\1/g')
> >>>>>>>>       >>>>      >    PR=$(echo $STATUS | awk '{print $1}' | sed
> >>>>>>>>       >>>> 's/.*[/]\(.*\)$/\1/g')
> >>>>>>>>       >>>>      >    #COMMIT=$(git log -n 1 | grep "^Merge:" | awk
> >>>>>>>>       '{print $3}')
> >>>>>>>>       >>>>      >    #if [ -z $COMMIT ]; then
> >>>>>>>>       >>>>      >    #  COMMIT=$(curl -s
> >>>>>>>>       >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR
> >>>>>>>>       >>>>      > | grep -e "\"label\":" -e "\"ref\":" -e
> "\"sha\":"
> >> |
> >>>>>>>>       tr '\n' ' '
> >>>>>>>>       >>>>      | sed
> >>>>>>>>       >>>>      > 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr =
> >> '\n' |
> >>>>>>>>       grep -v
> >>>>>>>>       >>>>      "apache:" |
> >>>>>>>>       >>>>      > sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g')
> >>>>>>>>       >>>>      >    #fi
> >>>>>>>>       >>>>      >
> >>>>>>>>       >>>>      >    # get commit hash from PR
> >>>>>>>>       >>>>      >    COMMIT=$(curl -s
> >>>>>>>>       >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR
> |
> >>>>>>>>       >>>>      > grep -e "\"label\":" -e "\"ref\":" -e
> "\"sha\":" |
> >> tr
> >>>>>>>>       '\n' ' '
> >>>>>>>>       >>>> | sed
> >>>>>>>>       >>>>      > 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr =
> >> '\n' |
> >>>>>>>>       grep -v
> >>>>>>>>       >>>>      "apache:" |
> >>>>>>>>       >>>>      > sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g')
> >>>>>>>>       >>>>      >    sleep 30 # sleep few moment to wait travis
> >> starts
> >>>>>>>>       the build
> >>>>>>>>       >>>>      >    RET_CODE=0
> >>>>>>>>       >>>>      >    python ./travis_check.py ${AUTHOR} ${COMMIT}
> ||
> >>>>>>>>       RET_CODE=$?
> >>>>>>>>       >>>>      >    if [ $RET_CODE -eq 2 ]; then # try with
> >> repository
> >>>>>>>>       name when
> >>>>>>>>       >>>>      travis-ci is
> >>>>>>>>       >>>>      > not available in the account
> >>>>>>>>       >>>>      >      RET_CODE=0
> >>>>>>>>       >>>>      >      AUTHOR=$(curl -s
> >>>>>>>>       >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR
> >>>>>>>>       >>>>      > | grep '"full_name":' | grep -v
> "apache/zeppelin" |
> >>>>>>>> sed
> >>>>>>>>       >>>>      > 's/.*[:][^"]*["]\([^/]*\).*/\1/g')
> >>>>>>>>       >>>>      >    python ./travis_check.py ${AUTHOR} ${COMMIT}
> ||
> >>>>>>>>       RET_CODE=$?
> >>>>>>>>       >>>>      >    fi
> >>>>>>>>       >>>>      >
> >>>>>>>>       >>>>      >    if [ $RET_CODE -eq 2 ]; then # fail with
> can't
> >> find
> >>>>>>>>       build
> >>>>>>>>       >>>>      information in
> >>>>>>>>       >>>>      > the travis
> >>>>>>>>       >>>>      >      set +x
> >>>>>>>>       >>>>      >      echo
> >>>>>>>>       "-----------------------------------------------------"
> >>>>>>>>       >>>>      >      echo "Looks like travis-ci is not
> configured
> >> for
> >>>>>>>>       your fork."
> >>>>>>>>       >>>>      >      echo "Please setup by swich on 'zeppelin'
> >>>>>>>>       repository at
> >>>>>>>>       >>>>      > https://travis-ci.org/profile and travis-ci."
> >>>>>>>>       >>>>      >      echo "And then make sure 'Build branch
> >> updates'
> >>>>>>>>       option is
> >>>>>>>>       >>>>      enabled in
> >>>>>>>>       >>>>      > the settings
> >>>>>>>>       https://travis-ci.org/${AUTHOR}/zeppelin/settings
> >>>>>>>> <https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings>
> >>>>>>>>       >>>> <https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings
> >."
> >>>>>>>>       >>>>      >      echo ""
> >>>>>>>>       >>>>      >      echo "To trigger CI after setup, you will
> need
> >>>>>>>>       ammend your
> >>>>>>>>       >>>>      last commit
> >>>>>>>>       >>>>      > with"
> >>>>>>>>       >>>>      >      echo "git commit --amend"
> >>>>>>>>       >>>>      >      echo "git push your-remote HEAD --force"
> >>>>>>>>       >>>>      >      echo ""
> >>>>>>>>       >>>>      >      echo "See
> >>>>>>>>       >>>>      >
> >>>>>>>>       >>>>
> >>>>>>>>       >>
> >>>>>>>>
> >>
> http://zeppelin.apache.org/contribution/contributions.html#continuous-integration
> >>>>>>>>       >>>>      > ."
> >>>>>>>>       >>>>      >    fi
> >>>>>>>>       >>>>      >
> >>>>>>>>       >>>>      >    exit $RET_CODE
> >>>>>>>>       >>>>      > else
> >>>>>>>>       >>>>      >    set +x
> >>>>>>>>       >>>>      >    echo "travis_check.py does not exists"
> >>>>>>>>       >>>>      >    exit 1
> >>>>>>>>       >>>>      > fi
> >>>>>>>>       >>>>      >
> >>>>>>>>       >>>>      > Chesnay Schepler <ches...@apache.org
> >>>>>>>>       <mailto:ches...@apache.org>
> >>>>>>>>       >>>>      <mailto:ches...@apache.org <mailto:
> >> ches...@apache.org
> >>>>>>>>       于2019年6月29日周六 下午3:17写道:
> >>>>>>>>       >>>>      >
> >>>>>>>>       >>>>      >> Does this imply that a Jenkins job is active as
> >> long
> >>>>>>>>       as the
> >>>>>>>>       >>>>      Travis build
> >>>>>>>>       >>>>      >> runs?
> >>>>>>>>       >>>>      >>
> >>>>>>>>       >>>>      >> On 26/06/2019 21:28, Bowen Li wrote:
> >>>>>>>>       >>>>      >>> Hi,
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>> @Dawid, I think the "long test running" as I
> >>>>>>>>       mentioned in the
> >>>>>>>>       >>>>      first
> >>>>>>>>       >>>>      >> email,
> >>>>>>>>       >>>>      >>> also as you guys said, belongs to "a big
> effort
> >>>>>>>>       which is much
> >>>>>>>>       >>>>      harder to
> >>>>>>>>       >>>>      >>> accomplish in a short period of time and may
> >> deserve
> >>>>>>>>       its own
> >>>>>>>>       >>>>      separate
> >>>>>>>>       >>>>      >>> discussion". Thus I didn't include it in what
> we
> >> can
> >>>>>>>>       do in a
> >>>>>>>>       >>>>      foreseeable
> >>>>>>>>       >>>>      >>> short term.
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>> Besides, I don't think that's the ultimate
> reason
> >>>>>>>>       for lack of
> >>>>>>>>       >>>>      build
> >>>>>>>>       >>>>      >>> resources. Even if the build is shortened to
> >>>>>>>>       something like
> >>>>>>>>       >>>>      2h, the
> >>>>>>>>       >>>>      >>> problems of no build machine works about 6 or
> >> more
> >>>>>>>>       hours in
> >>>>>>>>       >>>>      PST daytime
> >>>>>>>>       >>>>      >>> that I described will still happen, because no
> >>>>>>>>       machine from
> >>>>>>>>       >>>>      ASF INFRA's
> >>>>>>>>       >>>>      >>> pool is allocated to Flink. As I have paid
> close
> >>>>>>>>       attention to
> >>>>>>>>       >>>>      the build
> >>>>>>>>       >>>>      >>> queue in the past few weekdays, it's a pretty
> >> clear
> >>>>>>>>       pattern now.
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>> **The ultimate root cause** for that is - we
> >> don't
> >>>>>>>>       have any
> >>>>>>>>       >>>>      **dedicated**
> >>>>>>>>       >>>>      >>> build resources that we can stably rely on.
> I'm
> >>>>>>>>       actually ok to
> >>>>>>>>       >>>>      wait for a
> >>>>>>>>       >>>>      >>> long time if there are build requests
> running, it
> >>>>>>>>       means at
> >>>>>>>>       >>>>      least we are
> >>>>>>>>       >>>>      >>> making progress. But I'm not ok with no build
> >>>>>>>>       resource. A
> >>>>>>>>       >>>>      better place I
> >>>>>>>>       >>>>      >>> think we should aim at in short term is to
> always
> >>>>>>>>       have at
> >>>>>>>>       >>>>      least a central
> >>>>>>>>       >>>>      >>> pool (can be 3 or 5) of machines dedicated to
> >> build
> >>>>>>>>       Flink at
> >>>>>>>>       >>>>      any time, or
> >>>>>>>>       >>>>      >>> maybe use users resources.
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>> @Chesnay @Robert I synced with Jeff offline
> that
> >>>>>>>>       Zeppelin
> >>>>>>>>       >>>>      community is
> >>>>>>>>       >>>>      >>> using a Jenkins job to automatically build on
> >> users'
> >>>>>>>>       travis
> >>>>>>>>       >>>>      account and
> >>>>>>>>       >>>>      >>> link the result back to github PR. I guess the
> >>>>>>>>       Jenkins job
> >>>>>>>>       >>>>      would fetch
> >>>>>>>>       >>>>      >>> latest upstream master and build the PR
> against
> >> it.
> >>>>>>>>       Jeff has
> >>>>>>>>       >>>> filed
> >>>>>>>>       >>>>      >> tickets
> >>>>>>>>       >>>>      >>> to learn and get access to the Jenkins infra.
> >> It'll
> >>>>>>>>       better to
> >>>>>>>>       >>>>      fully
> >>>>>>>>       >>>>      >>> understand it first before judging this
> approach.
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>> I also heard good things about CircleCI, and
> ASF
> >>>>>>>>       INFRA seems
> >>>>>>>>       >>>>      to have a
> >>>>>>>>       >>>>      >> pool
> >>>>>>>>       >>>>      >>> of build capacity there too. Can be an
> >> alternative
> >>>>>>>>       to consider.
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>> On Wed, Jun 26, 2019 at 12:44 AM Dawid
> >> Wysakowicz <
> >>>>>>>>       >>>>      >> dwysakow...@apache.org
> >>>>>>>>       <mailto:dwysakow...@apache.org> <mailto:
> dwysakow...@apache.org
> >>>>>>>>       <mailto:dwysakow...@apache.org>>>
> >>>>>>>>       >>>>      >>> wrote:
> >>>>>>>>       >>>>      >>>
> >>>>>>>>       >>>>      >>>> Sorry to jump in late, but I think Bowen
> missed
> >> the
> >>>>>>>>       most
> >>>>>>>>       >>>>      important point
> >>>>>>>>       >>>>      >>>> from Chesnay's previous message in the
> summary.
> >> The
> >>>>>>>>       ultimate
> >>>>>>>>       >>>>      reason for
> >>>>>>>>       >>>>      >>>> all the problems is that the tests take close
> >> to 2
> >>>>>>>>       hours to
> >>>>>>>>       >>>>      run already.
> >>>>>>>>       >>>>      >>>> I fully support this claim: "Unless people
> start
> >>>>>>>>       caring about
> >>>>>>>>       >>>>      test times
> >>>>>>>>       >>>>      >>>> before adding them, this issue cannot be
> solved"
> >>>>>>>>       >>>>      >>>>
> >>>>>>>>       >>>>      >>>> This is also another reason why using user's
> >> Travis
> >>>>>>>>       account
> >>>>>>>>       >>>>      won't help.
> >>>>>>>>       >>>>      >>>> Every few weeks we reach the user's time
> limit
> >> for
> >>>>>>>>       a single
> >>>>>>>>       >>>>      profile.
> >>>>>>>>       >>>>      >>>> This makes the user's builds simply fail,
> until
> >> we
> >>>>>>>>       either
> >>>>>>>>       >>>>      properly
> >>>>>>>>       >>>>      >>>> decrease the time the tests take (which I am
> not
> >>>>>>>>       sure we ever
> >>>>>>>>       >>>>      did) or
> >>>>>>>>       >>>>      >>>> postpone the problem by splitting into more
> >>>>>>>>       profiles. (Note
> >>>>>>>>       >>>>      that the ASF
> >>>>>>>>       >>>>      >>>> Travis account has higher time limits)
> >>>>>>>>       >>>>      >>>>
> >>>>>>>>       >>>>      >>>> Best,
> >>>>>>>>       >>>>      >>>>
> >>>>>>>>       >>>>      >>>> Dawid
> >>>>>>>>       >>>>      >>>>
> >>>>>>>>       >>>>      >>>> On 26/06/2019 09:36, Robert Metzger wrote:
> >>>>>>>>       >>>>      >>>>> Do we know if using "the best" available
> >> hardware
> >>>>>>>>       would
> >>>>>>>>       >>>>      improve the
> >>>>>>>>       >>>>      >> build
> >>>>>>>>       >>>>      >>>>> times?
> >>>>>>>>       >>>>      >>>>> Imagine we would run the build on machines
> with
> >>>>>>>>       plenty of
> >>>>>>>>       >>>>      main memory
> >>>>>>>>       >>>>      >> to
> >>>>>>>>       >>>>      >>>>> mount everything to ramdisk + the latest CPU
> >>>>>>>>       architecture?
> >>>>>>>>       >>>>      >>>>>
> >>>>>>>>       >>>>      >>>>> Throwing hardware at the problem could help
> >> reduce
> >>>>>>>>       the time
> >>>>>>>>       >>>>      of an
> >>>>>>>>       >>>>      >>>>> individual build, and using our own
> >> infrastructure
> >>>>>>>>       would
> >>>>>>>>       >>>>      remove our
> >>>>>>>>       >>>>      >>>>> dependency on Apache's Travis account (with
> the
> >>>>>>>>       obvious
> >>>>>>>>       >>>>      downside of
> >>>>>>>>       >>>>      >>>> having
> >>>>>>>>       >>>>      >>>>> to maintain the infrastructure)
> >>>>>>>>       >>>>      >>>>> We could use an open source travis
> >> alternative, to
> >>>>>>>>       have a
> >>>>>>>>       >>>>      similar
> >>>>>>>>       >>>>      >>>>> experience and make the migration easy.
> >>>>>>>>       >>>>      >>>>>
> >>>>>>>>       >>>>      >>>>>
> >>>>>>>>       >>>>      >>>>> On Wed, Jun 26, 2019 at 9:34 AM Chesnay
> >> Schepler
> >>>>>>>>       >>>>      <ches...@apache.org <mailto:ches...@apache.org>
> >>>>>>>>       <mailto:ches...@apache.org <mailto:ches...@apache.org>>>
> >>>>>>>>       >>>>      >>>> wrote:
> >>>>>>>>       >>>>      >>>>>> >From what I gathered, there's no special
> >>>>>>>>       sauce that the
> >>>>>>>>       >>>>      Zeppelin
> >>>>>>>>       >>>>      >>>>>> project uses which actually integrates a
> users
> >>>>>>>> Travis
> >>>>>>>>       >>>>      account into the
> >>>>>>>>       >>>>      >>>> PR.
> >>>>>>>>       >>>>      >>>>>> They just disabled Travis for PRs. And
> that's
> >>>>>>>>       kind of it.
> >>>>>>>>       >>>>      >>>>>>
> >>>>>>>>       >>>>      >>>>>> Naturally we can do this (duh) and safe the
> >> ASF a
> >>>>>>>>       fair
> >>>>>>>>       >>>>      amount of
> >>>>>>>>       >>>>      >>>>>> resources, but there are downsides:
> >>>>>>>>       >>>>      >>>>>>
> >>>>>>>>       >>>>      >>>>>> The discoverability of the Travis check
> takes
> >> a
> >>>>>>>>       nose-dive.
> >>>>>>>>       >>>>      Either we
> >>>>>>>>       >>>>      >>>>>> require every contributor to always, an
> every
> >>>>>>>>       commit, also
> >>>>>>>>       >>>>      post a
> >>>>>>>>       >>>>      >> Travis
> >>>>>>>>       >>>>      >>>>>> build, or we have the reviewer sift through
> >> the
> >>>>>>>>       >>>>      contributors account
> >>>>>>>>       >>>>      >> to
> >>>>>>>>       >>>>      >>>>>> find it.
> >>>>>>>>       >>>>      >>>>>>
> >>>>>>>>       >>>>      >>>>>> This is rather cumbersome. Additionally,
> it's
> >>>>>>>>       also not
> >>>>>>>>       >>>>      equivalent to
> >>>>>>>>       >>>>      >>>>>> having a PR build.
> >>>>>>>>       >>>>      >>>>>>
> >>>>>>>>       >>>>      >>>>>> A normal branch build takes a branch as is
> and
> >>>>>>>>       tests it. A
> >>>>>>>>       >>>>      PR build
> >>>>>>>>       >>>>      >>>>>> merges the branch into master, and then
> runs
> >> it.
> >>>>>>>>       (Fun fact:
> >>>>>>>>       >>>>      This is
> >>>>>>>>       >>>>      >> why
> >>>>>>>>       >>>>      >>>>>> a PR without merge conflicts is not being
> run
> >> on
> >>>>>>>>       Travis.)
> >>>>>>>>       >>>>      >>>>>>
> >>>>>>>>       >>>>      >>>>>> And ultimately, everyone can already make
> use
> >>>>>>>> of this
> >>>>>>>>       >>>>      approach anyway.
> >>>>>>>>       >>>>      >>>>>>
> >>>>>>>>       >>>>      >>>>>> On 25/06/2019 08:02, Jark Wu wrote:
> >>>>>>>>       >>>>      >>>>>>> Hi Jeff,
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>> Thanks for sharing the Zeppelin approach.
> I
> >>>>>>>>       think it's a
> >>>>>>>>       >>>>      good idea to
> >>>>>>>>       >>>>      >>>>>>> leverage user's travis account.
> >>>>>>>>       >>>>      >>>>>>> In this way, we can have almost unlimited
> >>>>>>>>       concurrent build
> >>>>>>>>       >>>>      jobs and
> >>>>>>>>       >>>>      >>>>>>> developers can restart build by themselves
> >>>>>>>>       (currently only
> >>>>>>>>       >>>>      committers
> >>>>>>>>       >>>>      >>>>>>> can restart PR's build).
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>> But I'm still not very clear how to
> integrate
> >>>>>>>> user's
> >>>>>>>>       >>>>      travis build
> >>>>>>>>       >>>>      >> into
> >>>>>>>>       >>>>      >>>>>>> the Flink pull request's build
> automatically.
> >>>>>>>>       Can you
> >>>>>>>>       >>>>      explain more in
> >>>>>>>>       >>>>      >>>>>>> detail?
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>> Another question: does travis only build
> >>>>>>>>       branches for user
> >>>>>>>>       >>>>      account?
> >>>>>>>>       >>>>      >>>>>>> My concern is that builds for PRs will
> rebase
> >>>>>>>> user's
> >>>>>>>>       >>>>      commits against
> >>>>>>>>       >>>>      >>>>>>> current master branch.
> >>>>>>>>       >>>>      >>>>>>> This will help us to find problems before
> >>>>>>>>       merge.  Builds
> >>>>>>>>       >>>>      for branches
> >>>>>>>>       >>>>      >>>>>>> will lose the impact of new commits in
> >> master.
> >>>>>>>>       >>>>      >>>>>>> How does Zeppelin solve this problem?
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>> Thanks again for sharing the idea.
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>> Regards,
> >>>>>>>>       >>>>      >>>>>>> Jark
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>> On Tue, 25 Jun 2019 at 11:01, Jeff Zhang
> >>>>>>>>       <zjf...@gmail.com <mailto:zjf...@gmail.com>
> >>>>>>>>       >>>>      <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com
> >>
> >>>>>>>>       >>>>      >>>>>>> <mailto:zjf...@gmail.com
> >>>>>>>>       <mailto:zjf...@gmail.com> <mailto:zjf...@gmail.com
> >>>>>>>>       <mailto:zjf...@gmail.com>>>> wrote:
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>>  Hi Folks,
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>>  Zeppelin meet this kind of issue before,
> we
> >>>>>>>> solve
> >>>>>>>>       >>>> it by
> >>>>>>>>       >>>>      >> delegating
> >>>>>>>>       >>>>      >>>>>>>  each
> >>>>>>>>       >>>>      >>>>>>>  one's PR build to his travis account
> >>>>>>>>       (Everyone can
> >>>>>>>>       >>>>      have 5 free
> >>>>>>>>       >>>>      >>>>>>>  slot for
> >>>>>>>>       >>>>      >>>>>>>  travis build).
> >>>>>>>>       >>>>      >>>>>>>  Apache account travis build is only
> >>>>>>>> triggered when
> >>>>>>>>       >>>>      PR is merged.
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>>  Kurt Young <ykt...@gmail.com
> >>>>>>>>       <mailto:ykt...@gmail.com>
> >>>>>>>>       >>>>      <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com
> >>
> >>>>>>>>       <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>
> >>>>>>>>       >>>>      <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com
> >>>>>>>>       >>>>      >>>>>>>  于2019年6月25日周二 上午10:16写道:
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>>  > (Forgot to cc George)
> >>>>>>>>       >>>>      >>>>>>>  >
> >>>>>>>>       >>>>      >>>>>>>  > Best,
> >>>>>>>>       >>>>      >>>>>>>  > Kurt
> >>>>>>>>       >>>>      >>>>>>>  >
> >>>>>>>>       >>>>      >>>>>>>  >
> >>>>>>>>       >>>>      >>>>>>>  > On Tue, Jun 25, 2019 at 10:16 AM Kurt
> >> Young
> >>>>>>>>       >>>>      <ykt...@gmail.com <mailto:ykt...@gmail.com>
> >>>>>>>>       <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>>
> >>>>>>>>       >>>>      >>>>>>> <mailto:ykt...@gmail.com
> >>>>>>>>       <mailto:ykt...@gmail.com> <mailto:ykt...@gmail.com
> >>>>>>>>       <mailto:ykt...@gmail.com>>>>
> >>>>>>>>       >>>>      wrote:
> >>>>>>>>       >>>>      >>>>>>>  >
> >>>>>>>>       >>>>      >>>>>>>  > > Hi Bowen,
> >>>>>>>>       >>>>      >>>>>>>  > >
> >>>>>>>>       >>>>      >>>>>>>  > > Thanks for bringing this up. We
> >>>>>>>>       actually have
> >>>>>>>>       >>>>      discussed
> >>>>>>>>       >>>>      >> about
> >>>>>>>>       >>>>      >>>>>>>  this, and I
> >>>>>>>>       >>>>      >>>>>>>  > > think Till and George have
> >>>>>>>>       >>>>      >>>>>>>  > > already spend sometime investigating
> >>>>>>>>       it. I have
> >>>>>>>>       >>>>      cced both of
> >>>>>>>>       >>>>      >>>>>>>  them, and
> >>>>>>>>       >>>>      >>>>>>>  > > maybe they can share
> >>>>>>>>       >>>>      >>>>>>>  > > their findings.
> >>>>>>>>       >>>>      >>>>>>>  > >
> >>>>>>>>       >>>>      >>>>>>>  > > Best,
> >>>>>>>>       >>>>      >>>>>>>  > > Kurt
> >>>>>>>>       >>>>      >>>>>>>  > >
> >>>>>>>>       >>>>      >>>>>>>  > >
> >>>>>>>>       >>>>      >>>>>>>  > > On Tue, Jun 25, 2019 at 10:08 AM
> Jark Wu
> >>>>>>>>       >>>>      <imj...@gmail.com <mailto:imj...@gmail.com>
> >>>>>>>>       <mailto:imj...@gmail.com <mailto:imj...@gmail.com>>
> >>>>>>>>       >>>>      >>>>>>> <mailto:imj...@gmail.com
> >>>>>>>>       <mailto:imj...@gmail.com> <mailto:imj...@gmail.com
> >>>>>>>>       <mailto:imj...@gmail.com>>>>
> >>>>>>>>       >>>>      wrote:
> >>>>>>>>       >>>>      >>>>>>>  > >
> >>>>>>>>       >>>>      >>>>>>>  > >> Hi Bowen,
> >>>>>>>>       >>>>      >>>>>>>  > >>
> >>>>>>>>       >>>>      >>>>>>>  > >> Thanks for bringing this. We also
> >>>>>>>>       suffered from
> >>>>>>>>       >>>>      the long
> >>>>>>>>       >>>>      >>>>>>>  build time.
> >>>>>>>>       >>>>      >>>>>>>  > >> I agree that we should focus on
> >>>>>>>>       solving build
> >>>>>>>>       >>>>      capacity
> >>>>>>>>       >>>>      >>>>>>>  problem in the
> >>>>>>>>       >>>>      >>>>>>>  > >> thread.
> >>>>>>>>       >>>>      >>>>>>>  > >>
> >>>>>>>>       >>>>      >>>>>>>  > >> My observation is there is only one
> >>>>>>>>       build is
> >>>>>>>>       >>>>      running, all
> >>>>>>>>       >>>>      >> the
> >>>>>>>>       >>>>      >>>>>>>  others
> >>>>>>>>       >>>>      >>>>>>>  > >> (other
> >>>>>>>>       >>>>      >>>>>>>  > >> PRs, master) are pending.
> >>>>>>>>       >>>>      >>>>>>>  > >> The pricing plan[1] of travis shows
> >>>>>>>>       it can
> >>>>>>>>       >>>> support
> >>>>>>>>       >>>>      >> concurrent
> >>>>>>>>       >>>>      >>>>>>>  build
> >>>>>>>>       >>>>      >>>>>>>  > jobs.
> >>>>>>>>       >>>>      >>>>>>>  > >> But I don't know which plan we are
> >>>>>>>>       using, might
> >>>>>>>>       >>>>      be the free
> >>>>>>>>       >>>>      >>>>>>>  plan for
> >>>>>>>>       >>>>      >>>>>>>  > open
> >>>>>>>>       >>>>      >>>>>>>  > >> source.
> >>>>>>>>       >>>>      >>>>>>>  > >>
> >>>>>>>>       >>>>      >>>>>>>  > >> I cc-ed Chesnay who may have some
> >>>>>>>>       experience on
> >>>>>>>>       >>>>      Travis.
> >>>>>>>>       >>>>      >>>>>>>  > >>
> >>>>>>>>       >>>>      >>>>>>>  > >> Regards,
> >>>>>>>>       >>>>      >>>>>>>  > >> Jark
> >>>>>>>>       >>>>      >>>>>>>  > >>
> >>>>>>>>       >>>>      >>>>>>>  > >> [1]: https://travis-ci.com/plans
> >>>>>>>>       >>>>      >>>>>>>  > >>
> >>>>>>>>       >>>>      >>>>>>>  > >> On Tue, 25 Jun 2019 at 08:11, Bowen
> Li
> >> <
> >>>>>>>>       >>>>      >> bowenl...@gmail.com <mailto:
> bowenl...@gmail.com>
> >>>>>>>>       <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>
> >>>>>>>>       >>>>      >>>>>>> <mailto:bowenl...@gmail.com
> >>>>>>>>       <mailto:bowenl...@gmail.com>
> >>>>>>>>       >>>>      <mailto:bowenl...@gmail.com
> >>>>>>>>       <mailto:bowenl...@gmail.com>>>> wrote:
> >>>>>>>>       >>>>      >>>>>>>  > >>
> >>>>>>>>       >>>>      >>>>>>>  > >> > Hi Steven,
> >>>>>>>>       >>>>      >>>>>>>  > >> >
> >>>>>>>>       >>>>      >>>>>>>  > >> > I think you may not read what I
> >>>>>>>>       wrote. The
> >>>>>>>>       >>>>      discussion is
> >>>>>>>>       >>>>      >>>> about
> >>>>>>>>       >>>>      >>>>>>>  > "unstable
> >>>>>>>>       >>>>      >>>>>>>  > >> > build **capacity**", in another
> word
> >>>>>>>>       >>>>      "unstable / lack of
> >>>>>>>>       >>>>      >>>> build
> >>>>>>>>       >>>>      >>>>>>>  > >> resources",
> >>>>>>>>       >>>>      >>>>>>>  > >> > not "unstable build".
> >>>>>>>>       >>>>      >>>>>>>  > >> >
> >>>>>>>>       >>>>      >>>>>>>  > >> > On Mon, Jun 24, 2019 at 4:40 PM
> >>>>>>>>       Steven Wu
> >>>>>>>>       >>>>      >>>>>>>  <stevenz...@gmail.com
> >>>>>>>>       <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com
> >>>>>>>>       <mailto:stevenz...@gmail.com>>
> >>>>>>>>       >>>>      <mailto:stevenz...@gmail.com
> >>>>>>>>       <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com
> >>>>>>>>       <mailto:stevenz...@gmail.com>>>>
> >>>>>>>>       >>>>      >>>>>>>  > wrote:
> >>>>>>>>       >>>>      >>>>>>>  > >> >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > long and sometimes unstable
> build
> >> is
> >>>>>>>>       >>>>      definitely a pain
> >>>>>>>>       >>>>      >>>>>> point.
> >>>>>>>>       >>>>      >>>>>>>  > >> > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > I suspect the build failure
> here in
> >>>>>>>>       >>>>      >> flink-connector-kafka
> >>>>>>>>       >>>>      >>>>>>>  is not
> >>>>>>>>       >>>>      >>>>>>>  > >> related
> >>>>>>>>       >>>>      >>>>>>>  > >> > to
> >>>>>>>>       >>>>      >>>>>>>  > >> > > my change. but there is no easy
> >>>>>>>>       re-run the
> >>>>>>>>       >>>>      build on
> >>>>>>>>       >>>>      >>>>>>>  travis UI.
> >>>>>>>>       >>>>      >>>>>>>  > Google
> >>>>>>>>       >>>>      >>>>>>>  > >> > > search showed a trick of
> >>>>>>>>       close-and-open the
> >>>>>>>>       >>>>      PR will
> >>>>>>>>       >>>>      >>>>>>>  trigger rebuild.
> >>>>>>>>       >>>>      >>>>>>>  > >> but
> >>>>>>>>       >>>>      >>>>>>>  > >> > > that could add noises to the PR
> >>>>>>>>       activities.
> >>>>>>>>       >>>>      >>>>>>>  > >> > >
> >>>>>>>>       >>>> https://travis-ci.org/apache/flink/jobs/545555519
> >>>>>>>>       >>>>      >>>>>>>  > >> > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > travis-ci for my personal repo
> >>>>>>>>       often failed
> >>>>>>>>       >>>>      with
> >>>>>>>>       >>>>      >>>>>>>  exceeding time
> >>>>>>>>       >>>>      >>>>>>>  > limit
> >>>>>>>>       >>>>      >>>>>>>  > >> > after
> >>>>>>>>       >>>>      >>>>>>>  > >> > > 4+ hours.
> >>>>>>>>       >>>>      >>>>>>>  > >> > > The job exceeded the maximum
> time
> >>>>>>>>       limit for
> >>>>>>>>       >>>>      jobs, and
> >>>>>>>>       >>>>      >> has
> >>>>>>>>       >>>>      >>>>>>>  been
> >>>>>>>>       >>>>      >>>>>>>  > >> > terminated.
> >>>>>>>>       >>>>      >>>>>>>  > >> > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > On Mon, Jun 24, 2019 at 4:15 PM
> >>>>>>>>       Bowen Li
> >>>>>>>>       >>>>      >>>>>>>  <bowenl...@gmail.com
> >>>>>>>>       <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com
> >>>>>>>>       <mailto:bowenl...@gmail.com>>
> >>>>>>>>       >>>>      <mailto:bowenl...@gmail.com <mailto:
> >> bowenl...@gmail.com
> >>>>>>>>       <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>>
> >>>>>>>>       >>>>      >>>>>>>  > wrote:
> >>>>>>>>       >>>>      >>>>>>>  > >> > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > >
> >>>>>>>>       >>>> https://travis-ci.org/apache/flink/builds/549681530
> >>>>>>>>       >>>>      >>>>>>>  This build
> >>>>>>>>       >>>>      >>>>>>>  > >> > request
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > has
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > been sitting at **HEAD of the
> >>>>>>>>       queue**
> >>>>>>>>       >>>>      since I first
> >>>>>>>>       >>>>      >> saw
> >>>>>>>>       >>>>      >>>>>>>  it at PST
> >>>>>>>>       >>>>      >>>>>>>  > >> > 10:30am
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > (not sure how long it's been
> >>>>>>>>       there before
> >>>>>>>>       >>>>      10:30am).
> >>>>>>>>       >>>>      >>>>>>>  It's PST
> >>>>>>>>       >>>>      >>>>>>>  > 4:12pm
> >>>>>>>>       >>>>      >>>>>>>  > >> now
> >>>>>>>>       >>>>      >>>>>>>  > >> > > and
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > it hasn't started yet.
> >>>>>>>>       >>>>      >>>>>>>  > >> > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > On Mon, Jun 24, 2019 at 2:48
> PM
> >>>>>>>>       Bowen Li
> >>>>>>>>       >>>>      >>>>>>>  <bowenl...@gmail.com
> >>>>>>>>       <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com
> >>>>>>>>       <mailto:bowenl...@gmail.com>>
> >>>>>>>>       >>>>      <mailto:bowenl...@gmail.com <mailto:
> >> bowenl...@gmail.com
> >>>>>>>>       <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>>
> >>>>>>>>       >>>>      >>>>>>>  > >> wrote:
> >>>>>>>>       >>>>      >>>>>>>  > >> > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > Hi devs,
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > I've been experiencing the
> pain
> >>>>>>>>       >>>>      resulting from lack
> >>>>>>>>       >>>>      >>>>>>>  of stable
> >>>>>>>>       >>>>      >>>>>>>  > >> build
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > capacity on Travis for Flink
> >>>>>>>>       PRs [1].
> >>>>>>>>       >>>>      >> Specifically, I
> >>>>>>>>       >>>>      >>>>>>>  noticed
> >>>>>>>>       >>>>      >>>>>>>  > >> often
> >>>>>>>>       >>>>      >>>>>>>  > >> > > that
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > no
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > build in the queue is making
> >> any
> >>>>>>>>       >>>>      progress for
> >>>>>>>>       >>>>      >> hours,
> >>>>>>>>       >>>>      >>>> and
> >>>>>>>>       >>>>      >>>>>>>  > suddenly
> >>>>>>>>       >>>>      >>>>>>>  > >> 5
> >>>>>>>>       >>>>      >>>>>>>  > >> > or
> >>>>>>>>       >>>>      >>>>>>>  > >> > > 6
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > builds kick off all together
> >>>>>>>>       after the
> >>>>>>>>       >>>>      long pause.
> >>>>>>>>       >>>>      >>>>>>>  I'm at PST
> >>>>>>>>       >>>>      >>>>>>>  > >> > (UTC-08)
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > time
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > zone, and I've seen pause
> can
> >>>>>>>>       be as
> >>>>>>>>       >>>>      long as 6 hours
> >>>>>>>>       >>>>      >>>>>>>  from PST 9am
> >>>>>>>>       >>>>      >>>>>>>  > >> to
> >>>>>>>>       >>>>      >>>>>>>  > >> > 3pm
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > (let alone the time needed
> to
> >>>>>>>>       drain the
> >>>>>>>>       >>>>      queue
> >>>>>>>>       >>>>      >>>>>>>  afterwards).
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > I think this has greatly
> >>>>>>>>       impacted our
> >>>>>>>>       >>>>      productivity.
> >>>>>>>>       >>>>      >>>> I've
> >>>>>>>>       >>>>      >>>>>>>  > >> experienced
> >>>>>>>>       >>>>      >>>>>>>  > >> > > that
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > PRs submitted in the early
> >>>>>>>>       morning of
> >>>>>>>>       >>>>      PST time zone
> >>>>>>>>       >>>>      >>>>>>>  won't finish
> >>>>>>>>       >>>>      >>>>>>>  > >> > their
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > build until late night of
> the
> >>>>>>>>       same day.
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > So my questions are:
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > - Has anyone else
> experienced
> >>>>>>>>       the same
> >>>>>>>>       >>>>      problem or
> >>>>>>>>       >>>>      >>>>>>>  have similar
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > observation
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > on TravisCI? (I suspect it
> >>>>>>>>       has things
> >>>>>>>>       >>>>      to do with
> >>>>>>>>       >>>>      >> time
> >>>>>>>>       >>>>      >>>>>>>  zone)
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > - What pricing plan of
> >>>>>>>>       TravisCI is
> >>>>>>>>       >>>>      Flink currently
> >>>>>>>>       >>>>      >>>>>>>  using? Is it
> >>>>>>>>       >>>>      >>>>>>>  > >> the
> >>>>>>>>       >>>>      >>>>>>>  > >> > > free
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > plan for open source
> >>>>>>>>       projects? What
> >>>>>>>>       >>>> are the
> >>>>>>>>       >>>>      >>>>>>>  guaranteed build
> >>>>>>>>       >>>>      >>>>>>>  > >> capacity
> >>>>>>>>       >>>>      >>>>>>>  > >> > > of
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > the current plan?
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > - If the current pricing
> plan
> >>>>>>>>       (either
> >>>>>>>>       >>>>      free or paid)
> >>>>>>>>       >>>>      >>>>>> can't
> >>>>>>>>       >>>>      >>>>>>>  > provide
> >>>>>>>>       >>>>      >>>>>>>  > >> > > stable
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > build capacity, can we
> >>>>>>>>       upgrade to a
> >>>>>>>>       >>>>      higher priced
> >>>>>>>>       >>>>      >>>>>>>  plan with
> >>>>>>>>       >>>>      >>>>>>>  > larger
> >>>>>>>>       >>>>      >>>>>>>  > >> > and
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > more
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > stable build capacity?
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > BTW, another factor that
> >>>>>>>>       contribute to
> >>>>>>>>       >>>> the
> >>>>>>>>       >>>>      >>>>>>>  productivity problem
> >>>>>>>>       >>>>      >>>>>>>  > is
> >>>>>>>>       >>>>      >>>>>>>  > >> > that
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > our build is slow - we run
> >>>>>>>>       full build
> >>>>>>>>       >>>>      for every PR
> >>>>>>>>       >>>>      >>>> and a
> >>>>>>>>       >>>>      >>>>>>>  > >> successful
> >>>>>>>>       >>>>      >>>>>>>  > >> > > full
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > build takes ~5h. We
> >>>>>>>>       definitely have
> >>>>>>>>       >>>>      more options to
> >>>>>>>>       >>>>      >>>>>>>  solve it,
> >>>>>>>>       >>>>      >>>>>>>  > for
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > instance,
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > modularize the build graphs
> >>>>>>>>       and reuse
> >>>>>>>>       >>>>      artifacts
> >>>>>>>>       >>>>      >> from
> >>>>>>>>       >>>>      >>>> the
> >>>>>>>>       >>>>      >>>>>>>  > previous
> >>>>>>>>       >>>>      >>>>>>>  > >> > > build.
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > But I think that can be a
> big
> >>>>>>>>       effort
> >>>>>>>>       >>>>      which is much
> >>>>>>>>       >>>>      >>>>>>>  harder to
> >>>>>>>>       >>>>      >>>>>>>  > >> > accomplish
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > in
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > a short period of time and
> >>>>>>>>       may deserve
> >>>>>>>>       >>>>      its own
> >>>>>>>>       >>>>      >>>> separate
> >>>>>>>>       >>>>      >>>>>>>  > >> discussion.
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > > [1]
> >>>>>>>>       >>>>      >>
> https://travis-ci.org/apache/flink/pull_requests
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > > >
> >>>>>>>>       >>>>      >>>>>>>  > >> > >
> >>>>>>>>       >>>>      >>>>>>>  > >> >
> >>>>>>>>       >>>>      >>>>>>>  > >>
> >>>>>>>>       >>>>      >>>>>>>  > >
> >>>>>>>>       >>>>      >>>>>>>  >
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>>  --
> >>>>>>>>       >>>>      >>>>>>>  Best Regards
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>>>>>>  Jeff Zhang
> >>>>>>>>       >>>>      >>>>>>>
> >>>>>>>>       >>>>      >>
> >>>>>>>>       >>>>
> >>>>>>>>       >>>
> >>>>>>>>       >>
> >>>>>>>>
> >>
>
>

Reply via email to