Wow. That's great! Thanks Chesnay. On Fri, 2 Aug 2019 at 17:50, Chesnay Schepler <ches...@apache.org> wrote:
> I'm currently modifying the cibot to do this automatically; should be > finished until Monday. > > On 02/08/2019 07:41, Jark Wu wrote: > > Hi Chesnay, > > > > Can we assign Flink Committers the permission of flink-ci/flink repo? > > Several times, when I pushed some new commits, the old build jobs are > still > > in pending and not canceled. > > Before we fix that, we can manually cancel some old jobs to save build > > resource. > > > > Best, > > Jark > > > > > > On Wed, 10 Jul 2019 at 16:17, Chesnay Schepler <ches...@apache.org> > wrote: > > > >> Your best bet would be to check the first commit in the PR and check the > >> parent commit. > >> > >> To re-run things, you will have to rebase the PR on the latest master. > >> > >> On 10/07/2019 03:32, Kurt Young wrote: > >>> Thanks for all your efforts Chesnay, it indeed improves a lot for our > >>> develop experience. BTW, do you know how to find the master branch > >>> information which the CI runs with? > >>> > >>> For example, like this one: > >>> https://travis-ci.com/flink-ci/flink/jobs/214542568 > >>> It shows pass with the commits, which rebased on the master when the CI > >>> is triggered. But it's both possible that the master branch CI runs on > is > >>> the > >>> same or different with current master. If it's the same, I can simply > >> rely > >>> on the > >>> passed information to push commits, but if it's not, I think i should > >> find > >>> another > >>> way to re-trigger tests based on the newest master. > >>> > >>> Do you know where can I get such information? > >>> > >>> Best, > >>> Kurt > >>> > >>> > >>> On Tue, Jul 9, 2019 at 3:27 AM Chesnay Schepler <ches...@apache.org> > >> wrote: > >>>> The kinks have been worked out; the bot is running again and pr builds > >>>> are yet again no longer running on ASF resources. > >>>> > >>>> PRs are mirrored to: https://github.com/flink-ci/flink > >>>> Bot source: https://github.com/flink-ci/ci-bot > >>>> > >>>> On 08/07/2019 17:14, Chesnay Schepler wrote: > >>>>> I have temporarily re-enabled running PR builds on the ASF account; > >>>>> migrating to the Travis subscription caused some issues in the bot > >>>>> that I have to fix first. > >>>>> > >>>>> On 07/07/2019 23:01, Chesnay Schepler wrote: > >>>>>> The vote has passed unanimously in favor of migrating to a separate > >>>>>> Travis account. > >>>>>> > >>>>>> I will now set things up such that no PullRequest is no longer run > on > >>>>>> the ASF servers. > >>>>>> This is a major setup in reducing our usage of ASF resources. > >>>>>> For the time being we'll use free Travis plan for flink-ci (i.e. 5 > >>>>>> workers, which is the same the ASF gives us). Over the course of the > >>>>>> next week we'll setup the Ververica subscription to increase this > >> limit. > >>>>>> From now now, a bot will mirror all new and updated PullRequests > to a > >>>>>> mirror repository (https://github.com/flink-ci/flink-ci) and write > an > >>>>>> update into the PR once the build is complete. > >>>>>> I have ran the bots for the past 3 days in parallel to our existing > >>>>>> Travis and it was working without major issues. > >>>>>> > >>>>>> The biggest change that contributors will see is that there's no > >>>>>> longer a icon next to each commit. We may revisit this in the > future. > >>>>>> > >>>>>> I'll setup a repo with the source of the bot later. > >>>>>> > >>>>>> On 04/07/2019 10:46, Chesnay Schepler wrote: > >>>>>>> I've raised a JIRA > >>>>>>> <https://issues.apache.org/jira/browse/INFRA-18703>with INFRA to > >>>>>>> inquire whether it would be possible to switch to a different > Travis > >>>>>>> account, and if so what steps would need to be taken. > >>>>>>> We need a proper confirmation from INFRA since we are not in full > >>>>>>> control of the flink repository (for example, we cannot access the > >>>>>>> settings page). > >>>>>>> > >>>>>>> If this is indeed possible, Ververica is willing sponsor a Travis > >>>>>>> account for the Flink project. > >>>>>>> This would provide us with more than enough resources than we need. > >>>>>>> > >>>>>>> Since this makes the project more reliant on resources provided by > >>>>>>> external companies I would like to vote on this. > >>>>>>> > >>>>>>> Please vote on this proposal, as follows: > >>>>>>> [ ] +1, Approve the migration to a Ververica-sponsored Travis > >>>>>>> account, provided that INFRA approves > >>>>>>> [ ] -1, Do not approach the migration to a Ververica-sponsored > >>>>>>> Travis account > >>>>>>> > >>>>>>> The vote will be open for at least 24h, and until we have > >>>>>>> confirmation from INFRA. The voting period may be shorter than the > >>>>>>> usual 3 days since our current is effectively not working. > >>>>>>> > >>>>>>> On 04/07/2019 06:51, Bowen Li wrote: > >>>>>>>> Re: > Are they using their own Travis CI pool, or did the switch > to > >>>>>>>> an entirely different CI service? > >>>>>>>> > >>>>>>>> I reached out to Wes and Krisztián from Apache Arrow PMC. They are > >>>>>>>> currently moving away from ASF's Travis to their own in-house > metal > >>>>>>>> machines at [1] with custom CI application at [2]. They've seen > >>>>>>>> significant improvement w.r.t both much higher performance and > >>>>>>>> basically no resource waiting time, "night-and-day" difference > >>>>>>>> quoting Wes. > >>>>>>>> > >>>>>>>> Re: > If we can just switch to our own Travis pool, just for our > >>>>>>>> project, then this might be something we can do fairly quickly? > >>>>>>>> > >>>>>>>> I believe so, according to [3] and [4] > >>>>>>>> > >>>>>>>> > >>>>>>>> [1] https://ci.ursalabs.org/ <https://ci.ursalabs.org/#/> > >>>>>>>> [2] https://github.com/ursa-labs/ursabot > >>>>>>>> [3] > >>>>>>>> > >> > https://docs.travis-ci.com/user/migrate/open-source-repository-migration > >>>>>>>> [4] > >>>>>>>> > >> https://docs.travis-ci.com/user/migrate/open-source-on-travis-ci-com > >>>>>>>> > >>>>>>>> > >>>>>>>> On Wed, Jul 3, 2019 at 12:01 AM Chesnay Schepler > >>>>>>>> <ches...@apache.org <mailto:ches...@apache.org>> wrote: > >>>>>>>> > >>>>>>>> Are they using their own Travis CI pool, or did the switch > to > >> an > >>>>>>>> entirely different CI service? > >>>>>>>> > >>>>>>>> If we can just switch to our own Travis pool, just for our > >>>>>>>> project, then > >>>>>>>> this might be something we can do fairly quickly? > >>>>>>>> > >>>>>>>> On 03/07/2019 05:55, Bowen Li wrote: > >>>>>>>> > I responded in the INFRA ticket [1] that I believe they > are > >>>>>>>> using a wrong > >>>>>>>> > metric against Flink and the total build time is a > completely > >>>>>>>> different > >>>>>>>> > thing than guaranteed build capacity. > >>>>>>>> > > >>>>>>>> > My response: > >>>>>>>> > > >>>>>>>> > "As mentioned above, since I started to pay attention to > >> Flink's > >>>>>>>> build > >>>>>>>> > queue a few tens of days ago, I'm in Seattle and I saw no > >> build > >>>>>>>> was kicking > >>>>>>>> > off in PST daytime in weekdays for Flink. Our teammates in > >> China > >>>>>>>> and Europe > >>>>>>>> > have also reported similar observations. So we need to > >> evaluate > >>>>>>>> how the > >>>>>>>> > large total build time came from - if 1) your number and > 2) > >> our > >>>>>>>> > observations from three locations that cover pretty much a > >> full > >>>>>>>> day, are > >>>>>>>> > all true, I **guess** one reason can be that - highly > likely > >> the > >>>>>>>> extra > >>>>>>>> > build time came from weekends when other Apache projects > may > >> be > >>>>>>>> idle and > >>>>>>>> > Flink just drains hard its congested queue. > >>>>>>>> > > >>>>>>>> > Please be aware of that we're not complaining about the > lack > >> of > >>>>>>>> resources > >>>>>>>> > in general, I'm complaining about the lack of **stable, > >>>>>>>> dedicated** > >>>>>>>> > resources. An example for the latter one is, currently > even > >> if > >>>>>>>> no build is > >>>>>>>> > in Flink's queue and I submit a request to be the queue > head > >>>>>>>> in PST > >>>>>>>> > morning, my build won't even start in 6-8+h. That is an > >> absurd > >>>>>>>> amount of > >>>>>>>> > waiting time. > >>>>>>>> > > >>>>>>>> > That's saying, if ASF INFRA decides to adopt a quota > system > >> and > >>>>>>>> grants > >>>>>>>> > Flink five DEDICATED servers that runs all the time only > for > >>>>>>>> Flink, that'll > >>>>>>>> > be PERFECT and can totally solve our problem now. > >>>>>>>> > > >>>>>>>> > Please be aware of that we're not complaining about the > lack > >> of > >>>>>>>> resources > >>>>>>>> > in general, I'm complaining about the lack of **stable, > >>>>>>>> dedicated** > >>>>>>>> > resources. An example for the latter one is, currently > even > >> if > >>>>>>>> no build is > >>>>>>>> > in Flink's queue and I submit a request to be the queue > head > >>>>>>>> in PST > >>>>>>>> > morning, my build won't even start in 6-8+h. That is an > >> absurd > >>>>>>>> amount of > >>>>>>>> > waiting time. > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > That's saying, if ASF INFRA decides to adopt a quota > system > >> and > >>>>>>>> grants > >>>>>>>> > Flink five DEDICATED servers that runs all the time only > for > >>>>>>>> Flink, that'll > >>>>>>>> > be PERFECT and can totally solve our problem now. > >>>>>>>> > > >>>>>>>> > I feel what's missing in the ASF INFRA's Travis resource > >> pool is > >>>>>>>> some level > >>>>>>>> > of build capacity SLAs and certainty" > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > Again, I believe there are differences in nature of these > two > >>>>>>>> problems, > >>>>>>>> > long build time v.s. lack of dedicated build resource. > That's > >>>>>>>> saying, > >>>>>>>> > shortening build time may relieve the situation, and may > not. > >>>>>>>> I'm sightly > >>>>>>>> > negative on disabling IT cases for PRs, due to the > downside > >> is > >>>>>>>> that we are > >>>>>>>> > at risk of any potential bugs in PR that UTs doesn't > catch, > >> and > >>>>>>>> may cost a > >>>>>>>> > lot more to fix and if it slows others down or even block > >>>>>>>> others, but am > >>>>>>>> > open to others opinions on it. > >>>>>>>> > > >>>>>>>> > AFAICT from INFRA ticket[1], donating to ASF INFRA won't > be > >>>>>>>> feasible to > >>>>>>>> > solve our problem since INFRA's pool is fully shared and > they > >>>>>>>> have no > >>>>>>>> > control and finer insights over resource allocation to a > >>>>>>>> specific Apache > >>>>>>>> > project. As mentioned in [1], Apache Arrow is moving away > >> from > >>>>>>>> ASF INFRA > >>>>>>>> > Travis pool (they are actually surprised Flink hasn't plan > >> to do > >>>>>>>> so). I > >>>>>>>> > know that Spark is on its own build infra. If we all agree > >> that > >>>>>>>> funding our > >>>>>>>> > own build infra, I'd be glad to help investigate any > >> potential > >>>>>>>> options > >>>>>>>> > after releasing 1.9 since I'm super busy with 1.9 now. > >>>>>>>> > > >>>>>>>> > [1] https://issues.apache.org/jira/browse/INFRA-18533 > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > On Tue, Jul 2, 2019 at 4:46 AM Chesnay Schepler > >>>>>>>> <ches...@apache.org <mailto:ches...@apache.org>> wrote: > >>>>>>>> > > >>>>>>>> >> As a short-term stopgap, since we can assume this issue > to > >>>>>>>> become much > >>>>>>>> >> worse in the following days/weeks, we could disable IT > >> cases in > >>>>>>>> PRs and > >>>>>>>> >> only run them on master. > >>>>>>>> >> > >>>>>>>> >> On 02/07/2019 12:03, Chesnay Schepler wrote: > >>>>>>>> >>> People really have to stop thinking that just because > >>>>>>>> something works > >>>>>>>> >>> for us it is also a good solution. > >>>>>>>> >>> Also, please remember that our builds run for 2h from > >> start to > >>>>>>>> finish, > >>>>>>>> >>> and not the 14 _minutes_ it takes for zeppelin. > >>>>>>>> >>> We are dealing with an entirely different scale here, > both > >> in > >>>>>>>> terms of > >>>>>>>> >>> build times and number of builds. > >>>>>>>> >>> > >>>>>>>> >>> In this very thread people have been complaining about > long > >>>>>>>> queue > >>>>>>>> >>> times for their builds. Surprise, other Apache projects > >>>>>>>> have been > >>>>>>>> >>> suffering the very same thing due to us not controlling > our > >>>>>>>> build > >>>>>>>> >>> times. While switching services (be it Jenkins, > CircleCI or > >>>>>>>> whatever) > >>>>>>>> >>> will possibly work for us (and these options are > actually > >>>>>>>> attractive, > >>>>>>>> >>> like CircleCI's proper support for build artifacts), it > >>>>>>>> will also > >>>>>>>> >>> result in us likely negatively affecting other projects > in > >>>>>>>> significant > >>>>>>>> >>> ways. > >>>>>>>> >>> > >>>>>>>> >>> Sure, the Jenkins setup has a good user experience for > us, > >> at > >>>>>>>> the cost > >>>>>>>> >>> of blocking Jenkins workers for a _lot_ of time. Right > now > >> we > >>>>>>>> have 25 > >>>>>>>> >>> PR's in our queue; that's possibly 50h we'd consume of > >> Jenkins > >>>>>>>> >>> resources, and the European contributors haven't even > >> really > >>>>>>>> started yet. > >>>>>>>> >>> > >>>>>>>> >>> FYI, the latest INFRA response from INFRA-18533: > >>>>>>>> >>> > >>>>>>>> >>> "Our rough metrics shows that Flink used over 5800 > hours of > >>>>>>>> build time > >>>>>>>> >>> last month. That is equal to EIGHT servers running 24/7 > for > >>>>>>>> the ENTIRE > >>>>>>>> >>> MONTH. EIGHT. nonstop. > >>>>>>>> >>> When we discovered this last night, we discussed it some > >> and > >>>>>>>> are going > >>>>>>>> >>> to tune down Flink to allow only five executors > maximum. We > >>>>>>>> cannot > >>>>>>>> >>> allow Flink to consume so much of a Foundation shared > >>>>>>>> resource." > >>>>>>>> >>> > >>>>>>>> >>> So yes, we either > >>>>>>>> >>> a) have to heavily reduce our CI usage or > >>>>>>>> >>> b) fund our own, either maintaining it ourselves or > >> donating > >>>>>>>> to Apache. > >>>>>>>> >>> > >>>>>>>> >>> On 02/07/2019 05:11, Bowen Li wrote: > >>>>>>>> >>>> By looking at the git history of the Jenkins script, > its > >> core > >>>>>>>> part > >>>>>>>> >>>> was finished in March 2017 (and only two minor update > in > >>>>>>>> 2017/2018), > >>>>>>>> >>>> so it's been running for over two years now and feels > like > >>>>>>>> Zepplin > >>>>>>>> >>>> community has been quite happy with it. @Jeff Zhang > >>>>>>>> >>>> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>> > can > >> you > >>>>>>>> share your insights and user > >>>>>>>> >>>> experience with the Jenkins+Travis approach? > >>>>>>>> >>>> > >>>>>>>> >>>> Things like: > >>>>>>>> >>>> > >>>>>>>> >>>> - has the approach completely solved the resource > capacity > >>>>>>>> problem > >>>>>>>> >>>> for Zepplin community? is Zepplin community happy with > the > >>>>>>>> result? > >>>>>>>> >>>> - is the whole configuration chain stable (e.g. uptime) > >>>>>>>> enough? > >>>>>>>> >>>> - how often do you need to maintain the Jenkins infra? > how > >>>>>>>> many > >>>>>>>> >>>> people are usually involved in maintenance and > bug-fixes? > >>>>>>>> >>>> > >>>>>>>> >>>> The downside of this approach seems mostly to be on the > >>>>>>>> maintenance > >>>>>>>> >>>> to me - maintain the script and Jenkins infra. > >>>>>>>> >>>> > >>>>>>>> >>>> ** Having Our Own Travis-CI.com Account ** > >>>>>>>> >>>> > >>>>>>>> >>>> Another alternative I've been thinking of is to have > our > >> own > >>>>>>>> >>>> travis-ci.com <http://travis-ci.com> < > >> http://travis-ci.com> > >>>>>>>> account with paid dedicated > >>>>>>>> >>>> resources. Note travis-ci.org <http://travis-ci.org> > >>>>>>>> <http://travis-ci.org> is the free > >>>>>>>> >>>> version and travis-ci.com <http://travis-ci.com> > >>>>>>>> <http://travis-ci.com> is the commercial > >>>>>>>> >>>> version. We currently use a shared resource pool > managed > >> by > >>>>>>>> ASK INFRA > >>>>>>>> >>>> team on travis-ci.org <http://travis-ci.org> > >>>>>>>> <http://travis-ci.org>, but we have no control > >>>>>>>> >>>> over it - we can't see how it's configured, how much > >>>>>>>> resources are > >>>>>>>> >>>> available, how resources are allocated among Apache > >> projects, > >>>>>>>> etc. > >>>>>>>> >>>> The nice thing about having an account on > travis-ci.com > >>>>>>>> <http://travis-ci.com> > >>>>>>>> >>>> <http://travis-ci.com> are: > >>>>>>>> >>>> > >>>>>>>> >>>> - relatively low cost with much better resource > guarantee > >>>>>>>> than what > >>>>>>>> >>>> we currently have [1]: $249/month with 5 dedicated > >>>>>>>> concurrency, > >>>>>>>> >>>> $489/month with 10 concurrency > >>>>>>>> >>>> - low maintenance work compared to using Jenkins > >>>>>>>> >>>> - (potentially) no migration cost according to Travis's > >>>>>>>> doc [2] > >>>>>>>> >>>> (pending verification) > >>>>>>>> >>>> - full control over the build capacity/configuration > >>>>>>>> compared to > >>>>>>>> >>>> using ASF INFRA's pool > >>>>>>>> >>>> > >>>>>>>> >>>> I'd be surprised if we as such a vibrant community > cannot > >>>>>>>> find and > >>>>>>>> >>>> fund $249*12=$2988 a year in exchange for a much better > >>>>>>>> developer > >>>>>>>> >>>> experience and much higher productivity. > >>>>>>>> >>>> > >>>>>>>> >>>> [1] https://travis-ci.com/plans > >>>>>>>> >>>> [2] > >>>>>>>> >>>> > >>>>>>>> >> > >>>>>>>> > >> > https://docs.travis-ci.com/user/migrate/open-source-repository-migration > >>>>>>>> >>>> On Sat, Jun 29, 2019 at 8:39 AM Chesnay Schepler > >>>>>>>> <ches...@apache.org <mailto:ches...@apache.org> > >>>>>>>> >>>> <mailto:ches...@apache.org <mailto:ches...@apache.org > >>> > >>>>>>>> wrote: > >>>>>>>> >>>> > >>>>>>>> >>>> So yes, the Jenkins job keeps pulling the state > from > >>>>>>>> Travis until it > >>>>>>>> >>>> finishes. > >>>>>>>> >>>> > >>>>>>>> >>>> Note sure I'm comfortable with the idea of using > >> Jenkins > >>>>>>>> workers > >>>>>>>> >>>> just to > >>>>>>>> >>>> idle for a several hours. > >>>>>>>> >>>> > >>>>>>>> >>>> On 29/06/2019 14:56, Jeff Zhang wrote: > >>>>>>>> >>>> > Here's what zeppelin community did, we make a > >> python > >>>>>>>> script to > >>>>>>>> >>>> check the > >>>>>>>> >>>> > build status of pull request. > >>>>>>>> >>>> > Here's script: > >>>>>>>> >>>> > > >>>>>>>> https://github.com/apache/zeppelin/blob/master/travis_check.py > >>>>>>>> >>>> > > >>>>>>>> >>>> > And this is the script we used in Jenkins build > >> job. > >>>>>>>> >>>> > > >>>>>>>> >>>> > if [ -f "travis_check.py" ]; then > >>>>>>>> >>>> > git log -n 1 > >>>>>>>> >>>> > STATUS=$(curl -s $BUILD_URL | grep -e "GitHub > >> pull > >>>>>>>> >>>> request.*from.*" | sed > >>>>>>>> >>>> > 's/.*GitHub pull request <a > >>>>>>>> >>>> > > >> href=\"\(https[^"]*\).*from[^"]*.\(https[^"]*\).*/\1 > >>>>>>>> \2/g') > >>>>>>>> >>>> > AUTHOR=$(echo $STATUS | sed > >> 's/.*[/]\(.*\)$/\1/g') > >>>>>>>> >>>> > PR=$(echo $STATUS | awk '{print $1}' | sed > >>>>>>>> >>>> 's/.*[/]\(.*\)$/\1/g') > >>>>>>>> >>>> > #COMMIT=$(git log -n 1 | grep "^Merge:" | awk > >>>>>>>> '{print $3}') > >>>>>>>> >>>> > #if [ -z $COMMIT ]; then > >>>>>>>> >>>> > # COMMIT=$(curl -s > >>>>>>>> >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR > >>>>>>>> >>>> > | grep -e "\"label\":" -e "\"ref\":" -e > "\"sha\":" > >> | > >>>>>>>> tr '\n' ' ' > >>>>>>>> >>>> | sed > >>>>>>>> >>>> > 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr = > >> '\n' | > >>>>>>>> grep -v > >>>>>>>> >>>> "apache:" | > >>>>>>>> >>>> > sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g') > >>>>>>>> >>>> > #fi > >>>>>>>> >>>> > > >>>>>>>> >>>> > # get commit hash from PR > >>>>>>>> >>>> > COMMIT=$(curl -s > >>>>>>>> >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR > | > >>>>>>>> >>>> > grep -e "\"label\":" -e "\"ref\":" -e > "\"sha\":" | > >> tr > >>>>>>>> '\n' ' ' > >>>>>>>> >>>> | sed > >>>>>>>> >>>> > 's/\(.*sha[^,]*,\)\(.*ref.*\)/\1 = \2/g' | tr = > >> '\n' | > >>>>>>>> grep -v > >>>>>>>> >>>> "apache:" | > >>>>>>>> >>>> > sed 's/.*sha.[^"]*["]\([^"]*\).*/\1/g') > >>>>>>>> >>>> > sleep 30 # sleep few moment to wait travis > >> starts > >>>>>>>> the build > >>>>>>>> >>>> > RET_CODE=0 > >>>>>>>> >>>> > python ./travis_check.py ${AUTHOR} ${COMMIT} > || > >>>>>>>> RET_CODE=$? > >>>>>>>> >>>> > if [ $RET_CODE -eq 2 ]; then # try with > >> repository > >>>>>>>> name when > >>>>>>>> >>>> travis-ci is > >>>>>>>> >>>> > not available in the account > >>>>>>>> >>>> > RET_CODE=0 > >>>>>>>> >>>> > AUTHOR=$(curl -s > >>>>>>>> >>>> https://api.github.com/repos/apache/zeppelin/pulls/$PR > >>>>>>>> >>>> > | grep '"full_name":' | grep -v > "apache/zeppelin" | > >>>>>>>> sed > >>>>>>>> >>>> > 's/.*[:][^"]*["]\([^/]*\).*/\1/g') > >>>>>>>> >>>> > python ./travis_check.py ${AUTHOR} ${COMMIT} > || > >>>>>>>> RET_CODE=$? > >>>>>>>> >>>> > fi > >>>>>>>> >>>> > > >>>>>>>> >>>> > if [ $RET_CODE -eq 2 ]; then # fail with > can't > >> find > >>>>>>>> build > >>>>>>>> >>>> information in > >>>>>>>> >>>> > the travis > >>>>>>>> >>>> > set +x > >>>>>>>> >>>> > echo > >>>>>>>> "-----------------------------------------------------" > >>>>>>>> >>>> > echo "Looks like travis-ci is not > configured > >> for > >>>>>>>> your fork." > >>>>>>>> >>>> > echo "Please setup by swich on 'zeppelin' > >>>>>>>> repository at > >>>>>>>> >>>> > https://travis-ci.org/profile and travis-ci." > >>>>>>>> >>>> > echo "And then make sure 'Build branch > >> updates' > >>>>>>>> option is > >>>>>>>> >>>> enabled in > >>>>>>>> >>>> > the settings > >>>>>>>> https://travis-ci.org/${AUTHOR}/zeppelin/settings > >>>>>>>> <https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings> > >>>>>>>> >>>> <https://travis-ci.org/$%7BAUTHOR%7D/zeppelin/settings > >." > >>>>>>>> >>>> > echo "" > >>>>>>>> >>>> > echo "To trigger CI after setup, you will > need > >>>>>>>> ammend your > >>>>>>>> >>>> last commit > >>>>>>>> >>>> > with" > >>>>>>>> >>>> > echo "git commit --amend" > >>>>>>>> >>>> > echo "git push your-remote HEAD --force" > >>>>>>>> >>>> > echo "" > >>>>>>>> >>>> > echo "See > >>>>>>>> >>>> > > >>>>>>>> >>>> > >>>>>>>> >> > >>>>>>>> > >> > http://zeppelin.apache.org/contribution/contributions.html#continuous-integration > >>>>>>>> >>>> > ." > >>>>>>>> >>>> > fi > >>>>>>>> >>>> > > >>>>>>>> >>>> > exit $RET_CODE > >>>>>>>> >>>> > else > >>>>>>>> >>>> > set +x > >>>>>>>> >>>> > echo "travis_check.py does not exists" > >>>>>>>> >>>> > exit 1 > >>>>>>>> >>>> > fi > >>>>>>>> >>>> > > >>>>>>>> >>>> > Chesnay Schepler <ches...@apache.org > >>>>>>>> <mailto:ches...@apache.org> > >>>>>>>> >>>> <mailto:ches...@apache.org <mailto: > >> ches...@apache.org > >>>>>>>> 于2019年6月29日周六 下午3:17写道: > >>>>>>>> >>>> > > >>>>>>>> >>>> >> Does this imply that a Jenkins job is active as > >> long > >>>>>>>> as the > >>>>>>>> >>>> Travis build > >>>>>>>> >>>> >> runs? > >>>>>>>> >>>> >> > >>>>>>>> >>>> >> On 26/06/2019 21:28, Bowen Li wrote: > >>>>>>>> >>>> >>> Hi, > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> @Dawid, I think the "long test running" as I > >>>>>>>> mentioned in the > >>>>>>>> >>>> first > >>>>>>>> >>>> >> email, > >>>>>>>> >>>> >>> also as you guys said, belongs to "a big > effort > >>>>>>>> which is much > >>>>>>>> >>>> harder to > >>>>>>>> >>>> >>> accomplish in a short period of time and may > >> deserve > >>>>>>>> its own > >>>>>>>> >>>> separate > >>>>>>>> >>>> >>> discussion". Thus I didn't include it in what > we > >> can > >>>>>>>> do in a > >>>>>>>> >>>> foreseeable > >>>>>>>> >>>> >>> short term. > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> Besides, I don't think that's the ultimate > reason > >>>>>>>> for lack of > >>>>>>>> >>>> build > >>>>>>>> >>>> >>> resources. Even if the build is shortened to > >>>>>>>> something like > >>>>>>>> >>>> 2h, the > >>>>>>>> >>>> >>> problems of no build machine works about 6 or > >> more > >>>>>>>> hours in > >>>>>>>> >>>> PST daytime > >>>>>>>> >>>> >>> that I described will still happen, because no > >>>>>>>> machine from > >>>>>>>> >>>> ASF INFRA's > >>>>>>>> >>>> >>> pool is allocated to Flink. As I have paid > close > >>>>>>>> attention to > >>>>>>>> >>>> the build > >>>>>>>> >>>> >>> queue in the past few weekdays, it's a pretty > >> clear > >>>>>>>> pattern now. > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> **The ultimate root cause** for that is - we > >> don't > >>>>>>>> have any > >>>>>>>> >>>> **dedicated** > >>>>>>>> >>>> >>> build resources that we can stably rely on. > I'm > >>>>>>>> actually ok to > >>>>>>>> >>>> wait for a > >>>>>>>> >>>> >>> long time if there are build requests > running, it > >>>>>>>> means at > >>>>>>>> >>>> least we are > >>>>>>>> >>>> >>> making progress. But I'm not ok with no build > >>>>>>>> resource. A > >>>>>>>> >>>> better place I > >>>>>>>> >>>> >>> think we should aim at in short term is to > always > >>>>>>>> have at > >>>>>>>> >>>> least a central > >>>>>>>> >>>> >>> pool (can be 3 or 5) of machines dedicated to > >> build > >>>>>>>> Flink at > >>>>>>>> >>>> any time, or > >>>>>>>> >>>> >>> maybe use users resources. > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> @Chesnay @Robert I synced with Jeff offline > that > >>>>>>>> Zeppelin > >>>>>>>> >>>> community is > >>>>>>>> >>>> >>> using a Jenkins job to automatically build on > >> users' > >>>>>>>> travis > >>>>>>>> >>>> account and > >>>>>>>> >>>> >>> link the result back to github PR. I guess the > >>>>>>>> Jenkins job > >>>>>>>> >>>> would fetch > >>>>>>>> >>>> >>> latest upstream master and build the PR > against > >> it. > >>>>>>>> Jeff has > >>>>>>>> >>>> filed > >>>>>>>> >>>> >> tickets > >>>>>>>> >>>> >>> to learn and get access to the Jenkins infra. > >> It'll > >>>>>>>> better to > >>>>>>>> >>>> fully > >>>>>>>> >>>> >>> understand it first before judging this > approach. > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> I also heard good things about CircleCI, and > ASF > >>>>>>>> INFRA seems > >>>>>>>> >>>> to have a > >>>>>>>> >>>> >> pool > >>>>>>>> >>>> >>> of build capacity there too. Can be an > >> alternative > >>>>>>>> to consider. > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>> On Wed, Jun 26, 2019 at 12:44 AM Dawid > >> Wysakowicz < > >>>>>>>> >>>> >> dwysakow...@apache.org > >>>>>>>> <mailto:dwysakow...@apache.org> <mailto: > dwysakow...@apache.org > >>>>>>>> <mailto:dwysakow...@apache.org>>> > >>>>>>>> >>>> >>> wrote: > >>>>>>>> >>>> >>> > >>>>>>>> >>>> >>>> Sorry to jump in late, but I think Bowen > missed > >> the > >>>>>>>> most > >>>>>>>> >>>> important point > >>>>>>>> >>>> >>>> from Chesnay's previous message in the > summary. > >> The > >>>>>>>> ultimate > >>>>>>>> >>>> reason for > >>>>>>>> >>>> >>>> all the problems is that the tests take close > >> to 2 > >>>>>>>> hours to > >>>>>>>> >>>> run already. > >>>>>>>> >>>> >>>> I fully support this claim: "Unless people > start > >>>>>>>> caring about > >>>>>>>> >>>> test times > >>>>>>>> >>>> >>>> before adding them, this issue cannot be > solved" > >>>>>>>> >>>> >>>> > >>>>>>>> >>>> >>>> This is also another reason why using user's > >> Travis > >>>>>>>> account > >>>>>>>> >>>> won't help. > >>>>>>>> >>>> >>>> Every few weeks we reach the user's time > limit > >> for > >>>>>>>> a single > >>>>>>>> >>>> profile. > >>>>>>>> >>>> >>>> This makes the user's builds simply fail, > until > >> we > >>>>>>>> either > >>>>>>>> >>>> properly > >>>>>>>> >>>> >>>> decrease the time the tests take (which I am > not > >>>>>>>> sure we ever > >>>>>>>> >>>> did) or > >>>>>>>> >>>> >>>> postpone the problem by splitting into more > >>>>>>>> profiles. (Note > >>>>>>>> >>>> that the ASF > >>>>>>>> >>>> >>>> Travis account has higher time limits) > >>>>>>>> >>>> >>>> > >>>>>>>> >>>> >>>> Best, > >>>>>>>> >>>> >>>> > >>>>>>>> >>>> >>>> Dawid > >>>>>>>> >>>> >>>> > >>>>>>>> >>>> >>>> On 26/06/2019 09:36, Robert Metzger wrote: > >>>>>>>> >>>> >>>>> Do we know if using "the best" available > >> hardware > >>>>>>>> would > >>>>>>>> >>>> improve the > >>>>>>>> >>>> >> build > >>>>>>>> >>>> >>>>> times? > >>>>>>>> >>>> >>>>> Imagine we would run the build on machines > with > >>>>>>>> plenty of > >>>>>>>> >>>> main memory > >>>>>>>> >>>> >> to > >>>>>>>> >>>> >>>>> mount everything to ramdisk + the latest CPU > >>>>>>>> architecture? > >>>>>>>> >>>> >>>>> > >>>>>>>> >>>> >>>>> Throwing hardware at the problem could help > >> reduce > >>>>>>>> the time > >>>>>>>> >>>> of an > >>>>>>>> >>>> >>>>> individual build, and using our own > >> infrastructure > >>>>>>>> would > >>>>>>>> >>>> remove our > >>>>>>>> >>>> >>>>> dependency on Apache's Travis account (with > the > >>>>>>>> obvious > >>>>>>>> >>>> downside of > >>>>>>>> >>>> >>>> having > >>>>>>>> >>>> >>>>> to maintain the infrastructure) > >>>>>>>> >>>> >>>>> We could use an open source travis > >> alternative, to > >>>>>>>> have a > >>>>>>>> >>>> similar > >>>>>>>> >>>> >>>>> experience and make the migration easy. > >>>>>>>> >>>> >>>>> > >>>>>>>> >>>> >>>>> > >>>>>>>> >>>> >>>>> On Wed, Jun 26, 2019 at 9:34 AM Chesnay > >> Schepler > >>>>>>>> >>>> <ches...@apache.org <mailto:ches...@apache.org> > >>>>>>>> <mailto:ches...@apache.org <mailto:ches...@apache.org>>> > >>>>>>>> >>>> >>>> wrote: > >>>>>>>> >>>> >>>>>> >From what I gathered, there's no special > >>>>>>>> sauce that the > >>>>>>>> >>>> Zeppelin > >>>>>>>> >>>> >>>>>> project uses which actually integrates a > users > >>>>>>>> Travis > >>>>>>>> >>>> account into the > >>>>>>>> >>>> >>>> PR. > >>>>>>>> >>>> >>>>>> They just disabled Travis for PRs. And > that's > >>>>>>>> kind of it. > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> Naturally we can do this (duh) and safe the > >> ASF a > >>>>>>>> fair > >>>>>>>> >>>> amount of > >>>>>>>> >>>> >>>>>> resources, but there are downsides: > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> The discoverability of the Travis check > takes > >> a > >>>>>>>> nose-dive. > >>>>>>>> >>>> Either we > >>>>>>>> >>>> >>>>>> require every contributor to always, an > every > >>>>>>>> commit, also > >>>>>>>> >>>> post a > >>>>>>>> >>>> >> Travis > >>>>>>>> >>>> >>>>>> build, or we have the reviewer sift through > >> the > >>>>>>>> >>>> contributors account > >>>>>>>> >>>> >> to > >>>>>>>> >>>> >>>>>> find it. > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> This is rather cumbersome. Additionally, > it's > >>>>>>>> also not > >>>>>>>> >>>> equivalent to > >>>>>>>> >>>> >>>>>> having a PR build. > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> A normal branch build takes a branch as is > and > >>>>>>>> tests it. A > >>>>>>>> >>>> PR build > >>>>>>>> >>>> >>>>>> merges the branch into master, and then > runs > >> it. > >>>>>>>> (Fun fact: > >>>>>>>> >>>> This is > >>>>>>>> >>>> >> why > >>>>>>>> >>>> >>>>>> a PR without merge conflicts is not being > run > >> on > >>>>>>>> Travis.) > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> And ultimately, everyone can already make > use > >>>>>>>> of this > >>>>>>>> >>>> approach anyway. > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> On 25/06/2019 08:02, Jark Wu wrote: > >>>>>>>> >>>> >>>>>>> Hi Jeff, > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> Thanks for sharing the Zeppelin approach. > I > >>>>>>>> think it's a > >>>>>>>> >>>> good idea to > >>>>>>>> >>>> >>>>>>> leverage user's travis account. > >>>>>>>> >>>> >>>>>>> In this way, we can have almost unlimited > >>>>>>>> concurrent build > >>>>>>>> >>>> jobs and > >>>>>>>> >>>> >>>>>>> developers can restart build by themselves > >>>>>>>> (currently only > >>>>>>>> >>>> committers > >>>>>>>> >>>> >>>>>>> can restart PR's build). > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> But I'm still not very clear how to > integrate > >>>>>>>> user's > >>>>>>>> >>>> travis build > >>>>>>>> >>>> >> into > >>>>>>>> >>>> >>>>>>> the Flink pull request's build > automatically. > >>>>>>>> Can you > >>>>>>>> >>>> explain more in > >>>>>>>> >>>> >>>>>>> detail? > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> Another question: does travis only build > >>>>>>>> branches for user > >>>>>>>> >>>> account? > >>>>>>>> >>>> >>>>>>> My concern is that builds for PRs will > rebase > >>>>>>>> user's > >>>>>>>> >>>> commits against > >>>>>>>> >>>> >>>>>>> current master branch. > >>>>>>>> >>>> >>>>>>> This will help us to find problems before > >>>>>>>> merge. Builds > >>>>>>>> >>>> for branches > >>>>>>>> >>>> >>>>>>> will lose the impact of new commits in > >> master. > >>>>>>>> >>>> >>>>>>> How does Zeppelin solve this problem? > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> Thanks again for sharing the idea. > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> Regards, > >>>>>>>> >>>> >>>>>>> Jark > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> On Tue, 25 Jun 2019 at 11:01, Jeff Zhang > >>>>>>>> <zjf...@gmail.com <mailto:zjf...@gmail.com> > >>>>>>>> >>>> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com > >> > >>>>>>>> >>>> >>>>>>> <mailto:zjf...@gmail.com > >>>>>>>> <mailto:zjf...@gmail.com> <mailto:zjf...@gmail.com > >>>>>>>> <mailto:zjf...@gmail.com>>>> wrote: > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> Hi Folks, > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> Zeppelin meet this kind of issue before, > we > >>>>>>>> solve > >>>>>>>> >>>> it by > >>>>>>>> >>>> >> delegating > >>>>>>>> >>>> >>>>>>> each > >>>>>>>> >>>> >>>>>>> one's PR build to his travis account > >>>>>>>> (Everyone can > >>>>>>>> >>>> have 5 free > >>>>>>>> >>>> >>>>>>> slot for > >>>>>>>> >>>> >>>>>>> travis build). > >>>>>>>> >>>> >>>>>>> Apache account travis build is only > >>>>>>>> triggered when > >>>>>>>> >>>> PR is merged. > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> Kurt Young <ykt...@gmail.com > >>>>>>>> <mailto:ykt...@gmail.com> > >>>>>>>> >>>> <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com > >> > >>>>>>>> <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com> > >>>>>>>> >>>> <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com > >>>>>>>> >>>> >>>>>>> 于2019年6月25日周二 上午10:16写道: > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> > (Forgot to cc George) > >>>>>>>> >>>> >>>>>>> > > >>>>>>>> >>>> >>>>>>> > Best, > >>>>>>>> >>>> >>>>>>> > Kurt > >>>>>>>> >>>> >>>>>>> > > >>>>>>>> >>>> >>>>>>> > > >>>>>>>> >>>> >>>>>>> > On Tue, Jun 25, 2019 at 10:16 AM Kurt > >> Young > >>>>>>>> >>>> <ykt...@gmail.com <mailto:ykt...@gmail.com> > >>>>>>>> <mailto:ykt...@gmail.com <mailto:ykt...@gmail.com>> > >>>>>>>> >>>> >>>>>>> <mailto:ykt...@gmail.com > >>>>>>>> <mailto:ykt...@gmail.com> <mailto:ykt...@gmail.com > >>>>>>>> <mailto:ykt...@gmail.com>>>> > >>>>>>>> >>>> wrote: > >>>>>>>> >>>> >>>>>>> > > >>>>>>>> >>>> >>>>>>> > > Hi Bowen, > >>>>>>>> >>>> >>>>>>> > > > >>>>>>>> >>>> >>>>>>> > > Thanks for bringing this up. We > >>>>>>>> actually have > >>>>>>>> >>>> discussed > >>>>>>>> >>>> >> about > >>>>>>>> >>>> >>>>>>> this, and I > >>>>>>>> >>>> >>>>>>> > > think Till and George have > >>>>>>>> >>>> >>>>>>> > > already spend sometime investigating > >>>>>>>> it. I have > >>>>>>>> >>>> cced both of > >>>>>>>> >>>> >>>>>>> them, and > >>>>>>>> >>>> >>>>>>> > > maybe they can share > >>>>>>>> >>>> >>>>>>> > > their findings. > >>>>>>>> >>>> >>>>>>> > > > >>>>>>>> >>>> >>>>>>> > > Best, > >>>>>>>> >>>> >>>>>>> > > Kurt > >>>>>>>> >>>> >>>>>>> > > > >>>>>>>> >>>> >>>>>>> > > > >>>>>>>> >>>> >>>>>>> > > On Tue, Jun 25, 2019 at 10:08 AM > Jark Wu > >>>>>>>> >>>> <imj...@gmail.com <mailto:imj...@gmail.com> > >>>>>>>> <mailto:imj...@gmail.com <mailto:imj...@gmail.com>> > >>>>>>>> >>>> >>>>>>> <mailto:imj...@gmail.com > >>>>>>>> <mailto:imj...@gmail.com> <mailto:imj...@gmail.com > >>>>>>>> <mailto:imj...@gmail.com>>>> > >>>>>>>> >>>> wrote: > >>>>>>>> >>>> >>>>>>> > > > >>>>>>>> >>>> >>>>>>> > >> Hi Bowen, > >>>>>>>> >>>> >>>>>>> > >> > >>>>>>>> >>>> >>>>>>> > >> Thanks for bringing this. We also > >>>>>>>> suffered from > >>>>>>>> >>>> the long > >>>>>>>> >>>> >>>>>>> build time. > >>>>>>>> >>>> >>>>>>> > >> I agree that we should focus on > >>>>>>>> solving build > >>>>>>>> >>>> capacity > >>>>>>>> >>>> >>>>>>> problem in the > >>>>>>>> >>>> >>>>>>> > >> thread. > >>>>>>>> >>>> >>>>>>> > >> > >>>>>>>> >>>> >>>>>>> > >> My observation is there is only one > >>>>>>>> build is > >>>>>>>> >>>> running, all > >>>>>>>> >>>> >> the > >>>>>>>> >>>> >>>>>>> others > >>>>>>>> >>>> >>>>>>> > >> (other > >>>>>>>> >>>> >>>>>>> > >> PRs, master) are pending. > >>>>>>>> >>>> >>>>>>> > >> The pricing plan[1] of travis shows > >>>>>>>> it can > >>>>>>>> >>>> support > >>>>>>>> >>>> >> concurrent > >>>>>>>> >>>> >>>>>>> build > >>>>>>>> >>>> >>>>>>> > jobs. > >>>>>>>> >>>> >>>>>>> > >> But I don't know which plan we are > >>>>>>>> using, might > >>>>>>>> >>>> be the free > >>>>>>>> >>>> >>>>>>> plan for > >>>>>>>> >>>> >>>>>>> > open > >>>>>>>> >>>> >>>>>>> > >> source. > >>>>>>>> >>>> >>>>>>> > >> > >>>>>>>> >>>> >>>>>>> > >> I cc-ed Chesnay who may have some > >>>>>>>> experience on > >>>>>>>> >>>> Travis. > >>>>>>>> >>>> >>>>>>> > >> > >>>>>>>> >>>> >>>>>>> > >> Regards, > >>>>>>>> >>>> >>>>>>> > >> Jark > >>>>>>>> >>>> >>>>>>> > >> > >>>>>>>> >>>> >>>>>>> > >> [1]: https://travis-ci.com/plans > >>>>>>>> >>>> >>>>>>> > >> > >>>>>>>> >>>> >>>>>>> > >> On Tue, 25 Jun 2019 at 08:11, Bowen > Li > >> < > >>>>>>>> >>>> >> bowenl...@gmail.com <mailto: > bowenl...@gmail.com> > >>>>>>>> <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>> > >>>>>>>> >>>> >>>>>>> <mailto:bowenl...@gmail.com > >>>>>>>> <mailto:bowenl...@gmail.com> > >>>>>>>> >>>> <mailto:bowenl...@gmail.com > >>>>>>>> <mailto:bowenl...@gmail.com>>>> wrote: > >>>>>>>> >>>> >>>>>>> > >> > >>>>>>>> >>>> >>>>>>> > >> > Hi Steven, > >>>>>>>> >>>> >>>>>>> > >> > > >>>>>>>> >>>> >>>>>>> > >> > I think you may not read what I > >>>>>>>> wrote. The > >>>>>>>> >>>> discussion is > >>>>>>>> >>>> >>>> about > >>>>>>>> >>>> >>>>>>> > "unstable > >>>>>>>> >>>> >>>>>>> > >> > build **capacity**", in another > word > >>>>>>>> >>>> "unstable / lack of > >>>>>>>> >>>> >>>> build > >>>>>>>> >>>> >>>>>>> > >> resources", > >>>>>>>> >>>> >>>>>>> > >> > not "unstable build". > >>>>>>>> >>>> >>>>>>> > >> > > >>>>>>>> >>>> >>>>>>> > >> > On Mon, Jun 24, 2019 at 4:40 PM > >>>>>>>> Steven Wu > >>>>>>>> >>>> >>>>>>> <stevenz...@gmail.com > >>>>>>>> <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com > >>>>>>>> <mailto:stevenz...@gmail.com>> > >>>>>>>> >>>> <mailto:stevenz...@gmail.com > >>>>>>>> <mailto:stevenz...@gmail.com> <mailto:stevenz...@gmail.com > >>>>>>>> <mailto:stevenz...@gmail.com>>>> > >>>>>>>> >>>> >>>>>>> > wrote: > >>>>>>>> >>>> >>>>>>> > >> > > >>>>>>>> >>>> >>>>>>> > >> > > long and sometimes unstable > build > >> is > >>>>>>>> >>>> definitely a pain > >>>>>>>> >>>> >>>>>> point. > >>>>>>>> >>>> >>>>>>> > >> > > > >>>>>>>> >>>> >>>>>>> > >> > > I suspect the build failure > here in > >>>>>>>> >>>> >> flink-connector-kafka > >>>>>>>> >>>> >>>>>>> is not > >>>>>>>> >>>> >>>>>>> > >> related > >>>>>>>> >>>> >>>>>>> > >> > to > >>>>>>>> >>>> >>>>>>> > >> > > my change. but there is no easy > >>>>>>>> re-run the > >>>>>>>> >>>> build on > >>>>>>>> >>>> >>>>>>> travis UI. > >>>>>>>> >>>> >>>>>>> > Google > >>>>>>>> >>>> >>>>>>> > >> > > search showed a trick of > >>>>>>>> close-and-open the > >>>>>>>> >>>> PR will > >>>>>>>> >>>> >>>>>>> trigger rebuild. > >>>>>>>> >>>> >>>>>>> > >> but > >>>>>>>> >>>> >>>>>>> > >> > > that could add noises to the PR > >>>>>>>> activities. > >>>>>>>> >>>> >>>>>>> > >> > > > >>>>>>>> >>>> https://travis-ci.org/apache/flink/jobs/545555519 > >>>>>>>> >>>> >>>>>>> > >> > > > >>>>>>>> >>>> >>>>>>> > >> > > travis-ci for my personal repo > >>>>>>>> often failed > >>>>>>>> >>>> with > >>>>>>>> >>>> >>>>>>> exceeding time > >>>>>>>> >>>> >>>>>>> > limit > >>>>>>>> >>>> >>>>>>> > >> > after > >>>>>>>> >>>> >>>>>>> > >> > > 4+ hours. > >>>>>>>> >>>> >>>>>>> > >> > > The job exceeded the maximum > time > >>>>>>>> limit for > >>>>>>>> >>>> jobs, and > >>>>>>>> >>>> >> has > >>>>>>>> >>>> >>>>>>> been > >>>>>>>> >>>> >>>>>>> > >> > terminated. > >>>>>>>> >>>> >>>>>>> > >> > > > >>>>>>>> >>>> >>>>>>> > >> > > On Mon, Jun 24, 2019 at 4:15 PM > >>>>>>>> Bowen Li > >>>>>>>> >>>> >>>>>>> <bowenl...@gmail.com > >>>>>>>> <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com > >>>>>>>> <mailto:bowenl...@gmail.com>> > >>>>>>>> >>>> <mailto:bowenl...@gmail.com <mailto: > >> bowenl...@gmail.com > >>>>>>>> <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>> > >>>>>>>> >>>> >>>>>>> > wrote: > >>>>>>>> >>>> >>>>>>> > >> > > > >>>>>>>> >>>> >>>>>>> > >> > > > > >>>>>>>> >>>> https://travis-ci.org/apache/flink/builds/549681530 > >>>>>>>> >>>> >>>>>>> This build > >>>>>>>> >>>> >>>>>>> > >> > request > >>>>>>>> >>>> >>>>>>> > >> > > > has > >>>>>>>> >>>> >>>>>>> > >> > > > been sitting at **HEAD of the > >>>>>>>> queue** > >>>>>>>> >>>> since I first > >>>>>>>> >>>> >> saw > >>>>>>>> >>>> >>>>>>> it at PST > >>>>>>>> >>>> >>>>>>> > >> > 10:30am > >>>>>>>> >>>> >>>>>>> > >> > > > (not sure how long it's been > >>>>>>>> there before > >>>>>>>> >>>> 10:30am). > >>>>>>>> >>>> >>>>>>> It's PST > >>>>>>>> >>>> >>>>>>> > 4:12pm > >>>>>>>> >>>> >>>>>>> > >> now > >>>>>>>> >>>> >>>>>>> > >> > > and > >>>>>>>> >>>> >>>>>>> > >> > > > it hasn't started yet. > >>>>>>>> >>>> >>>>>>> > >> > > > > >>>>>>>> >>>> >>>>>>> > >> > > > On Mon, Jun 24, 2019 at 2:48 > PM > >>>>>>>> Bowen Li > >>>>>>>> >>>> >>>>>>> <bowenl...@gmail.com > >>>>>>>> <mailto:bowenl...@gmail.com> <mailto:bowenl...@gmail.com > >>>>>>>> <mailto:bowenl...@gmail.com>> > >>>>>>>> >>>> <mailto:bowenl...@gmail.com <mailto: > >> bowenl...@gmail.com > >>>>>>>> <mailto:bowenl...@gmail.com <mailto:bowenl...@gmail.com>>>> > >>>>>>>> >>>> >>>>>>> > >> wrote: > >>>>>>>> >>>> >>>>>>> > >> > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > Hi devs, > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > I've been experiencing the > pain > >>>>>>>> >>>> resulting from lack > >>>>>>>> >>>> >>>>>>> of stable > >>>>>>>> >>>> >>>>>>> > >> build > >>>>>>>> >>>> >>>>>>> > >> > > > > capacity on Travis for Flink > >>>>>>>> PRs [1]. > >>>>>>>> >>>> >> Specifically, I > >>>>>>>> >>>> >>>>>>> noticed > >>>>>>>> >>>> >>>>>>> > >> often > >>>>>>>> >>>> >>>>>>> > >> > > that > >>>>>>>> >>>> >>>>>>> > >> > > > no > >>>>>>>> >>>> >>>>>>> > >> > > > > build in the queue is making > >> any > >>>>>>>> >>>> progress for > >>>>>>>> >>>> >> hours, > >>>>>>>> >>>> >>>> and > >>>>>>>> >>>> >>>>>>> > suddenly > >>>>>>>> >>>> >>>>>>> > >> 5 > >>>>>>>> >>>> >>>>>>> > >> > or > >>>>>>>> >>>> >>>>>>> > >> > > 6 > >>>>>>>> >>>> >>>>>>> > >> > > > > builds kick off all together > >>>>>>>> after the > >>>>>>>> >>>> long pause. > >>>>>>>> >>>> >>>>>>> I'm at PST > >>>>>>>> >>>> >>>>>>> > >> > (UTC-08) > >>>>>>>> >>>> >>>>>>> > >> > > > time > >>>>>>>> >>>> >>>>>>> > >> > > > > zone, and I've seen pause > can > >>>>>>>> be as > >>>>>>>> >>>> long as 6 hours > >>>>>>>> >>>> >>>>>>> from PST 9am > >>>>>>>> >>>> >>>>>>> > >> to > >>>>>>>> >>>> >>>>>>> > >> > 3pm > >>>>>>>> >>>> >>>>>>> > >> > > > > (let alone the time needed > to > >>>>>>>> drain the > >>>>>>>> >>>> queue > >>>>>>>> >>>> >>>>>>> afterwards). > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > I think this has greatly > >>>>>>>> impacted our > >>>>>>>> >>>> productivity. > >>>>>>>> >>>> >>>> I've > >>>>>>>> >>>> >>>>>>> > >> experienced > >>>>>>>> >>>> >>>>>>> > >> > > that > >>>>>>>> >>>> >>>>>>> > >> > > > > PRs submitted in the early > >>>>>>>> morning of > >>>>>>>> >>>> PST time zone > >>>>>>>> >>>> >>>>>>> won't finish > >>>>>>>> >>>> >>>>>>> > >> > their > >>>>>>>> >>>> >>>>>>> > >> > > > > build until late night of > the > >>>>>>>> same day. > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > So my questions are: > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > - Has anyone else > experienced > >>>>>>>> the same > >>>>>>>> >>>> problem or > >>>>>>>> >>>> >>>>>>> have similar > >>>>>>>> >>>> >>>>>>> > >> > > > observation > >>>>>>>> >>>> >>>>>>> > >> > > > > on TravisCI? (I suspect it > >>>>>>>> has things > >>>>>>>> >>>> to do with > >>>>>>>> >>>> >> time > >>>>>>>> >>>> >>>>>>> zone) > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > - What pricing plan of > >>>>>>>> TravisCI is > >>>>>>>> >>>> Flink currently > >>>>>>>> >>>> >>>>>>> using? Is it > >>>>>>>> >>>> >>>>>>> > >> the > >>>>>>>> >>>> >>>>>>> > >> > > free > >>>>>>>> >>>> >>>>>>> > >> > > > > plan for open source > >>>>>>>> projects? What > >>>>>>>> >>>> are the > >>>>>>>> >>>> >>>>>>> guaranteed build > >>>>>>>> >>>> >>>>>>> > >> capacity > >>>>>>>> >>>> >>>>>>> > >> > > of > >>>>>>>> >>>> >>>>>>> > >> > > > > the current plan? > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > - If the current pricing > plan > >>>>>>>> (either > >>>>>>>> >>>> free or paid) > >>>>>>>> >>>> >>>>>> can't > >>>>>>>> >>>> >>>>>>> > provide > >>>>>>>> >>>> >>>>>>> > >> > > stable > >>>>>>>> >>>> >>>>>>> > >> > > > > build capacity, can we > >>>>>>>> upgrade to a > >>>>>>>> >>>> higher priced > >>>>>>>> >>>> >>>>>>> plan with > >>>>>>>> >>>> >>>>>>> > larger > >>>>>>>> >>>> >>>>>>> > >> > and > >>>>>>>> >>>> >>>>>>> > >> > > > more > >>>>>>>> >>>> >>>>>>> > >> > > > > stable build capacity? > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > BTW, another factor that > >>>>>>>> contribute to > >>>>>>>> >>>> the > >>>>>>>> >>>> >>>>>>> productivity problem > >>>>>>>> >>>> >>>>>>> > is > >>>>>>>> >>>> >>>>>>> > >> > that > >>>>>>>> >>>> >>>>>>> > >> > > > > our build is slow - we run > >>>>>>>> full build > >>>>>>>> >>>> for every PR > >>>>>>>> >>>> >>>> and a > >>>>>>>> >>>> >>>>>>> > >> successful > >>>>>>>> >>>> >>>>>>> > >> > > full > >>>>>>>> >>>> >>>>>>> > >> > > > > build takes ~5h. We > >>>>>>>> definitely have > >>>>>>>> >>>> more options to > >>>>>>>> >>>> >>>>>>> solve it, > >>>>>>>> >>>> >>>>>>> > for > >>>>>>>> >>>> >>>>>>> > >> > > > instance, > >>>>>>>> >>>> >>>>>>> > >> > > > > modularize the build graphs > >>>>>>>> and reuse > >>>>>>>> >>>> artifacts > >>>>>>>> >>>> >> from > >>>>>>>> >>>> >>>> the > >>>>>>>> >>>> >>>>>>> > previous > >>>>>>>> >>>> >>>>>>> > >> > > build. > >>>>>>>> >>>> >>>>>>> > >> > > > > But I think that can be a > big > >>>>>>>> effort > >>>>>>>> >>>> which is much > >>>>>>>> >>>> >>>>>>> harder to > >>>>>>>> >>>> >>>>>>> > >> > accomplish > >>>>>>>> >>>> >>>>>>> > >> > > > in > >>>>>>>> >>>> >>>>>>> > >> > > > > a short period of time and > >>>>>>>> may deserve > >>>>>>>> >>>> its own > >>>>>>>> >>>> >>>> separate > >>>>>>>> >>>> >>>>>>> > >> discussion. > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > [1] > >>>>>>>> >>>> >> > https://travis-ci.org/apache/flink/pull_requests > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > > >>>>>>>> >>>> >>>>>>> > >> > > > > >>>>>>>> >>>> >>>>>>> > >> > > > >>>>>>>> >>>> >>>>>>> > >> > > >>>>>>>> >>>> >>>>>>> > >> > >>>>>>>> >>>> >>>>>>> > > > >>>>>>>> >>>> >>>>>>> > > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> -- > >>>>>>>> >>>> >>>>>>> Best Regards > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >>>>>>> Jeff Zhang > >>>>>>>> >>>> >>>>>>> > >>>>>>>> >>>> >> > >>>>>>>> >>>> > >>>>>>>> >>> > >>>>>>>> >> > >>>>>>>> > >> > >