+1 for migrating to Azure pipelines as this can have shorter build time,
and faster response.

Best,
Congxian


Xiyuan Wang <wangxiyuan1...@gmail.com> 于2019年12月9日周一 上午10:13写道:

> Hi Robert,
>   Thanks for bring up this topic. The 2 ARM machines(16cores) which I
> donated is just for POC test. We(Huawei) can donate more once moving to
> official Azure pipeline. :)
>
> Robert Metzger <rmetz...@apache.org> 于2019年12月6日周五 上午3:25写道:
>
> > Thanks for your comments Yun.
> > If there's strong support for idea 2, it would actually make my
> > life easier: the migration would be easier to do.
> >
> > I also noticed that the uploads to transfer.sh were broken, but this
> should
> > be fixed in the "rmetzger.flink" builds (coming from rmetzger/flink). The
> > builds in "flink-ci.flink" (coming from flink-ci/flink) might have
> troubles
> > with transfer.sh.
> >
> >
> > On Thu, Dec 5, 2019 at 5:50 PM Yun Tang <myas...@live.com> wrote:
> >
> > > Hi Robert
> > >
> > > Really exciting to see this new more powerful CI tool to get rid of the
> > 50
> > > minutes limit of traivs-CI free account.
> > >
> > > After reading the wiki, I support idea 2 of AZP-setup version-2.
> > >
> > > However, after I dig into some failing builds at
> > > https://dev.azure.com/rmetzger/Flink/_build , I found we cannot view
> the
> > > logs of some IT cases which would be uploaded by traivs_watchdog to
> > > transfer.sh previously.
> > > I think this feature is also easy to implement in AZP, right?
> > >
> > > Best
> > > Yun Tang
> > >
> > > On 12/6/19, 12:19 AM, "Robert Metzger" <rmetz...@apache.org> wrote:
> > >
> > >     I've created a first draft of my plans in the wiki:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/%5Bpreview%5D+Azure+Pipelines
> > > .
> > >     I'm looking forward to your comments.
> > >
> > >     On Thu, Dec 5, 2019 at 12:37 PM Robert Metzger <
> rmetz...@apache.org>
> > > wrote:
> > >
> > >     > Thank you all for the positive feedback. I will start putting
> > > together a
> > >     > page in the wiki.
> > >     >
> > >     > @Jark: Azure Pipelines provides a free services, that is even
> > better
> > > than
> > >     > what Travis provides for free: 10 parallel builds with 6 hours
> > > timeouts.
> > >     >
> > >     > @Chesnay: I will answer your questions in the yet-to-be-written
> > >     > documentation in the wiki.
> > >     >
> > >     >
> > >     > On Thu, Dec 5, 2019 at 11:58 AM Arvid Heise <ar...@ververica.com
> >
> > > wrote:
> > >     >
> > >     >> +1 I had good experiences with Azure pipelines in the past.
> > >     >>
> > >     >> On Thu, Dec 5, 2019 at 11:35 AM Aljoscha Krettek <
> > > aljos...@apache.org>
> > >     >> wrote:
> > >     >>
> > >     >> > +1
> > >     >> >
> > >     >> > Thanks for the effort! The tooling seems to be quite a bit
> nicer
> > > and I
> > >     >> > like that we can grow by adding more machines.
> > >     >> >
> > >     >> > Best,
> > >     >> > Aljoscha
> > >     >> >
> > >     >> > > On 5. Dec 2019, at 03:18, Jark Wu <imj...@gmail.com> wrote:
> > >     >> > >
> > >     >> > > +1 for Azure pipeline because it promises better
> performance.
> > >     >> > >
> > >     >> > > However, I have 2 concerns:
> > >     >> > >
> > >     >> > > 1) Travis provides personal free service for testing
> personal
> > >     >> branches.
> > >     >> > > Usually, contributors use this feature to test PoC or run
> CRON
> > > jobs
> > >     >> for
> > >     >> > > pull requests.
> > >     >> > >    Using local machine will cost a lot of time. Does AZP
> > > provides the
> > >     >> > same
> > >     >> > > free service?
> > >     >> > > 2) Currently, we deployed a webhook [1] to receive Travis CI
> > > build
> > >     >> > > notifications [2] and send to bui...@flink.apache.org
> mailing
> > > list.
> > >     >> > >    We need to figure out a way how to send Azure build
> results
> > > to the
> > >     >> > > mailing list. And this [3] might be the way to go.
> > >     >> > >
> > >     >> > > builds@f.a.o mailing list
> > >     >> > >
> > >     >> > > Best,
> > >     >> > > Jark
> > >     >> > >
> > >     >> > > [1]: https://github.com/wuchong/flink-notification-bot
> > >     >> > > [2]:
> > >     >> > >
> > >     >> >
> > >     >>
> > >
> >
> https://docs.travis-ci.com/user/notifications/#configuring-webhook-notifications
> > >     >> > > [3]:
> > >     >> > >
> > >     >> >
> > >     >>
> > >
> >
> https://docs.microsoft.com/en-us/azure/devops/service-hooks/overview?view=azure-devops
> > >     >> > >
> > >     >> > >
> > >     >> > >
> > >     >> > > On Wed, 4 Dec 2019 at 22:48, Jeff Zhang <zjf...@gmail.com>
> > > wrote:
> > >     >> > >
> > >     >> > >> +1
> > >     >> > >>
> > >     >> > >> Till Rohrmann <trohrm...@apache.org> 于2019年12月4日周三
> > 下午10:43写道:
> > >     >> > >>
> > >     >> > >>> +1 for moving to Azure pipelines as it promises better
> > > scalability
> > >     >> and
> > >     >> > >>> tooling. Looking forward to having faster builds and hence
> > > shorter
> > >     >> > >> feedback
> > >     >> > >>> cycles :-)
> > >     >> > >>>
> > >     >> > >>> Cheers,
> > >     >> > >>> Till
> > >     >> > >>>
> > >     >> > >>> On Wed, Dec 4, 2019 at 1:24 PM Chesnay Schepler <
> > > ches...@apache.org
> > >     >> >
> > >     >> > >>> wrote:
> > >     >> > >>>
> > >     >> > >>>> @robert Can you expand how the azure setup interacts with
> > > CiBot?
> > >     >> Do we
> > >     >> > >>>> have to continue mirroring builds into flink-ci? How will
> > the
> > >     >> cronjob
> > >     >> > >>>> configuration work? We should have a general idea on how
> to
> > >     >> implement
> > >     >> > >>>> this before proceeding.
> > >     >> > >>>> Additionally, moving /all /jobs into flink-ci requires
> > > setting up
> > >     >> the
> > >     >> > >>>> environment variables we have; can we set these up via
> > files
> > > or
> > >     >> will
> > >     >> > we
> > >     >> > >>>> have to give all committers permissions for
> flink-ci/flink?
> > >     >> > >>>>
> > >     >> > >>>> On 04/12/2019 12:55, Chesnay Schepler wrote:
> > >     >> > >>>>> From what I've seen so far Azure will provide us a
> better
> > >     >> experience,
> > >     >> > >>>>> so I'd say +1 for the transition as a whole.
> > >     >> > >>>>>
> > >     >> > >>>>> I'd delay merge at least until the feature branch is
> cut.
> > >     >> > >>>>> Given the parental leave it may even make sense to only
> > > start
> > >     >> merging
> > >     >> > >>>>> in January afterwards, to reduce the total time taken
> for
> > > the
> > >     >> > >>> transition.
> > >     >> > >>>>>
> > >     >> > >>>>> Reviews could maybe be made earlier, but I'm wondering
> > > whether
> > >     >> anyone
> > >     >> > >>>>> would even have the time at the moment to do so.
> > >     >> > >>>>>
> > >     >> > >>>>> On 04/12/2019 12:35, Kurt Young wrote:
> > >     >> > >>>>>> Thanks Robert for driving this. There is another big
> pain
> > > point
> > >     >> of
> > >     >> > >>>>>> current
> > >     >> > >>>>>> travis,
> > >     >> > >>>>>> which is its cache mechanism will fail from time to
> time.
> > > Almost
> > >     >> > >>>>>> around 50%
> > >     >> > >>>>>> of
> > >     >> > >>>>>> the build fails are caused by cache problem. I opened
> > this
> > > issue
> > >     >> to
> > >     >> > >>>>>> travis
> > >     >> > >>>>>> but
> > >     >> > >>>>>> got no response yet. So big +1 from my side.
> > >     >> > >>>>>>
> > >     >> > >>>>>> Just one comment, it's close to 1.10 feature freeze and
> > we
> > > will
> > >     >> > >> spend
> > >     >> > >>>>>> some
> > >     >> > >>>>>> time
> > >     >> > >>>>>> to make tests stable before release. I wish this
> > > replacement can
> > >     >> > >>> happen
> > >     >> > >>>>>> after
> > >     >> > >>>>>> 1.10 release, otherwise it will be a unstable factor
> > during
> > >     >> release
> > >     >> > >>>>>> testing.
> > >     >> > >>>>>>
> > >     >> > >>>>>> Best,
> > >     >> > >>>>>> Kurt
> > >     >> > >>>>>>
> > >     >> > >>>>>>
> > >     >> > >>>>>> On Wed, Dec 4, 2019 at 7:16 PM Zhu Zhu <
> > reed...@gmail.com>
> > >     >> wrote:
> > >     >> > >>>>>>
> > >     >> > >>>>>>> Thanks Robert for the updates! And thanks a lot for
> all
> > > the
> > >     >> efforts
> > >     >> > >>> to
> > >     >> > >>>>>>> investigate, experiment and tune Azure Pipelines for
> > Flink
> > >     >> > >> building.
> > >     >> > >>>>>>> Big +1 for it.
> > >     >> > >>>>>>>
> > >     >> > >>>>>>> It would be great that the community building can be
> > > extended
> > >     >> with
> > >     >> > >>>>>>> custom
> > >     >> > >>>>>>> machines so that the tests would not be queued for
> long
> > > with
> > >     >> daily
> > >     >> > >>>>>>> growing
> > >     >> > >>>>>>> PRs.
> > >     >> > >>>>>>>
> > >     >> > >>>>>>> The increased timeout would be also very helpful.
> > >     >> > >>>>>>> The 50min timeout for free travis accounts is a pain
> > > currently,
> > >     >> > >>>>>>> especially
> > >     >> > >>>>>>> when we'd like to run e2e tests in our own travis.
> And I
> > > had to
> > >     >> > >>>>>>> manually
> > >     >> > >>>>>>> split the jobs to make it possible to pass.
> > >     >> > >>>>>>>
> > >     >> > >>>>>>> Thanks,
> > >     >> > >>>>>>> Zhu Zhu
> > >     >> > >>>>>>>
> > >     >> > >>>>>>> Robert Metzger <rmetz...@apache.org> 于2019年12月4日周三
> > > 下午6:36写道:
> > >     >> > >>>>>>>
> > >     >> > >>>>>>>> Hi all,
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> as a follow up from our discussion on reducing the
> > build
> > > time
> > >     >> > >> [1], I
> > >     >> > >>>>>>> would
> > >     >> > >>>>>>>> like to propose migrating our build infrastructure to
> > > Azure
> > >     >> > >>> Pipelines
> > >     >> > >>>>>>> (away
> > >     >> > >>>>>>>> from Travis).
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> I believe that we have reached the limits of what
> > Travis
> > > can
> > >     >> > >>>>>>>> provide the
> > >     >> > >>>>>>>> Flink community, and I don't want the build system to
> > > limit or
> > >     >> > >>>>>>>> influence
> > >     >> > >>>>>>>> the project's growth.
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> *Benefits:*
> > >     >> > >>>>>>>> 1. The free Travis account are limited to 5 parallel
> > > builds,
> > >     >> with
> > >     >> > >> a
> > >     >> > >>>>>>> timeout
> > >     >> > >>>>>>>> of 50 minutes. Azure offers *10 parallel builds with
> > 300
> > > minute
> > >     >> > >>>>>>>> timeouts
> > >     >> > >>>>>>>> *for
> > >     >> > >>>>>>>> free for open source projects.
> > >     >> > >>>>>>>> 2. Azure Pipelines allows us to *add custom build
> > > machines* to
> > >     >> the
> > >     >> > >>>>>>>> pool
> > >     >> > >>>>>>> of
> > >     >> > >>>>>>>> 10 free parallel builders.
> > >     >> > >>>>>>>> This will allow the Flink community to scale the
> > > available
> > >     >> build
> > >     >> > >>>>>>>> capacity
> > >     >> > >>>>>>>> as the project grows. We are dependent on donations
> > from
> > >     >> > >> supporting
> > >     >> > >>>>>>>> companies, but I believe that it is easier for
> > companies
> > > to
> > >     >> donate
> > >     >> > >>>>>>> machines
> > >     >> > >>>>>>>> than money.
> > >     >> > >>>>>>>> Alibaba is willing to provide 10 machines, with 32
> > cores
> > > each
> > >     >> to
> > >     >> > >> the
> > >     >> > >>>>>>> Flink
> > >     >> > >>>>>>>> project for this purpose.
> > >     >> > >>>>>>>> In addition, Xiyuan, who's working on adding ARM
> > support
> > > for
> > >     >> Flink
> > >     >> > >>>>>>> provided
> > >     >> > >>>>>>>> me with 2 ARM machines (16 cores each).
> > >     >> > >>>>>>>> I want to use the custom, more efficient build
> machines
> > > for
> > >     >> > >> building
> > >     >> > >>>>>>>> Flink's pull requests and master-pushes.
> > >     >> > >>>>>>>> 3. *Azure Pipelines is a more feature-rich tool*,
> > > allowing for
> > >     >> > >>>>>>>> example to
> > >     >> > >>>>>>>> transfer intermediate build artifacts between
> pipeline
> > > stages.
> > >     >> > >> This
> > >     >> > >>>>>>>> will
> > >     >> > >>>>>>>> allow us to make the build more reliable (we are
> > > currently
> > >     >> abusing
> > >     >> > >>> the
> > >     >> > >>>>>>>> caching mechanism in Travis for this).
> > >     >> > >>>>>>>> It also has some basic analytics on test results /
> > flaky
> > > tests
> > >     >> > >> etc.
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> *Known problems:*
> > >     >> > >>>>>>>> - Initially, we might see different build
> instabilities
> > > than
> > >     >> > >> before
> > >     >> > >>>>>>>> - There's a higher maintenance overhead for the
> custom
> > > build
> > >     >> > >>> machines
> > >     >> > >>>>>>>> (keeping them up to date etc.)
> > >     >> > >>>>>>>> - We can not use the build status integration of AZP,
> > > because
> > >     >> they
> > >     >> > >>>>>>> require
> > >     >> > >>>>>>>> write access to the repository's source. The
> foundation
> > > does
> > >     >> not
> > >     >> > >>> allow
> > >     >> > >>>>>>> that
> > >     >> > >>>>>>>> [2].
> > >     >> > >>>>>>>> I propose to extend flinkbot / the flink-ci
> repository.
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> *Current Status:*
> > >     >> > >>>>>>>> - I'm able [3] to execute [4] the current custom
> build
> > > scripts
> > >     >> on
> > >     >> > >>>>>>>> Azure
> > >     >> > >>>>>>>> Pipelines: This means that we will have one compile
> > > stage, and
> > >     >> N
> > >     >> > >>>>>>>> testing
> > >     >> > >>>>>>>> jobs in the 2nd stage. Currently, we have N=10
> testing
> > > jobs.
> > >     >> > >>>>>>>> The time from the start of a build till all tests
> have
> > >     >> completed
> > >     >> > >> is
> > >     >> > >>>>>>>> 1h22
> > >     >> > >>>>>>>> minutes.
> > >     >> > >>>>>>>> - I'm working on getting the nightly end to end tests
> > to
> > > run on
> > >     >> > >> the
> > >     >> > >>>>>>>> new
> > >     >> > >>>>>>>> infrastructure.
> > >     >> > >>>>>>>> - I'm working on getting the build to work on our
> pool
> > of
> > >     >> custom
> > >     >> > >>>>>>>> machines
> > >     >> > >>>>>>>> as well
> > >     >> > >>>>>>>> - I'm working on setting up the full matrix of builds
> > >     >> (different
> > >     >> > >>>>>>>> scala,
> > >     >> > >>>>>>>> hadoop etc. versions) for the nightlies
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> *Next Steps:*
> > >     >> > >>>>>>>> - I propose to document the entire build system in
> the
> > > Flink
> > >     >> Wiki
> > >     >> > >>>>>>>> - Once Azure can cover the same pull request tests as
> > > Travis, I
> > >     >> > >>>>>>>> would set
> > >     >> > >>>>>>>> it up to run in parallel (including Flinkbot posting
> > > links to
> > >     >> > >>>>>>>> Azure). I
> > >     >> > >>>>>>>> hope that this phase lasts for 1-2 weeks only, so
> that
> > > we do
> > >     >> not
> > >     >> > >>>>>>>> have to
> > >     >> > >>>>>>>> maintain things concurrently. I will monitor the
> build
> > >     >> stability
> > >     >> > >>>>>>>> closely,
> > >     >> > >>>>>>>> but would expect some support with debugging
> potential
> > > issues
> > >     >> from
> > >     >> > >>> the
> > >     >> > >>>>>>>> contributors.
> > >     >> > >>>>>>>> - Once there are no problems with the new setup, we
> > > remove the
> > >     >> > >>> Travis
> > >     >> > >>>>>>>> setup.
> > >     >> > >>>>>>>> - Independently, I will work on triggering builds
> from
> > > master /
> > >     >> > >>>>>>>> release -
> > >     >> > >>>>>>>> branch pushes, as well as cron builds from the master
> > > branch
> > >     >> ...
> > >     >> > >>>>>>>> all this
> > >     >> > >>>>>>>> will be described in the Wiki.
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> *Timeline:*- Once I have the feeling that people are
> > >     >> supportive of
> > >     >> > >>> the
> > >     >> > >>>>>>>> idea, I will start documenting in the Wiki. The first
> > > pull
> > >     >> > >> requests
> > >     >> > >>>>>>> should
> > >     >> > >>>>>>>> show up after a few more days.
> > >     >> > >>>>>>>> I will do a one month parental leave starting some
> time
> > > later
> > >     >> in
> > >     >> > >>>>>>> December,
> > >     >> > >>>>>>>> which will probably delay things a bit. I hope to
> have
> > >     >> everything
> > >     >> > >>>>>>> finished
> > >     >> > >>>>>>>> by end of January.
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> I'm happy to hear your thoughts on this work.
> > >     >> > >>>>>>>> If nobody objects, I will start documenting the
> system
> > > and
> > >     >> prepare
> > >     >> > >>>>>>>> everything for the migration.
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> Best,
> > >     >> > >>>>>>>> Robert
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>> [1]
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>>
> > >     >> > >>>>>>>
> > >     >> > >>>>
> > >     >> > >>>
> > >     >> > >>
> > >     >> >
> > >     >>
> > >
> >
> https://lists.apache.org/thread.html/b90aa518fcabce94f8e1de4132f46120fae613db6e95a2705f1bd1ea@%3Cdev.flink.apache.org%3E
> > >     >> > >>>>>>>
> > >     >> > >>>>>>>> [2]
> https://issues.apache.org/jira/browse/INFRA-17030
> > >     >> > >>>>>>>> [3]
> > > https://github.com/rmetzger/flink/tree/azure_playground
> > >     >> > >>>>>>>> [4]
> > >     >> > >>>>>>>
> > >     >> > >>>
> > >     >>
> > > https://dev.azure.com/rmetzger/Flink/_build?definitionId=4&_a=summary
> > >     >> > >>>>>
> > >     >> > >>>>>
> > >     >> > >>>>>
> > >     >> > >>>>
> > >     >> > >>>>
> > >     >> > >>>
> > >     >> > >>
> > >     >> > >>
> > >     >> > >> --
> > >     >> > >> Best Regards
> > >     >> > >>
> > >     >> > >> Jeff Zhang
> > >     >> > >>
> > >     >> >
> > >     >> >
> > >     >>
> > >     >
> > >
> > >
> > >
> >
>

Reply via email to