Thanks Brian and Daniel. I agree on the points you both raised. Brian, to you specific questions/points:
## We need details on each piece of the Travis workflow, where it will be ported to, and a rough estimate of how long each piece would take. I think these things would make a great EPIC. I have a Github Actions epic. I plan to update it this week based on our conversation and will add more specific details, estimates, etc. I'll respond when it's ready for review. https://pulp.plan.io/issues/6065 ## Who will work on it? It needs I think 2 fully dedicated people who already completely understand the Travis stuff in detail. It's too hard for one person and would take too long... I definitely agree we need at least 2 people to work on this. We need as many people as possible to understand Github Actions. I don't know who has time for this right now. I imagine it'll probably have to wait until next sprint (Sprint 67). Or at least I personally won't have time until next week at the earliest. That'll give us time to plan though. In the meantime, I'd consider letting the installer team merge Fabricio's ansible-pulp PR[0]. This will also alleviate much of the immediate need and let us begin collecting real world data/experience as well. ## It's got to happen fully - If we're leaving Travis for Github Actions, we have to fully leave ## I think it would be good if when a plugin switches, they switch fully-and-at-once from Travis to Github Actions... Makes sense to me. ## It needs to come with education somehow. Maybe a demo video, blog post recap, and certainly great docs replacing the Travis ones we have now. 100% agreed. I added these items to the epic. [0] https://github.com/pulp/ansible-pulp/pull/217 David On Sat, Feb 8, 2020 at 12:28 PM Daniel Alley <[email protected]> wrote: > Thanks for your response Brian, I think all of those concerns are > reasonable! I'll try to add to/help with some of them. > > The approach Fabricio took with his PR to pulp_file is incredibly smart, I > think. In his PR to pulp_file, all of the CI scripts remain unchanged. He > just fakes being in a Travis environment by using the information that GHA > provides to set all of the $TRAVIS_* environment variables [0] that those > scripts use. Not only was this much faster to do than doing a wholesale > conversion to idiomatic GHA (Fabricio got everything working with only > about 2 days of work!), it means that Travis can continue running for as > long or as short as we want it to, and once we do switch over the process > of converting the CI scripts to be more idiomatic with GHA can be done at > our leisure rather than frontloading a bunch of work. > > Adding GHA support to the template and the other plugins should be as > simple as taking the GHA configs (analogous to .travis.yml) that are > already written for pulp_file, generalizing them for the template, and then > re-applying the template to all the other plugins. I don't expect that it > would take longer than one engineer-day to complete the whole process! > > [0] > https://github.com/pulp/pulp_file/pull/353/files#diff-d45cbc8d15de0f15cdce609ec195cf2eR34-R47 > > On Sat, Feb 8, 2020 at 10:22 AM Brian Bouterse <[email protected]> > wrote: > >> Thanks for replying @dalley and @daviddavis, both of your replies make >> good points that resonate with me. Rather than inline responses, I'll try >> to bring back some of your points and comment on them. >> >> @dalley, your articulation of how we would split up the CI to run each >> part on only one CI platform sounds good to me. +1 to the SELinux and FIPS >> testing running on Centos CI, and everything else running in another CI. >> This addresses my concern that we were going to duplicate features from one >> CI to another. >> >> @daviddavis +1 to merging PRs to give us more Github Actions data on >> repos that are not managed by the plugin_template. I'm concerned about >> merging Github Action PRs against plugin_template managed repos. For >> example with pulp_file, I work on that regularly and I'd like to continue >> using the existing CI capabilities it has as-is until the new system is >> ready. Let me know if you think we should do this aspect differently. >> >> @daviddavis to your point that we must move to Github Actions and off of >> Travis makes sense to me because Travis is a huge bottleneck and Github >> Actions can run a lot more in parallel. If we're going to do that though I >> think we need to see a plan on how and when Pulp would leave Travis for >> Github Actions. In terms of making such a plan I would think it would need >> a few aspects in it: >> >> * We need details on each piece of the Travis workflow, where it will be >> ported to, and a rough estimate of how long each piece would take. I think >> these things would make a great EPIC. >> * Who will work on it? It needs I think 2 fully dedicated people who >> already completely understand the Travis stuff in detail. It's too hard for >> one person and would take too long. Not being able to have these people >> fully-dedicated on this task would be a deal-breaker for me. This type of >> activity needs no distraction. >> * It's got to happen fully - If we're leaving Travis for Github Actions, >> we have to fully leave. >> * I think it would be good if when a plugin switches, they switch >> fully-and-at-once from Travis to Github Actions. I think this because >> otherwise, every few days, another plugin_template update will take away a >> Travis feature and move it to Github Actions, which across the 10+ plugins >> and 10+ features would be painful. This would be very confusing I think. >> * It needs to come with education somehow. Maybe a demo video, blog post >> recap, and certainly great docs replacing the Travis ones we have now. >> >> I'm suggesting a plan instead of a decision because without a plan. I >> don't know how long the work will take, and thus I can't know if we can >> afford it in terms of development capacity now. Given the whole convo, I'm >> more wondering if "now is the right time" and less about "if this is the >> right long-term idea". I think the best long-term situation for the Pulp >> development community is likely not with Travis. Now could be the right >> time, if we look at the development team and determine if we can meet all >> of our goals while fully dedicated 1-3 people to this other effort. >> >> Let me know how I can help. Thank you both and Fabricio for continuing to >> drive this improvement for the community. >> >> -Brian >> >> >> >> >> >> >> >> >> >> >> >> On Thu, Feb 6, 2020 at 12:29 PM Daniel Alley <[email protected]> wrote: >> >>> I agree that Centos CI should be a high priority, however I think it is >>> still important to discuss what we want our end-state to look like, because >>> that will strongly influence our approach going forwards. And FWIW, I >>> don't think Fabricio's work will do any harm in this respect, especially >>> given that the main focus has been on repos that don't use the template >>> (pulp-rpm-prerequisites, ansible-pulp), and are putting enough load on >>> Travis to cause us tangible problems (ansible-pulp, pulp_file performance >>> tests). >>> >>> I don't believe Fabricio was suggesting that some plugins would use >>> Travis and other plugins would use Github Actions. It was an idea thrown >>> around that maybe we would want to support a choice of CI for potential >>> plugin writers, but personally I think we should just ditch Travis >>> entirely. The outages (such as the one on Monday) and resource >>> restrictions are hindering development, and I don't expect it to get better >>> considering how many senior engineers they laid off after being sold to >>> a private equity firm with a poor reputation. >>> <https://news.ycombinator.com/item?id=19218036> >>> >>> But I also don't think we should try to use Centos CI to replace all the >>> things Travis is currently doing. I would rather use Github Actions for >>> everything except for the very few workflows that require Centos CI, >>> namely, running tests on a FIPS platform and with SELinux configured. I >>> think that this proposal would be both the optimal outcome, and also the >>> easiest thing to do, and here is why. >>> >>> Centos CI would not be involved with any of the following: >>> >>> * Code formatting lints >>> * Commit message checks >>> * Changelog checks >>> * Everything involving a matrix of different combinations of Python / >>> PostgreSQL / Django versions >>> * Deploy to PyPI upon pushing a new tag >>> * Testing things against a specific PR or PRs (probably, if we were to >>> run the jobs nightly instead of on every PR, which doesn't strike me as >>> necessary) >>> >>> The majority of CI complexity is due to these auxillary features and I >>> don't see any reason to try to port this to Jenkins/Centos CI, much less >>> try to maintain it across both CI systems. Here we agree: that would be a >>> nightmare. Almost all of the CI-service-specific code deals with these >>> auxillary checks. But Fabricio has already proven that these things are >>> relatively easy to port to Github Actions, which, while different from >>> Travis, is much more similar to Travis than Jenkins is. And this work is >>> already done, and will be really easy to port back into the plugin template >>> to use everywhere. >>> >>> Of our various CI scripts, the only ones which would be remain in common >>> between GHA and CentosCI are install.sh and before_script.sh, which perform >>> the core setup tasks for our containers. Every other script in our >>> .travis/ directory does something which can be the sole concern of Github >>> Actions. So the maintenance burden of maintaining that small amount of >>> common code would not be very high, and certainly not double. >>> >>> >>> >>> >>> On Thu, Feb 6, 2020 at 10:03 AM David Davis <[email protected]> >>> wrote: >>> >>>> I think there is an immediate need to move to Github Actions. >>>> Yesterday, for example, I spent a good deal of time on failing pulp_file >>>> jobs, which are exceeding Travis' 50 minute threshold[0] (Github Actions >>>> has a 6 hour limit). We've also been working for weeks on alleviating the >>>> bottlenecks that we've been experiencing due to Travis' limit of 3 >>>> concurrent jobs. Paying the Travis tax is detracting from our stakeholder >>>> work. >>>> >>>> Regarding supporting two CIs, won't we have to support multiple CIs to >>>> run against selinux and FIPS? The only alternative would be to move >>>> everything to CentOS CI. Fabricio's pulp_file PR demonstrates though that >>>> our CI scripts can be made to run in multiple CIs. These scripts are the >>>> majority of our CI/CD code; the Travis/Github Actions configs are only a >>>> couple hundred lines. So most of our code will be shared across CIs, which >>>> should alleviate most of the burden of supporting more than one CI. >>>> >>>> I would suggest as a next step we merge the ansible-pulp PR[1] as it >>>> should provide some real world data about running on Github Actions which >>>> we can consider. Moreover, its CI is independent from the plugin_template >>>> and it should help to alleviate most of our bottlenecks in Travis. We can >>>> postpone the decision around plugins until we have more data and consensus. >>>> >>>> [0] https://pulp.plan.io/issues/6104 >>>> [1] https://github.com/pulp/ansible-pulp/pull/217 >>>> >>>> David >>>> >>>> >>>> On Thu, Feb 6, 2020 at 5:51 AM Brian Bouterse <[email protected]> >>>> wrote: >>>> >>>>> Inline replies to three convos would be too confusing, so I'm going to >>>>> try to bring it back to a single thread. >>>>> >>>>> The Pulp team can't afford to do two CI's. I estimate it's taken many >>>>> hundreds of hours cumulatively and probably >10 hours a week at least >>>>> maintaining the CI for Travis in the plugin template. The current >>>>> commitments and size of the pulp dev team can't sustain doubling that >>>>> additional level of investment. Think about allllllll the changes that we >>>>> make weekly. Are we prepared to "port" those continuously? I'm not. I >>>>> think >>>>> it's categorically a non-starter from a resource perspective. >>>>> >>>>> I don't think it's a good thing to split the plugins to use various >>>>> CI's. Today if something doesn't work, it doesn't work in all plugins CI, >>>>> and if someone fixes it, all plugins get fixed (for the most part). >>>>> Splitting plugins across different CI's with incompatible features and no >>>>> parity between them will put us in a situation where we lose the benefits >>>>> of every improvement improving everyone. >>>>> >>>>> Is this work being done to serve a stakeholder asking for it? I ask >>>>> because if it isn't, it's taking the place of work stakeholders are asking >>>>> for to be delivered in Feb and March. Those timelines are so close, I'm >>>>> surprised others perceive that now is the right time to take on a goal >>>>> like >>>>> this. >>>>> >>>>> I'm on PTO until the 17th so I will only be able to provide input on >>>>> his decision sparsely until then. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> I'm perceiving that people don't want to continue on Travis and this >>>>> is the way for some plugin writers to leave Travis. The problem is that >>>>> >>>>> On Wed, Feb 5, 2020 at 12:44 PM Fabricio Aguiar < >>>>> [email protected]> wrote: >>>>> >>>>>> I believe we can add GH actions on plugin_template, then we would >>>>>> have: >>>>>> $ ./plugin-template --travis PLUGIN_NAME >>>>>> or >>>>>> $ ./plugin-template --ghactions PLUGIN_NAME >>>>>> it is not implemented yet on plugin_template, >>>>>> but my experience with pulp_file ( >>>>>> https://github.com/pulp/pulp_file/pull/353) makes me think it will >>>>>> be easy to create a template for it since I didn't change many files, >>>>>> and I have not removed travis. >>>>>> This way, we can make plugin_template run both, travis and GH actions. >>>>>> Working with GH actions was a good exercise, I struggled to find a >>>>>> replacement for TRAVIS_COMMIT_RANGE, and got some config issues with >>>>>> kubectl and httpie. >>>>>> I personally think changing to GH is totally optional for plugins, >>>>>> but I believe ansible-pulp and pulp_rpm_prerequisites should move to GH >>>>>> actions, as both not use plugin_template and consume a lot of time. >>>>>> And make plugin_template run in both travis and GH actions, for >>>>>> pushing us to be more agnostic. >>>>>> >>>>>> Best regards, >>>>>> Fabricio Aguiar >>>>>> Software Engineer, Pulp Project >>>>>> Red Hat Brazil - Latam <https://www.redhat.com/> >>>>>> +55 11 999652368 >>>>>> >>>>>> >>>>>> On Wed, Feb 5, 2020 at 2:16 PM David Davis <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Brian, >>>>>>> >>>>>>> Thanks for the feedback. Responses inline below. >>>>>>> >>>>>>> On Wed, Feb 5, 2020 at 10:31 AM Brian Bouterse <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I'm concerned about the move to GH actions and also the timing. The >>>>>>>> benefits of lowering the CI runtime are really great, but I'm worried >>>>>>>> it >>>>>>>> isn't helping us towards our goals and even takes us further from them. >>>>>>>> >>>>>>>> I'm worried about double the outage risk. There are outages, and >>>>>>>> structurally repo CI pipelines that require more services are at more >>>>>>>> risk >>>>>>>> for total outage. This raises the risk of "total CI pipelines halting" >>>>>>>> in a >>>>>>>> concerning way for me. Trading runtime for risk I don't think is an >>>>>>>> overall >>>>>>>> win; I'd like to find a way to lower the runtime and keep risk the >>>>>>>> same or >>>>>>>> lower. >>>>>>>> >>>>>>> >>>>>>> We've been plagued by Travis outages and bottlenecks over the past >>>>>>> year. Our plugin_template is currently tied to Travis so one option >>>>>>> would >>>>>>> be to allow plugin writers to choose which CI to use and divorce Pulp >>>>>>> from >>>>>>> being tied to a single one. This ought to reduce risk and the impact of >>>>>>> outages. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Whatever we do I want to make sure we're doing it fully through the >>>>>>>> plugin template. Is this through the plugin template? If it isn't, or >>>>>>>> it >>>>>>>> requires additional steps to configure it than they had before, then >>>>>>>> I'm >>>>>>>> concerned about it taking us further from our goals of having the >>>>>>>> plugin >>>>>>>> writer take as much burden from the plugin writer as possible. I use >>>>>>>> this >>>>>>>> thinking to answer the question posed from daviddavis. My take is that >>>>>>>> the >>>>>>>> plugin template's goal is to make writing a plugin with great CI as >>>>>>>> easy as >>>>>>>> possible. It's design to be a quality improver and a time saver. >>>>>>>> >>>>>>> >>>>>>> Agreed, the goal is to update the plugin_template. The plan is to >>>>>>> start by moving ansible-pulp to Github Actions first and test out Github >>>>>>> Actions as a viable replacement for Travis. Then move pulpcore and >>>>>>> plugins >>>>>>> (via the plugin_template). The ansible-pulp repo doesn't use >>>>>>> plugin_template for its CI configuration so we don't have to change the >>>>>>> plugin_template in testing out Github Actions for ansible-pulp and also >>>>>>> ansible-pulp is the main hog of our Travis resources consuming job >>>>>>> runners >>>>>>> for 1+ hours. >>>>>>> >>>>>>> To your point about the plugin_template, supporting Github Actions >>>>>>> shouldn't add additional burden to the plugin writer. The two options >>>>>>> are >>>>>>> to either move to Github Actions wholesale or let plugin writers choose >>>>>>> which CI to use (which we could default). Either option would require >>>>>>> zero >>>>>>> extra steps for plugin writers. And the latter would give more >>>>>>> flexibility >>>>>>> to plugin writers if they want to use a different CI. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Having the lower runtime is nice, but if we're going to put effort >>>>>>>> in the CI, I'd like to bring up prioritizing getting the >>>>>>>> plugin_template >>>>>>>> integrated with https://ci.centos.org/ as a high-value goal. I'm >>>>>>>> concerned that we're about to ship the SELinux policy and we have no >>>>>>>> way to >>>>>>>> test it. Similar concerns with certguard's dependency and its >>>>>>>> dependencies >>>>>>>> not being packaged on Ubuntu (so it's hard to run on Travis). Also, I'm >>>>>>>> concerned we don't have an environment to evaluate FIPS compatibility >>>>>>>> with. >>>>>>>> Relatively speaking if we can only do one of these two initiatives at >>>>>>>> this >>>>>>>> time, I believe we should do the CentOS CI. >>>>>>>> >>>>>>> >>>>>>> I don't see moving to CentOS CI and Github Actions as mutually >>>>>>> exclusive. In fact, I think moving to Github Actions could make it >>>>>>> easier >>>>>>> to use to CentOS CI by making our CI/CD code more CI agnostic. Moreover, >>>>>>> much of the hard work to move to Github Actions was already completed by >>>>>>> Fabricio last week. >>>>>>> >>>>>>> >>>>>>>> Lowering the runtime I'm really in favor of, so I hope these >>>>>>>> concerns prompt discussion more than stop the initiative. What do you >>>>>>>> all >>>>>>>> think? >>>>>>>> >>>>>>>> On Wed, Feb 5, 2020 at 9:05 AM David Davis <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Great question. IMO the main benefit in continuing to support >>>>>>>>> Travis is that we could better separate our test/deployment code from >>>>>>>>> the >>>>>>>>> CI specific bits so that most of the plugin_template code could be CI >>>>>>>>> agnostic. That said, this would be more work. I think it comes down to >>>>>>>>> whether we want our plugin_template to be more opinionated or more >>>>>>>>> configurable. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Feb 5, 2020 at 8:18 AM Dana Walker <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> +1 to moving to Github Actions. >>>>>>>>>> >>>>>>>>>> Can anyone think of reasons a plugin would want to stay with >>>>>>>>>> Travis specifically? As fao89 pointed out on the issue, at least >>>>>>>>>> each >>>>>>>>>> plugin that does choose to move takes some of the workload with them >>>>>>>>>> to >>>>>>>>>> free up job runners for plugins that choose to remain. >>>>>>>>>> >>>>>>>>>> Dana Walker >>>>>>>>>> >>>>>>>>>> She / Her / Hers >>>>>>>>>> >>>>>>>>>> Software Engineer, Pulp Project >>>>>>>>>> >>>>>>>>>> Red Hat <https://www.redhat.com> >>>>>>>>>> >>>>>>>>>> [email protected] >>>>>>>>>> <https://www.redhat.com> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Feb 4, 2020 at 10:26 AM David Davis < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Over the past year, we've experienced several growing pains with >>>>>>>>>>> using Travis as our CI/CD environment. Perhaps the biggest has been >>>>>>>>>>> the >>>>>>>>>>> limitation of having only 3 concurrent job runners[0] across our >>>>>>>>>>> entire >>>>>>>>>>> Pulp organization. At times, it has slowed development by >>>>>>>>>>> bottlenecking the >>>>>>>>>>> merging of PRs and delayed numerous releases of Pulp. >>>>>>>>>>> >>>>>>>>>>> Last year, Github introduced Github Actions which offers open >>>>>>>>>>> source projects 20 concurrent jobs[1]. I've filed an issue here to >>>>>>>>>>> get >>>>>>>>>>> feedback on moving our repos and plugins to Github Actions: >>>>>>>>>>> >>>>>>>>>>> https://pulp.plan.io/issues/6065 >>>>>>>>>>> >>>>>>>>>>> Also, @fao89 has opened a couple PoC PRs to demonstrate using >>>>>>>>>>> Github Actions: >>>>>>>>>>> >>>>>>>>>>> https://github.com/pulp/pulp_file/pull/353 >>>>>>>>>>> https://github.com/pulp/ansible-pulp/pull/217 >>>>>>>>>>> >>>>>>>>>>> You'll notice for example that the ansible-pulp build time went >>>>>>>>>>> from more than 1 hour[2] to 27 minutes[3] as all the jobs ran in >>>>>>>>>>> parallel >>>>>>>>>>> on Github Actions. >>>>>>>>>>> >>>>>>>>>>> Unless there are objections, we plan to merge the ansible-pulp >>>>>>>>>>> PR this week since it's CI configuration is independent from other >>>>>>>>>>> pulp and >>>>>>>>>>> plugin repos (ie it doesn't use the plugin_template's Travis files). >>>>>>>>>>> >>>>>>>>>>> We're hoping though to get feedback on whether we should move >>>>>>>>>>> pulpcore and plugin repos to Github Actions. If so, should we >>>>>>>>>>> provide >>>>>>>>>>> plugins with the option to continue using Travis if they want? >>>>>>>>>>> >>>>>>>>>>> If there's no objections by February 11, 2020, we'll proceed >>>>>>>>>>> with moving pulp_file to Github Actions and look at updating >>>>>>>>>>> plugin_template. >>>>>>>>>>> >>>>>>>>>>> [0] https://travis-ci.com/plans >>>>>>>>>>> [1] >>>>>>>>>>> https://help.github.com/en/actions/automating-your-workflow-with-github-actions/workflow-syntax-for-github-actions#usage-limits >>>>>>>>>>> [2] https://travis-ci.org/pulp/ansible-pulp/builds/645651353 >>>>>>>>>>> [3] >>>>>>>>>>> https://github.com/fabricio-aguiar/ansible-pulp/actions/runs/33601847 >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Pulp-dev mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>> Pulp-dev mailing list >>>>>>>>> [email protected] >>>>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>> Pulp-dev mailing list >>>>>>> [email protected] >>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>>>> >>>>>> _______________________________________________ >>>> Pulp-dev mailing list >>>> [email protected] >>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>> >>>
_______________________________________________ Pulp-dev mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-dev
