Reproducible builds [Airflow] -> done

2024-01-14 Thread Jarek Potiuk
Hey everyone, I just wanted to share a little accomplishment I (mostly) implemented in Airflow - I just merged the last PR to get fully reproducible builds for all thePython artifacts we produce and publish in downloads.apache.org (python whl, sdist packages, source tarballs). All our 90 or so

Re: Github hard limit on job execution time

2023-04-18 Thread Jarek Potiuk
thanks for all the feedback. At this time there are no sponsors for > Geode > > > so cannot have self-hosted runners. I have already split the job > running > > > tests into multiple jobs by gradle module, and even then one particular > > > gradle

Re: Github hard limit on job execution time

2023-04-14 Thread Jarek Potiuk
In many cases it can be done with choosing a bigger machine with more CPUS and parallelising as others mentioned. This is cool if your tests are pure unit tests and you can add just `--xdist` flag or similar (this is a pytest extension to run your tests in parallel with as many CPUs as you can).

Re: Multi-arch container images on DockerHub

2022-12-06 Thread Jarek Potiuk
ver 2 hours, compare to 10 > >> minutes for amd64), but this approach eliminates the need to build for > more > >> than one platform at a time, so if the arm64 job were to run on an arm64 > >> runner in the future, that wouldn't need to further complicate the

Re: Multi-arch container images on DockerHub

2022-12-06 Thread Jarek Potiuk
In Airflow we have a bit more complex setting (we are building 2x5x2 different images and they are different sets of them for different branches), Building images for Airflow takes quite some time (installing many dependencies) so qemu was out of the question (several hours to build single image).

Bigger Public Runners on GiHub

2022-12-05 Thread Jarek Potiuk
Hello here, We've discussed recently on the possibility of enabling bigger public runners for GitHub Actions. I recall that it was supposed to be raised at a meeting with GitHub. We would very much like to start using bigger runners - we started to experience much more frequent intermittent

Re: Meeting tomorrow

2022-11-10 Thread Jarek Potiuk
Hmm. Is it just me, or the link from builds page is incorrect? Got "this link is invalid". On Wed, Nov 9, 2022 at 9:55 AM Jarek Potiuk wrote: > > +1 > > On Wed, Nov 9, 2022 at 8:20 AM Gavin McDonald wrote: >> >> Just a reminder, as advertised on the 27th Octobe

Re: Meeting tomorrow

2022-11-09 Thread Jarek Potiuk
+1 On Wed, Nov 9, 2022 at 8:20 AM Gavin McDonald wrote: > Just a reminder, as advertised on the 27th October so hopefully > plenty of notice this time! - We have a builds meeting tomorrow. > > Add your name to the cwiki page if you plan on attending. > > >

Re: Meeting Summary 20th October 2022

2022-10-26 Thread Jarek Potiuk
Thanks. Really helpful :) On Wed, Oct 26, 2022 at 8:18 AM Gavin McDonald wrote: > > Hi All, > > Meeting notes from last week are now up at: > > https://cwiki.apache.org/confluence/display/BUILDS/Builds+Agenda+--+2022-10-20 > > Future meeting are scheduled for 1530 UTC on the 2nd Thursday of each

Re: Meeting this Thursday

2022-10-19 Thread Jarek Potiuk
Unfortunately, I am at Cape Canaveral, Kennedy Space Centre, and watching the Starlink Launch about that time (or at Atlantis Shuttle tour depending on the launch progress) so I will have to pass this time :(. But I'm looking forward to the result of the Github discussion afterwards. I reviewed

Re: Building with Travis - anyone?

2022-09-01 Thread Jarek Potiuk
We moved out in Airflow from Travis to GA 2 years ago - because for us it started to degrade back then. We never looked back. There are a few docs and on-going discussions that you might want to check out as well Phil: * The "status" and "some overview" - look here

Re: GitHub Actions runners

2022-08-16 Thread Jarek Potiuk
As an original author - not much has changed since. I am regularly following what Github Action released to see if the problems we raised to them were addressed or not. All the updates regarding the security tightening are updated in the wiki page (not much changed - there are a number of

Re: GitHub Action Status

2022-05-11 Thread Jarek Potiuk
Pretty current. I refreshed it recently. On Wed, May 11, 2022 at 12:04 AM Giovanni Bechis wrote: > Hi, > I was planning to work on SpamAssassin regression tests using GitHub > actions but I noticed this wiki: > > https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status > > is

Re: GitHub Actions - Change in Behaviour - Automatic Pull Request Creation via GH

2022-05-05 Thread Jarek Potiuk
GitHub does NOT make things easy :) On Thu, May 5, 2022 at 10:38 AM Gavin McDonald wrote: > On Thu, May 5, 2022 at 10:15 AM Jarek Potiuk wrote: > > > See > > > > > https://github.blog/changelog/2022-05-03-github-actions-prevent-github-actions-from-creating-

Re: GitHub Actions - Change in Behaviour - Automatic Pull Request Creation via GH

2022-05-05 Thread Jarek Potiuk
See https://github.blog/changelog/2022-05-03-github-actions-prevent-github-actions-from-creating-and-approving-pull-requests/- that prevention was introduced by GitHub 2 days ago. On Thu, May 5, 2022 at 9:57 AM Richard Zowalla wrote: > Hi all, > > we automatically re-generate some files during

Re: Meeting next week

2022-04-21 Thread Jarek Potiuk
+1 On Thu, Apr 21, 2022 at 10:35 PM Gavin McDonald wrote: > > Hi All, > > I have been terrible at remembering to organise our next builds@ meeting, > > I'd like to go for Next Thursday 28th April at 4PM UTC. > > If that is agreeable with everyone I'll lock it in and open up a wiki page > for >

Re: ETA on GHA self-hosted runner security audit?

2022-04-13 Thread Jarek Potiuk
On Sat, Apr 9, 2022 at 8:10 AM Chesnay Schepler wrote: > > i think we'd be alright with those caveats. > > 1) It depends on how far you wanna go. If the goal is to only protect > the workflow files themselves (and not any scripts that the workflow calls), > then you only need to check out a

Re: ETA on GHA self-hosted runner security audit?

2022-04-07 Thread Jarek Potiuk
that. > > For 3rd party actions we, as of right now, only intend to use those > provided by Github (things like checkout or artifact management). Our > pipeline isn't really complex or fancy, just expensive. > Since these actions offer pretty basic functionality there is no need to &g

Re: ETA on GHA self-hosted runner security audit?

2022-04-06 Thread Jarek Potiuk
ave very much read that. > > On 06/04/2022 19:22, Jarek Potiuk wrote: > > Since you referred Ash's link you probably have not read this: > > > > However this is not something to tackle lightly, as Infra *will not manage > > or secure your VM* - that is up to you. &g

Re: ETA on GHA self-hosted runner security audit?

2022-04-06 Thread Jarek Potiuk
Since you referred Ash's link you probably have not read this: However this is not something to tackle lightly, as Infra *will not manage or secure your VM* - that is up to you. On Wed, Apr 6, 2022 at 7:21 PM Chesnay Schepler wrote: > This article also lists self-hosted runners as an option:

Re: ETA on GHA self-hosted runner security audit?

2022-04-06 Thread Jarek Potiuk
Hello Chesnay, I think you have a bit too high of expectations and I am not sure why. Not sure who you talked to at Airflow, but we always underline and stress and warn that our solution is really "experimental" and "works for us" because we invested awfully (and I mean awfully) lot of time in

Re: builds meeting

2022-02-18 Thread Jarek Potiuk
Fine for me if it's 5pm UTC. Proposal/Suggestion - I have been organising a few meetings with big group of people and "doodle" worked beautifully for that: https://doodle.com/dashboard - you can send a link with a number of proposals matching times where you are available and people can mark

Re: Disable GH Actions from approving PRs

2022-02-08 Thread Jarek Potiuk
It would only work If you want "All" your PRs to be "approved". => It would only work If you want "All" your PRs to be "approved" by the GH Actions. On Tue, Feb 8, 2022 at 4:00 PM Jarek Potiuk wrote: > Depends on the scheme you choose. &

Re: Disable GH Actions from approving PRs

2022-02-08 Thread Jarek Potiuk
actions that verify stuff, then the project > >> configures this to N+1. > >> > >> Admittedly this would mean that every project has use this properly. > >> > >> On 08/02/2022 15:14, Jarek Potiuk wrote: > >>> In short - just to explain why: > >&

Re: Disable GH Actions from approving PRs

2022-02-08 Thread Jarek Potiuk
: > Is it not possible to control the number of approvals that are required? > > So if a project has N github actions that verify stuff, then the project > configures this to N+1. > > Admittedly this would mean that every project has use this properly. > > On 08/02/2022 1

Re: Disable GH Actions from approving PRs

2022-02-08 Thread Jarek Potiuk
In short - just to explain why: The "protected branches" feature is the way how to make sure that the code is looked at and approved by at least 1 commiter in order to be merged. This is a strong protection - not only from UI but also prevents you from fast-forward the "prtected branch" to the

Re: dockerhub limitations - maximum user count reached

2022-01-21 Thread Jarek Potiuk
rtifactory instance can act as a docker > registry. > > -Chris > > > > > > On Jan 21, 2022, at 9:30 AM, Jarek Potiuk wrote: > > > > We are pushing DockerHub images using personal accounts in Airflow. I > > believe we have three "active" user

Re: dockerhub limitations - maximum user count reached

2022-01-21 Thread Jarek Potiuk
We are pushing DockerHub images using personal accounts in Airflow. I believe we have three "active" users - Kaxil, Jarek, Ash, likely some more seasoned PMC members (those can be removed), Basically everyone who is a member of the "airflow" team in DockerHub. But we also would like to add one or

Re: ephemeral builds via AWS ECS and/or EKS? GPU Nodes?

2021-12-31 Thread Jarek Potiuk
We do not use Jenkins but we do have self-hosted runners in Airflow - we use VMS and auto-scaling groups rather than ECS or EKS, no GPU nodes though, so I am not sure how it would be useful. But we are looking into using EKS instead at some point, so maybe that's a good opportunity to try it out

Re: Pushing Docker Images

2021-11-16 Thread Jarek Potiuk
Cool! I will take a look at that in the coming weeks :) On Tue, Nov 16, 2021 at 11:35 AM Martin Grigorov wrote: > > Hi Allen, > > I've just documented how one could use Oracle Cloud free plan to build and > test on Linux ARM64 for free! > Please check >

Re: Next builds@ meeting

2021-11-01 Thread Jarek Potiuk
I updated the graphs. I think things have changed a lot since December last year. There are no absurd-high (and way above the limits) peaks of workflows in progress for some projects (e.g. Pulsar). However a number of apache repos using GA grows steadily. Number of workflows "queued" fluctuates

Re: Owner apache user license exceeded

2021-09-16 Thread Jarek Potiuk
Maybe the user licence exceed is somewhat related and someone is exploiting the leakage ? Remote possibility, but who knows. On Thu, Sep 16, 2021 at 11:22 PM Jarek Potiuk wrote: > Just learned something (not using Travis CI but I have not seen this > posted here). Several days ago Travis

Re: Owner apache user license exceeded

2021-09-16 Thread Jarek Potiuk
Just learned something (not using Travis CI but I have not seen this posted here). Several days ago Travis CI had a serious security breach. In short - for about 1 hr anyone could get access to any secrets stored in any public project. https://www.infoq.com/news/2021/09/travis-ci-secrets-leak/

Re: docker login

2021-09-15 Thread Jarek Potiuk
Question out of curiosity (I am not a Jenkins user but I know why we had to run 'docker logins' on private runners). Are Jenkin's Public IP addresses excluded from the Pull Rate limit? I am asking, because if not then preventing login might cause a problem with hitting 100 pulls/day/IP limit

Re: Dependabot-like solution for Apache projects

2021-09-03 Thread Jarek Potiuk
Agree with Christopher that "technically" this does not matter if branch is fork PR or branch PR. And I also see the usefulness of Dependabot. I used it in the past and it's been extremely easy and helpful - with all the changelogs/release notes right in the PR you could find in exactly the moment

Re: Re: Dependabot-like solution for Apache projects

2021-08-30 Thread Jarek Potiuk
I believe that changed when Github bought dependabot and it become "embedded" in GitHub soon after: https://dependabot.com/blog/hello-github/ J. On Mon, Aug 30, 2021 at 3:43 PM Lewis John McGibbney wrote: > Thanks Gary and Sebb. > How do I turn dependabot on? Last time I tried I was informed

Re: Beta access to the new GitHub issues for Apache ?

2021-07-20 Thread Jarek Potiuk
Actually this is the conversation - https://the-asf.slack.com/archives/CBX4TSBQ8/p1624615203074400. The previous link was where I spoke about it in Slack. J. On Tue, Jul 20, 2021 at 4:17 PM Jarek Potiuk wrote: > Sure. Conversation here (mostly with Daniel): > https://the-asf.sla

Re: Beta access to the new GitHub issues for Apache ?

2021-07-20 Thread Jarek Potiuk
21 at 3:11 PM Jarek Potiuk wrote: > > > Thanks Gavin. As explained on slack, > > > Explained on Slack? - Can't recall this, can I get a link to the > conversation please. > > I have a meeting with Matthew Butler- > > who is the PM of the new GitHub Issues on Tu

Re: Beta access to the new GitHub issues for Apache ?

2021-07-18 Thread Jarek Potiuk
wrote: > Hi All, > > Jarek, signed up. > > On Fri, Jun 25, 2021 at 10:33 AM Jarek Potiuk wrote: > > > Hello everyone (especially the INFRA :) ). > > > > GitHub recently introduced a new beta feature (limited access) of a > > new way of managing Github

Beta access to the new GitHub issues for Apache ?

2021-06-25 Thread Jarek Potiuk
Hello everyone (especially the INFRA :) ). GitHub recently introduced a new beta feature (limited access) of a new way of managing Github Issues https://github.blog/2021-06-23-introducing-new-github-issues/ For now it can only be enabled for organisations, not individuals. It looks really cool,

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-06-20 Thread Jarek Potiuk
ng your build. https://github.blog/changelog/2021-04-20-github-actions-control-permissions-for-github_token/ . This could help preventing sophisticated supply-chain attack for example like the recent codecov attack. > On Wed, Dec 30, 2020 at 5:25 PM Jarek Potiuk wrote: > > > > F

New Concurrency feature of GitHub Actions -> replaces "cancel-workflow" action of mine

2021-05-24 Thread Jarek Potiuk
Hello everyone, In case you have not noticed, Github Recently (19th of April) introduced a new feature "concurrency" which adds the possibility of cancelling duplicate workflows in a much "nicer" way than the "cancel-workflow-action" I wrote.

Re: Ephemeral builds for manual testing

2021-05-04 Thread Jarek Potiuk
This is a great post and great approach from the Superset team. FYI. In Airflow we are working on enabling the very same approach that we did for AWS in GCP, with a slightly different approach when it comes to auto-scaling (a bit simpler I think). We will share it when we finish. J. On Mon,

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

2021-04-20 Thread Jarek Potiuk
> > Is it possible to share the raw data in some form? If you can publish > data in any form (csv? sqlite?) I can generate static html files with > python notebooks which can be shared with everybody... > > I am adding Tobiasz who can share it :). > (BTW, how do you get the data? Do you poll

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

2021-04-19 Thread Jarek Potiuk
Also some comments for the stats. This is good stuff Marton. > Apparently, what's named "jobhours" in your statistics is actually the > runtime for an entire workflow (the sum of all job runtimes for that > workflow). That's at least what I conclude if I look at this workflow, > which your

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

2021-04-19 Thread Jarek Potiuk
I really love what Hyukjin has done. I did not have the capacity to participate in this actively, but this is exactly the way to go, I think (with a caveat). Following the motorway metaphor - everyone (every contributor/committer, not every project) has their own lane and they do not interfere

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

2021-04-08 Thread Jarek Potiuk
> > > That's a good idea. We do need to thank Github to give free resources to > ASF projects, but it's better if we can make it a business: we allow > individual projects to sign deals with Github to get dedicated resources. > It's a bit wasteful to ask every project to set up its own dev ops, >

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

2021-04-07 Thread Jarek Potiuk
śr., 7 kwi 2021, 18:45 użytkownik ocket napisał: > If your project can afford it, you can add self-hosted GHA runners: > > https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners > The issue with that being that the machine running your actions will >

Re: Increase the number of parallel jobs in GitHub Actions at ASF organization level

2021-04-07 Thread Jarek Potiuk
Just a comment here - as I commented also in the ticket The document https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status gives complete overview of where the Github Actions are for the ASF project. And we have some nice experiences in Apache Airflow that we will be able to

Re: Hitting GitHub API limits

2021-02-17 Thread Jarek Potiuk
Just one comment - you might want to take a look at https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status. documents which describes some of the open issues we have with GitHub Actions - they relate to scalability/performance and security. Also - maybe the IP Addresses of ASF

Re: GA workflow cancellation

2021-02-09 Thread Jarek Potiuk
ueue strain situation" chapter - where I explain how `allDuplicates` mode helps to fight it. J On Tue, Feb 9, 2021 at 7:53 PM Jarek Potiuk wrote: > > Er... so "source" workflow is the triggering workflow, and "target" > workflow is the cancelling workflow? Or

Re: GA workflow cancellation

2021-02-09 Thread Jarek Potiuk
ly, the action is written in TypeScript not Javascript (Typescript is transpiled to javascript) and developing in Typescript if you have a good IDE is easy :). This might be easier than you think. J. On Tue, Feb 9, 2021 at 7:34 PM Antoine Pitrou wrote: > > Le 09/02/2021 à 19:28, Jarek Po

Re: GA workflow cancellation

2021-02-09 Thread Jarek Potiuk
why "alllDuplicates" cancel mode was introduced - I explained exactly how it works and how the "high strain" situation is handled in this comment: https://github.com/apache/pulsar/pull/9503#issuecomment-774644408 - I am moving part of the explanation to the documentation of the act

Re: GA again unreasonably slow (again)

2021-02-09 Thread Jarek Potiuk
The reasoning for selective checks here: https://github.com/apache/airflow/blob/master/PULL_REQUEST_WORKFLOW.rst (correct link) On Tue, Feb 9, 2021 at 7:05 PM Jarek Potiuk wrote: > | The real hard problem is knowing when a change requires full regression > and integration testing

Re: GA again unreasonably slow (again)

2021-02-09 Thread Jarek Potiuk
easy lightweight branches all being fully tested > > This is my 10,000 meter view. > > But then I’m old school and on my first job the mainframe printout > included how much the run I made was costing my boss in $. > > Best Regards, > Dave > > Sent from my iPhone >

Re: GA again unreasonably slow (again)

2021-02-09 Thread Jarek Potiuk
on every > platform. As your project scales up, CI becomes a Hard Problem. I don’t > think throwing hardware at it indefinitely works, though your research here > is finding most of the useful things. > > On Tue, Feb 9, 2021 at 02:21 Jarek Potiuk wrote: > > > The report show

Re: GA again unreasonably slow (again)

2021-02-09 Thread Jarek Potiuk
te a little help from your side. So maybe you just submit 2-3 PRs yourself any time Monday - Friday 12pm CET -> 8pm CET - this is where regularly bottlenecks happen. Please let everyone know your findings J, On Tue, Feb 9, 2021 at 8:35 AM Allen Wittenauer wrote: > > > > On Feb 8, 2

Re: GA again unreasonably slow (again)

2021-02-08 Thread Jarek Potiuk
for others who do. J On Tue, Feb 9, 2021 at 1:33 AM Allen Wittenauer wrote: > > > On Feb 7, 2021, at 4:44 PM, Jarek Potiuk wrote: > > > > If you are interested - my document is here. Open for comments - happy to > > add you as editors if you want (just send m

Re: GA again unreasonably slow (again)

2021-02-08 Thread Jarek Potiuk
Lambertus wrote: > > > > On Feb 8, 2021, at 1:51 PM, Jarek Potiuk wrote: > > > > This uses https://github.com/actions/runner/pull/783 to not have > > un-trusted users run code (security is based on the actors of the commit > - > > commiter’s PRs and d

Re: GA again unreasonably slow (again)

2021-02-08 Thread Jarek Potiuk
and an AWS Auto-Scaling Group pon., 8 lut 2021, 09:58 użytkownik Antoine Pitrou napisał: > > Hi Jarek, > > Thank you for the document. Could you tell us more about the "custom > security layer" that you implemented? > > Regards > > Antoine. > > &g

Re: GA again unreasonably slow (again)

2021-02-07 Thread Jarek Potiuk
ing to validate them even more when we implement all the optimisations, but the conclusions should be pretty sound. https://docs.google.com/document/d/1ZZeZ4BYMNX7ycGRUKAXv0s6etz1g-90Onn5nRQQHOfE/edit# J. On Fri, Jan 8, 2021 at 10:02 PM Jarek Potiuk wrote: > > We should be able to make an e

Re: Using GitHub Actions for Apache Hudi repo

2021-01-31 Thread Jarek Potiuk
workflows - I am happy to help. Also if there are any other ideas how the current situation can be improved, you are most welcome. We are in this together :). J. On Sun, Jan 31, 2021 at 11:49 AM Jarek Potiuk wrote: > The space is created (thnks infra!), I turned my message into this pa

Re: Using GitHub Actions for Apache Hudi repo

2021-01-31 Thread Jarek Potiuk
. On Sun, Jan 31, 2021 at 9:06 AM Jarek Potiuk wrote: > Ticket created: > https://issues.apache.org/jira/projects/INFRA/issues/INFRA-21364 > > J. > > On Thu, Jan 28, 2021 at 5:59 AM Chris Lambertus wrote: > >> >> >> > On Jan 23, 2021, at 10:15 PM, J

Re: Using GitHub Actions for Apache Hudi repo

2021-01-31 Thread Jarek Potiuk
Ticket created: https://issues.apache.org/jira/projects/INFRA/issues/INFRA-21364 J. On Thu, Jan 28, 2021 at 5:59 AM Chris Lambertus wrote: > > > > On Jan 23, 2021, at 10:15 PM, Jarek Potiuk wrote: > > > Let me explain then what I see as the current state of Github Acti

Re: Builds Meeting this Thursday

2021-01-13 Thread Jarek Potiuk
Sure. My bad and mental shortcut. > > Apache Airflow is a PMC. That Committee has *members* that are part of that > committee. Those members are NOT "PMCs". We have about 200 PMCs at the > Foundation, established by the Board. Please stop confusing *members* with > *committees*. > -- +48 660

Re: Builds Meeting this Thursday

2021-01-13 Thread Jarek Potiuk
One of PMCs @Airflow (in ASIA timezone) asked if the meeting can be recorded ? J. On Tue, Jan 12, 2021 at 10:42 PM Jarek Potiuk wrote: > Added my two topics. Thanks Gavin for this opportunity ! > > On Tue, Jan 12, 2021 at 10:20 PM Gavin McDonald > wrote: > >> Please list

Re: Builds Meeting this Thursday

2021-01-12 Thread Jarek Potiuk
Added my two topics. Thanks Gavin for this opportunity ! On Tue, Jan 12, 2021 at 10:20 PM Gavin McDonald wrote: > Please list on the cwiki meeting page any questions you have > for Brian so that I may send them to him ahead of time, if > possible. > > On Tue, Jan 12, 2021 at 10:00 PM Gavin

Re: Builds Meeting this Thursday

2021-01-12 Thread Jarek Potiuk
Cool. Looking forward to it! On Tue, Jan 12, 2021 at 9:53 PM Gavin McDonald wrote: > Hi All, > > Sorry for the last minute notice, this Thursday the 14th January > at 1700 UTC time will be our next builds@ meeting. > > Just confirmed a few minutes ago, there will be a guest > representing

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-10 Thread Jarek Potiuk
uld suggest you have a direct chat with Greg Stein. > > > > Best Regards, > > Dave > > > > Sent from my iPhone > > > > > On Jan 10, 2021, at 9:08 AM, Jarek Potiuk wrote: > > > > > > On Sun, Jan 10, 2021 at 5:28 PM Matt Sicker wrote: >

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-10 Thread Jarek Potiuk
ast delegated to me and provided with the means of contacting GitHub and representing the ASF (but I doubt anyone would give me that power, it is a bit risky as with big power I would have no big responsibility. Tough call - I am not sure how else I can help INFRA/ASF to help me and others. J. > >

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-10 Thread Jarek Potiuk
ion like > Apache tries using something). > > Many of the features you’re asking from GA are likely non-trivial > architecture changes they’ll have to make to accommodate the non-trivial > use cases we have. Or maybe it isn’t and they’re just incompetent? > > On Sat, Jan 9, 2

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-09 Thread Jarek Potiuk
"] } Or to only allow the given users, but not all contributors: "pullRequestSecurity": { "allowContributors": false, "allowedAuthors": ["ashb"] } Owners of the repo are always allowed to run jobs. On Sat, Jan 9, 2021 at 2:13 PM

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-09 Thread Jarek Potiuk
> > > > > The multiple threads about how shitty those are in practice for your > > needs seem to indicate otherwise. Security and easy learning curves > > don't seem to get along too well, do they? > The usabilty, integration level (especially GitHub Actions), maintenance effort needed - thi is

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-09 Thread Jarek Potiuk
> > > > I presume you are referring to the fact that external non-committers > cannot force a build on a forked repo PR on the ASF Jenkins without > whitelisting [1]. This is by design because it is a huge security problem > to run unvetted 3rd party code on our build infrastructure. This is the

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-09 Thread Jarek Potiuk
> > > > The ASF does not run Travis. We pay a large amount of money TO Travis for > additional workers on top of the free tier. The reason I ask for discussion > regarding Jenkins usability is because throwing infinite money at solutions > like Travis is not an option. Now you find yourself in the

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-08 Thread Jarek Potiuk
t; preference comes from. Jenkins is still useful for jobs that don't need to > run on forks, (e.g., periodically checking for Go version updates and > opening a PR if a minor version update is found) VAST majority of our traffic comes from PRs from forks. > > -Zach > > On Fri,

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-08 Thread Jarek Potiuk
I used to love gradle and Jenkins and started to hate it. Once you move to python/javascript world, suddenly stuff that take you days in Java/Gradle take hours in Python/Javascript. And it is amplified in case of CI where you do not need to have an enterprise-grade system but a bunch of scripts to

Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-08 Thread Jarek Potiuk
Let me just answer from my side. Personally - I hate and love Jenkins at the same time. I am a Jenkins user and developer for 15 years or so. I even developed mobile app plugins for jenkins and we run our own jenkins farm for mobile app CI. but some 5 years ago we switched to GitLab which was so

Re: GA again unreasonably slow (again)

2021-01-08 Thread Jarek Potiuk
> We should be able to make an efficient query via GraphQL API right? I found > the REST API for actions to be a little underwhelming. That was the first thing I checked when we started looking at the stats. Unfortunately last time that I checked (and I even opened an issue for that to Github

Re: GA again unreasonably slow (again)

2021-01-08 Thread Jarek Potiuk
uggests that Pulsar, Spark and Airflow are the top contributors > to the queue. > I filed issues to Pulsar ( https://github.com/apache/pulsar/issues/9154 ) > and Spark ( https://issues.apache.org/jira/browse/SPARK-34053 ) > Hope they can do something to reduce the build time and the numbe

GA again unreasonably slow (again)

2021-01-08 Thread Jarek Potiuk
t organize the meeting and urge them to fix the security problem for public self -hosted repositories? This is not a complaint, this is just crying for HELP ... We are terribly stuck. J, -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 1

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-07 Thread Jarek Potiuk
nned. > However, submodules and "git clone ...action" do not improve security, yet > they make it harder to automatically analyze workflow.yml. > > Jarek>Again is the matter of 'thinking' this way > > I have updated the PR so it uses SHA. Could you please stop making that > comment?

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-07 Thread Jarek Potiuk
so whether it is applicable to everyone, every repo, every project - I have no idea. I just wanted to share it with others so they can make informed decisions (and prepare for what's coming from infra). If anyone decides to bypass the requirements/policies which infra imposed. it's their own deci

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-07 Thread Jarek Potiuk
On Thu, Jan 7, 2021 at 6:56 PM Vladimir Sitnikov < sitnikov.vladi...@gmail.com> wrote: > What I mean is that there's a trivial workaround which does not require > significant changes to the repository layout. On top of that, it does not > change developer's workflows (they do not need to learn

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-07 Thread Jarek Potiuk
> impl: openj9 > version: '8' > architecture: x64 > > After: > > - name: Prepare Actions > run: | > git clone -c advice.detachedHead=false --branch v1 --depth 1 > https://github.com/AdoptOpenJDK/install-jdk.git > ./build/.actions/install-jdk >

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-06 Thread Jarek Potiuk
> > > > I am aware of how to use Docker in our build jobs... I am talking about > Docker based GitHub Actions. They can have published docker > images that they used so they do not get rebuilt all of the time. I called > out SuperLinter as just an example. > Same thing. You can build and push

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-06 Thread Jarek Potiuk
> > > > If we ban persisted credentials in the checkout action it becomes very hard > to commit the static website build. My example is what triggered this > whole conversation about how they are persisted because this is how it > works for us today, which I guess was not common knowledge. >

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-06 Thread Jarek Potiuk
> > > > Btw this is really not ideal if the action is docker based like the GitHub > SuperLinter. Rebuilding this takes forever if it does not pull the > existing container. > I belive super linter is an official GitHub one. All official actions that are GitHub owned are 'business as usual'. And

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-06 Thread Jarek Potiuk
I prefer constructive approach rather than criticising and complaining, and I think we found (together with Ash) even better approach with submodules. Daniel - I know you will be working on the infra side on that - I think you should consider making the below submodule approach the "highly

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-06 Thread Jarek Potiuk
> > > Please, no. The inclusion of the third-party code into the project > repository has non-trivial licensing implications. > I do not think (but correct me if I am wrong) that this is the case. I think once the licence is OK, it has very little implications. The implications are mostly when

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2021-01-06 Thread Jarek Potiuk
Hello everyone, Coming back to the thread. Apache Beam project solved the problem in a slightly different way than we did in Airflow, and I think it might be a bit better, so maybe that can lead to a recommendation from the INFRA. https://github.com/apache/beam/pull/13670 Rather than cloning

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2020-12-30 Thread Jarek Potiuk
FYI. I've filed two issues to GH via https://bounty.github.com/ - let's see what their security teams do with those. BTW. Brennan, if there is any reward, happy to share it with you :) J. On Wed, Dec 30, 2020 at 4:03 PM Jarek Potiuk wrote: > Got some feedback from GH support . It's both g

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2020-12-30 Thread Jarek Potiuk
). Vladimir - I think that closes the topic about banning GITHUB_TOKEN usage. J. On Wed, Dec 30, 2020 at 2:37 PM Jarek Potiuk wrote: > FYI We looked at the source code of the checkout action and indeed it > seems it uses some kind of token, possibly GITHUB_TOKEN by simply

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2020-12-30 Thread Jarek Potiuk
as INPUT_TOKEN env variable. https://github.com/actions/toolkit/blob/main/packages/core/src/core.ts#L84 This is quite unexpected and really, really bad if that's confirmed. J. On Wed, Dec 30, 2020 at 11:56 AM Jarek Potiuk wrote: > Jarek>What credentials are you talking about? > > P

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2020-12-30 Thread Jarek Potiuk
Jarek>What credentials are you talking about? Please report it to security@ then. If it works this way, this is serious security threat IMHO. On Wed, Dec 30, 2020 at 11:42 AM Vladimir Sitnikov < sitnikov.vladi...@gmail.com> wrote: > Jarek>What credentials are you talking about? > > For

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2020-12-30 Thread Jarek Potiuk
I also sent incident report to secur...@apache.org for the checkout action. If it is confirmed that it works this way, this is a really serious issue IMHO. On Wed, Dec 30, 2020 at 11:24 AM Jarek Potiuk wrote: > > Jarek>Installing and even running commands via PIP does not expose >&g

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2020-12-30 Thread Jarek Potiuk
> Jarek>Installing and even running commands via PIP does not expose > GITHUB_TOKEN > (and this is the real threat). It at most exposes the local build > > Running PIP at the ASF Jenkins instance (e.g. https://ci-beam.apache.org/ > ) > exposes ASF credentials to a malicious PIP package. > Does

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2020-12-30 Thread Jarek Potiuk
meter. It never even crossed my mind that write token might be persisted in this case https://github.com/apache/airflow/blob/a4a3d3f262257efbad7a36d6c72e0abd921b3a6f/.github/workflows/ci.yml#L1045 On Wed, Dec 30, 2020 at 10:32 AM Brennan Ashton wrote: > On Wed, Dec 30, 2020, 1:25

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2020-12-30 Thread Jarek Potiuk
> > > This is only sorry of correct. If you are using the standard checkout > action and install a package from pypi/npm at a later step that package > absolutely can push to the Apache repo when it runs in a push context (pr > context it is read-only). This later step does not need the token

Re: Failure with Github Actions from outside of the organization (out of a sudden!)

2020-12-30 Thread Jarek Potiuk
; builds. Other > projects using Gradle (version 6.2 and above) might also like to consider > using that. > > Cheers, Paul. > [1] https://docs.gradle.org/current/userguide/dependency_verification.html > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>

  1   2   >