Re: CI feedback time

2021-04-15 Thread Jorge Cardoso Leitão
Hi, I agree. I'll submit two requirements though: > - the configuration for CI builds must be kept in the Arrow repository >(as they are currently in .github, etc.) > - CI builds must be runnable from PRs > I'll submit three more: - The result of the build (pass / did not pass) must be

Re: CI feedback time

2021-04-15 Thread Krisztián Szűcs
On Fri, Apr 16, 2021 at 1:11 AM Jed Brown wrote: > > Wes McKinney writes: > > > I think we should take a more serious look at Buildkite for some of our CI. > > > > * First of all, it's very easy to connect self-hosted workers and > > supports ephemeral cloud workers in a way that would be

Re: CI feedback time

2021-04-15 Thread Jed Brown
Wes McKinney writes: > I think we should take a more serious look at Buildkite for some of our CI. > > * First of all, it's very easy to connect self-hosted workers and > supports ephemeral cloud workers in a way that would be difficult or > impossible with GHA. No need to have Infra fiddle with

Re: CI feedback time

2021-04-15 Thread Krisztián Szűcs
On Thu, Apr 15, 2021 at 11:53 PM Andy Grove wrote: > > I started looking at BulidKite and it would solve one large problem for the > DataFusion/Ballista project. We really need to be running integration tests > against large data sets (such as TPC-H @ SF=100GB) and self-hosted > BuildKite makes

Re: CI feedback time

2021-04-15 Thread Andy Grove
I started looking at BulidKite and it would solve one large problem for the DataFusion/Ballista project. We really need to be running integration tests against large data sets (such as TPC-H @ SF=100GB) and self-hosted BuildKite makes this simple to accomplish. I even have some modest hardware

Re: CI feedback time

2021-04-15 Thread Wes McKinney
I think we should take a more serious look at Buildkite for some of our CI. * First of all, it's very easy to connect self-hosted workers and supports ephemeral cloud workers in a way that would be difficult or impossible with GHA. No need to have Infra fiddle with the admin dashboard. So we

Re: CI feedback time

2021-04-15 Thread Krisztián Szűcs
On Thu, Apr 15, 2021 at 2:13 AM Weston Pace wrote: > > It may be worth reaching out to the Airflow project. Based on > https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status > it seems they have been investing time into figuring how to make > self-hosted runners work (it seems

Re: CI feedback time

2021-04-15 Thread Krisztián Szűcs
On Thu, Apr 15, 2021 at 10:48 AM Antoine Pitrou wrote: > > > Le 15/04/2021 à 03:13, Kazuaki Ishizaki a écrit : > > As we know this is a common issue among Apache projects. While the > > projects do not have the final solution, Apache Spark project has a > > mechanism [1][2] to run a test in own

Re: CI feedback time

2021-04-15 Thread Antoine Pitrou
Le 15/04/2021 à 03:13, Kazuaki Ishizaki a écrit : As we know this is a common issue among Apache projects. While the projects do not have the final solution, Apache Spark project has a mechanism [1][2] to run a test in own local (forked) repository. Can we alleviate the problem a little bit?

Re: CI feedback time

2021-04-14 Thread Kazuaki Ishizaki
] https://github.com/apache/spark-website/pull/286 Regards, Kazuaki Ishizaki Weston Pace wrote on 2021/04/15 09:13:05: > From: Weston Pace > To: dev@arrow.apache.org > Date: 2021/04/15 09:13 > Subject: [EXTERNAL] Re: CI feedback time > > It may be worth reaching out to t

Re: CI feedback time

2021-04-14 Thread Weston Pace
It may be worth reaching out to the Airflow project. Based on https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status it seems they have been investing time into figuring how to make self-hosted runners work (it seems Github's patching model makes this somewhat difficult). On

Re: CI feedback time

2021-04-14 Thread Antoine Pitrou
Hi Krisztian, Thanks for bringing this up. This is definitely becoming a high-priority topic for Arrow development. I don't believe there is much opportunity for reducing the number of builds or their runtime. We simply have a lot of development going on, and the number of different CI

CI feedback time

2021-04-14 Thread Krisztián Szűcs
Hi, The Apache Github Actions agent pool seems to be oversubscribed as more Apache projects migrate their CI setup to GHA. We experienced pretty solid feedback times (~20-30m) when we originally moved to GHA but now we are roughly 5hrs behind [1]. Based on other projects' complaints and