Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-07-14 Thread Yikun Jiang
With the help from the community, the cache based job switch has been
completed!

* About the ghcr images:

You might notice that two images are generated in apache ghcr:

- Image cache: spark/apache-spark-github-action-image-cache
:
This is the cache based on branches' dev/infra/Dockerfile.

- CI image: apache-spark-ci-image
:
This is for scheduled jobs. It builds an image just-in-time from the cache,
and then uses it to run the CI jobs.

- Distributed (User) CI image: such as yikun/apache-spark-ci-image
: This
is for PR triggered jobs. Again built just-in-time from the cache and used
to execute the CI job(s) in the user's Github Action space.

* About the job:

For Lint/PySpark/SparkR jobs, "Base image build" will do a just-in-time
build and generate a ci-image for each PR, and jobs use the image as the
job container image.

* About how to change the infra deps:

Currently, the CI image is just like a static image unless you change the
Dockerfile.

- If you want to change the version of a dependency of Lint/PySpark/SparkR
jobs, you could change the dev/infra/Dockerfile just like
https://github.com/apache/spark/pull/37175.

- If you want to trigger a full refresh you could just change the
FULL_REFRESH_DATE
in the Dockerfile

.

FYI, I also do a updated the doc on
https://docs.google.com/document/d/1_uiId-U1DODYyYZejAZeyz2OAjxcnA-xfwjynDF6vd0
to
help you understand.


Through this work, I can really feel the efforts of previous maintenance! A
simple version bump of a dependency may lead to a lot of investigation!
Thanks to HyukjinKwon, Dongjoon and the whole community for keeping the
infra deps always latest!

Feel free to ping me if you have any other concerns or ideas!

Regards,
Yikun


On Mon, Jun 27, 2022 at 12:05 AM Yikun Jiang  wrote:

> > There’s one last task to simply caching the Docker image (
> https://issues.apache.org/jira/browse/SPARK-39522).
> I will have to be less active for this week and next week because of the
> Spark Summit. Would appreciate if somebody
> finds some time to take a stab.
>
> I did some investigations on spark container jobs (pyspark/sparkr/lint)
> using cache, and draft a doc to help you guys understand #36980
> :
>
> https://docs.google.com/document/d/1_uiId-U1DODYyYZejAZeyz2OAjxcnA-xfwjynDF6vd0
>
>
> > About a quick hallway meetup, I will be there after Holden’s talk at
> least to say hello to her :-).
>
> Something topic I was interesting about and related to build CI:
> - K8S integrations  test on
> GA:
> - To help various OS  and
> multi architecture/hardware (x86/arm64, gpu) integration support, what we
> can do to help improving.
> Please feel free to ping me if necessary. It's a little bit pity I
> couldn't have the opportunity to be there, I hope you guys have a fabulous
> meet on summit!
>
> Regards,
> Yikun
>
>
> On Fri, Jun 24, 2022 at 11:15 AM Dongjoon Hyun 
> wrote:
>
>> Yep, I'll be there too. Thank you for the adjustment. See you soon. :)
>>
>> Dongjoon.
>>
>> On Thu, Jun 23, 2022 at 4:59 PM Hyukjin Kwon  wrote:
>>
>>> Alright, I'll be there after Holden's talk Thursday
>>> https://databricks.com/dataaisummit/session/tools-assisted-apache-spark-version-migrations-21-32
>>> w/ Dongjoon (since he manages OSS Jenkins too).
>>> Let's have a quickie chat :-).
>>>
>>> On Thu, 23 Jun 2022 at 06:16, Hyukjin Kwon  wrote:
>>>
 Oops, I was confused about the time and distance in the US. I won't
 make it too.
 Let me find another time slot that works for more ppl.

 On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun 
 wrote:

> Thank you, Hyukjin! :)
>
> BTW, unfortunately, it seems that I cannot join that quick meeting.
> I have another schedule at South Bay around 7PM and need to leave San
> Francisco at least 5PM.
>
> Dongjoon.
>
>
> On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon 
> wrote:
>
>> (cc @Yikun Jiang  @Gengliang Wang
>>  @Maxim Gekk
>>  @Yang,Jie(INF)  FYI)
>>
>> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon 
>> wrote:
>>
>>> Couple of updates:
>>>
>>>-
>>>
>>>All builds passed now with all combinations we defined in the
>>>GitHub Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>>JDK 17 and Scala 2.13), see
>>>https://github.com/apache/spark/actions cc @Tom Graves
>>> @Dongjoon Hyun 
>>> FYI
>>>-
>>>
>>>except one test that is being failed due to OOM. That’s being
>>>fixed at https://g

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-26 Thread Yikun Jiang
> There’s one last task to simply caching the Docker image (
https://issues.apache.org/jira/browse/SPARK-39522).
I will have to be less active for this week and next week because of the
Spark Summit. Would appreciate if somebody
finds some time to take a stab.

I did some investigations on spark container jobs (pyspark/sparkr/lint)
using cache, and draft a doc to help you guys understand #36980
:
https://docs.google.com/document/d/1_uiId-U1DODYyYZejAZeyz2OAjxcnA-xfwjynDF6vd0


> About a quick hallway meetup, I will be there after Holden’s talk at
least to say hello to her :-).

Something topic I was interesting about and related to build CI:
- K8S integrations  test on GA:
- To help various OS  and multi
architecture/hardware (x86/arm64, gpu) integration support, what we can do
to help improving.
Please feel free to ping me if necessary. It's a little bit pity I couldn't
have the opportunity to be there, I hope you guys have a fabulous meet on
summit!

Regards,
Yikun


On Fri, Jun 24, 2022 at 11:15 AM Dongjoon Hyun 
wrote:

> Yep, I'll be there too. Thank you for the adjustment. See you soon. :)
>
> Dongjoon.
>
> On Thu, Jun 23, 2022 at 4:59 PM Hyukjin Kwon  wrote:
>
>> Alright, I'll be there after Holden's talk Thursday
>> https://databricks.com/dataaisummit/session/tools-assisted-apache-spark-version-migrations-21-32
>> w/ Dongjoon (since he manages OSS Jenkins too).
>> Let's have a quickie chat :-).
>>
>> On Thu, 23 Jun 2022 at 06:16, Hyukjin Kwon  wrote:
>>
>>> Oops, I was confused about the time and distance in the US. I won't make
>>> it too.
>>> Let me find another time slot that works for more ppl.
>>>
>>> On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun 
>>> wrote:
>>>
 Thank you, Hyukjin! :)

 BTW, unfortunately, it seems that I cannot join that quick meeting.
 I have another schedule at South Bay around 7PM and need to leave San
 Francisco at least 5PM.

 Dongjoon.


 On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon 
 wrote:

> (cc @Yikun Jiang  @Gengliang Wang
>  @Maxim Gekk
>  @Yang,Jie(INF)  FYI)
>
> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon 
> wrote:
>
>> Couple of updates:
>>
>>-
>>
>>All builds passed now with all combinations we defined in the
>>GitHub Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>JDK 17 and Scala 2.13), see
>>https://github.com/apache/spark/actions cc @Tom Graves
>> @Dongjoon Hyun 
>> FYI
>>-
>>
>>except one test that is being failed due to OOM. That’s being
>>fixed at https://github.com/apache/spark/pull/36954, see
>>also
>>https://github.com/apache/spark/pull/36787#discussion_r901190636
>>-
>>
>>I am now adding PySpark, SparkR jobs to the scheduled builds at
>>https://github.com/apache/spark/pull/36940
>>and see if they pass. We might need a couple of more fixes there.
>>-
>>
>>There’s one last task to simply caching the Docker image (
>>https://issues.apache.org/jira/browse/SPARK-39522).
>>I will have to be less active for this week and next week because
>>of the Spark Summit. Would appreciate if somebody
>>finds some time to take a stab.
>>
>> About a quick hallway meetup, I will be there after Holden’s talk at
>> least to say hello to her :-).
>> Let’s have a quick chat about our CI. We still have some general
>> problems to cope with like the lack of resources in
>> GitHub Actions.
>>
>>
>>
>> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon 
>> wrote:
>>
>>> Just chatted offline - both I and Holden have multiple sessions :-).
>>> Probably let's meet up for a quick chat after your talk
>>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>>> ?
>>>
>>>
>>> On Mon, 20 Jun 2022 at 22:23, Holden Karau 
>>> wrote:
>>>
 How about a hallway meet up at Data AI summit to talk about build
 CI if folks are
 Interested?

 On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon 
 wrote:

> Increased the priority to a blocker - I don't think we can release
> with these build failures and poor CI
>
> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon 
> wrote:
>
>> There are too many test failures here. I pinged in some PRs I
>> could identify from a cursory look but would be great for you guys 
>> to take
>> a look if you guys haven't tested your change against other
>> environments like JDK 11, Scala 2.13.
>>
>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon 
>> wrote:
>>
>>> Hi all,
>>>
>>

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-23 Thread Dongjoon Hyun
Yep, I'll be there too. Thank you for the adjustment. See you soon. :)

Dongjoon.

On Thu, Jun 23, 2022 at 4:59 PM Hyukjin Kwon  wrote:

> Alright, I'll be there after Holden's talk Thursday
> https://databricks.com/dataaisummit/session/tools-assisted-apache-spark-version-migrations-21-32
> w/ Dongjoon (since he manages OSS Jenkins too).
> Let's have a quickie chat :-).
>
> On Thu, 23 Jun 2022 at 06:16, Hyukjin Kwon  wrote:
>
>> Oops, I was confused about the time and distance in the US. I won't make
>> it too.
>> Let me find another time slot that works for more ppl.
>>
>> On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun 
>> wrote:
>>
>>> Thank you, Hyukjin! :)
>>>
>>> BTW, unfortunately, it seems that I cannot join that quick meeting.
>>> I have another schedule at South Bay around 7PM and need to leave San
>>> Francisco at least 5PM.
>>>
>>> Dongjoon.
>>>
>>>
>>> On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon 
>>> wrote:
>>>
 (cc @Yikun Jiang  @Gengliang Wang
  @Maxim Gekk 
  @Yang,Jie(INF)  FYI)

 On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon  wrote:

> Couple of updates:
>
>-
>
>All builds passed now with all combinations we defined in the
>GitHub Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions
>cc @Tom Graves  @Dongjoon Hyun
> FYI
>-
>
>except one test that is being failed due to OOM. That’s being
>fixed at https://github.com/apache/spark/pull/36954, see
>also
>https://github.com/apache/spark/pull/36787#discussion_r901190636
>-
>
>I am now adding PySpark, SparkR jobs to the scheduled builds at
>https://github.com/apache/spark/pull/36940
>and see if they pass. We might need a couple of more fixes there.
>-
>
>There’s one last task to simply caching the Docker image (
>https://issues.apache.org/jira/browse/SPARK-39522).
>I will have to be less active for this week and next week because
>of the Spark Summit. Would appreciate if somebody
>finds some time to take a stab.
>
> About a quick hallway meetup, I will be there after Holden’s talk at
> least to say hello to her :-).
> Let’s have a quick chat about our CI. We still have some general
> problems to cope with like the lack of resources in
> GitHub Actions.
>
>
>
> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon 
> wrote:
>
>> Just chatted offline - both I and Holden have multiple sessions :-).
>> Probably let's meet up for a quick chat after your talk
>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>> ?
>>
>>
>> On Mon, 20 Jun 2022 at 22:23, Holden Karau 
>> wrote:
>>
>>> How about a hallway meet up at Data AI summit to talk about build CI
>>> if folks are
>>> Interested?
>>>
>>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon 
>>> wrote:
>>>
 Increased the priority to a blocker - I don't think we can release
 with these build failures and poor CI

 On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon 
 wrote:

> There are too many test failures here. I pinged in some PRs I
> could identify from a cursory look but would be great for you guys to 
> take
> a look if you guys haven't tested your change against other
> environments like JDK 11, Scala 2.13.
>
> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon 
> wrote:
>
>> Hi all,
>>
>> I am trying to rework GitHub Actions CI at
>> https://issues.apache.org/jira/browse/SPARK-39515. Any help
>> would be very appreciated.
>>
>>
>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>


Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-23 Thread Hyukjin Kwon
Alright, I'll be there after Holden's talk Thursday
https://databricks.com/dataaisummit/session/tools-assisted-apache-spark-version-migrations-21-32
w/ Dongjoon (since he manages OSS Jenkins too).
Let's have a quickie chat :-).

On Thu, 23 Jun 2022 at 06:16, Hyukjin Kwon  wrote:

> Oops, I was confused about the time and distance in the US. I won't make
> it too.
> Let me find another time slot that works for more ppl.
>
> On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun 
> wrote:
>
>> Thank you, Hyukjin! :)
>>
>> BTW, unfortunately, it seems that I cannot join that quick meeting.
>> I have another schedule at South Bay around 7PM and need to leave San
>> Francisco at least 5PM.
>>
>> Dongjoon.
>>
>>
>> On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon  wrote:
>>
>>> (cc @Yikun Jiang  @Gengliang Wang
>>>  @Maxim Gekk 
>>> @Yang,Jie(INF)  FYI)
>>>
>>> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon  wrote:
>>>
 Couple of updates:

-

All builds passed now with all combinations we defined in the
GitHub Actions (e.g., branch-3.2, branch-3.3, JDK 11,
JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions
cc @Tom Graves  @Dongjoon Hyun
 FYI
-

except one test that is being failed due to OOM. That’s being fixed
at https://github.com/apache/spark/pull/36954, see
also
https://github.com/apache/spark/pull/36787#discussion_r901190636
-

I am now adding PySpark, SparkR jobs to the scheduled builds at
https://github.com/apache/spark/pull/36940
and see if they pass. We might need a couple of more fixes there.
-

There’s one last task to simply caching the Docker image (
https://issues.apache.org/jira/browse/SPARK-39522).
I will have to be less active for this week and next week because
of the Spark Summit. Would appreciate if somebody
finds some time to take a stab.

 About a quick hallway meetup, I will be there after Holden’s talk at
 least to say hello to her :-).
 Let’s have a quick chat about our CI. We still have some general
 problems to cope with like the lack of resources in
 GitHub Actions.



 On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon  wrote:

> Just chatted offline - both I and Holden have multiple sessions :-).
> Probably let's meet up for a quick chat after your talk
> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
> ?
>
>
> On Mon, 20 Jun 2022 at 22:23, Holden Karau 
> wrote:
>
>> How about a hallway meet up at Data AI summit to talk about build CI
>> if folks are
>> Interested?
>>
>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon 
>> wrote:
>>
>>> Increased the priority to a blocker - I don't think we can release
>>> with these build failures and poor CI
>>>
>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon 
>>> wrote:
>>>
 There are too many test failures here. I pinged in some PRs I could
 identify from a cursory look but would be great for you guys to take a 
 look
 if you guys haven't tested your change against other environments like 
 JDK
 11, Scala 2.13.

 On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon 
 wrote:

> Hi all,
>
> I am trying to rework GitHub Actions CI at
> https://issues.apache.org/jira/browse/SPARK-39515. Any help would
> be very appreciated.
>
>
> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>


Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-22 Thread Hyukjin Kwon
Oops, I was confused about the time and distance in the US. I won't make it
too.
Let me find another time slot that works for more ppl.

On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun  wrote:

> Thank you, Hyukjin! :)
>
> BTW, unfortunately, it seems that I cannot join that quick meeting.
> I have another schedule at South Bay around 7PM and need to leave San
> Francisco at least 5PM.
>
> Dongjoon.
>
>
> On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon  wrote:
>
>> (cc @Yikun Jiang  @Gengliang Wang
>>  @Maxim Gekk 
>> @Yang,Jie(INF)  FYI)
>>
>> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon  wrote:
>>
>>> Couple of updates:
>>>
>>>-
>>>
>>>All builds passed now with all combinations we defined in the GitHub
>>>Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>>JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions
>>>cc @Tom Graves  @Dongjoon Hyun
>>> FYI
>>>-
>>>
>>>except one test that is being failed due to OOM. That’s being fixed
>>>at https://github.com/apache/spark/pull/36954, see
>>>also https://github.com/apache/spark/pull/36787#discussion_r901190636
>>>-
>>>
>>>I am now adding PySpark, SparkR jobs to the scheduled builds at
>>>https://github.com/apache/spark/pull/36940
>>>and see if they pass. We might need a couple of more fixes there.
>>>-
>>>
>>>There’s one last task to simply caching the Docker image (
>>>https://issues.apache.org/jira/browse/SPARK-39522).
>>>I will have to be less active for this week and next week because of
>>>the Spark Summit. Would appreciate if somebody
>>>finds some time to take a stab.
>>>
>>> About a quick hallway meetup, I will be there after Holden’s talk at
>>> least to say hello to her :-).
>>> Let’s have a quick chat about our CI. We still have some general
>>> problems to cope with like the lack of resources in
>>> GitHub Actions.
>>>
>>>
>>>
>>> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon  wrote:
>>>
 Just chatted offline - both I and Holden have multiple sessions :-).
 Probably let's meet up for a quick chat after your talk
 https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
 ?


 On Mon, 20 Jun 2022 at 22:23, Holden Karau 
 wrote:

> How about a hallway meet up at Data AI summit to talk about build CI
> if folks are
> Interested?
>
> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon 
> wrote:
>
>> Increased the priority to a blocker - I don't think we can release
>> with these build failures and poor CI
>>
>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon 
>> wrote:
>>
>>> There are too many test failures here. I pinged in some PRs I could
>>> identify from a cursory look but would be great for you guys to take a 
>>> look
>>> if you guys haven't tested your change against other environments like 
>>> JDK
>>> 11, Scala 2.13.
>>>
>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon 
>>> wrote:
>>>
 Hi all,

 I am trying to rework GitHub Actions CI at
 https://issues.apache.org/jira/browse/SPARK-39515. Any help would
 be very appreciated.


 --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>



Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-22 Thread Dongjoon Hyun
Thank you, Hyukjin! :)

BTW, unfortunately, it seems that I cannot join that quick meeting.
I have another schedule at South Bay around 7PM and need to leave San
Francisco at least 5PM.

Dongjoon.


On Wed, Jun 22, 2022 at 3:39 AM Hyukjin Kwon  wrote:

> (cc @Yikun Jiang  @Gengliang Wang
>  @Maxim Gekk 
> @Yang,Jie(INF)  FYI)
>
> On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon  wrote:
>
>> Couple of updates:
>>
>>-
>>
>>All builds passed now with all combinations we defined in the GitHub
>>Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>>JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions
>>cc @Tom Graves  @Dongjoon Hyun
>> FYI
>>-
>>
>>except one test that is being failed due to OOM. That’s being fixed
>>at https://github.com/apache/spark/pull/36954, see
>>also https://github.com/apache/spark/pull/36787#discussion_r901190636
>>-
>>
>>I am now adding PySpark, SparkR jobs to the scheduled builds at
>>https://github.com/apache/spark/pull/36940
>>and see if they pass. We might need a couple of more fixes there.
>>-
>>
>>There’s one last task to simply caching the Docker image (
>>https://issues.apache.org/jira/browse/SPARK-39522).
>>I will have to be less active for this week and next week because of
>>the Spark Summit. Would appreciate if somebody
>>finds some time to take a stab.
>>
>> About a quick hallway meetup, I will be there after Holden’s talk at
>> least to say hello to her :-).
>> Let’s have a quick chat about our CI. We still have some general problems
>> to cope with like the lack of resources in
>> GitHub Actions.
>>
>>
>>
>> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon  wrote:
>>
>>> Just chatted offline - both I and Holden have multiple sessions :-).
>>> Probably let's meet up for a quick chat after your talk
>>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>>> ?
>>>
>>>
>>> On Mon, 20 Jun 2022 at 22:23, Holden Karau  wrote:
>>>
 How about a hallway meet up at Data AI summit to talk about build CI if
 folks are
 Interested?

 On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon 
 wrote:

> Increased the priority to a blocker - I don't think we can release
> with these build failures and poor CI
>
> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon 
> wrote:
>
>> There are too many test failures here. I pinged in some PRs I could
>> identify from a cursory look but would be great for you guys to take a 
>> look
>> if you guys haven't tested your change against other environments like 
>> JDK
>> 11, Scala 2.13.
>>
>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon 
>> wrote:
>>
>>> Hi all,
>>>
>>> I am trying to rework GitHub Actions CI at
>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would
>>> be very appreciated.
>>>
>>>
>>> --
 Twitter: https://twitter.com/holdenkarau
 Books (Learning Spark, High Performance Spark, etc.):
 https://amzn.to/2MaRAG9  
 YouTube Live Streams: https://www.youtube.com/user/holdenkarau

>>>


Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-22 Thread Hyukjin Kwon
(cc @Yikun Jiang  @Gengliang Wang
 @Maxim Gekk 
@Yang,Jie(INF)  FYI)

On Wed, 22 Jun 2022 at 19:34, Hyukjin Kwon  wrote:

> Couple of updates:
>
>-
>
>All builds passed now with all combinations we defined in the GitHub
>Actions (e.g., branch-3.2, branch-3.3, JDK 11,
>JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions cc @Tom
>Graves  @Dongjoon Hyun 
> FYI
>-
>
>except one test that is being failed due to OOM. That’s being fixed at
>https://github.com/apache/spark/pull/36954, see
>also https://github.com/apache/spark/pull/36787#discussion_r901190636
>-
>
>I am now adding PySpark, SparkR jobs to the scheduled builds at
>https://github.com/apache/spark/pull/36940
>and see if they pass. We might need a couple of more fixes there.
>-
>
>There’s one last task to simply caching the Docker image (
>https://issues.apache.org/jira/browse/SPARK-39522).
>I will have to be less active for this week and next week because of
>the Spark Summit. Would appreciate if somebody
>finds some time to take a stab.
>
> About a quick hallway meetup, I will be there after Holden’s talk at least
> to say hello to her :-).
> Let’s have a quick chat about our CI. We still have some general problems
> to cope with like the lack of resources in
> GitHub Actions.
>
>
>
> On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon  wrote:
>
>> Just chatted offline - both I and Holden have multiple sessions :-).
>> Probably let's meet up for a quick chat after your talk
>> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
>> ?
>>
>>
>> On Mon, 20 Jun 2022 at 22:23, Holden Karau  wrote:
>>
>>> How about a hallway meet up at Data AI summit to talk about build CI if
>>> folks are
>>> Interested?
>>>
>>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon 
>>> wrote:
>>>
 Increased the priority to a blocker - I don't think we can release with
 these build failures and poor CI

 On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon  wrote:

> There are too many test failures here. I pinged in some PRs I could
> identify from a cursory look but would be great for you guys to take a 
> look
> if you guys haven't tested your change against other environments like JDK
> 11, Scala 2.13.
>
> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon 
> wrote:
>
>> Hi all,
>>
>> I am trying to rework GitHub Actions CI at
>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
>> very appreciated.
>>
>>
>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>


Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-22 Thread Hyukjin Kwon
Couple of updates:

   -

   All builds passed now with all combinations we defined in the GitHub
   Actions (e.g., branch-3.2, branch-3.3, JDK 11,
   JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions cc @Tom
   Graves  @Dongjoon Hyun 
FYI
   -

   except one test that is being failed due to OOM. That’s being fixed at
   https://github.com/apache/spark/pull/36954, see
   also https://github.com/apache/spark/pull/36787#discussion_r901190636
   -

   I am now adding PySpark, SparkR jobs to the scheduled builds at
   https://github.com/apache/spark/pull/36940
   and see if they pass. We might need a couple of more fixes there.
   -

   There’s one last task to simply caching the Docker image (
   https://issues.apache.org/jira/browse/SPARK-39522).
   I will have to be less active for this week and next week because of the
   Spark Summit. Would appreciate if somebody
   finds some time to take a stab.

About a quick hallway meetup, I will be there after Holden’s talk at least
to say hello to her :-).
Let’s have a quick chat about our CI. We still have some general problems
to cope with like the lack of resources in
GitHub Actions.



On Tue, 21 Jun 2022 at 11:49, Hyukjin Kwon  wrote:

> Just chatted offline - both I and Holden have multiple sessions :-).
> Probably let's meet up for a quick chat after your talk
> https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
> ?
>
>
> On Mon, 20 Jun 2022 at 22:23, Holden Karau  wrote:
>
>> How about a hallway meet up at Data AI summit to talk about build CI if
>> folks are
>> Interested?
>>
>> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon  wrote:
>>
>>> Increased the priority to a blocker - I don't think we can release with
>>> these build failures and poor CI
>>>
>>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon  wrote:
>>>
 There are too many test failures here. I pinged in some PRs I could
 identify from a cursory look but would be great for you guys to take a look
 if you guys haven't tested your change against other environments like JDK
 11, Scala 2.13.

 On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon  wrote:

> Hi all,
>
> I am trying to rework GitHub Actions CI at
> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
> very appreciated.
>
>
> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>


Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-20 Thread Hyukjin Kwon
Just chatted offline - both I and Holden have multiple sessions :-).
Probably let's meet up for a quick chat after your talk
https://databricks.com/dataaisummit/session/what-do-when-your-job-goes-oom-night-flowcharts
?


On Mon, 20 Jun 2022 at 22:23, Holden Karau  wrote:

> How about a hallway meet up at Data AI summit to talk about build CI if
> folks are
> Interested?
>
> On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon  wrote:
>
>> Increased the priority to a blocker - I don't think we can release with
>> these build failures and poor CI
>>
>> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon  wrote:
>>
>>> There are too many test failures here. I pinged in some PRs I could
>>> identify from a cursory look but would be great for you guys to take a look
>>> if you guys haven't tested your change against other environments like JDK
>>> 11, Scala 2.13.
>>>
>>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon  wrote:
>>>
 Hi all,

 I am trying to rework GitHub Actions CI at
 https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
 very appreciated.


 --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-20 Thread Holden Karau
How about a hallway meet up at Data AI summit to talk about build CI if
folks are
Interested?

On Sun, Jun 19, 2022 at 7:50 PM Hyukjin Kwon  wrote:

> Increased the priority to a blocker - I don't think we can release with
> these build failures and poor CI
>
> On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon  wrote:
>
>> There are too many test failures here. I pinged in some PRs I could
>> identify from a cursory look but would be great for you guys to take a look
>> if you guys haven't tested your change against other environments like JDK
>> 11, Scala 2.13.
>>
>> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon  wrote:
>>
>>> Hi all,
>>>
>>> I am trying to rework GitHub Actions CI at
>>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
>>> very appreciated.
>>>
>>>
>>> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-19 Thread Hyukjin Kwon
Increased the priority to a blocker - I don't think we can release with
these build failures and poor CI

On Mon, 20 Jun 2022 at 10:39, Hyukjin Kwon  wrote:

> There are too many test failures here. I pinged in some PRs I could
> identify from a cursory look but would be great for you guys to take a look
> if you guys haven't tested your change against other environments like JDK
> 11, Scala 2.13.
>
> On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon  wrote:
>
>> Hi all,
>>
>> I am trying to rework GitHub Actions CI at
>> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be
>> very appreciated.
>>
>>
>>


Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-19 Thread Hyukjin Kwon
There are too many test failures here. I pinged in some PRs I could
identify from a cursory look but would be great for you guys to take a look
if you guys haven't tested your change against other environments like JDK
11, Scala 2.13.

On Mon, 20 Jun 2022 at 10:04, Hyukjin Kwon  wrote:

> Hi all,
>
> I am trying to rework GitHub Actions CI at
> https://issues.apache.org/jira/browse/SPARK-39515. Any help would be very
> appreciated.
>
>
>