[ 
https://issues.apache.org/jira/browse/SPARK-35607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yikun Jiang updated SPARK-35607:
--------------------------------
    Description: 
There were two Arm CI jobs in AMP lab:

[https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-python-arm/]

[https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/]

As note from ML[1], AMPLab jenkins had been shut down at the end of 2021.

We should consider to migrate the Arm job from jenkins to somewhere like github 
action. Unfortunately, the github action doen't support native arm64 vitualenv 
yet.

 

There are some two possible way to complete migrations:
 * Github Action self hosted
 ** Running jobs in ARM VM.
 ** Potential risk mentioned in [2], but can solve
 ** Reuse many actions and build script
 ** Easy to migrate when official GA arm support ready

 * Github Action + 3rd Party Kubernetes Cluster
 ** Running jobs in the pod of ARM k8s cluster and print log in github action.
 ** We have to pack separate build/script in dockerfile.
 ** We can't easy to use existing actions
 ** Bring many pontential cost of maintainance

Looks like 2nd way, bring many pontential cost of maintainance and not easy to 
migrate when native arm supported in GA, so I think Github Action self hosted 
with strict allow list would be best choice in here.

 

[1] [https://www.mail-archive.com/dev@spark.apache.org/msg28480.html]

  was:
There were two Arm CI jobs in AMP lab:

[https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-python-arm/]

[https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/]

As note from ML[1], AMPLab will shut jenkins down at the end of 2021. We should 
consider to migrate the Arm job from jenkins to somewhere like github action. 
Unfortunately, the github action doen't support native arm64 vitualenv yet.

 

We propose to use the kubernetes on Arm to complete this task as two part:

1. Build the basic Arm64 docker image on Github Action in some repo:

[https://github.com/Yikun/spark-arm-docker]

The docker build job will be running in self-host node(Arm64).

 

2. Running above image and running test script on Arm Kuberenetes(the service 
which is provided by public cloud).

Here is the POC job for build test:

[https://github.com/Yikun/spark-arm-docker/pull/4]

Here is the POC job for python test:

[https://github.com/Yikun/spark-arm-docker/pull/6]

 

Alternate, we could also use the self-host in 2, but there are some github 
action security warning notice on

[https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status]

[https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories]

So, it seems the kubernetes services is more flexiable and easy to manage.

 

TODO:

Split the maven test job to multiple small job to make the job complete in 2 
hours.(Gtihub timeout limition)

 

[1] 
[http://apache-spark-developers-list.1001551.n3.nabble.com/please-read-current-state-and-the-future-of-the-apache-spark-build-system-td31075.html]


> Migrate Spark Arm Job from Jenkins to GitHub Actions
> ----------------------------------------------------
>
>                 Key: SPARK-35607
>                 URL: https://issues.apache.org/jira/browse/SPARK-35607
>             Project: Spark
>          Issue Type: Test
>          Components: Project Infra
>    Affects Versions: 3.1.2
>            Reporter: Yikun Jiang
>            Priority: Major
>
> There were two Arm CI jobs in AMP lab:
> [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-python-arm/]
> [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/]
> As note from ML[1], AMPLab jenkins had been shut down at the end of 2021.
> We should consider to migrate the Arm job from jenkins to somewhere like 
> github action. Unfortunately, the github action doen't support native arm64 
> vitualenv yet.
>  
> There are some two possible way to complete migrations:
>  * Github Action self hosted
>  ** Running jobs in ARM VM.
>  ** Potential risk mentioned in [2], but can solve
>  ** Reuse many actions and build script
>  ** Easy to migrate when official GA arm support ready
>  * Github Action + 3rd Party Kubernetes Cluster
>  ** Running jobs in the pod of ARM k8s cluster and print log in github action.
>  ** We have to pack separate build/script in dockerfile.
>  ** We can't easy to use existing actions
>  ** Bring many pontential cost of maintainance
> Looks like 2nd way, bring many pontential cost of maintainance and not easy 
> to migrate when native arm supported in GA, so I think Github Action self 
> hosted with strict allow list would be best choice in here.
>  
> [1] [https://www.mail-archive.com/dev@spark.apache.org/msg28480.html]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to