You can volunteer to be in charge of that for our new infra because you are
PMC.

BTW, personally, I prefer to receive a fund-raising officially instead of
connecting to some unknown servers.

I'll leave the security issues to you, Holden.

Dongjoon

On Sat, Jan 8, 2022 at 8:15 PM Holden Karau <hol...@pigscanfly.ca> wrote:

> Personally I’d love to see us compiling and testing on Linux arm64 as well.
>
> On Sat, Jan 8, 2022 at 7:49 PM Yikun Jiang <yikunk...@gmail.com> wrote:
>
>> BTW, this is not intended to be in potential opposition to Apache Spark
>> Infra 2022 which dongjoon mentioned in "Apache Spark Jenkins Infra 2022".
>> It is just to share a possible way for the Linux arm64 scheduled job.
>>
>> Also, I think we should get a final conclusion about the attitude of
>> self-hosted action from the spark community for future reference.
>>
>> Regards,
>> Yikun
>>
>> Yikun Jiang <yikunk...@gmail.com> 于2022年1月9日周日 11:33写道:
>>
>>> Hi, all
>>>
>>> I tried to verify the possibility of *Linux arm64 scheduled job *using
>>> self-hosted action, below is some progress and I would like to hear
>>> suggestion from you in the next step (continue or stop).
>>>
>>> Related JIRA: SPARK-35607
>>> <https://issues.apache.org/jira/browse/SPARK-35607>
>>>
>>> *## About self-hosted Github Action:*
>>> Currently, self-hosted action supported x64(Linux, macOS, Windows),
>>> ARM64(Linux only), ARM32(Linux only)
>>> <https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#architectures>
>>> .
>>>
>>> There is guidance on self-hosted runners from Apache Infra
>>> <https://cwiki.apache.org/confluence/display/INFRA/GitHub+-+self-hosted+runners>.
>>> The gap to enable self-hosted runner on Apache repo is resource security
>>> considerations, specifically, it's to prevent the self-hosted runner from
>>> being accessed by unallow users' PR. As info and suggestion from ASF, the
>>> apache/airflow team maintained a custom runner
>>> <https://github.com/ashb/runner/tree/releases/pr-security-options>, and
>>> it's also used by apache/airflow in their CI. So, we could just use this
>>> directly.
>>>
>>> TLDR, what we needed is setup resource with custom runner, then enable
>>> these resources in self-hosted action.
>>>
>>> *## Test on self-hosted Github Action with custom runner:*
>>> Here is some tries on my local repo:
>>> 1. Spark Maven/SBT test:
>>> PR: https://github.com/apache/spark/pull/35088
>>> TEST: https://github.com/Yikun/spark/pull/51
>>> 2. PySpark test:
>>> PR: https://github.com/apache/spark/pull/35049
>>> TEST: https://github.com/Yikun/spark/pull/53
>>> 3. Pull request test on unallow user:
>>> TEST: https://github.com/Yikun/spark/pull/60
>>> The self-hosted runner will prevent the PR access the runner due to
>>> "Running job on worker spark-github-runner-0001 disallowed by security
>>> policy".
>>>
>>> *## Pros of self-hosted github aciton:*
>>> - Satisfy the simple demands of Linux arm64 sheduled jobs.
>>> - Reuse the main workflow of github action.
>>> - All changes are visible on github is easy to review.
>>> - Easy to migrate when official GA arm64 support ready.
>>>
>>> *## What's the next step:*
>>> * If we can also consider self-hosted action as optional, I will submit
>>> a JIRA on Apache Infra to request the token to continue, like:
>>> https://issues.apache.org/jira/browse/INFRA-21305
>>> * If we certainly think that self-hosted action is not a wise choice, I
>>> will try to find other way.
>>>
>>> There are also some initial discusson, just FYI:
>>> https://github.com/dongjoon-hyun/ApacheSparkGitHubActionImage/pull/6
>>>
>>> Regards,
>>> Yikun
>>>
>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Reply via email to