[ https://issues.apache.org/jira/browse/SPARK-27941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shuheng Dai updated SPARK-27941: -------------------------------- Description: Public cloud providers have started offering serverless container services. For example, AWS offers Fargate [https://aws.amazon.com/fargate/] This opens up the possibility to run Spark workloads in a serverless manner and remove the need to provision and maintain a cluster. POC: [https://github.com/mu5358271/spark-on-fargate] While it might not make sense for Spark to favor any particular cloud provider or to support a large number of cloud providers natively. It would make sense to make some of the internal Spark components more pluggable and cloud friendly so that it is easier for various cloud providers to integrate. For example, * authentication: IO and network encryption requires authentication via securely sharing a secret, and the implementation of this is currently tied to the cluster manager: yarn uses hadoop ugi, kubernetes uses a shared file mounted on all pods. These can be decoupled so it is possible to swap in implementation using public cloud. In the POC, this is implemented by passing around AWS KMS encrypted secret and decrypting the secret at each executor, which delegate authentication and authorization to the cloud. * deployment & scheduler: adding a new cluster manager and scheduler backend requires changing a number of places in the Spark core package and rebuilding the entire project. Having a pluggable scheduler per https://issues.apache.org/jira/browse/SPARK-19700 would make it easier to add different scheduler backends backed by different cloud providers. * client-cluster communication: I am not very familiar with the network part of the code base so I might be wrong on this. My understanding is that the code base assumes that the client and the cluster are on the same network and the nodes communicate with each other via hostname/ip. * shuffle storage and retrieval: was: Public cloud providers have started offering serverless container services. For example, AWS offers Fargate [https://aws.amazon.com/fargate/] This opens up the possibility to run Spark workloads in a serverless manner and remove the need to provision and maintain a cluster. POC: [https://github.com/mu5358271/spark-on-fargate] While it might not make sense for Spark to favor any particular cloud provider or to support a large number of cloud providers natively. It would make sense to make some of the internal Spark components more pluggable and cloud friendly so that it is easier for various cloud providers to integrate. For example, * authentication: IO and network encryption requires authentication via securely sharing a secret, and the implementation of this is currently tied to the cluster manager: yarn uses hadoop ugi, kubernetes uses a shared file mounted on all pods. These can be decoupled so it is possible to swap in implementation using public cloud. In the POC, this is implemented by passing around AWS KMS encrypted secret and decrypting the secret at each executor, which delegate authentication and authorization to the cloud. * deployment & scheduler: adding a new cluster manager and scheduler backend requires changing a number of places in the Spark core package, and rebuilding the entire project. * driver-executor communication: * shuffle storage and retrieval: > Serverless Spark in the Cloud > ----------------------------- > > Key: SPARK-27941 > URL: https://issues.apache.org/jira/browse/SPARK-27941 > Project: Spark > Issue Type: New Feature > Components: Build, Deploy, Scheduler, Security, Shuffle, Spark Core > Affects Versions: 3.0.0 > Reporter: Shuheng Dai > Priority: Major > > Public cloud providers have started offering serverless container services. > For example, AWS offers Fargate [https://aws.amazon.com/fargate/] > This opens up the possibility to run Spark workloads in a serverless manner > and remove the need to provision and maintain a cluster. POC: > [https://github.com/mu5358271/spark-on-fargate] > While it might not make sense for Spark to favor any particular cloud > provider or to support a large number of cloud providers natively. It would > make sense to make some of the internal Spark components more pluggable and > cloud friendly so that it is easier for various cloud providers to integrate. > For example, > * authentication: IO and network encryption requires authentication via > securely sharing a secret, and the implementation of this is currently tied > to the cluster manager: yarn uses hadoop ugi, kubernetes uses a shared file > mounted on all pods. These can be decoupled so it is possible to swap in > implementation using public cloud. In the POC, this is implemented by passing > around AWS KMS encrypted secret and decrypting the secret at each executor, > which delegate authentication and authorization to the cloud. > * deployment & scheduler: adding a new cluster manager and scheduler backend > requires changing a number of places in the Spark core package and rebuilding > the entire project. Having a pluggable scheduler per > https://issues.apache.org/jira/browse/SPARK-19700 would make it easier to add > different scheduler backends backed by different cloud providers. > * client-cluster communication: I am not very familiar with the network part > of the code base so I might be wrong on this. My understanding is that the > code base assumes that the client and the cluster are on the same network and > the nodes communicate with each other via hostname/ip. > * shuffle storage and retrieval: -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org