andygrove opened a new issue, #3537:
URL: https://github.com/apache/datafusion-comet/issues/3537

   ### What is the problem the feature request solves?
   
                                                                                
    
   ### Description                                                              
                                                                                
                                           
                                                                                
               
   Add support for running benchmarks on a Kubernetes cluster using Spark's 
`spark-submit --master k8s://...` client mode. This was explored during #3534 
but removed as out of scope for the initial PR.  
                                                                                
               
   ### Motivation
   
   The current benchmark runner supports local and standalone Spark clusters 
via docker-compose. Adding K8s support would enable:
   
   - Running benchmarks on multi-node clusters with realistic resource 
constraints
   - Leveraging existing K8s infrastructure (e.g., K3s, EKS, GKE) without 
managing standalone Spark clusters
   - Better reproducibility via containerized executor pods with defined 
resource limits
   
   ### Proposed Scope
   
   - **K8s profile config** (`conf/profiles/k8s.conf`) with 
`spark.master=k8s://...`, executor pod templates, and container image settings
   - **RBAC manifests** (namespace, service account, role, role binding) for 
the `comet-bench` namespace
   - **PV/PVC definitions** for mounting benchmark data and engine JARs into 
executor pods
   - **Documentation** for pushing the `comet-bench` image to a 
cluster-accessible registry and running benchmarks
   - **Validation** with at least one TPC-H query on a multi-node cluster 
(e.g., K3s)
   
   ### Key Considerations
   
   - The `comet-bench` Docker image already includes both Java 8 and Java 17 
runtimes and the TPC query files, so it can serve as the executor image
   - Spark client mode requires the driver pod (or host) to be reachable from 
executor pods — network configuration may vary by cluster
   - Engine JARs (Comet, Gluten) need to be accessible to executors, either 
baked into the image or mounted via PVCs
   - Gluten requires `JAVA_HOME` override to Java 8 on all executor pods
   
   
   ### Describe the potential solution
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to