GitHub user aicam edited a discussion: Approach: unifying AWS (EKS) and 
on-premise Kubernetes deployment under bin/k8s — Helm-only vs eksctl vs 
Terraform

## Problem

Today `bin/k8s` on `main` ships a single Helm chart that targets a **local / 
on-premise Kubernetes cluster only**. Our AWS (EKS) deployment runs the *same* 
Helm chart, but everything underneath it — the EKS cluster, S3 bucket, IAM/IRSA 
roles, Elastic IP, DNS, node pools, cert-manager — is created **by hand** and 
lives nowhere in the repo. None of that substrate is captured as code.

The result: standing up Texera on a fresh AWS account is an undocumented, 
manual prerequisite checklist that drifts and is hard to reproduce. We'd like 
to **unify the local and AWS deployment stories** so there's one documented, 
reproducible path from "empty cluster (or empty AWS account)" to "running 
Texera," while keeping the Helm chart itself cloud-agnostic.

## Scope of "substrate"

The Helm chart already covers the *application*. The open question is only how 
we provision/document the *substrate* it runs on:
- EKS cluster + managed node pools
- IAM roles / IRSA for service accounts
- S3 bucket (LakeFS backend), Elastic IP, DNS, cert-manager / Let's Encrypt

## Options considered

### 1. Helm only
Deploy the chart into a cluster that already exists. No AWS substrate creation.

**Pros**
- Simplest — one `helm install`
- Identical for local and AWS — fully cloud-agnostic
- Nothing extra to learn or maintain in the repo
- Keeps the Apache PR scoped to just the chart

**Cons**
- Doesn't create the cluster, S3 bucket, IAM, EIP, node pools, or cert-manager 
— all manual on a new AWS account
- New-AWS-account UX = a long prerequisite checklist done by hand
- Substrate stays undocumented-as-code → drift, hard to reproduce

### 2. eksctl
AWS CLI that provisions an EKS cluster + node pools + IAM/IRSA + add-ons from a 
small YAML, via CloudFormation. Helm still deploys on top.

**Pros**
- Purpose-built for EKS; minimal config for a working cluster
- Handles IRSA, node pools, and add-ons natively
- "New AWS account → cluster" in ~one command
- Config file is committable and reproducible

**Cons**
- AWS-only — no use for local or other clouds
- Doesn't manage non-cluster resources well (S3 bucket, EIP, DNS) — you still 
script those
- CloudFormation-based: slower, less granular state control
- Adds an AWS-specific tool to a repo that's otherwise portable

### 3. Terraform
Declarative IaC that can provision everything: EKS cluster, S3, IAM, EIP, DNS, 
node pools — and across clouds. Helm deploys on top (or via the Helm provider).

**Pros**
- Most reproducible & reviewable (full state, plan/apply, diffs)
- One tool for all substrate, not just the cluster
- Cloud-agnostic — same tool for AWS / GCP / local
- Can even invoke Helm itself, so one `apply` does substrate + deploy

**Cons**
- Heaviest to set up and maintain (modules, remote state, locking)
- Steeper learning curve; more moving parts
- State management is its own operational burden
- Overkill if you only ever target one EKS cluster

## Our current position: eksctl

We're currently leaning toward **eksctl**, because:
- **GCP and Azure are low priority for now** — the cloud-agnostic advantage of 
Terraform doesn't pay off yet, so we'd rather not take on its maintenance cost 
speculatively.
- **We prefer easier maintenance over easier deploying.** eksctl keeps the 
AWS-specific config small and declarative while letting the Helm chart stay 
fully portable, rather than introducing Terraform modules + remote state + 
locking that we'd have to operate.
- eksctl handles the genuinely fiddly EKS-specific parts (IRSA, node pools, 
add-ons) natively, and the cluster config is committable.

Regardless of the choice, a small amount of non-cluster substrate (S3 bucket, 
EIP, DNS) would still be scripted alongside.

## We'd love your input

Which approach should we standardize on for upstreaming AWS deployment into 
`bin/k8s`? Please vote with a reaction on this post and share your reasoning in 
the comments — especially if you deploy Texera (or similar) on a cloud we 
haven't prioritized:
- 👍 — **Helm only**
- 🚀 — **eksctl** (our current lean)
- ❤️ — **Terraform**


GitHub link: https://github.com/apache/texera/discussions/5641

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to