villebro opened a new pull request, #72:
URL: https://github.com/apache/superset-kubernetes-operator/pull/72
## Summary
Tightens lifecycle correctness, CRD validation, and observability across
parent-owned resources, plus a docs cleanup and a missing-field documentation
entry.
The most consequential change closes a gap in init's task checksum where
config and env mutations could silently skip re-runs. Init now hashes both the
rendered `superset_config.py` and the resolved env vars the Job mounts — both
derived from the real artifacts, so the checksum stays honest as new
config-rendering or env-injection fields are added.
## Details
### Lifecycle init checksum coverage
The lifecycle Job consumes two kinds of inputs that can change independently:
- **Rendered Python config** — feature flags, celery imports, lifecycle
config, sqlaEngineOptions presets, valkey cache layout, etc.
- **Resolved env vars** — secret refs, metastore
host/database/user/password, valkey host/refs, lifecycle admin user
credentials. The rendered Python references env var *names*
(`os.environ['SUPERSET_OPERATOR__DB_HOST']`), so changing `spec.metastore.host`
does not change the Python — only the resolved env var slice differs.
Previously, init's checksum hashed a hand-curated subset of `SupersetSpec`
fields. Mutations to `spec.featureFlags`, the full `spec.celery`,
`lifecycle.config`, or `lifecycle.sqlaEngineOptions` silently skipped init
re-runs despite `docs/user-guide/lifecycle.md` promising "any config-affecting
field changes" trigger init. Switching to "hash the rendered config" alone
would have introduced the inverse gap — env-only changes (e.g., a new metastore
host) would no longer trigger init, leaving it pointed at the wrong database.
`initInputs` now returns `{Image, ConfigHash, EnvHash, Trigger}`. A new
`renderLifecycleTaskConfig` helper in
`internal/controller/lifecycle_taskspec.go` is shared with
`buildStandardTaskFlatSpec` so the checksum and the mounted ConfigMap cannot
diverge. A new `initTaskEnv` helper in `internal/controller/lifecycle_init.go`
mirrors what `buildStandardTaskFlatSpec` collects via `buildOperatorInjected`
for `taskTypeInit`. `forceReload` is intentionally excluded from `EnvHash` — it
is a per-component rollout knob, not a lifecycle re-run signal.
### Disabled-aware CEL for clone and rotate
CEL prerequisites on `lifecycle.{clone,rotate}` previously fired regardless
of `disabled: true`, while the controller correctly skipped disabled tasks via
`isDisabled(...)`. The CEL rules are now categorized:
- **Task-readiness rules** (bypass on `disabled: true`): clone
destructive-op environment guard, clone `metastore.host` requirement, rotate
`previousSecretKey*` requirement. These guard against running a misconfigured
task; if the task is disabled, the guard is moot.
- **Secret-hygiene rules** (do NOT bypass): clone `source.password` env
restriction, top-level `secretKey` / `metastore.password` / `valkey.password` /
`previousSecretKey` env restrictions, dev-only `init.adminUser` /
`init.loadExamples`. A plaintext secret in the YAML is a problem regardless of
whether the task runs, so `disabled: true` does not authorize bypassing these.
### Uniform labeling on parent-owned resources
Service/HPA/PDB/NetworkPolicy already set `ObjectMeta.Labels`; Deployments
and ConfigMaps now do too. `reconcileParentOwnedConfigMap` accepts a `labels`
parameter; component callers pass `componentLabels(componentType, parentName)`,
lifecycle task callers pass labels for the lifecycle component. The kubectl
idiom `kubectl get deploy,svc,cm,hpa,pdb -l
app.kubernetes.io/instance=<parent>` now lists every parent-owned resource
consistently.
Cleanup stays deterministic and name-based, driven by the descriptor table —
`docs/architecture/internals.md` is rewritten to describe what actually happens
(the prior label-based-cleanup claim did not match the implementation).
### `ContainerImageSpec` for non-Superset images
`ImageSpec.Repository` defaults to the Superset image, but the same struct
was previously reused for `lifecycle.maintenancePage.image` and
`lifecycle.clone.image`. A user supplying `image: {tag: alpine}` for clone got
`apachesuperset.docker.scarf.sh/apache/superset:alpine` instead of
`postgres:alpine`.
A new `ContainerImageSpec` in `api/v1alpha1/shared_types.go` (no kubebuilder
repository default) is used for those nested image fields. `resolveCloneImage`
and `resolveMaintenanceImage` share a new `resolveContainerImage` helper that
merges user fields with context-appropriate defaults (`postgres:17-alpine` /
`mysql:8-alpine` / `nginx:alpine`).
### `podTemplate.container.ports` documentation
The container fields table in `docs/user-guide/configuration.md` was missing
`ports`, even though the API exposes it and the operator consumes the first
resolved port for the Service `targetPort` and the operator-managed
NetworkPolicy ingress port. Added a row.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]