villebro opened a new pull request, #72:
URL: https://github.com/apache/superset-kubernetes-operator/pull/72

   ## Summary
   
   Tightens lifecycle correctness, CRD validation, and observability across 
parent-owned resources, plus a docs cleanup and a missing-field documentation 
entry.
   
   The most consequential change closes a gap in init's task checksum where 
config and env mutations could silently skip re-runs. Init now hashes both the 
rendered `superset_config.py` and the resolved env vars the Job mounts — both 
derived from the real artifacts, so the checksum stays honest as new 
config-rendering or env-injection fields are added.
   
   ## Details
   
   ### Lifecycle init checksum coverage
   
   The lifecycle Job consumes two kinds of inputs that can change independently:
   
   - **Rendered Python config** — feature flags, celery imports, lifecycle 
config, sqlaEngineOptions presets, valkey cache layout, etc.
   - **Resolved env vars** — secret refs, metastore 
host/database/user/password, valkey host/refs, lifecycle admin user 
credentials. The rendered Python references env var *names* 
(`os.environ['SUPERSET_OPERATOR__DB_HOST']`), so changing `spec.metastore.host` 
does not change the Python — only the resolved env var slice differs.
   
   Previously, init's checksum hashed a hand-curated subset of `SupersetSpec` 
fields. Mutations to `spec.featureFlags`, the full `spec.celery`, 
`lifecycle.config`, or `lifecycle.sqlaEngineOptions` silently skipped init 
re-runs despite `docs/user-guide/lifecycle.md` promising "any config-affecting 
field changes" trigger init. Switching to "hash the rendered config" alone 
would have introduced the inverse gap — env-only changes (e.g., a new metastore 
host) would no longer trigger init, leaving it pointed at the wrong database.
   
   `initInputs` now returns `{Image, ConfigHash, EnvHash, Trigger}`. A new 
`renderLifecycleTaskConfig` helper in 
`internal/controller/lifecycle_taskspec.go` is shared with 
`buildStandardTaskFlatSpec` so the checksum and the mounted ConfigMap cannot 
diverge. A new `initTaskEnv` helper in `internal/controller/lifecycle_init.go` 
mirrors what `buildStandardTaskFlatSpec` collects via `buildOperatorInjected` 
for `taskTypeInit`. `forceReload` is intentionally excluded from `EnvHash` — it 
is a per-component rollout knob, not a lifecycle re-run signal.
   
   ### Disabled-aware CEL for clone and rotate
   
   CEL prerequisites on `lifecycle.{clone,rotate}` previously fired regardless 
of `disabled: true`, while the controller correctly skipped disabled tasks via 
`isDisabled(...)`. The CEL rules are now categorized:
   
   - **Task-readiness rules** (bypass on `disabled: true`): clone 
destructive-op environment guard, clone `metastore.host` requirement, rotate 
`previousSecretKey*` requirement. These guard against running a misconfigured 
task; if the task is disabled, the guard is moot.
   - **Secret-hygiene rules** (do NOT bypass): clone `source.password` env 
restriction, top-level `secretKey` / `metastore.password` / `valkey.password` / 
`previousSecretKey` env restrictions, dev-only `init.adminUser` / 
`init.loadExamples`. A plaintext secret in the YAML is a problem regardless of 
whether the task runs, so `disabled: true` does not authorize bypassing these.
   
   ### Uniform labeling on parent-owned resources
   
   Service/HPA/PDB/NetworkPolicy already set `ObjectMeta.Labels`; Deployments 
and ConfigMaps now do too. `reconcileParentOwnedConfigMap` accepts a `labels` 
parameter; component callers pass `componentLabels(componentType, parentName)`, 
lifecycle task callers pass labels for the lifecycle component. The kubectl 
idiom `kubectl get deploy,svc,cm,hpa,pdb -l 
app.kubernetes.io/instance=<parent>` now lists every parent-owned resource 
consistently.
   
   Cleanup stays deterministic and name-based, driven by the descriptor table — 
`docs/architecture/internals.md` is rewritten to describe what actually happens 
(the prior label-based-cleanup claim did not match the implementation).
   
   ### `ContainerImageSpec` for non-Superset images
   
   `ImageSpec.Repository` defaults to the Superset image, but the same struct 
was previously reused for `lifecycle.maintenancePage.image` and 
`lifecycle.clone.image`. A user supplying `image: {tag: alpine}` for clone got 
`apachesuperset.docker.scarf.sh/apache/superset:alpine` instead of 
`postgres:alpine`.
   
   A new `ContainerImageSpec` in `api/v1alpha1/shared_types.go` (no kubebuilder 
repository default) is used for those nested image fields. `resolveCloneImage` 
and `resolveMaintenanceImage` share a new `resolveContainerImage` helper that 
merges user fields with context-appropriate defaults (`postgres:17-alpine` / 
`mysql:8-alpine` / `nginx:alpine`).
   
   ### `podTemplate.container.ports` documentation
   
   The container fields table in `docs/user-guide/configuration.md` was missing 
`ports`, even though the API exposes it and the operator consumes the first 
resolved port for the Service `targetPort` and the operator-managed 
NetworkPolicy ingress port. Added a row.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to