Andrew please see my previous response, that covers the secrets case. Gyula
On Thu, Dec 1, 2022 at 2:54 PM Andrew Otto <o...@wikimedia.org> wrote: > > several failures to write into $FLINK_HOME/conf/. > I'm working on > <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/> > building Flink and flink-kubernetes-operator images for the Wikimedia > Foundation, and I found this strange as well. It makes sense in a docker / > docker-compose only environment, but in k8s where you have ConfigMap > responsible for flink-conf.yaml, and (also logs all going to the console, > not FLINK_HOME/log), I'd prefer if the image was not modified by the > ENTRYPOINT. > > I believe that for flink-kubernetes-operator, the docker-entrypoint.sh > <https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh> > provided by flink-docker is not really needed. It seems to be written more > for deployments outside of kubernetes. > flink-kubernetes-operator never calls the built in subcommands (e.g. > standalone-job), and always runs in 'pass-through' mode, just execing the > args passed to it. At WMF we build > <https://doc.wikimedia.org/docker-pkg/> our own images, so I'm planning > on removing all of the stuff in ENTRYPOINTs that mangles the image. > Anything that I might want to keep from docker-entrypoint.sh (like enabling > jemoalloc > <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/6/images/flink/Dockerfile.template#73>) > I should be able to do in the Dockerfile at image creation time. > > > want to set an API key as part of the flink-conf.yaml file, but we > don't want it to be persisted in Kubernetes or in our version control > I personally am still pretty green at k8s, but would using kubernetes > Secrets > <https://kubernetes.io/docs/concepts/configuration/secret/#use-case-secret-visible-to-one-container-in-a-pod> > work for your use case? I know we use them at WMF, but from a quick glance > I'm not sure how to combine them in flink-kubernetes-operator's ConfigMap > that renders flink-conf.yaml, but I feel like there should be a way. > > > > > On Wed, Nov 30, 2022 at 4:59 PM Gyula Fóra <gyula.f...@gmail.com> wrote: > >> Hi Lucas! >> >> The Flink kubernetes integration itself is responsible for mounting the >> configmap and overwriting the entrypoint not the operator. Therefore this >> is not something we can easily change from the operator side. However I >> think we are looking at the problem from the wrong side and there may be a >> solution already :) >> >> Ideally what you want is ENV replacement in Flink configuration. This is >> not something that the Flink community has added yet unfortunately but we >> have it on our radar for the operator at least ( >> https://issues.apache.org/jira/browse/FLINK-27491). It will probably be >> added in the next 1.4.0 version. >> >> This will be possible from Flink 1.16 which introduced a small feature >> that allows us to inject parameters to the kubernetes entrypoints: >> https://issues.apache.org/jira/browse/FLINK-29123 >> >> https://github.com/apache/flink/commit/c37643031dca2e6d4c299c0d704081a8bffece1d >> >> While it's not implemented in the operator yet, you could try setting the >> following config in Flink 1.16.0: >> kubernetes.jobmanager.entrypoint.args: -D >> datadog.secret.conf=$MY_SECRET_ENV >> kubernetes.taskmanager.entrypoint.args: -D >> datadog.secret.conf=$MY_SECRET_ENV >> >> If you use this configuration together with the default native mode in >> the operator, it should work I believe. >> >> Please try and let me know! >> Gyula >> >> On Wed, Nov 30, 2022 at 10:36 PM Lucas Caparelli < >> lucas.capare...@gympass.com> wrote: >> >>> Hello folks, >>> >>> Not sure if this is the best list for this, sorry if it isn't. I'd >>> appreciate some pointers :-) >>> >>> When using flink-kubernetes-operator [1], docker-entrypoint.sh [2] goes >>> through several failures to write into $FLINK_HOME/conf/. We believe this >>> is due to this volume being mounted from a ConfigMap, which means it's >>> read-only. >>> >>> This has been reported in the past in GCP's operator, but I was unable >>> to find any kind of resolution for it: >>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213 >>> >>> In our use case, we want to set an API key as part of the >>> flink-conf.yaml file, but we don't want it to be persisted in Kubernetes or >>> in our version control, since it's sensitive data. This API Key is used by >>> Flink to report metrics to Datadog [3]. >>> >>> We have automation in place which allows us to accomplish this by >>> setting environment variables pointing to a path in our secret manager, >>> which only gets injected during runtime. That part is working fine. >>> >>> However, we're trying to inject this secret using the FLINK_PROPERTIES >>> variable, which is appended [4] to the flink-conf.yaml file in the >>> docker-entrypoint script, which fails due to the filesystem where the file >>> is being read-only. >>> >>> We attempted working around this in 2 different ways: >>> >>> - providing our own .spec.containers[0].command, where we copied over >>> /opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not work >>> because the operator overwrote it and replaced it with its original >>> command/args; >>> - providing an initContainer sharing the volumes so it could make the >>> copy without being overridden by the operator's command/args. This did not >>> work because the initContainer present in the spec never makes it to the >>> resulting Deployment, it seems the operator ignores it. >>> >>> We have some questions: >>> >>> 1. Is this overriding of the pod template present in FlinkDeployment >>> intentional? That is, should our custom command/args and initContainers >>> have been overwritten? If so, I find it a bit confusing that these fields >>> are present and available for use at all. >>> 2. Since the ConfigMap volume will always be mounted as read-only, it >>> seems to me there's some adjustments to be made in order for this script to >>> work correctly. Do you think it would make sense for the script to copy >>> over contents from the ConfigMap volume to a writable directory during >>> initialization, and then use this copy for any subsequent operation? >>> Perhaps copying over to $FLINK_HOME, which the user could set themselves, >>> maybe even with a sane default which wouldn't fail on writes (eg >>> /tmp/flink). >>> >>> Thanks in advance for your attention and hard work on the project! >>> >>> [1]: https://github.com/apache/flink-kubernetes-operator >>> [2]: >>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh >>> [3]: https://docs.datadoghq.com/integrations/flink/ >>> [4]: >>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88 >>> >>