Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Gyula Fóra Thu, 01 Dec 2022 06:31:25 -0800

Andrew please see my previous response, that covers the secrets case.

Gyula


On Thu, Dec 1, 2022 at 2:54 PM Andrew Otto <o...@wikimedia.org> wrote:

> > several failures to write into $FLINK_HOME/conf/.
> I'm working on
> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/>
> building Flink and flink-kubernetes-operator images for the Wikimedia
> Foundation, and I found this strange as well.  It makes sense in a docker /
> docker-compose only environment, but in k8s where you have ConfigMap
> responsible for flink-conf.yaml, and (also logs all going to the console,
> not FLINK_HOME/log), I'd prefer if the image was not modified by the
> ENTRYPOINT.
>
> I believe that for flink-kubernetes-operator, the docker-entrypoint.sh
> <https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh>
> provided by flink-docker is not really needed.  It seems to be written more
> for deployments outside of kubernetes.
>  flink-kubernetes-operator never calls the built in subcommands (e.g.
> standalone-job), and always runs in 'pass-through' mode, just execing the
> args passed to it.  At WMF we build
> <https://doc.wikimedia.org/docker-pkg/> our own images, so I'm planning
> on removing all of the stuff in ENTRYPOINTs that mangles the image.
> Anything that I might want to keep from docker-entrypoint.sh (like enabling
> jemoalloc
> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/6/images/flink/Dockerfile.template#73>)
> I should be able to do in the Dockerfile at image creation time.
>
> >  want to set an API key as part of the flink-conf.yaml file, but we
> don't want it to be persisted in Kubernetes or in our version control
> I personally am still pretty green at k8s, but would using kubernetes
> Secrets
> <https://kubernetes.io/docs/concepts/configuration/secret/#use-case-secret-visible-to-one-container-in-a-pod>
> work for your use case? I know we use them at WMF, but from a quick glance
> I'm not sure how to combine them in flink-kubernetes-operator's ConfigMap
> that renders flink-conf.yaml, but I feel like there should be a way.
>
>
>
>
> On Wed, Nov 30, 2022 at 4:59 PM Gyula Fóra <gyula.f...@gmail.com> wrote:
>
>> Hi Lucas!
>>
>> The Flink kubernetes integration itself is responsible for mounting the
>> configmap and overwriting the entrypoint not the operator. Therefore this
>> is not something we can easily change from the operator side. However I
>> think we are looking at the problem from the wrong side and there may be a
>> solution already :)
>>
>> Ideally what you want is ENV replacement in Flink configuration. This is
>> not something that the Flink community has added yet unfortunately but we
>> have it on our radar for the operator at least (
>> https://issues.apache.org/jira/browse/FLINK-27491). It will probably be
>> added in the next 1.4.0 version.
>>
>> This will be possible from Flink 1.16 which introduced a small feature
>> that allows us to inject parameters to the kubernetes entrypoints:
>> https://issues.apache.org/jira/browse/FLINK-29123
>>
>> https://github.com/apache/flink/commit/c37643031dca2e6d4c299c0d704081a8bffece1d
>>
>> While it's not implemented in the operator yet, you could try setting the
>> following config in Flink 1.16.0:
>> kubernetes.jobmanager.entrypoint.args: -D
>> datadog.secret.conf=$MY_SECRET_ENV
>> kubernetes.taskmanager.entrypoint.args: -D
>> datadog.secret.conf=$MY_SECRET_ENV
>>
>> If you use this configuration together with the default native mode in
>> the operator, it should work I believe.
>>
>> Please try and let me know!
>> Gyula
>>
>> On Wed, Nov 30, 2022 at 10:36 PM Lucas Caparelli <
>> lucas.capare...@gympass.com> wrote:
>>
>>> Hello folks,
>>>
>>> Not sure if this is the best list for this, sorry if it isn't. I'd
>>> appreciate some pointers :-)
>>>
>>> When using flink-kubernetes-operator [1], docker-entrypoint.sh [2] goes
>>> through several failures to write into $FLINK_HOME/conf/. We believe this
>>> is due to this volume being mounted from a ConfigMap, which means it's
>>> read-only.
>>>
>>> This has been reported in the past in GCP's operator, but I was unable
>>> to find any kind of resolution for it:
>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213
>>>
>>> In our use case, we want to set an API key as part of the
>>> flink-conf.yaml file, but we don't want it to be persisted in Kubernetes or
>>> in our version control, since it's sensitive data. This API Key is used by
>>> Flink to report metrics to Datadog [3].
>>>
>>> We have automation in place which allows us to accomplish this by
>>> setting environment variables pointing to a path in our secret manager,
>>> which only gets injected during runtime. That part is working fine.
>>>
>>> However, we're trying to inject this secret using the FLINK_PROPERTIES
>>> variable, which is appended [4] to the flink-conf.yaml file in the
>>> docker-entrypoint script, which fails due to the filesystem where the file
>>> is being read-only.
>>>
>>> We attempted working around this in 2 different ways:
>>>
>>>   - providing our own .spec.containers[0].command, where we copied over
>>> /opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not work
>>> because the operator overwrote it and replaced it with its original
>>> command/args;
>>>   - providing an initContainer sharing the volumes so it could make the
>>> copy without being overridden by the operator's command/args. This did not
>>> work because the initContainer present in the spec never makes it to the
>>> resulting Deployment, it seems the operator ignores it.
>>>
>>> We have some questions:
>>>
>>> 1. Is this overriding of the pod template present in FlinkDeployment
>>> intentional? That is, should our custom command/args and initContainers
>>> have been overwritten? If so, I find it a bit confusing that these fields
>>> are present and available for use at all.
>>> 2. Since the ConfigMap volume will always be mounted as read-only, it
>>> seems to me there's some adjustments to be made in order for this script to
>>> work correctly. Do you think it would make sense for the script to copy
>>> over contents from the ConfigMap volume to a writable directory during
>>> initialization, and then use this copy for any subsequent operation?
>>> Perhaps copying over to $FLINK_HOME, which the user could set themselves,
>>> maybe even with a sane default which wouldn't fail on writes (eg
>>> /tmp/flink).
>>>
>>> Thanks in advance for your attention and hard work on the project!
>>>
>>> [1]: https://github.com/apache/flink-kubernetes-operator
>>> [2]:
>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh
>>> [3]: https://docs.datadoghq.com/integrations/flink/
>>> [4]:
>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88
>>>
>>

Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Reply via email to