Hello devs,

I'd like to revive this discussion. There is also a ticket about this effort 
for some time [1] and this thing also affects us as well. Right now we have a 
custom solution that is similar to "environment variables", but it only can be 
used in parts of our downstream product. The main thing for us to achieve would 
be to be able to use variables in DDLs (not necessarily for hiding sensitive 
props). I think it would be really handy to have the ability to reuse values in 
multiple tables.

With that said, comes the temptation to hit two birds with one stone, although 
a sensitive property requires much more care than a regular one, so I think 
these two things should be handled separately. At least in the beginning. The 
tricky part of the "environment variables" are their scope, and if they are not 
coming from an external system, it will probably be necessary to persist them. 
Or keep them in memory, but that may be insufficient according to what is the 
scope of the "environment variables".

Considering the sensitive props, I think a small step forward could be to hide 
the values in case of a "SHOW CREATE TABLE" op.

For a varible to be used in a DDL I'd imagine it could apply for a whole 
catalog as starters. As long as the catalog is present, those variables would 
be valid.

I did not check implementation details yet, so it is possible I'm missing 
something important or wrong in some places, but I wanted to get some feedback 
about the idea.

WDYT?

[1] https://issues.apache.org/jira/browse/FLINK-28028

Best,
F


------- Original Message -------
On Monday, April 4th, 2022 at 09:53, Timo Walther <twal...@apache.org> wrote:


> 
> 
> Hi Fred,
> 
> thanks for starting this discussion. I totally agree that this an issue
> that the community should solve. It popped up before and is still
> unsolved today. Great that you offer your help here. So let's clarify
> the implementation details.
> 
> 1) Global vs. Local solution
> 
> Is this a DDL-only problem? If yes, it would be easier to solve it in
> the `FactoryUtil` that all Flink connectors and formats use.
> 
> 2) Configruation vs. enviornment variables
> 
> I agree with Qingsheng that environment variable are not always
> straightforward to identify if you have a "pre-flight phase" and a
> "cluster phase".
> In the DynamicTableFactory, one has access to Flink configuration and
> could resolve `${...}` variables.
> 
> 
> What do you think?
> 
> Regards,
> Timo
> 
> 
> Am 01.04.22 um 12:26 schrieb Qingsheng Ren:
> 
> > Hi Fred,
> > 
> > Thanks for raising the discussion! I think the definition of “environment 
> > variable” varies under different context. Under Flink on K8s it means the 
> > environment variable for a container, and if you are a SQL client user it 
> > could refer to environment variable of SQL client, or even the system 
> > properties on JVM. So using “environment variable” is a bit vague under 
> > different environments.
> > 
> > A more generic solution in my mind is that we can take advantage of 
> > configurations in Flink, to pass table options dynamically by adding 
> > configs to TableConfig or even flink-conf.yaml. For example option 
> > “table.dynamic.options.my_catalog.my_db_.my_table.accessId = foo” means 
> > adding table option “accessId = foo” to table “my_catalog.my_db.my_table”. 
> > By this way we could de-couple DDL statement with table options containing 
> > secret credentials. What do you think?
> > 
> > Best regards,
> > 
> > Qingsheng
> > 
> > > On Mar 30, 2022, at 16:25, Teunissen, F.G.J. (Fred) 
> > > fred.teunis...@ing.com.INVALID wrote:
> > > 
> > > Hi devs,
> > > 
> > > Some SQL Table properties contain sensitive data, like passwords that we 
> > > do not want to expose in the VVP ui to other users. Also, having them 
> > > clear text in a SQL statement is not secure. For example,
> > > 
> > > CREATE TABLE Orders (
> > > `user` BIGINT,
> > > product STRING,
> > > order_time TIMESTAMP(3)
> > > ) WITH (
> > > 'connector' = 'kafka',
> > > 
> > > 'properties.bootstrap.servers' = 'kafka-host-1:9093,kafka-host-2:9093',
> > > 'properties.security.protocol' = 'SSL',
> > > 'properties.ssl.key.password' = 'should-be-a-secret',
> > > 'properties.ssl.keystore.location' = '/tmp/secrets/my-keystore.jks',
> > > 'properties.ssl.keystore.password' = 'should-also-be-a-secret',
> > > 'properties.ssl.truststore.location' = '/tmp/secrets/my-truststore.jks',
> > > 'properties.ssl.truststore.password' = 'should-again-be-a-secret',
> > > 'scan.startup.mode' = 'earliest-offset'
> > > );
> > > 
> > > I would like to bring up for a discussion a proposal to provide these 
> > > secrets values via environment variables since these can be populated 
> > > from a K8s configMap or secrets.
> > > 
> > > For implementing the SQL Table properties, the ConfigOption<T> class is 
> > > used in connectors and formatters. This class could be extended that it 
> > > checks whether the config-value contains certain tokens, like 
> > > ‘${env-var-name}’. If it does, it could fetch the value from the 
> > > environment variable and use that to replace that token in the 
> > > config-value.
> > > 
> > > The above SQL statement would then look like,
> > > 
> > > CREATE TABLE Orders (
> > > `user` BIGINT,
> > > product STRING,
> > > order_time TIMESTAMP(3)
> > > ) WITH (
> > > 'connector' = 'kafka',
> > > 
> > > 'properties.bootstrap.servers' = 'kafka-host-1:9093,kafka-host-2:9093',
> > > 'properties.security.protocol' = 'SSL',
> > > 'properties.ssl.key.password' = '${secret_kafka_ssl_key_password}',
> > > 'properties.ssl.keystore.location' = '/tmp/secrets/my-keystore.jks',
> > > 'properties.ssl.keystore.password' = 
> > > '${secret_kafka_ssl_keystore_password}',
> > > 'properties.ssl.truststore.location' = '/tmp/secrets/my-truststore.jks',
> > > 'properties.ssl.truststore.password' = 
> > > '${secret_kafka_ssl_truststore_password}',
> > > 'scan.startup.mode' = 'earliest-offset'
> > > );
> > > 
> > > For the purpose of secrets I don’t think you need any complex processing 
> > > of tokens but perhaps there are other usages as well. For instance,
> > > 
> > > 'properties.bootstrap.servers' = 
> > > 'kafka-${otap_env}-1:9093,kafka-${otap_env}-2:9093',
> > > 
> > > Because it is possible that (but I think unlikely) someone wants a 
> > > property value like ‘${not-an-env-var}’ you need to be able to escape 
> > > this ’$’ token like ‘$${not-an-env-var}’. This also means that in theory 
> > > it would break compatibility.
> > > 
> > > Looking forward for your feedback!
> > > 
> > > Best,
> > > Fred Teunissen
> > > 
> > > -----------------------------------------------------------------
> > > ATTENTION:
> > > The information in this e-mail is confidential and only meant for the 
> > > intended recipient. If you are not the intended recipient, don't use or 
> > > disclose it in any way. Please let the sender know and delete the message 
> > > immediately.
> > > -----------------------------------------------------------------
> 
>

Reply via email to