Hi friends I like the updated FLIP goals, that’s what I want. I’ve some feedback:
(1) Minor: Interface Hierarchy : Why doesn't WritableSecretStore extend SecretStore? (2) Configurability of SECRET_FIELDS : Could the hardcoded SECRET_FIELDS in BasicConnectionFactory be made configurable (e.g., 'token' vs 'accessKey') for better connector compatibility? (3)Inconsistent Return Types : ConnectionFactory#resolveConnection returns SensitiveConnection, while BasicConnectionFactory#resolveConnection returns Map<String, String>. Should these be aligned? (4)Framework-level Resolution : +1 to Shengkai's point about having the framework (DynamicTableFactory) return complete options to reduce connector adaptation cost. (5)Secret ID Handling : When no encryption is needed, secretId is null (from secrets.isEmpty() ? null : secretStore.storeSecret(secrets)). This behavior should be explicitly documented in the interfaces. Best, Leonard > 2025 7月 24 11:44,Shengkai Fang <fskm...@gmail.com> 写道: > > hi. > > Sorry for the late reply. I just have some questions: > > 1. Why SecretStoreFactory#open throws a CatalogException? I think the > exteranl system can not handle this exception. > > 2. I think we can also modify the create catalog ddl syntax. > > ``` > CREATE CATALOG cat USING CONNECTION mycat.mydb.mysql_prod > WITH ( > 'type' = 'jdbc' > ); > ``` > > 3. It seems the connector factory should merge the with options and > connection options together and then create the source/sink. It's > better that framework can merge all these options and connectors don't need > any codes. > > 4. Why we need to introduce ConnectionFactory? I think connection is like > CatalogTable. It should hold the basic information and all information in > the connection should be stored into secret store. > > > Best, > Shengkai > > > Timo Walther <twal...@apache.org> 于2025年7月22日周二 22:04写道: > >> Hi Mayank, >> >> Thanks for updating the FLIP and clearly documenting our discussion. >> >> +1 for moving forward with the vote, unless there are objections from >> others. >> >> Cheers, >> Timo >> >> On 22.07.25 02:14, Mayank Juneja wrote: >>> Hi Ryan and Austin, >>> >>> Thanks for your suggestions. I've updated the FLIP with the following >>> additional info - >>> >>> 1. *table.secret-store.kind* key to register the SecretStore in a yaml >> file >>> 2. *updateSecret* method in WritableSecretStore interface >>> >>> Thanks, >>> Mayank >>> >>> On Thu, Jul 17, 2025 at 5:42 PM Austin Cawley-Edwards < >>> austin.caw...@gmail.com> wrote: >>> >>>> Hey all, >>>> >>>> Thanks for the nice flip all! I’m just reading through – had one >> question >>>> on the ALTER CONNECTION implementation flow. Would it make sense for the >>>> WritableSecretStore to expose a method for updating a secret by ID, so >> it >>>> can be done atomically? Else, would we need to call delete and create >>>> again, potentially introducing concurrent resolution errors? >>>> >>>> Best, >>>> Austin >>>> >>>> On Thu, Jul 17, 2025 at 13:07 Ryan van Huuksloot >>>> <ryan.vanhuuksl...@shopify.com.invalid> wrote: >>>> >>>>> Hi Mayank, >>>>> >>>>> Thanks for updating the FLIP. Overall it looks good to me. >>>>> >>>>> One question I had related to how someone could choose the SecretStore >>>> they >>>>> want to use if they use something like the SQL Gateway as the >> entrypoint >>>> on >>>>> top of a remote Session cluster. I don't see an explicit way to set the >>>>> SecretStore in the FLIP. >>>>> I assume we'll do it similar to the CatalogStore but I wanted to call >>>> this >>>>> out. >>>>> >>>>> table.catalog-store.kind: filetable.catalog-store.file.path: >>>>> file:///path/to/catalog/store/ >>>>> >>>>> Ryan van Huuksloot >>>>> Staff Engineer, Infrastructure | Streaming Platform >>>>> [image: Shopify] >>>>> < >> https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email >>>>> >>>>> >>>>> >>>>> On Wed, Jul 16, 2025 at 2:22 PM Mayank Juneja < >> mayankjunej...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi everyone, >>>>>> >>>>>> Thanks for your valuable inputs. I have updated the FLIP with the >> ideas >>>>>> proposed earlier in the thread. Looking forward to your feedback. >>>>>> https://cwiki.apache.org/confluence/x/cYroF >>>>>> >>>>>> Best, >>>>>> Mayank >>>>>> >>>>>> On Fri, Jun 27, 2025 at 2:59 AM Leonard Xu <xbjt...@gmail.com> wrote: >>>>>> >>>>>>> Quick response, thanks Mayank, Hao and Timo for the effort. The new >>>>>>> proposal looks well, +1 from my side. >>>>>>> >>>>>>> Could you draft(update) current FLIP docs thus we can have some >>>>> specific >>>>>>> discussions later? >>>>>>> >>>>>>> >>>>>>> Best, >>>>>>> Leonard >>>>>>> >>>>>>> >>>>>>>> 2025 6月 26 15:06,Timo Walther <twal...@apache.org> 写道: >>>>>>>> >>>>>>>> Hi everyone, >>>>>>>> >>>>>>>> sorry for the late reply, feature freeze kept me busy. Mayank, Hao >>>>> and >>>>>> I >>>>>>> synced offline and came up we an improved proposal. Before we update >>>>> the >>>>>>> FLIP let me summarize the most important key facts that hopefully >>>>> address >>>>>>> most concerns: >>>>>>>> >>>>>>>> 1) SecretStore >>>>>>>> - Similar to CatalogStore, we introduce a SecretStore as the >>>> highest >>>>>>> level in TableEnvironment. >>>>>>>> - SecretStore is initialized with options and potentially >>>> environment >>>>>>> variables. Including >>>> EnvironmentSettings.withSecretStore(SecretStore). >>>>>>>> - The SecretStore is pluggable and discovered using the regular >>>>>>> factory-approach. >>>>>>>> - For example, it could implement Azure Key Vault or other cloud >>>>>>> provider secrets stores. >>>>>>>> - Goal: Flink and Flink catalogs do not have to deal with sensitive >>>>>> data. >>>>>>>> >>>>>>>> 2) Connections >>>>>>>> - Connections are catalog objects identified with 3-part >>>> identifiers. >>>>>>> 3-part identifiers are crucial for managability of larger projects >>>> and >>>>>>> align with existing catalog objects. >>>>>>>> - They contain connection details, e.g. URL, query parameters, and >>>>>> other >>>>>>> configuration. >>>>>>>> - They do not contain secrets, but only pointers to secrets in the >>>>>>> SecretStore. >>>>>>>> >>>>>>>> 3) Connection DDL >>>>>>>> >>>>>>>> CREATE [TEMPORARY] CONNECTION mycat.mydb.OpenAPI WITH ( >>>>>>>> 'type' = 'basic' | 'bearer' | 'jwt' | 'oauth' | ..., >>>>>>>> ... >>>>>>>> ) >>>>>>>> >>>>>>>> - Connection type is pluggable and discovered using the regular >>>>>>> factory-approach. >>>>>>>> - The factory extracts secrets and puts them into SecretStore. >>>>>>>> - The factory only leaves non-confidential options left that can be >>>>>>> stored in a catalog. >>>>>>>> >>>>>>>> When executing: >>>>>>>> CREATE [TEMPORARY] CONNECTION mycat.mydb.OpenAPI WITH ( >>>>>>>> 'type' = 'basic', >>>>>>>> 'url' = 'api.example.com', >>>>>>>> 'username' = 'bob', >>>>>>>> 'password' = 'xyz' >>>>>>>> ) >>>>>>>> >>>>>>>> The catalog will receive something similar to: >>>>>>>> CREATE [TEMPORARY] CONNECTION mycat.mydb.OpenAPI WITH ( >>>>>>>> 'type' = 'basic', >>>>>>>> 'url' = 'api.example.com', >>>>>>>> 'secret.store' = 'azure-key-vault' >>>>>>>> 'secret.id' = 'secretId' >>>>>>>> ) >>>>>>>> >>>>>>>> - However, the exact property design is up to the connection >>>> factory. >>>>>>>> >>>>>>>> 4) Connection Usage >>>>>>>> >>>>>>>> CREATE TABLE t (...) USING CONNECTION mycat.mydb.OpenAPI; >>>>>>>> >>>>>>>> - MODEL, FUNCTION, TABLE DDL will support USING CONNECTION keyword >>>>>>> similar to BigQuery. >>>>>>>> - The connection will be provided in a table/model >>>> provider/function >>>>>>> definition factory. >>>>>>>> >>>>>>>> 5) CatalogStore / Catalog Initialization >>>>>>>> >>>>>>>> Catalog store or catalog can make use of SecretStore to retrieve >>>>>> initial >>>>>>> credentials for bootstrapping. All objects lower then catalog >>>>>> store/catalog >>>>>>> can then use connections. If you think we still need system level >>>>>>> connections, we can support CREATE SYSTEM CONNECTION GlobalName WITH >>>>> (..) >>>>>>> similar to SYSTEM functions directly store in a ConnectioManager in >>>>>>> TableEnvironment. But for now I would suggest to start simple with >>>>>>> per-catalog connections and later evolve the design. >>>>>>>> >>>>>>>> Dealing with secrets is a very sensitive topic and I'm clearly not >>>> an >>>>>>> expert on it. This is why we should try to push the problem to >>>> existing >>>>>>> solutions and don't start storing secrets in Flink in any way. Thus, >>>>> the >>>>>>> interfaces will be defined very generic. >>>>>>>> >>>>>>>> Looking forward to your feedback. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Timo >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 09.06.25 04:01, Leonard Xu wrote: >>>>>>>>> Thanks Timo for joining this thread. >>>>>>>>> I agree that this feature is needed by the community; the current >>>>>>> disagreement is only about the implementation method or solution. >>>>>>>>> Your thoughts looks generally good to me, looking forward to your >>>>>>> proposal. >>>>>>>>> Best, >>>>>>>>> Leonard >>>>>>>>>> 2025 6月 6 22:46,Timo Walther <twal...@apache.org> 写道: >>>>>>>>>> >>>>>>>>>> Hi everyone, >>>>>>>>>> >>>>>>>>>> thanks for this healthy discussion. Looking at high number of >>>>>>> participants, it looks like we definitely want this feature. We just >>>>> need >>>>>>> to figure out the "how". >>>>>>>>>> >>>>>>>>>> This reminds me very much of the discussion we had for CREATE >>>>>>> FUNCTION. There, we discussed whether functions should be named >>>>> globally >>>>>> or >>>>>>> catalog-specific. In the end, we decided for both `CREATE SYSTEM >>>>>> FUNCTION` >>>>>>> and `CREATE FUNCTION`, satisfying both the data platform team of an >>>>>>> organization (which might provide system functions) and individual >>>> data >>>>>>> teams or use cases (scoped by catalog/database). >>>>>>>>>> >>>>>>>>>> Looking at other modern vendors like Snowflake there is SECRET >>>>>> (scoped >>>>>>> to schema) [1] and API INTEGRATION [2] (scoped to account). So also >>>>> other >>>>>>> vendors offer global and per-team / per-use case connections details. >>>>>>>>>> >>>>>>>>>> In general, I think fitting connections into the existing >>>> concepts >>>>>> for >>>>>>> catalog objects (with three-part identifier) makes managing them >>>>> easier. >>>>>>> But I also see the need for global defaults. >>>>>>>>>> >>>>>>>>>> Btw keep in mind that a catalog implementation should only store >>>>>>> metadata. Similar how a CatalogTable doesn't store the actual data, a >>>>>>> CatalogConnection should not store the credentials. It should only >>>>> offer >>>>>> a >>>>>>> factory that allows for storing and retrieving them. In real world >>>>>>> scenarios a factory is most likely backed by a product like Azure Key >>>>>> Vault. >>>>>>>>>> >>>>>>>>>> So code-wise having a ConnectionManager that behaves similar to >>>>>>> FunctionManager sounds reasonable. >>>>>>>>>> >>>>>>>>>> +1 for having special syntax instead of using properties. This >>>>> allows >>>>>>> to access connections in tables, models, functions. And catalogs, if >>>> we >>>>>>> agree to have global ones as well. >>>>>>>>>> >>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> Let me spend some more thoughts on this and come back with a >>>>> concrete >>>>>>> proposal by early next week. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Timo >>>>>>>>>> >>>>>>>>>> [1] >>>> https://docs.snowflake.com/en/sql-reference/sql/create-secret >>>>>>>>>> [2] >>>>>>> >>>> https://docs.snowflake.com/en/sql-reference/sql/create-api-integration >>>>>>>>>> >>>>>>>>>> On 04.06.25 10:47, Leonard Xu wrote: >>>>>>>>>>> Hey,Mayank >>>>>>>>>>> Please see my feedback as following: >>>>>>>>>>> 1. One of the motivations of this FLIP is to improve security. >>>>>>> However, the current design stores all connection information in the >>>>>>> catalog, >>>>>>>>>>> and each Flink SQL job reads from the catalog during >>>> compilation. >>>>>> The >>>>>>> connection information is passed between SQL Gateway and the >>>>>>>>>>> catalog in plaintext, which actually introduces new security >>>>> risks. >>>>>>>>>>> 2. The name "Connection" should be changed to something like >>>>>>> ConnectionSpec to clearly indicate that it is a object containing >>>> only >>>>>>> static >>>>>>>>>>> properties without a lifecycle. Putting aside the naming issue, >>>> I >>>>>>> think the current model and hierarchy design is somewhat strange. >>>>> Storing >>>>>>>>>>> various kinds of connections (e.g., Kafka, MySQL) in the same >>>>>> Catalog >>>>>>> with hierarchical identifiers like >>>> catalog-name.db-name.connection-name >>>>>>>>>>> raises the following questions: >>>>>>>>>>> (1) What is the purpose of this hierarchical structure of >>>>> Connection >>>>>>> object ? >>>>>>>>>>> (2) If we can use a Connection to create a MySQL table, why >>>> can't >>>>> we >>>>>>> use a Connection to create a MySQL Catalog? >>>>>>>>>>> 3. Regarding the connector usage examples given in this FLIP: >>>>>>>>>>> ```sql >>>>>>>>>>> 1 -- Example 2: Using connection for jdbc tables >>>>>>>>>>> 2 CREATE OR REPLACE CONNECTION mysql_customer_db >>>>>>>>>>> 3 WITH ( >>>>>>>>>>> 4 'type' = 'jdbc', >>>>>>>>>>> 5 'jdbc.url' = 'jdbc:mysql:// >>>>>>> customer-db.example.com:3306/customerdb', >>>>>>>>>>> 6 'jdbc.connection.ssl.enabled' = 'true' >>>>>>>>>>> 7 ); >>>>>>>>>>> 8 >>>>>>>>>>> 9 CREATE TABLE customers ( >>>>>>>>>>> 10 customer_id INT, >>>>>>>>>>> 11 PRIMARY KEY (customer_id) NOT ENFORCED >>>>>>>>>>> 12 ) WITH ( >>>>>>>>>>> 13 'connector' = 'jdbc', >>>>>>>>>>> 14 'jdbc.connection' = 'mysql_customer_db', >>>>>>>>>>> 15 'jdbc.connection.ssl.enabled' = 'true', >>>>>>>>>>> 16 'jdbc.connection.max-retry-timeout' = '60s', >>>>>>>>>>> 17 'jdbc.table-name' = 'customers', >>>>>>>>>>> 18 'jdbc.lookup.cache' = 'PARTIAL' >>>>>>>>>>> 19 ); >>>>>>>>>>> ``` >>>>>>>>>>> I see three issues from SQL semantics and Connector >>>> compatibility >>>>>>> perspectives: >>>>>>>>>>> (1) Look at line 14: `mysql_customer_db` is an object identifier >>>>> of >>>>>> a >>>>>>> CONNECTION defined in SQL. However, this identifier is referenced >>>>>>>>>>> via a string value inside the table’s WITH clause, which >>>> feel >>>>>>> hack for me. >>>>>>>>>>> (2) Look at lines 14–16: the use of the specific prefix >>>>>>> `jdbc.connection` will confuse users because `connection.xx` maybe >>>>>> already >>>>>>> used as >>>>>>>>>>> a prefix for existing configuration items. >>>>>>>>>>> (3) Look at lines 14–18: Why do all existing configuration >>>> options >>>>>>> need to be prefixed with `jdbc`, even they’re not related to >>>> Connection >>>>>>> properties? >>>>>>>>>>> This completely changes user habits — is it backward compatible? >>>>>>>>>>> In my opinion, Connection should be a model independent of both >>>>>>> Catalog and Table, and can be referenced by all >>>> catalog/table/udf/model >>>>>>> object. >>>>>>>>>>> It should be managed by a Component such as a ConnectionManager >>>> to >>>>>>> enable reuse. For security purposes, authentication mechanisms could >>>>>>>>>>> be supported within the ConnectionManager. >>>>>>>>>>> Best, >>>>>>>>>>> Leonard >>>>>>>>>>>> 2025 6月 4 02:04,Martijn Visser <martijnvis...@apache.org> 写道: >>>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> First of all, I think having a Connection resource is something >>>>>> that >>>>>>> will >>>>>>>>>>>> be beneficial for Apache Flink. I could see that being extended >>>>> in >>>>>>> the >>>>>>>>>>>> future to allow for easier secret handling [1]. >>>>>>>>>>>> In my mental mind, I'm comparing this proposal against SQL/MED >>>>> from >>>>>>> the ISO >>>>>>>>>>>> standard [2]. I do think that SQL/MED isn't a very user >>>> friendly >>>>>>> syntax >>>>>>>>>>>> though, looking at Postgres for example [3]. >>>>>>>>>>>> >>>>>>>>>>>> I think it's a valid question if Connection should be >>>> considered >>>>>>> with a >>>>>>>>>>>> catalog or database-level scope. @Ryan can you share something >>>>>> more, >>>>>>> since >>>>>>>>>>>> you've mentioned "Note: I much prefer catalogs for this case. >>>>> Which >>>>>>> is what >>>>>>>>>>>> we use internally to manage connection properties". It looks >>>> like >>>>>>> there >>>>>>>>>>>> isn't a strong favourable approach looking at other vendors >>>>> (like, >>>>>>>>>>>> Databricks does scopes it on a Unity catalog, Snowflake on a >>>>>> database >>>>>>>>>>>> level). >>>>>>>>>>>> >>>>>>>>>>>> Also looking forward to Leonard's input. >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> >>>>>>>>>>>> Martijn >>>>>>>>>>>> >>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-36818 >>>>>>>>>>>> [2] https://www.iso.org/standard/84804.html >>>>>>>>>>>> [3] >>>>> https://www.postgresql.org/docs/current/sql-createserver.html >>>>>>>>>>>> >>>>>>>>>>>> On Fri, May 30, 2025 at 5:07 AM Leonard Xu <xbjt...@gmail.com> >>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hey Mayank. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for the FLIP, I went through this FLIP quickly and >>>> found >>>>>> some >>>>>>>>>>>>> issues which I think we >>>>>>>>>>>>> need to deep discuss later. As we’re on a short Dragon boat >>>>>>> Festival, >>>>>>>>>>>>> could you kindly hold >>>>>>>>>>>>> on this thread? and we will back to continue the FLIP discuss. >>>>>>>>>>>>> >>>>>>>>>>>>> Best, >>>>>>>>>>>>> Leonard >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> 2025 4月 29 23:07,Mayank Juneja <mayankjunej...@gmail.com> >>>> 写道: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would like to open up for discussion a new FLIP-529 [1]. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Motivation: >>>>>>>>>>>>>> Currently, Flink SQL handles external connectivity by >>>> defining >>>>>>> endpoints >>>>>>>>>>>>>> and credentials in table configuration. This approach >>>> prevents >>>>>>>>>>>>> reusability >>>>>>>>>>>>>> of these connections and makes table definition less secure >>>> by >>>>>>> exposing >>>>>>>>>>>>>> sensitive information. >>>>>>>>>>>>>> We propose the introduction of a new "connection" resource in >>>>>>> Flink. This >>>>>>>>>>>>>> will be a pluggable resource configured with a remote >>>> endpoint >>>>>> and >>>>>>>>>>>>>> associated access key. Once defined, connections can be >>>> reused >>>>>>> across >>>>>>>>>>>>> table >>>>>>>>>>>>>> definitions, and eventually for model definition (as >>>> discussed >>>>> in >>>>>>>>>>>>> FLIP-437) >>>>>>>>>>>>>> for inference, enabling seamless and secure integration with >>>>>>> external >>>>>>>>>>>>>> systems. >>>>>>>>>>>>>> The connection resource will provide a new, optional way to >>>>>> manage >>>>>>>>>>>>> external >>>>>>>>>>>>>> connectivity in Flink. Existing methods for table definitions >>>>>> will >>>>>>> remain >>>>>>>>>>>>>> unchanged. >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] https://cwiki.apache.org/confluence/x/cYroF >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>> Mayank Juneja >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> *Mayank Juneja* >>>>>> Product Manager | Data Streaming and AI >>>>>> >>>>> >>>> >>> >>> >> >>