hi.

Sorry for the late reply. I just have some questions:

1. Why SecretStoreFactory#open throws a CatalogException? I think the
exteranl system can not handle this exception.

2. I think we can also modify the create catalog ddl syntax.

```
CREATE CATALOG cat USING CONNECTION mycat.mydb.mysql_prod
WITH (
    'type' = 'jdbc'
);
```

3. It seems the connector factory should merge the with options and
connection options together and then create the source/sink. It's
better that framework can merge all these options and connectors don't need
any codes.

4. Why we need to introduce ConnectionFactory? I think connection is like
CatalogTable. It should hold the basic information and all information in
the connection should be stored into secret store.


Best,
Shengkai


Timo Walther <twal...@apache.org> 于2025年7月22日周二 22:04写道:

> Hi Mayank,
>
> Thanks for updating the FLIP and clearly documenting our discussion.
>
> +1 for moving forward with the vote, unless there are objections from
> others.
>
> Cheers,
> Timo
>
> On 22.07.25 02:14, Mayank Juneja wrote:
> > Hi Ryan and Austin,
> >
> > Thanks for your suggestions. I've updated the FLIP with the following
> > additional info -
> >
> > 1. *table.secret-store.kind* key to register the SecretStore in a yaml
> file
> > 2. *updateSecret* method in WritableSecretStore interface
> >
> > Thanks,
> > Mayank
> >
> > On Thu, Jul 17, 2025 at 5:42 PM Austin Cawley-Edwards <
> > austin.caw...@gmail.com> wrote:
> >
> >> Hey all,
> >>
> >> Thanks for the nice flip all! I’m just reading through – had one
> question
> >> on the ALTER CONNECTION implementation flow. Would it make sense for the
> >> WritableSecretStore to expose a method for updating a secret by ID, so
> it
> >> can be done atomically? Else, would we need to call delete and create
> >> again, potentially introducing concurrent resolution errors?
> >>
> >> Best,
> >> Austin
> >>
> >> On Thu, Jul 17, 2025 at 13:07 Ryan van Huuksloot
> >> <ryan.vanhuuksl...@shopify.com.invalid> wrote:
> >>
> >>> Hi Mayank,
> >>>
> >>> Thanks for updating the FLIP. Overall it looks good to me.
> >>>
> >>> One question I had related to how someone could choose the SecretStore
> >> they
> >>> want to use if they use something like the SQL Gateway as the
> entrypoint
> >> on
> >>> top of a remote Session cluster. I don't see an explicit way to set the
> >>> SecretStore in the FLIP.
> >>> I assume we'll do it similar to the CatalogStore but I wanted to call
> >> this
> >>> out.
> >>>
> >>> table.catalog-store.kind: filetable.catalog-store.file.path:
> >>> file:///path/to/catalog/store/
> >>>
> >>> Ryan van Huuksloot
> >>> Staff Engineer, Infrastructure | Streaming Platform
> >>> [image: Shopify]
> >>> <
> https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email
> >>>
> >>>
> >>>
> >>> On Wed, Jul 16, 2025 at 2:22 PM Mayank Juneja <
> mayankjunej...@gmail.com>
> >>> wrote:
> >>>
> >>>> Hi everyone,
> >>>>
> >>>> Thanks for your valuable inputs. I have updated the FLIP with the
> ideas
> >>>> proposed earlier in the thread. Looking forward to your feedback.
> >>>> https://cwiki.apache.org/confluence/x/cYroF
> >>>>
> >>>> Best,
> >>>> Mayank
> >>>>
> >>>> On Fri, Jun 27, 2025 at 2:59 AM Leonard Xu <xbjt...@gmail.com> wrote:
> >>>>
> >>>>> Quick response, thanks Mayank, Hao and Timo for the effort.  The new
> >>>>> proposal looks well, +1 from my side.
> >>>>>
> >>>>> Could you draft(update) current FLIP docs thus we can have some
> >>> specific
> >>>>> discussions later?
> >>>>>
> >>>>>
> >>>>> Best,
> >>>>> Leonard
> >>>>>
> >>>>>
> >>>>>> 2025 6月 26 15:06,Timo Walther <twal...@apache.org> 写道:
> >>>>>>
> >>>>>> Hi everyone,
> >>>>>>
> >>>>>> sorry for the late reply, feature freeze kept me busy. Mayank, Hao
> >>> and
> >>>> I
> >>>>> synced offline and came up we an improved proposal. Before we update
> >>> the
> >>>>> FLIP let me summarize the most important key facts that hopefully
> >>> address
> >>>>> most concerns:
> >>>>>>
> >>>>>> 1) SecretStore
> >>>>>> - Similar to CatalogStore, we introduce a SecretStore as the
> >> highest
> >>>>> level in TableEnvironment.
> >>>>>> - SecretStore is initialized with options and potentially
> >> environment
> >>>>> variables. Including
> >> EnvironmentSettings.withSecretStore(SecretStore).
> >>>>>> - The SecretStore is pluggable and discovered using the regular
> >>>>> factory-approach.
> >>>>>> - For example, it could implement Azure Key Vault or other cloud
> >>>>> provider secrets stores.
> >>>>>> - Goal: Flink and Flink catalogs do not have to deal with sensitive
> >>>> data.
> >>>>>>
> >>>>>> 2) Connections
> >>>>>> - Connections are catalog objects identified with 3-part
> >> identifiers.
> >>>>> 3-part identifiers are crucial for managability of larger projects
> >> and
> >>>>> align with existing catalog objects.
> >>>>>> - They contain connection details, e.g. URL, query parameters, and
> >>>> other
> >>>>> configuration.
> >>>>>> - They do not contain secrets, but only pointers to secrets in the
> >>>>> SecretStore.
> >>>>>>
> >>>>>> 3) Connection DDL
> >>>>>>
> >>>>>> CREATE [TEMPORARY] CONNECTION mycat.mydb.OpenAPI WITH (
> >>>>>>   'type' = 'basic' | 'bearer' | 'jwt' | 'oauth' | ...,
> >>>>>>   ...
> >>>>>> )
> >>>>>>
> >>>>>> - Connection type is pluggable and discovered using the regular
> >>>>> factory-approach.
> >>>>>> - The factory extracts secrets and puts them into SecretStore.
> >>>>>> - The factory only leaves non-confidential options left that can be
> >>>>> stored in a catalog.
> >>>>>>
> >>>>>> When executing:
> >>>>>> CREATE [TEMPORARY] CONNECTION mycat.mydb.OpenAPI WITH (
> >>>>>>   'type' = 'basic',
> >>>>>>   'url' = 'api.example.com',
> >>>>>>   'username' = 'bob',
> >>>>>>   'password' = 'xyz'
> >>>>>> )
> >>>>>>
> >>>>>> The catalog will receive something similar to:
> >>>>>> CREATE [TEMPORARY] CONNECTION mycat.mydb.OpenAPI WITH (
> >>>>>>   'type' = 'basic',
> >>>>>>   'url' = 'api.example.com',
> >>>>>>   'secret.store' = 'azure-key-vault'
> >>>>>>   'secret.id' = 'secretId'
> >>>>>> )
> >>>>>>
> >>>>>> - However, the exact property design is up to the connection
> >> factory.
> >>>>>>
> >>>>>> 4) Connection Usage
> >>>>>>
> >>>>>> CREATE TABLE t (...) USING CONNECTION mycat.mydb.OpenAPI;
> >>>>>>
> >>>>>> - MODEL, FUNCTION, TABLE DDL will support USING CONNECTION keyword
> >>>>> similar to BigQuery.
> >>>>>> - The connection will be provided in a table/model
> >> provider/function
> >>>>> definition factory.
> >>>>>>
> >>>>>> 5) CatalogStore / Catalog Initialization
> >>>>>>
> >>>>>> Catalog store or catalog can make use of SecretStore to retrieve
> >>>> initial
> >>>>> credentials for bootstrapping. All objects lower then catalog
> >>>> store/catalog
> >>>>> can then use connections. If you think we still need system level
> >>>>> connections, we can support CREATE SYSTEM CONNECTION GlobalName WITH
> >>> (..)
> >>>>> similar to SYSTEM functions directly store in a ConnectioManager in
> >>>>> TableEnvironment. But for now I would suggest to start simple with
> >>>>> per-catalog connections and later evolve the design.
> >>>>>>
> >>>>>> Dealing with secrets is a very sensitive topic and I'm clearly not
> >> an
> >>>>> expert on it. This is why we should try to push the problem to
> >> existing
> >>>>> solutions and don't start storing secrets in Flink in any way. Thus,
> >>> the
> >>>>> interfaces will be defined very generic.
> >>>>>>
> >>>>>> Looking forward to your feedback.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Timo
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 09.06.25 04:01, Leonard Xu wrote:
> >>>>>>> Thanks  Timo for joining this thread.
> >>>>>>> I agree that this feature is needed by the community; the current
> >>>>> disagreement is only about the implementation method or solution.
> >>>>>>> Your thoughts looks generally good to me, looking forward to your
> >>>>> proposal.
> >>>>>>> Best,
> >>>>>>> Leonard
> >>>>>>>> 2025 6月 6 22:46,Timo Walther <twal...@apache.org> 写道:
> >>>>>>>>
> >>>>>>>> Hi everyone,
> >>>>>>>>
> >>>>>>>> thanks for this healthy discussion. Looking at high number of
> >>>>> participants, it looks like we definitely want this feature. We just
> >>> need
> >>>>> to figure out the "how".
> >>>>>>>>
> >>>>>>>> This reminds me very much of the discussion we had for CREATE
> >>>>> FUNCTION. There, we discussed whether functions should be named
> >>> globally
> >>>> or
> >>>>> catalog-specific. In the end, we decided for both `CREATE SYSTEM
> >>>> FUNCTION`
> >>>>> and `CREATE FUNCTION`, satisfying both the data platform team of an
> >>>>> organization (which might provide system functions) and individual
> >> data
> >>>>> teams or use cases (scoped by catalog/database).
> >>>>>>>>
> >>>>>>>> Looking at other modern vendors like Snowflake there is SECRET
> >>>> (scoped
> >>>>> to schema) [1] and API INTEGRATION [2] (scoped to account). So also
> >>> other
> >>>>> vendors offer global and per-team / per-use case connections details.
> >>>>>>>>
> >>>>>>>> In general, I think fitting connections into the existing
> >> concepts
> >>>> for
> >>>>> catalog objects (with three-part identifier) makes managing them
> >>> easier.
> >>>>> But I also see the need for global defaults.
> >>>>>>>>
> >>>>>>>> Btw keep in mind that a catalog implementation should only store
> >>>>> metadata. Similar how a CatalogTable doesn't store the actual data, a
> >>>>> CatalogConnection should not store the credentials. It should only
> >>> offer
> >>>> a
> >>>>> factory that allows for storing and retrieving them. In real world
> >>>>> scenarios a factory is most likely backed by a product like Azure Key
> >>>> Vault.
> >>>>>>>>
> >>>>>>>> So code-wise having a ConnectionManager that behaves similar to
> >>>>> FunctionManager sounds reasonable.
> >>>>>>>>
> >>>>>>>> +1 for having special syntax instead of using properties. This
> >>> allows
> >>>>> to access connections in tables, models, functions. And catalogs, if
> >> we
> >>>>> agree to have global ones as well.
> >>>>>>>>
> >>>>>>>> What do you think?
> >>>>>>>>
> >>>>>>>> Let me spend some more thoughts on this and come back with a
> >>> concrete
> >>>>> proposal by early next week.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Timo
> >>>>>>>>
> >>>>>>>> [1]
> >> https://docs.snowflake.com/en/sql-reference/sql/create-secret
> >>>>>>>> [2]
> >>>>>
> >> https://docs.snowflake.com/en/sql-reference/sql/create-api-integration
> >>>>>>>>
> >>>>>>>> On 04.06.25 10:47, Leonard Xu wrote:
> >>>>>>>>> Hey,Mayank
> >>>>>>>>> Please see my feedback as following:
> >>>>>>>>> 1. One of the motivations of this FLIP is to improve security.
> >>>>> However, the current design stores all connection information in the
> >>>>> catalog,
> >>>>>>>>> and each Flink SQL job reads from the catalog during
> >> compilation.
> >>>> The
> >>>>> connection information is passed between SQL Gateway and the
> >>>>>>>>> catalog in plaintext, which actually introduces new security
> >>> risks.
> >>>>>>>>> 2. The name "Connection" should be changed to something like
> >>>>> ConnectionSpec to clearly indicate that it is a object containing
> >> only
> >>>>> static
> >>>>>>>>> properties without a lifecycle. Putting aside the naming issue,
> >> I
> >>>>> think the current model and hierarchy design is somewhat strange.
> >>> Storing
> >>>>>>>>> various kinds of connections (e.g., Kafka, MySQL) in the same
> >>>> Catalog
> >>>>> with hierarchical identifiers like
> >> catalog-name.db-name.connection-name
> >>>>>>>>> raises the following questions:
> >>>>>>>>> (1) What is the purpose of this hierarchical structure of
> >>> Connection
> >>>>> object ?
> >>>>>>>>> (2) If we can use a Connection to create a MySQL table, why
> >> can't
> >>> we
> >>>>> use a Connection to create a MySQL Catalog?
> >>>>>>>>> 3. Regarding the connector usage examples given in this FLIP:
> >>>>>>>>> ```sql
> >>>>>>>>> 1  -- Example 2: Using connection for jdbc tables
> >>>>>>>>> 2  CREATE OR REPLACE CONNECTION mysql_customer_db
> >>>>>>>>> 3  WITH (
> >>>>>>>>> 4    'type' = 'jdbc',
> >>>>>>>>> 5    'jdbc.url' = 'jdbc:mysql://
> >>>>> customer-db.example.com:3306/customerdb',
> >>>>>>>>> 6    'jdbc.connection.ssl.enabled' = 'true'
> >>>>>>>>> 7  );
> >>>>>>>>> 8
> >>>>>>>>> 9  CREATE TABLE customers (
> >>>>>>>>> 10   customer_id INT,
> >>>>>>>>> 11   PRIMARY KEY (customer_id) NOT ENFORCED
> >>>>>>>>> 12 ) WITH (
> >>>>>>>>> 13   'connector' = 'jdbc',
> >>>>>>>>> 14   'jdbc.connection' = 'mysql_customer_db',
> >>>>>>>>> 15   'jdbc.connection.ssl.enabled' = 'true',
> >>>>>>>>> 16   'jdbc.connection.max-retry-timeout' = '60s',
> >>>>>>>>> 17   'jdbc.table-name' = 'customers',
> >>>>>>>>> 18   'jdbc.lookup.cache' = 'PARTIAL'
> >>>>>>>>> 19 );
> >>>>>>>>> ```
> >>>>>>>>> I see three issues from SQL semantics and Connector
> >> compatibility
> >>>>> perspectives:
> >>>>>>>>> (1) Look at line 14: `mysql_customer_db` is an object identifier
> >>> of
> >>>> a
> >>>>> CONNECTION defined in SQL. However, this identifier is referenced
> >>>>>>>>>      via a string value inside the table’s WITH clause, which
> >> feel
> >>>>> hack for me.
> >>>>>>>>> (2) Look at lines 14–16: the use of the specific prefix
> >>>>> `jdbc.connection` will confuse users because `connection.xx` maybe
> >>>> already
> >>>>> used as
> >>>>>>>>>   a prefix for existing configuration items.
> >>>>>>>>> (3) Look at lines 14–18: Why do all existing configuration
> >> options
> >>>>> need to be prefixed with `jdbc`, even they’re not related to
> >> Connection
> >>>>> properties?
> >>>>>>>>> This completely changes user habits — is it backward compatible?
> >>>>>>>>>   In my opinion, Connection should be a model independent of both
> >>>>> Catalog and Table, and can be referenced by all
> >> catalog/table/udf/model
> >>>>> object.
> >>>>>>>>> It should be managed by a Component such as a ConnectionManager
> >> to
> >>>>> enable reuse. For security purposes, authentication mechanisms could
> >>>>>>>>> be supported within the ConnectionManager.
> >>>>>>>>> Best,
> >>>>>>>>> Leonard
> >>>>>>>>>> 2025 6月 4 02:04,Martijn Visser <martijnvis...@apache.org> 写道:
> >>>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> First of all, I think having a Connection resource is something
> >>>> that
> >>>>> will
> >>>>>>>>>> be beneficial for Apache Flink. I could see that being extended
> >>> in
> >>>>> the
> >>>>>>>>>> future to allow for easier secret handling [1].
> >>>>>>>>>> In my mental mind, I'm comparing this proposal against SQL/MED
> >>> from
> >>>>> the ISO
> >>>>>>>>>> standard [2]. I do think that SQL/MED isn't a very user
> >> friendly
> >>>>> syntax
> >>>>>>>>>> though, looking at Postgres for example [3].
> >>>>>>>>>>
> >>>>>>>>>> I think it's a valid question if Connection should be
> >> considered
> >>>>> with a
> >>>>>>>>>> catalog or database-level scope. @Ryan can you share something
> >>>> more,
> >>>>> since
> >>>>>>>>>> you've mentioned "Note: I much prefer catalogs for this case.
> >>> Which
> >>>>> is what
> >>>>>>>>>> we use internally to manage connection properties". It looks
> >> like
> >>>>> there
> >>>>>>>>>> isn't a strong favourable approach looking at other vendors
> >>> (like,
> >>>>>>>>>> Databricks does scopes it on a Unity catalog, Snowflake on a
> >>>> database
> >>>>>>>>>> level).
> >>>>>>>>>>
> >>>>>>>>>> Also looking forward to Leonard's input.
> >>>>>>>>>>
> >>>>>>>>>> Best regards,
> >>>>>>>>>>
> >>>>>>>>>> Martijn
> >>>>>>>>>>
> >>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-36818
> >>>>>>>>>> [2] https://www.iso.org/standard/84804.html
> >>>>>>>>>> [3]
> >>> https://www.postgresql.org/docs/current/sql-createserver.html
> >>>>>>>>>>
> >>>>>>>>>> On Fri, May 30, 2025 at 5:07 AM Leonard Xu <xbjt...@gmail.com>
> >>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hey Mayank.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for the FLIP, I went through this FLIP quickly and
> >> found
> >>>> some
> >>>>>>>>>>> issues which I think we
> >>>>>>>>>>> need to deep discuss later. As we’re on a short Dragon boat
> >>>>> Festival,
> >>>>>>>>>>> could you kindly hold
> >>>>>>>>>>> on this thread? and we will back to continue the FLIP discuss.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Leonard
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> 2025 4月 29 23:07,Mayank Juneja <mayankjunej...@gmail.com>
> >> 写道:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I would like to open up for discussion a new FLIP-529 [1].
> >>>>>>>>>>>>
> >>>>>>>>>>>> Motivation:
> >>>>>>>>>>>> Currently, Flink SQL handles external connectivity by
> >> defining
> >>>>> endpoints
> >>>>>>>>>>>> and credentials in table configuration. This approach
> >> prevents
> >>>>>>>>>>> reusability
> >>>>>>>>>>>> of these connections and makes table definition less secure
> >> by
> >>>>> exposing
> >>>>>>>>>>>> sensitive information.
> >>>>>>>>>>>> We propose the introduction of a new "connection" resource in
> >>>>> Flink. This
> >>>>>>>>>>>> will be a pluggable resource configured with a remote
> >> endpoint
> >>>> and
> >>>>>>>>>>>> associated access key. Once defined, connections can be
> >> reused
> >>>>> across
> >>>>>>>>>>> table
> >>>>>>>>>>>> definitions, and eventually for model definition (as
> >> discussed
> >>> in
> >>>>>>>>>>> FLIP-437)
> >>>>>>>>>>>> for inference, enabling seamless and secure integration with
> >>>>> external
> >>>>>>>>>>>> systems.
> >>>>>>>>>>>> The connection resource will provide a new, optional way to
> >>>> manage
> >>>>>>>>>>> external
> >>>>>>>>>>>> connectivity in Flink. Existing methods for table definitions
> >>>> will
> >>>>> remain
> >>>>>>>>>>>> unchanged.
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1] https://cwiki.apache.org/confluence/x/cYroF
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>> Mayank Juneja
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>> --
> >>>> *Mayank Juneja*
> >>>> Product Manager | Data Streaming and AI
> >>>>
> >>>
> >>
> >
> >
>
>

Reply via email to