Thanks Bowen for driving this,

+1 for this, The DDL schema definition is a headache for users, and catalog
is a solution to this problem.

I have some questions and suggestions:

- We can provide a Builder for Catalog, In my opinion, defaultDatabase,
username, pwd can be included in JDBC DB url.

- About timestamp and time, write down the specific Flink precision of
Postgres?

- I think there is a part missing in your document, that is how to use this
catalog. If you can write a complete example, I think it will be much
clearer.

- So a thing is what TableFactory will this catalog use? For example,
JDBCTableSourceSinkFactory has different parameters for source or sink? How
do you think about it?

Best,
Jingsong Lee

On Thu, Jan 9, 2020 at 11:33 AM Jark Wu <imj...@gmail.com> wrote:

> Thanks Bowen for driving this.
>
> +1 to this feature.
>
> My concern is that why introducing a `PostgresJDBCCatalog`, not a generic
> `JDBCCatalog` (catalog.type = 'postgres' vs 'jdbc') ?
> From my understanding, JDBC catalog is similar to JDBC source/sink. For
> JDBC source/sink, we have a generic
> implementation for JDBC and delegate operations to JDBCDialect. Different
> driver may have different implementation of
> JDBCDialect, e.g `quoteIdentifier()`.
>
> For JDBC catalog, I guess maybe we can do it in the same way, i.e. a
> generic JDBCCatalog implementation and delegate
> operations to JDBCDialect, and we will have `listDataBase()`,
> `listTables()` interfaces in JDBCDialect. The benefit is that:
> 0) reuse the existing `JDBCDialect`, I guess JDBCCatalog also need to quote
> identifiers.
> 1) we can easily to support a new database catalog (e.g. mysql) by
> implementing new dialects (e.g. MySQLDialect).
> 2) this can keep the same behavior as JDBC source/sink, i.e.
> connector.type=jdbc, catalog.type=jdbc
>
> Best,
> Jark
>
>
> On Thu, 9 Jan 2020 at 08:33, Bowen Li <bowenl...@gmail.com> wrote:
>
> > Hi dev,
> >
> > I'd like to kick off a discussion on adding JDBC catalogs, specifically
> > Postgres catalog in Flink [1].
> >
> > Currently users have to manually create schemas in Flink source/sink
> > mirroring tables in their relational databases in use cases like JDBC
> > read/write and consuming CDC. Many users have complaint about the
> > unnecessary, redundant, manual work. Any mismatch can lead to a failing
> > Flink job at runtime instead of compile time. All these have been quite
> > unpleasant, resulting in a broken user experience.
> >
> > We want to provide a JDBC catalog interface and a Postgres implementation
> > for Flink as a start to connect to all kinds of relational databases,
> > enabling Flink SQL to 1) retrieve table schema automatically without
> > requiring user writes duped DDL 2) check at compile time for schema
> errors.
> > It will greatly streamline user experiences when using Flink to deal with
> > popular relational databases like Postgres, MySQL, MariaDB, AWS Aurora,
> > etc.
> >
> > Note that the problem and solution are actually very general to Flink
> when
> > connecting to all kinds of external systems. We just focus on solving
> that
> > for relational databases in this FLIP.
> >
> > Thanks,
> > Bowen
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-92%3A+JDBC+catalog+and+Postgres+catalog
> >
>


-- 
Best, Jingsong Lee

Reply via email to