Timo,

> Do you have a concrete use case in mind where this feature could help?

If I care about backwards compatibility when evolving a schema, giving
DEFAULT values to new columns would be a clean approach.

David

On Mon, Oct 6, 2025 at 12:31 AM Timo Walther <[email protected]> wrote:

> Hi David,
>
> the need for DEFAULT columns popped up a couple of times. And I would
> also support the implementation. But apparently the need was not strong
> enough for concrete resource planning for it.
>
> Do you have a concrete use case in mind where this feature could help?
>
> Regards,
> Timo
>
> On 02.10.25 00:47, David Anderson wrote:
> > I'm wishing we had column DEFAULTs as was discussed in this thread way
> back
> > in 2022.
> >
> > I found FLIP-261 [1], but it doesn't appear to have been voted on or
> > implemented. Is that correct?
> >
> > [1] https://cwiki.apache.org/confluence/x/x4ueDQ
> >
> > Regards,
> > David
> >
> > On Thu, Sep 1, 2022 at 1:32 AM Ran Tao <[email protected]> wrote:
> >
> >> Hi, Jark & Timo. I'm glad to support this feature, and if you guys
> agree,
> >> I'll be ready to create a FLIP, and then you guys and other developers
> can
> >> review and check some specifics.
> >>
> >> Thanks.
> >>
> >> Jark Wu <[email protected]> 于2022年8月30日周二 20:24写道:
> >>
> >>> Thank you Ran for the explanation.
> >>>
> >>> The column DEFAULT is a reasonable feature and can also help in other
> >>> cases.
> >>> I’m fine with adding this feature.
> >>> Do you want to prepare a FLIP for it?
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>>> 2022年8月29日 15:02,Ran Tao <[email protected]> 写道:
> >>>>
> >>>> Hi Jack. Timo summed it up very well. In fact, my problem is that the
> >>>> current flink table metadata is fixed and cannot be compatible with
> the
> >>>> connector's changes in metadata columns.
> >>>> A metadata column that did not exist in the past, does exist at some
> >>> point
> >>>> in the future, and vice versa.
> >>>> There is forward and backward compatibility here.
> >>>>
> >>>> Jark Wu <[email protected]> 于2022年8月26日周五 16:28写道:
> >>>>
> >>>>> Hi Ran,
> >>>>>
> >>>>> If the metadata is from the message properties, then you can manually
> >>> cast
> >>>>> it to your preferred types,
> >>>>> such as `my_dyanmic_meta AS CAST(properties['my-new-property’] AS
> >>>>> TIMESTAMP)`.
> >>>>>
> >>>>> If the metadata is not from the message properties, how does the
> >>> connector
> >>>>> know which field to convert from?
> >>>>> Shouldn’t the connector be modified to support this new metadata
> >> column?
> >>>>>
> >>>>> Best,
> >>>>> Jark
> >>>>>
> >>>>>
> >>>>>
> >>>>>> 2022年8月26日 15:30,Ran Tao <[email protected]> 写道:
> >>>>>>
> >>>>>> Hi, TiMo. I think using one map column in the debezium format you
> >>>>>> illustrated above can't cover the discussed scenario.
> >>>>>> It's not the same thing.
> >>>>>>
> >>>>>> Here is a debezium format example from flink docs: [1]
> >>>>>>
> >>>>>> ```
> >>>>>> CREATE TABLE KafkaTable (
> >>>>>> origin_ts TIMESTAMP(3) METADATA FROM 'value.ingestion-timestamp'
> >>>>> VIRTUAL,
> >>>>>> origin_properties MAP<STRING, STRING> METADATA FROM
> >>>>>> 'value.source.properties' VIRTUAL,
> >>>>>> user_id BIGINT,
> >>>>>> ) WITH (
> >>>>>> 'connector' = 'kafka',
> >>>>>> 'value.format' = 'debezium-json'
> >>>>>> ...
> >>>>>> );
> >>>>>> ```
> >>>>>>
> >>>>>> *the `origin_properties` is a column used for properties. So we
> >> define
> >>> it
> >>>>>> at MAP *(just like you respond). But the other metadata columns have
> >>>>> their
> >>>>>> own data types.
> >>>>>> e.g. `origin_ts` is TIMESTAMP.  We can not flatmap all metadata
> >> columns
> >>>>>> within one MAP<STRING, STRING> column. it's not a good idea.
> >>>>>>
> >>>>>> My suggestion is that if kafka above *add some new metadatas*(just
> >> for
> >>>>>> example, kafka maybe stable, but a certain connector or middleware
> >>> might
> >>>>> be
> >>>>>> developing, so its metadatas could be added or changed)
> >>>>>> e.g. at some time, kafka added a `host_name` metadata (indicate the
> >>>>> address
> >>>>>> of message broker).
> >>>>>>
> >>>>>> We can define sql like this:
> >>>>>> ```
> >>>>>> CREATE TABLE KafkaTable (
> >>>>>> origin_ts TIMESTAMP(3) METADATA FROM 'value.ingestion-timestamp'
> >>>>> VIRTUAL,
> >>>>>> host_name STRING METADATA VIRTUAL DYNAMIC,
> >>>>>> origin_properties MAP<STRING, STRING> METADATA FROM
> >>>>>> 'value.source.properties' VIRTUAL,
> >>>>>> user_id BIGINT,
> >>>>>> ) WITH (
> >>>>>> 'connector' = 'kafka',
> >>>>>> 'value.format' = 'debezium-json'
> >>>>>> ...
> >>>>>> );
> >>>>>> ```
> >>>>>> Then users can use `host_name` this metadata, because it's a DYNAMIC
> >>>>>> metacolumn, flink dont't throw exception although `host_name`
> >>>>>> not belongs to kafka before, and the developers don't need to modify
> >> or
> >>>>>> rebuild flink source code and publish flink to online environment
> (it
> >>>>> comes
> >>>>>> at a high cost).
> >>>>>>
> >>>>>> Considering the return value:
> >>>>>> kafka before (no this metadata): null
> >>>>>> kafka now (added this metadata already): the concrete value
> >>>>>>
> >>>>>> Same user sql works well in the past and now even in the future
> >> rather
> >>>>> than
> >>>>>> check and deny these new metadata columns or modify connector
> >>>>>> implementation frequently to support it.
> >>>>>> And it's an option to configure by using 'DYNAMIC' at the metadata
> >>>>>> column(or other better implementations).
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>>
> >>>
> >>
> https://nightlies.apache.org/flink/flink-docs-release-1.15/zh/docs/connectors/table/formats/debezium/
> >>>>>>
> >>>>>> Timo Walther <[email protected]> 于2022年8月25日周四 21:07写道:
> >>>>>>
> >>>>>>> Hi Ran,
> >>>>>>>
> >>>>>>> what would be the data type of this dynamic metadata column? The
> >>> planner
> >>>>>>> and many parts of the stack will require a data type.
> >>>>>>>
> >>>>>>> Personally, I feel connector developers can already have the same
> >>>>>>> functionality by declaring a metadata column as `MAP<STRING,
> >> STRING>`.
> >>>>>>> This is what we expose already as `debezium.source.properties`.
> >>> Whatever
> >>>>>>> Debezium adds will be available through this property and can be
> >>>>>>> accessed via `SELECT col['my-new-property'] FROM x` including being
> >>> NULL
> >>>>>>> be default if not present.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Timo
> >>>>>>>
> >>>>>>>
> >>>>>>> On 25.08.22 14:04, Ran Tao wrote:
> >>>>>>>> ```
> >>>>>>>> create table test_source(
> >>>>>>>> __test_metadata__ varchar METADATA,
> >>>>>>>> f0 varchar,
> >>>>>>>> f1 varchar,
> >>>>>>>> f2 bigint,
> >>>>>>>> ts as CURRENT_TIMESTAMP
> >>>>>>>> ) with(
> >>>>>>>> 'connector'='test',
> >>>>>>>>   ...
> >>>>>>>> )
> >>>>>>>> ```
> >>>>>>>>
> >>>>>>>> If we not pre define `__test_metadata__` as meta keys by
> >> implementing
> >>>>>>>> listReadableMetadata, run the above sql, it will cause exception
> >> like
> >>>>>>> this:
> >>>>>>>>
> >>>>>>>> org.apache.flink.table.api.ValidationException: Invalid metadata
> >> key
> >>>>>>>> '__test_metadata__' in column '__test_metadata__' of table
> >>>>>>>> 'default_catalog.default_database.test_source'. The
> >>> DynamicTableSource
> >>>>>>>> class
> >>> 'com.alipay.flink.connectors.test.source.TestDynamicTableSource'
> >>>>>>>> supports the following metadata keys for reading:
> >>>>>>>> xxx, yyy
> >>>>>>>>
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.apache.flink.table.planner.connectors.DynamicSourceUtils.lambda$validateAndApplyMetadata$5(DynamicSourceUtils.java:409)
> >>>>>>>>
> >>>>>>>> Because the current flink metadata column must exist in results
> >>>>> returned
> >>>>>>> by
> >>>>>>>> `listReadableMetadata`.  But when a certain connector adds some
> >>>>>>> metadatas,
> >>>>>>>> we can not use it directly unless we modify this connector code
> and
> >>>>>>> support
> >>>>>>>> it. In some situations, It can be intolerable. Can we support
> >>> 'DYNAMIC
> >>>>>>>> MetadataColumn'?  Its basic mechanism is not to check a column
> with
> >>>>>>>> existing metadatas and users can define it dynamically. If a
> >> certain
> >>>>>>>> connector without this metadata, the column value will return null
> >>>>>>>> otherwise return it's concrete value. It has great benefits in
> some
> >>>>>>>> scenarios.
> >>>>>>>>
> >>>>>>>> Looking forward to your opinions.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Best Regards,
> >>>>>> Ran Tao
> >>>>>> https://github.com/chucheng92
> >>>>>
> >>>>>
> >>>>
> >>>> --
> >>>> Best Regards,
> >>>> Ran Tao
> >>>> https://github.com/chucheng92
> >>>
> >>>
> >>
> >> --
> >> Best Regards,
> >> Ran Tao
> >> https://github.com/chucheng92
> >>
> >
>
>

Reply via email to