Hi Zhenghua,

After removing TableSource::getTableSchema, during optimization, I could
imagine
the schema information might come from relational nodes such as TableScan.

Best,
Kurt


On Wed, Feb 5, 2020 at 8:24 PM Kurt Young <ykt...@gmail.com> wrote:

> Hi Jingsong,
>
> Yes current TableFactory is not ideal for users to use either. I think we
> should
> also spend some time in 1.11 to improve the usability of TableEnvironment
> when
> users trying to read or write something. Automatic scheme inference would
> be
> one of them. Other from this, we also support convert a DataStream to
> Table, which
> can serve some flexible requirements to read or write data.
>
> Best,
> Kurt
>
>
> On Wed, Feb 5, 2020 at 7:29 PM Zhenghua Gao <doc...@gmail.com> wrote:
>
>> +1 to remove these methods.
>>
>> One concern about invocations of TableSource::getTableSchema:
>> By removing such methods, we can stop calling TableSource::getTableSchema
>> in some place(such
>> as BatchTableEnvImpl/TableEnvironmentImpl#validateTableSource,
>> ConnectorCatalogTable, TableSourceQueryOperation).
>>
>> But in other place we need field types and names of the table source(such
>> as
>> BatchExecLookupJoinRule/StreamExecLookupJoinRule,
>> PushProjectIntoTableSourceScanRule,
>> CommonLookupJoin).  So how should we deal with this?
>>
>> *Best Regards,*
>> *Zhenghua Gao*
>>
>>
>> On Wed, Feb 5, 2020 at 2:36 PM Kurt Young <ykt...@gmail.com> wrote:
>>
>> > Hi all,
>> >
>> > I'd like to bring up a discussion about removing registration of
>> > TableSource and
>> > TableSink in TableEnvironment as well as in ConnectTableDescriptor. The
>> > affected
>> > method would be:
>> >
>> > TableEnvironment::registerTableSource
>> > TableEnvironment::fromTableSource
>> > TableEnvironment::registerTableSink
>> > ConnectTableDescriptor::registerTableSource
>> > ConnectTableDescriptor::registerTableSink
>> > ConnectTableDescriptor::registerTableSourceAndSink
>> >
>> > (Most of them are already deprecated, except for
>> > TableEnvironment::fromTableSource,
>> > which was intended to deprecate but missed by accident).
>> >
>> > FLIP-64 [1] already explained why we want to deprecate TableSource &
>> > TableSink from
>> > user's interface. In a short word, these interfaces should only read &
>> > write the physical
>> > representation of the table, and they are not fitting well after we
>> already
>> > introduced some
>> > logical table fields such as computed column, watermarks.
>> >
>> > Another reason is the exposure of registerTableSource in Table Env just
>> > make the whole
>> > SQL protocol opposite. TableSource should be used as a reader of table,
>> it
>> > should rely on
>> > other metadata information held by framework, which eventually comes
>> from
>> > DDL or
>> > ConnectDescriptor. But if we register a TableSource to Table Env, we
>> have
>> > no choice but
>> > have to rely on TableSource::getTableSchema. It will make the design
>> > obscure, sometimes
>> > TableSource should trust the information comes from framework, but
>> > sometimes it should
>> > also generate its own schema information.
>> >
>> > Furthermore, if the authority about schema information is not clear, it
>> > will make things much
>> > more complicated if we want to improve the table api usability such as
>> > introducing automatic
>> > schema inference in the near future.
>> >
>> > Since this is an API break change, I've also included user mailing list
>> to
>> > gather more feedbacks.
>> >
>> > Best,
>> > Kurt
>> >
>> > [1]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
>> >
>>
>

Reply via email to