Hi Dawid,

> How to express projections with TableSchema?
The TableSource holds the original TableSchema (i.e. from DDL) and the
pushed TableSchema represents the schema after projection.
Thus the table source can compare them to figure out changed field orders
or not matched types.
For most sources who maps physical storage by field names (e.g. jdbc,
hbase, json) they can just simply apply the pushed TableSchema.
But sources who maps by field indexes (e.g. csv), they need to figure out
the projected indexes by comparing the original and projected schema.
For example, the original schema is [a: String, b: Int, c: Timestamp], and
b is pruned, then the pushed schema is [a: String, c: Timestamp]. So the
source can figure out index=1 is pruned.

> How do we express projection of a nested field with TableSchema?
This is the same to the above one. For example, the original schema is [rk:
String, f1 Row<q1 Int, q2 Double>].
If `f1.q1` is pruned, the pushed schema will be [rk: String, f1 Row<q2
Double>].

> TableSchema might be used at too many different places for different
responsibilities.
Agree. We have recognized that a structure and builder for pure table
schema is required in many places. But we mixed many concepts of catalog
table schema in TableSchema.
IIRC, in an offline discussion of FLIP-84, we want to introduce a new
`CatalogTableSchema` to represent the schema part of a DDL,
and remove all the watermark, computed column information from TableSchema?
Then `TableSchema` can continue to serve as a pure table schema and it
stays in a good package.

Best,
Jark




On Thu, 2 Apr 2020 at 19:39, Timo Walther <twal...@apache.org> wrote:

> Hi Dawid,
>
> thanks for your feedback. I agree with your concerns. I also observed
> that TableSchema might be used at too many different places for
> different responsibilities.
>
> How about we introduce a helper class for `SupportsProjectionPushDown`
> and also `LookupTableSource#Context#getKeys()` to represent nested
> structure of names. Data types, constraints, or computed columns are not
> necessary at those locations.
>
> We can also add utility methods for connectors to this helper class
> there to quickly figuring out differences between the original table
> schema and the new one.
>
> SelectedFields {
>
>         private LogicalType orignalRowType; // set by the planner
>
>         private int[][] indices;
>
>         getNames(int... at): String[]
>
>         getNames(String... at): String[]
>
>         getIndices(int... at): int[]
>
>         getNames(String... at): String[]
>
>         toTableSchema(): TableSchema
> }
>
> What do others think?
>
> Thanks,
> Timo
>
>
>
> On 02.04.20 12:28, Dawid Wysakowicz wrote:
> > Generally +1
> >
> > One slight concern I have is about the |SupportsProjectionPushDown.|I
> > don't necessarily understand how can we express projections with
> > TableSchema. It's unclear for me what happens when a type of a field
> > changes, fields are in a different order, when types do not match. How
> > do we express projection of a nested field with TableSchema?
> >
> > I don't think this changes the core design presented in the FLIP,
> > therefore I'm fine with accepting the FLIP. I wanted to mention my
> > concerns, so that maybe we can adjust the passed around structures
> slightly.
> >
> > Best,
> >
> > Dawid
> > ||
> >
> > On 30/03/2020 14:42, Leonard Xu wrote:
> >> +1(non-binding)
> >>
> >> Best,
> >> Leonard Xu
> >>
> >>> 在 2020年3月30日,16:43,Jingsong Li<jingsongl...@gmail.com>  写道:
> >>>
> >>> +1
> >>>
> >>> Best,
> >>> Jingsong Lee
> >>>
> >>> On Mon, Mar 30, 2020 at 4:41 PM Kurt Young<k...@apache.org>  wrote:
> >>>
> >>>> +1
> >>>>
> >>>> Best,
> >>>> Kurt
> >>>>
> >>>>
> >>>> On Mon, Mar 30, 2020 at 4:08 PM Benchao Li<libenc...@gmail.com>
> wrote:
> >>>>
> >>>>> +1 (non-binding)
> >>>>>
> >>>>> Jark Wu<imj...@gmail.com>  于2020年3月30日周一 下午3:57写道:
> >>>>>
> >>>>>> +1 from my side.
> >>>>>>
> >>>>>> Thanks Timo for driving this.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jark
> >>>>>>
> >>>>>> On Mon, 30 Mar 2020 at 15:36, Timo Walther<twal...@apache.org>
> wrote:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I would like to start the vote for FLIP-95 [1], which is discussed
> >>>> and
> >>>>>>> reached a consensus in the discussion thread [2].
> >>>>>>>
> >>>>>>> The vote will be open until April 2nd (72h), unless there is an
> >>>>>>> objection or not enough votes.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Timo
> >>>>>>>
> >>>>>>> [1]
> >>>>>>>
> >>>>>>>
> >>>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-95%3A+New+TableSource+and+TableSink+interfaces
> >>>>>>> [2]
> >>>>>>>
> >>>>>>>
> >>>>
> https://lists.apache.org/thread.html/r03cbce8996fd06c9b0406c9ddc0d271bd456f943f313b9261fa061f9%40%3Cdev.flink.apache.org%3E
> >>>>> --
> >>>>>
> >>>>> Benchao Li
> >>>>> School of Electronics Engineering and Computer Science, Peking
> University
> >>>>> Tel:+86-15650713730
> >>>>> Email:libenc...@gmail.com;libenc...@pku.edu.cn
> >>>>>
> >>> --
> >>> Best, Jingsong Lee
>
>

Reply via email to