Hey Leonard,

Thanks for summarizing the document. I have one quick question. I
understand a temporal table w/o version means each row in the table only
has one version. But are we still able to track different views of such a
table through time, as rows are added/deleted to/from the table? For
example, suppose I have an append-only table source with event-time and PK,
will I be allowed to do an event-time temporal join with this table?

On Wed, Aug 12, 2020 at 3:31 PM Leonard Xu <xbjt...@gmail.com> wrote:

> Hi, all
>
> After a detailed offline discussion about the temporal table related
> concept and behavior, we had a reliable solution and rejected several
> alternatives.
> Compared to rejected alternatives, the proposed approach is a more unified
> story and also friendly to user and current Flink framework.
> I improved the FLIP[1] with the proposed approach and refactored the
> document organization to make it clear enough.
>
> Please let me know if you have any concerns, I’m looking forward your
> comments.
>
>
> Best
> Leonard
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
> >
>
>
> > 在 2020年8月4日,21:25,Leonard Xu <xbjt...@gmail.com> 写道:
> >
> > Hi, all
> >
> > I’ve updated the FLIP[1] with the terminology `ChangelogTime`.
> >
> > Best
> > Leonard
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
> >
> >
> >> 在 2020年8月4日,20:58,Leonard Xu <xbjt...@gmail.com <mailto:
> xbjt...@gmail.com>> 写道:
> >>
> >> Hi, Timo
> >>
> >> Thanks for you response.
> >>
> >>> 1) Naming: Is operation time a good term for this concept? If I read
> "The operation time is the time when the changes happened in system." or
> "The system time of DML execution in database", why don't we call it
> `ChangelogTime` or `SystemTime`? Introducing another terminology of time in
> Flink should be thought through.
> >>
> >> I agree that we should thought through. I have considered the name
> `ChangelogTime` and `SystemTime` too, I don’t have strong opinion on the
> name.
> >>
> >> I proposed `operationTime` because most changelog comes from Database
> and we always called an action as `operation` rather than `change` in
> Database, the operation time is  easier to understand  for database users,
> but it's more like a database terminology.
> >>
> >> For `SystemTime`, user may confuse which one does the system in
> `SystemTime` represents?  Flink, Database or CDC tool.  Maybe it’s not a
> good name.
> >>
> >> `ChangelogTime` is a pretty choice which is more unified with existed
> terminology `Changelog` and `ChangelogMode`, so let me use `ChangelogTime`
> and I’ll update the FLIP.
> >>
> >>
> >>> 2) Exposing it through `org.apache.flink.types.Row`: Shall we also
> expose the concept of time through the user-level `Row` type? The FLIP does
> not mention this explictly. I think we can keep it as an internal concept
> but I just wanted to ask for clarification.
> >>
> >> Yes, I want to keep it as an internal concept, we have discussed that
> changelog time concept should be the third time concept(the other two are
> event-time and processing-time). It’s not easy for normal users(or to help
> normal users) understand the three concepts accurately, and I did not find
> a big enough scenario that user need to touch the changelog time for now,
> so I tend to do not expose the concept to users.
> >>
> >>
> >> Best,
> >> Leonard
> >>
> >>
> >>>
> >>> On 04.08.20 04:58, Leonard Xu wrote:
> >>>> Thanks Konstantin,
> >>>> Regarding your questions, hope my comments has address your questions
> and I also add a few explanation in the FLIP.
> >>>> Thank you all for the feedback,
> >>>> It seems everyone involved  in this thread has reached a consensus.
> >>>> I will start a vote thread  later.
> >>>> Best,
> >>>> Leonard
> >>>>> 在 2020年8月3日,19:35,godfrey he <godfre...@gmail.com <mailto:
> godfre...@gmail.com>> 写道:
> >>>>>
> >>>>> Thanks Lennard for driving this FLIP.
> >>>>> Looks good to me.
> >>>>>
> >>>>> Best,
> >>>>> Godfrey
> >>>>>
> >>>>> Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com>> 于2020年8月3日周一
> 下午12:04写道:
> >>>>>
> >>>>>> Thanks Leonard for the great FLIP. I think it is in very good shape.
> >>>>>> +1 to start a vote.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jark
> >>>>>>
> >>>>>> On Fri, 31 Jul 2020 at 17:56, Fabian Hueske <fhue...@gmail.com
> <mailto:fhue...@gmail.com>> wrote:
> >>>>>>
> >>>>>>> Hi Leonard,
> >>>>>>>
> >>>>>>> Thanks for this FLIP!
> >>>>>>> Looks good from my side.
> >>>>>>>
> >>>>>>> Cheers, Fabian
> >>>>>>>
> >>>>>>> Am Do., 30. Juli 2020 um 22:15 Uhr schrieb Seth Wiesman <
> >>>>>>> sjwies...@gmail.com <mailto:sjwies...@gmail.com>
> >>>>>>>> :
> >>>>>>>
> >>>>>>>> Hi Leondard,
> >>>>>>>>
> >>>>>>>> Thank you for pushing this, I think the updated syntax looks
> really
> >>>>>> good
> >>>>>>>> and the semantics make sense to me.
> >>>>>>>>
> >>>>>>>> +1
> >>>>>>>>
> >>>>>>>> Seth
> >>>>>>>>
> >>>>>>>> On Wed, Jul 29, 2020 at 11:36 AM Leonard Xu <xbjt...@gmail.com
> <mailto:xbjt...@gmail.com>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi, Konstantin
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 1) A  "Versioned Temporal Table DDL on source" can only be
> joined
> >>>>>> on
> >>>>>>>> the
> >>>>>>>>>> PRIMARY KEY attribute, correct?
> >>>>>>>>> Yes, the PRIMARY KEY would be join key.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2) Isn't it the time attribute in the ORDER BY clause of the
> VIEW
> >>>>>>>>> definition that defines
> >>>>>>>>>> whether a event-time or processing time temporal table join is
> >>>>>> used?
> >>>>>>>>>
> >>>>>>>>> I think event-time or processing-time temporal table join
> depends on
> >>>>>>> fact
> >>>>>>>>> table’s time attribute in temporal join rather than from temporal
> >>>>>> table
> >>>>>>>>> side, the event-time or processing time in temporal table is just
> >>>>>> used
> >>>>>>> to
> >>>>>>>>> split the validity period of versioned snapshot of temporal
> table.
> >>>>>> The
> >>>>>>>>> processing time attribute is not  necessary for temporal table
> >>>>>> without
> >>>>>>>>> version, only the primary key is required, the following VIEW is
> also
> >>>>>>>> valid
> >>>>>>>>> for temporal table without version.
> >>>>>>>>> CREATE VIEW latest_rates AS
> >>>>>>>>> SELECT currency, LAST_VALUE(rate)            -- only keep the
> latest
> >>>>>>>>> version
> >>>>>>>>> FROM rates
> >>>>>>>>> GROUP BY currency;                           -- inferred primary
> key
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 3) A "Versioned Temporal Table DDL on source" is always
> versioned
> >>>>>> on
> >>>>>>>>>> operation_time regardless of the lookup table attribute
> (event-time
> >>>>>>> or
> >>>>>>>>>> processing time attribute), correct?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Yes, the semantics of `FOR SYSTEM_TIME AS OF o.time` is using the
> >>>>>>> o.time
> >>>>>>>>> value to lookup the version of the temporal table.
> >>>>>>>>> For fact table has the processing time attribute, it means only
> >>>>>> lookup
> >>>>>>>> the
> >>>>>>>>> latest version of temporal table and we can do some optimization
> in
> >>>>>>>>> implementation like only keep the latest version.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Best
> >>>>>>>>> Leonard
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>
> >>
> >
>
>

-- 
Best regards!
Rui Li

Reply via email to