+1 to the updated design.

I agree with Fabian that the naming of "temporal table without version" is
a bit confusing but the actual semantics make sense to me. I think just
saying its a Flink managed lookup join makes sense.

Seth

On Tue, Aug 18, 2020 at 3:07 PM Fabian Hueske <fhue...@gmail.com> wrote:

> Thanks for the updated FLIP Leonard!
> In my opinion this was an improvement.
> So +1 for this design.
>
> I have just one remark regarding the terminology.
> I find the term "Temporal Table without Version" somewhat confusing.
> IMO, versions are the core principle of temporal tables and temporal
> tables without versions don't make much sense to me.
>
> What makes such a table a "Temporal" table? Isn't it just a regular table?
> If I understand the proposal correctly, "Temporal Tables without Version"
> can only be used in processing time temporal table joins, because this join
> only requests the current version.
> But all regular tables can be used in processing time (temporal) table
> joins as well.
> It's basically the same as a lookup join, with the only difference that
> the table is maintained in Flink and not accessed in an external system
> (for example via JDBC).
>
> Are "Temporal Tables without Version" called "Temporal" because they can
> be used in "processing time temporal table joins" and due to its name this
> join needs to join something that's called "Temporal"?
> In that case, we might want to rename "processing time temporal table
> joins" into something else that does not imply a versioning.
> Maybe we can call them just lookup joins to avoid introducing another term?
>
> Thanks, Fabian
>
> Am Di., 18. Aug. 2020 um 04:30 Uhr schrieb Rui Li <lirui.fu...@gmail.com>:
>
>> Thanks Leonard for the clarifications!
>>
>> On Mon, Aug 17, 2020 at 9:17 PM Leonard Xu <xbjt...@gmail.com> wrote:
>>
>>>
>>> > But are we still able to track different views of such a
>>> > table through time, as rows are added/deleted to/from the table?
>>>
>>> Yes, in fact we support temporal table from changlog which contains all
>>> possible message types(INSERT/UPDATE/DELETE).
>>>
>>> > For
>>> > example, suppose I have an append-only table source with event-time
>>> and PK,
>>> > will I be allowed to do an event-time temporal join with this table?
>>> Yes, I list some examples in the doc, the example versioned_rates3  is
>>> this case exactly.
>>>
>>> Best
>>> Leonard
>>>
>>>
>>> >
>>> > On Wed, Aug 12, 2020 at 3:31 PM Leonard Xu <xbjt...@gmail.com <mailto:
>>> xbjt...@gmail.com>> wrote:
>>> >
>>> >> Hi, all
>>> >>
>>> >> After a detailed offline discussion about the temporal table related
>>> >> concept and behavior, we had a reliable solution and rejected several
>>> >> alternatives.
>>> >> Compared to rejected alternatives, the proposed approach is a more
>>> unified
>>> >> story and also friendly to user and current Flink framework.
>>> >> I improved the FLIP[1] with the proposed approach and refactored the
>>> >> document organization to make it clear enough.
>>> >>
>>> >> Please let me know if you have any concerns, I’m looking forward your
>>> >> comments.
>>> >>
>>> >>
>>> >> Best
>>> >> Leonard
>>> >>
>>> >> [1]
>>> >>
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
>>> >> <
>>> >>
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
>>> <
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
>>> >
>>> >>>
>>> >>
>>> >>
>>> >>> 在 2020年8月4日,21:25,Leonard Xu <xbjt...@gmail.com <mailto:
>>> xbjt...@gmail.com>> 写道:
>>> >>>
>>> >>> Hi, all
>>> >>>
>>> >>> I’ve updated the FLIP[1] with the terminology `ChangelogTime`.
>>> >>>
>>> >>> Best
>>> >>> Leonard
>>> >>> [1]
>>> >>
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
>>> <
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
>>> >
>>> >> <
>>> >>
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
>>> <
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL
>>> >
>>> >>>
>>> >>>
>>> >>>> 在 2020年8月4日,20:58,Leonard Xu <xbjt...@gmail.com <mailto:
>>> xbjt...@gmail.com> <mailto:
>>> >> xbjt...@gmail.com <mailto:xbjt...@gmail.com>>> 写道:
>>> >>>>
>>> >>>> Hi, Timo
>>> >>>>
>>> >>>> Thanks for you response.
>>> >>>>
>>> >>>>> 1) Naming: Is operation time a good term for this concept? If I
>>> read
>>> >> "The operation time is the time when the changes happened in system."
>>> or
>>> >> "The system time of DML execution in database", why don't we call it
>>> >> `ChangelogTime` or `SystemTime`? Introducing another terminology of
>>> time in
>>> >> Flink should be thought through.
>>> >>>>
>>> >>>> I agree that we should thought through. I have considered the name
>>> >> `ChangelogTime` and `SystemTime` too, I don’t have strong opinion on
>>> the
>>> >> name.
>>> >>>>
>>> >>>> I proposed `operationTime` because most changelog comes from
>>> Database
>>> >> and we always called an action as `operation` rather than `change` in
>>> >> Database, the operation time is  easier to understand  for database
>>> users,
>>> >> but it's more like a database terminology.
>>> >>>>
>>> >>>> For `SystemTime`, user may confuse which one does the system in
>>> >> `SystemTime` represents?  Flink, Database or CDC tool.  Maybe it’s
>>> not a
>>> >> good name.
>>> >>>>
>>> >>>> `ChangelogTime` is a pretty choice which is more unified with
>>> existed
>>> >> terminology `Changelog` and `ChangelogMode`, so let me use
>>> `ChangelogTime`
>>> >> and I’ll update the FLIP.
>>> >>>>
>>> >>>>
>>> >>>>> 2) Exposing it through `org.apache.flink.types.Row`: Shall we also
>>> >> expose the concept of time through the user-level `Row` type? The
>>> FLIP does
>>> >> not mention this explictly. I think we can keep it as an internal
>>> concept
>>> >> but I just wanted to ask for clarification.
>>> >>>>
>>> >>>> Yes, I want to keep it as an internal concept, we have discussed
>>> that
>>> >> changelog time concept should be the third time concept(the other two
>>> are
>>> >> event-time and processing-time). It’s not easy for normal users(or to
>>> help
>>> >> normal users) understand the three concepts accurately, and I did not
>>> find
>>> >> a big enough scenario that user need to touch the changelog time for
>>> now,
>>> >> so I tend to do not expose the concept to users.
>>> >>>>
>>> >>>>
>>> >>>> Best,
>>> >>>> Leonard
>>> >>>>
>>> >>>>
>>> >>>>>
>>> >>>>> On 04.08.20 04:58, Leonard Xu wrote:
>>> >>>>>> Thanks Konstantin,
>>> >>>>>> Regarding your questions, hope my comments has address your
>>> questions
>>> >> and I also add a few explanation in the FLIP.
>>> >>>>>> Thank you all for the feedback,
>>> >>>>>> It seems everyone involved  in this thread has reached a
>>> consensus.
>>> >>>>>> I will start a vote thread  later.
>>> >>>>>> Best,
>>> >>>>>> Leonard
>>> >>>>>>> 在 2020年8月3日,19:35,godfrey he <godfre...@gmail.com <mailto:
>>> godfre...@gmail.com> <mailto:
>>> >> godfre...@gmail.com <mailto:godfre...@gmail.com>>> 写道:
>>> >>>>>>>
>>> >>>>>>> Thanks Lennard for driving this FLIP.
>>> >>>>>>> Looks good to me.
>>> >>>>>>>
>>> >>>>>>> Best,
>>> >>>>>>> Godfrey
>>> >>>>>>>
>>> >>>>>>> Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com> <mailto:
>>> imj...@gmail.com <mailto:imj...@gmail.com>>> 于2020年8月3日周一
>>> >> 下午12:04写道:
>>> >>>>>>>
>>> >>>>>>>> Thanks Leonard for the great FLIP. I think it is in very good
>>> shape.
>>> >>>>>>>> +1 to start a vote.
>>> >>>>>>>>
>>> >>>>>>>> Best,
>>> >>>>>>>> Jark
>>> >>>>>>>>
>>> >>>>>>>> On Fri, 31 Jul 2020 at 17:56, Fabian Hueske <fhue...@gmail.com
>>> <mailto:fhue...@gmail.com>
>>> >> <mailto:fhue...@gmail.com <mailto:fhue...@gmail.com>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>>> Hi Leonard,
>>> >>>>>>>>>
>>> >>>>>>>>> Thanks for this FLIP!
>>> >>>>>>>>> Looks good from my side.
>>> >>>>>>>>>
>>> >>>>>>>>> Cheers, Fabian
>>> >>>>>>>>>
>>> >>>>>>>>> Am Do., 30. Juli 2020 um 22:15 Uhr schrieb Seth Wiesman <
>>> >>>>>>>>> sjwies...@gmail.com <mailto:sjwies...@gmail.com> <mailto:
>>> sjwies...@gmail.com <mailto:sjwies...@gmail.com>>
>>> >>>>>>>>>> :
>>> >>>>>>>>>
>>> >>>>>>>>>> Hi Leondard,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Thank you for pushing this, I think the updated syntax looks
>>> >> really
>>> >>>>>>>> good
>>> >>>>>>>>>> and the semantics make sense to me.
>>> >>>>>>>>>>
>>> >>>>>>>>>> +1
>>> >>>>>>>>>>
>>> >>>>>>>>>> Seth
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Wed, Jul 29, 2020 at 11:36 AM Leonard Xu <
>>> xbjt...@gmail.com <mailto:xbjt...@gmail.com>
>>> >> <mailto:xbjt...@gmail.com <mailto:xbjt...@gmail.com>>> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>>> Hi, Konstantin
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> 1) A  "Versioned Temporal Table DDL on source" can only be
>>> >> joined
>>> >>>>>>>> on
>>> >>>>>>>>>> the
>>> >>>>>>>>>>>> PRIMARY KEY attribute, correct?
>>> >>>>>>>>>>> Yes, the PRIMARY KEY would be join key.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> 2) Isn't it the time attribute in the ORDER BY clause of the
>>> >> VIEW
>>> >>>>>>>>>>> definition that defines
>>> >>>>>>>>>>>> whether a event-time or processing time temporal table join
>>> is
>>> >>>>>>>> used?
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> I think event-time or processing-time temporal table join
>>> >> depends on
>>> >>>>>>>>> fact
>>> >>>>>>>>>>> table’s time attribute in temporal join rather than from
>>> temporal
>>> >>>>>>>> table
>>> >>>>>>>>>>> side, the event-time or processing time in temporal table is
>>> just
>>> >>>>>>>> used
>>> >>>>>>>>> to
>>> >>>>>>>>>>> split the validity period of versioned snapshot of temporal
>>> >> table.
>>> >>>>>>>> The
>>> >>>>>>>>>>> processing time attribute is not  necessary for temporal
>>> table
>>> >>>>>>>> without
>>> >>>>>>>>>>> version, only the primary key is required, the following
>>> VIEW is
>>> >> also
>>> >>>>>>>>>> valid
>>> >>>>>>>>>>> for temporal table without version.
>>> >>>>>>>>>>> CREATE VIEW latest_rates AS
>>> >>>>>>>>>>> SELECT currency, LAST_VALUE(rate)            -- only keep the
>>> >> latest
>>> >>>>>>>>>>> version
>>> >>>>>>>>>>> FROM rates
>>> >>>>>>>>>>> GROUP BY currency;                           -- inferred
>>> primary
>>> >> key
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> 3) A "Versioned Temporal Table DDL on source" is always
>>> >> versioned
>>> >>>>>>>> on
>>> >>>>>>>>>>>> operation_time regardless of the lookup table attribute
>>> >> (event-time
>>> >>>>>>>>> or
>>> >>>>>>>>>>>> processing time attribute), correct?
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Yes, the semantics of `FOR SYSTEM_TIME AS OF o.time` is
>>> using the
>>> >>>>>>>>> o.time
>>> >>>>>>>>>>> value to lookup the version of the temporal table.
>>> >>>>>>>>>>> For fact table has the processing time attribute, it means
>>> only
>>> >>>>>>>> lookup
>>> >>>>>>>>>> the
>>> >>>>>>>>>>> latest version of temporal table and we can do some
>>> optimization
>>> >> in
>>> >>>>>>>>>>> implementation like only keep the latest version.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Best
>>> >>>>>>>>>>> Leonard
>>> >>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >
>>> > --
>>> > Best regards!
>>> > Rui Li
>>>
>>>
>>
>> --
>> Best regards!
>> Rui Li
>>
>

Reply via email to