Hi, Timo

Thanks for you response.

> 1) Naming: Is operation time a good term for this concept? If I read "The 
> operation time is the time when the changes happened in system." or "The 
> system time of DML execution in database", why don't we call it 
> `ChangelogTime` or `SystemTime`? Introducing another terminology of time in 
> Flink should be thought through.

I agree that we should thought through. I have considered the name 
`ChangelogTime` and `SystemTime` too, I don’t have strong opinion on the name.

I proposed `operationTime` because most changelog comes from Database and we 
always called an action as `operation` rather than `change` in Database, the 
operation time is  easier to understand  for database users, but it's more like 
a database terminology. 

For `SystemTime`, user may confuse which one does the system in `SystemTime` 
represents?  Flink, Database or CDC tool.  Maybe it’s not a good name.

`ChangelogTime` is a pretty choice which is more unified with existed 
terminology `Changelog` and `ChangelogMode`, so let me use `ChangelogTime` and 
I’ll update the FLIP.


> 2) Exposing it through `org.apache.flink.types.Row`: Shall we also expose the 
> concept of time through the user-level `Row` type? The FLIP does not mention 
> this explictly. I think we can keep it as an internal concept but I just 
> wanted to ask for clarification.

Yes, I want to keep it as an internal concept, we have discussed that changelog 
time concept should be the third time concept(the other two are event-time and 
processing-time). It’s not easy for normal users(or to help normal users) 
understand the three concepts accurately, and I did not find a big enough 
scenario that user need to touch the changelog time for now, so I tend to do 
not expose the concept to users.


Best,
Leonard


> 
> On 04.08.20 04:58, Leonard Xu wrote:
>> Thanks Konstantin,
>> Regarding your questions, hope my comments has address your questions and I 
>> also add a few explanation in the FLIP.
>> Thank you all for the feedback,
>> It seems everyone involved  in this thread has reached a consensus.
>> I will start a vote thread  later.
>> Best,
>> Leonard
>>> 在 2020年8月3日,19:35,godfrey he <godfre...@gmail.com> 写道:
>>> 
>>> Thanks Lennard for driving this FLIP.
>>> Looks good to me.
>>> 
>>> Best,
>>> Godfrey
>>> 
>>> Jark Wu <imj...@gmail.com> 于2020年8月3日周一 下午12:04写道:
>>> 
>>>> Thanks Leonard for the great FLIP. I think it is in very good shape.
>>>> +1 to start a vote.
>>>> 
>>>> Best,
>>>> Jark
>>>> 
>>>> On Fri, 31 Jul 2020 at 17:56, Fabian Hueske <fhue...@gmail.com> wrote:
>>>> 
>>>>> Hi Leonard,
>>>>> 
>>>>> Thanks for this FLIP!
>>>>> Looks good from my side.
>>>>> 
>>>>> Cheers, Fabian
>>>>> 
>>>>> Am Do., 30. Juli 2020 um 22:15 Uhr schrieb Seth Wiesman <
>>>>> sjwies...@gmail.com
>>>>>> :
>>>>> 
>>>>>> Hi Leondard,
>>>>>> 
>>>>>> Thank you for pushing this, I think the updated syntax looks really
>>>> good
>>>>>> and the semantics make sense to me.
>>>>>> 
>>>>>> +1
>>>>>> 
>>>>>> Seth
>>>>>> 
>>>>>> On Wed, Jul 29, 2020 at 11:36 AM Leonard Xu <xbjt...@gmail.com> wrote:
>>>>>> 
>>>>>>> Hi, Konstantin
>>>>>>> 
>>>>>>>> 
>>>>>>>> 1) A  "Versioned Temporal Table DDL on source" can only be joined
>>>> on
>>>>>> the
>>>>>>>> PRIMARY KEY attribute, correct?
>>>>>>> Yes, the PRIMARY KEY would be join key.
>>>>>>> 
>>>>>>>> 
>>>>>>>> 2) Isn't it the time attribute in the ORDER BY clause of the VIEW
>>>>>>> definition that defines
>>>>>>>> whether a event-time or processing time temporal table join is
>>>> used?
>>>>>>> 
>>>>>>> I think event-time or processing-time temporal table join depends on
>>>>> fact
>>>>>>> table’s time attribute in temporal join rather than from temporal
>>>> table
>>>>>>> side, the event-time or processing time in temporal table is just
>>>> used
>>>>> to
>>>>>>> split the validity period of versioned snapshot of temporal table.
>>>> The
>>>>>>> processing time attribute is not  necessary for temporal table
>>>> without
>>>>>>> version, only the primary key is required, the following VIEW is also
>>>>>> valid
>>>>>>> for temporal table without version.
>>>>>>> CREATE VIEW latest_rates AS
>>>>>>> SELECT currency, LAST_VALUE(rate)            -- only keep the latest
>>>>>>> version
>>>>>>> FROM rates
>>>>>>> GROUP BY currency;                           -- inferred primary key
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> 3) A "Versioned Temporal Table DDL on source" is always versioned
>>>> on
>>>>>>>> operation_time regardless of the lookup table attribute (event-time
>>>>> or
>>>>>>>> processing time attribute), correct?
>>>>>>> 
>>>>>>> 
>>>>>>> Yes, the semantics of `FOR SYSTEM_TIME AS OF o.time` is using the
>>>>> o.time
>>>>>>> value to lookup the version of the temporal table.
>>>>>>> For fact table has the processing time attribute, it means only
>>>> lookup
>>>>>> the
>>>>>>> latest version of temporal table and we can do some optimization in
>>>>>>> implementation like only keep the latest version.
>>>>>>> 
>>>>>>> 
>>>>>>> Best
>>>>>>> Leonard
>>>>>> 
>>>>> 
>>>> 
> 

Reply via email to