Hi everyone,
Thank you to all those who participated in the discussion, we have discussed 
many rounds, the program has been gradually revised and improved, 
looking forward to further feedback, we will launch a vote in the next day or 
two.







--

Best regards,
Mang Zhang





At 2022-06-28 22:23:16, "Mang Zhang" <zhangma...@163.com> wrote:
>Hi Yuxia,
>Thank you very much for your reply.
>
>
>>1: Also, the mixture of ctas and rtas confuses me as the FLIP talks nothing 
>>about rtas but refer it in the configuration suddenly.  And if we're not to 
>>implement rtas in this FLIP, it may be better not to refer it and the `rtas` 
>>shouldn't exposed to user as a configuration.
>Currently does not support RTAS because in the stream mode and batch mode 
>semantic unification issues and specific business scenarios are not very 
>clear, the future we will support, if in support of rtas and then modify the 
>option name, then it will bring the cost of modifying the configuration to the 
>user.
>>2: How will the CTASJobStatusHook be passed to StreamGraph as a hook? Could 
>>you please explain about it. Some pseudocode will be much better if it's 
>>possible. I'm lost in this part.
>
>
>
>
>This part is too much of an implementation detail, and of course we had to 
>make some changes to achieve this. FLIP focuses on semantic consistency in 
>stream and batch mode, and can provide optional atomicity support.
>
>
>>3: The name `AtomicCatalog` confuses me. Seems the backgroud for the naming 
>>is to implement atomic for ctas, we propose a interface for catalog to 
>>support serializing, then we name it to `AtomicCatalog`. At least, the 
>>interface is for the atomic of ctas. But if we want to implement other 
>>features like isolate which may also require serializable catalog in the 
>>future, should we introduce a new interface naming `IsolateCatalog`? Have you 
>>ever considered other names like `SerializableCatalog`.  As it's a public 
>>interface, maybe we should be careful about the name. 
>Regarding the definition of the Catalog name, we have also discussed the name 
>`SerializableCatalog`, which is too specific and does not relate to the atomic 
>functionality we want to express. CTAS/RTAS want to support atomicity, need 
>Catalog to implement `AtomicCatalog`, so it's more straightforward to 
>understand. 
>
>
>Hope this answers your question.
>
>
>
>
>--
>
>Best regards,
>Mang Zhang
>
>
>
>
>
>At 2022-06-28 11:36:51, "yuxia" <luoyu...@alumni.sjtu.edu.cn> wrote:
>>Thanks for updating. The FLIP looks generall good to me. I have only minor 
>>questions:
>>
>>1: Also, the mixture of ctas and rtas confuses me as the FLIP talks nothing 
>>about rtas but refer it in the configuration suddenly.  And if we're not to 
>>implement rtas in this FLIP, it may be better not to refer it and the `rtas` 
>>shouldn't exposed to user as a configuration.
>>
>>2: How will the CTASJobStatusHook be passed to StreamGraph as a hook? Could 
>>you please explain about it. Some pseudocode will be much better if it's 
>>possible.  I'm lost in this part.
>>
>>3: The name `AtomicCatalog` confuses me. Seems the backgroud for the naming 
>>is to implement atomic for ctas, we propose a interface for catalog to 
>>support serializing, then we name it to `AtomicCatalog`. At least, the 
>>interface is for the atomic of ctas. But if we want to implement other 
>>features like isolate which may also require serializable catalog in the 
>>future, should we introduce a new interface naming `IsolateCatalog`? Have you 
>>ever considered other names like `SerializableCatalog`.  As it's a public 
>>interface, maybe we should be careful about the name. 
>>
>>
>>Best regards,
>>Yuxia
>>
>>----- 原始邮件 -----
>>发件人: "Mang Zhang" <zhangma...@163.com>
>>收件人: "dev" <dev@flink.apache.org>
>>抄送: imj...@gmail.com
>>发送时间: 星期一, 2022年 6 月 27日 下午 5:43:50
>>主题: Re:Re: Re:Re: Re: Re: Re: [DISCUSS] FLIP-218: Support SELECT clause in 
>>CREATE TABLE(CTAS)
>>
>>Hi Jark,
>>First of all, thank you for your very good advice!
>>The RTAS point you mentioned is a good one, and we should support it as well. 
>>However, by investigating the semantics of RTAS and how RTAS is used within 
>>the company, I found that:
>>1. The semantics of RTAS says that if the table exists, need to delete the 
>>old data and use the new data.
>>This semantics is better implemented in Batch mode, for example, if the 
>>target table is a Hive table, old data file can be deleted directly.
>>But in Streaming mode, the target table is probably a Kafka topic, we can't 
>>delete the data.
>>So the semantics in streaming and batch scenarios are not well guaranteed to 
>>be consistent.
>>2. I checked the SQL for big data in the company in the last week and found 
>>that RTAS was not used.
>>No users in the company have mentioned the need for RTAS yet. So this 
>>application scenario is not very clear.
>>
>>
>>It is not clear what kind of semantics RTAS should provide in streaming mode, 
>>and the user's business scenarios are not very clear.
>>Maybe We don't have to support RTAS soon, but we can leave the possibility of 
>>supporting RTAS in the future in the interface definition.
>>What do you think? Looking forward to your response!
>>
>>
>>By the way, the other points raised have been updated. thanks.
>>
>>
>>
>>
>>--
>>
>>Best regards,
>>Mang Zhang
>>
>>
>>
>>
>>
>>At 2022-06-26 11:56:53, "Jark Wu" <imj...@gmail.com> wrote:
>>>Thanks for the update, Mang and Ron,
>>>
>>>The new proposal looks good to me in general, especially keeping the
>>>behavior
>>>consistent between batch and streaming mode by default. This is how we do
>>>it
>>>in the previous "table.dml-sync" option on ML [1].
>>>
>>>Besides that, I just have some final minor comments regarding some
>>>interfaces.
>>>
>>>1) table.ctas-or-rtas.atomicity-enabled
>>>The "OR" keyword sounds like this configuration can only take effect on one
>>>of CTAS and RTAS.
>>>What about "table.ctas-and-rtas" or "table.ctas-rtas"?
>>>
>>>2) In the FLIP, you have mentioned RTAS many times, but have no plan to
>>>support it.
>>>RTAS is another widely used statement similar to CTAS. It seems there is
>>>not much difference
>>>between CTAS and RTAS. Considering we are introducing RTAS configurations,
>>>is it possible
>>> to support RTAS in this FLIP as well?
>>>
>>>3) connector.type
>>>"connector.type" has been deprecated since FLIP-95, could you replace them
>>>with 'connector'?
>>>
>>>4) SupportsAtomicCatalog
>>>I have some concerns about using "Supports.." prefix which is known as the
>>>ability extension for
>>>DynamicTableSource and DynamicTableSink. Maybe "AtomicCatalog" is enough?
>>>
>>>Best,
>>>Jark
>>>
>>>[1]: https://lists.apache.org/thread/78r8ybh4q3hkxf935vzjkb7782hqzcj2
>>>
>>>On Fri, 24 Jun 2022 at 22:51, Mang Zhang <zhangma...@163.com> wrote:
>>>
>>>> Hi all,
>>>> Thank you to all those who participated in the discussion and made
>>>> suggestions!
>>>> After several rounds of online and offline discussions, the solution in
>>>> FLIP has been updated.
>>>> Looking forward to more feedback from everyone.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Best regards,
>>>> Mang Zhang
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> At 2022-06-24 21:58:01, "Mang Zhang" <zhangma...@163.com> wrote:
>>>> >Hi godfrey and ron,
>>>> >Thank you very much for your replies and suggestions.
>>>> >Special thanks to ron for helping to review and improve the FLIP.
>>>> >Looking forward to further feedback from others.
>>>> >
>>>> >
>>>> >
>>>> >--
>>>> >
>>>> >Best regards,
>>>> >Mang Zhang
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >At 2022-06-24 19:52:58, "ron" <ld...@zju.edu.cn> wrote:
>>>> >>Thanks for godfrey further feedback, your suggestions are very good to
>>>> me, the FLIP has updated according to your feedback. It will be very good
>>>> if you look at it again。
>>>> >>
>>>> >>Also looking forward to further feedback from others.
>>>> >>
>>>> >>
>>>> >>> -----原始邮件-----
>>>> >>> 发件人: "godfrey he" <godfre...@gmail.com>
>>>> >>> 发送时间: 2022-06-24 17:00:51 (星期五)
>>>> >>> 收件人: dev <dev@flink.apache.org>
>>>> >>> 抄送: "Yun Gao" <yungao...@aliyun.com>
>>>> >>> 主题: Re: Re: Re: [DISCUSS] FLIP-218: Support SELECT clause in CREATE
>>>> TABLE(CTAS)
>>>> >>>
>>>> >>> Hi all,
>>>> >>>
>>>> >>> Sorry for the late reply.
>>>> >>>
>>>> >>> >table.cor-table-as-select.atomicity-enabled
>>>> >>> Regarding `cor`,  this abbreviation is not commonly used.
>>>> >>>
>>>> >>> >Create Table As Select(CTAS) feature depends on the serializability
>>>> of the catalog. To quickly see if the catalog supports CTAS, we need to try
>>>> to serialize the catalog when compile SQL in planner and if it fails, an
>>>> exception will be >thrown to indicate to user that the catalog does not
>>>> support CTAS because it cannot be serialized.
>>>> >>> This behavior is too cryptic, and will break the current catalog
>>>> >>> behavior when using 1.16.
>>>> >>> I suggest we introduce a new interface for atomic catalog which
>>>> >>> implements Serializable.
>>>> >>>  The existent catalogs can choose whether implements the new catalog
>>>> interface.
>>>> >>>
>>>> >>> > Catalog#inferTableOptions
>>>> >>> I strongly recommend not introducing this feature now, because the
>>>> >>> behavior is unclear.
>>>> >>> 1) if the catalog support managed table, the connector option is
>>>> >>> empty. but if user forget to
>>>> >>> set connector option for CTAS statement, the created table will be
>>>> >>> managed table.
>>>> >>> 2) the options and its values for catalog and for connector may be
>>>> different,
>>>> >>> so use the catalog option may cause expected errors.
>>>> >>>
>>>> >>> > StreamGraph#addJobStatusHook
>>>> >>> I prefer `registerJobStatusHook`
>>>> >>>
>>>> >>> Best,
>>>> >>> Godfrey
>>>> >>>
>>>> >>> Mang Zhang <zhangma...@163.com> 于2022年6月13日周一 16:43写道:
>>>> >>> >
>>>> >>> > Hi Yun,
>>>> >>> > Thanks for your reply!
>>>> >>> > Through offline communication with Dalong, I updated the
>>>> JobStatusHook part to FLIP, looking forward to your feedback.
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>> > --
>>>> >>> >
>>>> >>> > Best regards,
>>>> >>> > Mang Zhang
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>> > At 2022-05-31 14:34:25, "Yun Gao" <yungao...@aliyun.com.INVALID>
>>>> wrote:
>>>> >>> > >Hi,
>>>> >>> > >
>>>> >>> > >Regarding the drop operation, with some offline discussion with
>>>> Dalong and Zhu,
>>>> >>> > >we think that listening in the client side might be problematic
>>>> since it would exit
>>>> >>> > >after submitting the jobs in detached mode, thus the operation
>>>> might need to
>>>> >>> > >be in the JobMaster side.
>>>> >>> > >
>>>> >>> > >For the listener interface, currently JobListener only resides in
>>>> the client side
>>>> >>> > >and contains unsuitable methods like onJobSubmitted for this
>>>> scenario, and
>>>> >>> > >the internal JobStatusListener is designed to be used inside JM and
>>>> is not
>>>> >>> > >serializable, thus we tend to add a new interface JobStatusHook,
>>>> >>> > >which could be attached to the JobGraph and executed in the
>>>> JobMaster.
>>>> >>> > >The interface will also be marked as Internal.
>>>> >>> > >
>>>> >>> > >Best,
>>>> >>> > >Yun
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >------------------------------------------------------------------
>>>> >>> > >From:Mang Zhang <zhangma...@163.com>
>>>> >>> > >Send Time:2022 May 25 (Wed.) 10:24
>>>> >>> > >To:dev <dev@flink.apache.org>
>>>> >>> > >Subject:Re:Re: [DISCUSS] FLIP-218: Support SELECT clause in CREATE
>>>> TABLE(CTAS)
>>>> >>> > >
>>>> >>> > >Hi, Martijn
>>>> >>> > >Thanks for your reply!
>>>> >>> > >I looked at the SQL standard, CTAS is part of the SQL standard.
>>>> >>> > >Feature T172 is "AS subquery clause in table definition".
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >--
>>>> >>> > >
>>>> >>> > >Best regards,
>>>> >>> > >Mang Zhang
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >At 2022-05-04 21:49:00, "Martijn Visser" <martijnvis...@apache.org>
>>>> wrote:
>>>> >>> > >>Hi everyone,
>>>> >>> > >>
>>>> >>> > >>Can we identify if this proposed syntax is part of the SQL
>>>> standard?
>>>> >>> > >>
>>>> >>> > >>Best regards,
>>>> >>> > >>
>>>> >>> > >>Martijn Visser
>>>> >>> > >>https://twitter.com/MartijnVisser82
>>>> >>> > >>https://github.com/MartijnVisser
>>>> >>> > >>
>>>> >>> > >>
>>>> >>> > >>On Fri, 29 Apr 2022 at 11:19, yuxia <luoyu...@alumni.sjtu.edu.cn>
>>>> wrote:
>>>> >>> > >>
>>>> >>> > >>> Thanks for for driving this work, it's to be a useful feature.
>>>> >>> > >>> About the flip-218, I have some questions.
>>>> >>> > >>>
>>>> >>> > >>> 1: Does our CTAS syntax support specify target table's schema
>>>> including
>>>> >>> > >>> column name and data type? I think it maybe a useful fature in
>>>> case we want
>>>> >>> > >>> to change the data types in target table instead of always copy
>>>> the source
>>>> >>> > >>> table's schema. It'll be more flexible with this feature.
>>>> >>> > >>> Btw, MySQL's "CREATE TABLE ... SELECT Statement"[1] support this
>>>> feature.
>>>> >>> > >>>
>>>> >>> > >>> 2: Seems it'll requre sink to implement an public interface to
>>>> drop table,
>>>> >>> > >>> so what's the interface will look like?
>>>> >>> > >>>
>>>> >>> > >>> [1]
>>>> https://dev.mysql.com/doc/refman/8.0/en/create-table-select.html
>>>> >>> > >>>
>>>> >>> > >>> Best regards,
>>>> >>> > >>> Yuxia
>>>> >>> > >>>
>>>> >>> > >>> ----- 原始邮件 -----
>>>> >>> > >>> 发件人: "Mang Zhang" <zhangma...@163.com>
>>>> >>> > >>> 收件人: "dev" <dev@flink.apache.org>
>>>> >>> > >>> 发送时间: 星期四, 2022年 4 月 28日 下午 4:57:24
>>>> >>> > >>> 主题: [DISCUSS] FLIP-218: Support SELECT clause in CREATE
>>>> TABLE(CTAS)
>>>> >>> > >>>
>>>> >>> > >>> Hi, everyone
>>>> >>> > >>>
>>>> >>> > >>>
>>>> >>> > >>> I would like to open a discussion for support select clause in
>>>> CREATE
>>>> >>> > >>> TABLE(CTAS),
>>>> >>> > >>> With the development of business and the enhancement of flink sql
>>>> >>> > >>> capabilities, queries become more and more complex.
>>>> >>> > >>> Now the user needs to use the Create Table statement to create
>>>> the target
>>>> >>> > >>> table first, and then execute the insert statement.
>>>> >>> > >>> However, the target table may have many columns, which will
>>>> bring a lot of
>>>> >>> > >>> work outside the business logic to the user.
>>>> >>> > >>> At the same time, ensure that the schema of the created target
>>>> table is
>>>> >>> > >>> consistent with the schema of the query result.
>>>> >>> > >>> Using a CTAS syntax like Hive/Spark can greatly facilitate the
>>>> user.
>>>> >>> > >>>
>>>> >>> > >>>
>>>> >>> > >>>
>>>> >>> > >>> You can find more details in FLIP-218[1]. Looking forward to
>>>> your feedback.
>>>> >>> > >>>
>>>> >>> > >>>
>>>> >>> > >>>
>>>> >>> > >>> [1]
>>>> >>> > >>>
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-218%3A+Support+SELECT+clause+in+CREATE+TABLE(CTAS)
>>>> >>> > >>>
>>>> >>> > >>>
>>>> >>> > >>>
>>>> >>> > >>>
>>>> >>> > >>> --
>>>> >>> > >>>
>>>> >>> > >>> Best regards,
>>>> >>> > >>> Mang Zhang
>>>> >>> > >>>
>>>> >>> > >
>>>> >>
>>>> >>
>>>> >>------------------------------
>>>> >>Best,
>>>> >>Ron
>>>>

Reply via email to