[jira] [Commented] (FLINK-27548) Improve quick-start of table store

2023-02-27 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694284#comment-17694284
 ] 

ZhuoYu Chen commented on FLINK-27548:
-

hi [~lzljs3620320] ,please assign to me; thx。

> Improve quick-start of table store
> --
>
> Key: FLINK-27548
> URL: https://issues.apache.org/jira/browse/FLINK-27548
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
> Fix For: table-store-0.4.0
>
>
> When the quick-start is completed, the stream job needs to be killed on the 
> flink page and the table needs to be dropped.
> But the exiting of the stream job is asynchronous and we need to wait a while 
> between these two actions. Otherwise there will be residue in the file 
> directory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-26999) Introduce ClickHouse Connector

2022-04-01 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-26999:

Description: 
[https://cwiki.apache.org/confluence/display/FLINK/FLIP-202%3A+Introduce+ClickHouse+Connector]

 

> Introduce ClickHouse Connector
> --
>
> Key: FLINK-26999
> URL: https://issues.apache.org/jira/browse/FLINK-26999
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-202%3A+Introduce+ClickHouse+Connector]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26999) Introduce ClickHouse Connector

2022-04-01 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-26999:
---

 Summary: Introduce ClickHouse Connector
 Key: FLINK-26999
 URL: https://issues.apache.org/jira/browse/FLINK-26999
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common
Affects Versions: 1.15.0
Reporter: ZhuoYu Chen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-26999) Introduce ClickHouse Connector

2022-04-01 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-26999:

Issue Type: New Feature  (was: Improvement)

> Introduce ClickHouse Connector
> --
>
> Key: FLINK-26999
> URL: https://issues.apache.org/jira/browse/FLINK-26999
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24351) translate "JSON Function" pages into Chinese

2022-03-28 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513850#comment-17513850
 ] 

ZhuoYu Chen commented on FLINK-24351:
-

hi [~gaoyunhaii]  I would be happy to complete some follow-up work

>  translate "JSON Function" pages into Chinese
> -
>
> Key: FLINK-24351
> URL: https://issues.apache.org/jira/browse/FLINK-24351
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: liwei li
>Assignee: ZhuoYu Chen
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: sql_functions_zh.yml, 微信图片_20211009105019.png
>
>
> translate "JSON Function" pages into Chinese, 
> docs/data/sql_functions_zh.yml
>  
> https://github.com/apache/flink/pull/17275#issuecomment-924536467



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25716) Translate "Streaming Concepts" page of "Application Development > Table API & SQL" to Chinese

2022-01-20 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479529#comment-17479529
 ] 

ZhuoYu Chen commented on FLINK-25716:
-

Hi [~MartijnVisser]  I am very interested in this,and I want do some job for 
flink,can I help to do that?

> Translate "Streaming Concepts" page of "Application Development > Table API & 
> SQL" to Chinese
> -
>
> Key: FLINK-25716
> URL: https://issues.apache.org/jira/browse/FLINK-25716
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Martijn Visser
>Priority: Major
>  Labels: chinese-translation
>
> After [https://github.com/apache/flink/pull/18316] is merged, we need to 
> update the translation for 
> [https://nightlies.apache.org/flink/flink-docs-master/zh/docs/dev/table/concepts/overview/]
> The markdown file is located in 
> flink/docs/content.zh/docs/dev/table/concepts/overview.md



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24351) translate "JSON Function" pages into Chinese

2022-01-20 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479525#comment-17479525
 ] 

ZhuoYu Chen commented on FLINK-24351:
-

hi [~jark]  [~liliwei] The `pr` has been completed, please merge it

>  translate "JSON Function" pages into Chinese
> -
>
> Key: FLINK-24351
> URL: https://issues.apache.org/jira/browse/FLINK-24351
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: liwei li
>Assignee: ZhuoYu Chen
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: sql_functions_zh.yml, 微信图片_20211009105019.png
>
>
> translate "JSON Function" pages into Chinese, 
> docs/data/sql_functions_zh.yml
>  
> https://github.com/apache/flink/pull/17275#issuecomment-924536467



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25559) SQL JOIN causes data loss

2022-01-12 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475100#comment-17475100
 ] 

ZhuoYu Chen commented on FLINK-25559:
-

Hi [~Ashulin]    I am very interested in this,and I want do some job for 
flink,can I help to do that?

> SQL JOIN causes data loss
> -
>
> Key: FLINK-25559
> URL: https://issues.apache.org/jira/browse/FLINK-25559
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.14.0, 1.13.2, 1.13.3, 1.13.5, 1.14.2
>Reporter: Zongwen Li
>Priority: Major
> Attachments: image-2022-01-07-11-27-01-010.png
>
>
> {code:java}
> //sink table,omits some physical fields
> CREATE TABLE kd_product_info (
>   productId BIGINT COMMENT '产品编号',
>   productSaleId BIGINT COMMENT '商品编号',
>   PRIMARY KEY (productSaleId) NOT ENFORCED
> )
> // sql omits some selected fields
> INSERT INTO kd_product_info
> SELECT
>  ps.product AS productId,
>  ps.productsaleid AS productSaleId,
>  CAST(p.complex AS INT) AS complex,
>  p.createtime AS createTime,
>  p.updatetime AS updateTime,
>  p.ean AS ean,
>  ts.availablequantity AS totalAvailableStock,
> IF
>  (
>   ts.availablequantity - ts.lockoccupy - ts.lock_available_quantity > 0,
>   ts.availablequantity - ts.lockoccupy - ts.lock_available_quantity,
>   0
>  ) AS sharedStock
>  ,rps.purchase AS purchase
>  ,v.`name` AS vendorName
>  FROM
>  product_sale ps
>  JOIN product p ON ps.product = p.id
>  LEFT JOIN rate_product_sale rps ON ps.productsaleid = rps.id
>  LEFT JOIN pss_total_stock ts ON ps.productsaleid = ts.productsale
>  LEFT JOIN vendor v ON ps.merchant_id = v.merchant_id AND ps.vendor = v.vendor
>  LEFT JOIN mccategory mc ON ps.merchant_id = mc.merchant_id AND p.mccategory 
> = mc.id
>  LEFT JOIN new_mccategory nmc ON p.mccategory = nmc.mc
>  LEFT JOIN product_sale_grade_plus psgp ON ps.productsaleid = psgp.productsale
>  LEFT JOIN product_sale_extend pse359 ON ps.productsaleid = 
> pse359.product_sale AND pse359.meta = 359
>  LEFT JOIN product_image_url piu ON ps.product = piu.product {code}
> All table sources are upsert-kafka,I have ensured that the associated columns 
> are of the same type:
> {code:java}
> CREATE TABLE product_sale (
>   id BIGINT COMMENT '主键',
>   productsaleid BIGINT COMMENT '商品编号',
>   product BIGINT COMMENT '产品编号',
>   merchant_id DECIMAL(20, 0) COMMENT '商户id',
>   vendor STRING COMMENT '供应商',
>   PRIMARY KEY (productsaleid) NOT ENFORCED
> )
> // No computed columns
> // Just plain physical columns
> WITH (
> 'connector' = 'upsert-kafka',
> 'topic' = 'XXX',
> 'group.id' = '%s',
> 'properties.bootstrap.servers' = '%s',
> 'key.format' = 'json',
> 'value.format' = 'json'
> ) 
> CREATE TABLE product (
>   id BIGINT,
>   mccategory STRING,
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE rate_product_sale ( 
>   id BIGINT COMMENT '主键',
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE pss_total_stock (
>   id INT COMMENT 'ID',
>   productsale BIGINT COMMENT '商品编码',
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE vendor (
>   merchant_id DECIMAL(20, 0) COMMENT '商户id',
>   vendor STRING COMMENT '供应商编码',
>   PRIMARY KEY (merchant_id, vendor) NOT ENFORCED
> )
> CREATE TABLE mccategory (
>   id STRING COMMENT 'mc编号',
>   merchant_id DECIMAL(20, 0) COMMENT '商户id',
>   PRIMARY KEY (merchant_id, id) NOT ENFORCED
> )
> CREATE TABLE new_mccategory (
>   mc STRING,
>   PRIMARY KEY (mc) NOT ENFORCED
> )
> CREATE TABLE product_sale_grade_plus (
>   productsale BIGINT,
>   PRIMARY KEY (productsale) NOT ENFORCED
> )
> CREATE TABLE product_sale_extend (
>   id BIGINT,
>   product_sale BIGINT,
>   meta BIGINT,
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE product_image_url(
>   product BIGINT,
>   PRIMARY KEY (product) NOT ENFORCED 
> ){code}
> the data in each table is between 5 million and 10 million, parallelism: 24;
> Not set ttl; In fact, we can notice data loss as soon as 30 minutes.
>  
> The data flow:
> MySQL -> Flink CDC -> ODS (Upsert Kafka) -> the job -> sink
> I'm sure the ODS data in Kafka is correct;
> I have also tried to use the flink-cdc source directly, it didn't solve the 
> problem;
>  
> We tested sinking to kudu, Kafka or ES;
> Also tested multiple versions: 1.13.2, 1.13.3, 1.13.5, 1.14.0, 1.14.2;
> Lost data appears out of order on kafka, guessed as a bug of retraction 
> stream:
> !image-2022-01-07-11-27-01-010.png!
>  
> After many tests, we found that when the left join table is more or the 
> parallelism of the operator is greater, the data will be more easily lost.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25595) Specify hash/sort aggregate strategy in SQL hint

2022-01-12 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17474534#comment-17474534
 ] 

ZhuoYu Chen commented on FLINK-25595:
-

Hi [~jingzhang]     I am very interested in this,and I want do some job for 
flink,can I help to do that?

> Specify hash/sort aggregate strategy in SQL hint
> 
>
> Key: FLINK-25595
> URL: https://issues.apache.org/jira/browse/FLINK-25595
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Jing Zhang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25600) Support new statement set syntax in sql client and update docs

2022-01-12 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17474536#comment-17474536
 ] 

ZhuoYu Chen commented on FLINK-25600:
-

Hi [~wenlong.lwl]   I am very interested in this,and I want do some job for 
flink,can I help to do that?

> Support new statement set syntax in sql client and update docs
> --
>
> Key: FLINK-25600
> URL: https://issues.apache.org/jira/browse/FLINK-25600
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Client
>Reporter: Wenlong Lyu
>Priority: Major
>
> this is a follow up of FLINK-25392, to finish adding the new statement set: 
> 1. the new statement set need multi line parsing support in sql client, which 
> is not supported currently:
> execute statement set begin
> insert xxx;
> insert xxx;
> end;
> 2. we need to update the doc to introduce the new syntax



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-25631) Support enhanced `show tables` statement

2022-01-12 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17474529#comment-17474529
 ] 

ZhuoYu Chen edited comment on FLINK-25631 at 1/12/22, 1:39 PM:
---

Hi [~liyubin117]     I am very interested in this,and I want do some job for 
flink,can I help to do that?


was (Author: monster#12):
嗨, [~liyubin117]  我对这个很感兴趣,我想为 flink 做一些工作,我可以帮忙做吗?

> Support enhanced `show tables` statement
> 
>
> Key: FLINK-25631
> URL: https://issues.apache.org/jira/browse/FLINK-25631
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.14.4
>Reporter: Yubin Li
>Priority: Major
>
> Enhanced `show tables` statement like ` show tables from db1 like 't%' ` has 
> been supported broadly in many popular data process engine like 
> presto/trino/spark
> [https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-tables.html]
> I have investigated the syntax of engines as mentioned above.
>  
> We could use such statement to easily show the tables of specified databases 
> without switching db frequently, alse we could use regexp pattern to find 
> focused tables quickly from plenty of tables. besides, the new statement is 
> compatible completely with the old one, users could use `show tables` as 
> before.
> h3. SHOW TABLES [ ( FROM | IN ) [catalog.]db ] [ [NOT] LIKE regex_pattern ]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25631) Support enhanced `show tables` statement

2022-01-12 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17474529#comment-17474529
 ] 

ZhuoYu Chen commented on FLINK-25631:
-

嗨, [~liyubin117]  我对这个很感兴趣,我想为 flink 做一些工作,我可以帮忙做吗?

> Support enhanced `show tables` statement
> 
>
> Key: FLINK-25631
> URL: https://issues.apache.org/jira/browse/FLINK-25631
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.14.4
>Reporter: Yubin Li
>Priority: Major
>
> Enhanced `show tables` statement like ` show tables from db1 like 't%' ` has 
> been supported broadly in many popular data process engine like 
> presto/trino/spark
> [https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-tables.html]
> I have investigated the syntax of engines as mentioned above.
>  
> We could use such statement to easily show the tables of specified databases 
> without switching db frequently, alse we could use regexp pattern to find 
> focused tables quickly from plenty of tables. besides, the new statement is 
> compatible completely with the old one, users could use `show tables` as 
> before.
> h3. SHOW TABLES [ ( FROM | IN ) [catalog.]db ] [ [NOT] LIKE regex_pattern ]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-24456) Support bounded offset in the Kafka table connector

2022-01-11 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472973#comment-17472973
 ] 

ZhuoYu Chen edited comment on FLINK-24456 at 1/11/22, 5:33 PM:
---

Hi [~MartijnVisser] I have finished this pr, could you please take a look and 
see if it can be merged to the master branch?


was (Author: monster#12):
Hi [~MartijnVisser] *I have finished this pr, could you please take a look and 
see if it can be merged to the master branch?*

> Support bounded offset in the Kafka table connector
> ---
>
> Key: FLINK-24456
> URL: https://issues.apache.org/jira/browse/FLINK-24456
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / Ecosystem
>Reporter: Haohui Mai
>Assignee: ZhuoYu Chen
>Priority: Minor
>  Labels: pull-request-available
>
> The {{setBounded}} API in the DataStream connector of Kafka is particularly 
> useful when writing tests. Unfortunately the table connector of Kafka lacks 
> the same API.
> It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24456) Support bounded offset in the Kafka table connector

2022-01-11 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472973#comment-17472973
 ] 

ZhuoYu Chen commented on FLINK-24456:
-

Hi [~MartijnVisser] *I have finished this pr, could you please take a look and 
see if it can be merged to the master branch?*

> Support bounded offset in the Kafka table connector
> ---
>
> Key: FLINK-24456
> URL: https://issues.apache.org/jira/browse/FLINK-24456
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / Ecosystem
>Reporter: Haohui Mai
>Assignee: ZhuoYu Chen
>Priority: Minor
>  Labels: pull-request-available
>
> The {{setBounded}} API in the DataStream connector of Kafka is particularly 
> useful when writing tests. Unfortunately the table connector of Kafka lacks 
> the same API.
> It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24456) Support bounded offset in the Kafka table connector

2022-01-11 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472536#comment-17472536
 ] 

ZhuoYu Chen commented on FLINK-24456:
-

[~MartijnVisser] I think the basic logic is done, there are still some 
controversial field naming optimization issues, I can put time and effort on 
this ticket first and try to solve the above issues this month.

> Support bounded offset in the Kafka table connector
> ---
>
> Key: FLINK-24456
> URL: https://issues.apache.org/jira/browse/FLINK-24456
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / Ecosystem
>Reporter: Haohui Mai
>Assignee: ZhuoYu Chen
>Priority: Minor
>  Labels: pull-request-available
>
> The {{setBounded}} API in the DataStream connector of Kafka is particularly 
> useful when writing tests. Unfortunately the table connector of Kafka lacks 
> the same API.
> It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24900) Support to run multiple shuffle plugins in one session cluster

2022-01-04 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468528#comment-17468528
 ] 

ZhuoYu Chen commented on FLINK-24900:
-

I'm glad to see your message. https //github.com/apache/flink/pull/18253 I'll 
read it in the next few days and ask for your advice if I don't understand 
something. I hope I can contribute a little to the community on this issue in 
the future.:D

> Support to run multiple shuffle plugins in one session cluster
> --
>
> Key: FLINK-24900
> URL: https://issues.apache.org/jira/browse/FLINK-24900
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Yingjie Cao
>Priority: Major
>  Labels: pull-request-available
>
> Currently, one Flink cluster can only use one shuffle plugin. However, there 
> are cases where different jobs may need different shuffle implementations. By 
> loading shuffle plugin with the plugin manager and letting jobs select their 
> shuffle service freely, Flink can support to run multiple shuffle plugins in 
> one session cluster.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-15571) Create a Redis Streams Connector for Flink

2022-01-01 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467441#comment-17467441
 ] 

ZhuoYu Chen commented on FLINK-15571:
-

Hello, I have already started to implement and expect to submit this month

> Create a Redis Streams Connector for Flink
> --
>
> Key: FLINK-15571
> URL: https://issues.apache.org/jira/browse/FLINK-15571
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Tugdual Grall
>Assignee: ZhuoYu Chen
>Priority: Minor
>  Labels: pull-request-available, stale-assigned
>
> Redis has a "log data structure" called Redis Streams, it would be nice to 
> integrate Redis Streams and Apache Flink as:
>  * Source
>  * Sink
> See Redis Streams introduction: [https://redis.io/topics/streams-intro]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25397) support grouped_execution

2021-12-20 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-25397:

Description: 
Performing data bucketing execution: two tables (orders, orders_item), divided 
into buckets (bucketing) based on the same fields (orderid) and the same number 
of buckets. In join by order id, join and aggregation calculations can be 
performed independently, because the same order ids of both tables are divided 
into buckets with the same ids.
This has several advantages.:
1. Whenever a bucket of data is computed, the memory occupied by this bucket 
can be released immediately, so memory consumption can be limited by 
controlling the number of buckets processed in parallel.
2. reduces a lot of shuffling

  was:
Performing data bucketing execution: two tables (orders, orders_item), divided 
into buckets (bucketing) based on the same fields (orderid) and the same number 
of buckets. In join by order id, join and aggregation calculations can be 
performed independently, because the same order ids of both tables are divided 
into buckets with the same ids.
This has several advantages. 1. Whenever a bucket of data is computed, the 
memory occupied by this bucket can be released immediately, so memory 
consumption can be limited by controlling the number of buckets processed in 
parallel.
2. reduces a lot of shuffling


> support grouped_execution
> -
>
> Key: FLINK-25397
> URL: https://issues.apache.org/jira/browse/FLINK-25397
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Legacy Planner, Table SQL / Planner, Table 
> SQL / Runtime
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> Performing data bucketing execution: two tables (orders, orders_item), 
> divided into buckets (bucketing) based on the same fields (orderid) and the 
> same number of buckets. In join by order id, join and aggregation 
> calculations can be performed independently, because the same order ids of 
> both tables are divided into buckets with the same ids.
> This has several advantages.:
> 1. Whenever a bucket of data is computed, the memory occupied by this bucket 
> can be released immediately, so memory consumption can be limited by 
> controlling the number of buckets processed in parallel.
> 2. reduces a lot of shuffling



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25397) support grouped_execution

2021-12-20 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-25397:

Summary: support grouped_execution  (was: grouped_execution)

> support grouped_execution
> -
>
> Key: FLINK-25397
> URL: https://issues.apache.org/jira/browse/FLINK-25397
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Legacy Planner, Table SQL / Planner, Table 
> SQL / Runtime
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> Performing data bucketing execution: two tables (orders, orders_item), 
> divided into buckets (bucketing) based on the same fields (orderid) and the 
> same number of buckets. In join by order id, join and aggregation 
> calculations can be performed independently, because the same order ids of 
> both tables are divided into buckets with the same ids.
> This has several advantages. 1. Whenever a bucket of data is computed, the 
> memory occupied by this bucket can be released immediately, so memory 
> consumption can be limited by controlling the number of buckets processed in 
> parallel.
> 2. reduces a lot of shuffling



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25397) support grouped_execution

2021-12-20 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-25397:

Description: 
Performing data bucketing execution: two tables (orders, orders_item), divided 
into buckets (bucketing) based on the same fields (orderid) and the same number 
of buckets. In join by order id, join and aggregation calculations can be 
performed independently, because the same order ids of both tables are divided 
into buckets with the same ids.
This has several advantages:
1. Whenever a bucket of data is computed, the memory occupied by this bucket 
can be released immediately, so memory consumption can be limited by 
controlling the number of buckets processed in parallel.
2. reduces a lot of shuffling

  was:
Performing data bucketing execution: two tables (orders, orders_item), divided 
into buckets (bucketing) based on the same fields (orderid) and the same number 
of buckets. In join by order id, join and aggregation calculations can be 
performed independently, because the same order ids of both tables are divided 
into buckets with the same ids.
This has several advantages.:
1. Whenever a bucket of data is computed, the memory occupied by this bucket 
can be released immediately, so memory consumption can be limited by 
controlling the number of buckets processed in parallel.
2. reduces a lot of shuffling


> support grouped_execution
> -
>
> Key: FLINK-25397
> URL: https://issues.apache.org/jira/browse/FLINK-25397
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Legacy Planner, Table SQL / Planner, Table 
> SQL / Runtime
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> Performing data bucketing execution: two tables (orders, orders_item), 
> divided into buckets (bucketing) based on the same fields (orderid) and the 
> same number of buckets. In join by order id, join and aggregation 
> calculations can be performed independently, because the same order ids of 
> both tables are divided into buckets with the same ids.
> This has several advantages:
> 1. Whenever a bucket of data is computed, the memory occupied by this bucket 
> can be released immediately, so memory consumption can be limited by 
> controlling the number of buckets processed in parallel.
> 2. reduces a lot of shuffling



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25397) grouped_execution

2021-12-20 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-25397:
---

 Summary: grouped_execution
 Key: FLINK-25397
 URL: https://issues.apache.org/jira/browse/FLINK-25397
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Legacy Planner, Table SQL / Planner, Table 
SQL / Runtime
Affects Versions: 1.15.0
Reporter: ZhuoYu Chen


Performing data bucketing execution: two tables (orders, orders_item), divided 
into buckets (bucketing) based on the same fields (orderid) and the same number 
of buckets. In join by order id, join and aggregation calculations can be 
performed independently, because the same order ids of both tables are divided 
into buckets with the same ids.
This has several advantages. 1. Whenever a bucket of data is computed, the 
memory occupied by this bucket can be released immediately, so memory 
consumption can be limited by controlling the number of buckets processed in 
parallel.
2. reduces a lot of shuffling



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24483) Document what is Public API and what compatibility guarantees Flink is providing

2021-12-20 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462626#comment-17462626
 ] 

ZhuoYu Chen commented on FLINK-24483:
-

[~trohrmann] 
@PublicEvolving is to show that the current class/interface can only be 
guaranteed to be stable in small versions, such as 1.1.1, 1.1.3 current 
class/interface are stable
@Public is to show that the current class/interface changes will only change 
between major versions, can guarantee that the current major version is stable, 
such as 1.1, 1.2, 1.3 current class/interface are stable,What I understand 
above is correct, right?

`Transitive closure for methods` I don't see anything that puts restrictions on 
it.

`Test plan` Where can I see the relevant demos or code?

> Document what is Public API and what compatibility guarantees Flink is 
> providing
> 
>
> Key: FLINK-24483
> URL: https://issues.apache.org/jira/browse/FLINK-24483
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream, Documentation, Table SQL / API
>Affects Versions: 1.14.0, 1.12.5, 1.13.2
>Reporter: Piotr Nowojski
>Priority: Blocker
> Fix For: 1.15.0, 1.13.6, 1.14.3
>
>
> We should document:
> * What constitute of the Public API, what do 
> Public/PublicEvolving/Experimental/Internal annotations mean.
> * What compatibility guarantees we are providing forward (backward?) 
> functional/compile/binary compatibility for {{@Public}} interfaces?
> A good starting point: 
> https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=44302796#content/view/62686683



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24483) Document what is Public API and what compatibility guarantees Flink is providing

2021-12-17 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461454#comment-17461454
 ] 

ZhuoYu Chen commented on FLINK-24483:
-

Hi [~trohrmann]    I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Document what is Public API and what compatibility guarantees Flink is 
> providing
> 
>
> Key: FLINK-24483
> URL: https://issues.apache.org/jira/browse/FLINK-24483
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream, Documentation, Table SQL / API
>Affects Versions: 1.14.0, 1.12.5, 1.13.2
>Reporter: Piotr Nowojski
>Priority: Blocker
> Fix For: 1.15.0, 1.13.6, 1.14.3
>
>
> We should document:
> * What constitute of the Public API, what do 
> Public/PublicEvolving/Experimental/Internal annotations mean.
> * What compatibility guarantees we are providing forward (backward?) 
> functional/compile/binary compatibility for {{@Public}} interfaces?
> A good starting point: 
> https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=44302796#content/view/62686683



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24456) Support bounded offset in the Kafka table connector

2021-12-17 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461307#comment-17461307
 ] 

ZhuoYu Chen commented on FLINK-24456:
-

[~dragonpic] kafka his characteristic is unbounded, the reason why now set 
bounded parameters because of the need for small tests. His main battlefield 
and feature is the stream

> Support bounded offset in the Kafka table connector
> ---
>
> Key: FLINK-24456
> URL: https://issues.apache.org/jira/browse/FLINK-24456
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / Ecosystem
>Reporter: Haohui Mai
>Assignee: ZhuoYu Chen
>Priority: Minor
>  Labels: pull-request-available
>
> The {{setBounded}} API in the DataStream connector of Kafka is particularly 
> useful when writing tests. Unfortunately the table connector of Kafka lacks 
> the same API.
> It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25361) Support bounded timestamp in kafka table connector

2021-12-17 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-25361:
---

 Summary: Support bounded timestamp in kafka table connector
 Key: FLINK-25361
 URL: https://issues.apache.org/jira/browse/FLINK-25361
 Project: Flink
  Issue Type: Sub-task
Reporter: ZhuoYu Chen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24456) Support bounded offset in the Kafka table connector

2021-12-17 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461291#comment-17461291
 ] 

ZhuoYu Chen commented on FLINK-24456:
-

[~dragonpic] This problem I think can be added to two:

1. is the offset value of the end of offset
2. the timestamp end value

> Support bounded offset in the Kafka table connector
> ---
>
> Key: FLINK-24456
> URL: https://issues.apache.org/jira/browse/FLINK-24456
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / Ecosystem
>Reporter: Haohui Mai
>Assignee: ZhuoYu Chen
>Priority: Minor
>  Labels: pull-request-available
>
> The {{setBounded}} API in the DataStream connector of Kafka is particularly 
> useful when writing tests. Unfortunately the table connector of Kafka lacks 
> the same API.
> It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-7129) Support dynamically changing CEP patterns

2021-12-16 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460504#comment-17460504
 ] 

ZhuoYu Chen commented on FLINK-7129:


Hello [~MartijnVisser] I saw the email about the discussion and wanted to write 
some of my opinion, but the page jumped to :

!image-2021-12-16-16-59-22-133.png!

How do I register for this username and password?

> Support dynamically changing CEP patterns
> -
>
> Key: FLINK-7129
> URL: https://issues.apache.org/jira/browse/FLINK-7129
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major, stale-minor
> Attachments: image-2021-12-16-16-59-22-133.png
>
>
> An umbrella task for introducing mechanism for injecting patterns through 
> coStream



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-7129) Support dynamically changing CEP patterns

2021-12-16 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-7129:
---
Attachment: image-2021-12-16-16-59-22-133.png

> Support dynamically changing CEP patterns
> -
>
> Key: FLINK-7129
> URL: https://issues.apache.org/jira/browse/FLINK-7129
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major, stale-minor
> Attachments: image-2021-12-16-16-59-22-133.png
>
>
> An umbrella task for introducing mechanism for injecting patterns through 
> coStream



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-7129) Support dynamically changing CEP patterns

2021-12-16 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460488#comment-17460488
 ] 

ZhuoYu Chen edited comment on FLINK-7129 at 12/16/21, 8:31 AM:
---

 [~MartijnVisser] thank you for your reply.

Please assign this to me, I have enough time to do this for a long time to come.


was (Author: monster#12):
Hi [~MartijnVisser] You will assign it to me, and I have plenty of time to do 
that for a long time to come.

> Support dynamically changing CEP patterns
> -
>
> Key: FLINK-7129
> URL: https://issues.apache.org/jira/browse/FLINK-7129
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major, stale-minor
>
> An umbrella task for introducing mechanism for injecting patterns through 
> coStream



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-7129) Support dynamically changing CEP patterns

2021-12-16 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460488#comment-17460488
 ] 

ZhuoYu Chen edited comment on FLINK-7129 at 12/16/21, 8:28 AM:
---

Hi [~MartijnVisser] You will assign it to me, and I have plenty of time to do 
that for a long time to come.


was (Author: monster#12):
Hi [~MartijnVisser] You will assign it to me, and I will have plenty of time to 
work on it in the next few days.

> Support dynamically changing CEP patterns
> -
>
> Key: FLINK-7129
> URL: https://issues.apache.org/jira/browse/FLINK-7129
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major, stale-minor
>
> An umbrella task for introducing mechanism for injecting patterns through 
> coStream



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-7129) Support dynamically changing CEP patterns

2021-12-16 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460488#comment-17460488
 ] 

ZhuoYu Chen commented on FLINK-7129:


Hi [~MartijnVisser] You will assign it to me, and I will have plenty of time to 
work on it in the next few days.

> Support dynamically changing CEP patterns
> -
>
> Key: FLINK-7129
> URL: https://issues.apache.org/jira/browse/FLINK-7129
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major, stale-minor
>
> An umbrella task for introducing mechanism for injecting patterns through 
> coStream



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25286) Improve connector testing framework to support more scenarios

2021-12-15 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460438#comment-17460438
 ] 

ZhuoYu Chen commented on FLINK-25286:
-

Hi [~renqs], I've done a lot of custom work on connectors at work and I'm very 
interested in the issues you mentioned. Is there any work I can be involved in 
with this current isues

> Improve connector testing framework to support more scenarios
> -
>
> Key: FLINK-25286
> URL: https://issues.apache.org/jira/browse/FLINK-25286
> Project: Flink
>  Issue Type: Improvement
>  Components: Test Infrastructure
>Reporter: Qingsheng Ren
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently connector testing framework only support tests for DataStream 
> sources, and available scenarios are quite limited by current interface 
> design. 
> This ticket proposes to made improvements to connector testing framework for 
> supporting more test scenarios, and add test suites for sink and Table/SQL 
> API.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25218) Performance issues with lookup join accessing external dimension tables

2021-12-09 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17456397#comment-17456397
 ] 

ZhuoYu Chen commented on FLINK-25218:
-

[~MartijnVisser] Hello, please wait a moment, in the next few days I will 
organize my recent research on look up optimization, and then posted to jira

>  Performance issues with lookup join accessing external dimension tables
> 
>
> Key: FLINK-25218
> URL: https://issues.apache.org/jira/browse/FLINK-25218
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> Current lookup join: for each input data, access the external dimension table 
> to get the result and output a data
> Implement a lookup join that can improve performance by batching and delaying



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25232) flink sql supports fine-grained state configuration

2021-12-09 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-25232:
---

 Summary: flink sql supports fine-grained state configuration
 Key: FLINK-25232
 URL: https://issues.apache.org/jira/browse/FLINK-25232
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: ZhuoYu Chen


In production, I found that if I configure the TTL of the state in sql, it will 
cause this ttl time to take effect globally
If I have the following sql grouped by region:
select count(1),region from (select * from A join B on a.uid = b.uid)
If I configure a global TTL it will cause count the status of this 
GroupAggFunction to be eliminated, for example the accumulation will be cleared 
after one day
If I don't configure it, it will cause the status of Regular join to increase 
again 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25218) Performance issues with lookup join accessing external dimension tables

2021-12-07 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-25218:

Affects Version/s: 1.15.0

>  Performance issues with lookup join accessing external dimension tables
> 
>
> Key: FLINK-25218
> URL: https://issues.apache.org/jira/browse/FLINK-25218
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> Current lookup join: for each input data, access the external dimension table 
> to get the result and output a data
> Implement a lookup join that can improve performance by batching and delaying



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25218) Performance issues with lookup join accessing external dimension tables

2021-12-07 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-25218:
---

 Summary:  Performance issues with lookup join accessing external 
dimension tables
 Key: FLINK-25218
 URL: https://issues.apache.org/jira/browse/FLINK-25218
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: ZhuoYu Chen


Current lookup join: for each input data, access the external dimension table 
to get the result and output a data

Implement a lookup join that can improve performance by batching and delaying



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25213) Add @Public annotations to flink-table-common

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454598#comment-17454598
 ] 

ZhuoYu Chen commented on FLINK-25213:
-

@[~twalthr] Hello, assign this sub-question to me and I will try to finish it

> Add @Public annotations to flink-table-common
> -
>
> Key: FLINK-25213
> URL: https://issues.apache.org/jira/browse/FLINK-25213
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25213) Add @Public annotations to flink-table-common

2021-12-07 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-25213:
---

 Summary: Add @Public annotations to flink-table-common
 Key: FLINK-25213
 URL: https://issues.apache.org/jira/browse/FLINK-25213
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.15.0
Reporter: ZhuoYu Chen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-21884) Reduce TaskManager failure detection time

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454589#comment-17454589
 ] 

ZhuoYu Chen edited comment on FLINK-21884 at 12/7/21, 11:16 AM:


[~trohrmann] 
HDFS cluster with the growth of time, there will inevitably be some 
"performance degradation" of the node, mainly manifested as disk read and write 
slow, network transmission slow, we collectively call these nodes slow nodes. 
When a cluster grows to a certain size, such as a cluster of thousands of 
nodes, slow nodes are usually not easily detected. Most of the time, slow nodes 
are hidden in many healthy nodes, and only when the client frequently accesses 
these problematic nodes and finds that the read/write is slow, will it be 
perceived.

*Network slow monitoring principle*

The principle of monitoring a DN network transmission slowdown is to record the 
data transmission time between each DN in the cluster, find out the abnormal 
value and report it to NN as a slow node, under normal circumstances the 
transmission rate between nodes is basically the same, not too much difference, 
if there is an abnormal transmission time of A to B, A will report B to NN as a 
slow node, NN side will determine whether there is a slow node after 
aggregating all the slow node reports. The NN will determine whether there is a 
slow node after aggregating all the slow node reports. For example, if node X 
is a faulty node, then there must be many nodes reporting to NN that X is a 
slow node, and eventually there are similar slow node reports at the NN side, 
indicating that X is a slow node.

!https://img-blog.csdnimg.cn/img_convert/10b1676de9d4cbdc73521a21fa36c104.png!

In order to calculate the average elapsed time of data transmission from DN to 
downstream, DN maintains a HashMap>, the key of which 
is the ip of the downstream DN and the value is a queue of SampleStat objects. 
The number of packets and the total elapsed time are recorded. By default, the 
DN does a snapshot every five minutes to generate a SampleStat, which is used 
to record the network situation of data transmission to the downstream DN in 
these five minutes and stored in the Queue of the HashMap.

The Queue of HashMap is a fixed-size queue with a queue length of 36. If the 
queue is full, it will kick off the first member of the queue to add the new 
SampleStat to the end of the queue. This means that we will only monitor the 
last 3 hours (36 X 5 = 180min) of network transmission, in order to ensure that 
the monitoring data is time-sensitive.


was (Author: monster#12):
[~trohrmann] 
HDFS cluster with the growth of time, there will inevitably be some 
"performance degradation" of the node, mainly manifested as disk read and write 
slow, network transmission slow, we collectively call these nodes slow nodes. 
When a cluster grows to a certain size, such as a cluster of thousands of 
nodes, slow nodes are usually not easily detected. Most of the time, slow nodes 
are hidden in many healthy nodes, and only when the client frequently accesses 
these problematic nodes and finds that the read/write is slow, will it be 
perceived.

*Network slow monitoring principle*

The principle of monitoring a DN network transmission slowdown is to record the 
data transmission time between each DN in the cluster, find out the abnormal 
value and report it to NN as a slow node, under normal circumstances the 
transmission rate between nodes is basically the same, not too much difference, 
if there is an abnormal transmission time of A to B, A will report B to NN as a 
slow node, NN side will determine whether there is a slow node after 
aggregating all the slow node reports. The NN will determine whether there is a 
slow node after aggregating all the slow node reports. For example, if node X 
is a faulty node, then there must be many nodes reporting to NN that X is a 
slow node, and eventually there are similar slow node reports at the NN side, 
indicating that X is a slow node.

In order to calculate the average elapsed time of data transmission from DN to 
downstream, DN maintains a HashMap>, the key of which 
is the ip of the downstream DN and the value is a queue of SampleStat objects. 
The number of packets and the total elapsed time are recorded. By default, the 
DN does a snapshot every five minutes to generate a SampleStat, which is used 
to record the network situation of data transmission to the downstream DN in 
these five minutes and stored in the Queue of the HashMap.

The Queue of HashMap is a fixed-size queue with a queue length of 36. If the 
queue is full, it will kick off the first member of the queue to add the new 
SampleStat to the end of the queue. This means that we will only monitor the 
last 3 hours (36 X 5 = 180min) of network transmission, in order to ensure that 
the monitoring data is time-sensitive.

> Reduce Task

[jira] [Comment Edited] (FLINK-21884) Reduce TaskManager failure detection time

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454589#comment-17454589
 ] 

ZhuoYu Chen edited comment on FLINK-21884 at 12/7/21, 11:16 AM:


[~trohrmann] 
HDFS cluster with the growth of time, there will inevitably be some 
"performance degradation" of the node, mainly manifested as disk read and write 
slow, network transmission slow, we collectively call these nodes slow nodes. 
When a cluster grows to a certain size, such as a cluster of thousands of 
nodes, slow nodes are usually not easily detected. Most of the time, slow nodes 
are hidden in many healthy nodes, and only when the client frequently accesses 
these problematic nodes and finds that the read/write is slow, will it be 
perceived.

*Network slow monitoring principle*

The principle of monitoring a DN network transmission slowdown is to record the 
data transmission time between each DN in the cluster, find out the abnormal 
value and report it to NN as a slow node, under normal circumstances the 
transmission rate between nodes is basically the same, not too much difference, 
if there is an abnormal transmission time of A to B, A will report B to NN as a 
slow node, NN side will determine whether there is a slow node after 
aggregating all the slow node reports. The NN will determine whether there is a 
slow node after aggregating all the slow node reports. For example, if node X 
is a faulty node, then there must be many nodes reporting to NN that X is a 
slow node, and eventually there are similar slow node reports at the NN side, 
indicating that X is a slow node.

In order to calculate the average elapsed time of data transmission from DN to 
downstream, DN maintains a HashMap>, the key of which 
is the ip of the downstream DN and the value is a queue of SampleStat objects. 
The number of packets and the total elapsed time are recorded. By default, the 
DN does a snapshot every five minutes to generate a SampleStat, which is used 
to record the network situation of data transmission to the downstream DN in 
these five minutes and stored in the Queue of the HashMap.

The Queue of HashMap is a fixed-size queue with a queue length of 36. If the 
queue is full, it will kick off the first member of the queue to add the new 
SampleStat to the end of the queue. This means that we will only monitor the 
last 3 hours (36 X 5 = 180min) of network transmission, in order to ensure that 
the monitoring data is time-sensitive.


was (Author: monster#12):
[~trohrmann] 
HDFS cluster with the growth of time, there will inevitably be some 
"performance degradation" of the node, mainly manifested as disk read and write 
slow, network transmission slow, we collectively call these nodes slow nodes. 
When a cluster grows to a certain size, such as a cluster of thousands of 
nodes, slow nodes are usually not easily detected. Most of the time, slow nodes 
are hidden in many healthy nodes, and only when the client frequently accesses 
these problematic nodes and finds that the read/write is slow, will it be 
perceived.

The principle of monitoring a DN network transmission slowdown is to record the 
data transmission time between each DN in the cluster, find out the abnormal 
value and report it to NN as a slow node, under normal circumstances the 
transmission rate between nodes is basically the same, not too much difference, 
if there is an abnormal transmission time of A to B, A will report B to NN as a 
slow node, NN side will determine whether there is a slow node after 
aggregating all the slow node reports. The NN will determine whether there is a 
slow node after aggregating all the slow node reports. For example, if node X 
is a faulty node, then there must be many nodes reporting to NN that X is a 
slow node, and eventually there are similar slow node reports at the NN side, 
indicating that X is a slow node.


In order to calculate the average elapsed time of data transmission from DN to 
downstream, DN maintains a HashMap>, the key of which 
is the ip of the downstream DN and the value is a queue of SampleStat objects. 
The number of packets and the total elapsed time are recorded. By default, the 
DN does a snapshot every five minutes to generate a SampleStat, which is used 
to record the network situation of data transmission to the downstream DN in 
these five minutes and stored in the Queue of the HashMap.

The Queue of HashMap is a fixed-size queue with a queue length of 36. If the 
queue is full, it will kick off the first member of the queue to add the new 
SampleStat to the end of the queue. This means that we will only monitor the 
last 3 hours (36 X 5 = 180min) of network transmission, in order to ensure that 
the monitoring data is time-sensitive.

> Reduce TaskManager failure detection time
> -
>
> Key: FLINK-21884
>   

[jira] [Commented] (FLINK-21884) Reduce TaskManager failure detection time

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454589#comment-17454589
 ] 

ZhuoYu Chen commented on FLINK-21884:
-

[~trohrmann] 
HDFS cluster with the growth of time, there will inevitably be some 
"performance degradation" of the node, mainly manifested as disk read and write 
slow, network transmission slow, we collectively call these nodes slow nodes. 
When a cluster grows to a certain size, such as a cluster of thousands of 
nodes, slow nodes are usually not easily detected. Most of the time, slow nodes 
are hidden in many healthy nodes, and only when the client frequently accesses 
these problematic nodes and finds that the read/write is slow, will it be 
perceived.

The principle of monitoring a DN network transmission slowdown is to record the 
data transmission time between each DN in the cluster, find out the abnormal 
value and report it to NN as a slow node, under normal circumstances the 
transmission rate between nodes is basically the same, not too much difference, 
if there is an abnormal transmission time of A to B, A will report B to NN as a 
slow node, NN side will determine whether there is a slow node after 
aggregating all the slow node reports. The NN will determine whether there is a 
slow node after aggregating all the slow node reports. For example, if node X 
is a faulty node, then there must be many nodes reporting to NN that X is a 
slow node, and eventually there are similar slow node reports at the NN side, 
indicating that X is a slow node.


In order to calculate the average elapsed time of data transmission from DN to 
downstream, DN maintains a HashMap>, the key of which 
is the ip of the downstream DN and the value is a queue of SampleStat objects. 
The number of packets and the total elapsed time are recorded. By default, the 
DN does a snapshot every five minutes to generate a SampleStat, which is used 
to record the network situation of data transmission to the downstream DN in 
these five minutes and stored in the Queue of the HashMap.

The Queue of HashMap is a fixed-size queue with a queue length of 36. If the 
queue is full, it will kick off the first member of the queue to add the new 
SampleStat to the end of the queue. This means that we will only monitor the 
last 3 hours (36 X 5 = 180min) of network transmission, in order to ensure that 
the monitoring data is time-sensitive.

> Reduce TaskManager failure detection time
> -
>
> Key: FLINK-21884
> URL: https://issues.apache.org/jira/browse/FLINK-21884
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.13.2
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: reactive
> Fix For: 1.15.0
>
> Attachments: image-2021-03-19-20-10-40-324.png
>
>
> In Flink 1.13 (and older versions), TaskManager failures stall the processing 
> for a significant amount of time, even though the system gets indications for 
> the failure almost immediately through network connection losses.
> This is due to a high (default) heartbeat timeout of 50 seconds [1] to 
> accommodate for GC pauses, transient network disruptions or generally slow 
> environments (otherwise, we would unregister a healthy TaskManager).
> Such a high timeout can lead to disruptions in the processing (no processing 
> for certain periods, high latencies, buildup of consumer lag etc.). In 
> Reactive Mode (FLINK-10407), the issue surfaces on scale-down events, where 
> the loss of a TaskManager is immediately visible in the logs, but the job is 
> stuck in "FAILING" for quite a while until the TaskManger is really 
> deregistered. (Note that this issue is not that critical in a autoscaling 
> setup, because Flink can control the scale-down events and trigger them 
> proactively)
> On the attached metrics dashboard, one can see that the job has significant 
> throughput drops / consumer lags during scale down (and also CPU usage spikes 
> on processing the queued events, leading to incorrect scale up events again).
>  !image-2021-03-19-20-10-40-324.png|thumbnail!
> One idea to solve this problem is to:
> - Score TaskManagers based on certain signals (# exceptions reported, 
> exception types (connection losses, akka failures), failure frequencies,  
> ...) and blacklist them accordingly.
> - Introduce a best-effort TaskManager unregistration mechanism: When a 
> TaskManager receives a sigterm, it sends a final message to the JobManager 
> saying "goodbye", and the JobManager can immediately remove the TM from its 
> bookkeeping.
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/config/#heartbeat-timeout



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-21373) Port RabbitMQ Sink to FLIP-143 API

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454587#comment-17454587
 ] 

ZhuoYu Chen commented on FLINK-21373:
-

[~westphal-jan] Hello, I would like to know how is the progress?

> Port RabbitMQ Sink to FLIP-143 API
> --
>
> Key: FLINK-21373
> URL: https://issues.apache.org/jira/browse/FLINK-21373
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors/ RabbitMQ
>Reporter: Jan Westphal
>Priority: Minor
>  Labels: auto-unassigned
> Fix For: 1.12.0
>
>
> *Structure*
> The unified Sink API provides a Writer, a Committer and a GlobalCommitter. 
> Right now we don’t see the need to use the Committer and GlobalCommitter as 
> the Writer is sufficient to hold up to the consistencies. Since we are in the 
> need of asynchronous RabbitMQ callbacks to know whether or not a message was 
> published successfully and have to store unacknowledged messages in the 
> checkpoint, there would be a large bidirectional communication and state 
> exchange overhead between the Writer and the Committer.
> *At-most-once*
> The Writer receives a message from Flink and simply publishes it to RabbitMQ. 
> The current RabbitMQ Sink only provides this mode.
> *At-least-once*
> The objective here is, to receive an acknowledgement by RabbitMQ for 
> published messages. Therefore, before publishing a message, we store the 
> message in a Map with the sequence number as its key. If the message is 
> acknowledged by RabbitMQ we can remove it from the Map. If we don’t receive 
> an acknowledgement for a certain amount of time (or a RabbitMQ specific so 
> called negative acknowledgement)  we will try to resend the message when 
> doing a checkpoint.
> *Exactly-once*
> On checkpointing we send all messages by Flink in transaction mode to 
> RabbitMQ. This way, all the messages get sent or are rolled back on failure. 
> All messages that are not sent successfully are written to the checkpoint and 
> are tried to be sent with the next checkpoint.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24483) Document what is Public API and what compatibility guarantees Flink is providing

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454581#comment-17454581
 ] 

ZhuoYu Chen commented on FLINK-24483:
-

[~trohrmann] Is this current issue to be a document?

> Document what is Public API and what compatibility guarantees Flink is 
> providing
> 
>
> Key: FLINK-24483
> URL: https://issues.apache.org/jira/browse/FLINK-24483
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Documentation, Table SQL / API
>Affects Versions: 1.14.0, 1.12.5, 1.13.2
>Reporter: Piotr Nowojski
>Priority: Blocker
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
>
> We should document:
> * What constitute of the Public API, what do 
> Public/PublicEvolving/Experimental/Internal annotations mean.
> * What compatibility guarantees we are providing forward (backward?) 
> functional/compile/binary compatibility for {{@Public}} interfaces?
> A good starting point: 
> https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=44302796#content/view/62686683



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25187) Apply padding for BINARY()

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454543#comment-17454543
 ] 

ZhuoYu Chen commented on FLINK-25187:
-

Hi [~matriv]   I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Apply padding for BINARY()
> -
>
> Key: FLINK-25187
> URL: https://issues.apache.org/jira/browse/FLINK-25187
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Marios Trivyzas
>Priority: Major
>
> When the resulting byte array that is generated for a *CAST(XXX AS 
> BINARY()* has *length* < {*}precision{*}, then it should be padded 
> with *0* to the right, to end up with a byte array of *precision* length, 
> similarly to padding with spaces for {*}CHAR(){*}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-7129) Support dynamically changing CEP patterns

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454527#comment-17454527
 ] 

ZhuoYu Chen commented on FLINK-7129:


Hi Dawid Wysakowicz , I'm interested to see the question you posted, are there 
any plans to split the task? Can I help you with some of the work?

> Support dynamically changing CEP patterns
> -
>
> Key: FLINK-7129
> URL: https://issues.apache.org/jira/browse/FLINK-7129
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major, stale-minor
>
> An umbrella task for introducing mechanism for injecting patterns through 
> coStream



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25208) AsyncSink\Source lock of documentation or Blog

2021-12-07 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-25208:
---

 Summary: AsyncSink\Source lock of documentation or Blog
 Key: FLINK-25208
 URL: https://issues.apache.org/jira/browse/FLINK-25208
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.15.0
Reporter: ZhuoYu Chen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25196) Add Documentation for Data Sink

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454511#comment-17454511
 ] 

ZhuoYu Chen commented on FLINK-25196:
-

[~MartijnVisser]  The sink part is missing

> Add Documentation for Data Sink
> ---
>
> Key: FLINK-25196
> URL: https://issues.apache.org/jira/browse/FLINK-25196
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Documentation
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
> Attachments: image-2021-12-07-17-50-02-896.png, 
> image-2021-12-07-17-50-57-175.png
>
>
> FLIP-143 :Unified Sink API Documentataion
> h1.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-25196) Add Documentation for Data Sink

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454509#comment-17454509
 ] 

ZhuoYu Chen edited comment on FLINK-25196 at 12/7/21, 9:51 AM:
---

[~MartijnVisser] 
!image-2021-12-07-17-50-57-175.png!!image-2021-12-07-17-50-02-896.png!


was (Author: monster#12):
[~MartijnVisser] 
!c:\Users\monster\AppData\Local\Temp\%E4%BC%81%E4%B8%9A%E5%BE%AE%E4%BF%A1%E6%88%AA%E5%9B%BE_16388704556879.png!

> Add Documentation for Data Sink
> ---
>
> Key: FLINK-25196
> URL: https://issues.apache.org/jira/browse/FLINK-25196
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Documentation
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
> Attachments: image-2021-12-07-17-50-02-896.png, 
> image-2021-12-07-17-50-57-175.png
>
>
> FLIP-143 :Unified Sink API Documentataion
> h1.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-25196) Add Documentation for Data Sink

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454509#comment-17454509
 ] 

ZhuoYu Chen edited comment on FLINK-25196 at 12/7/21, 9:49 AM:
---

[~MartijnVisser] 
!c:\Users\monster\AppData\Local\Temp\%E4%BC%81%E4%B8%9A%E5%BE%AE%E4%BF%A1%E6%88%AA%E5%9B%BE_16388704556879.png!


was (Author: monster#12):
[~MartijnVisser] 

> Add Documentation for Data Sink
> ---
>
> Key: FLINK-25196
> URL: https://issues.apache.org/jira/browse/FLINK-25196
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Documentation
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> FLIP-143 :Unified Sink API Documentataion
> h1.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25196) Add Documentation for Data Sink

2021-12-07 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454509#comment-17454509
 ] 

ZhuoYu Chen commented on FLINK-25196:
-

[~MartijnVisser] 

> Add Documentation for Data Sink
> ---
>
> Key: FLINK-25196
> URL: https://issues.apache.org/jira/browse/FLINK-25196
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Documentation
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> FLIP-143 :Unified Sink API Documentataion
> h1.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25158) Fix formatting for true, false and null to uppercase

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454072#comment-17454072
 ] 

ZhuoYu Chen commented on FLINK-25158:
-

Hi [~slinkydeveloper] I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Fix formatting for true, false and null to uppercase
> 
>
> Key: FLINK-25158
> URL: https://issues.apache.org/jira/browse/FLINK-25158
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Francesco Guardiani
>Priority: Major
>
> All the cast rules using the constant strings {{true}}, {{false}} and 
> {{null}} should use {{TRUE}}, {{FALSE}} and {{NULL}} instead.
> This behavior should be enabled only if legacy behavior is not enabled



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25187) Apply padding for BINARY()

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454069#comment-17454069
 ] 

ZhuoYu Chen commented on FLINK-25187:
-

[~matriv] Can you briefly describe the problem?

 

> Apply padding for BINARY()
> -
>
> Key: FLINK-25187
> URL: https://issues.apache.org/jira/browse/FLINK-25187
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Marios Trivyzas
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25154) FLIP-193: Snapshots ownership

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454067#comment-17454067
 ] 

ZhuoYu Chen commented on FLINK-25154:
-

Hi [~dwysakowicz]  , I was interested to see the question you posted, is there 
a sub-task I can participate in? I will try to do a good job!

> FLIP-193: Snapshots ownership
> -
>
> Key: FLINK-25154
> URL: https://issues.apache.org/jira/browse/FLINK-25154
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.15.0
>
>
> Task for implementing FLIP-193: https://cwiki.apache.org/confluence/x/bIyqCw



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-25152) FLIP-188: Introduce Built-in Dynamic Table Storage

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454064#comment-17454064
 ] 

ZhuoYu Chen edited comment on FLINK-25152 at 12/6/21, 3:07 PM:
---

Hi [~lzljs3620320] , I was interested to see the question you posted, is there 
a sub-task I can participate in? I will try to do a good job!


was (Author: monster#12):
Hello [~lzljs3620320] , I was interested to see the issues you posted, is there 
a sub-task that I can participate in? I will try to do a good job!

> FLIP-188: Introduce Built-in Dynamic Table Storage
> --
>
> Key: FLINK-25152
> URL: https://issues.apache.org/jira/browse/FLINK-25152
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Ecosystem
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: 1.15.0
>
>
> introduce built-in storage support for dynamic table, a truly unified 
> changelog & table representation, from Flink SQL’s perspective. The storage 
> will improve the usability a lot.
> More detail see: 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25152) FLIP-188: Introduce Built-in Dynamic Table Storage

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454064#comment-17454064
 ] 

ZhuoYu Chen commented on FLINK-25152:
-

Hello [~lzljs3620320] , I was interested to see the issues you posted, is there 
a sub-task that I can participate in? I will try to do a good job!

> FLIP-188: Introduce Built-in Dynamic Table Storage
> --
>
> Key: FLINK-25152
> URL: https://issues.apache.org/jira/browse/FLINK-25152
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Ecosystem
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: 1.15.0
>
>
> introduce built-in storage support for dynamic table, a truly unified 
> changelog & table representation, from Flink SQL’s perspective. The storage 
> will improve the usability a lot.
> More detail see: 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25173) Introduce CatalogLock

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454061#comment-17454061
 ] 

ZhuoYu Chen commented on FLINK-25173:
-

Hi [~lzljs3620320]  I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Introduce CatalogLock
> -
>
> Key: FLINK-25173
> URL: https://issues.apache.org/jira/browse/FLINK-25173
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive, Table SQL / API
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, only HiveCatalog can provide this catalog lock.
> {code:java}
> /**
>  * An interface that allows source and sink to use global lock to some 
> transaction-related things.
>  */
> @Internal
> public interface CatalogLock extends Closeable {
>  
> /** Run with catalog lock. The caller should tell catalog the database 
> and table name. */
>  T runWithLock(String database, String table, Callable callable) 
> throws Exception;
>  
> /** Factory to create {@link CatalogLock}. */
> interface Factory extends Serializable {
> CatalogLock create();
> }
> } {code}
> And we need a interface to set lock to source&sink by catalog:
> {code:java}
> /**
>  * Source and sink implement this interface if they require {@link 
> CatalogLock}. This is marked as
>  * internal. If we need lock to be more general, we can put lock factory into 
> {@link
>  * DynamicTableFactory.Context}.
>  */
> @Internal
> public interface RequireCatalogLock {
>  
> void setLockFactory(CatalogLock.Factory lockFactory);
> } {code}
> {{}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25169) CsvFileSystemFormatFactory#CsvInputFormat supports recursion

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454059#comment-17454059
 ] 

ZhuoYu Chen commented on FLINK-25169:
-

[~Bo Cui] By recursive support, are you recursively reading the contents of a 
folder?

> CsvFileSystemFormatFactory#CsvInputFormat supports recursion
> 
>
> Key: FLINK-25169
> URL: https://issues.apache.org/jira/browse/FLINK-25169
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0
>Reporter: Bo Cui
>Priority: Major
>
> https://github.com/apache/flink/blob/ca4fbd10a1e8919c48e602640bc3238648cc48bb/flink-formats/flink-csv/src/main/java/org/apache/flink/formats/csv/CsvFileSystemFormatFactory.java#L117
> Why CsvFileSystemFormatFactory#CsvInputFormat doesn't support file recursion?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25196) Add Documentation for Data Sink

2021-12-06 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-25196:

Description: 
FLIP-143 :Unified Sink API Documentataion
h1.

> Add Documentation for Data Sink
> ---
>
> Key: FLINK-25196
> URL: https://issues.apache.org/jira/browse/FLINK-25196
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Documentation
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> FLIP-143 :Unified Sink API Documentataion
> h1.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25196) Add Documentation for Data Sink

2021-12-06 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-25196:
---

 Summary: Add Documentation for Data Sink
 Key: FLINK-25196
 URL: https://issues.apache.org/jira/browse/FLINK-25196
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream, Documentation
Affects Versions: 1.15.0
Reporter: ZhuoYu Chen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-20188) Add Documentation for new File Source

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454003#comment-17454003
 ] 

ZhuoYu Chen edited comment on FLINK-20188 at 12/6/21, 12:58 PM:


[~MartijnVisser] Then let's set the 15th of December, do you think it's okay?


was (Author: monster#12):
Then let's set the 15th of December, do you think it's okay?

> Add Documentation for new File Source
> -
>
> Key: FLINK-20188
> URL: https://issues.apache.org/jira/browse/FLINK-20188
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem, Documentation
>Affects Versions: 1.14.0, 1.13.3, 1.15.0
>Reporter: Stephan Ewen
>Assignee: ZhuoYu Chen
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
> Attachments: image-2021-11-16-11-42-32-957.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-20188) Add Documentation for new File Source

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454003#comment-17454003
 ] 

ZhuoYu Chen commented on FLINK-20188:
-

Then let's set the 15th of December, do you think it's okay?

> Add Documentation for new File Source
> -
>
> Key: FLINK-20188
> URL: https://issues.apache.org/jira/browse/FLINK-20188
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem, Documentation
>Affects Versions: 1.14.0, 1.13.3, 1.15.0
>Reporter: Stephan Ewen
>Assignee: ZhuoYu Chen
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
> Attachments: image-2021-11-16-11-42-32-957.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25170) cep supports dynamic rule updates

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454000#comment-17454000
 ] 

ZhuoYu Chen commented on FLINK-25170:
-

[~nicholasjiang] I have always believed that flink's cep is its most important 
feature, and the future development of flink, cep will be an important direction

> cep supports dynamic rule updates
> -
>
> Key: FLINK-25170
> URL: https://issues.apache.org/jira/browse/FLINK-25170
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Reporter: ZhuoYu Chen
>Priority: Major
>
> External loading: Usually the rule engine has a special rule management 
> module to provide users to create their own rules, for Flink tasks need to go 
> external to load rules
> Dynamic update: need to provide timing to detect whether the rules change
> History state cleanup: in pattern matching is a series of NFAState If the 
> rules change, the states are useless and need to be cleaned up
> Easy API: Different business developers may have their own rule management, 
> timing policies, etc., so they need to provide an easy-to-use API to the 
> outside world



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25170) cep supports dynamic rule updates

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453999#comment-17453999
 ] 

ZhuoYu Chen commented on FLINK-25170:
-

[~nicholasjiang] If there is a need, I am willing to help fully and contribute 
a little power

> cep supports dynamic rule updates
> -
>
> Key: FLINK-25170
> URL: https://issues.apache.org/jira/browse/FLINK-25170
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Reporter: ZhuoYu Chen
>Priority: Major
>
> External loading: Usually the rule engine has a special rule management 
> module to provide users to create their own rules, for Flink tasks need to go 
> external to load rules
> Dynamic update: need to provide timing to detect whether the rules change
> History state cleanup: in pattern matching is a series of NFAState If the 
> rules change, the states are useless and need to be cleaned up
> Easy API: Different business developers may have their own rule management, 
> timing policies, etc., so they need to provide an easy-to-use API to the 
> outside world



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-20188) Add Documentation for new File Source

2021-12-06 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453995#comment-17453995
 ] 

ZhuoYu Chen commented on FLINK-20188:
-

[~MartijnVisser] I hope you can give me your advice. I will read more 
information and source code in the next few days and try to finish this task as 
soon as possible.

> Add Documentation for new File Source
> -
>
> Key: FLINK-20188
> URL: https://issues.apache.org/jira/browse/FLINK-20188
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem, Documentation
>Affects Versions: 1.14.0, 1.13.3, 1.15.0
>Reporter: Stephan Ewen
>Assignee: ZhuoYu Chen
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
> Attachments: image-2021-11-16-11-42-32-957.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25170) cep supports dynamic rule updates

2021-12-05 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453768#comment-17453768
 ] 

ZhuoYu Chen commented on FLINK-25170:
-

The application of flink in the business layer has a broad space, but cep's 
rigidity seriously restricts its development

> cep supports dynamic rule updates
> -
>
> Key: FLINK-25170
> URL: https://issues.apache.org/jira/browse/FLINK-25170
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> External loading: Usually the rule engine has a special rule management 
> module to provide users to create their own rules, for Flink tasks need to go 
> external to load rules
> Dynamic update: need to provide timing to detect whether the rules change
> History state cleanup: in pattern matching is a series of NFAState If the 
> rules change, the states are useless and need to be cleaned up
> Easy API: Different business developers may have their own rule management, 
> timing policies, etc., so they need to provide an easy-to-use API to the 
> outside world



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25170) cep supports dynamic rule updates

2021-12-05 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-25170:

Description: 
External loading: Usually the rule engine has a special rule management module 
to provide users to create their own rules, for Flink tasks need to go external 
to load rules

Dynamic update: need to provide timing to detect whether the rules change - 
History state cleanup: in pattern matching is a series of NFAState If the rules 
change, the states are useless and need to be cleaned up

Easy API: Different business developers may have their own rule management, 
timing policies, etc., so they need to provide an easy-to-use API to the 
outside world

  was:External loading: Usually the rule engine has a special rule management 
module to provide users to create their own rules, for Flink tasks need to go 
external to load rules - Dynamic update: need to provide timing to detect 
whether the rules change - History state cleanup: in pattern matching is a 
series of NFAState If the rules change, the states are useless and need to be 
cleaned up - Easy API: Different business developers may have their own rule 
management, timing policies, etc., so they need to provide an easy-to-use API 
to the outside world


> cep supports dynamic rule updates
> -
>
> Key: FLINK-25170
> URL: https://issues.apache.org/jira/browse/FLINK-25170
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> External loading: Usually the rule engine has a special rule management 
> module to provide users to create their own rules, for Flink tasks need to go 
> external to load rules
> Dynamic update: need to provide timing to detect whether the rules change - 
> History state cleanup: in pattern matching is a series of NFAState If the 
> rules change, the states are useless and need to be cleaned up
> Easy API: Different business developers may have their own rule management, 
> timing policies, etc., so they need to provide an easy-to-use API to the 
> outside world



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25170) cep supports dynamic rule updates

2021-12-05 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-25170:

Description: 
External loading: Usually the rule engine has a special rule management module 
to provide users to create their own rules, for Flink tasks need to go external 
to load rules

Dynamic update: need to provide timing to detect whether the rules change

History state cleanup: in pattern matching is a series of NFAState If the rules 
change, the states are useless and need to be cleaned up

Easy API: Different business developers may have their own rule management, 
timing policies, etc., so they need to provide an easy-to-use API to the 
outside world

  was:
External loading: Usually the rule engine has a special rule management module 
to provide users to create their own rules, for Flink tasks need to go external 
to load rules

Dynamic update: need to provide timing to detect whether the rules change - 
History state cleanup: in pattern matching is a series of NFAState If the rules 
change, the states are useless and need to be cleaned up

Easy API: Different business developers may have their own rule management, 
timing policies, etc., so they need to provide an easy-to-use API to the 
outside world


> cep supports dynamic rule updates
> -
>
> Key: FLINK-25170
> URL: https://issues.apache.org/jira/browse/FLINK-25170
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> External loading: Usually the rule engine has a special rule management 
> module to provide users to create their own rules, for Flink tasks need to go 
> external to load rules
> Dynamic update: need to provide timing to detect whether the rules change
> History state cleanup: in pattern matching is a series of NFAState If the 
> rules change, the states are useless and need to be cleaned up
> Easy API: Different business developers may have their own rule management, 
> timing policies, etc., so they need to provide an easy-to-use API to the 
> outside world



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25170) cep supports dynamic rule updates

2021-12-05 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-25170:

Description: External loading: Usually the rule engine has a special rule 
management module to provide users to create their own rules, for Flink tasks 
need to go external to load rules - Dynamic update: need to provide timing to 
detect whether the rules change - History state cleanup: in pattern matching is 
a series of NFAState If the rules change, the states are useless and need to be 
cleaned up - Easy API: Different business developers may have their own rule 
management, timing policies, etc., so they need to provide an easy-to-use API 
to the outside world

> cep supports dynamic rule updates
> -
>
> Key: FLINK-25170
> URL: https://issues.apache.org/jira/browse/FLINK-25170
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / CEP
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Major
>
> External loading: Usually the rule engine has a special rule management 
> module to provide users to create their own rules, for Flink tasks need to go 
> external to load rules - Dynamic update: need to provide timing to detect 
> whether the rules change - History state cleanup: in pattern matching is a 
> series of NFAState If the rules change, the states are useless and need to be 
> cleaned up - Easy API: Different business developers may have their own rule 
> management, timing policies, etc., so they need to provide an easy-to-use API 
> to the outside world



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25170) cep supports dynamic rule updates

2021-12-05 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-25170:
---

 Summary: cep supports dynamic rule updates
 Key: FLINK-25170
 URL: https://issues.apache.org/jira/browse/FLINK-25170
 Project: Flink
  Issue Type: New Feature
  Components: Library / CEP
Affects Versions: 1.15.0
Reporter: ZhuoYu Chen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-13395) Add source and sink connector for Alibaba Log Service

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452771#comment-17452771
 ] 

ZhuoYu Chen commented on FLINK-13395:
-

Hi [~liketic]  I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Add source and sink connector for Alibaba Log Service
> -
>
> Key: FLINK-13395
> URL: https://issues.apache.org/jira/browse/FLINK-13395
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Ke Li
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> Alibaba Log Service is a big data service which has been widely used in 
> Alibaba Group and thousands of customers of Alibaba Cloud. The core storage 
> engine of Log Service is named Loghub which is a large scale distributed 
> storage system which provides producer and consumer to push and pull data 
> like Kafka, AWS Kinesis and Azure Eventhub does. 
> Log Service provides a complete solution to help user collect data from both 
> on premise and cloud data sources. More than 10 PB data is sent to and 
> consumed from Loghub every day.  And hundreds of thousands of users 
> implemented their DevOPS and big data system based on Log Service.
> Log Service and Flink/Blink has became the de facto standard of big data 
> architecture for unified data processing in Alibaba Group and more users of 
> Alibaba Cloud.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24152) Exactly-once semantics should be configurable through Flink configuration only

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452768#comment-17452768
 ] 

ZhuoYu Chen commented on FLINK-24152:
-

Hi [~mapohl]  I am very interested in this,and I want do some job for flink,can 
I help to do that?
Thank you

> Exactly-once semantics should be configurable through Flink configuration only
> --
>
> Key: FLINK-24152
> URL: https://issues.apache.org/jira/browse/FLINK-24152
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core, Connectors / Common
>Reporter: Matthias
>Priority: Major
>
> Exactly-once constraint needs to be configured in multiple locations: 
> 1. Flink configuration
> 2. Kafka connector configuration
> As a user, I would expect to set this configuration parameter in Flink. The 
> Kafka connector should be able to derive this configuration parameter from 
> the Flink configuration by default. Setting it on the connector level could 
> be considered as an advanced configuration parameter which would overwrite 
> the default configuration.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24396) Add @Public annotations to Table & SQL API classes

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452765#comment-17452765
 ] 

ZhuoYu Chen commented on FLINK-24396:
-

[~twalthr] One module, one subtask?

> Add @Public annotations to Table & SQL API classes
> --
>
> Key: FLINK-24396
> URL: https://issues.apache.org/jira/browse/FLINK-24396
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Major
>
> Many parts of the Table & SQL API have stabilized and we can mark them as 
> {{@Public}} which gives both users and downstream projects more confidence 
> when using Flink.
> A concrete list of classes and methods needs to be compiled. Some parts of 
> the API might stay {{@PublicEvolving}} for now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-24959) Add a BitMap function to FlinkSQL

2021-12-02 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-24959:

Description: 
BITMAP_AND :{color:#33}Computes the intersection of two input bitmaps and 
returns the new bitmap{color}
{code:java}
select bitmap_count(bitmap_and(to_bitmap(1), to_bitmap(2))) cnt; {code}
{color:#33} {color}

{color:#33}{{TO_BITMAP:input for the value of 0 ~ 18446744073709551615 
unsigned bigint in the range, the output for the bitmap containing the element. 
This function is mainly used for stream load task to import integer fields into 
the bitmap field of the table}}{color}
{code:java}
select bitmap_count(to_bitmap(10)); {code}
 

{color:#30323e}BITMAP_ANDNOT:Computes the set (difference set) that is in A but 
not in B.{color}
{code:java}
select bitmap_to_string(bitmap_andnot(bitmap_from_string('1, 3'), 
bitmap_from_string('2'))) cnt; {code}
 

Bitmap functions related to join operations, etc

  was:
BITMAP_AND :{color:#33}Computes the intersection of two input bitmaps and 
returns the new bitmap
{color}
select bitmap_count(bitmap_and(to_bitmap(1), to_bitmap(2))) cnt;
{color:#33} {color}

{color:#33}{{TO_BITMAP:input for the value of 0 ~ 18446744073709551615 
unsigned bigint in the range, the output for the bitmap containing the element. 
This function is mainly used for stream load task to import integer fields into 
the bitmap field of the table}}
{color}
{code:java}
select bitmap_count(to_bitmap(10)); {code}
{color:#30323e}BITMAP_ANDNOT:Computes the set (difference set) that is in A but 
not in B.{color}

{code:java}
select bitmap_to_string(bitmap_andnot(bitmap_from_string('1, 3'), 
bitmap_from_string('2'))) cnt; {code}

Bitmap functions related to join operations, etc


> Add a BitMap function to FlinkSQL
> -
>
> Key: FLINK-24959
> URL: https://issues.apache.org/jira/browse/FLINK-24959
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Minor
>
> BITMAP_AND :{color:#33}Computes the intersection of two input bitmaps and 
> returns the new bitmap{color}
> {code:java}
> select bitmap_count(bitmap_and(to_bitmap(1), to_bitmap(2))) cnt; {code}
> {color:#33} {color}
> {color:#33}{{TO_BITMAP:input for the value of 0 ~ 18446744073709551615 
> unsigned bigint in the range, the output for the bitmap containing the 
> element. This function is mainly used for stream load task to import integer 
> fields into the bitmap field of the table}}{color}
> {code:java}
> select bitmap_count(to_bitmap(10)); {code}
>  
> {color:#30323e}BITMAP_ANDNOT:Computes the set (difference set) that is in A 
> but not in B.{color}
> {code:java}
> select bitmap_to_string(bitmap_andnot(bitmap_from_string('1, 3'), 
> bitmap_from_string('2'))) cnt; {code}
>  
> Bitmap functions related to join operations, etc



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-24959) Add a BitMap function to FlinkSQL

2021-12-02 Thread ZhuoYu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuoYu Chen updated FLINK-24959:

Description: 
BITMAP_AND :{color:#33}Computes the intersection of two input bitmaps and 
returns the new bitmap
{color}
select bitmap_count(bitmap_and(to_bitmap(1), to_bitmap(2))) cnt;
{color:#33} {color}

{color:#33}{{TO_BITMAP:input for the value of 0 ~ 18446744073709551615 
unsigned bigint in the range, the output for the bitmap containing the element. 
This function is mainly used for stream load task to import integer fields into 
the bitmap field of the table}}
{color}
{code:java}
select bitmap_count(to_bitmap(10)); {code}
{color:#30323e}BITMAP_ANDNOT:Computes the set (difference set) that is in A but 
not in B.{color}

{code:java}
select bitmap_to_string(bitmap_andnot(bitmap_from_string('1, 3'), 
bitmap_from_string('2'))) cnt; {code}

Bitmap functions related to join operations, etc

  was:
bitmap_and :{color:#33}Computes the intersection of two input bitmaps and 
returns the new bitmap{color}

{color:#30323e}bitmap_andnot:{color:#33}Computes the set (difference set) 
that is in A but not in B.{color}{color}

{color:#30323e}{color:#33}Bitmap functions related to join operations, 
etc{color}{color}


> Add a BitMap function to FlinkSQL
> -
>
> Key: FLINK-24959
> URL: https://issues.apache.org/jira/browse/FLINK-24959
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Minor
>
> BITMAP_AND :{color:#33}Computes the intersection of two input bitmaps and 
> returns the new bitmap
> {color}
> select bitmap_count(bitmap_and(to_bitmap(1), to_bitmap(2))) cnt;
> {color:#33} {color}
> {color:#33}{{TO_BITMAP:input for the value of 0 ~ 18446744073709551615 
> unsigned bigint in the range, the output for the bitmap containing the 
> element. This function is mainly used for stream load task to import integer 
> fields into the bitmap field of the table}}
> {color}
> {code:java}
> select bitmap_count(to_bitmap(10)); {code}
> {color:#30323e}BITMAP_ANDNOT:Computes the set (difference set) that is in A 
> but not in B.{color}
> {code:java}
> select bitmap_to_string(bitmap_andnot(bitmap_from_string('1, 3'), 
> bitmap_from_string('2'))) cnt; {code}
> Bitmap functions related to join operations, etc



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-24959) Add a BitMap function to FlinkSQL

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452755#comment-17452755
 ] 

ZhuoYu Chen edited comment on FLINK-24959 at 12/3/21, 5:12 AM:
---

hi [~jark] clickhouse,starRocks,doris all support bitmap

Traditional Count distinct calculation, due to the need for multiple data 
shuffle (transferring data between different nodes and calculating 
de-duplication) during query execution, can lead to a linear decrease in 
performance as the amount of data increases.

Advantages of bitmap de-duplication

    Space advantage: The use of one bit of a bitmap to indicate the existence 
of the corresponding subscript has a great space advantage; for example, for 
int32 de-duplication, the storage space required by a normal bitmap is only 
1/32 of the traditional de-duplication. With the optimized implementation of 
Roaring Bitmap, the storage space is further reduced significantly for sparse 
bitmaps.
    Time advantage: The computation involved in bitmap de-duplication includes 
bit placement for a given subscript and counting the number of bits in a 
bitmap, which is an O(1) operation and an O(n) operation, respectively, and the 
latter can be efficiently computed using clz, ctz, and other instructions.


was (Author: monster#12):
hi [~jark] clickhouse,starRocks,doris all support bitmap

Traditional Count distinct calculation, due to the need for multiple data 
shuffle (transferring data between different nodes and calculating 
de-duplication) during query execution, can lead to a linear decrease in 
performance as the amount of data increases.

Advantages of bitmap de-duplication

    Space advantage: The use of one bit of a bitmap to indicate the existence 
of the corresponding subscript has a great space advantage; for example, for 
int32 de-duplication, the storage space required by a normal bitmap is only 
1/32 of the traditional de-duplication. With the optimized implementation of 
Roaring Bitmap, the storage space is further reduced significantly for sparse 
bitmaps.
    Time advantage: The computation involved in bitmap de-duplication includes 
bit placement for a given subscript and counting the number of bits in a 
bitmap, which is an O(1) operation and an O(n) operation, respectively, and the 
latter can be efficiently computed using clz, ctz, and other instructions. In 
addition, bitmap de-duplication can be accelerated in parallel in the MPP 
execution engine, where each computation node computes a local sub-bitmap and 
uses the bitor operation to merge these sub-bitmaps into the final bitmap. The 
bitor operation is more efficient than sort-based and hash-based 
de-duplication, has no conditional dependencies and data dependencies, and can 
be executed quantitatively.

> Add a BitMap function to FlinkSQL
> -
>
> Key: FLINK-24959
> URL: https://issues.apache.org/jira/browse/FLINK-24959
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Minor
>
> bitmap_and :{color:#33}Computes the intersection of two input bitmaps and 
> returns the new bitmap{color}
> {color:#30323e}bitmap_andnot:{color:#33}Computes the set (difference set) 
> that is in A but not in B.{color}{color}
> {color:#30323e}{color:#33}Bitmap functions related to join operations, 
> etc{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24959) Add a BitMap function to FlinkSQL

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452755#comment-17452755
 ] 

ZhuoYu Chen commented on FLINK-24959:
-

hi [~jark] clickhouse,starRocks,doris all support bitmap

Traditional Count distinct calculation, due to the need for multiple data 
shuffle (transferring data between different nodes and calculating 
de-duplication) during query execution, can lead to a linear decrease in 
performance as the amount of data increases.

Advantages of bitmap de-duplication

    Space advantage: The use of one bit of a bitmap to indicate the existence 
of the corresponding subscript has a great space advantage; for example, for 
int32 de-duplication, the storage space required by a normal bitmap is only 
1/32 of the traditional de-duplication. With the optimized implementation of 
Roaring Bitmap, the storage space is further reduced significantly for sparse 
bitmaps.
    Time advantage: The computation involved in bitmap de-duplication includes 
bit placement for a given subscript and counting the number of bits in a 
bitmap, which is an O(1) operation and an O(n) operation, respectively, and the 
latter can be efficiently computed using clz, ctz, and other instructions. In 
addition, bitmap de-duplication can be accelerated in parallel in the MPP 
execution engine, where each computation node computes a local sub-bitmap and 
uses the bitor operation to merge these sub-bitmaps into the final bitmap. The 
bitor operation is more efficient than sort-based and hash-based 
de-duplication, has no conditional dependencies and data dependencies, and can 
be executed quantitatively.

> Add a BitMap function to FlinkSQL
> -
>
> Key: FLINK-24959
> URL: https://issues.apache.org/jira/browse/FLINK-24959
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Minor
>
> bitmap_and :{color:#33}Computes the intersection of two input bitmaps and 
> returns the new bitmap{color}
> {color:#30323e}bitmap_andnot:{color:#33}Computes the set (difference set) 
> that is in A but not in B.{color}{color}
> {color:#30323e}{color:#33}Bitmap functions related to join operations, 
> etc{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-20873) Upgrade Calcite version to 1.27

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452753#comment-17452753
 ] 

ZhuoYu Chen commented on FLINK-20873:
-

[~jark] Are there still plans for this to go ahead? If there is a plan, I think 
I can be assigned this task.

> Upgrade Calcite version to 1.27
> ---
>
> Key: FLINK-20873
> URL: https://issues.apache.org/jira/browse/FLINK-20873
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
>
> The following files should be removed from the Flink code base during an 
> upgrade:
>  - org.apache.calcite.rex.RexSimplify
>  - org.apache.calcite.sql.SqlMatchRecognize
>  - org.apache.calcite.sql.SqlTableRef
>  - org.apache.calcite.sql2rel.RelDecorrelator
>  - org.apache.flink.table.planner.functions.sql.SqlJsonObjectFunction (added 
> in FLINK-16203)
>  - Adopt calcite's behaviour and add SQL tests once 
> [https://github.com/apache/calcite/pull/2555] is merged, (check FLINK-24576 )



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-12429) Translate the "Generating Timestamps / Watermarks" page into Chinese

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452752#comment-17452752
 ] 

ZhuoYu Chen commented on FLINK-12429:
-

Hi [~jark]   I am very interested in this,and I want do some job for flink,can 
I help to do that?
Thank you

> Translate the "Generating Timestamps / Watermarks" page into Chinese
> 
>
> Key: FLINK-12429
> URL: https://issues.apache.org/jira/browse/FLINK-12429
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Affects Versions: 1.9.0
>Reporter: YangFei
>Priority: Major
>  Labels: auto-unassigned
>
> file locate flink/docs/dev/event_timestamps_watermarks.zh.md
> [https://ci.apache.org/projects/flink/flink-docs-master/dev/event_timestamps_watermarks.html]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-17203) Add metrics for ClickHouse sink

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452747#comment-17452747
 ] 

ZhuoYu Chen commented on FLINK-17203:
-

Hi [~csbliss]  I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Add metrics for ClickHouse sink
> ---
>
> Key: FLINK-17203
> URL: https://issues.apache.org/jira/browse/FLINK-17203
> Project: Flink
>  Issue Type: Sub-task
>Reporter: jinhai
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-17202) Add SQL for ClickHouse connector

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452745#comment-17452745
 ] 

ZhuoYu Chen commented on FLINK-17202:
-

Hi [~csbliss]  I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Add SQL for ClickHouse connector
> 
>
> Key: FLINK-17202
> URL: https://issues.apache.org/jira/browse/FLINK-17202
> Project: Flink
>  Issue Type: Sub-task
>Reporter: jinhai
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-17201) Implement Streaming ClickHouseSink

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452746#comment-17452746
 ] 

ZhuoYu Chen commented on FLINK-17201:
-

Hi [~csbliss]  I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Implement Streaming ClickHouseSink
> --
>
> Key: FLINK-17201
> URL: https://issues.apache.org/jira/browse/FLINK-17201
> Project: Flink
>  Issue Type: Sub-task
>Reporter: jinhai
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-24103) Create time-based LAST_VALUE / FIRST_VALUE

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452738#comment-17452738
 ] 

ZhuoYu Chen edited comment on FLINK-24103 at 12/3/21, 4:21 AM:
---

Hi [~twalthr] I am very interested in this,and I want do some job for flink,can 
I help to do that?
Thank you


was (Author: monster#12):
[~twalthr]Walther I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Create time-based LAST_VALUE / FIRST_VALUE
> --
>
> Key: FLINK-24103
> URL: https://issues.apache.org/jira/browse/FLINK-24103
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Timo Walther
>Priority: Major
>
> LAST_VALUE and FIRST_VALUE don't support merging. As far I can see it, 
> FLINK-20110 tries to solve this by using nano second timestamps internally. 
> However, an easier and consistent approach could be to allow a time parameter 
> in the signature:
> {code}
> LAST_VALUE(timestamp, value)
> FIRST_VALUE(timestamp, value)
> {code}
> This allows merging based on a timestamp in HOP or SESSION windows.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24103) Create time-based LAST_VALUE / FIRST_VALUE

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452738#comment-17452738
 ] 

ZhuoYu Chen commented on FLINK-24103:
-

[~twalthr]  我对这个非常感兴趣,我想为flink做一些工作,我可以帮助做吗?
谢谢你

> Create time-based LAST_VALUE / FIRST_VALUE
> --
>
> Key: FLINK-24103
> URL: https://issues.apache.org/jira/browse/FLINK-24103
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Timo Walther
>Priority: Major
>
> LAST_VALUE and FIRST_VALUE don't support merging. As far I can see it, 
> FLINK-20110 tries to solve this by using nano second timestamps internally. 
> However, an easier and consistent approach could be to allow a time parameter 
> in the signature:
> {code}
> LAST_VALUE(timestamp, value)
> FIRST_VALUE(timestamp, value)
> {code}
> This allows merging based on a timestamp in HOP or SESSION windows.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24900) Support to run multiple shuffle plugins in one session cluster

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452737#comment-17452737
 ] 

ZhuoYu Chen commented on FLINK-24900:
-

Hi [~kevin.cyj]  I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Support to run multiple shuffle plugins in one session cluster
> --
>
> Key: FLINK-24900
> URL: https://issues.apache.org/jira/browse/FLINK-24900
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Yingjie Cao
>Priority: Major
>
> Currently, one Flink cluster can only use one shuffle plugin. However, there 
> are cases where different jobs may need different shuffle implementations. By 
> loading shuffle plugin with the plugin manager and letting jobs select their 
> shuffle service freely, Flink can support to run multiple shuffle plugins in 
> one session cluster.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-24103) Create time-based LAST_VALUE / FIRST_VALUE

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452738#comment-17452738
 ] 

ZhuoYu Chen edited comment on FLINK-24103 at 12/3/21, 4:20 AM:
---

[~twalthr]Walther I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you


was (Author: monster#12):
[~twalthr]  我对这个非常感兴趣,我想为flink做一些工作,我可以帮助做吗?
谢谢你

> Create time-based LAST_VALUE / FIRST_VALUE
> --
>
> Key: FLINK-24103
> URL: https://issues.apache.org/jira/browse/FLINK-24103
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Timo Walther
>Priority: Major
>
> LAST_VALUE and FIRST_VALUE don't support merging. As far I can see it, 
> FLINK-20110 tries to solve this by using nano second timestamps internally. 
> However, an easier and consistent approach could be to allow a time parameter 
> in the signature:
> {code}
> LAST_VALUE(timestamp, value)
> FIRST_VALUE(timestamp, value)
> {code}
> This allows merging based on a timestamp in HOP or SESSION windows.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24103) Create time-based LAST_VALUE / FIRST_VALUE

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452736#comment-17452736
 ] 

ZhuoYu Chen commented on FLINK-24103:
-

[~twalthr] Is there a result of the discussion now?

> Create time-based LAST_VALUE / FIRST_VALUE
> --
>
> Key: FLINK-24103
> URL: https://issues.apache.org/jira/browse/FLINK-24103
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Timo Walther
>Priority: Major
>
> LAST_VALUE and FIRST_VALUE don't support merging. As far I can see it, 
> FLINK-20110 tries to solve this by using nano second timestamps internally. 
> However, an easier and consistent approach could be to allow a time parameter 
> in the signature:
> {code}
> LAST_VALUE(timestamp, value)
> FIRST_VALUE(timestamp, value)
> {code}
> This allows merging based on a timestamp in HOP or SESSION windows.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24910) Propagate the Calcite parser config to SQL Client

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452734#comment-17452734
 ] 

ZhuoYu Chen commented on FLINK-24910:
-

Hi [~Sergey Nuyanzin] I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Propagate the Calcite parser config to SQL Client
> -
>
> Key: FLINK-24910
> URL: https://issues.apache.org/jira/browse/FLINK-24910
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Reporter: Sergey Nuyanzin
>Priority: Major
>
> It's required to get Dialect specific info like keywords, sql quote identifier



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-20188) Add Documentation for new File Source

2021-12-02 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452250#comment-17452250
 ] 

ZhuoYu Chen commented on FLINK-20188:
-

[~MartijnVisser] I am very sorry that I just started this work today. The 
previous issue was very difficult and took me a lot of time. I read some test 
cases roughly prepared, I want to submit on pr and the current jira for 
correlation first, we communicate and modify on pr, brainstorming, I will keep 
submitting and modifying this pr in the next few days, I hope to complete this 
task as soon as possible.Do you think this is okay?

> Add Documentation for new File Source
> -
>
> Key: FLINK-20188
> URL: https://issues.apache.org/jira/browse/FLINK-20188
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem, Documentation
>Affects Versions: 1.14.0, 1.13.3, 1.15.0
>Reporter: Stephan Ewen
>Assignee: ZhuoYu Chen
>Priority: Blocker
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
> Attachments: image-2021-11-16-11-42-32-957.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-15571) Create a Redis Streams Connector for Flink

2021-11-23 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448391#comment-17448391
 ] 

ZhuoYu Chen commented on FLINK-15571:
-

[~yunta] Thank you for your reply, I think you can assign this task to me 
first, I've finished the mongo connector submission so far, now it's being 
reviewed by the community and I'm planning to do the reids connector next

> Create a Redis Streams Connector for Flink
> --
>
> Key: FLINK-15571
> URL: https://issues.apache.org/jira/browse/FLINK-15571
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Tugdual Grall
>Priority: Minor
>  Labels: pull-request-available
>
> Redis has a "log data structure" called Redis Streams, it would be nice to 
> integrate Redis Streams and Apache Flink as:
>  * Source
>  * Sink
> See Redis Streams introduction: [https://redis.io/topics/streams-intro]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24396) Add @Public annotations to Table & SQL API classes

2021-11-23 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448377#comment-17448377
 ] 

ZhuoYu Chen commented on FLINK-24396:
-

Hi [~twalthr] I am very interested in this,and I want do some job for flink,can 
I help to do that?
Thank you

> Add @Public annotations to Table & SQL API classes
> --
>
> Key: FLINK-24396
> URL: https://issues.apache.org/jira/browse/FLINK-24396
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Major
>
> Many parts of the Table & SQL API have stabilized and we can mark them as 
> {{@Public}} which gives both users and downstream projects more confidence 
> when using Flink.
> A concrete list of classes and methods needs to be compiled. Some parts of 
> the API might stay {{@PublicEvolving}} for now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24429) Port FileSystemTableSink to new Unified Sink API (FLIP-143)

2021-11-23 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448376#comment-17448376
 ] 

ZhuoYu Chen commented on FLINK-24429:
-

Hi [~alexanderpreuss] I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Port FileSystemTableSink to new Unified Sink API (FLIP-143)
> ---
>
> Key: FLINK-24429
> URL: https://issues.apache.org/jira/browse/FLINK-24429
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: Alexander Preuss
>Assignee: Alexander Preuss
>Priority: Major
>
> We want to port the FileSystemTableSink to the new Sink API as was done with 
> the  Kafka Sink.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24959) Add a BitMap function to FlinkSQL

2021-11-23 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448374#comment-17448374
 ] 

ZhuoYu Chen commented on FLINK-24959:
-

Hi [~jark] I am very interested in this,and I want do some job for flink,can I 
help to do that?
Thank you

> Add a BitMap function to FlinkSQL
> -
>
> Key: FLINK-24959
> URL: https://issues.apache.org/jira/browse/FLINK-24959
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: ZhuoYu Chen
>Priority: Minor
>
> bitmap_and :{color:#33}Computes the intersection of two input bitmaps and 
> returns the new bitmap{color}
> {color:#30323e}bitmap_andnot:{color:#33}Computes the set (difference set) 
> that is in A but not in B.{color}{color}
> {color:#30323e}{color:#33}Bitmap functions related to join operations, 
> etc{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-15571) Create a Redis Streams Connector for Flink

2021-11-23 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448340#comment-17448340
 ] 

ZhuoYu Chen commented on FLINK-15571:
-

Hi [~yunta],[~tgrall]  I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank you

> Create a Redis Streams Connector for Flink
> --
>
> Key: FLINK-15571
> URL: https://issues.apache.org/jira/browse/FLINK-15571
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Tugdual Grall
>Priority: Minor
>  Labels: pull-request-available
>
> Redis has a "log data structure" called Redis Streams, it would be nice to 
> integrate Redis Streams and Apache Flink as:
>  * Source
>  * Sink
> See Redis Streams introduction: [https://redis.io/topics/streams-intro]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-12941) Translate "Amazon AWS Kinesis Streams Connector" page into Chinese

2021-11-22 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447787#comment-17447787
 ] 

ZhuoYu Chen commented on FLINK-12941:
-

Hi [~jark] I am very interested in this,and I want do some job for flink,can I 
help to do that?
Thank you

> Translate "Amazon AWS Kinesis Streams Connector" page into Chinese
> --
>
> Key: FLINK-12941
> URL: https://issues.apache.org/jira/browse/FLINK-12941
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Reporter: Jark Wu
>Priority: Minor
>  Labels: auto-unassigned
>
> Translate the internal page 
> "https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/kinesis.html";
>  into Chinese.
>  
> The doc located in "flink/docs/dev/connectors/kinesis.zh.md"



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24696) Translate how to configure unaligned checkpoints into Chinese

2021-11-22 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447786#comment-17447786
 ] 

ZhuoYu Chen commented on FLINK-24696:
-

Hi [~pnowojski]  , I am very interested in this,and I want do some job for 
flink,can I help to do that?
Thank youHi Liebing Yu , I am very interested in this,and I want do some job 
for flink,can I help to do that?
Thank you

> Translate how to configure unaligned checkpoints into Chinese
> -
>
> Key: FLINK-24696
> URL: https://issues.apache.org/jira/browse/FLINK-24696
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation, Documentation
>Affects Versions: 1.15.0, 1.14.1
>Reporter: Piotr Nowojski
>Priority: Major
> Fix For: 1.15.0
>
>
> As part of FLINK-24695 
> {{docs/content/docs/ops/state/checkpointing_under_backpressure.md}} and 
> {{docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md}} were 
> modified. Those modifications should be translated into Chinese



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24456) Support bounded offset in the Kafka table connector

2021-11-20 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446832#comment-17446832
 ] 

ZhuoYu Chen commented on FLINK-24456:
-

[~wheat9] Hello, I have started to modify the code. Due to the delay caused by 
other tasks of Flink community, I will speed up the completion and submit PR 
next week

> Support bounded offset in the Kafka table connector
> ---
>
> Key: FLINK-24456
> URL: https://issues.apache.org/jira/browse/FLINK-24456
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / Ecosystem
>Reporter: Haohui Mai
>Assignee: ZhuoYu Chen
>Priority: Minor
>
> The {{setBounded}} API in the DataStream connector of Kafka is particularly 
> useful when writing tests. Unfortunately the table connector of Kafka lacks 
> the same API.
> It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24959) Add a BitMap function to FlinkSQL

2021-11-18 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-24959:
---

 Summary: Add a BitMap function to FlinkSQL
 Key: FLINK-24959
 URL: https://issues.apache.org/jira/browse/FLINK-24959
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Affects Versions: 1.15.0
Reporter: ZhuoYu Chen


bitmap_and :{color:#33}Computes the intersection of two input bitmaps and 
returns the new bitmap{color}

{color:#30323e}bitmap_andnot:{color:#33}Computes the set (difference set) 
that is in A but not in B.{color}{color}

{color:#30323e}{color:#33}Bitmap functions related to join operations, 
etc{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-6573) Flink MongoDB Connector

2021-11-17 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-6573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445175#comment-17445175
 ] 

ZhuoYu Chen commented on FLINK-6573:


{color:#4a90e2}[~arvid]  {color:#33}Ok, I will submit the code this 
Sunday{color}{color}

> Flink MongoDB Connector
> ---
>
> Key: FLINK-6573
> URL: https://issues.apache.org/jira/browse/FLINK-6573
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Affects Versions: 1.2.0
> Environment: Linux Operating System, Mongo DB
>Reporter: Nagamallikarjuna
>Assignee: ZhuoYu Chen
>Priority: Not a Priority
>  Labels: stale-assigned
> Attachments: image-2021-11-15-14-41-07-514.png
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Hi Community,
> Currently we are using Flink in the current Project. We have huge amount of 
> data to process using Flink which resides in Mongo DB. We have a requirement 
> of parallel data connectivity in between Flink and Mongo DB for both 
> reads/writes. Currently we are planning to create this connector and 
> contribute to the Community.
> I will update the further details once I receive your feedback 
> Please let us know if you have any concerns.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-20188) Add Documentation for new File Source

2021-11-17 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445174#comment-17445174
 ] 

ZhuoYu Chen commented on FLINK-20188:
-

{color:#33}[~sewen] Thank you for sharing, I am currently working on 
another flink task, the current document task will not start until next 
week{color}

> Add Documentation for new File Source
> -
>
> Key: FLINK-20188
> URL: https://issues.apache.org/jira/browse/FLINK-20188
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem, Documentation
>Affects Versions: 1.14.0, 1.13.3, 1.15.0
>Reporter: Stephan Ewen
>Priority: Blocker
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
> Attachments: image-2021-11-16-11-42-32-957.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-20188) Add Documentation for new File Source

2021-11-15 Thread ZhuoYu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444257#comment-17444257
 ] 

ZhuoYu Chen commented on FLINK-20188:
-

!image-2021-11-16-11-42-32-957.png!
Is it the document that completes this File source?If yes, you can assign the 
task to me. I will send the document to JIRA after it is finished for 
everyone's review

> Add Documentation for new File Source
> -
>
> Key: FLINK-20188
> URL: https://issues.apache.org/jira/browse/FLINK-20188
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem, Documentation
>Affects Versions: 1.14.0, 1.13.3, 1.15.0
>Reporter: Stephan Ewen
>Priority: Blocker
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
> Attachments: image-2021-11-16-11-42-32-957.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


  1   2   >