[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-13.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-10.png, 
> screenshot-11.png, screenshot-12.png, screenshot-13.png, screenshot-14.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png, 
> screenshot-6.png, screenshot-7.png, screenshot-8.png, screenshot-9.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-14.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-10.png, 
> screenshot-11.png, screenshot-12.png, screenshot-13.png, screenshot-14.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png, 
> screenshot-6.png, screenshot-7.png, screenshot-8.png, screenshot-9.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-12.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-10.png, 
> screenshot-11.png, screenshot-12.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png, screenshot-6.png, screenshot-7.png, 
> screenshot-8.png, screenshot-9.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-11.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-10.png, 
> screenshot-11.png, screenshot-12.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png, screenshot-6.png, screenshot-7.png, 
> screenshot-8.png, screenshot-9.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-9.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-10.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png, 
> screenshot-6.png, screenshot-7.png, screenshot-8.png, screenshot-9.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-8.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-10.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png, 
> screenshot-6.png, screenshot-7.png, screenshot-8.png, screenshot-9.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-10.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-10.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png, 
> screenshot-6.png, screenshot-7.png, screenshot-8.png, screenshot-9.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> 

[jira] [Comment Edited] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859814#comment-17859814
 ] 

Weijie Guo edited comment on FLINK-35689 at 6/25/24 5:51 AM:
-

Thanks [~lsy] for the quick testing, great work!

[~hackergin] Can you help to confirm whether this is in line with expectations? 
If yes, feel free to close this and mark as done.


was (Author: weijie guo):
Thanks [~lsy] for the quick testing, great work!

[~hackergin] Can you help to confirm whether this is in line with expectations? 
If yes, feel free to close this marked as done.

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png, screenshot-4.png, screenshot-5.png, screenshot-6.png, 
> screenshot-7.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-7.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png, screenshot-4.png, screenshot-5.png, screenshot-6.png, 
> screenshot-7.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 

[jira] [Commented] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859814#comment-17859814
 ] 

Weijie Guo commented on FLINK-35689:


Thanks [~lsy] for the quick testing, great work!

[~hackergin] Can you help to confirm whether this is in line with expectations? 
If yes, feel free to close this marked as done.

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png, screenshot-4.png, screenshot-5.png, screenshot-6.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-5.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png, screenshot-4.png, screenshot-5.png, screenshot-6.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-6.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png, screenshot-4.png, screenshot-5.png, screenshot-6.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-4.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png, screenshot-4.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 10:02:00', 30),
> (1004, 2, 'user4', 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-2.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-24 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-3.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, image-2024-06-25-13-38-47-663.png, 
> image-2024-06-25-13-39-44-790.png, image-2024-06-25-13-39-56-133.png, 
> image-2024-06-25-13-43-10-439.png, image-2024-06-25-13-43-22-548.png, 
> image-2024-06-25-13-44-07-669.png, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-24 

[jira] [Comment Edited] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859807#comment-17859807
 ] 

dalongliu edited comment on FLINK-35689 at 6/25/24 5:44 AM:


For Continuous Mode, I did the following verification.
h2. Create Materialized Table
h2. 
1. bad case1: Invalid Primary Key: column not-exists
 !screenshot-1.png! 

2. bad case2: Invalid Primary Key: column is nullable
 !image-2024-06-25-12-00-07-616.png! 

3. bad case3: Invalid partition key
 !image-2024-06-25-12-01-09-648.png! 

4. bad case4: invalid `partition.fields.pt.date-formatter`
 !image-2024-06-25-12-02-08-558.png! 

5. bad case5: invalid freshness time unit
 !image-2024-06-25-12-02-51-615.png! 

6. bad case6: negative freshness value
 !image-2024-06-25-12-03-20-930.png! 

7. bad case7: Specify Json Format, according to Definition Query can not 
generate Flink stream jobs, the framework will first create Materialized Table, 
and then deleted, weak atomicity assurance
 !image-2024-06-25-12-04-24-948.png! 

8. Good case: Materialized Table created successfully; Flink streaming job 
submitted successfully; data written in real time

{code:sql}
CREATE MATERIALIZED TABLE continuous_users_shops 
PARTITIONED BY (ds)
WITH(
  'format' = 'debezium-json',
  'sink.rolling-policy.rollover-interval' = '10s',
  'sink.rolling-policy.check-interval' = '10s'
)
FRESHNESS = INTERVAL '30' SECOND
AS SELECT 
  user_id,
  ds,
  SUM (payment_amount_cents) AS payed_buy_fee_sum,
  SUM (1) AS pv
FROM (
SELECT user_id, DATE_FORMAT(order_created_at, '-MM-dd') AS ds, 
payment_amount_cents FROM json_source ) AS tmp
 GROUP BY (user_id, ds);
{code}

 !image-2024-06-25-12-05-39-089.png! 

 !image-2024-06-25-12-05-54-104.png! 

h2. Suspend Materialized Table
h2. 
1. Suspend without specifying a savepoint path
 !image-2024-06-25-12-07-52-182.png! 

2. Suspend with specifying a savepoint path

{code:sql}
SET 'execution.checkpointing.savepoint-dir' = 
'file:///Users/ron/mt_demo/savepoint';

ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops SUSPEND;
{code}

 !image-2024-06-25-12-09-11-207.png! 
 !image-2024-06-25-12-09-22-879.png! 

3. Repeat Suspend
 !image-2024-06-25-12-11-08-720.png! 

h2. Resume Materialized Table
h2. 
1. Resume without options
{code:sql}
ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops RESUME;
{code}
 !image-2024-06-25-12-13-47-363.png! 
 !image-2024-06-25-12-14-13-107.png! 

2. Resume with options
{code:sql}
ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops RESUME
WITH (
  -- 'sink.parallelism' = '3'
  'sink.shuffle-by-partition.enable' = 'true'
);
{code}

 !image-2024-06-25-12-15-03-493.png! 

3. Repeat Resume: When a background streaming job is running and a Resume 
operation is performed, an error should be reported and a streaming job cannot 
be resubmitted.

 !image-2024-06-25-12-16-47-160.png! 
 !image-2024-06-25-12-16-57-076.png! 

h2. Manual Refresh Materialized Table
h2. 
1. bad case: Partition field does not exist
 !image-2024-06-25-12-18-12-506.png! 

2. bad case: Partition fields that are not String type
 !image-2024-06-25-13-38-47-663.png! 

3. good case: 
 !image-2024-06-25-13-39-44-790.png! 
 !image-2024-06-25-13-39-56-133.png! 

h2. Drop Materialized Table
h2. 

1. Drop without keyword `IF EXISTS`
 !image-2024-06-25-13-43-10-439.png! 
 !image-2024-06-25-13-43-22-548.png! 

2. Drop with keyword `IF EXISTS`
 !image-2024-06-25-13-44-07-669.png! 



was (Author: lsy):
For Continuous Mode, I did the following verification.
h2. Create Materialized Table
h2. 
1. bad case1: Invalid Primary Key: column not-exists
 !screenshot-1.png! 

2. bad case2: Invalid Primary Key: column is nullable
 !image-2024-06-25-12-00-07-616.png! 

3. bad case3: Invalid partition key
 !image-2024-06-25-12-01-09-648.png! 

4. bad case4: invalid `partition.fields.pt.date-formatter`
 !image-2024-06-25-12-02-08-558.png! 

5. bad case5: invalid freshness time unit
 !image-2024-06-25-12-02-51-615.png! 

6. bad case6: negative freshness value
 !image-2024-06-25-12-03-20-930.png! 

7. bad case7: Specify Json Format, according to Definition Query can not 
generate Flink stream jobs, the framework will first create Materialized Table, 
and then deleted, weak atomicity assurance
 !image-2024-06-25-12-04-24-948.png! 

8. Good case: Materialized Table created successfully; Flink streaming job 
submitted successfully; data written in real time

{code:sql}
CREATE MATERIALIZED TABLE continuous_users_shops 
PARTITIONED BY (ds)
WITH(
  'format' = 'debezium-json',
  'sink.rolling-policy.rollover-interval' = '10s',
  'sink.rolling-policy.check-interval' = '10s'
)
FRESHNESS = INTERVAL '30' SECOND
AS SELECT 
  user_id,
  ds,
  SUM (payment_amount_cents) AS payed_buy_fee_sum,
  SUM (1) AS pv
FROM (
SELECT user_id, DATE_FORMAT(order_created_at, '-MM-dd') AS ds, 
payment_amount_cents FROM json_source ) AS tmp
 GROUP BY (user_id, 

[jira] [Commented] (FLINK-35606) Release Testing Instructions: Verify FLINK-26050 Too many small sst files in rocksdb state backend when using time window created in ascending order

2024-06-24 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859812#comment-17859812
 ] 

Weijie Guo commented on FLINK-35606:


Hi [~roman], could you please confirm whether FLINK-26050 needs the release 
testing? If yes, also complete the testing instruction. thanks!

> Release Testing Instructions: Verify FLINK-26050 Too many small sst files in 
> rocksdb state backend when using time window created in ascending order
> 
>
> Key: FLINK-35606
> URL: https://issues.apache.org/jira/browse/FLINK-35606
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Rui Fan
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-26050



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35613) Release Testing Instructions: Verify [FLIP-451] Introduce timeout configuration to AsyncSink

2024-06-24 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859811#comment-17859811
 ] 

Weijie Guo commented on FLINK-35613:


Hi, [~chalixar], could you complete the testing instruction(also documentation 
if needed) of this feature? thanks!

> Release Testing Instructions: Verify [FLIP-451] Introduce timeout 
> configuration to AsyncSink
> 
>
> Key: FLINK-35613
> URL: https://issues.apache.org/jira/browse/FLINK-35613
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common
>Reporter: Rui Fan
>Assignee: Ahmed Hamdy
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35435



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-35669) Release Testing: Verify FLIP-383: Support Job Recovery from JobMaster Failures for Batch Jobs

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo reassigned FLINK-35669:
--

Assignee: (was: Junrui Li)

> Release Testing: Verify FLIP-383: Support Job Recovery from JobMaster 
> Failures for Batch Jobs
> -
>
> Key: FLINK-35669
> URL: https://issues.apache.org/jira/browse/FLINK-35669
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Junrui Li
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> In 1.20, we introduced a batch job recovery mechanism to enable batch jobs to 
> recover as much progress as possible after a JobMaster failover, avoiding the 
> need to rerun tasks that have already been finished.
> More information about this feature and how to enable it could be found in: 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/recovery_from_job_master_failure/]
> We may need the following tests:
>  # Start a batch job with High Availability (HA) enabled, and after it has 
> progressed to a certain point, kill the JobManager (jm), then observe whether 
> the job recovers its progress normally.
>  # Use a custom source and ensure that its SplitEnumerator implements the 
> SupportsBatchSnapshot interface, submit the job, and after it has progressed 
> to a certain point, kill the JobManager (jm), then observe whether the job 
> recovers its progress normally.
>  
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-33892



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-35604) Release Testing Instructions: Verify FLIP-383: Support Job Recovery from JobMaster Failures for Batch Jobs

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo closed FLINK-35604.
--
Resolution: Done

closed due to FLINK-35669.

> Release Testing Instructions: Verify FLIP-383: Support Job Recovery from 
> JobMaster Failures for Batch Jobs
> --
>
> Key: FLINK-35604
> URL: https://issues.apache.org/jira/browse/FLINK-35604
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Rui Fan
>Assignee: Junrui Li
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> In 1.20, we introduced a batch job recovery mechanism to enable batch jobs to 
> recover as much progress as possible after a JobMaster failover, avoiding the 
> need to rerun tasks that have already been finished.
> More information about this feature and how to enable it could be found in: 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/recovery_from_job_master_failure/]
> We may need the following tests:
>  # Start a batch job with High Availability (HA) enabled, and after it has 
> progressed to a certain point, kill the JobManager (jm), then observe whether 
> the job recovers its progress normally.
>  # Use a custom source and ensure that its SplitEnumerator implements the 
> SupportsBatchSnapshot interface, submit the job, and after it has progressed 
> to a certain point, kill the JobManager (jm), then observe whether the job 
> recovers its progress normally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35604) Release Testing Instructions: Verify FLIP-383: Support Job Recovery from JobMaster Failures for Batch Jobs

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35604:
---
Description: 
In 1.20, we introduced a batch job recovery mechanism to enable batch jobs to 
recover as much progress as possible after a JobMaster failover, avoiding the 
need to rerun tasks that have already been finished.

More information about this feature and how to enable it could be found in: 
[https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/recovery_from_job_master_failure/]

We may need the following tests:
 # Start a batch job with High Availability (HA) enabled, and after it has 
progressed to a certain point, kill the JobManager (jm), then observe whether 
the job recovers its progress normally.
 # Use a custom source and ensure that its SplitEnumerator implements the 
SupportsBatchSnapshot interface, submit the job, and after it has progressed to 
a certain point, kill the JobManager (jm), then observe whether the job 
recovers its progress normally.

  was:Follow up the test for https://issues.apache.org/jira/browse/FLINK-33892


> Release Testing Instructions: Verify FLIP-383: Support Job Recovery from 
> JobMaster Failures for Batch Jobs
> --
>
> Key: FLINK-35604
> URL: https://issues.apache.org/jira/browse/FLINK-35604
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Rui Fan
>Assignee: Junrui Li
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> In 1.20, we introduced a batch job recovery mechanism to enable batch jobs to 
> recover as much progress as possible after a JobMaster failover, avoiding the 
> need to rerun tasks that have already been finished.
> More information about this feature and how to enable it could be found in: 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/recovery_from_job_master_failure/]
> We may need the following tests:
>  # Start a batch job with High Availability (HA) enabled, and after it has 
> progressed to a certain point, kill the JobManager (jm), then observe whether 
> the job recovers its progress normally.
>  # Use a custom source and ensure that its SplitEnumerator implements the 
> SupportsBatchSnapshot interface, submit the job, and after it has progressed 
> to a certain point, kill the JobManager (jm), then observe whether the job 
> recovers its progress normally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35603) Release Testing Instructions: Verify FLINK-35533(FLIP-459): Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35603:
---
Description: 
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35533

In 1.20, we introduced a batch job recovery mechanism to enable batch jobs to 
recover as much progress as possible after a JobMaster failover, avoiding the 
need to rerun tasks that have already been finished.

More information about this feature and how to enable it could be found in: 
[https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/recovery_from_job_master_failure/]

We may need the following tests:
 # Start a batch job with High Availability (HA) enabled, and after it has 
progressed to a certain point, kill the JobManager (jm), then observe whether 
the job recovers its progress normally.
 # Use a custom source and ensure that its SplitEnumerator implements the 
SupportsBatchSnapshot interface, submit the job, and after it has progressed to 
a certain point, kill the JobManager (jm), then observe whether the job 
recovers its progress normally.

 

Follow up the test for https://issues.apache.org/jira/browse/FLINK-33892

  was:Follow up the test for https://issues.apache.org/jira/browse/FLINK-35533


> Release Testing Instructions: Verify FLINK-35533(FLIP-459): Support Flink 
> hybrid shuffle integration with Apache Celeborn
> -
>
> Key: FLINK-35603
> URL: https://issues.apache.org/jira/browse/FLINK-35603
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Rui Fan
>Assignee: Yuxin Tan
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35533
> In 1.20, we introduced a batch job recovery mechanism to enable batch jobs to 
> recover as much progress as possible after a JobMaster failover, avoiding the 
> need to rerun tasks that have already been finished.
> More information about this feature and how to enable it could be found in: 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/recovery_from_job_master_failure/]
> We may need the following tests:
>  # Start a batch job with High Availability (HA) enabled, and after it has 
> progressed to a certain point, kill the JobManager (jm), then observe whether 
> the job recovers its progress normally.
>  # Use a custom source and ensure that its SplitEnumerator implements the 
> SupportsBatchSnapshot interface, submit the job, and after it has progressed 
> to a certain point, kill the JobManager (jm), then observe whether the job 
> recovers its progress normally.
>  
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-33892



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35603) Release Testing Instructions: Verify FLINK-35533(FLIP-459): Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35603:
---
Description: Follow up the test for 
https://issues.apache.org/jira/browse/FLINK-35533  (was: Follow up the test for 
https://issues.apache.org/jira/browse/FLINK-35533

In 1.20, we introduced a batch job recovery mechanism to enable batch jobs to 
recover as much progress as possible after a JobMaster failover, avoiding the 
need to rerun tasks that have already been finished.

More information about this feature and how to enable it could be found in: 
[https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/recovery_from_job_master_failure/]

We may need the following tests:
 # Start a batch job with High Availability (HA) enabled, and after it has 
progressed to a certain point, kill the JobManager (jm), then observe whether 
the job recovers its progress normally.
 # Use a custom source and ensure that its SplitEnumerator implements the 
SupportsBatchSnapshot interface, submit the job, and after it has progressed to 
a certain point, kill the JobManager (jm), then observe whether the job 
recovers its progress normally.

 

Follow up the test for https://issues.apache.org/jira/browse/FLINK-33892)

> Release Testing Instructions: Verify FLINK-35533(FLIP-459): Support Flink 
> hybrid shuffle integration with Apache Celeborn
> -
>
> Key: FLINK-35603
> URL: https://issues.apache.org/jira/browse/FLINK-35603
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Rui Fan
>Assignee: Yuxin Tan
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35533



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859807#comment-17859807
 ] 

dalongliu edited comment on FLINK-35689 at 6/25/24 4:18 AM:


For Continuous Mode, I did the following verification.
h2. Create Materialized Table
h2. 
1. bad case1: Invalid Primary Key: column not-exists
 !screenshot-1.png! 

2. bad case2: Invalid Primary Key: column is nullable
 !image-2024-06-25-12-00-07-616.png! 

3. bad case3: Invalid partition key
 !image-2024-06-25-12-01-09-648.png! 

4. bad case4: invalid `partition.fields.pt.date-formatter`
 !image-2024-06-25-12-02-08-558.png! 

5. bad case5: invalid freshness time unit
 !image-2024-06-25-12-02-51-615.png! 

6. bad case6: negative freshness value
 !image-2024-06-25-12-03-20-930.png! 

7. bad case7: Specify Json Format, according to Definition Query can not 
generate Flink stream jobs, the framework will first create Materialized Table, 
and then deleted, weak atomicity assurance
 !image-2024-06-25-12-04-24-948.png! 

8. Good case: Materialized Table created successfully; Flink streaming job 
submitted successfully; data written in real time

{code:sql}
CREATE MATERIALIZED TABLE continuous_users_shops 
PARTITIONED BY (ds)
WITH(
  'format' = 'debezium-json',
  'sink.rolling-policy.rollover-interval' = '10s',
  'sink.rolling-policy.check-interval' = '10s'
)
FRESHNESS = INTERVAL '30' SECOND
AS SELECT 
  user_id,
  ds,
  SUM (payment_amount_cents) AS payed_buy_fee_sum,
  SUM (1) AS pv
FROM (
SELECT user_id, DATE_FORMAT(order_created_at, '-MM-dd') AS ds, 
payment_amount_cents FROM json_source ) AS tmp
 GROUP BY (user_id, ds);
{code}

 !image-2024-06-25-12-05-39-089.png! 

 !image-2024-06-25-12-05-54-104.png! 

h2. Suspend Materialized Table
h2. 
1. Suspend without specifying a savepoint path
 !image-2024-06-25-12-07-52-182.png! 

2. Suspend with specifying a savepoint path

{code:sql}
SET 'execution.checkpointing.savepoint-dir' = 
'file:///Users/ron/mt_demo/savepoint';

ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops SUSPEND;
{code}

 !image-2024-06-25-12-09-11-207.png! 
 !image-2024-06-25-12-09-22-879.png! 

3. Repeat Suspend
 !image-2024-06-25-12-11-08-720.png! 

h2. Resume Materialized Table
h2. 
1. Resume without options
{code:sql}
ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops RESUME;
{code}
 !image-2024-06-25-12-13-47-363.png! 
 !image-2024-06-25-12-14-13-107.png! 

2. Resume with options
{code:sql}
ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops RESUME
WITH (
  -- 'sink.parallelism' = '3'
  'sink.shuffle-by-partition.enable' = 'true'
);
{code}

 !image-2024-06-25-12-15-03-493.png! 

3. Repeat Resume: When a background streaming job is running and a Resume 
operation is performed, an error should be reported and a streaming job cannot 
be resubmitted.

 !image-2024-06-25-12-16-47-160.png! 
 !image-2024-06-25-12-16-57-076.png! 

h2. Manual Refresh Materialized Table
h2. 
1. bad case: Partition field does not exist
 !image-2024-06-25-12-18-12-506.png! 

2. bad case: Partition fields that are not of type String





was (Author: lsy):
For Continuous Mode, I did the following verification.
h1. Create Materialized Table
h1. 
1. bad case1: Invalid Primary Key: column not-exists
 !screenshot-1.png! 

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-06-25-12-00-07-616.png, 
> image-2024-06-25-12-01-09-648.png, image-2024-06-25-12-02-08-558.png, 
> image-2024-06-25-12-02-51-615.png, image-2024-06-25-12-03-20-930.png, 
> image-2024-06-25-12-04-24-948.png, image-2024-06-25-12-05-39-089.png, 
> image-2024-06-25-12-05-54-104.png, image-2024-06-25-12-07-52-182.png, 
> image-2024-06-25-12-09-11-207.png, image-2024-06-25-12-09-22-879.png, 
> image-2024-06-25-12-11-08-720.png, image-2024-06-25-12-13-47-363.png, 
> image-2024-06-25-12-14-13-107.png, image-2024-06-25-12-15-03-493.png, 
> image-2024-06-25-12-16-47-160.png, image-2024-06-25-12-16-57-076.png, 
> image-2024-06-25-12-18-12-506.png, screenshot-1.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> 

[jira] [Closed] (FLINK-35656) Hive Source has issues setting max parallelism in dynamic inference mode

2024-06-24 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-35656.
---
Resolution: Fixed

master: 01c3fd67ac46898bd520477ae861cd29cceaa636
release-1.20: d93f7421e0d520d6b2899cbff5844867374b96ab

> Hive Source has issues setting max parallelism in dynamic inference mode
> 
>
> Key: FLINK-35656
> URL: https://issues.apache.org/jira/browse/FLINK-35656
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.20.0
>Reporter: xingbe
>Assignee: xingbe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> In the dynamic parallelism inference mode of Hive Source, when 
> `table.exec.hive.infer-source-parallelism.max` is not configured, it does not 
> use `execution.batch.adaptive.auto-parallelism.default-source-parallelism` as 
> the upper bound for parallelism inference, which is inconsistent with the 
> behavior described in the documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859807#comment-17859807
 ] 

dalongliu commented on FLINK-35689:
---

For Continuous Mode, I did the following verification.
h1. Create Materialized Table
h1. 
1. bad case1: Invalid Primary Key: column not-exists
 !screenshot-1.png! 

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: screenshot-1.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-24 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-25 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-25 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-25 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-25 10:03:00', 40);
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-26 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-26 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-26 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-26 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-27 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-27 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-27 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-27 10:03:00', 40);
> {code}
> h1. Feature verification
> h1. 
> h2. Continuous Mode
> h2. 
> In Continuous Mode, 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Attachment: screenshot-1.png

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: screenshot-1.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-24 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-25 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-25 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-25 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-25 10:03:00', 40);
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-26 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-26 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-26 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-26 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-27 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-27 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-27 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-27 10:03:00', 40);
> {code}
> h1. Feature verification
> h1. 
> h2. Continuous Mode
> h2. 
> In Continuous Mode, Materialized Table runs a Flink streaming job to update 
> the data in real-time. Feature verify includes various scenarios such as 
> Create & Suspend & Resume & Drop.
> 1. 

[jira] [Commented] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859806#comment-17859806
 ] 

dalongliu commented on FLINK-35689:
---

FLIP-435 & FLIP-448 were designed by me, and most of the code implementation 
was done by [~hackergin], so I will do the Release Testing Verify.

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
> https://issues.apache.org/jira/browse/FLINK-35345
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-24 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-25 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-25 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-25 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-25 10:03:00', 40);
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-26 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-26 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-26 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-26 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-27 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-27 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-27 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-27 10:03:00', 40);
> {code}
> h1. Feature verification
> h1. 
> h2. Continuous Mode
> h2. 
> In Continuous Mode, Materialized Table runs a Flink streaming job to update 
> the data 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Description: 
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187, 
https://issues.apache.org/jira/browse/FLINK-35345

Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
feature at the same time.
Since Materialized Table depends on CatalogStore, Catalog, Workflow Scheduler, 
SQL Client, SQL Gateway, and Standalone cluster to go through the whole 
process, the validation process consists of two parts: Environment Setup and 
Feature Verification.

h1. Environment Setup:
h1. 
1. create the File CatalogStore directory
2. Create the test-filesystem Catalog and put 
flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
3. Create the Savepoint directory.
4. Configure the Flink config.yaml file.

{code:yaml}
#==
# Common
#==

jobmanager:
  bind-host: localhost
  rpc:
address: localhost
# The RPC port where the JobManager is reachable.
port: 6123
  memory:
process:
  size: 1600m
  execution:
failover-strategy: region

taskmanager:
  bind-host: localhost
  host: localhost
  # The number of task slots that each TaskManager offers. Each slot runs one 
parallel pipeline.
  numberOfTaskSlots: 3
  memory:
process:
  size: 1728m

parallelism:
  # The parallelism used for programs that did not specify and other 
parallelism.
  default: 1

#==
# Rest & web frontend
#==

rest:
  # The address to which the REST client will connect to
  address: localhost
  bind-address: localhost

# Catalog Store
table:
  catalog-store:
kind: file
file:
  path: xxx

# Embedded Scheduler config
workflow-scheduler:
  type: embedded

# SQL Gateway address
sql-gateway:
  endpoint:
rest:
  address: 127.0.0.1
{code}

5. Start the Standalone cluster: . /bin/start-cluster.sh
6. Start the SQL Gateway: . /bin/sql-gateway.sh
7. Start SQL Client: /bin/sql-client.sh gateway --endpoint http://127.0.0.1:8083
8. Register the test-filesystem Catalog

{code:sql}
CREATE CATALOG mt_cat
WITH (
  'type' = 'test-filesystem',
  'path' = '...',
  'default-database' = 'mydb'  
);

USE CATALOG mt_cat;
{code}

9. Create the test-filesystem source table and insert the data

{code:sql}
-- 1. create json format table
CREATE TABLE json_source (
  order_id BIGINT,
  user_id BIGINT,
  user_name STRING,
  order_created_at STRING,
  payment_amount_cents BIGINT
) WITH (
  'format' = 'json',
  'source.monitor-interval' = '5S'
);

-- 2. insert data
INSERT INTO mt_cat.mydb.json_source VALUES
(1001, 1, 'user1', '2024-06-24 10:00:00', 10),
(1002, 1, 'user2', '2024-06-24 10:01:00', 20),
(1003, 2, 'user3', '2024-06-24 10:02:00', 30),
(1004, 2, 'user4', '2024-06-24 10:03:00', 40),
(1005, 1, 'user1', '2024-06-25 10:00:00', 10),
(1006, 1, 'user2', '2024-06-25 10:01:00', 20),
(1007, 2, 'user3', '2024-06-25 10:02:00', 30),
(1008, 2, 'user4', '2024-06-25 10:03:00', 40);

INSERT INTO mt_cat.mydb.json_source VALUES
(1001, 1, 'user1', '2024-06-26 10:00:00', 10),
(1002, 1, 'user2', '2024-06-26 10:01:00', 20),
(1003, 2, 'user3', '2024-06-26 10:02:00', 30),
(1004, 2, 'user4', '2024-06-26 10:03:00', 40),
(1005, 1, 'user1', '2024-06-27 10:00:00', 10),
(1006, 1, 'user2', '2024-06-27 10:01:00', 20),
(1007, 2, 'user3', '2024-06-27 10:02:00', 30),
(1008, 2, 'user4', '2024-06-27 10:03:00', 40);
{code}

h1. Feature verification
h1. 
h2. Continuous Mode
h2. 
In Continuous Mode, Materialized Table runs a Flink streaming job to update the 
data in real-time. Feature verify includes various scenarios such as Create & 
Suspend & Resume & Drop.

1. Create Materialized Table, including various bad cases and good cases, and 
execute the following statement in the SQL Client

{code:sql}
CREATE MATERIALIZED TABLE continuous_users_shops 
(
  PRIMARY KEY(id) NOT ENFORCED
)
WITH(
  'format' = 'debezium-json'
)
FRESHNESS = INTERVAL '30' SECOND
AS SELECT 
  user_id,
  ds,
  SUM (payment_amount_cents) AS payed_buy_fee_sum,
  SUM (1) AS pv
FROM (
SELECT user_id, DATE_FORMAT(order_created_at, '-MM-dd') AS ds, 
payment_amount_cents FROM json_source ) AS tmp
 GROUP BY (user_id, ds);
{code}

2. Suspend Materialized Table and execute the following statement in the SQL 
Client

{code:sql}
ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops SUSPEND;
{code}

3. Resume Materialized Table

{code:sql}
ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops RESUME;
{code}

4. Manual Refresh Materialized Table

{code:sql}
ALTER 

[jira] [Commented] (FLINK-35610) Release Testing Instructions: Verify FLIP-448: Introduce Pluggable Workflow Scheduler Interface for Materialized Table

2024-06-24 Thread dalongliu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859804#comment-17859804
 ] 

dalongliu commented on FLINK-35610:
---

Close this Jira due to https://issues.apache.org/jira/browse/FLINK-35689 has 
been created.

> Release Testing Instructions: Verify FLIP-448: Introduce Pluggable Workflow 
> Scheduler Interface for Materialized Table
> --
>
> Key: FLINK-35610
> URL: https://issues.apache.org/jira/browse/FLINK-35610
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Rui Fan
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35345



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-35610) Release Testing Instructions: Verify FLIP-448: Introduce Pluggable Workflow Scheduler Interface for Materialized Table

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu closed FLINK-35610.
-
Resolution: Fixed

> Release Testing Instructions: Verify FLIP-448: Introduce Pluggable Workflow 
> Scheduler Interface for Materialized Table
> --
>
> Key: FLINK-35610
> URL: https://issues.apache.org/jira/browse/FLINK-35610
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Rui Fan
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35345



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35609) Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859803#comment-17859803
 ] 

dalongliu commented on FLINK-35609:
---

Close this Jira due to https://issues.apache.org/jira/browse/FLINK-35689 has 
been created.



> Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> 
>
> Key: FLINK-35609
> URL: https://issues.apache.org/jira/browse/FLINK-35609
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Rui Fan
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-24 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-25 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-25 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-25 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-25 10:03:00', 40);
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-26 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-26 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-26 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-26 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-27 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-27 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-27 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-27 10:03:00', 40);
> {code}
> h1. Feature verification
> h1. 
> h2. Continuous Mode
> h2. 
> In Continuous Mode, Materialized Table runs a Flink streaming job to update 
> the data in real-time. Feature verify includes various scenarios such as 
> Create & Suspend & Resume & Drop.
> 

[jira] [Closed] (FLINK-35609) Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu closed FLINK-35609.
-
Resolution: Fixed

> Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> 
>
> Key: FLINK-35609
> URL: https://issues.apache.org/jira/browse/FLINK-35609
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Rui Fan
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-24 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-25 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-25 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-25 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-25 10:03:00', 40);
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-26 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-26 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-26 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-26 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-27 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-27 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-27 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-27 10:03:00', 40);
> {code}
> h1. Feature verification
> h1. 
> h2. Continuous Mode
> h2. 
> In Continuous Mode, Materialized Table runs a Flink streaming job to update 
> the data in real-time. Feature verify includes various scenarios such as 
> Create & Suspend & Resume & Drop.
> 1. Create Materialized Table, including various bad cases and good cases, and 
> execute the following statement in 

[jira] [Updated] (FLINK-35689) Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35689:
--
Summary: Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New 
Materialized Table for Simplifying Data Pipelines  (was: CLONE - Release 
Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for 
Simplifying Data Pipelines)

> Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> --
>
> Key: FLINK-35689
> URL: https://issues.apache.org/jira/browse/FLINK-35689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. Each slot runs one 
> parallel pipeline.
>   numberOfTaskSlots: 3
>   memory:
> process:
>   size: 1728m
> parallelism:
>   # The parallelism used for programs that did not specify and other 
> parallelism.
>   default: 1
> #==
> # Rest & web frontend
> #==
> rest:
>   # The address to which the REST client will connect to
>   address: localhost
>   bind-address: localhost
> # Catalog Store
> table:
>   catalog-store:
> kind: file
> file:
>   path: xxx
> # Embedded Scheduler config
> workflow-scheduler:
>   type: embedded
> # SQL Gateway address
> sql-gateway:
>   endpoint:
> rest:
>   address: 127.0.0.1
> {code}
> 5. Start the Standalone cluster: . /bin/start-cluster.sh
> 6. Start the SQL Gateway: . /bin/sql-gateway.sh
> 7. Start SQL Client: /bin/sql-client.sh gateway --endpoint 
> http://127.0.0.1:8083
> 8. Register the test-filesystem Catalog
> {code:sql}
> CREATE CATALOG mt_cat
> WITH (
>   'type' = 'test-filesystem',
>   'path' = '...',
>   'default-database' = 'mydb'  
> );
> USE CATALOG mt_cat;
> {code}
> 9. Create the test-filesystem source table and insert the data
> {code:sql}
> -- 1. create json format table
> CREATE TABLE json_source (
>   order_id BIGINT,
>   user_id BIGINT,
>   user_name STRING,
>   order_created_at STRING,
>   payment_amount_cents BIGINT
> ) WITH (
>   'format' = 'json',
>   'source.monitor-interval' = '5S'
> );
> -- 2. insert data
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-24 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-24 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-24 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-24 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-25 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-25 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-25 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-25 10:03:00', 40);
> INSERT INTO mt_cat.mydb.json_source VALUES
> (1001, 1, 'user1', '2024-06-26 10:00:00', 10),
> (1002, 1, 'user2', '2024-06-26 10:01:00', 20),
> (1003, 2, 'user3', '2024-06-26 10:02:00', 30),
> (1004, 2, 'user4', '2024-06-26 10:03:00', 40),
> (1005, 1, 'user1', '2024-06-27 10:00:00', 10),
> (1006, 1, 'user2', '2024-06-27 10:01:00', 20),
> (1007, 2, 'user3', '2024-06-27 10:02:00', 30),
> (1008, 2, 'user4', '2024-06-27 10:03:00', 40);
> {code}
> h1. Feature verification
> h1. 
> h2. Continuous Mode
> h2. 
> In Continuous Mode, Materialized Table runs a Flink streaming job to 

[jira] [Created] (FLINK-35689) CLONE - Release Testing: Verify FLIP-435 & FLIP-448: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)
dalongliu created FLINK-35689:
-

 Summary: CLONE - Release Testing: Verify FLIP-435 & FLIP-448: 
Introduce a New Materialized Table for Simplifying Data Pipelines
 Key: FLINK-35689
 URL: https://issues.apache.org/jira/browse/FLINK-35689
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: dalongliu
Assignee: dalongliu
 Fix For: 1.20.0


Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187

Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
feature at the same time.
Since Materialized Table depends on CatalogStore, Catalog, Workflow Scheduler, 
SQL Client, SQL Gateway, and Standalone cluster to go through the whole 
process, the validation process consists of two parts: Environment Setup and 
Feature Verification.

h1. Environment Setup:
h1. 
1. create the File CatalogStore directory
2. Create the test-filesystem Catalog and put 
flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
3. Create the Savepoint directory.
4. Configure the Flink config.yaml file.

{code:yaml}
#==
# Common
#==

jobmanager:
  bind-host: localhost
  rpc:
address: localhost
# The RPC port where the JobManager is reachable.
port: 6123
  memory:
process:
  size: 1600m
  execution:
failover-strategy: region

taskmanager:
  bind-host: localhost
  host: localhost
  # The number of task slots that each TaskManager offers. Each slot runs one 
parallel pipeline.
  numberOfTaskSlots: 3
  memory:
process:
  size: 1728m

parallelism:
  # The parallelism used for programs that did not specify and other 
parallelism.
  default: 1

#==
# Rest & web frontend
#==

rest:
  # The address to which the REST client will connect to
  address: localhost
  bind-address: localhost

# Catalog Store
table:
  catalog-store:
kind: file
file:
  path: xxx

# Embedded Scheduler config
workflow-scheduler:
  type: embedded

# SQL Gateway address
sql-gateway:
  endpoint:
rest:
  address: 127.0.0.1
{code}

5. Start the Standalone cluster: . /bin/start-cluster.sh
6. Start the SQL Gateway: . /bin/sql-gateway.sh
7. Start SQL Client: /bin/sql-client.sh gateway --endpoint http://127.0.0.1:8083
8. Register the test-filesystem Catalog

{code:sql}
CREATE CATALOG mt_cat
WITH (
  'type' = 'test-filesystem',
  'path' = '...',
  'default-database' = 'mydb'  
);

USE CATALOG mt_cat;
{code}

9. Create the test-filesystem source table and insert the data

{code:sql}
-- 1. create json format table
CREATE TABLE json_source (
  order_id BIGINT,
  user_id BIGINT,
  user_name STRING,
  order_created_at STRING,
  payment_amount_cents BIGINT
) WITH (
  'format' = 'json',
  'source.monitor-interval' = '5S'
);

-- 2. insert data
INSERT INTO mt_cat.mydb.json_source VALUES
(1001, 1, 'user1', '2024-06-24 10:00:00', 10),
(1002, 1, 'user2', '2024-06-24 10:01:00', 20),
(1003, 2, 'user3', '2024-06-24 10:02:00', 30),
(1004, 2, 'user4', '2024-06-24 10:03:00', 40),
(1005, 1, 'user1', '2024-06-25 10:00:00', 10),
(1006, 1, 'user2', '2024-06-25 10:01:00', 20),
(1007, 2, 'user3', '2024-06-25 10:02:00', 30),
(1008, 2, 'user4', '2024-06-25 10:03:00', 40);

INSERT INTO mt_cat.mydb.json_source VALUES
(1001, 1, 'user1', '2024-06-26 10:00:00', 10),
(1002, 1, 'user2', '2024-06-26 10:01:00', 20),
(1003, 2, 'user3', '2024-06-26 10:02:00', 30),
(1004, 2, 'user4', '2024-06-26 10:03:00', 40),
(1005, 1, 'user1', '2024-06-27 10:00:00', 10),
(1006, 1, 'user2', '2024-06-27 10:01:00', 20),
(1007, 2, 'user3', '2024-06-27 10:02:00', 30),
(1008, 2, 'user4', '2024-06-27 10:03:00', 40);
{code}

h1. Feature verification
h1. 
h2. Continuous Mode
h2. 
In Continuous Mode, Materialized Table runs a Flink streaming job to update the 
data in real-time. Feature verify includes various scenarios such as Create & 
Suspend & Resume & Drop.

1. Create Materialized Table, including various bad cases and good cases, and 
execute the following statement in the SQL Client

{code:sql}
CREATE MATERIALIZED TABLE continuous_users_shops 
(
  PRIMARY KEY(id) NOT ENFORCED
)
WITH(
  'format' = 'debezium-json'
)
FRESHNESS = INTERVAL '30' SECOND
AS SELECT 
  user_id,
  ds,
  SUM (payment_amount_cents) AS payed_buy_fee_sum,
  SUM (1) AS pv
FROM (
SELECT user_id, DATE_FORMAT(order_created_at, '-MM-dd') AS ds, 
payment_amount_cents FROM json_source ) AS tmp
 GROUP BY (user_id, ds);
{code}

2. Suspend Materialized Table and execute the following statement in the SQL 
Client

{code:sql}
ALTER 

[jira] [Updated] (FLINK-35609) Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35609:
--
Description: 
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187

Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
feature at the same time.
Since Materialized Table depends on CatalogStore, Catalog, Workflow Scheduler, 
SQL Client, SQL Gateway, and Standalone cluster to go through the whole 
process, the validation process consists of two parts: Environment Setup and 
Feature Verification.

h1. Environment Setup:
h1. 
1. create the File CatalogStore directory
2. Create the test-filesystem Catalog and put 
flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
3. Create the Savepoint directory.
4. Configure the Flink config.yaml file.

{code:yaml}
#==
# Common
#==

jobmanager:
  bind-host: localhost
  rpc:
address: localhost
# The RPC port where the JobManager is reachable.
port: 6123
  memory:
process:
  size: 1600m
  execution:
failover-strategy: region

taskmanager:
  bind-host: localhost
  host: localhost
  # The number of task slots that each TaskManager offers. Each slot runs one 
parallel pipeline.
  numberOfTaskSlots: 3
  memory:
process:
  size: 1728m

parallelism:
  # The parallelism used for programs that did not specify and other 
parallelism.
  default: 1

#==
# Rest & web frontend
#==

rest:
  # The address to which the REST client will connect to
  address: localhost
  bind-address: localhost

# Catalog Store
table:
  catalog-store:
kind: file
file:
  path: xxx

# Embedded Scheduler config
workflow-scheduler:
  type: embedded

# SQL Gateway address
sql-gateway:
  endpoint:
rest:
  address: 127.0.0.1
{code}

5. Start the Standalone cluster: . /bin/start-cluster.sh
6. Start the SQL Gateway: . /bin/sql-gateway.sh
7. Start SQL Client: /bin/sql-client.sh gateway --endpoint http://127.0.0.1:8083
8. Register the test-filesystem Catalog

{code:sql}
CREATE CATALOG mt_cat
WITH (
  'type' = 'test-filesystem',
  'path' = '...',
  'default-database' = 'mydb'  
);

USE CATALOG mt_cat;
{code}

9. Create the test-filesystem source table and insert the data

{code:sql}
-- 1. create json format table
CREATE TABLE json_source (
  order_id BIGINT,
  user_id BIGINT,
  user_name STRING,
  order_created_at STRING,
  payment_amount_cents BIGINT
) WITH (
  'format' = 'json',
  'source.monitor-interval' = '5S'
);

-- 2. insert data
INSERT INTO mt_cat.mydb.json_source VALUES
(1001, 1, 'user1', '2024-06-24 10:00:00', 10),
(1002, 1, 'user2', '2024-06-24 10:01:00', 20),
(1003, 2, 'user3', '2024-06-24 10:02:00', 30),
(1004, 2, 'user4', '2024-06-24 10:03:00', 40),
(1005, 1, 'user1', '2024-06-25 10:00:00', 10),
(1006, 1, 'user2', '2024-06-25 10:01:00', 20),
(1007, 2, 'user3', '2024-06-25 10:02:00', 30),
(1008, 2, 'user4', '2024-06-25 10:03:00', 40);

INSERT INTO mt_cat.mydb.json_source VALUES
(1001, 1, 'user1', '2024-06-26 10:00:00', 10),
(1002, 1, 'user2', '2024-06-26 10:01:00', 20),
(1003, 2, 'user3', '2024-06-26 10:02:00', 30),
(1004, 2, 'user4', '2024-06-26 10:03:00', 40),
(1005, 1, 'user1', '2024-06-27 10:00:00', 10),
(1006, 1, 'user2', '2024-06-27 10:01:00', 20),
(1007, 2, 'user3', '2024-06-27 10:02:00', 30),
(1008, 2, 'user4', '2024-06-27 10:03:00', 40);
{code}

h1. Feature verification
h1. 
h2. Continuous Mode
h2. 
In Continuous Mode, Materialized Table runs a Flink streaming job to update the 
data in real-time. Feature verify includes various scenarios such as Create & 
Suspend & Resume & Drop.

1. Create Materialized Table, including various bad cases and good cases, and 
execute the following statement in the SQL Client

{code:sql}
CREATE MATERIALIZED TABLE continuous_users_shops 
(
  PRIMARY KEY(id) NOT ENFORCED
)
WITH(
  'format' = 'debezium-json'
)
FRESHNESS = INTERVAL '30' SECOND
AS SELECT 
  user_id,
  ds,
  SUM (payment_amount_cents) AS payed_buy_fee_sum,
  SUM (1) AS pv
FROM (
SELECT user_id, DATE_FORMAT(order_created_at, '-MM-dd') AS ds, 
payment_amount_cents FROM json_source ) AS tmp
 GROUP BY (user_id, ds);
{code}

2. Suspend Materialized Table and execute the following statement in the SQL 
Client

{code:sql}
ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops SUSPEND;
{code}

3. Resume Materialized Table

{code:sql}
ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops RESUME;
{code}

4. Manual Refresh Materialized Table

{code:sql}
ALTER MATERIALIZED TABLE mt_cat.mydb.continuous_users_shops 

[jira] [Updated] (FLINK-35609) Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35609:
--
Description: 
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187

Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
feature at the same time.
Since Materialized Table depends on CatalogStore, Catalog, Workflow Scheduler, 
SQL Client, SQL Gateway, and Standalone cluster to go through the whole 
process, the validation process consists of two parts: Environment Setup and 
Feature Verification.

h1. Environment Setup:
h1. 
1. create the File CatalogStore directory
2. Create the test-filesystem Catalog and put 
flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
3. Create the Savepoint directory.
4. Configure the Flink config.yaml file.

{code:yaml}
#==
# Common
#==

jobmanager:
  bind-host: localhost
  rpc:
address: localhost
# The RPC port where the JobManager is reachable.
port: 6123
  memory:
process:
  size: 1600m
  execution:
failover-strategy: region

taskmanager:
  bind-host: localhost
  host: localhost
  # The number of task slots that each TaskManager offers. Each slot runs one 
parallel pipeline.
  numberOfTaskSlots: 3
  memory:
process:
  size: 1728m

parallelism:
  # The parallelism used for programs that did not specify and other 
parallelism.
  default: 1

#==
# Rest & web frontend
#==

rest:
  # The address to which the REST client will connect to
  address: localhost
  bind-address: localhost

# Catalog Store
table:
  catalog-store:
kind: file
file:
  path: xxx

# Embedded Scheduler config
workflow-scheduler:
  type: embedded

# SQL Gateway address
sql-gateway:
  endpoint:
rest:
  address: 127.0.0.1
{code}


  was:
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187

Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
feature at the same time.
Since Materialized Table depends on CatalogStore, Catalog, Workflow Scheduler, 
SQL Client, SQL Gateway, and Standalone cluster to go through the whole 
process, the validation process consists of two parts: Environment Setup and 
Feature Verification.

h1. Environment Setup:
h1. 
1. create the File CatalogStore directory
2. Create the test-filesystem Catalog and put 
flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
3. Create the Savepoint directory.
4. Configure the Flink config.yaml file.


> Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> 
>
> Key: FLINK-35609
> URL: https://issues.apache.org/jira/browse/FLINK-35609
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Rui Fan
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.
> {code:yaml}
> #==
> # Common
> #==
> jobmanager:
>   bind-host: localhost
>   rpc:
> address: localhost
> # The RPC port where the JobManager is reachable.
> port: 6123
>   memory:
> process:
>   size: 1600m
>   execution:
> failover-strategy: region
> taskmanager:
>   bind-host: localhost
>   host: localhost
>   # The number of task slots that each TaskManager offers. 

[jira] [Updated] (FLINK-35609) Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35609:
--
Description: 
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187

Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
feature at the same time.
Since Materialized Table depends on CatalogStore, Catalog, Workflow Scheduler, 
SQL Client, SQL Gateway, and Standalone cluster to go through the whole 
process, the validation process consists of two parts: Environment Setup and 
Feature Verification.

h1. Environment Setup:
h1. 
1. create the File CatalogStore directory
2. Create the test-filesystem Catalog and put 
flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
3. Create the Savepoint directory.
4. Configure the Flink config.yaml file.

  was:
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187

Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
feature at the same time.
Since Materialized Table depends on CatalogStore, Catalog, Workflow Scheduler, 
SQL Client, SQL Gateway, and Standalone cluster to go through the whole 
process, the validation process consists of two parts: Environment Setup and 
Feature Verification.

*Environment Setup:
*



> Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> 
>
> Key: FLINK-35609
> URL: https://issues.apache.org/jira/browse/FLINK-35609
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Rui Fan
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> h1. Environment Setup:
> h1. 
> 1. create the File CatalogStore directory
> 2. Create the test-filesystem Catalog and put 
> flink-table-filesystem-test-utils-1.20-SNAPSHOT.jar into the lib directory.
> 3. Create the Savepoint directory.
> 4. Configure the Flink config.yaml file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35609) Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35609:
--
Description: 
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187

Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
feature at the same time.
Since Materialized Table depends on CatalogStore, Catalog, Workflow Scheduler, 
SQL Client, SQL Gateway, and Standalone cluster to go through the whole 
process, the validation process consists of two parts: Environment Setup and 
Feature Verification.

*Environment Setup:
*


  was:Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187


> Release Testing Instructions: Verify FLIP-435: Introduce a New Materialized 
> Table for Simplifying Data Pipelines
> 
>
> Key: FLINK-35609
> URL: https://issues.apache.org/jira/browse/FLINK-35609
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Rui Fan
>Assignee: dalongliu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-35187
> Materialized Table depends on FLIP-435 & FLIP-448 to complete the end-to-end 
> process, so the Release testing is an overall test of FLIP-435 & FLIP-448 
> feature at the same time.
> Since Materialized Table depends on CatalogStore, Catalog, Workflow 
> Scheduler, SQL Client, SQL Gateway, and Standalone cluster to go through the 
> whole process, the validation process consists of two parts: Environment 
> Setup and Feature Verification.
> *Environment Setup:
> *



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35688) Add materialized table develop guide docs

2024-06-24 Thread dalongliu (Jira)
dalongliu created FLINK-35688:
-

 Summary: Add materialized table develop guide docs
 Key: FLINK-35688
 URL: https://issues.apache.org/jira/browse/FLINK-35688
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.20.0
Reporter: dalongliu
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35644) Add materialized table user guide doc

2024-06-24 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-35644:
--
Summary: Add materialized table user guide doc  (was: Add workflow 
scheduler doc)

> Add materialized table user guide doc
> -
>
> Key: FLINK-35644
> URL: https://issues.apache.org/jira/browse/FLINK-35644
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Priority: Major
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35681) [Release-1.20] Select executing Release Manager

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35681:
---
Affects Version/s: 1.20.0
   (was: 1.19.0)

> [Release-1.20]  Select executing Release Manager
> 
>
> Key: FLINK-35681
> URL: https://issues.apache.org/jira/browse/FLINK-35681
> Project: Flink
>  Issue Type: Sub-task
>  Components: Release System
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
> Fix For: 1.19.0
>
>
> h4. GPG Key
> You need to have a GPG key to sign the release artifacts. Please be aware of 
> the ASF-wide [release signing 
> guidelines|https://www.apache.org/dev/release-signing.html]. If you don’t 
> have a GPG key associated with your Apache account, please create one 
> according to the guidelines.
> Determine your Apache GPG Key and Key ID, as follows:
> {code:java}
> $ gpg --list-keys
> {code}
> This will list your GPG keys. One of these should reflect your Apache 
> account, for example:
> {code:java}
> --
> pub   2048R/845E6689 2016-02-23
> uid  Nomen Nescio 
> sub   2048R/BA4D50BE 2016-02-23
> {code}
> In the example above, the key ID is the 8-digit hex string in the {{pub}} 
> line: {{{}845E6689{}}}.
> Now, add your Apache GPG key to the Flink’s {{KEYS}} file in the [Apache 
> Flink release KEYS 
> file|https://dist.apache.org/repos/dist/release/flink/KEYS] repository at 
> [dist.apache.org|http://dist.apache.org/]. Follow the instructions listed at 
> the top of these files. (Note: Only PMC members have write access to the 
> release repository. If you end up getting 403 errors ask on the mailing list 
> for assistance.)
> Configure {{git}} to use this key when signing code by giving it your key ID, 
> as follows:
> {code:java}
> $ git config --global user.signingkey 845E6689
> {code}
> You may drop the {{--global}} option if you’d prefer to use this key for the 
> current repository only.
> You may wish to start {{gpg-agent}} to unlock your GPG key only once using 
> your passphrase. Otherwise, you may need to enter this passphrase hundreds of 
> times. The setup for {{gpg-agent}} varies based on operating system, but may 
> be something like this:
> {code:bash}
> $ eval $(gpg-agent --daemon --no-grab --write-env-file $HOME/.gpg-agent-info)
> $ export GPG_TTY=$(tty)
> $ export GPG_AGENT_INFO
> {code}
> h4. Access to Apache Nexus repository
> Configure access to the [Apache Nexus 
> repository|https://repository.apache.org/], which enables final deployment of 
> releases to the Maven Central Repository.
>  # You log in with your Apache account.
>  # Confirm you have appropriate access by finding {{org.apache.flink}} under 
> {{{}Staging Profiles{}}}.
>  # Navigate to your {{Profile}} (top right drop-down menu of the page).
>  # Choose {{User Token}} from the dropdown, then click {{{}Access User 
> Token{}}}. Copy a snippet of the Maven XML configuration block.
>  # Insert this snippet twice into your global Maven {{settings.xml}} file, 
> typically {{{}${HOME}/.m2/settings.xml{}}}. The end result should look like 
> this, where {{TOKEN_NAME}} and {{TOKEN_PASSWORD}} are your secret tokens:
> {code:xml}
> 
>
>  
>apache.releases.https
>TOKEN_NAME
>TOKEN_PASSWORD
>  
>  
>apache.snapshots.https
>TOKEN_NAME
>TOKEN_PASSWORD
>  
>
>  
> {code}
> h4. Website development setup
> Get ready for updating the Flink website by following the [website 
> development 
> instructions|https://flink.apache.org/contributing/improve-website.html].
> h4. GNU Tar Setup for Mac (Skip this step if you are not using a Mac)
> The default tar application on Mac does not support GNU archive format and 
> defaults to Pax. This bloats the archive with unnecessary metadata that can 
> result in additional files when decompressing (see [1.15.2-RC2 vote 
> thread|https://lists.apache.org/thread/mzbgsb7y9vdp9bs00gsgscsjv2ygy58q]). 
> Install gnu-tar and create a symbolic link to use in preference of the 
> default tar program.
> {code:bash}
> $ brew install gnu-tar
> $ ln -s /usr/local/bin/gtar /usr/local/bin/tar
> $ which tar
> {code}
>  
> 
> h3. Expectations
>  * Release Manager’s GPG key is published to 
> [dist.apache.org|http://dist.apache.org/]
>  * Release Manager’s GPG key is configured in git configuration
>  * Release Manager's GPG key is configured as the default gpg key.
>  * Release Manager has {{org.apache.flink}} listed under Staging Profiles in 
> Nexus
>  * Release Manager’s Nexus User Token is configured in settings.xml



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35679) [Release-1.20] Cross team testing

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35679:
---
Fix Version/s: 1.20.0

> [Release-1.20] Cross team testing
> -
>
> Key: FLINK-35679
> URL: https://issues.apache.org/jira/browse/FLINK-35679
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
> Fix For: 1.20.0
>
>
> For user facing features that go into the release we'd like to ensure they 
> can actually _be used_ by Flink users. To achieve this the release managers 
> ensure that an issue for cross team testing is created in the Apache Flink 
> Jira. This can and should be picked up by other community members to verify 
> the functionality and usability of the feature.
> The issue should contain some entry points which enables other community 
> members to test it. It should not contain documentation on how to use the 
> feature as this should be part of the actual documentation. The cross team 
> tests are performed after the feature freeze. Documentation should be in 
> place before that. Those tests are manual tests, so do not confuse them with 
> automated tests.
> To sum that up:
>  * User facing features should be tested by other contributors
>  * The scope is usability and sanity of the feature
>  * The feature needs to be already documented
>  * The contributor creates an issue containing some pointers on how to get 
> started (e.g. link to the documentation, suggested targets of verification)
>  * Other community members pick those issues up and provide feedback
>  * Cross team testing happens right after the feature freeze
>  
> 
> h3. Expectations
>  * Jira issues for each expected release task according to the release plan 
> is created and labeled as {{{}release-testing{}}}.
>  * All the created release-testing-related Jira issues are resolved and the 
> corresponding blocker issues are fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35681) [Release-1.20] Select executing Release Manager

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35681:
---
Fix Version/s: 1.20.0
   (was: 1.19.0)

> [Release-1.20]  Select executing Release Manager
> 
>
> Key: FLINK-35681
> URL: https://issues.apache.org/jira/browse/FLINK-35681
> Project: Flink
>  Issue Type: Sub-task
>  Components: Release System
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
> Fix For: 1.20.0
>
>
> h4. GPG Key
> You need to have a GPG key to sign the release artifacts. Please be aware of 
> the ASF-wide [release signing 
> guidelines|https://www.apache.org/dev/release-signing.html]. If you don’t 
> have a GPG key associated with your Apache account, please create one 
> according to the guidelines.
> Determine your Apache GPG Key and Key ID, as follows:
> {code:java}
> $ gpg --list-keys
> {code}
> This will list your GPG keys. One of these should reflect your Apache 
> account, for example:
> {code:java}
> --
> pub   2048R/845E6689 2016-02-23
> uid  Nomen Nescio 
> sub   2048R/BA4D50BE 2016-02-23
> {code}
> In the example above, the key ID is the 8-digit hex string in the {{pub}} 
> line: {{{}845E6689{}}}.
> Now, add your Apache GPG key to the Flink’s {{KEYS}} file in the [Apache 
> Flink release KEYS 
> file|https://dist.apache.org/repos/dist/release/flink/KEYS] repository at 
> [dist.apache.org|http://dist.apache.org/]. Follow the instructions listed at 
> the top of these files. (Note: Only PMC members have write access to the 
> release repository. If you end up getting 403 errors ask on the mailing list 
> for assistance.)
> Configure {{git}} to use this key when signing code by giving it your key ID, 
> as follows:
> {code:java}
> $ git config --global user.signingkey 845E6689
> {code}
> You may drop the {{--global}} option if you’d prefer to use this key for the 
> current repository only.
> You may wish to start {{gpg-agent}} to unlock your GPG key only once using 
> your passphrase. Otherwise, you may need to enter this passphrase hundreds of 
> times. The setup for {{gpg-agent}} varies based on operating system, but may 
> be something like this:
> {code:bash}
> $ eval $(gpg-agent --daemon --no-grab --write-env-file $HOME/.gpg-agent-info)
> $ export GPG_TTY=$(tty)
> $ export GPG_AGENT_INFO
> {code}
> h4. Access to Apache Nexus repository
> Configure access to the [Apache Nexus 
> repository|https://repository.apache.org/], which enables final deployment of 
> releases to the Maven Central Repository.
>  # You log in with your Apache account.
>  # Confirm you have appropriate access by finding {{org.apache.flink}} under 
> {{{}Staging Profiles{}}}.
>  # Navigate to your {{Profile}} (top right drop-down menu of the page).
>  # Choose {{User Token}} from the dropdown, then click {{{}Access User 
> Token{}}}. Copy a snippet of the Maven XML configuration block.
>  # Insert this snippet twice into your global Maven {{settings.xml}} file, 
> typically {{{}${HOME}/.m2/settings.xml{}}}. The end result should look like 
> this, where {{TOKEN_NAME}} and {{TOKEN_PASSWORD}} are your secret tokens:
> {code:xml}
> 
>
>  
>apache.releases.https
>TOKEN_NAME
>TOKEN_PASSWORD
>  
>  
>apache.snapshots.https
>TOKEN_NAME
>TOKEN_PASSWORD
>  
>
>  
> {code}
> h4. Website development setup
> Get ready for updating the Flink website by following the [website 
> development 
> instructions|https://flink.apache.org/contributing/improve-website.html].
> h4. GNU Tar Setup for Mac (Skip this step if you are not using a Mac)
> The default tar application on Mac does not support GNU archive format and 
> defaults to Pax. This bloats the archive with unnecessary metadata that can 
> result in additional files when decompressing (see [1.15.2-RC2 vote 
> thread|https://lists.apache.org/thread/mzbgsb7y9vdp9bs00gsgscsjv2ygy58q]). 
> Install gnu-tar and create a symbolic link to use in preference of the 
> default tar program.
> {code:bash}
> $ brew install gnu-tar
> $ ln -s /usr/local/bin/gtar /usr/local/bin/tar
> $ which tar
> {code}
>  
> 
> h3. Expectations
>  * Release Manager’s GPG key is published to 
> [dist.apache.org|http://dist.apache.org/]
>  * Release Manager’s GPG key is configured in git configuration
>  * Release Manager's GPG key is configured as the default gpg key.
>  * Release Manager has {{org.apache.flink}} listed under Staging Profiles in 
> Nexus
>  * Release Manager’s Nexus User Token is configured in settings.xml



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35680) [Release-1.20] Review Release Notes in JIRA

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35680:
---
Affects Version/s: 1.20.0

> [Release-1.20] Review Release Notes in JIRA
> ---
>
> Key: FLINK-35680
> URL: https://issues.apache.org/jira/browse/FLINK-35680
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Ufuk Celebi
>Priority: Major
>
> JIRA automatically generates Release Notes based on the {{Fix Version}} field 
> applied to issues. Release Notes are intended for Flink users (not Flink 
> committers/contributors). You should ensure that Release Notes are 
> informative and useful.
> Open the release notes from the version status page by choosing the release 
> underway and clicking Release Notes.
> You should verify that the issues listed automatically by JIRA are 
> appropriate to appear in the Release Notes. Specifically, issues should:
>  * Be appropriately classified as {{{}Bug{}}}, {{{}New Feature{}}}, 
> {{{}Improvement{}}}, etc.
>  * Represent noteworthy user-facing changes, such as new functionality, 
> backward-incompatible API changes, or performance improvements.
>  * Have occurred since the previous release; an issue that was introduced and 
> fixed between releases should not appear in the Release Notes.
>  * Have an issue title that makes sense when read on its own.
> Adjust any of the above properties to the improve clarity and presentation of 
> the Release Notes.
> Ensure that the JIRA release notes are also included in the release notes of 
> the documentation (see section "Review and update documentation").
> h4. Content of Release Notes field from JIRA tickets 
> To get the list of "release notes" field from JIRA, you can ran the following 
> script using JIRA REST API (notes the maxResults limits the number of 
> entries):
> {code:bash}
> curl -s 
> https://issues.apache.org/jira//rest/api/2/search?maxResults=200=project%20%3D%20FLINK%20AND%20%22Release%20Note%22%20is%20not%20EMPTY%20and%20fixVersion%20%3D%20${RELEASE_VERSION}
>  | jq '.issues[]|.key,.fields.summary,.fields.customfield_12310192' | paste - 
> - -
> {code}
> {{jq}}  is present in most Linux distributions and on MacOS can be installed 
> via brew.
>  
> 
> h3. Expectations
>  * Release Notes in JIRA have been audited and adjusted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35683) [Release-1.20] Verify that no exclusions were erroneously added to the japicmp plugin

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35683:
---
Affects Version/s: 1.20.0

> [Release-1.20]  Verify that no exclusions were erroneously added to the 
> japicmp plugin
> --
>
> Key: FLINK-35683
> URL: https://issues.apache.org/jira/browse/FLINK-35683
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
>
> Verify that no exclusions were erroneously added to the japicmp plugin that 
> break compatibility guarantees. Check the exclusions for the 
> japicmp-maven-plugin in the root pom (see 
> [apache/flink:pom.xml:2175ff|https://github.com/apache/flink/blob/3856c49af77601cf7943a5072d8c932279ce46b4/pom.xml#L2175]
>  for exclusions that:
> * For minor releases: break source compatibility for {{@Public}} APIs
> * For patch releases: break source/binary compatibility for 
> {{@Public}}/{{@PublicEvolving}}  APIs
> Any such exclusion must be properly justified, in advance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35678) [Release-1.20] Review and update documentation

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35678:
---
Fix Version/s: 1.20.0
   (was: 1.19.0)

> [Release-1.20] Review and update documentation
> --
>
> Key: FLINK-35678
> URL: https://issues.apache.org/jira/browse/FLINK-35678
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> There are a few pages in the documentation that need to be reviewed and 
> updated for each release.
>  * Ensure that there exists a release notes page for each non-bugfix release 
> (e.g., 1.5.0) in {{{}./docs/release-notes/{}}}, that it is up-to-date, and 
> linked from the start page of the documentation.
>  * Upgrading Applications and Flink Versions: 
> [https://ci.apache.org/projects/flink/flink-docs-master/ops/upgrading.html]
>  * ...
>  
> 
> h3. Expectations
>  * Update upgrade compatibility table 
> ([apache-flink:./docs/content/docs/ops/upgrading.md|https://github.com/apache/flink/blob/master/docs/content/docs/ops/upgrading.md#compatibility-table]
>  and 
> [apache-flink:./docs/content.zh/docs/ops/upgrading.md|https://github.com/apache/flink/blob/master/docs/content.zh/docs/ops/upgrading.md#compatibility-table]).
>  * Update [Release Overview in 
> Confluence|https://cwiki.apache.org/confluence/display/FLINK/Release+Management+and+Feature+Plan]
>  * (minor only) The documentation for the new major release is visible under 
> [https://nightlies.apache.org/flink/flink-docs-release-$SHORT_RELEASE_VERSION]
>  (after at least one [doc 
> build|https://github.com/apache/flink/actions/workflows/docs.yml] succeeded).
>  * (minor only) The documentation for the new major release does not contain 
> "-SNAPSHOT" in its version title, and all links refer to the corresponding 
> version docs instead of {{{}master{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35679) [Release-1.20] Cross team testing

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35679:
---
Affects Version/s: 1.20.0

> [Release-1.20] Cross team testing
> -
>
> Key: FLINK-35679
> URL: https://issues.apache.org/jira/browse/FLINK-35679
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
>
> For user facing features that go into the release we'd like to ensure they 
> can actually _be used_ by Flink users. To achieve this the release managers 
> ensure that an issue for cross team testing is created in the Apache Flink 
> Jira. This can and should be picked up by other community members to verify 
> the functionality and usability of the feature.
> The issue should contain some entry points which enables other community 
> members to test it. It should not contain documentation on how to use the 
> feature as this should be part of the actual documentation. The cross team 
> tests are performed after the feature freeze. Documentation should be in 
> place before that. Those tests are manual tests, so do not confuse them with 
> automated tests.
> To sum that up:
>  * User facing features should be tested by other contributors
>  * The scope is usability and sanity of the feature
>  * The feature needs to be already documented
>  * The contributor creates an issue containing some pointers on how to get 
> started (e.g. link to the documentation, suggested targets of verification)
>  * Other community members pick those issues up and provide feedback
>  * Cross team testing happens right after the feature freeze
>  
> 
> h3. Expectations
>  * Jira issues for each expected release task according to the release plan 
> is created and labeled as {{{}release-testing{}}}.
>  * All the created release-testing-related Jira issues are resolved and the 
> corresponding blocker issues are fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35678) [Release-1.20] Review and update documentation

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35678:
---
Affects Version/s: 1.20.0
   (was: 1.19.0)

> [Release-1.20] Review and update documentation
> --
>
> Key: FLINK-35678
> URL: https://issues.apache.org/jira/browse/FLINK-35678
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> There are a few pages in the documentation that need to be reviewed and 
> updated for each release.
>  * Ensure that there exists a release notes page for each non-bugfix release 
> (e.g., 1.5.0) in {{{}./docs/release-notes/{}}}, that it is up-to-date, and 
> linked from the start page of the documentation.
>  * Upgrading Applications and Flink Versions: 
> [https://ci.apache.org/projects/flink/flink-docs-master/ops/upgrading.html]
>  * ...
>  
> 
> h3. Expectations
>  * Update upgrade compatibility table 
> ([apache-flink:./docs/content/docs/ops/upgrading.md|https://github.com/apache/flink/blob/master/docs/content/docs/ops/upgrading.md#compatibility-table]
>  and 
> [apache-flink:./docs/content.zh/docs/ops/upgrading.md|https://github.com/apache/flink/blob/master/docs/content.zh/docs/ops/upgrading.md#compatibility-table]).
>  * Update [Release Overview in 
> Confluence|https://cwiki.apache.org/confluence/display/FLINK/Release+Management+and+Feature+Plan]
>  * (minor only) The documentation for the new major release is visible under 
> [https://nightlies.apache.org/flink/flink-docs-release-$SHORT_RELEASE_VERSION]
>  (after at least one [doc 
> build|https://github.com/apache/flink/actions/workflows/docs.yml] succeeded).
>  * (minor only) The documentation for the new major release does not contain 
> "-SNAPSHOT" in its version title, and all links refer to the corresponding 
> version docs instead of {{{}master{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35682) [Release-1.20] Create a release branch

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35682:
---
Fix Version/s: 1.20.0
   (was: 1.19.0)

> [Release-1.20]  Create a release branch
> ---
>
> Key: FLINK-35682
> URL: https://issues.apache.org/jira/browse/FLINK-35682
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> If you are doing a new major release, you need to update Flink version in the 
> following repositories and the [AzureCI project 
> configuration|https://dev.azure.com/apache-flink/apache-flink/]:
>  * [apache/flink|https://github.com/apache/flink]
>  * [apache/flink-docker|https://github.com/apache/flink-docker]
>  * [apache/flink-benchmarks|https://github.com/apache/flink-benchmarks]
> Patch releases don't require the these repositories to be touched. Simply 
> checkout the already existing branch for that version:
> {code:java}
> $ git checkout release-$SHORT_RELEASE_VERSION
> {code}
> h4. Flink repository
> Create a branch for the new version that we want to release before updating 
> the master branch to the next development version:
> {code:bash}
> $ cd ./tools
> tools $ releasing/create_snapshot_branch.sh
> tools $ git checkout master
> tools $ OLD_VERSION=$CURRENT_SNAPSHOT_VERSION 
> NEW_VERSION=$NEXT_SNAPSHOT_VERSION releasing/update_branch_version.sh
> {code}
> In the {{master}} branch, add a new value (e.g. {{v1_16("1.16")}}) to 
> [apache-flink:flink-annotations/src/main/java/org/apache/flink/FlinkVersion|https://github.com/apache/flink/blob/master/flink-annotations/src/main/java/org/apache/flink/FlinkVersion.java]
>  as the last entry:
> {code:java}
> // ...
> v1_12("1.12"),
> v1_13("1.13"),
> v1_14("1.14"),
> v1_15("1.15"),
> v1_16("1.16");
> {code}
> Additionally in master, update the branch list of the GitHub Actions nightly 
> workflow (see 
> [apache/flink:.github/workflows/nightly-trigger.yml#L31ff|https://github.com/apache/flink/blob/master/.github/workflows/nightly-trigger.yml#L31]):
>  The two most-recent releases and master should be covered.
> The newly created branch and updated {{master}} branch need to be pushed to 
> the official repository.
> h4. Flink Docker Repository
> Afterwards fork off from {{dev-master}} a {{dev-x.y}} branch in the 
> [apache/flink-docker|https://github.com/apache/flink-docker] repository. Make 
> sure that 
> [apache/flink-docker:.github/workflows/ci.yml|https://github.com/apache/flink-docker/blob/dev-master/.github/workflows/ci.yml]
>  points to the correct snapshot version; for {{dev-x.y}} it should point to 
> {{{}x.y-SNAPSHOT{}}}, while for {{dev-master}} it should point to the most 
> recent snapshot version (\{[$NEXT_SNAPSHOT_VERSION}}).
> After pushing the new minor release branch, as the last step you should also 
> update the documentation workflow to also build the documentation for the new 
> release branch. Check [Managing 
> Documentation|https://cwiki.apache.org/confluence/display/FLINK/Managing+Documentation]
>  on details on how to do that. You may also want to manually trigger a build 
> to make the changes visible as soon as possible.
> h4. Flink Benchmark Repository
> First of all, checkout the {{master}} branch to {{dev-x.y}} branch in 
> [apache/flink-benchmarks|https://github.com/apache/flink-benchmarks], so that 
> we can have a branch named {{dev-x.y}} which could be built on top of 
> (${{CURRENT_SNAPSHOT_VERSION}}).
> Then, inside the repository you need to manually update the {{flink.version}} 
> property inside the parent *pom.xml* file. It should be pointing to the most 
> recent snapshot version ($NEXT_SNAPSHOT_VERSION). For example:
> {code:xml}
> 1.18-SNAPSHOT
> {code}
> h4. AzureCI Project Configuration
> The new release branch needs to be configured within AzureCI to make azure 
> aware of the new release branch. This matter can only be handled by Ververica 
> employees since they are owning the AzureCI setup.
>  
> 
> h3. Expectations (Minor Version only if not stated otherwise)
>  * Release branch has been created and pushed
>  * Changes on the new release branch are picked up by [Azure 
> CI|https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=1&_a=summary]
>  * {{master}} branch has the version information updated to the new version 
> (check pom.xml files and 
>  * 
> [apache-flink:flink-annotations/src/main/java/org/apache/flink/FlinkVersion|https://github.com/apache/flink/blob/master/flink-annotations/src/main/java/org/apache/flink/FlinkVersion.java]
>  enum)
>  *  
> 

[jira] [Updated] (FLINK-35682) [Release-1.20] Create a release branch

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35682:
---
Affects Version/s: 1.20.0
   (was: 1.19.0)

> [Release-1.20]  Create a release branch
> ---
>
> Key: FLINK-35682
> URL: https://issues.apache.org/jira/browse/FLINK-35682
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> If you are doing a new major release, you need to update Flink version in the 
> following repositories and the [AzureCI project 
> configuration|https://dev.azure.com/apache-flink/apache-flink/]:
>  * [apache/flink|https://github.com/apache/flink]
>  * [apache/flink-docker|https://github.com/apache/flink-docker]
>  * [apache/flink-benchmarks|https://github.com/apache/flink-benchmarks]
> Patch releases don't require the these repositories to be touched. Simply 
> checkout the already existing branch for that version:
> {code:java}
> $ git checkout release-$SHORT_RELEASE_VERSION
> {code}
> h4. Flink repository
> Create a branch for the new version that we want to release before updating 
> the master branch to the next development version:
> {code:bash}
> $ cd ./tools
> tools $ releasing/create_snapshot_branch.sh
> tools $ git checkout master
> tools $ OLD_VERSION=$CURRENT_SNAPSHOT_VERSION 
> NEW_VERSION=$NEXT_SNAPSHOT_VERSION releasing/update_branch_version.sh
> {code}
> In the {{master}} branch, add a new value (e.g. {{v1_16("1.16")}}) to 
> [apache-flink:flink-annotations/src/main/java/org/apache/flink/FlinkVersion|https://github.com/apache/flink/blob/master/flink-annotations/src/main/java/org/apache/flink/FlinkVersion.java]
>  as the last entry:
> {code:java}
> // ...
> v1_12("1.12"),
> v1_13("1.13"),
> v1_14("1.14"),
> v1_15("1.15"),
> v1_16("1.16");
> {code}
> Additionally in master, update the branch list of the GitHub Actions nightly 
> workflow (see 
> [apache/flink:.github/workflows/nightly-trigger.yml#L31ff|https://github.com/apache/flink/blob/master/.github/workflows/nightly-trigger.yml#L31]):
>  The two most-recent releases and master should be covered.
> The newly created branch and updated {{master}} branch need to be pushed to 
> the official repository.
> h4. Flink Docker Repository
> Afterwards fork off from {{dev-master}} a {{dev-x.y}} branch in the 
> [apache/flink-docker|https://github.com/apache/flink-docker] repository. Make 
> sure that 
> [apache/flink-docker:.github/workflows/ci.yml|https://github.com/apache/flink-docker/blob/dev-master/.github/workflows/ci.yml]
>  points to the correct snapshot version; for {{dev-x.y}} it should point to 
> {{{}x.y-SNAPSHOT{}}}, while for {{dev-master}} it should point to the most 
> recent snapshot version (\{[$NEXT_SNAPSHOT_VERSION}}).
> After pushing the new minor release branch, as the last step you should also 
> update the documentation workflow to also build the documentation for the new 
> release branch. Check [Managing 
> Documentation|https://cwiki.apache.org/confluence/display/FLINK/Managing+Documentation]
>  on details on how to do that. You may also want to manually trigger a build 
> to make the changes visible as soon as possible.
> h4. Flink Benchmark Repository
> First of all, checkout the {{master}} branch to {{dev-x.y}} branch in 
> [apache/flink-benchmarks|https://github.com/apache/flink-benchmarks], so that 
> we can have a branch named {{dev-x.y}} which could be built on top of 
> (${{CURRENT_SNAPSHOT_VERSION}}).
> Then, inside the repository you need to manually update the {{flink.version}} 
> property inside the parent *pom.xml* file. It should be pointing to the most 
> recent snapshot version ($NEXT_SNAPSHOT_VERSION). For example:
> {code:xml}
> 1.18-SNAPSHOT
> {code}
> h4. AzureCI Project Configuration
> The new release branch needs to be configured within AzureCI to make azure 
> aware of the new release branch. This matter can only be handled by Ververica 
> employees since they are owning the AzureCI setup.
>  
> 
> h3. Expectations (Minor Version only if not stated otherwise)
>  * Release branch has been created and pushed
>  * Changes on the new release branch are picked up by [Azure 
> CI|https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=1&_a=summary]
>  * {{master}} branch has the version information updated to the new version 
> (check pom.xml files and 
>  * 
> [apache-flink:flink-annotations/src/main/java/org/apache/flink/FlinkVersion|https://github.com/apache/flink/blob/master/flink-annotations/src/main/java/org/apache/flink/FlinkVersion.java]
>  enum)
>  *  
> 

[jira] [Commented] (FLINK-20539) Type mismatch when using ROW in computed column

2024-06-24 Thread xuyang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859708#comment-17859708
 ] 

xuyang commented on FLINK-20539:


I'll take a look recently.

> Type mismatch when using ROW in computed column
> ---
>
> Key: FLINK-20539
> URL: https://issues.apache.org/jira/browse/FLINK-20539
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: xuyang
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.19.0, 1.18.2
>
>
> The following SQL:
> {code}
> env.executeSql(
>   "CREATE TABLE Orders (\n"
>   + "order_number BIGINT,\n"
>   + "priceINT,\n"
>   + "first_name   STRING,\n"
>   + "last_nameSTRING,\n"
>   + "buyer_name AS ROW(first_name, last_name)\n"
>   + ") WITH (\n"
>   + "  'connector' = 'datagen'\n"
>   + ")");
> env.executeSql("SELECT * FROM Orders").print();
> {code}
> Fails with:
> {code}
> Exception in thread "main" java.lang.AssertionError: Conversion to relational 
> algebra failed to preserve datatypes:
> validated type:
> RecordType(BIGINT order_number, INTEGER price, VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE" first_name, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> last_name, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" EXPR$0, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" EXPR$1) NOT 
> NULL buyer_name) NOT NULL
> converted type:
> RecordType(BIGINT order_number, INTEGER price, VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE" first_name, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> last_name, RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" EXPR$0, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" EXPR$1) NOT NULL buyer_name) NOT 
> NULL
> rel:
> LogicalProject(order_number=[$0], price=[$1], first_name=[$2], 
> last_name=[$3], buyer_name=[ROW($2, $3)])
>   LogicalTableScan(table=[[default_catalog, default_database, Orders]])
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.checkConvertedType(SqlToRelConverter.java:467)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:582)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35553][runtime] Wires up the RescaleManager with the CheckpointLifecycleListener interface [flink]

2024-06-24 Thread via GitHub


ztison commented on code in PR #24912:
URL: https://github.com/apache/flink/pull/24912#discussion_r1651071044


##
docs/layouts/shortcodes/generated/all_jobmanager_section.html:
##
@@ -26,6 +32,12 @@
 Duration
 The maximum time the JobManager will wait to acquire all 
required resources after a job submission or restart. Once elapsed it will try 
to run the job with a lower parallelism, or fail if the minimum amount of 
resources could not be acquired.Increasing this value will make the 
cluster more resilient against temporary resources shortages (e.g., there is 
more time for a failed TaskManager to be restarted).Setting a negative 
duration will disable the resource timeout: The JobManager will wait 
indefinitely for resources to appear.If scheduler-mode is configured to REACTIVE, this configuration value will 
default to a negative value to disable the resource timeout.
 
+
+
jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count
+2
+Integer
+The number of subsequent failed checkpoints that will initiate 
rescaling.

Review Comment:
   Yes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35687] JSON_QUERY should return a well formatted nested objects/arrays for ARRAY [flink]

2024-06-24 Thread via GitHub


flinkbot commented on PR #24976:
URL: https://github.com/apache/flink/pull/24976#issuecomment-2186630310

   
   ## CI report:
   
   * dda23d4c2d9dbefecb9cb8533d076ed8bb8c9a8f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-35687] JSON_QUERY should return a well formatted nested objects/arrays for ARRAY [flink]

2024-06-24 Thread via GitHub


dawidwys opened a new pull request, #24976:
URL: https://github.com/apache/flink/pull/24976

   
   ## What is the purpose of the change
   
   Fix a bug that objects and arrays were incorrectly formatted for `JSON 
QUERY` with `RETURNING ARRAY` clause.
   
   ## Verifying this change
   
   Added tests in `JsonFunctionsITCase`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-35687) JSON_QUERY should return a well formatted nested objects/arrays for ARRAY

2024-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35687:
---
Labels: pull-request-available  (was: )

> JSON_QUERY should return a well formatted nested objects/arrays for 
> ARRAY
> -
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> ['{itemId=1234, count=10}']
> {code}
> but it should:
> {code}
> ['{"itemId":1234, "count":10}']
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35687) JSON_QUERY should return a well formatted nested objects/arrays for ARRAY

2024-06-24 Thread Dawid Wysakowicz (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-35687:
-
Summary: JSON_QUERY should return a well formatted nested objects/arrays 
for ARRAY  (was: JSON_QUERY should return a proper JSON for 
ARRAY)

> JSON_QUERY should return a well formatted nested objects/arrays for 
> ARRAY
> -
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> ['{itemId=1234, count=10}']
> {code}
> but it should:
> {code}
> ['{"itemId":1234, "count":10}']
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35553][runtime] Wires up the RescaleManager with the CheckpointLifecycleListener interface [flink]

2024-06-24 Thread via GitHub


XComp commented on code in PR #24912:
URL: https://github.com/apache/flink/pull/24912#discussion_r1651045471


##
docs/layouts/shortcodes/generated/all_jobmanager_section.html:
##
@@ -26,6 +32,12 @@
 Duration
 The maximum time the JobManager will wait to acquire all 
required resources after a job submission or restart. Once elapsed it will try 
to run the job with a lower parallelism, or fail if the minimum amount of 
resources could not be acquired.Increasing this value will make the 
cluster more resilient against temporary resources shortages (e.g., there is 
more time for a failed TaskManager to be restarted).Setting a negative 
duration will disable the resource timeout: The JobManager will wait 
indefinitely for resources to appear.If scheduler-mode is configured to REACTIVE, this configuration value will 
default to a negative value to disable the resource timeout.
 
+
+
jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count
+2
+Integer
+The number of subsequent failed checkpoints that will initiate 
rescaling.

Review Comment:
   I updated the description to something like the follow:
   > The number of consecutive failed checkpoints that will trigger rescaling 
even in the absence of a completed checkpoint.
   
   Does this work for you?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread Dawid Wysakowicz (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-35687:
-
Description: 
{code}
SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
RETURNING ARRAY)
{code}

returns

{code}
['{itemId=1234, count=10}']
{code}

but it should:

{code}
['{"itemId":1234, "count":10}']
{code}

We should call jsonize for Collection types here: 
https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268

  was:
{code}
SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
RETURNING ARRAY)
{code}

returns

{code}
["{itemId=1234, count=10}"]
{code}

but it should:

{code}
["{"itemId":1234, "count":10}"]
{code}

We should call jsonize for Collection types here: 
https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268


> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> ['{itemId=1234, count=10}']
> {code}
> but it should:
> {code}
> ['{"itemId":1234, "count":10}']
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread Dawid Wysakowicz (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-35687:
-
Description: 
{code}
SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
RETURNING ARRAY)
{code}

returns

{code}
["{itemId=1234, count=10}"]
{code}

but it should:

{code}
["{"itemId":1234, "count":10}"]
{code}

We should call jsonize for Collection types here: 
https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268

  was:
{code}
SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
RETURNING ARRAY)
{code}

returns

{code}
{itemId=1234, count=10}
{code}

but it should:

{code}
{"itemId":1234, "count":10}
{code}

We should call jsonize for Collection types here: 
https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268


> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> ["{itemId=1234, count=10}"]
> {code}
> but it should:
> {code}
> ["{"itemId":1234, "count":10}"]
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859675#comment-17859675
 ] 

Dawid Wysakowicz commented on FLINK-35687:
--

I think it is an issue of RETURNING ARRAY clause which is part of 1.20 
and I'll fix it there.

> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> {itemId=1234, count=10}
> {code}
> but it should:
> {code}
> {"itemId":1234, "count":10}
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread david radley (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859674#comment-17859674
 ] 

david radley commented on FLINK-35687:
--

[~dwysakowicz] Thanks for clarifying. 
 * are you saying that this is not an issue before the RETURNING keyword. I had 
thought that the default would be a return a string. Does this string have the 
_=_ s in , the response in the example looks like a String (i.e. what would 
have been returned before the RETURNING keyword was added) .
 * the example given has ARRAY.  Shouldn't this return an array of 
Strings in this case and not a JSON Object? 
 * If this behaviour is new at 1.19 then I agree that correcting in 1.19.1 is 
not an issue. If this has been around since 1.15, there could be more of a 
migration impact.

WDYT?

> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> {itemId=1234, count=10}
> {code}
> but it should:
> {code}
> {"itemId":1234, "count":10}
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859672#comment-17859672
 ] 

Dawid Wysakowicz commented on FLINK-35687:
--

Actually, it hasn't landed in 1.19.x, but is part of 1.20, I'll remove 1.19.2 
as the target.

> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> {itemId=1234, count=10}
> {code}
> but it should:
> {code}
> {"itemId":1234, "count":10}
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread Dawid Wysakowicz (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-35687:
-
Fix Version/s: (was: 1.19.2)

> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> {itemId=1234, count=10}
> {code}
> but it should:
> {code}
> {"itemId":1234, "count":10}
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread david radley (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859664#comment-17859664
 ] 

david radley edited comment on FLINK-35687 at 6/24/24 1:07 PM:
---

[~dwysakowicz] I agree your proposal does look better and matches the 
documentation. I am concerned that there is a migration issue, as you have 
targeted 1.20 and 1.19 and this is an SQL API change (even if it is a 
correction). Existing applications since Flink 1.15 (when this came in) will 
expect the existing response - so are likely to break and need changing. If we 
want to target 1.19 and 1.20, I suggest having a config switch that would 
enable this behaviour, or wait for Flink 2.0 - where more breaking changes are 
appropriate.  


was (Author: JIRAUSER300523):
[~dwysakowicz] I agree your proposal does look better. I am concerned that 
there is a migration issue, as you have targeted 1.20 and 1.19 and this is an 
SQL API change. Existing applications since Flink 1.15 (when this came in) will 
expect the existing response - so are likely to break and need changing. If we 
want to target 1.19 and 1.20, I suggest having a config switch that would 
enable this behaviour, or wait for Flink 2.0 - where more breaking changes are 
appropriate.  

> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0, 1.19.2
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> {itemId=1234, count=10}
> {code}
> but it should:
> {code}
> {"itemId":1234, "count":10}
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859669#comment-17859669
 ] 

Dawid Wysakowicz commented on FLINK-35687:
--

> Existing applications since Flink 1.15 (when this came in) will expect the 
> existing response - so are likely to break and need changing

That's not true. `RETURNING` clause was introduced in 1.19. 

I don't think we should maintain incorrect behaviour just because someone 
depends on an incorrect behaviour. 

> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0, 1.19.2
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> {itemId=1234, count=10}
> {code}
> but it should:
> {code}
> {"itemId":1234, "count":10}
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859669#comment-17859669
 ] 

Dawid Wysakowicz edited comment on FLINK-35687 at 6/24/24 1:05 PM:
---

> Existing applications since Flink 1.15 (when this came in) will expect the 
> existing response - so are likely to break and need changing

That's not true. `RETURNING` clause was introduced in 1.19. 

Moreover, I don't think we should maintain incorrect behaviour just because 
someone depends on an incorrect behaviour. 


was (Author: dawidwys):
> Existing applications since Flink 1.15 (when this came in) will expect the 
> existing response - so are likely to break and need changing

That's not true. `RETURNING` clause was introduced in 1.19. 

I don't think we should maintain incorrect behaviour just because someone 
depends on an incorrect behaviour. 

> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0, 1.19.2
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> {itemId=1234, count=10}
> {code}
> but it should:
> {code}
> {"itemId":1234, "count":10}
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread david radley (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859664#comment-17859664
 ] 

david radley commented on FLINK-35687:
--

[~dwysakowicz] I agree your proposal does look better. I am concerned that 
there is a migration issue, as you have targeted 1.20 and 1.19 and this is an 
SQL API change. Existing applications since Flink 1.15 (when this came in) will 
expect the existing response - so are likely to break and need changing. If we 
want to target 1.19 and 1.20, I suggest having a config switch that would 
enable this behaviour, or wait for Flink 2.0 - where more breaking changes are 
appropriate.  

> JSON_QUERY should return a proper JSON for ARRAY
> 
>
> Key: FLINK-35687
> URL: https://issues.apache.org/jira/browse/FLINK-35687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.1
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
> Fix For: 1.20.0, 1.19.2
>
>
> {code}
> SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
> RETURNING ARRAY)
> {code}
> returns
> {code}
> {itemId=1234, count=10}
> {code}
> but it should:
> {code}
> {"itemId":1234, "count":10}
> {code}
> We should call jsonize for Collection types here: 
> https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35552][runtime] Moves CheckpointStatsTracker out of DefaultExecutionGraphFactory into Scheduler [flink]

2024-06-24 Thread via GitHub


XComp commented on PR #24911:
URL: https://github.com/apache/flink/pull/24911#issuecomment-2186476457

   Force-pushed the rebase onto the most-recent version of base PR #24910 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35551][runtime] Introduces RescaleManager#onTrigger endpoint [flink]

2024-06-24 Thread via GitHub


XComp commented on PR #24910:
URL: https://github.com/apache/flink/pull/24910#issuecomment-2186472875

   Rebased this PR onto the most-recent version of the base PR #24909 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35550][runtime] Move rescaling functionality into dedicated class RescaleManager [flink]

2024-06-24 Thread via GitHub


XComp commented on PR #24909:
URL: https://github.com/apache/flink/pull/24909#issuecomment-2186462260

   Final force-push to rebase to most-recent `master`. Can you approve this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-35687) JSON_QUERY should return a proper JSON for ARRAY

2024-06-24 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-35687:


 Summary: JSON_QUERY should return a proper JSON for ARRAY
 Key: FLINK-35687
 URL: https://issues.apache.org/jira/browse/FLINK-35687
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.1
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz
 Fix For: 1.20.0, 1.19.2


{code}
SELECT JSON_QUERY('{"items": [{"itemId":1234, "count":10}]}', '$.items' 
RETURNING ARRAY)
{code}

returns

{code}
{itemId=1234, count=10}
{code}

but it should:

{code}
{"itemId":1234, "count":10}
{code}

We should call jsonize for Collection types here: 
https://github.com/apache/flink/blob/f6f88135b3a5fa5616fe905346e5ab6dce084555/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlJsonUtils.java#L268



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35550][runtime] Move rescaling functionality into dedicated class RescaleManager [flink]

2024-06-24 Thread via GitHub


XComp commented on PR #24909:
URL: https://github.com/apache/flink/pull/24909#issuecomment-2186460177

   args, I forgot the commit on move the factory instantiation into the 
`AdaptiveScheduler` constructor 
(https://github.com/apache/flink/pull/24909/commits/1fca0a10ff19d038dbac3722cf5047de4fba45fc).
 The subsequent force-push squashes the changes once more 
([diff](https://github.com/apache/flink/compare/1fca0a10ff19d038dbac3722cf5047de4fba45fc..1993e33efdd50f42547ffd36385835110a4d1169)).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-35585) Add documentation for distribution

2024-06-24 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-35585.

Fix Version/s: 1.20.0
   Resolution: Fixed

Fixed in master: f6f88135b3a5fa5616fe905346e5ab6dce084555

> Add documentation for distribution
> --
>
> Key: FLINK-35585
> URL: https://issues.apache.org/jira/browse/FLINK-35585
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Jim Hughes
>Assignee: Jim Hughes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> Add documentation for ALTER TABLE, CREATE TABLE, and the sink abilities.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-20539) Type mismatch when using ROW in computed column

2024-06-24 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859658#comment-17859658
 ] 

Martijn Visser commented on FLINK-20539:


[~xuyangzhong] [~qingyue] I've re-opened the ticket because it indeed doesn't 
yet work. Can you take a look?

> Type mismatch when using ROW in computed column
> ---
>
> Key: FLINK-20539
> URL: https://issues.apache.org/jira/browse/FLINK-20539
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: xuyang
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.19.0, 1.18.2
>
>
> The following SQL:
> {code}
> env.executeSql(
>   "CREATE TABLE Orders (\n"
>   + "order_number BIGINT,\n"
>   + "priceINT,\n"
>   + "first_name   STRING,\n"
>   + "last_nameSTRING,\n"
>   + "buyer_name AS ROW(first_name, last_name)\n"
>   + ") WITH (\n"
>   + "  'connector' = 'datagen'\n"
>   + ")");
> env.executeSql("SELECT * FROM Orders").print();
> {code}
> Fails with:
> {code}
> Exception in thread "main" java.lang.AssertionError: Conversion to relational 
> algebra failed to preserve datatypes:
> validated type:
> RecordType(BIGINT order_number, INTEGER price, VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE" first_name, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> last_name, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" EXPR$0, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" EXPR$1) NOT 
> NULL buyer_name) NOT NULL
> converted type:
> RecordType(BIGINT order_number, INTEGER price, VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE" first_name, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> last_name, RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" EXPR$0, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" EXPR$1) NOT NULL buyer_name) NOT 
> NULL
> rel:
> LogicalProject(order_number=[$0], price=[$1], first_name=[$2], 
> last_name=[$3], buyer_name=[ROW($2, $3)])
>   LogicalTableScan(table=[[default_catalog, default_database, Orders]])
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.checkConvertedType(SqlToRelConverter.java:467)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:582)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (FLINK-20539) Type mismatch when using ROW in computed column

2024-06-24 Thread Martijn Visser (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser reopened FLINK-20539:


> Type mismatch when using ROW in computed column
> ---
>
> Key: FLINK-20539
> URL: https://issues.apache.org/jira/browse/FLINK-20539
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: xuyang
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.19.0, 1.18.2
>
>
> The following SQL:
> {code}
> env.executeSql(
>   "CREATE TABLE Orders (\n"
>   + "order_number BIGINT,\n"
>   + "priceINT,\n"
>   + "first_name   STRING,\n"
>   + "last_nameSTRING,\n"
>   + "buyer_name AS ROW(first_name, last_name)\n"
>   + ") WITH (\n"
>   + "  'connector' = 'datagen'\n"
>   + ")");
> env.executeSql("SELECT * FROM Orders").print();
> {code}
> Fails with:
> {code}
> Exception in thread "main" java.lang.AssertionError: Conversion to relational 
> algebra failed to preserve datatypes:
> validated type:
> RecordType(BIGINT order_number, INTEGER price, VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE" first_name, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> last_name, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" EXPR$0, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" EXPR$1) NOT 
> NULL buyer_name) NOT NULL
> converted type:
> RecordType(BIGINT order_number, INTEGER price, VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE" first_name, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> last_name, RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" EXPR$0, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" EXPR$1) NOT NULL buyer_name) NOT 
> NULL
> rel:
> LogicalProject(order_number=[$0], price=[$1], first_name=[$2], 
> last_name=[$3], buyer_name=[ROW($2, $3)])
>   LogicalTableScan(table=[[default_catalog, default_database, Orders]])
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.checkConvertedType(SqlToRelConverter.java:467)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:582)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35550][runtime] Move rescaling functionality into dedicated class RescaleManager [flink]

2024-06-24 Thread via GitHub


XComp commented on PR #24909:
URL: https://github.com/apache/flink/pull/24909#issuecomment-2186454160

   Thanks for your reviews, @ztison and @1996fanrui . I addressed your changes 
and squashed the commits in a separate force-push (see 
[diff](https://github.com/apache/flink/compare/e365b8860b2f0bb95865d0d9fd67e2d0fbf60e1d..d5365715f4f81b2d6aabdcafdbb737c8e634d787)).
   
   @1996fanrui Yes, this PR is solely about collecting the rescale-related code 
in the `Executing` state to prepare for the follow-up PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Update master version to 2.0-SNAPSHOT [flink]

2024-06-24 Thread via GitHub


reswqa commented on code in PR #24974:
URL: https://github.com/apache/flink/pull/24974#discussion_r1650907110


##
.github/workflows/nightly-trigger.yml:
##
@@ -31,8 +31,8 @@ jobs:
   matrix:
 branch:
   - master
+  - release-1.20
   - release-1.19
-  - release-1.18

Review Comment:
   Ah, make sense.   Re-add the removed `1.18`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35585] Add documentation for distribution [flink]

2024-06-24 Thread via GitHub


twalthr merged PR #24929:
URL: https://github.com/apache/flink/pull/24929


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Update master version to 2.0-SNAPSHOT [flink]

2024-06-24 Thread via GitHub


reswqa commented on PR #24974:
URL: https://github.com/apache/flink/pull/24974#issuecomment-2186413637

   Thanks @XComp and @1996fanrui for the review!
   
   > I know that we agreed on 1.20 being the last minor 1.x release. But did 
you reconfirm with the 2.0 release managers on the state of the 2.0 work?
   
   Yes, I have confirmed this with Xintong and Jark.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35550][runtime] Move rescaling functionality into dedicated class RescaleManager [flink]

2024-06-24 Thread via GitHub


XComp commented on code in PR #24909:
URL: https://github.com/apache/flink/pull/24909#discussion_r1650914361


##
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/AdaptiveScheduler.java:
##
@@ -1048,8 +1044,8 @@ public void goToExecuting(
 this,
 userCodeClassLoader,
 failureCollection,
-settings.getScalingIntervalMin(),
-settings.getScalingIntervalMax()));
+DefaultRescaleManager.Factory.fromSettings(settings),

Review Comment:
   sure, good point. That's possible.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Update master version to 2.0-SNAPSHOT [flink]

2024-06-24 Thread via GitHub


reswqa commented on code in PR #24974:
URL: https://github.com/apache/flink/pull/24974#discussion_r1650907110


##
.github/workflows/nightly-trigger.yml:
##
@@ -31,8 +31,8 @@ jobs:
   matrix:
 branch:
   - master
+  - release-1.20
   - release-1.19
-  - release-1.18

Review Comment:
   Ah, make sense.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Update master version to 2.0-SNAPSHOT [flink]

2024-06-24 Thread via GitHub


XComp commented on code in PR #24974:
URL: https://github.com/apache/flink/pull/24974#discussion_r1650882902


##
.github/workflows/nightly-trigger.yml:
##
@@ -31,8 +31,8 @@ jobs:
   matrix:
 branch:
   - master
+  - release-1.20
   - release-1.19
-  - release-1.18

Review Comment:
   Don't remove the 1.18 release branch just, yet. There might be another 
1.18.2 being released before 1.18 finally reaching end-of-life.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35656][hive] Fix the issue that Hive Source incorrectly set max parallelism in dynamic inference mode [flink]

2024-06-24 Thread via GitHub


SinBex commented on PR #24962:
URL: https://github.com/apache/flink/pull/24962#issuecomment-2186295086

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35022][Connector/DynamoDB] Add TypeInformed DDB Element Converter as default element converter [flink-connector-aws]

2024-06-24 Thread via GitHub


hlteoh37 commented on PR #136:
URL: 
https://github.com/apache/flink-connector-aws/pull/136#issuecomment-2186279507

   Hi @vahmed-hamdy, my preference would be for us to refactor the code in 
`DynamoDbTypeInformedElementConverter` to use a cleaner structure rather than a 
long list of `if-else`. Is this something we need to close urgently?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35022][Connector/DynamoDB] Add TypeInformed DDB Element Converter as default element converter [flink-connector-aws]

2024-06-24 Thread via GitHub


hlteoh37 commented on code in PR #136:
URL: 
https://github.com/apache/flink-connector-aws/pull/136#discussion_r1650817669


##
flink-connector-aws/flink-connector-dynamodb/src/main/java/org/apache/flink/connector/dynamodb/sink/DynamoDbTypeInformedElementConverter.java:
##
@@ -0,0 +1,380 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.dynamodb.sink;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.typeinfo.BasicArrayTypeInfo;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.NumericTypeInfo;
+import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeutils.CompositeType;
+import org.apache.flink.api.connector.sink2.SinkWriter;
+import org.apache.flink.api.java.tuple.Tuple;
+import org.apache.flink.api.java.typeutils.ObjectArrayTypeInfo;
+import org.apache.flink.api.java.typeutils.PojoTypeInfo;
+import org.apache.flink.api.java.typeutils.RowTypeInfo;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.connector.base.sink.writer.ElementConverter;
+import 
org.apache.flink.connector.dynamodb.table.converter.ArrayAttributeConverterProvider;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.FlinkRuntimeException;
+import org.apache.flink.util.Preconditions;
+
+import software.amazon.awssdk.core.SdkBytes;
+import software.amazon.awssdk.enhanced.dynamodb.AttributeConverter;
+import software.amazon.awssdk.enhanced.dynamodb.AttributeConverterProvider;
+import software.amazon.awssdk.enhanced.dynamodb.AttributeValueType;
+import software.amazon.awssdk.enhanced.dynamodb.EnhancedType;
+import software.amazon.awssdk.enhanced.dynamodb.TableSchema;
+import 
software.amazon.awssdk.enhanced.dynamodb.internal.mapper.BeanAttributeGetter;
+import software.amazon.awssdk.enhanced.dynamodb.mapper.StaticTableSchema;
+import software.amazon.awssdk.services.dynamodb.model.AttributeValue;
+
+import java.beans.BeanInfo;
+import java.beans.IntrospectionException;
+import java.beans.Introspector;
+import java.beans.PropertyDescriptor;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+import java.util.function.Function;
+
+/**
+ * A {@link ElementConverter} that converts an element to a {@link 
DynamoDbWriteRequest} using
+ * TypeInformation provided.
+ */
+@PublicEvolving
+public class DynamoDbTypeInformedElementConverter
+implements ElementConverter {
+
+private final CompositeType typeInfo;
+private final boolean ignoreNulls;
+private final TableSchema tableSchema;
+
+/**
+ * Creates a {@link DynamoDbTypeInformedElementConverter} that converts an 
element to a {@link
+ * DynamoDbWriteRequest} using the provided {@link CompositeType}. Usage: 
{@code new
+ * 
DynamoDbTypeInformedElementConverter<>(TypeInformation.of(MyPojoClass.class))}
+ *
+ * @param typeInfo The {@link CompositeType} that provides the type 
information for the element.
+ */
+public DynamoDbTypeInformedElementConverter(CompositeType typeInfo) {
+this(typeInfo, true);
+}
+
+public DynamoDbTypeInformedElementConverter(CompositeType typeInfo, 
boolean ignoreNulls) {
+
+try {
+this.typeInfo = typeInfo;
+this.ignoreNulls = ignoreNulls;
+this.tableSchema = createTableSchema(typeInfo);
+} catch (IntrospectionException | IllegalStateException | 
IllegalArgumentException e) {
+throw new FlinkRuntimeException("Failed to extract DynamoDb table 
schema", e);
+}
+}
+
+@Override
+public DynamoDbWriteRequest apply(T input, SinkWriter.Context context) {
+Preconditions.checkNotNull(tableSchema, "TableSchema is not 
initialized");
+try {
+return DynamoDbWriteRequest.builder()
+.setType(DynamoDbWriteRequestType.PUT)
+.setItem(tableSchema.itemToMap(input, ignoreNulls))
+.build();
+} catch 

[jira] [Updated] (FLINK-35656) Hive Source has issues setting max parallelism in dynamic inference mode

2024-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35656:
---
Labels: pull-request-available  (was: )

> Hive Source has issues setting max parallelism in dynamic inference mode
> 
>
> Key: FLINK-35656
> URL: https://issues.apache.org/jira/browse/FLINK-35656
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.20.0
>Reporter: xingbe
>Assignee: xingbe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> In the dynamic parallelism inference mode of Hive Source, when 
> `table.exec.hive.infer-source-parallelism.max` is not configured, it does not 
> use `execution.batch.adaptive.auto-parallelism.default-source-parallelism` as 
> the upper bound for parallelism inference, which is inconsistent with the 
> behavior described in the documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35656][hive] Fix the issue that Hive Source incorrectly set max parallelism in dynamic inference mode [flink]

2024-06-24 Thread via GitHub


zhuzhurk commented on code in PR #24962:
URL: https://github.com/apache/flink/pull/24962#discussion_r1650686355


##
flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/HiveSourceTest.java:
##
@@ -202,14 +195,16 @@ void testDynamicParallelismInferenceWithFiltering() 
throws Exception {
 
tablePath.getDatabaseName(),
 
tablePath.getObjectName(),
 new 
LinkedHashMap<>(spec)))
-.collect(Collectors.toList()))
-.buildWithDefaultBulkFormat();
+.collect(Collectors.toList()));
+if (limit != null) {

Review Comment:
   It's better to use `long` and use `-1` to represent `no limit`.



##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveDynamicParallelismInferenceFactory.java:
##
@@ -62,6 +62,18 @@ public HiveParallelismInference create() {
 globalMaxParallelism),
 globalMaxParallelism);
 int parallelism = ExecutionConfig.PARALLELISM_DEFAULT;
+reSetInferMaxParallelism(jobConf, inferMaxParallelism);
 return new HiveParallelismInference(tablePath, infer, 
inferMaxParallelism, parallelism);
 }
+
+/**
+ * Reset infer source max parallelism in jobConf, and {@link
+ * HiveSourceFileEnumerator#createInputSplits} will infer InputSplits 
based on the
+ * inferMaxParallelism.
+ */
+private void reSetInferMaxParallelism(JobConf jobConf, int 
inferMaxParallelism) {

Review Comment:
   reSetInferMaxParallelism -> adjustInferMaxParallelism



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-33607) Add checksum verification for Maven wrapper as well

2024-06-24 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo reassigned FLINK-33607:
--

Assignee: Luke Chen

> Add checksum verification for Maven wrapper as well
> ---
>
> Key: FLINK-33607
> URL: https://issues.apache.org/jira/browse/FLINK-33607
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Matthias Pohl
>Assignee: Luke Chen
>Priority: Major
>  Labels: pull-request-available
>
> FLINK-33503 enabled us to add checksum checks for the Maven wrapper binaries 
> along the update from 3.1.0 to 3.2.0.
> But there seems to be an issue with verifying the wrapper's checksum under 
> windows (see [related PR discussion in 
> Guava|https://github.com/google/guava/pull/6807/files]).
> This issue covers the fix as soon as MVRAPPER-103 is resolved. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35022][Connector/DynamoDB] Add TypeInformed DDB Element Converter as default element converter [flink-connector-aws]

2024-06-24 Thread via GitHub


vahmed-hamdy commented on PR #136:
URL: 
https://github.com/apache/flink-connector-aws/pull/136#issuecomment-2186211486

   @hlteoh37 Do you have other comments or could we merge this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33607][build]: Add checksum verification for Maven wrapper [flink]

2024-06-24 Thread via GitHub


showuon commented on PR #24852:
URL: https://github.com/apache/flink/pull/24852#issuecomment-2186208571

   @XComp , I added another git action test today. I modified the 
`wrapperSha256Sum` to verify if we can successfully detect it, and the result 
works as expected 
[here](https://github.com/showuon/flink/actions/runs/9643660323/job/26594117819).
 
   ```
   Run ./mvnw clean package -DskipTests
   Error: Failed to validate Maven wrapper SHA-256, your Maven wrapper might be 
compromised.
   Investigate or delete 
/home/runner/work/flink/flink/.mvn/wrapper/maven-wrapper.jar to attempt a clean 
download.
   If you updated your Maven version, you need to update the specified 
wrapperSha256Sum property.
   Error: Process completed with exit code 1.
   ```
   
   I think this PR is all verified. FYI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-35686) Flink-connector-jdbc v3.2.0 support flink 1.17.x

2024-06-24 Thread ZhengJunZhou (Jira)
ZhengJunZhou created FLINK-35686:


 Summary: Flink-connector-jdbc v3.2.0 support flink 1.17.x
 Key: FLINK-35686
 URL: https://issues.apache.org/jira/browse/FLINK-35686
 Project: Flink
  Issue Type: Improvement
Reporter: ZhengJunZhou


Can Flink-connector-jdbc v3.2.0 support flink 1.17.x?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35685) Some metrics in the MetricStore are duplicated when increasing or decreasing task parallelism

2024-06-24 Thread elon_X (Jira)
elon_X created FLINK-35685:
--

 Summary: Some metrics in the MetricStore are duplicated when 
increasing or decreasing task parallelism
 Key: FLINK-35685
 URL: https://issues.apache.org/jira/browse/FLINK-35685
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Metrics
Affects Versions: 1.19.0
Reporter: elon_X
 Attachments: image-2024-06-24-18-01-40-869.png

1.This can cause confusion for users.
2.As parallelism is continuously adjusted, the data in the MetricStore will 
increase, occupying more memory.

!image-2024-06-24-18-01-40-869.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35674) MySQL connector cause blocking when searching for binlog file‘s timestamp

2024-06-24 Thread Thorne (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thorne updated FLINK-35674:
---
Description: 
When a task is started by multiple mysql connector timestamp start mode at the 
same time, when searching for binlog timestamp, there will be task blocking 
problem, which may cause source to be unable to obtain data all the time. 

 

1、 I have four tables(products,orders,orders_copy,shipments,)to capture in a 
task . For these four tables, I made a lot of binlogs,such as 10 million。

2、I try start it with timestamp mode and the products table could not get any 
records .

!FBA32597-8783-4678-B391-E450148C1B30.png|width=550,height=264!

3、I try start it with timestamp mode  ,but  the orders_copy table could not get 
any records

!BF180441-9C61-40eb-B07C-A11F8BCEC2D0.png|width=557,height=230!

3、I debug  code and find some  problems
{code:java}
# Class: org.apache.flink.cdc.connectors.mysql.debezium.DebeziumUtils

private static String searchBinlogName(
BinaryLogClient client, long targetMs, List binlogFiles)
throws IOException, InterruptedException {
int startIdx = 0;
int endIdx = binlogFiles.size() - 1;

while (startIdx <= endIdx) {
int mid = startIdx + (endIdx - startIdx) / 2;
long midTs = getBinlogTimestamp(client, binlogFiles.get(mid));
if (midTs < targetMs) {
startIdx = mid + 1;
} else if (targetMs < midTs) {
endIdx = mid - 1;
} else {
return binlogFiles.get(mid);
}
}

return endIdx < 0 ? binlogFiles.get(0) : binlogFiles.get(endIdx);
}

private static long getBinlogTimestamp(BinaryLogClient client, String 
binlogFile)
throws IOException, InterruptedException {

ArrayBlockingQueue binlogTimestamps = new ArrayBlockingQueue<>(1);
BinaryLogClient.EventListener eventListener =
event -> {
EventData data = event.getData();
if (data instanceof RotateEventData) {
// We skip RotateEventData because it does not contain the 
timestamp we are
// interested in.
return;
}

EventHeaderV4 header = event.getHeader();
long timestamp = header.getTimestamp();
if (timestamp > 0) {
binlogTimestamps.offer(timestamp);
try {
client.disconnect();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
};

try {
client.registerEventListener(eventListener);
client.setBinlogFilename(binlogFile);
client.setBinlogPosition(0);

LOG.info("begin parse binlog: {}", binlogFile);
client.connect();
} finally {
client.unregisterEventListener(eventListener);
}
return binlogTimestamps.take();
}{code}
5、 the funciton binlogTimestamps.take() is blocking until the queue has records.

6、the binlogTimestamps queue is always  blocking and cannot get any data.

  was:
When a task is started by multiple mysql connector timestamp start mode at the 
same time, when searching for binlog timestamp, there will be task blocking 
problem, which may cause source to be unable to obtain data all the time. 

 

1、 I have four tables(products,orders,orders_copy,shipments,)to capture in a 
task . For these four tables, I made a lot of binlogs,such as 10 million。

2、I try start it with timestamp mode and the products table could not get any 
records .

!FBA32597-8783-4678-B391-E450148C1B30.png|width=550,height=264!

3、I try start it with timestamp mode  ,but  the orders_copy table could not get 
any records

!BF180441-9C61-40eb-B07C-A11F8BCEC2D0.png|width=557,height=230!

3、I debug  code and find some  problems
{code:java}
# Class: org.apache.flink.cdc.connectors.mysql.debezium.DebeziumUtils

private static String searchBinlogName(
BinaryLogClient client, long targetMs, List binlogFiles)
throws IOException, InterruptedException {
int startIdx = 0;
int endIdx = binlogFiles.size() - 1;

while (startIdx <= endIdx) {
int mid = startIdx + (endIdx - startIdx) / 2;
long midTs = getBinlogTimestamp(client, binlogFiles.get(mid));
if (midTs < targetMs) {
startIdx = mid + 1;
} else if (targetMs < midTs) {
endIdx = mid - 1;
} else {
return binlogFiles.get(mid);
}
}

return endIdx < 0 ? binlogFiles.get(0) : binlogFiles.get(endIdx);
}

private static long getBinlogTimestamp(BinaryLogClient client, String 
binlogFile)
throws IOException, InterruptedException {

ArrayBlockingQueue binlogTimestamps = new ArrayBlockingQueue<>(1);
BinaryLogClient.EventListener eventListener =
event -> {
EventData data 

[jira] [Updated] (FLINK-35674) MySQL connector cause blocking when searching for binlog file‘s timestamp

2024-06-24 Thread Thorne (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thorne updated FLINK-35674:
---
Summary: MySQL connector cause blocking when searching for binlog file‘s 
timestamp  (was: MySQL connector cause blocking when searching for binlog 
timestamps)

> MySQL connector cause blocking when searching for binlog file‘s timestamp
> -
>
> Key: FLINK-35674
> URL: https://issues.apache.org/jira/browse/FLINK-35674
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.1
> Environment: flink-cdc-3.1.x
>Reporter: Thorne
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
> Attachments: A7AE0D63-365D-4572-B63D-96DF5F096BF9.png, 
> BF180441-9C61-40eb-B07C-A11F8BCEC2D0.png, 
> FBA32597-8783-4678-B391-E450148C1B30.png
>
>
> When a task is started by multiple mysql connector timestamp start mode at 
> the same time, when searching for binlog timestamp, there will be task 
> blocking problem, which may cause source to be unable to obtain data all the 
> time. 
>  
> 1、 I have four tables(products,orders,orders_copy,shipments,)to capture in a 
> task . For these four tables, I made a lot of binlogs,such as 10 million。
> 2、I try start it with timestamp mode and the products table could not get any 
> records .
> !FBA32597-8783-4678-B391-E450148C1B30.png|width=550,height=264!
> 3、I try start it with timestamp mode  ,but  the orders_copy table could not 
> get any records
> !BF180441-9C61-40eb-B07C-A11F8BCEC2D0.png|width=557,height=230!
> 3、I debug  code and find some  problems
> {code:java}
> # Class: org.apache.flink.cdc.connectors.mysql.debezium.DebeziumUtils
> private static String searchBinlogName(
> BinaryLogClient client, long targetMs, List binlogFiles)
> throws IOException, InterruptedException {
> int startIdx = 0;
> int endIdx = binlogFiles.size() - 1;
> while (startIdx <= endIdx) {
> int mid = startIdx + (endIdx - startIdx) / 2;
> long midTs = getBinlogTimestamp(client, binlogFiles.get(mid));
> if (midTs < targetMs) {
> startIdx = mid + 1;
> } else if (targetMs < midTs) {
> endIdx = mid - 1;
> } else {
> return binlogFiles.get(mid);
> }
> }
> return endIdx < 0 ? binlogFiles.get(0) : binlogFiles.get(endIdx);
> }
> private static long getBinlogTimestamp(BinaryLogClient client, String 
> binlogFile)
> throws IOException, InterruptedException {
> ArrayBlockingQueue binlogTimestamps = new ArrayBlockingQueue<>(1);
> BinaryLogClient.EventListener eventListener =
> event -> {
> EventData data = event.getData();
> if (data instanceof RotateEventData) {
> // We skip RotateEventData because it does not contain 
> the timestamp we are
> // interested in.
> return;
> }
> EventHeaderV4 header = event.getHeader();
> long timestamp = header.getTimestamp();
> if (timestamp > 0) {
> binlogTimestamps.offer(timestamp);
> try {
> client.disconnect();
> } catch (IOException e) {
> throw new RuntimeException(e);
> }
> }
> };
> try {
> client.registerEventListener(eventListener);
> client.setBinlogFilename(binlogFile);
> client.setBinlogPosition(0);
> LOG.info("begin parse binlog: {}", binlogFile);
> client.connect();
> } finally {
> client.unregisterEventListener(eventListener);
> }
> return binlogTimestamps.take();
> }{code}
> 5、 the funciton binlogTimestamps.take() is blocking until the queue has 
> records.
> 6、the binlogTimestamps queue is always  blocking and cannot  get any data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35674) MySQL connector cause blocking when searching for binlog timestamps

2024-06-24 Thread Thorne (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thorne updated FLINK-35674:
---
Description: 
When a task is started by multiple mysql connector timestamp start mode at the 
same time, when searching for binlog timestamp, there will be task blocking 
problem, which may cause source to be unable to obtain data all the time. 

 

1、 I have four tables(products,orders,orders_copy,shipments,)to capture in a 
task . For these four tables, I made a lot of binlogs,such as 10 million。

2、I try start it with timestamp mode and the products table could not get any 
records .

!FBA32597-8783-4678-B391-E450148C1B30.png|width=550,height=264!

3、I try start it with timestamp mode  ,but  the orders_copy table could not get 
any records

!BF180441-9C61-40eb-B07C-A11F8BCEC2D0.png|width=557,height=230!

3、I debug  code and find some  problems
{code:java}
# Class: org.apache.flink.cdc.connectors.mysql.debezium.DebeziumUtils

private static String searchBinlogName(
BinaryLogClient client, long targetMs, List binlogFiles)
throws IOException, InterruptedException {
int startIdx = 0;
int endIdx = binlogFiles.size() - 1;

while (startIdx <= endIdx) {
int mid = startIdx + (endIdx - startIdx) / 2;
long midTs = getBinlogTimestamp(client, binlogFiles.get(mid));
if (midTs < targetMs) {
startIdx = mid + 1;
} else if (targetMs < midTs) {
endIdx = mid - 1;
} else {
return binlogFiles.get(mid);
}
}

return endIdx < 0 ? binlogFiles.get(0) : binlogFiles.get(endIdx);
}

private static long getBinlogTimestamp(BinaryLogClient client, String 
binlogFile)
throws IOException, InterruptedException {

ArrayBlockingQueue binlogTimestamps = new ArrayBlockingQueue<>(1);
BinaryLogClient.EventListener eventListener =
event -> {
EventData data = event.getData();
if (data instanceof RotateEventData) {
// We skip RotateEventData because it does not contain the 
timestamp we are
// interested in.
return;
}

EventHeaderV4 header = event.getHeader();
long timestamp = header.getTimestamp();
if (timestamp > 0) {
binlogTimestamps.offer(timestamp);
try {
client.disconnect();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
};

try {
client.registerEventListener(eventListener);
client.setBinlogFilename(binlogFile);
client.setBinlogPosition(0);

LOG.info("begin parse binlog: {}", binlogFile);
client.connect();
} finally {
client.unregisterEventListener(eventListener);
}
return binlogTimestamps.take();
}{code}
5、 the funciton binlogTimestamps.take() is blocking until the queue has records.

6、the binlogTimestamps queue is always  blocking and cannot  get any data.

  was:
When a task is started by multiple mysql connector timestamp start mode at the 
same time, when searching for binlog timestamp, there will be task blocking 
problem, which may cause source to be unable to obtain data all the time. 

 

1、 I have four tables(products,orders,orders_copy,shipments,)to capture in a 
task . For these four tables, I made a lot of binlogs,such as 10 million。

2、I try start it with timestamp mode and the products table could not get any 
records .

!FBA32597-8783-4678-B391-E450148C1B30.png|width=550,height=264!

3、I try start it with timestamp mode  ,but  the orders_copy table could not get 
any records

!BF180441-9C61-40eb-B07C-A11F8BCEC2D0.png|width=557,height=230!

3、I debug  code and find some  problems
{code:java}
# Class: org.apache.flink.cdc.connectors.mysql.debezium.DebeziumUtils

private static String searchBinlogName(
BinaryLogClient client, long targetMs, List binlogFiles)
throws IOException, InterruptedException {
int startIdx = 0;
int endIdx = binlogFiles.size() - 1;

while (startIdx <= endIdx) {
int mid = startIdx + (endIdx - startIdx) / 2;
long midTs = getBinlogTimestamp(client, binlogFiles.get(mid));
if (midTs < targetMs) {
startIdx = mid + 1;
} else if (targetMs < midTs) {
endIdx = mid - 1;
} else {
return binlogFiles.get(mid);
}
}

return endIdx < 0 ? binlogFiles.get(0) : binlogFiles.get(endIdx);
}

private static long getBinlogTimestamp(BinaryLogClient client, String 
binlogFile)
throws IOException, InterruptedException {

ArrayBlockingQueue binlogTimestamps = new ArrayBlockingQueue<>(1);
BinaryLogClient.EventListener eventListener =
event -> {
EventData data 

[jira] [Updated] (FLINK-35662) Use maven batch mode in k8s-operator CI

2024-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35662:
---
Labels: pull-request-available  (was: )

> Use maven batch mode in k8s-operator CI
> ---
>
> Key: FLINK-35662
> URL: https://issues.apache.org/jira/browse/FLINK-35662
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Ferenc Csaky
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.10.0
>
>
> Currently, the GitHub workflows do not use batch mode in the k8s-operator 
> repo, so there are a lot of lines in the log like this:
> {code}
> Progress (1): 4.1/14 kB
> Progress (1): 8.2/14 kB
> Progress (1): 12/14 kB 
> Progress (1): 14 kB
> {code}
> To produce logs that are for more easy to navigate, all {{mvn}} calls should 
> apply the batch-mode option {{-B}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-35662) Use maven batch mode in k8s-operator CI

2024-06-24 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan resolved FLINK-35662.
-
  Assignee: Ferenc Csaky
Resolution: Fixed

> Use maven batch mode in k8s-operator CI
> ---
>
> Key: FLINK-35662
> URL: https://issues.apache.org/jira/browse/FLINK-35662
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Ferenc Csaky
>Assignee: Ferenc Csaky
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.10.0
>
>
> Currently, the GitHub workflows do not use batch mode in the k8s-operator 
> repo, so there are a lot of lines in the log like this:
> {code}
> Progress (1): 4.1/14 kB
> Progress (1): 8.2/14 kB
> Progress (1): 12/14 kB 
> Progress (1): 14 kB
> {code}
> To produce logs that are for more easy to navigate, all {{mvn}} calls should 
> apply the batch-mode option {{-B}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35662) Use maven batch mode in k8s-operator CI

2024-06-24 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859622#comment-17859622
 ] 

Rui Fan commented on FLINK-35662:
-

Merged to main(1.10.0) via 2b78ea315cd6ea261cb92ffab80e3d62481c96a0

> Use maven batch mode in k8s-operator CI
> ---
>
> Key: FLINK-35662
> URL: https://issues.apache.org/jira/browse/FLINK-35662
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Ferenc Csaky
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.10.0
>
>
> Currently, the GitHub workflows do not use batch mode in the k8s-operator 
> repo, so there are a lot of lines in the log like this:
> {code}
> Progress (1): 4.1/14 kB
> Progress (1): 8.2/14 kB
> Progress (1): 12/14 kB 
> Progress (1): 14 kB
> {code}
> To produce logs that are for more easy to navigate, all {{mvn}} calls should 
> apply the batch-mode option {{-B}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35662] Use maven batch mode in CI workflows [flink-kubernetes-operator]

2024-06-24 Thread via GitHub


1996fanrui merged PR #844:
URL: https://github.com/apache/flink-kubernetes-operator/pull/844


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >