from:"Xingcan Cui \(Jira\)"

[jira] [Assigned] (FLINK-33926) Can't start a job with a jar in the system classpath in native k8s mode

2024-09-23 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-33926:
---

Assignee: Gantigmaa Selenge

> Can't start a job with a jar in the system classpath in native k8s mode
> ---
>
> Key: FLINK-33926
> URL: https://issues.apache.org/jira/browse/FLINK-33926
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.6.0
>Reporter: Trystan
>Assignee: Gantigmaa Selenge
>Priority: Major
>
> It appears that the combination of the running operator-controlled jobs in 
> native k8s + application mode + using a job jar in the classpath is invalid. 
> Avoiding dynamic classloading (as specified in the 
> [docs|https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/debugging/debugging_classloading/#avoiding-dynamic-classloading-for-user-code])
>  is beneficial for some jobs. This affects at least Flink 1.16.1 and 
> Kubernetes Operator 1.6.0.
>  
> FLINK-29288 seems to have addressed this for standalone mode. If I am 
> misunderstanding how to correctly build jars for this native k8s scenario, 
> apologies for the noise and any pointers would be appreciated!
>  
> Perhaps related, the [spec 
> documentation|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/reference/#jobspec]
>  declares it optional, but isn't clear about under what conditions that 
> applies.
>  * Putting the jar in the system classpath and pointing *jarURI* to that jar 
> leads to linkage errors.
>  * Not including *jarURI* leads to NullPointerExceptions in the operator:
> {code:java}
> {"type":"org.apache.flink.kubernetes.operator.exception.ReconciliationException","message":"java.lang.NullPointerException","stackTrace":"org.apache.flink.kubernetes.operator.exception.ReconciliationException:
>  java.lang.NullPointerException\n\tat 
> org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:148)\n\tat
>  
> org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:56)\n\tat
>  
> io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:138)\n\tat
>  
> io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:96)\n\tat
>  
> org.apache.flink.kubernetes.operator.metrics.OperatorJosdkMetrics.timeControllerExecution(OperatorJosdkMetrics.java:80)\n\tat
>  
> io.javaoperatorsdk.operator.processing.Controller.reconcile(Controller.java:95)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:139)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:119)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:89)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:62)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:414)\n\tat
>  java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
> Source)\n\tat 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
> Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\nCaused by: 
> java.lang.NullPointerException\n\tat 
> org.apache.flink.kubernetes.utils.KubernetesUtils.checkJarFileForApplicationMode(KubernetesUtils.java:407)\n\tat
>  
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployApplicationCluster(KubernetesClusterDescriptor.java:207)\n\tat
>  
> org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer.run(ApplicationClusterDeployer.java:67)\n\tat
>  
> org.apache.flink.kubernetes.operator.service.NativeFlinkService.deployApplicationCluster(Native","additionalMetadata":{},"throwableList":[{"type":"java.lang.NullPointerException","additionalMetadata":{}}]}
>   {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas

2024-08-16 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874102#comment-17874102
 ] 

Xingcan Cui edited comment on FLINK-36004 at 8/16/24 4:30 PM:
--

Fixed in master: b460d3d6fe00a18342783113635d1d80774fefe6
release-1.20: 9601e06bab1c4541dd5f7043c36031ac971dd705
release-1.19: 204db68ac418716e8af6fe371c606c129ddb4daa


was (Author: xccui):
Fixed in master: b460d3d6fe00a18342783113635d1d80774fefe6

> Flink SQL returns wrong results for Paimon tables with complex schemas
> --
>
> Key: FLINK-36004
> URL: https://issues.apache.org/jira/browse/FLINK-36004
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.1, 1.19.1
>Reporter: Xingcan Cui
>Assignee: xuyang
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> We have a Paimon table with some nested files such as the following one.
> {code:java}
> `f1` ROW <
> `f2` ROW <
> `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >,
> `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >,
> `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >,
> `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >,
> `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < 
> FLOAT > > > NOT NULL >,
> `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > 
> NOT NULL >,
> `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < 
> BIGINT > > > NOT NULL >,
> `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` 
> STRING, `f104` BIGINT > NOT NULL > >,
> `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL >
> >
> {code}
> When a select query includes some nested columns, the results will be wrong.
> For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return 
> correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} 
> will return wrong values for {{r.}}
> The query execution won't throw any exception but fails silently.
> I'm not sure if this is a Paimon-specific issue, but I also tested running 
> the same query with Spark and StarRocks, and both of them can produce correct 
> results.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas

2024-08-15 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui resolved FLINK-36004.
-
Resolution: Fixed

> Flink SQL returns wrong results for Paimon tables with complex schemas
> --
>
> Key: FLINK-36004
> URL: https://issues.apache.org/jira/browse/FLINK-36004
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.1, 1.19.1
>Reporter: Xingcan Cui
>Assignee: xuyang
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> We have a Paimon table with some nested files such as the following one.
> {code:java}
> `f1` ROW <
> `f2` ROW <
> `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >,
> `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >,
> `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >,
> `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >,
> `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < 
> FLOAT > > > NOT NULL >,
> `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > 
> NOT NULL >,
> `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < 
> BIGINT > > > NOT NULL >,
> `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` 
> STRING, `f104` BIGINT > NOT NULL > >,
> `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL >
> >
> {code}
> When a select query includes some nested columns, the results will be wrong.
> For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return 
> correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} 
> will return wrong values for {{r.}}
> The query execution won't throw any exception but fails silently.
> I'm not sure if this is a Paimon-specific issue, but I also tested running 
> the same query with Spark and StarRocks, and both of them can produce correct 
> results.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas

2024-08-15 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874102#comment-17874102
 ] 

Xingcan Cui commented on FLINK-36004:
-

Fixed in master: b460d3d6fe00a18342783113635d1d80774fefe6

> Flink SQL returns wrong results for Paimon tables with complex schemas
> --
>
> Key: FLINK-36004
> URL: https://issues.apache.org/jira/browse/FLINK-36004
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.1, 1.19.1
>Reporter: Xingcan Cui
>Assignee: xuyang
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> We have a Paimon table with some nested files such as the following one.
> {code:java}
> `f1` ROW <
> `f2` ROW <
> `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >,
> `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >,
> `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >,
> `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >,
> `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < 
> FLOAT > > > NOT NULL >,
> `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > 
> NOT NULL >,
> `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < 
> BIGINT > > > NOT NULL >,
> `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` 
> STRING, `f104` BIGINT > NOT NULL > >,
> `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL >
> >
> {code}
> When a select query includes some nested columns, the results will be wrong.
> For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return 
> correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} 
> will return wrong values for {{r.}}
> The query execution won't throw any exception but fails silently.
> I'm not sure if this is a Paimon-specific issue, but I also tested running 
> the same query with Spark and StarRocks, and both of them can produce correct 
> results.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas

2024-08-13 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-36004:
---

Assignee: xuyang

> Flink SQL returns wrong results for Paimon tables with complex schemas
> --
>
> Key: FLINK-36004
> URL: https://issues.apache.org/jira/browse/FLINK-36004
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.1, 1.19.1
>Reporter: Xingcan Cui
>Assignee: xuyang
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> We have a Paimon table with some nested files such as the following one.
> {code:java}
> `f1` ROW <
> `f2` ROW <
> `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >,
> `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >,
> `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >,
> `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >,
> `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < 
> FLOAT > > > NOT NULL >,
> `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > 
> NOT NULL >,
> `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < 
> BIGINT > > > NOT NULL >,
> `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` 
> STRING, `f104` BIGINT > NOT NULL > >,
> `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL >
> >
> {code}
> When a select query includes some nested columns, the results will be wrong.
> For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return 
> correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} 
> will return wrong values for {{r.}}
> The query execution won't throw any exception but fails silently.
> I'm not sure if this is a Paimon-specific issue, but I also tested running 
> the same query with Spark and StarRocks, and both of them can produce correct 
> results.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT

2024-08-09 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-35935:
---

Assignee: xuyang  (was: rocky-xu)

> CREATE TABLE AS doesn't work with LIMIT
> ---
>
> Key: FLINK-35935
> URL: https://issues.apache.org/jira/browse/FLINK-35935
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Assignee: xuyang
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code}
> The above statement throws "Caused by: java.lang.AssertionError: not a query: 
> " exception.
> A workaround is to wrap the query with CTE.
> {code:java}
> CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * 
> FROM R){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT

2024-08-09 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-35935:
---

Assignee: rocky-xu  (was: Xingcan Cui)

> CREATE TABLE AS doesn't work with LIMIT
> ---
>
> Key: FLINK-35935
> URL: https://issues.apache.org/jira/browse/FLINK-35935
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Assignee: rocky-xu
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code}
> The above statement throws "Caused by: java.lang.AssertionError: not a query: 
> " exception.
> A workaround is to wrap the query with CTE.
> {code:java}
> CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * 
> FROM R){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT

2024-08-09 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-35935:
---

Assignee: Xingcan Cui

> CREATE TABLE AS doesn't work with LIMIT
> ---
>
> Key: FLINK-35935
> URL: https://issues.apache.org/jira/browse/FLINK-35935
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code}
> The above statement throws "Caused by: java.lang.AssertionError: not a query: 
> " exception.
> A workaround is to wrap the query with CTE.
> {code:java}
> CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * 
> FROM R){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas

2024-08-08 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17872084#comment-17872084
 ] 

Xingcan Cui commented on FLINK-36004:
-

Please see the screenshot for an explanation. The data type for 
{{request_feature_stage}} and {{execution_feature_stage}} are the same.

It could be related to the flattened (unqualified) field access feature.

!screenshot-1.png!

> Flink SQL returns wrong results for Paimon tables with complex schemas
> --
>
> Key: FLINK-36004
> URL: https://issues.apache.org/jira/browse/FLINK-36004
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.1, 1.19.1
>Reporter: Xingcan Cui
>Priority: Blocker
> Attachments: screenshot-1.png
>
>
> We have a Paimon table with some nested files such as the following one.
> {code:java}
> `f1` ROW <
> `f2` ROW <
> `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >,
> `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >,
> `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >,
> `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >,
> `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < 
> FLOAT > > > NOT NULL >,
> `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > 
> NOT NULL >,
> `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < 
> BIGINT > > > NOT NULL >,
> `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` 
> STRING, `f104` BIGINT > NOT NULL > >,
> `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL >
> >
> {code}
> When a select query includes some nested columns, the results will be wrong.
> For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return 
> correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} 
> will return wrong values for {{r.}}
> The query execution won't throw any exception but fails silently.
> I'm not sure if this is a Paimon-specific issue, but I also tested running 
> the same query with Spark and StarRocks, and both of them can produce correct 
> results.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas

2024-08-08 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-36004:

Attachment: screenshot-1.png

> Flink SQL returns wrong results for Paimon tables with complex schemas
> --
>
> Key: FLINK-36004
> URL: https://issues.apache.org/jira/browse/FLINK-36004
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.1, 1.19.1
>Reporter: Xingcan Cui
>Priority: Blocker
> Attachments: screenshot-1.png
>
>
> We have a Paimon table with some nested files such as the following one.
> {code:java}
> `f1` ROW <
> `f2` ROW <
> `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >,
> `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >,
> `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >,
> `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >,
> `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < 
> FLOAT > > > NOT NULL >,
> `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > 
> NOT NULL >,
> `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < 
> BIGINT > > > NOT NULL >,
> `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` 
> STRING, `f104` BIGINT > NOT NULL > >,
> `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL >
> >
> {code}
> When a select query includes some nested columns, the results will be wrong.
> For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return 
> correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} 
> will return wrong values for {{r.}}
> The query execution won't throw any exception but fails silently.
> I'm not sure if this is a Paimon-specific issue, but I also tested running 
> the same query with Spark and StarRocks, and both of them can produce correct 
> results.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas

2024-08-08 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17872058#comment-17872058
 ] 

Xingcan Cui commented on FLINK-36004:
-

I narrowed down the scope a bit. In the table schema, there are two fields with 
the same *_nested_* type but different names, say, {{f1}} and {{f2}}.

When we only select a  {{f1.foo.bar}}, everything works file. But when we 
select {{f1.foo.bar}} and {{f2.foo.bar}}, both of them will return the value 
for {{f2.foo.bar}}.

> Flink SQL returns wrong results for Paimon tables with complex schemas
> --
>
> Key: FLINK-36004
> URL: https://issues.apache.org/jira/browse/FLINK-36004
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.1, 1.19.1
>Reporter: Xingcan Cui
>Priority: Blocker
>
> We have a Paimon table with some nested files such as the following one.
> {code:java}
> `f1` ROW <
> `f2` ROW <
> `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >,
> `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >,
> `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >,
> `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >,
> `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < 
> FLOAT > > > NOT NULL >,
> `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > 
> NOT NULL >,
> `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < 
> BIGINT > > > NOT NULL >,
> `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` 
> STRING, `f104` BIGINT > NOT NULL > >,
> `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL >
> >
> {code}
> When a select query includes some nested columns, the results will be wrong.
> For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return 
> correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} 
> will return wrong values for {{r.}}
> The query execution won't throw any exception but fails silently.
> I'm not sure if this is a Paimon-specific issue, but I also tested running 
> the same query with Spark and StarRocks, and both of them can produce correct 
> results.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas

2024-08-07 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871864#comment-17871864
 ] 

Xingcan Cui commented on FLINK-36004:
-

Flink version: 1.18.1 and 1.19.1

Paimon version: 0.8.2

> Flink SQL returns wrong results for Paimon tables with complex schemas
> --
>
> Key: FLINK-36004
> URL: https://issues.apache.org/jira/browse/FLINK-36004
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.1, 1.19.1
>Reporter: Xingcan Cui
>Priority: Blocker
>
> We have a Paimon table with some nested files such as the following one.
> {code:java}
> `f1` ROW <
> `f2` ROW <
> `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >,
> `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >,
> `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >,
> `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >,
> `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < 
> FLOAT > > > NOT NULL >,
> `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > 
> NOT NULL >,
> `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < 
> BIGINT > > > NOT NULL >,
> `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` 
> STRING, `f104` BIGINT > NOT NULL > >,
> `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL >
> >
> {code}
> When a select query includes some nested columns, the results will be wrong.
> For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return 
> correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} 
> will return wrong values for {{r.}}
> The query execution won't throw any exception but fails silently.
> I'm not sure if this is a Paimon-specific issue, but I also tested running 
> the same query with Spark and StarRocks, and both of them can produce correct 
> results.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas

2024-08-07 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-36004:
---

 Summary: Flink SQL returns wrong results for Paimon tables with 
complex schemas
 Key: FLINK-36004
 URL: https://issues.apache.org/jira/browse/FLINK-36004
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.19.1, 1.18.1
Reporter: Xingcan Cui


We have a Paimon table with some nested files such as the following one.
{code:java}
`f1` ROW <
`f2` ROW <
`f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >,
`f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >,
`f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >,
`f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >,
`f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < 
FLOAT > > > NOT NULL >,
`f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > 
NOT NULL >,
`f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < 
BIGINT > > > NOT NULL >,
`f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` STRING, 
`f104` BIGINT > NOT NULL > >,
`f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL >
>
{code}
When a select query includes some nested columns, the results will be wrong.

For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return 
correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} 
will return wrong values for {{r.}}

The query execution won't throw any exception but fails silently.

I'm not sure if this is a Paimon-specific issue, but I also tested running the 
same query with Spark and StarRocks, and both of them can produce correct 
results.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT

2024-08-07 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871856#comment-17871856
 ] 

Xingcan Cui edited comment on FLINK-35935 at 8/8/24 5:39 AM:
-

hi [~xuyangzhong] , it looks like this doesn't work for the filesystem 
connector. For instance
{code:java}
CREATEOR REPLACE TABLE `default_catalog`.`default_database`.`test` WITH (
'connector' = 'filesystem','path' = 'path','sink.parallelism' = '1',
'format' = 'json') AS ( SELECT * FROM table WHERE dt = '2024-08-05' LIMIT 10) 
{code}
`FETCH NEXT 10 ROWS ONLY` was added twice to the inner query.
{code:java}
Caused by: java.lang.AssertionError: not a query: SELECT * FROM `table`
WHERE `table`.`dt` = '2024-08-05'
FETCH NEXT 10 ROWS ONLY
FETCH NEXT 10 ROWS ONLY
{code}


was (Author: xccui):
hi [~xuyangzhong] , it looks like this doesn't for the filesystem connector. 
For instance
{code:java}
CREATEOR REPLACE TABLE `default_catalog`.`default_database`.`test` WITH (
'connector' = 'filesystem','path' = 'path','sink.parallelism' = '1',
'format' = 'json') AS ( SELECT * FROM table WHERE dt = '2024-08-05' LIMIT 10) 
{code}

> CREATE TABLE AS doesn't work with LIMIT
> ---
>
> Key: FLINK-35935
> URL: https://issues.apache.org/jira/browse/FLINK-35935
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> {code:java}
> CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code}
> The above statement throws "Caused by: java.lang.AssertionError: not a query: 
> " exception.
> A workaround is to wrap the query with CTE.
> {code:java}
> CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * 
> FROM R){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT

2024-08-07 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871856#comment-17871856
 ] 

Xingcan Cui commented on FLINK-35935:
-

hi [~xuyangzhong] , it looks like this doesn't for the filesystem connector. 
For instance
{code:java}
CREATEOR REPLACE TABLE `default_catalog`.`default_database`.`test` WITH (
'connector' = 'filesystem','path' = 'path','sink.parallelism' = '1',
'format' = 'json') AS ( SELECT * FROM table WHERE dt = '2024-08-05' LIMIT 10) 
{code}

> CREATE TABLE AS doesn't work with LIMIT
> ---
>
> Key: FLINK-35935
> URL: https://issues.apache.org/jira/browse/FLINK-35935
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> {code:java}
> CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code}
> The above statement throws "Caused by: java.lang.AssertionError: not a query: 
> " exception.
> A workaround is to wrap the query with CTE.
> {code:java}
> CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * 
> FROM R){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT

2024-07-30 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-35935:
---

 Summary: CREATE TABLE AS doesn't work with LIMIT
 Key: FLINK-35935
 URL: https://issues.apache.org/jira/browse/FLINK-35935
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.1
Reporter: Xingcan Cui


{code:java}
CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code}
The above statement throws "Caused by: java.lang.AssertionError: not a query: " 
exception.

A workaround is to wrap the query with CTE.
{code:java}
CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * FROM 
R){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35485) JobMaster failed with "the job xx has not been finished"

2024-06-13 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854891#comment-17854891
 ] 

Xingcan Cui commented on FLINK-35485:
-

Hi [~mapohl], I hit the exception again today and it caused the jobmanager to 
restart. Still no WARN+ logs. I checked the corresponding job and it actually 
finished succefully.

> JobMaster failed with "the job xx has not been finished"
> 
>
> Key: FLINK-35485
> URL: https://issues.apache.org/jira/browse/FLINK-35485
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> We ran a session cluster on K8s and used Flink SQL gateway to submit queries. 
> Hit the following rare exception once which caused the job manager to restart.
> {code:java}
> org.apache.flink.util.FlinkException: JobMaster for job 
> 50d681ae1e8170f77b4341dda6aba9bc failed.
>   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.jobMasterFailed(Dispatcher.java:1454)
>   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.jobManagerRunnerFailed(Dispatcher.java:776)
>   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.lambda$runJob$6(Dispatcher.java:698)
>   at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown 
> Source)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown 
> Source)
>   at java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown 
> Source)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRunAsync$4(PekkoRpcActor.java:451)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRunAsync(PekkoRpcActor.java:451)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:218)
>   at 
> org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:85)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:168)
>   at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33)
>   at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29)
>   at scala.PartialFunction.applyOrElse(PartialFunction.scala:127)
>   at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126)
>   at 
> org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
>   at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547)
>   at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545)
>   at 
> org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229)
>   at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590)
>   at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557)
>   at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280)
>   at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241)
>   at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253)
>   at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown 
> Source)
>   at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
>   at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
>   at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
> Caused by: org.apache.flink.runtime.jobmaster.JobNotFinishedException: The 
> job (50d681ae1e8170f77b4341dda6aba9bc) has not been finished.
>   at 
> org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.closeAsync(DefaultJobMasterServiceProcess.java:157)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcess(JobMasterServiceLeadershipRunner.java:431)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.callIfRunning(JobMasterServiceLeadershipRunner.java:476)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.lambda$stopJobMasterServiceProcessAsync$12(JobMasterServiceLeadershipRunner.java:407)
>   at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown 
> Source)
>   at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown 
> Source)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcessAsync(JobMasterServiceLeadershipRunner.java:405)
>   at 
> org.apache.flink.runtime.jobm

[jira] [Commented] (FLINK-35485) JobMaster failed with "the job xx has not been finished"

2024-06-04 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852267#comment-17852267
 ] 

Xingcan Cui commented on FLINK-35485:
-

Hi [~mapohl], unfortunately, I failed to fetch more logs from the cluster. If 
it happens again, I'll try to get some logs and post them here. Thanks!

> JobMaster failed with "the job xx has not been finished"
> 
>
> Key: FLINK-35485
> URL: https://issues.apache.org/jira/browse/FLINK-35485
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> We ran a session cluster on K8s and used Flink SQL gateway to submit queries. 
> Hit the following rare exception once which caused the job manager to restart.
> {code:java}
> org.apache.flink.util.FlinkException: JobMaster for job 
> 50d681ae1e8170f77b4341dda6aba9bc failed.
>   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.jobMasterFailed(Dispatcher.java:1454)
>   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.jobManagerRunnerFailed(Dispatcher.java:776)
>   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.lambda$runJob$6(Dispatcher.java:698)
>   at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown 
> Source)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown 
> Source)
>   at java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown 
> Source)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRunAsync$4(PekkoRpcActor.java:451)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRunAsync(PekkoRpcActor.java:451)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:218)
>   at 
> org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:85)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:168)
>   at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33)
>   at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29)
>   at scala.PartialFunction.applyOrElse(PartialFunction.scala:127)
>   at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126)
>   at 
> org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
>   at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547)
>   at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545)
>   at 
> org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229)
>   at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590)
>   at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557)
>   at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280)
>   at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241)
>   at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253)
>   at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown 
> Source)
>   at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
>   at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
>   at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
> Caused by: org.apache.flink.runtime.jobmaster.JobNotFinishedException: The 
> job (50d681ae1e8170f77b4341dda6aba9bc) has not been finished.
>   at 
> org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.closeAsync(DefaultJobMasterServiceProcess.java:157)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcess(JobMasterServiceLeadershipRunner.java:431)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.callIfRunning(JobMasterServiceLeadershipRunner.java:476)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.lambda$stopJobMasterServiceProcessAsync$12(JobMasterServiceLeadershipRunner.java:407)
>   at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown 
> Source)
>   at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown 
> Source)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcessAsync(JobMasterServiceLeadershipRunner.java:405)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeader

[jira] [Created] (FLINK-35486) Potential sql expression generation issues on SQL gateway

2024-05-29 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-35486:
---

 Summary: Potential sql expression generation issues on SQL gateway
 Key: FLINK-35486
 URL: https://issues.apache.org/jira/browse/FLINK-35486
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Gateway, Table SQL / Planner
Affects Versions: 1.18.1
Reporter: Xingcan Cui


We hit the following exceptions a few times when submitting queries to a 
session cluster with the Flink SQL gateway. When the same queries were 
submitted again, everything was good. There might be a concurrency problem for 
the expression generator.
{code:java}
"process.thread.name":"sql-gateway-operation-pool-thread-111","log.logger":"org.apache.flink.table.gateway.service.operation.OperationManager","error.type":"org.apache.flink.table.planner.codegen.CodeGenException","error.message":"Mismatch
 of expected output data type 'ARRAY NOT 
NULL>' and function's output type 'ARRAY NOT 
NULL>'.","error.stack_trace":"org.apache.flink.table.planner.codegen.CodeGenException:
 Mismatch of expected output data type 'ARRAY 
NOT NULL>' and function's output type 'ARRAY 
NOT NULL>'.
  at 
org.apache.flink.table.planner.codegen.calls.BridgingFunctionGenUtil$.verifyOutputType(BridgingFunctionGenUtil.scala:369)
  at 
org.apache.flink.table.planner.codegen.calls.BridgingFunctionGenUtil$.verifyFunctionAwareOutputType(BridgingFunctionGenUtil.scala:359)
  at 
org.apache.flink.table.planner.codegen.calls.BridgingFunctionGenUtil$.generateFunctionAwareCallWithDataType(BridgingFunctionGenUtil.scala:107)
  at 
org.apache.flink.table.planner.codegen.calls.BridgingFunctionGenUtil$.generateFunctionAwareCall(BridgingFunctionGenUtil.scala:84)
  at 
org.apache.flink.table.planner.codegen.calls.BridgingSqlFunctionCallGen.generate(BridgingSqlFunctionCallGen.scala:79)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateCallExpression(ExprCodeGenerator.scala:820)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:481)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:56)
  at org.apache.calcite.rex.RexCall.accept(RexCall.java:189)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.$anonfun$visitCall$1(ExprCodeGenerator.scala:478)
  at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:58)
  at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:51)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at scala.collection.TraversableLike.map(TraversableLike.scala:233)
  at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
  at scala.collection.AbstractTraversable.map(Traversable.scala:104)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:469)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:56)
  at org.apache.calcite.rex.RexCall.accept(RexCall.java:189)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.$anonfun$visitCall$1(ExprCodeGenerator.scala:478)
  at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:58)
  at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:51)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at scala.collection.TraversableLike.map(TraversableLike.scala:233)
  at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
  at scala.collection.AbstractTraversable.map(Traversable.scala:104)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:469)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:56)
  at org.apache.calcite.rex.RexCall.accept(RexCall.java:189)
  at 
org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateExpression(ExprCodeGenerator.scala:134)
  at 
org.apache.flink.table.planner.codegen.CalcCodeGenerator$.$anonfun$generateProcessCode$4(CalcCodeGenerator.scala:140)
  at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:58)
  at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:51)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at scala.collection.TraversableLike.map(TraversableLike.scala:233)
  at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
  at scala.collection.AbstractTraversable.map(Traversable.scala:104)
  at 
org.apache.flink.table.planner.codegen.CalcCodeGenerator$.produceProjectionCode$1(CalcCodeGenerator.scala:140)
  at 
org.apache.flink.table.planner.codegen.CalcCodeGenerator$

[jira] [Created] (FLINK-35485) JobMaster failed with "the job xx has not been finished"

2024-05-29 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-35485:
---

 Summary: JobMaster failed with "the job xx has not been finished"
 Key: FLINK-35485
 URL: https://issues.apache.org/jira/browse/FLINK-35485
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.1
Reporter: Xingcan Cui


We ran a session cluster on K8s and used Flink SQL gateway to submit queries. 
Hit the following rare exception once which caused the job manager to restart.
{code:java}
org.apache.flink.util.FlinkException: JobMaster for job 
50d681ae1e8170f77b4341dda6aba9bc failed.
  at 
org.apache.flink.runtime.dispatcher.Dispatcher.jobMasterFailed(Dispatcher.java:1454)
  at 
org.apache.flink.runtime.dispatcher.Dispatcher.jobManagerRunnerFailed(Dispatcher.java:776)
  at 
org.apache.flink.runtime.dispatcher.Dispatcher.lambda$runJob$6(Dispatcher.java:698)
  at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown Source)
  at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown 
Source)
  at java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown 
Source)
  at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRunAsync$4(PekkoRpcActor.java:451)
  at 
org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
  at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRunAsync(PekkoRpcActor.java:451)
  at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:218)
  at 
org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:85)
  at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:168)
  at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33)
  at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29)
  at scala.PartialFunction.applyOrElse(PartialFunction.scala:127)
  at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126)
  at 
org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29)
  at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175)
  at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
  at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
  at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547)
  at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545)
  at org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229)
  at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590)
  at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557)
  at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280)
  at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241)
  at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253)
  at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
  at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown 
Source)
  at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
  at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
  at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
Caused by: org.apache.flink.runtime.jobmaster.JobNotFinishedException: The job 
(50d681ae1e8170f77b4341dda6aba9bc) has not been finished.
  at 
org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.closeAsync(DefaultJobMasterServiceProcess.java:157)
  at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcess(JobMasterServiceLeadershipRunner.java:431)
  at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.callIfRunning(JobMasterServiceLeadershipRunner.java:476)
  at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.lambda$stopJobMasterServiceProcessAsync$12(JobMasterServiceLeadershipRunner.java:407)
  at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown 
Source)
  at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown 
Source)
  at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcessAsync(JobMasterServiceLeadershipRunner.java:405)
  at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.runIfStateRunning(JobMasterServiceLeadershipRunner.java:463)
  at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.revokeLeadership(JobMasterServiceLeadershipRunner.java:397)
  at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.notifyLeaderContenderOfLeadershipLoss(DefaultLeaderElectionService.java:484)
  at java.base/java.util.HashMap.forEach(Unknown Source)
  at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.onRevokeLeadershipInternal(DefaultLeaderElectionService.java:452)
  at 
o

[jira] [Commented] (FLINK-28867) Parquet reader support nested type in array/map type

2024-05-10 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845539#comment-17845539
 ] 

Xingcan Cui commented on FLINK-28867:
-

Hey [~jark], any plan to improve this in the near future? I feel that this is a 
blocker for Flink OLAP despite the data lake projects having their data 
readers/writers. Sometimes users would like to use Flink to process some raw 
parquet files.

> Parquet reader support nested type in array/map type
> 
>
> Key: FLINK-28867
> URL: https://issues.apache.org/jira/browse/FLINK-28867
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.16.0
>Reporter: dalongliu
>Priority: Major
> Attachments: ReadParquetArray1.java, part-00121.parquet
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs

2024-04-17 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838379#comment-17838379
 ] 

Xingcan Cui commented on FLINK-34583:
-

Hi [~xuyangzhong], thanks for looking into this. I hit the issue when using the 
Paimon table source. The execution plan looks good. However, the options don't 
work. It could be a runtime issue or Paimon source implementation bug. I can't 
remember clearly if Flink generates multiple table sources and then merges them 
at runtime. If it does, the options may not be merged properly.

!image-2024-04-17-16-48-49-073.png!

 

> Bug for dynamic table option hints with multiple CTEs
> -
>
> Key: FLINK-34583
> URL: https://issues.apache.org/jira/browse/FLINK-34583
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
> Attachments: image-2024-04-17-16-35-06-153.png, 
> image-2024-04-17-16-48-49-073.png
>
>
> The table options hints don't work well with multiple WITH clauses referring 
> to the same table. Please see the following example.
>  
> The following query with hints works well.
> {code:java}
> SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
> The following query with multiple WITH clauses also works well.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...)
> SELECT * FROM T3;{code}
> The following query with multiple WITH clauses referring to the same original 
> table failed to recognize the hints.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...),
> T4 AS (SELECT ... FROM T2 WHERE...),
> T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
> SELECT * FROM T5;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs

2024-04-17 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34583:

Attachment: image-2024-04-17-16-48-49-073.png

> Bug for dynamic table option hints with multiple CTEs
> -
>
> Key: FLINK-34583
> URL: https://issues.apache.org/jira/browse/FLINK-34583
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
> Attachments: image-2024-04-17-16-35-06-153.png, 
> image-2024-04-17-16-48-49-073.png
>
>
> The table options hints don't work well with multiple WITH clauses referring 
> to the same table. Please see the following example.
>  
> The following query with hints works well.
> {code:java}
> SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
> The following query with multiple WITH clauses also works well.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...)
> SELECT * FROM T3;{code}
> The following query with multiple WITH clauses referring to the same original 
> table failed to recognize the hints.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...),
> T4 AS (SELECT ... FROM T2 WHERE...),
> T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
> SELECT * FROM T5;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs

2024-04-17 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34583:

Attachment: image-2024-04-17-16-35-06-153.png

> Bug for dynamic table option hints with multiple CTEs
> -
>
> Key: FLINK-34583
> URL: https://issues.apache.org/jira/browse/FLINK-34583
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
> Attachments: image-2024-04-17-16-35-06-153.png
>
>
> The table options hints don't work well with multiple WITH clauses referring 
> to the same table. Please see the following example.
>  
> The following query with hints works well.
> {code:java}
> SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
> The following query with multiple WITH clauses also works well.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...)
> SELECT * FROM T3;{code}
> The following query with multiple WITH clauses referring to the same original 
> table failed to recognize the hints.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...),
> T4 AS (SELECT ... FROM T2 WHERE...),
> T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
> SELECT * FROM T5;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34926:

Attachment: image.png

> Adaptive auto parallelism doesn't work for a query
> --
>
> Key: FLINK-34926
> URL: https://issues.apache.org/jira/browse/FLINK-34926
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
> Attachments: image.png
>
>
> We have the following query running in batch mode.
> {code:java}
> WITH FEATURE_INCLUSION AS (
>     SELECT
>         insertion_id, -- Not unique
>         features -- Array>
>     FROM
>         features_table
> ),
> TOTAL AS (
>     SELECT
>         COUNT(DISTINCT insertion_id) total_id
>     FROM
>         FEATURE_INCLUSION
> ),
> FEATURE_INCLUSION_COUNTS AS (
>     SELECT
>         `key`,
>         COUNT(DISTINCT insertion_id) AS id_count
>     FROM
>         FEATURE_INCLUSION,
>         UNNEST(features) as t (`key`, `value`)
>     WHERE
>         TRUE
>     GROUP BY
>         `key`
> ),
> RESULTS AS (
>     SELECT
>         `key`
>     FROM
>         FEATURE_INCLUSION_COUNTS,
>         TOTAL
>     WHERE
>        (1.0 * id_count)/total_id > 0.1
> )
> SELECT
>     JSON_ARRAYAGG(`key`) AS feature_ids,
> FROM
>     RESULTS{code}
> The parallelism adaptively set by Flink for the following operator was always 
> 1.
> {code:java}
> [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
> insertion_id])
> +- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
> Partial_COUNT(insertion_id) AS count$0]){code}
> If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and 
> manually set `parallelism.default` to be greater than one, it worked.
> The screenshot of the full job graph is attached.  !image_720.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34926:

Description: 
We have the following query running in batch mode.
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached.  !image.png!

  was:
We have the following query running in batch mode.
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached.  !image_720.png!


> Adaptive auto parallelism doesn't work for a query
> --
>
> Key: FLINK-34926
> URL: https://issues.apache.org/jira/browse/FLINK-34926
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
> Attachments: image.png
>
>
> We have the following query running in batch mode.
> {code:java}
> WITH FEATURE_INCLUSION AS (
>     SELECT
>         insertion_id, -- Not unique
>         features -- Array>
>     FROM
>         features_table
> ),
> TOTAL AS (
>     SELECT
>         COUNT(DISTINCT insertion_id) total_id
>     FROM
>         FEATURE_INCLUSION
> ),
> FEATURE_INCLUSION_COUNTS AS (
>     SELECT
>         `key`,
>         COUNT(DISTINCT insertion_id) AS id_count
>     FROM
>         FEATURE_INCLUSION,
>         UNNEST(features) as t (`key`, `value`)
>     WHERE
>         TRUE
>     GROUP BY
>         `key`
> ),
> RESULTS AS (
>     SELECT
>         `key`
>     FROM
>         FEATURE_INCLUSION_COUNTS,
>         TOTAL
>     WHERE
>        (1.0 * id_count)/total_id > 0.1
> )
> SELECT
>     JSON_ARRAYAGG(`key`) AS feature_ids,
> FROM
>     RESULTS{code}
> The parallelism adaptively set by Flink for the following operator was always 
> 1.
> {code:java}
> [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
> insertion_id])
> +- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
> Partial_COUNT(insertion_id) AS count$0]){code}
> If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and 
> manually set `parallelism.default` to be greater than one, it worked.
> The screenshot of the full job graph is attached.  !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34926:

Attachment: (was: image_720.png)

> Adaptive auto parallelism doesn't work for a query
> --
>
> Key: FLINK-34926
> URL: https://issues.apache.org/jira/browse/FLINK-34926
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> We have the following query running in batch mode.
> {code:java}
> WITH FEATURE_INCLUSION AS (
>     SELECT
>         insertion_id, -- Not unique
>         features -- Array>
>     FROM
>         features_table
> ),
> TOTAL AS (
>     SELECT
>         COUNT(DISTINCT insertion_id) total_id
>     FROM
>         FEATURE_INCLUSION
> ),
> FEATURE_INCLUSION_COUNTS AS (
>     SELECT
>         `key`,
>         COUNT(DISTINCT insertion_id) AS id_count
>     FROM
>         FEATURE_INCLUSION,
>         UNNEST(features) as t (`key`, `value`)
>     WHERE
>         TRUE
>     GROUP BY
>         `key`
> ),
> RESULTS AS (
>     SELECT
>         `key`
>     FROM
>         FEATURE_INCLUSION_COUNTS,
>         TOTAL
>     WHERE
>        (1.0 * id_count)/total_id > 0.1
> )
> SELECT
>     JSON_ARRAYAGG(`key`) AS feature_ids,
> FROM
>     RESULTS{code}
> The parallelism adaptively set by Flink for the following operator was always 
> 1.
> {code:java}
> [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
> insertion_id])
> +- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
> Partial_COUNT(insertion_id) AS count$0]){code}
> If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and 
> manually set `parallelism.default` to be greater than one, it worked.
> The screenshot of the full job graph is attached.  !image_720.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34926:

Description: 
We have the following query running in batch mode.
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached.  !image_720.png!

  was:
We have the following query running in batch mode.

 
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached. !image_720.png!


> Adaptive auto parallelism doesn't work for a query
> --
>
> Key: FLINK-34926
> URL: https://issues.apache.org/jira/browse/FLINK-34926
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> We have the following query running in batch mode.
> {code:java}
> WITH FEATURE_INCLUSION AS (
>     SELECT
>         insertion_id, -- Not unique
>         features -- Array>
>     FROM
>         features_table
> ),
> TOTAL AS (
>     SELECT
>         COUNT(DISTINCT insertion_id) total_id
>     FROM
>         FEATURE_INCLUSION
> ),
> FEATURE_INCLUSION_COUNTS AS (
>     SELECT
>         `key`,
>         COUNT(DISTINCT insertion_id) AS id_count
>     FROM
>         FEATURE_INCLUSION,
>         UNNEST(features) as t (`key`, `value`)
>     WHERE
>         TRUE
>     GROUP BY
>         `key`
> ),
> RESULTS AS (
>     SELECT
>         `key`
>     FROM
>         FEATURE_INCLUSION_COUNTS,
>         TOTAL
>     WHERE
>        (1.0 * id_count)/total_id > 0.1
> )
> SELECT
>     JSON_ARRAYAGG(`key`) AS feature_ids,
> FROM
>     RESULTS{code}
> The parallelism adaptively set by Flink for the following operator was always 
> 1.
> {code:java}
> [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
> insertion_id])
> +- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
> Partial_COUNT(insertion_id) AS count$0]){code}
> If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and 
> manually set `parallelism.default` to be greater than one, it worked.
> The screenshot of the full job graph is attached.  !image_720.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-34926:
---

 Summary: Adaptive auto parallelism doesn't work for a query
 Key: FLINK-34926
 URL: https://issues.apache.org/jira/browse/FLINK-34926
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.1
Reporter: Xingcan Cui
 Attachments: image_720.png

We have the following query running in batch mode.

 
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached. !image_720.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34723) Parquet writer should restrict map keys to be not null

2024-03-19 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-34723:
---

Assignee: (was: Xingcan Cui)

> Parquet writer should restrict map keys to be not null
> --
>
> Key: FLINK-34723
> URL: https://issues.apache.org/jira/browse/FLINK-34723
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.19.0, 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>  Labels: pull-request-available
>
> We got the following exception when reading a parquet file (with map types) 
> generated by Flink.
> {code:java}
> Map keys must be annotated as required.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34723) Parquet writer should restrict map keys to be not null

2024-03-18 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-34723:
---

Assignee: Xingcan Cui

> Parquet writer should restrict map keys to be not null
> --
>
> Key: FLINK-34723
> URL: https://issues.apache.org/jira/browse/FLINK-34723
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.19.0, 1.18.1
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>Priority: Major
>
> We got the following exception when reading a parquet file (with map types) 
> generated by Flink.
> {code:java}
> Map keys must be annotated as required.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34723) Parquet writer should restrict map keys to be not null

2024-03-18 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34723:

Issue Type: Bug  (was: New Feature)

> Parquet writer should restrict map keys to be not null
> --
>
> Key: FLINK-34723
> URL: https://issues.apache.org/jira/browse/FLINK-34723
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.19.0, 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> We got the following exception when reading a parquet file (with map types) 
> generated by Flink.
> {code:java}
> Map keys must be annotated as required.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34723) Parquet writer should restrict map keys to be not null

2024-03-18 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-34723:
---

 Summary: Parquet writer should restrict map keys to be not null
 Key: FLINK-34723
 URL: https://issues.apache.org/jira/browse/FLINK-34723
 Project: Flink
  Issue Type: New Feature
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.18.1, 1.19.0
Reporter: Xingcan Cui


We got the following exception when reading a parquet file (with map types) 
generated by Flink.
{code:java}
Map keys must be annotated as required.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34633) Support unnesting array constants

2024-03-17 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-34633:
---

Assignee: Jeyhun Karimov

> Support unnesting array constants
> -
>
> Key: FLINK-34633
> URL: https://issues.apache.org/jira/browse/FLINK-34633
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Assignee: Jeyhun Karimov
>Priority: Minor
>  Labels: pull-request-available
>
> It seems that the current planner doesn't support using UNNEST on array 
> constants.(x)
> {code:java}
> SELECT * FROM UNNEST(ARRAY[1,2,3]);{code}
>  
> The following query can't be compiled.(x)
> {code:java}
> SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code}
>  
> The rewritten version works. (/)
> {code:java}
> SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN 
> UNNEST(A){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34633) Support unnesting array constants

2024-03-08 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34633:

Description: 
It seems that the current planner doesn't support using UNNEST on array 
constants.(x)
{code:java}
SELECT * FROM UNNEST(ARRAY[1,2,3]);{code}
 
The following query can't be compiled.(x)
{code:java}
SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code}
 
The rewritten version works. (/)
{code:java}
SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN 
UNNEST(A){code}

  was:
It seems that the current planner doesn't support using UNNEST on array 
constants.(x)
{code:java}
SELECT * FROM UNNEST(ARRAY[1,2,3]);{code}
 
The following query can be compiled.(x)
{code:java}
SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code}
 
The rewritten version works. (/)
{code:java}
SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN 
UNNEST(A){code}


> Support unnesting array constants
> -
>
> Key: FLINK-34633
> URL: https://issues.apache.org/jira/browse/FLINK-34633
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Minor
>
> It seems that the current planner doesn't support using UNNEST on array 
> constants.(x)
> {code:java}
> SELECT * FROM UNNEST(ARRAY[1,2,3]);{code}
>  
> The following query can't be compiled.(x)
> {code:java}
> SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code}
>  
> The rewritten version works. (/)
> {code:java}
> SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN 
> UNNEST(A){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34633) Support unnesting array constants

2024-03-08 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-34633:
---

 Summary: Support unnesting array constants
 Key: FLINK-34633
 URL: https://issues.apache.org/jira/browse/FLINK-34633
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Planner
Affects Versions: 1.18.1
Reporter: Xingcan Cui


It seems that the current planner doesn't support using UNNEST on array 
constants.(x)
{code:java}
SELECT * FROM UNNEST(ARRAY[1,2,3]);{code}
 
The following query can be compiled.(x)
{code:java}
SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code}
 
The rewritten version works. (/)
{code:java}
SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN 
UNNEST(A){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs

2024-03-05 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-34583:
---

 Summary: Bug for dynamic table option hints with multiple CTEs
 Key: FLINK-34583
 URL: https://issues.apache.org/jira/browse/FLINK-34583
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.1
Reporter: Xingcan Cui


The table options hints don't work well with multiple WITH clauses referring to 
the same table. Please see the following example.

 

The following query with hints works well.
{code:java}
SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
The following query with multiple WITH clauses also works well.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...)
T3 AS (SELECT ... FROM T2 WHERE...)
SELECT * FROM T3;{code}
The following query with multiple WITH clauses referring to the same original 
table failed to recognize the hints.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
T3 AS (SELECT ... FROM T2 WHERE...),
T4 AS (SELECT ... FROM T2 WHERE...),
T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
SELECT * FROM T5;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs

2024-03-05 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34583:

Description: 
The table options hints don't work well with multiple WITH clauses referring to 
the same table. Please see the following example.

 

The following query with hints works well.
{code:java}
SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
The following query with multiple WITH clauses also works well.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
T3 AS (SELECT ... FROM T2 WHERE...)
SELECT * FROM T3;{code}
The following query with multiple WITH clauses referring to the same original 
table failed to recognize the hints.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
T3 AS (SELECT ... FROM T2 WHERE...),
T4 AS (SELECT ... FROM T2 WHERE...),
T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
SELECT * FROM T5;{code}

  was:
The table options hints don't work well with multiple WITH clauses referring to 
the same table. Please see the following example.

 

The following query with hints works well.
{code:java}
SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
The following query with multiple WITH clauses also works well.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...)
T3 AS (SELECT ... FROM T2 WHERE...)
SELECT * FROM T3;{code}
The following query with multiple WITH clauses referring to the same original 
table failed to recognize the hints.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
T3 AS (SELECT ... FROM T2 WHERE...),
T4 AS (SELECT ... FROM T2 WHERE...),
T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
SELECT * FROM T5;{code}


> Bug for dynamic table option hints with multiple CTEs
> -
>
> Key: FLINK-34583
> URL: https://issues.apache.org/jira/browse/FLINK-34583
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> The table options hints don't work well with multiple WITH clauses referring 
> to the same table. Please see the following example.
>  
> The following query with hints works well.
> {code:java}
> SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
> The following query with multiple WITH clauses also works well.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...)
> SELECT * FROM T3;{code}
> The following query with multiple WITH clauses referring to the same original 
> table failed to recognize the hints.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...),
> T4 AS (SELECT ... FROM T2 WHERE...),
> T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
> SELECT * FROM T5;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33184) HybridShuffleITCase fails with exception in resource cleanup of task Map on AZP

2024-01-24 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810661#comment-17810661
 ] 

Xingcan Cui commented on FLINK-33184:
-

Just hit a similar issue in Flink 1.18.1. If 
[https://github.com/apache/flink/pull/23532] solved the issue, it's better to 
backport it.
{code:java}
ERROR org.apache.flink.runtime.taskmanager.Task                    [] - Error 
in the task canceler for task KeyedProcess (112/128)#1.
java.lang.IllegalStateException: Leaking buffers.
    at org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) 
~[flink-dist-1.18.1.jar:1.18.1]
    at 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier.disk.SubpartitionDiskCacheManager.release(SubpartitionDiskCacheManager.java:113)
 ~[flink-dist-1.18.1.jar:1.18.1]
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(Unknown Source) 
~[?:?]
    at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source) ~[?:?]
    at 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier.disk.DiskCacheManager.release(DiskCacheManager.java:128)
 ~[flink-dist-1.18.1.jar:1.18.1]
    at 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier.disk.DiskTierProducerAgent.releaseResources(DiskTierProducerAgent.java:222)
 ~[flink-dist-1.18.1.jar:1.18.1]
    at java.util.ArrayList.forEach(Unknown Source) ~[?:?]
    at 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59)
 ~[flink-dist-1.18.1.jar:1.18.1]
    at 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195)
 ~[flink-dist-1.18.1.jar:1.18.1]
    at 
org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262)
 ~[flink-dist-1.18.1.jar:1.18.1]
    at 
org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88)
 ~[flink-dist-1.18.1.jar:1.18.1]
    at 
org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284)
 ~[flink-dist-1.18.1.jar:1.18.1]
    at 
org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004)
 ~[flink-dist-1.18.1.jar:1.18.1]
    at org.apache.flink.runtime.taskmanager.Task.access$100(Task.java:139) 
~[flink-dist-1.18.1.jar:1.18.1]
    at 
org.apache.flink.runtime.taskmanager.Task$TaskCanceler.run(Task.java:1677) 
[flink-dist-1.18.1.jar:1.18.1]
    at java.lang.Thread.run(Unknown Source) [?:?]
2024-01-25 03:44:21 [KeyedProcess (112/128)#1] INFO  
org.apache.flink.runtime.taskmanager.Task                    [] - KeyedProcess 
(112/128)#1 
(7bb761e84f2d7957d3b927e49a6b28b3_e0d77c22cedd08dc719831d914bf_111_1) 
switched from CANCELING to CANCELED. {code}

> HybridShuffleITCase fails with exception in resource cleanup of task Map on 
> AZP
> ---
>
> Key: FLINK-33184
> URL: https://issues.apache.org/jira/browse/FLINK-33184
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build fails 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53548&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8710
> {noformat} 
> Map (5/10)#0] ERROR org.apache.flink.runtime.taskmanager.Task 
>[] - FATAL - exception in resource cleanup of task Map (5/10)#0 
> (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0)
> .
> java.lang.IllegalStateException: Leaking buffers.
> at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) 
> ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
> at 
> org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236)
>  ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
> at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292]
> at 
> org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59)
>  ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
> at 
> org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195)
>  ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
> at 
> org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262)
>  ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
> at 
> org.apache.flink.runtime.io.network

[jira] [Closed] (FLINK-33547) SQL primitive array type after upgrading to Flink 1.18.0

2023-11-23 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui closed FLINK-33547.
---
Resolution: Duplicate

Duplicated with https://issues.apache.org/jira/browse/FLINK-33523

> SQL primitive array type after upgrading to Flink 1.18.0
> 
>
> Key: FLINK-33547
> URL: https://issues.apache.org/jira/browse/FLINK-33547
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: Xingcan Cui
>Priority: Major
>
> We have some Flink SQL UDFs that use object array (Object[]) arguments and 
> take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink 
> 1.18.0, the data created by ARRAY[] SQL function became primitive arrays 
> (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's 
> expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33547) SQL primitive array type after upgrading to Flink 1.18.0

2023-11-23 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789319#comment-17789319
 ] 

Xingcan Cui commented on FLINK-33547:
-

Sure. I'll close this.

> SQL primitive array type after upgrading to Flink 1.18.0
> 
>
> Key: FLINK-33547
> URL: https://issues.apache.org/jira/browse/FLINK-33547
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: Xingcan Cui
>Priority: Major
>
> We have some Flink SQL UDFs that use object array (Object[]) arguments and 
> take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink 
> 1.18.0, the data created by ARRAY[] SQL function became primitive arrays 
> (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's 
> expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33547) SQL primitive array type after upgrading to Flink 1.18.0

2023-11-23 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789311#comment-17789311
 ] 

Xingcan Cui commented on FLINK-33547:
-

Hi [~jeyhun] , thanks for your attention!

What you explained makes sense. However, sometimes it's tricky to deal with 
primitive array parameters. They make it harder for users to write generic 
UDFs, e.g., one that takes an arbitrary array and returns the first 3 elements. 
Also, this is a breaking change in 1.18.0. All the old UDFs using Object[] as 
arguments before can't directly work for primitive arrays generated from ARRAY 
functions now.

Type inference and dealing with null values are challenging in Flink SQL. Users 
won't understand why a UDF accepting an ARRAY argument can't work for an 
ARRAY parameter. It's also impossible for UDF developers to code 
a bunch of functions taking different primitive arrays. We need some changes 
here.

> SQL primitive array type after upgrading to Flink 1.18.0
> 
>
> Key: FLINK-33547
> URL: https://issues.apache.org/jira/browse/FLINK-33547
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: Xingcan Cui
>Priority: Major
>
> We have some Flink SQL UDFs that use object array (Object[]) arguments and 
> take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink 
> 1.18.0, the data created by ARRAY[] SQL function became primitive arrays 
> (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's 
> expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33547) SQL primitive array type after upgrading to Flink 1.18.0

2023-11-14 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-33547:

Summary: SQL primitive array type after upgrading to Flink 1.18.0  (was: 
Primitive SQL array type after upgrading to Flink 1.18.0)

> SQL primitive array type after upgrading to Flink 1.18.0
> 
>
> Key: FLINK-33547
> URL: https://issues.apache.org/jira/browse/FLINK-33547
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: Xingcan Cui
>Priority: Major
>
> We have some Flink SQL UDFs that use object array (Object[]) arguments and 
> take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink 
> 1.18.0, the data created by ARRAY[] SQL function became primitive arrays 
> (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's 
> expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-33547) Primitive SQL array type after upgrading to Flink 1.18.0

2023-11-14 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-33547:
---

 Summary: Primitive SQL array type after upgrading to Flink 1.18.0
 Key: FLINK-33547
 URL: https://issues.apache.org/jira/browse/FLINK-33547
 Project: Flink
  Issue Type: Technical Debt
  Components: Table SQL / Runtime
Affects Versions: 1.18.0
Reporter: Xingcan Cui


We have some Flink SQL UDFs that use object array (Object[]) arguments and take 
boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink 1.18.0, 
the data created by ARRAY[] SQL function became primitive arrays (e.g., 
float[]) and it caused argument mismatch issues. I'm not sure if it's expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32171) Add PostStart hook to flink k8s operator helm

2023-05-23 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725588#comment-17725588
 ] 

Xingcan Cui commented on FLINK-32171:
-

Hi [~gyfora], would like to get your thoughts on this. I can work on it if you 
think this feature is reasonable. Thanks!

> Add PostStart hook to flink k8s operator helm
> -
>
> Key: FLINK-32171
> URL: https://issues.apache.org/jira/browse/FLINK-32171
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Xingcan Cui
>Priority: Minor
> Fix For: kubernetes-operator-1.6.0, kubernetes-operator-1.5.1
>
>
> I feel it will be convenient to add a PostStart hook optional config to flink 
> k8s operator helm (e.g. when users need to download some Flink plugins).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-32171) Add PostStart hook to flink k8s operator helm

2023-05-23 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-32171:
---

 Summary: Add PostStart hook to flink k8s operator helm
 Key: FLINK-32171
 URL: https://issues.apache.org/jira/browse/FLINK-32171
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Reporter: Xingcan Cui
 Fix For: kubernetes-operator-1.6.0, kubernetes-operator-1.5.1


I feel it will be convenient to add a PostStart hook optional config to flink 
k8s operator helm (e.g. when users need to download some Flink plugins).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-31021) JavaCodeSplitter doesn't split static method properly

2023-02-13 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688269#comment-17688269
 ] 

Xingcan Cui commented on FLINK-31021:
-

[~libenchao] Thanks for the comments. 

I'm a bit busy these days. Will try to replace the static methods with 
non-static ones for {{flink-protobuf}} if no one works on this before I get 
some time.

> JavaCodeSplitter doesn't split static method properly
> -
>
> Key: FLINK-31021
> URL: https://issues.apache.org/jira/browse/FLINK-31021
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.14.4, 1.15.3, 1.16.1
>Reporter: Xingcan Cui
>Priority: Minor
>
> The exception while compiling the generated source
> {code:java}
> cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: 
> Instance method "default void 
> org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)"
>  cannot be invoked in static context,{code}
> The original method header 
> {code:java}
> public static RowData decode(foo.bar.LogData message){{code}
> The code after split
>  
> {code:java}
> Line 3383: public static RowData decode(foo.bar.LogData message){ 
> decodeImpl(message); return decodeReturnValue$0; } 
> Line 3384:
> Line 3385: void decodeImpl(foo.bar.LogData message) {{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-31021) JavaCodeSplitter doesn't split static method properly

2023-02-12 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687728#comment-17687728
 ] 

Xingcan Cui commented on FLINK-31021:
-

I'm playing with 
[https://github.com/apache/flink/blob/c096c03df70648b60b665a09816635b956b201cc/flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/deserialize/ProtoToRowConverter.java#L98]
 

The generated converter source is too large sometimes. I fixed it locally by 
removing the static keyword for now. FYI [~libenchao] 

It's fine if the code splitter doesn't support static methods. But at least we 
should inform users with a proper message instead of generating incorrect code.

 

> JavaCodeSplitter doesn't split static method properly
> -
>
> Key: FLINK-31021
> URL: https://issues.apache.org/jira/browse/FLINK-31021
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.14.4, 1.15.3, 1.16.1
>Reporter: Xingcan Cui
>Priority: Minor
>
> The exception while compiling the generated source
> {code:java}
> cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: 
> Instance method "default void 
> org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)"
>  cannot be invoked in static context,{code}
> The original method header 
> {code:java}
> public static RowData decode(foo.bar.LogData message){{code}
> The code after split
>  
> {code:java}
> Line 3383: public static RowData decode(foo.bar.LogData message){ 
> decodeImpl(message); return decodeReturnValue$0; } 
> Line 3384:
> Line 3385: void decodeImpl(foo.bar.LogData message) {{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-31021) JavaCodeSplitter doesn't split static method properly

2023-02-10 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-31021:

Affects Version/s: 1.15.3

> JavaCodeSplitter doesn't split static method properly
> -
>
> Key: FLINK-31021
> URL: https://issues.apache.org/jira/browse/FLINK-31021
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.14.4, 1.15.3, 1.16.1
>Reporter: Xingcan Cui
>Priority: Minor
>
> The exception while compiling the generated source
> {code:java}
> cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: 
> Instance method "default void 
> org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)"
>  cannot be invoked in static context,{code}
> The original method header 
> {code:java}
> public static RowData decode(foo.bar.LogData message){{code}
> The code after split
>  
> {code:java}
> Line 3383: public static RowData decode(foo.bar.LogData message){ 
> decodeImpl(message); return decodeReturnValue$0; } 
> Line 3384:
> Line 3385: void decodeImpl(foo.bar.LogData message) {{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-31021) JavaCodeSplitter doesn't split static method properly

2023-02-10 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-31021:
---

 Summary: JavaCodeSplitter doesn't split static method properly
 Key: FLINK-31021
 URL: https://issues.apache.org/jira/browse/FLINK-31021
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.16.1, 1.14.4
Reporter: Xingcan Cui


The exception while compiling the generated source
{code:java}
cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: 
Instance method "default void 
org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)"
 cannot be invoked in static context,{code}
The original method header 
{code:java}
public static RowData decode(foo.bar.LogData message){{code}
The code after split
 
{code:java}
Line 3383: public static RowData decode(foo.bar.LogData message){ 
decodeImpl(message); return decodeReturnValue$0; } 
Line 3384:
Line 3385: void decodeImpl(foo.bar.LogData message) {{code}
 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-24007) Support Avro timestamp conversion with precision greater than three

2021-08-27 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-24007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406049#comment-17406049
 ] 

Xingcan Cui commented on FLINK-24007:
-

Will close this since it's duplicated with 
https://issues.apache.org/jira/browse/FLINK-23589

> Support Avro timestamp conversion with precision greater than three
> ---
>
> Key: FLINK-24007
> URL: https://issues.apache.org/jira/browse/FLINK-24007
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.13.2
>Reporter: Xingcan Cui
>Priority: Major
>
> {{AvroSchemaConverter.convertToSchema()}} doesn't support timestamp with 
> precision > 3 now. This seems to be a bug and should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-24007) Support Avro timestamp conversion with precision greater than three

2021-08-27 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui closed FLINK-24007.
---
Resolution: Duplicate

> Support Avro timestamp conversion with precision greater than three
> ---
>
> Key: FLINK-24007
> URL: https://issues.apache.org/jira/browse/FLINK-24007
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.13.2
>Reporter: Xingcan Cui
>Priority: Major
>
> {{AvroSchemaConverter.convertToSchema()}} doesn't support timestamp with 
> precision > 3 now. This seems to be a bug and should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-24007) Support Avro timestamp conversion with precision greater than three

2021-08-26 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-24007:
---

 Summary: Support Avro timestamp conversion with precision greater 
than three
 Key: FLINK-24007
 URL: https://issues.apache.org/jira/browse/FLINK-24007
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.13.2
Reporter: Xingcan Cui


{{AvroSchemaConverter.convertToSchema()}} doesn't support timestamp with 
precision > 3 now. This seems to be a bug and should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-8236) Allow to set the parallelism of table queries

2020-11-05 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17227070#comment-17227070
 ] 

Xingcan Cui commented on FLINK-8236:


Hi [~rex-remind], unfortunately, there's no progress from my side on this 
issue. I think we still can only use a global parallelism value for all the 
operators of a compiled SQL.

Also, I'm going to unassign myself and see if other members of the community 
could work on this.

> Allow to set the parallelism of table queries
> -
>
> Key: FLINK-8236
> URL: https://issues.apache.org/jira/browse/FLINK-8236
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.4.0
>Reporter: Timo Walther
>Priority: Major
>
> Right now the parallelism of a table program is determined by the parallelism 
> of the stream/batch environment. E.g., by default, tumbling window operators 
> use the default parallelism of the environment. Simple project and select 
> operations have the same parallelism as the inputs they are applied on.
> While we cannot change forwarding operations because this would change the 
> results when using retractions, it should be possible to change the 
> parallelism for operators after shuffling operations.
> It should be possible to specify the default parallelism of a table program 
> in the {{TableConfig}} and/or {{QueryConfig}}. The configuration per query 
> has higher precedence that the configuration per table environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-8236) Allow to set the parallelism of table queries

2020-11-05 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-8236:
--

Assignee: (was: Xingcan Cui)

> Allow to set the parallelism of table queries
> -
>
> Key: FLINK-8236
> URL: https://issues.apache.org/jira/browse/FLINK-8236
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.4.0
>Reporter: Timo Walther
>Priority: Major
>
> Right now the parallelism of a table program is determined by the parallelism 
> of the stream/batch environment. E.g., by default, tumbling window operators 
> use the default parallelism of the environment. Simple project and select 
> operations have the same parallelism as the inputs they are applied on.
> While we cannot change forwarding operations because this would change the 
> results when using retractions, it should be possible to change the 
> parallelism for operators after shuffling operations.
> It should be possible to specify the default parallelism of a table program 
> in the {{TableConfig}} and/or {{QueryConfig}}. The configuration per query 
> has higher precedence that the configuration per table environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16048) Support read/write confluent schema registry avro data from Kafka

2020-09-08 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192590#comment-17192590
 ] 

Xingcan Cui commented on FLINK-16048:
-

Hi all, thanks for your effort on this feature. I wonder if it's possible to 
provide the authentication info (i.e.,{{value.converter.basic.auth.user.info}}) 
via this new schema class?

> Support read/write confluent schema registry avro data  from Kafka
> --
>
> Key: FLINK-16048
> URL: https://issues.apache.org/jira/browse/FLINK-16048
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Affects Versions: 1.11.0
>Reporter: Leonard Xu
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.12.0
>
>
> *The background*
> I found SQL Kafka connector can not consume avro data that was serialized by 
> `KafkaAvroSerializer` and only can consume Row data with avro schema because 
> we use `AvroRowDeserializationSchema/AvroRowSerializationSchema` to se/de 
> data in  `AvroRowFormatFactory`. 
> I think we should support this because `KafkaAvroSerializer` is very common 
> in Kafka.
> and someone met same question in stackoverflow[1].
> [[1]https://stackoverflow.com/questions/56452571/caused-by-org-apache-avro-avroruntimeexception-malformed-data-length-is-negat/56478259|https://stackoverflow.com/questions/56452571/caused-by-org-apache-avro-avroruntimeexception-malformed-data-length-is-negat/56478259]
> *The format details*
> _The factory identifier (or format id)_
> There are 2 candidates now ~
> - {{avro-sr}}: the pattern borrowed from KSQL {{JSON_SR}} format [1]
> - {{avro-confluent}}: the pattern borrowed from Clickhouse {{AvroConfluent}} 
> [2]
> Personally i would prefer {{avro-sr}} because it is more concise and the 
> confluent is a company name which i think is not that suitable for a format 
> name.
> _The format attributes_
> || Options || required || Remark ||
> | schema-registry.url | true | URL to connect to schema registry service |
> | schema-registry.subject | false | Subject name to write to the Schema 
> Registry service, required for sink |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join

2020-08-13 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177438#comment-17177438
 ] 

Xingcan Cui edited comment on FLINK-7865 at 8/14/20, 2:56 AM:
--

Hi [~godfreyhe], it took me some time to think about how to do the reversion. 
However, the codebase has changed too much and I'm afraid I don't have much 
bandwidth in recent days to continue working on this (just onboarded a new 
company). I was wondering if you could take it over. Thanks!

Some inspections that may help (hopefully):

In {{FlinkDecorrelateProgram.optimize()}}, 
{{RelDecorrelator.decorrelateQuery(root)}} will rewrite the 
{{LogicalCorrelate}} of a left lateral join plan with extra conditions to 
{{LogicalJoin}}. For left lateral join without extra conditions, it will 
preserve the {{LogicalCorrelate}}.


was (Author: xccui):
Hi [~godfreyhe], it took me some time to think about how to do the reversion. 
However, the codebase has changed too much and I'm afraid I don't have much 
bandwidth in recent days to continue working on this. I was wondering if you 
could take it over. Thanks!

Some inspections that may help (hopefully):

In {{FlinkDecorrelateProgram.optimize()}}, 
{{RelDecorrelator.decorrelateQuery(root)}} will rewrite the 
{{LogicalCorrelate}} of a left lateral join plan with extra conditions to 
{{LogicalJoin}}. For left lateral join without extra conditions, it will 
preserve the {{LogicalCorrelate}}.

> Remove predicate restrictions on TableFunction left outer join
> --
>
> Key: FLINK-7865
> URL: https://issues.apache.org/jira/browse/FLINK-7865
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Priority: Major
>
> To cover up the improper translation of lateral table left outer join 
> (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} 
> literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has 
> been fixed in Calcite, we should remove the restrictions. The tasks may 
> include removing Table API/SQL condition check, removing validation tests, 
> enabling integration tests, updating the documents, etc.
> See [this thread on Calcite dev 
> list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E]
>  for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join

2020-08-13 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-7865:
--

Assignee: (was: Xingcan Cui)

> Remove predicate restrictions on TableFunction left outer join
> --
>
> Key: FLINK-7865
> URL: https://issues.apache.org/jira/browse/FLINK-7865
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Priority: Major
>
> To cover up the improper translation of lateral table left outer join 
> (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} 
> literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has 
> been fixed in Calcite, we should remove the restrictions. The tasks may 
> include removing Table API/SQL condition check, removing validation tests, 
> enabling integration tests, updating the documents, etc.
> See [this thread on Calcite dev 
> list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E]
>  for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join

2020-08-13 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177438#comment-17177438
 ] 

Xingcan Cui commented on FLINK-7865:


Hi [~godfreyhe], it took me some time to think about how to do the reversion. 
However, the codebase has changed too much and I'm afraid I don't have much 
bandwidth in recent days to continue working on this. I was wondering if you 
could take it over. Thanks!

Some inspections that may help (hopefully):

In {{FlinkDecorrelateProgram.optimize()}}, 
{{RelDecorrelator.decorrelateQuery(root)}} will rewrite the 
{{LogicalCorrelate}} of a left lateral join plan with extra conditions to 
{{LogicalJoin}}. For left lateral join without extra conditions, it will 
preserve the {{LogicalCorrelate}}.

> Remove predicate restrictions on TableFunction left outer join
> --
>
> Key: FLINK-7865
> URL: https://issues.apache.org/jira/browse/FLINK-7865
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>Priority: Major
>
> To cover up the improper translation of lateral table left outer join 
> (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} 
> literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has 
> been fixed in Calcite, we should remove the restrictions. The tasks may 
> include removing Table API/SQL condition check, removing validation tests, 
> enabling integration tests, updating the documents, etc.
> See [this thread on Calcite dev 
> list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E]
>  for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join

2020-08-12 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176325#comment-17176325
 ] 

Xingcan Cui commented on FLINK-7865:


[~liupengcheng] Thanks for your reminder. I'll remove the restrictions asap.

> Remove predicate restrictions on TableFunction left outer join
> --
>
> Key: FLINK-7865
> URL: https://issues.apache.org/jira/browse/FLINK-7865
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>Priority: Major
>
> To cover up the improper translation of lateral table left outer join 
> (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} 
> literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has 
> been fixed in Calcite, we should remove the restrictions. The tasks may 
> include removing Table API/SQL condition check, removing validation tests, 
> enabling integration tests, updating the documents, etc.
> See [this thread on Calcite dev 
> list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E]
>  for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-13849) The back-pressure monitoring tab in Web UI may cause errors

2019-08-25 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-13849:
---

 Summary: The back-pressure monitoring tab in Web UI may cause 
errors
 Key: FLINK-13849
 URL: https://issues.apache.org/jira/browse/FLINK-13849
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.9.0
Reporter: Xingcan Cui


Clicking the back-pressure monitoring tab for a finished job in Web UI will 
cause an internal server error. The exceptions are as follows.
{code:java}
2019-08-26 01:23:54,845 ERROR 
org.apache.flink.runtime.rest.handler.job.JobVertexBackPressureHandler - 
Unhandled exception.
org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find 
Flink job (09e107685e0b81b443b556062debb443)
at 
org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGatewayFuture(Dispatcher.java:825)
at 
org.apache.flink.runtime.dispatcher.Dispatcher.requestOperatorBackPressureStats(Dispatcher.java:524)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:279)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:194)
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-13405) Translate "Basic API Concepts" page into Chinese

2019-08-07 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-13405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902700#comment-16902700
 ] 

Xingcan Cui commented on FLINK-13405:
-

[~WangHW], personally, I'd like to translate it to "数据汇", which corresponds to 
source ("数据源"). However, as [~jark] suggested, you can choose not to translate 
it.

> Translate "Basic API Concepts" page into Chinese
> 
>
> Key: FLINK-13405
> URL: https://issues.apache.org/jira/browse/FLINK-13405
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Affects Versions: 1.10.0
>Reporter: WangHengWei
>Assignee: WangHengWei
>Priority: Major
>  Labels: documentation, pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The page url is 
> [https://github.com/apache/flink/blob/master/docs/dev/api_concepts.zh.md]
> The markdown file is located in flink/docs/dev/api_concepts.zh.md



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (FLINK-13405) Translate "Basic API Concepts" page into Chinese

2019-07-28 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-13405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894879#comment-16894879
 ] 

Xingcan Cui commented on FLINK-13405:
-

首先，sorting在这里一定是排序而非分类。

其次，我个人对于此处“i.e.”的用法有所怀疑。如果访问其内容就等同于高效排序，那用"i.e."没问题。但根据我的理解，高效排序可能只是内容访问的用途之一。如果是后者，那应该把"i.e."替换成"e.g."，大致翻译是：……无法访问它们的内容（例如为了高效排序）。

 

> Translate "Basic API Concepts" page into Chinese
> 
>
> Key: FLINK-13405
> URL: https://issues.apache.org/jira/browse/FLINK-13405
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Affects Versions: 1.10.0
>Reporter: WangHengWei
>Assignee: WangHengWei
>Priority: Major
>  Labels: documentation, pull-request-available
> Fix For: 1.10.0
>
>
> The page url is 
> [https://github.com/apache/flink/blob/master/docs/dev/api_concepts.zh.md]
> The markdown file is located in flink/docs/dev/api_concepts.zh.md



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Closed] (FLINK-11769) The estimateDataTypesSize method in FlinkRelNode causes NPE for Multiset

2019-07-19 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui closed FLINK-11769.
---
Resolution: Duplicate

> The estimateDataTypesSize method in FlinkRelNode causes NPE for Multiset
> 
>
> Key: FLINK-11769
> URL: https://issues.apache.org/jira/browse/FLINK-11769
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.6.4, 1.7.2
>Reporter: Xingcan Cui
>Priority: Major
>
> As raised in [this 
> thread|https://lists.apache.org/thread.html/3a93723d2e74ae667a9aeb7d6ff28955f3ef79b5f20b4848b67fe709@%3Cuser.flink.apache.org%3E].
>  The {{estimateDataTypesSize}} method in {{FlinkRelNode}} causes NPE for a 
> {{Multiset>}} field type. Maybe the {{keyType}} or the 
> {{valueType}} is empty in that case.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (FLINK-11433) JOIN on a table having a column of type MULTISET gives a NPE

2019-07-19 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889287#comment-16889287
 ] 

Xingcan Cui commented on FLINK-11433:
-

Hi [~azagrebin], I think they refer to the same issue. Will mark FLINK-11769 as 
duplicated and close it.

>  JOIN on a table having a column of type MULTISET gives a NPE
> -
>
> Key: FLINK-11433
> URL: https://issues.apache.org/jira/browse/FLINK-11433
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.7.0, 1.7.1
>Reporter: Elias Saalmann
>Assignee: TANG Wen-hui
>Priority: Major
>
> I get an error (Error while applying rule FlinkLogicalJoinConverter) when 
> performing a JOIN on a table having a column of type MULTISET (e.g. a COLLECT 
> as aggregation of a GROUP BY), for instance:
> SELECT a, d
>  FROM TableA JOIN (
>    SELECT b, COLLECT(c) AS d
>    FROM TableB
>    GROUP BY b
>  ) TableC ON a = b
> Full stacktrace:
> Exception in thread "main" java.lang.RuntimeException: Error while applying 
> rule FlinkLogicalJoinConverter, args 
> [rel#71:LogicalJoin.NONE(left=rel#69:Subset#3.NONE,right=rel#70:Subset#4.NONE,condition==($2,
>  $0),joinType=inner)]
>      at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:236)
>      at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:646)
>      at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:339)
>      at 
> org.apache.flink.table.api.TableEnvironment.runVolcanoPlanner(TableEnvironment.scala:373)
>      at 
> org.apache.flink.table.api.TableEnvironment.optimizeLogicalPlan(TableEnvironment.scala:292)
>      at 
> org.apache.flink.table.api.BatchTableEnvironment.optimize(BatchTableEnvironment.scala:455)
>      at 
> org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:475)
>      at 
> org.apache.flink.table.api.java.BatchTableEnvironment.toDataSet(BatchTableEnvironment.scala:165)
>      at org.myorg.quickstart.TableJob2.main(TableJob2.java:40)
>  Caused by: java.lang.RuntimeException: Error occurred while applying rule 
> FlinkLogicalJoinConverter
>      at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:149)
>      at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234)
>      at 
> org.apache.calcite.rel.convert.ConverterRule.onMatch(ConverterRule.java:141)
>      at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:212)
>      ... 8 more
>  Caused by: java.lang.NullPointerException
>      at 
> org.apache.flink.table.plan.nodes.FlinkRelNode$class.estimateDataTypeSize(FlinkRelNode.scala:84)
>      at 
> org.apache.flink.table.plan.nodes.logical.FlinkLogicalJoinBase.estimateDataTypeSize(FlinkLogicalJoinBase.scala:29)
>      at 
> org.apache.flink.table.plan.nodes.FlinkRelNode$class.estimateDataTypeSize(FlinkRelNode.scala:104)
>      at 
> org.apache.flink.table.plan.nodes.logical.FlinkLogicalJoinBase.estimateDataTypeSize(FlinkLogicalJoinBase.scala:29)
>      at 
> org.apache.flink.table.plan.nodes.FlinkRelNode$$anonfun$estimateRowSize$2.apply(FlinkRelNode.scala:80)
>      at 
> org.apache.flink.table.plan.nodes.FlinkRelNode$$anonfun$estimateRowSize$2.apply(FlinkRelNode.scala:79)
>      at 
> scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57)
>      at 
> scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66)
>      at scala.collection.mutable.ArrayBuffer.foldLeft(ArrayBuffer.scala:48)
>      at 
> org.apache.flink.table.plan.nodes.FlinkRelNode$class.estimateRowSize(FlinkRelNode.scala:79)
>      at 
> org.apache.flink.table.plan.nodes.logical.FlinkLogicalJoinBase.estimateRowSize(FlinkLogicalJoinBase.scala:29)
>      at 
> org.apache.flink.table.plan.nodes.logical.FlinkLogicalJoinBase.computeSelfCost(FlinkLogicalJoinBase.scala:48)
>      at 
> org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getNonCumulativeCost(RelMdPercentageOriginalRows.java:162)
>      at 
> GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost_$(Unknown 
> Source)
>      at 
> GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost(Unknown 
> Source)
>      at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumulativeCost(RelMetadataQuery.java:301)
>      at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(VolcanoPlanner.java:953)
>      at 
> org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(RelSubset.java:339)
>      at 
> org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(RelSubset.java:322)
>      at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1643)
>      at 
> o

[jira] [Updated] (FLINK-12116) Args autocast will cause exception for plan transformation in TableAPI

2019-06-24 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-12116:

Labels:   (was: pull-request-available)

> Args autocast will cause exception for plan transformation in TableAPI
> --
>
> Key: FLINK-12116
> URL: https://issues.apache.org/jira/browse/FLINK-12116
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.6.4, 1.7.2
>Reporter: Xingcan Cui
>Assignee: vinoyang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In tableAPI, the automatic typecast for arguments may break their initial 
> structures, which makes {{TreeNode.makeCopy()}} fail.
> Take the {{ConcatWs}} function as an example. It requires a string 
> {{Expression}} sequence for the second parameter of its constructor. If we 
> provide some {{Expressions}} with other types, the planner will try to cast 
> them automatically. However, during this process, the arguments will be 
> incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two 
> expressions {{f1.cast(String)}} and {{f2.cast(String)}}) which will cause 
> {{java.lang.IllegalArgumentException: wrong number of arguments}} for 
> {{Constructor.newInstance()}}.
> As a workaround, we can cast these arguments manually.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10049) Correctly handle NULL arguments in SQL built-in functions

2019-04-16 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-10049:

Description: Currently, the built-in functions treat NULL arguments in 
different ways. E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws NPE. The 
general SQL-way of handling NULL values should be that if one argument is NULL 
the result is NULL. We should keep the correct semantics and avoid terminating 
the (continuous) queries unexpectedly.  (was: Currently, the built-in functions 
treat NULL arguments in different ways. E.g., ABS(NULL) returns NULL, while 
LOG10(NULL) throws an NPE. The general SQL-way of handling NULL values should 
be that if one argument is NULL the result is NULL. We should unify the 
processing logic for that.)

> Correctly handle NULL arguments in SQL built-in functions
> -
>
> Key: FLINK-10049
> URL: https://issues.apache.org/jira/browse/FLINK-10049
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Assignee: vinoyang
>Priority: Major
>
> Currently, the built-in functions treat NULL arguments in different ways. 
> E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws NPE. The general 
> SQL-way of handling NULL values should be that if one argument is NULL the 
> result is NULL. We should keep the correct semantics and avoid terminating 
> the (continuous) queries unexpectedly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10049) Correctly handle NULL arguments in SQL built-in functions

2019-04-16 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-10049:

Summary: Correctly handle NULL arguments in SQL built-in functions  (was: 
Unify the processing logic for NULL arguments in SQL built-in functions)

> Correctly handle NULL arguments in SQL built-in functions
> -
>
> Key: FLINK-10049
> URL: https://issues.apache.org/jira/browse/FLINK-10049
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Assignee: vinoyang
>Priority: Major
>
> Currently, the built-in functions treat NULL arguments in different ways. 
> E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general 
> SQL-way of handling NULL values should be that if one argument is NULL the 
> result is NULL. We should unify the processing logic for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10049) Unify the processing logic for NULL arguments in SQL built-in functions

2019-04-08 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813002#comment-16813002
 ] 

Xingcan Cui commented on FLINK-10049:
-

Hey guys, thanks for the comments. I've not gone through all the documents, but 
it seems true that different SQL engines have different mechanisms for this 
problem. However, since all the fields in Flink SQL are nullable in the current 
version, simply throwing NPE and terminating the execution should always be 
avoided.

IMO, each UDF is responsible to handle {{NULL}} arguments itself, with the 
correct semantics. The {{NULL}} means unknown in SQL, and thus most scalar 
functions should output "unknown" with an unknown input. We can add the 
{{RETURNS NULL ON NULL INPUT}} option to UDF definitions (maybe a method to be 
overridden), but it works more like an optimization method, which means event 
without this declaration, the function should return "NULL" after being invoked 
(just in case).

Actually, there's no need to unify the processing logic. Just keep the correct 
semantics and avoid terminating the (continuous) queries unexpectedly. Thus, I 
plan to rename this ticket to "Correctly handle NULL arguments in SQL built-in 
functions". As for the exception handling mechanism, it's a little bit 
different and we'd better discuss it in another place.

What do you think?

> Unify the processing logic for NULL arguments in SQL built-in functions
> ---
>
> Key: FLINK-10049
> URL: https://issues.apache.org/jira/browse/FLINK-10049
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Assignee: vinoyang
>Priority: Major
>
> Currently, the built-in functions treat NULL arguments in different ways. 
> E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general 
> SQL-way of handling NULL values should be that if one argument is NULL the 
> result is NULL. We should unify the processing logic for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10684) Improve the CSV reading process

2019-04-08 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812514#comment-16812514
 ] 

Xingcan Cui commented on FLINK-10684:
-

Hi all, thanks for your attention. I really encountered some problems when I 
tried to read some CSV files for data wrangling, and that's why this ticket was 
filed.

As shown in the description, the problems come from different aspects:
 P1. Lack of schema inference.
 P2. Weak error handling.
 P3. Other implicit bugs (or standard incompatible mentioned by [~fhueske]).

Since the basic {{CSVInputFormat}} are used for both streaming and batch 
environments, some solutions to these problems may be tricky. For instance, to 
automatically infer schemas, we need to introduce a new mechanism, stream 
sampling, from which I believe some other processes such as automatic 
parallelism tuning and stream SQL optimization will also benefit.

To solve these problems on my own project, I applied some workarounds which are 
not general enough. Although I did have some general ideas (e.g., using side 
output for bad records), considering that the Flink project has been adopting 
some major changes recently, maybe it's better to propose them after then.

All in all, personally I don't think it's a good time to concentrate on this 
issue because none of the solutions are trivial. What do you think?

> Improve the CSV reading process
> ---
>
> Key: FLINK-10684
> URL: https://issues.apache.org/jira/browse/FLINK-10684
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataSet
>Reporter: Xingcan Cui
>Priority: Major
>
> CSV is one of the most commonly used file formats in data wrangling. To load 
> records from CSV files, Flink has provided the basic {{CsvInputFormat}}, as 
> well as some variants (e.g., {{RowCsvInputFormat}} and 
> {{PojoCsvInputFormat}}). However, it seems that the reading process can be 
> improved. For example, we could add a built-in util to automatically infer 
> schemas from CSV headers and samples of data. Also, the current bad record 
> handling method can be improved by somehow keeping the invalid lines (and 
> even the reasons for failed parsing), instead of logging the total number 
> only.
> This is an umbrella issue for all the improvements and bug fixes for the CSV 
> reading process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-12116) Args autocast will cause exception for plan transformation in TableAPI

2019-04-05 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-12116:

Description: 
In tableAPI, the automatic typecast for arguments may break their initial 
structures, which makes {{TreeNode.makeCopy()}} fail.

Take the {{ConcatWs}} function as an example. It requires a string 
{{Expression}} sequence for the second parameter of its constructor. If we 
provide some {{Expressions}} with other types, the planner will try to cast 
them automatically. However, during this process, the arguments will be 
incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two expressions 
{{f1.cast(String)}} and {{f2.cast(String)}}) which will cause 
{{java.lang.IllegalArgumentException: wrong number of arguments}} for 
{{Constructor.newInstance()}}.

As a workaround, we can cast these arguments manually.

  was:
In tableAPI, the automatic typecast for arguments may break their initial 
structures, which makes {{TreeNode.makeCopy()}} fail.

Take the {{ConcatWs}} function as an example. It requires a string 
{{Expression}} sequence for the second parameter of its constructor. If we 
provide some {{Expressions}} with other types, the planner will try to cast 
them automatically. However, during this process, the arguments will be 
incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two expressions 
{{f1.cast(String)}} and {{f2.cast(String)}}) which will cause 
{{java.lang.IllegalArgumentException: wrong number of arguments}}.

As a workaround, we can cast these arguments manually.


> Args autocast will cause exception for plan transformation in TableAPI
> --
>
> Key: FLINK-12116
> URL: https://issues.apache.org/jira/browse/FLINK-12116
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.6.4, 1.7.2
>Reporter: Xingcan Cui
>Priority: Major
>
> In tableAPI, the automatic typecast for arguments may break their initial 
> structures, which makes {{TreeNode.makeCopy()}} fail.
> Take the {{ConcatWs}} function as an example. It requires a string 
> {{Expression}} sequence for the second parameter of its constructor. If we 
> provide some {{Expressions}} with other types, the planner will try to cast 
> them automatically. However, during this process, the arguments will be 
> incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two 
> expressions {{f1.cast(String)}} and {{f2.cast(String)}}) which will cause 
> {{java.lang.IllegalArgumentException: wrong number of arguments}} for 
> {{Constructor.newInstance()}}.
> As a workaround, we can cast these arguments manually.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-12116) Args autocast will cause exception for plan transformation in TableAPI

2019-04-04 Thread Xingcan Cui (JIRA)

Xingcan Cui created FLINK-12116:
---

 Summary: Args autocast will cause exception for plan 
transformation in TableAPI
 Key: FLINK-12116
 URL: https://issues.apache.org/jira/browse/FLINK-12116
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.7.2, 1.6.4
Reporter: Xingcan Cui


In tableAPI, the automatic typecast for arguments may break their initial 
structures, which makes {{TreeNode.makeCopy()}} fail.

Take the {{ConcatWs}} function as an example. It requires a string 
{{Expression}} sequence for the second parameter of its constructor. If we 
provide some {{Expressions}} with other types, the planner will try to cast 
them automatically. However, during this process, the arguments will be 
incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two expressions 
{{f1.cast(String)}} and {{f2.cast(String)}}) which will cause 
{{java.lang.IllegalArgumentException: wrong number of arguments}}.

As a workaround, we can cast these arguments manually.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join

2019-04-02 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807880#comment-16807880
 ] 

Xingcan Cui commented on FLINK-7865:


Thanks for working on that, [~hyuan] ! I'll take care of this issue on Flink 
side.

> Remove predicate restrictions on TableFunction left outer join
> --
>
> Key: FLINK-7865
> URL: https://issues.apache.org/jira/browse/FLINK-7865
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>Priority: Major
>
> To cover up the improper translation of lateral table left outer join 
> (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} 
> literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has 
> been fixed in Calcite, we should remove the restrictions. The tasks may 
> include removing Table API/SQL condition check, removing validation tests, 
> enabling integration tests, updating the documents, etc.
> See [this thread on Calcite dev 
> list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E]
>  for more information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join

2019-04-02 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-7865:
--

Assignee: Xingcan Cui

> Remove predicate restrictions on TableFunction left outer join
> --
>
> Key: FLINK-7865
> URL: https://issues.apache.org/jira/browse/FLINK-7865
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>Priority: Major
>
> To cover up the improper translation of lateral table left outer join 
> (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} 
> literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has 
> been fixed in Calcite, we should remove the restrictions. The tasks may 
> include removing Table API/SQL condition check, removing validation tests, 
> enabling integration tests, updating the documents, etc.
> See [this thread on Calcite dev 
> list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E]
>  for more information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-11528) Translate the "Use Cases" page into Chinese

2019-03-18 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui closed FLINK-11528.
---

> Translate the "Use Cases" page into Chinese
> ---
>
> Key: FLINK-11528
> URL: https://issues.apache.org/jira/browse/FLINK-11528
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Project Website
>Reporter: Jark Wu
>Assignee: Xingcan Cui
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Translate flink-web/usecases.zh.md into Chinese. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-11528) Translate the "Use Cases" page into Chinese

2019-03-18 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui resolved FLINK-11528.
-
Resolution: Done

Implemented with 6af7f48bd754fb9a5635c25ec7656677fcf10b9b

> Translate the "Use Cases" page into Chinese
> ---
>
> Key: FLINK-11528
> URL: https://issues.apache.org/jira/browse/FLINK-11528
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Project Website
>Reporter: Jark Wu
>Assignee: Xingcan Cui
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Translate flink-web/usecases.zh.md into Chinese. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-11819) Additional attribute for order by not support by flink sql, but told supported in doc

2019-03-15 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793592#comment-16793592
 ] 

Xingcan Cui commented on FLINK-11819:
-

Hi [~hequn8128] and [~hustclf], sorry for the late reply.

IMO, the current description seems to be OK. However, this could be subjective. 
If more readers are confused, we should definitely improve it.

Best, Xingcan

> Additional attribute for order by not support by flink sql, but told 
> supported in doc
> -
>
> Key: FLINK-11819
> URL: https://issues.apache.org/jira/browse/FLINK-11819
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.7.2
>Reporter: Lifei Chen
>Assignee: Lifei Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.3
>
>   Original Estimate: 3h
>  Time Spent: 20m
>  Remaining Estimate: 2h 40m
>
> I am using flink v1.7.1, when I use flink sql to order by an attribute (not 
> time attribute), the error logs is as follow.
>  
> sql:
> {quote}"SELECT * FROM events order by tenantId"
> {quote}
>  
> error logs:
> {quote}Exception in thread "main" org.apache.flink.table.api.TableException: 
> Cannot generate a valid execution plan for the given query:
> FlinkLogicalSort(sort0=[$2], dir0=[ASC])
>  FlinkLogicalNativeTableScan(table=[[_DataStreamTable_0]])
> This exception indicates that the query uses an unsupported SQL feature.
>  Please check the documentation for the set of currently supported SQL 
> features.
>  at 
> org.apache.flink.table.api.TableEnvironment.runVolcanoPlanner(TableEnvironment.scala:377)
>  at 
> org.apache.flink.table.api.TableEnvironment.optimizePhysicalPlan(TableEnvironment.scala:302)
>  at 
> org.apache.flink.table.api.StreamTableEnvironment.optimize(StreamTableEnvironment.scala:814)
>  at 
> org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:860)
>  at 
> org.apache.flink.table.api.java.StreamTableEnvironment.toRetractStream(StreamTableEnvironment.scala:305)
>  at 
> org.apache.flink.table.api.java.StreamTableEnvironment.toRetractStream(StreamTableEnvironment.scala:248)
> {quote}
>  
> So as for now, only time attribute is supported by flink for command `order 
> by`, additional attribute is not supported yet, Is that right ?
> If so, there is a mistake, indicated that other attribute except for `time 
> attribute` is supported .
> related links: 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/table/sql.html#orderby--limit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-11819) Additional attribute for order by not support by flink sql, but told supported in doc

2019-03-05 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784452#comment-16784452
 ] 

Xingcan Cui commented on FLINK-11819:
-

Hi [~hustclf], I think "additional" means you should first sort on a time 
attribute and then use other attributes, e.g., "order by rowtime, tenantId".

> Additional attribute for order by not support by flink sql, but told 
> supported in doc
> -
>
> Key: FLINK-11819
> URL: https://issues.apache.org/jira/browse/FLINK-11819
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.7.2
>Reporter: Lifei Chen
>Assignee: Lifei Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.3
>
>   Original Estimate: 3h
>  Time Spent: 10m
>  Remaining Estimate: 2h 50m
>
> I am using flink v1.7.1, when I use flink sql to order by an attribute (not 
> time attribute), the error logs is as follow.
>  
> sql:
> {quote}"SELECT * FROM events order by tenantId"
> {quote}
>  
> error logs:
> {quote}Exception in thread "main" org.apache.flink.table.api.TableException: 
> Cannot generate a valid execution plan for the given query:
> FlinkLogicalSort(sort0=[$2], dir0=[ASC])
>  FlinkLogicalNativeTableScan(table=[[_DataStreamTable_0]])
> This exception indicates that the query uses an unsupported SQL feature.
>  Please check the documentation for the set of currently supported SQL 
> features.
>  at 
> org.apache.flink.table.api.TableEnvironment.runVolcanoPlanner(TableEnvironment.scala:377)
>  at 
> org.apache.flink.table.api.TableEnvironment.optimizePhysicalPlan(TableEnvironment.scala:302)
>  at 
> org.apache.flink.table.api.StreamTableEnvironment.optimize(StreamTableEnvironment.scala:814)
>  at 
> org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:860)
>  at 
> org.apache.flink.table.api.java.StreamTableEnvironment.toRetractStream(StreamTableEnvironment.scala:305)
>  at 
> org.apache.flink.table.api.java.StreamTableEnvironment.toRetractStream(StreamTableEnvironment.scala:248)
> {quote}
>  
> So as for now, only time attribute is supported by flink for command `order 
> by`, additional attribute is not supported yet, Is that right ?
> If so, there is a mistake, indicated that other attribute except for `time 
> attribute` is supported .
> related links: 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/table/sql.html#orderby--limit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-11769) The estimateDataTypesSize method in FlinkRelNode causes NPE for Multiset

2019-02-26 Thread Xingcan Cui (JIRA)

Xingcan Cui created FLINK-11769:
---

 Summary: The estimateDataTypesSize method in FlinkRelNode causes 
NPE for Multiset
 Key: FLINK-11769
 URL: https://issues.apache.org/jira/browse/FLINK-11769
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.7.2, 1.6.4
Reporter: Xingcan Cui


As raised in [this 
thread|https://lists.apache.org/thread.html/3a93723d2e74ae667a9aeb7d6ff28955f3ef79b5f20b4848b67fe709@%3Cuser.flink.apache.org%3E].
 The {{estimateDataTypesSize}} method in {{FlinkRelNode}} causes NPE for a 
{{Multiset>}} field type. Maybe the {{keyType}} or the 
{{valueType}} is empty in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-11567) Translate "How to Review a Pull Request" page into Chinese

2019-02-26 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui resolved FLINK-11567.
-
Resolution: Done

Fixed in flink-web: 63af7e0c6fa4a87072f54fc6bf0cf4ebe5c56b25

> Translate "How to Review a Pull Request" page into Chinese
> --
>
> Key: FLINK-11567
> URL: https://issues.apache.org/jira/browse/FLINK-11567
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Project Website
>Reporter: Jark Wu
>Assignee: xulinjie
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Translate "How to Review a Pull Request" page into Chinese.
> The markdown file is located in: flink-web/reviewing-prs.zh.md
> The url link is: https://flink.apache.org/zh/reviewing-prs.html
> Please adjust the links in the page to Chinese pages when translating. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-11528) Translate the "Use Cases" page into Chinese

2019-02-05 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760922#comment-16760922
 ] 

Xingcan Cui commented on FLINK-11528:
-

Hi [~jark], thanks for your effort on the Chinese docs/website. I'll help to 
take care of this page.

> Translate the "Use Cases" page into Chinese
> ---
>
> Key: FLINK-11528
> URL: https://issues.apache.org/jira/browse/FLINK-11528
> Project: Flink
>  Issue Type: Sub-task
>  Components: Project Website
>Reporter: Jark Wu
>Assignee: Xingcan Cui
>Priority: Major
>
> Translate flink-web/usecases.zh.md into Chinese. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (FLINK-11528) Translate the "Use Cases" page into Chinese

2019-02-05 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui reassigned FLINK-11528:
---

Assignee: Xingcan Cui

> Translate the "Use Cases" page into Chinese
> ---
>
> Key: FLINK-11528
> URL: https://issues.apache.org/jira/browse/FLINK-11528
> Project: Flink
>  Issue Type: Sub-task
>  Components: Project Website
>Reporter: Jark Wu
>Assignee: Xingcan Cui
>Priority: Major
>
> Translate flink-web/usecases.zh.md into Chinese. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-11227) The DescriptorProperties contains some bounds checking errors

2018-12-27 Thread Xingcan Cui (JIRA)

Xingcan Cui created FLINK-11227:
---

 Summary: The DescriptorProperties contains some bounds checking 
errors
 Key: FLINK-11227
 URL: https://issues.apache.org/jira/browse/FLINK-11227
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.7.1, 1.6.3
Reporter: Xingcan Cui
Assignee: Xingcan Cui
 Fix For: 1.6.4, 1.7.2, 1.8.0


In {{DescriptorProperties}}, both the {{validateFixedIndexedProperties()}} and 
{{validateArray()}} use wrong upperbounds for validation, which leads to the 
last element not being validated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-11220) Can not Select row time field in JOIN query

2018-12-27 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729631#comment-16729631
 ] 

Xingcan Cui commented on FLINK-11220:
-

Hi [~sunjincheng121] and [~jark], I admit that automatically materializing the 
rowtime fields will be convenient in some cases. However, that seems not to be 
quite reasonable. Here are my thoughts.

1. As you said, the {{RowtimeIndicator}} and {{TIMESTAMP}} are two different 
types. It will be confusing if we implicitly change the field types after a 
join.
2. The current approach makes it possible for the time-windowed join to be used 
as a sub-query for further rowtime-based operations (e.g., group windows).
3. Now that the results of time-windowed joins are still aligned with 
watermarks (which are held-back according to the window size), I think it 
doesn't make sense to ignore it and let users assign rowtime/watermarks 
manually again.

All in all, using {{CAST}} seems to be a little verbose, but should be a 
feasible solution for the time being.

Any idea?

Best, Xingcan

> Can not Select row time field in JOIN query
> ---
>
> Key: FLINK-11220
> URL: https://issues.apache.org/jira/browse/FLINK-11220
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.8.0
>Reporter: sunjincheng
>Priority: Major
>
> SQL:
> {code:java}
> Orders...toTable(tEnv, 'orderId, 'orderTime.rowtime)
> Payment...toTable(tEnv, 'orderId, 'payTime.rowtime)
> SELECT orderTime, o.orderId, payTime
>   FROM Orders AS o JOIN Payment AS p
>   ON o.orderId = p.orderId AND
>  p.payTime BETWEEN orderTime AND orderTime + INTERVAL '1' HOUR
> {code}
> Execption:
> {code:java}
> org.apache.flink.table.api.TableException: Found more than one rowtime field: 
> [orderTime, payTime] in the table that should be converted to a DataStream.
> Please select the rowtime field that should be used as event-time timestamp 
> for the DataStream by casting all other fields to TIMESTAMP.
> at 
> org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:906)
> {code}
> The reason for the error is that we have 2 time fields `orderTime` and  
> `payTime`.  I think we do not  need throw the exception, and we can remove 
> the logic of `plan.process(new OutputRowtimeProcessFunction[A](conversion, 
> rowtimeFields.head.getIndex))`, if we want using the timestamp after 
> toDataSteram, we should using `assignTimestampsAndWatermarks()`.
> What do you think ? [~twalthr] [~fhueske] 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-11220) Can not Select row time field in JOIN query

2018-12-26 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729271#comment-16729271
 ] 

Xingcan Cui commented on FLINK-11220:
-

Hi [~sunjincheng121], if users want to preserve a rowtime field (as a common 
field), they can cast the field to TIMESTAMP manually (i.e., {{select 
orderTime, cast (payTime as timestamp) from ... join ...}}).

> Can not Select row time field in JOIN query
> ---
>
> Key: FLINK-11220
> URL: https://issues.apache.org/jira/browse/FLINK-11220
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.8.0
>Reporter: sunjincheng
>Priority: Major
>
> SQL:
> {code:java}
> Orders...toTable(tEnv, 'orderId, 'orderTime.rowtime)
> Payment...toTable(tEnv, 'orderId, 'payTime.rowtime)
> SELECT orderTime, o.orderId, payTime
>   FROM Orders AS o JOIN Payment AS p
>   ON o.orderId = p.orderId AND
>  p.payTime BETWEEN orderTime AND orderTime + INTERVAL '1' HOUR
> {code}
> Execption:
> {code:java}
> org.apache.flink.table.api.TableException: Found more than one rowtime field: 
> [orderTime, payTime] in the table that should be converted to a DataStream.
> Please select the rowtime field that should be used as event-time timestamp 
> for the DataStream by casting all other fields to TIMESTAMP.
> at 
> org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:906)
> {code}
> The reason for the error is that we have 2 time fields `orderTime` and  
> `payTime`.  I think we do not  need throw the exception, and we can remove 
> the logic of `plan.process(new OutputRowtimeProcessFunction[A](conversion, 
> rowtimeFields.head.getIndex))`, if we want using the timestamp after 
> toDataSteram, we should using `assignTimestampsAndWatermarks()`.
> What do you think ? [~twalthr] [~fhueske] 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-11220) Can not Select row time field in JOIN query

2018-12-26 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729019#comment-16729019
 ] 

Xingcan Cui commented on FLINK-11220:
-

Hi [~sunjincheng121], I'd like to give some explanations about this problem.

Unlike the common join, the time-windowed join will produce a time-ordered 
stream, which means the event times of the results are still aligned with the 
watermarks. Thus we can directly choose either of the event time fields from 
the original streams as the new event time field.

It's quite an interesting problem to deal with the event time field after some 
aggregations or joins (e.g., in common join, no event time field can be 
preserved). Personally, I suggest keeping the current manner and maybe in the 
future, we can extend the time system to support multiple event times and 
separated watermarks.

What do you think?

Best, Xingcan

> Can not Select row time field in JOIN query
> ---
>
> Key: FLINK-11220
> URL: https://issues.apache.org/jira/browse/FLINK-11220
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.8.0
>Reporter: sunjincheng
>Priority: Major
>
> SQL:
> {code:java}
> Orders...toTable(tEnv, 'orderId, 'orderTime.rowtime)
> Payment...toTable(tEnv, 'orderId, 'payTime.rowtime)
> SELECT orderTime, o.orderId, payTime
>   FROM Orders AS o JOIN Payment AS p
>   ON o.orderId = p.orderId AND
>  p.payTime BETWEEN orderTime AND orderTime + INTERVAL '1' HOUR
> {code}
> Execption:
> {code:java}
> org.apache.flink.table.api.TableException: Found more than one rowtime field: 
> [orderTime, payTime] in the table that should be converted to a DataStream.
> Please select the rowtime field that should be used as event-time timestamp 
> for the DataStream by casting all other fields to TIMESTAMP.
> at 
> org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:906)
> {code}
> The reason for the error is that we have 2 time fields `orderTime` and  
> `payTime`.  I think we do not  need throw the exception, and we can remove 
> the logic of `plan.process(new OutputRowtimeProcessFunction[A](conversion, 
> rowtimeFields.head.getIndex))`, if we want using the timestamp after 
> toDataSteram, we should using `assignTimestampsAndWatermarks()`.
> What do you think ? [~twalthr] [~fhueske] 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call

2018-11-07 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676901#comment-16676901
 ] 

Xingcan Cui edited comment on FLINK-10463 at 11/7/18 11:32 AM:
---

Fixed in 1.8.0 5a4e5e9dcaa83368862d27bd80429235c02ed66e
Fixed in 1.7.1 98198d4cda78cc815fe6430e2249130efb102b61
Fixed in 1.6.3 761c7db58ea39bfc3a2bfdacba847d2d6e224129


was (Author: xccui):
Fixed in 1.8.0 5a4e5e9dcaa83368862d27bd80429235c02ed66e

> Null literal cannot be properly parsed in Java Table API function call
> --
>
> Key: FLINK-10463
> URL: https://issues.apache.org/jira/browse/FLINK-10463
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Xingcan Cui
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.3, 1.8.0, 1.7.1
>
>
> For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws 
> the following exception.
> {code:java}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 13: string matching regex 
> `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found
> Null(STRING).regexpReplace('oo|ar', '')
> ^
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576)
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call

2018-11-06 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676968#comment-16676968
 ] 

Xingcan Cui commented on FLINK-10463:
-

[~twalthr], sure. I'll cherry-pick the commit to versions 1.6.3 and 1.7.1.

> Null literal cannot be properly parsed in Java Table API function call
> --
>
> Key: FLINK-10463
> URL: https://issues.apache.org/jira/browse/FLINK-10463
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Xingcan Cui
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.3, 1.8.0, 1.7.1
>
>
> For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws 
> the following exception.
> {code:java}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 13: string matching regex 
> `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found
> Null(STRING).regexpReplace('oo|ar', '')
> ^
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576)
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call

2018-11-06 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-10463:

Fix Version/s: 1.7.1
   1.6.3

> Null literal cannot be properly parsed in Java Table API function call
> --
>
> Key: FLINK-10463
> URL: https://issues.apache.org/jira/browse/FLINK-10463
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Xingcan Cui
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.3, 1.8.0, 1.7.1
>
>
> For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws 
> the following exception.
> {code:java}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 13: string matching regex 
> `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found
> Null(STRING).regexpReplace('oo|ar', '')
> ^
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576)
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call

2018-11-06 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui resolved FLINK-10463.
-
Resolution: Fixed

Fixed in 1.8.0 5a4e5e9dcaa83368862d27bd80429235c02ed66e

> Null literal cannot be properly parsed in Java Table API function call
> --
>
> Key: FLINK-10463
> URL: https://issues.apache.org/jira/browse/FLINK-10463
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Xingcan Cui
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws 
> the following exception.
> {code:java}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 13: string matching regex 
> `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found
> Null(STRING).regexpReplace('oo|ar', '')
> ^
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576)
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call

2018-11-06 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-10463:

Fix Version/s: 1.8.0

> Null literal cannot be properly parsed in Java Table API function call
> --
>
> Key: FLINK-10463
> URL: https://issues.apache.org/jira/browse/FLINK-10463
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Xingcan Cui
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws 
> the following exception.
> {code:java}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 13: string matching regex 
> `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found
> Null(STRING).regexpReplace('oo|ar', '')
> ^
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576)
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-10684) Improve the CSV reading process

2018-10-25 Thread Xingcan Cui (JIRA)

Xingcan Cui created FLINK-10684:
---

 Summary: Improve the CSV reading process
 Key: FLINK-10684
 URL: https://issues.apache.org/jira/browse/FLINK-10684
 Project: Flink
  Issue Type: Improvement
  Components: Core
Reporter: Xingcan Cui


CSV is one of the most commonly used file formats in data wrangling. To load 
records from CSV files, Flink has provided the basic {{CsvInputFormat}}, as 
well as some variants (e.g., {{RowCsvInputFormat}} and {{PojoCsvInputFormat}}). 
However, it seems that the reading process can be improved. For example, we 
could add a built-in util to automatically infer schemas from CSV headers and 
samples of data. Also, the current bad record handling method can be improved 
by somehow keeping the invalid lines (and even the reasons for failed parsing), 
instead of logging the total number only.

This is an umbrella issue for all the improvements and bug fixes for the CSV 
reading process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-9990) Add regexp_extract supported in TableAPI and SQL

2018-10-14 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui resolved FLINK-9990.

Resolution: Implemented

Implemented in 1.7.0 5dc360984143005f73b8f70f97ed6b1c2afd7dc3

> Add regexp_extract supported in TableAPI and SQL
> 
>
> Key: FLINK-9990
> URL: https://issues.apache.org/jira/browse/FLINK-9990
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Minor
>  Labels: pull-request-available
>
> regex_extract is a very useful function, it returns a string based on a regex 
> pattern and a index.
> For example : 
> {code:java}
> regexp_extract('foothebar', 'foo(.*?)(bar)', 2) // returns 'bar.'
> {code}
> It is provided as a UDF in Hive, more details please see[1].
> [1]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call

2018-09-29 Thread Xingcan Cui (JIRA)

Xingcan Cui created FLINK-10463:
---

 Summary: Null literal cannot be properly parsed in Java Table API 
function call
 Key: FLINK-10463
 URL: https://issues.apache.org/jira/browse/FLINK-10463
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Reporter: Xingcan Cui


For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws 
the following exception.

{code:java}
org.apache.flink.table.api.ExpressionParserException: Could not parse 
expression at column 13: string matching regex 
`(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found
Null(STRING).regexpReplace('oo|ar', '')
^

at 
org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576)
at 
org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569)
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-7800) Enable window joins without equi-join predicates

2018-09-28 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-7800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16632815#comment-16632815
 ] 

Xingcan Cui commented on FLINK-7800:


As posted in the calcite mailing list before [Question about join plan 
optimization in 
Flink|https://lists.apache.org/thread.html/f40182397d393b4348a2658d256c87fd7566b8add0eef3642a152471@%3Cdev.calcite.apache.org%3E],
 the manner to forbid some candidate plans is still confusing me and the 
problem becomes even harder when considering the time attributes (i.e., we need 
some pushing-down to get the equi-predicate, while that doesn't hold for time 
attributes). Do you have any suggestions? [~fhueske] [~twalthr]

> Enable window joins without equi-join predicates
> 
>
> Key: FLINK-7800
> URL: https://issues.apache.org/jira/browse/FLINK-7800
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Major
>  Labels: pull-request-available
>
> Currently, windowed joins can only be translated if they have at least on 
> equi-join predicate. This limitation exists due to the lack of a good cross 
> join strategy for the DataSet API.
> Due to the window, windowed joins do not have to be executed as cross joins. 
> Hence, the equi-join limitation does not need to be enforces (even though 
> non-equi joins are executed with a parallelism of 1 right now).
> We can resolve this issue by adding a boolean flag to the 
> {{FlinkLogicalJoinConverter}} rule to permit non-equi joins and add such a 
> rule to the logical optimization set of the DataStream API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-10145) Add replace supported in TableAPI and SQL

2018-09-26 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui resolved FLINK-10145.
-
   Resolution: Fixed
Fix Version/s: 1.7.0

Implemented in 1.7.0: 24af70fdecbbb66e8555df7aca35a92a2f1aa7ac

> Add replace supported in TableAPI and SQL
> -
>
> Key: FLINK-10145
> URL: https://issues.apache.org/jira/browse/FLINK-10145
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Guibo Pan
>Assignee: Guibo Pan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> replace is an useful function for String. 
> for example:
> {code:java}
> select replace("Hello World", "World", "Flink") // return "Hello Flink"
> select replace("ababab", "abab", "z") // return "zab"
> {code}
> It is supported as a UDF in Hive, more details please see[1]
> [1]: 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-StringFunctions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-9991) Add regexp_replace supported in TableAPI and SQL

2018-09-15 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui resolved FLINK-9991.

Resolution: Implemented

Implemented in 1.7.0: f03d15a08ad5df44a4bb742d3edfdce211bf9e48

> Add regexp_replace supported in TableAPI and SQL
> 
>
> Key: FLINK-9991
> URL: https://issues.apache.org/jira/browse/FLINK-9991
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> regexp_replace is a very userful function to process String. 
>  For example :
> {code:java}
> regexp_replace("foobar", "oo|ar", "") //returns 'fb.'
> {code}
> It is supported as a UDF in Hive, more details please see[1].
> [1]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9991) Add regexp_replace supported in TableAPI and SQL

2018-09-15 Thread Xingcan Cui (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-9991:
---
Fix Version/s: 1.7.0

> Add regexp_replace supported in TableAPI and SQL
> 
>
> Key: FLINK-9991
> URL: https://issues.apache.org/jira/browse/FLINK-9991
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> regexp_replace is a very userful function to process String. 
>  For example :
> {code:java}
> regexp_replace("foobar", "oo|ar", "") //returns 'fb.'
> {code}
> It is supported as a UDF in Hive, more details please see[1].
> [1]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10323) A single backslash cannot be successfully parsed in Java Table API

2018-09-12 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611894#comment-16611894
 ] 

Xingcan Cui commented on FLINK-10323:
-

Yes, it seems to have been fixed and we even don't need to double the 
backslashes in SQL queries. Thanks for your work, [~twalthr] :-)

> A single backslash cannot be successfully parsed in Java Table API
> --
>
> Key: FLINK-10323
> URL: https://issues.apache.org/jira/browse/FLINK-10323
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.2, 1.5.3, 1.6.0
>Reporter: Xingcan Cui
>Priority: Major
>
> The snippet below will cause a parser exception.
> {code:java}
>  testAllApis(
>   concat_ws("~", "AA", "\\"),
>   "concat_ws('~','AA','\\')",
>   "concat_ws('~','AA','')",
>   "AA~\\")
> {code}
> {code}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 20: Invalid expression.
> concat_ws('~','AA','\')
>^
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:549)
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:542)
>   at 
> org.apache.flink.table.expressions.utils.ExpressionTestBase.addTableApiTestExpr(ExpressionTestBase.scala:255)
>   at 
> org.apache.flink.table.expressions.utils.ExpressionTestBase.testAllApis(ExpressionTestBase.scala:265)
>   at 
> org.apache.flink.table.expressions.ScalarFunctionsTest.testConcatWs(ScalarFunctionsTest.scala:340)
> {code}
> However, with double (or more) backslashes, it will be successfully parsed.
> {code:java}
> testAllApis(
>   concat_ws("~", "AA", ""),
>   "concat_ws('~','AA','')",
>   "concat_ws('~','AA','')",
>   "AA~")
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10323) A single backslash cannot be successfully parsed in Java Table API

2018-09-12 Thread Xingcan Cui (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611878#comment-16611878
 ] 

Xingcan Cui commented on FLINK-10323:
-

Thanks for your reminder [~twalthr]. I'll check it and close this issue if it 
has been addressed.

> A single backslash cannot be successfully parsed in Java Table API
> --
>
> Key: FLINK-10323
> URL: https://issues.apache.org/jira/browse/FLINK-10323
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.2, 1.5.3, 1.6.0
>Reporter: Xingcan Cui
>Priority: Major
>
> The snippet below will cause a parser exception.
> {code:java}
>  testAllApis(
>   concat_ws("~", "AA", "\\"),
>   "concat_ws('~','AA','\\')",
>   "concat_ws('~','AA','')",
>   "AA~\\")
> {code}
> {code}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 20: Invalid expression.
> concat_ws('~','AA','\')
>^
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:549)
>   at 
> org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:542)
>   at 
> org.apache.flink.table.expressions.utils.ExpressionTestBase.addTableApiTestExpr(ExpressionTestBase.scala:255)
>   at 
> org.apache.flink.table.expressions.utils.ExpressionTestBase.testAllApis(ExpressionTestBase.scala:265)
>   at 
> org.apache.flink.table.expressions.ScalarFunctionsTest.testConcatWs(ScalarFunctionsTest.scala:340)
> {code}
> However, with double (or more) backslashes, it will be successfully parsed.
> {code:java}
> testAllApis(
>   concat_ws("~", "AA", ""),
>   "concat_ws('~','AA','')",
>   "concat_ws('~','AA','')",
>   "AA~")
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 3 4 >

1 - 100 of 333 matches

Mail list logo