[jira] [Assigned] (FLINK-33926) Can't start a job with a jar in the system classpath in native k8s mode
[ https://issues.apache.org/jira/browse/FLINK-33926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-33926: --- Assignee: Gantigmaa Selenge > Can't start a job with a jar in the system classpath in native k8s mode > --- > > Key: FLINK-33926 > URL: https://issues.apache.org/jira/browse/FLINK-33926 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.6.0 >Reporter: Trystan >Assignee: Gantigmaa Selenge >Priority: Major > > It appears that the combination of the running operator-controlled jobs in > native k8s + application mode + using a job jar in the classpath is invalid. > Avoiding dynamic classloading (as specified in the > [docs|https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/debugging/debugging_classloading/#avoiding-dynamic-classloading-for-user-code]) > is beneficial for some jobs. This affects at least Flink 1.16.1 and > Kubernetes Operator 1.6.0. > > FLINK-29288 seems to have addressed this for standalone mode. If I am > misunderstanding how to correctly build jars for this native k8s scenario, > apologies for the noise and any pointers would be appreciated! > > Perhaps related, the [spec > documentation|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/reference/#jobspec] > declares it optional, but isn't clear about under what conditions that > applies. > * Putting the jar in the system classpath and pointing *jarURI* to that jar > leads to linkage errors. > * Not including *jarURI* leads to NullPointerExceptions in the operator: > {code:java} > {"type":"org.apache.flink.kubernetes.operator.exception.ReconciliationException","message":"java.lang.NullPointerException","stackTrace":"org.apache.flink.kubernetes.operator.exception.ReconciliationException: > java.lang.NullPointerException\n\tat > org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:148)\n\tat > > org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:56)\n\tat > > io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:138)\n\tat > > io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:96)\n\tat > > org.apache.flink.kubernetes.operator.metrics.OperatorJosdkMetrics.timeControllerExecution(OperatorJosdkMetrics.java:80)\n\tat > > io.javaoperatorsdk.operator.processing.Controller.reconcile(Controller.java:95)\n\tat > > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:139)\n\tat > > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:119)\n\tat > > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:89)\n\tat > > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:62)\n\tat > > io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:414)\n\tat > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > Source)\n\tat > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\nCaused by: > java.lang.NullPointerException\n\tat > org.apache.flink.kubernetes.utils.KubernetesUtils.checkJarFileForApplicationMode(KubernetesUtils.java:407)\n\tat > > org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployApplicationCluster(KubernetesClusterDescriptor.java:207)\n\tat > > org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer.run(ApplicationClusterDeployer.java:67)\n\tat > > org.apache.flink.kubernetes.operator.service.NativeFlinkService.deployApplicationCluster(Native","additionalMetadata":{},"throwableList":[{"type":"java.lang.NullPointerException","additionalMetadata":{}}]} > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas
[ https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874102#comment-17874102 ] Xingcan Cui edited comment on FLINK-36004 at 8/16/24 4:30 PM: -- Fixed in master: b460d3d6fe00a18342783113635d1d80774fefe6 release-1.20: 9601e06bab1c4541dd5f7043c36031ac971dd705 release-1.19: 204db68ac418716e8af6fe371c606c129ddb4daa was (Author: xccui): Fixed in master: b460d3d6fe00a18342783113635d1d80774fefe6 > Flink SQL returns wrong results for Paimon tables with complex schemas > -- > > Key: FLINK-36004 > URL: https://issues.apache.org/jira/browse/FLINK-36004 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.1, 1.19.1 >Reporter: Xingcan Cui >Assignee: xuyang >Priority: Blocker > Labels: pull-request-available > Attachments: screenshot-1.png > > > We have a Paimon table with some nested files such as the following one. > {code:java} > `f1` ROW < > `f2` ROW < > `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >, > `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >, > `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >, > `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >, > `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < > FLOAT > > > NOT NULL >, > `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > > NOT NULL >, > `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < > BIGINT > > > NOT NULL >, > `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` > STRING, `f104` BIGINT > NOT NULL > >, > `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL > > > > {code} > When a select query includes some nested columns, the results will be wrong. > For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return > correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} > will return wrong values for {{r.}} > The query execution won't throw any exception but fails silently. > I'm not sure if this is a Paimon-specific issue, but I also tested running > the same query with Spark and StarRocks, and both of them can produce correct > results. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas
[ https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui resolved FLINK-36004. - Resolution: Fixed > Flink SQL returns wrong results for Paimon tables with complex schemas > -- > > Key: FLINK-36004 > URL: https://issues.apache.org/jira/browse/FLINK-36004 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.1, 1.19.1 >Reporter: Xingcan Cui >Assignee: xuyang >Priority: Blocker > Labels: pull-request-available > Attachments: screenshot-1.png > > > We have a Paimon table with some nested files such as the following one. > {code:java} > `f1` ROW < > `f2` ROW < > `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >, > `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >, > `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >, > `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >, > `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < > FLOAT > > > NOT NULL >, > `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > > NOT NULL >, > `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < > BIGINT > > > NOT NULL >, > `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` > STRING, `f104` BIGINT > NOT NULL > >, > `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL > > > > {code} > When a select query includes some nested columns, the results will be wrong. > For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return > correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} > will return wrong values for {{r.}} > The query execution won't throw any exception but fails silently. > I'm not sure if this is a Paimon-specific issue, but I also tested running > the same query with Spark and StarRocks, and both of them can produce correct > results. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas
[ https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874102#comment-17874102 ] Xingcan Cui commented on FLINK-36004: - Fixed in master: b460d3d6fe00a18342783113635d1d80774fefe6 > Flink SQL returns wrong results for Paimon tables with complex schemas > -- > > Key: FLINK-36004 > URL: https://issues.apache.org/jira/browse/FLINK-36004 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.1, 1.19.1 >Reporter: Xingcan Cui >Assignee: xuyang >Priority: Blocker > Labels: pull-request-available > Attachments: screenshot-1.png > > > We have a Paimon table with some nested files such as the following one. > {code:java} > `f1` ROW < > `f2` ROW < > `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >, > `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >, > `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >, > `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >, > `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < > FLOAT > > > NOT NULL >, > `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > > NOT NULL >, > `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < > BIGINT > > > NOT NULL >, > `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` > STRING, `f104` BIGINT > NOT NULL > >, > `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL > > > > {code} > When a select query includes some nested columns, the results will be wrong. > For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return > correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} > will return wrong values for {{r.}} > The query execution won't throw any exception but fails silently. > I'm not sure if this is a Paimon-specific issue, but I also tested running > the same query with Spark and StarRocks, and both of them can produce correct > results. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas
[ https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-36004: --- Assignee: xuyang > Flink SQL returns wrong results for Paimon tables with complex schemas > -- > > Key: FLINK-36004 > URL: https://issues.apache.org/jira/browse/FLINK-36004 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.1, 1.19.1 >Reporter: Xingcan Cui >Assignee: xuyang >Priority: Blocker > Labels: pull-request-available > Attachments: screenshot-1.png > > > We have a Paimon table with some nested files such as the following one. > {code:java} > `f1` ROW < > `f2` ROW < > `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >, > `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >, > `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >, > `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >, > `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < > FLOAT > > > NOT NULL >, > `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > > NOT NULL >, > `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < > BIGINT > > > NOT NULL >, > `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` > STRING, `f104` BIGINT > NOT NULL > >, > `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL > > > > {code} > When a select query includes some nested columns, the results will be wrong. > For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return > correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} > will return wrong values for {{r.}} > The query execution won't throw any exception but fails silently. > I'm not sure if this is a Paimon-specific issue, but I also tested running > the same query with Spark and StarRocks, and both of them can produce correct > results. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT
[ https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-35935: --- Assignee: xuyang (was: rocky-xu) > CREATE TABLE AS doesn't work with LIMIT > --- > > Key: FLINK-35935 > URL: https://issues.apache.org/jira/browse/FLINK-35935 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Assignee: xuyang >Priority: Major > Labels: pull-request-available > > {code:java} > CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code} > The above statement throws "Caused by: java.lang.AssertionError: not a query: > " exception. > A workaround is to wrap the query with CTE. > {code:java} > CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * > FROM R){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT
[ https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-35935: --- Assignee: rocky-xu (was: Xingcan Cui) > CREATE TABLE AS doesn't work with LIMIT > --- > > Key: FLINK-35935 > URL: https://issues.apache.org/jira/browse/FLINK-35935 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Assignee: rocky-xu >Priority: Major > Labels: pull-request-available > > {code:java} > CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code} > The above statement throws "Caused by: java.lang.AssertionError: not a query: > " exception. > A workaround is to wrap the query with CTE. > {code:java} > CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * > FROM R){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT
[ https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-35935: --- Assignee: Xingcan Cui > CREATE TABLE AS doesn't work with LIMIT > --- > > Key: FLINK-35935 > URL: https://issues.apache.org/jira/browse/FLINK-35935 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Assignee: Xingcan Cui >Priority: Major > Labels: pull-request-available > > {code:java} > CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code} > The above statement throws "Caused by: java.lang.AssertionError: not a query: > " exception. > A workaround is to wrap the query with CTE. > {code:java} > CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * > FROM R){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas
[ https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17872084#comment-17872084 ] Xingcan Cui commented on FLINK-36004: - Please see the screenshot for an explanation. The data type for {{request_feature_stage}} and {{execution_feature_stage}} are the same. It could be related to the flattened (unqualified) field access feature. !screenshot-1.png! > Flink SQL returns wrong results for Paimon tables with complex schemas > -- > > Key: FLINK-36004 > URL: https://issues.apache.org/jira/browse/FLINK-36004 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.1, 1.19.1 >Reporter: Xingcan Cui >Priority: Blocker > Attachments: screenshot-1.png > > > We have a Paimon table with some nested files such as the following one. > {code:java} > `f1` ROW < > `f2` ROW < > `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >, > `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >, > `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >, > `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >, > `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < > FLOAT > > > NOT NULL >, > `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > > NOT NULL >, > `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < > BIGINT > > > NOT NULL >, > `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` > STRING, `f104` BIGINT > NOT NULL > >, > `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL > > > > {code} > When a select query includes some nested columns, the results will be wrong. > For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return > correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} > will return wrong values for {{r.}} > The query execution won't throw any exception but fails silently. > I'm not sure if this is a Paimon-specific issue, but I also tested running > the same query with Spark and StarRocks, and both of them can produce correct > results. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas
[ https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-36004: Attachment: screenshot-1.png > Flink SQL returns wrong results for Paimon tables with complex schemas > -- > > Key: FLINK-36004 > URL: https://issues.apache.org/jira/browse/FLINK-36004 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.1, 1.19.1 >Reporter: Xingcan Cui >Priority: Blocker > Attachments: screenshot-1.png > > > We have a Paimon table with some nested files such as the following one. > {code:java} > `f1` ROW < > `f2` ROW < > `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >, > `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >, > `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >, > `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >, > `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < > FLOAT > > > NOT NULL >, > `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > > NOT NULL >, > `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < > BIGINT > > > NOT NULL >, > `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` > STRING, `f104` BIGINT > NOT NULL > >, > `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL > > > > {code} > When a select query includes some nested columns, the results will be wrong. > For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return > correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} > will return wrong values for {{r.}} > The query execution won't throw any exception but fails silently. > I'm not sure if this is a Paimon-specific issue, but I also tested running > the same query with Spark and StarRocks, and both of them can produce correct > results. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas
[ https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17872058#comment-17872058 ] Xingcan Cui commented on FLINK-36004: - I narrowed down the scope a bit. In the table schema, there are two fields with the same *_nested_* type but different names, say, {{f1}} and {{f2}}. When we only select a {{f1.foo.bar}}, everything works file. But when we select {{f1.foo.bar}} and {{f2.foo.bar}}, both of them will return the value for {{f2.foo.bar}}. > Flink SQL returns wrong results for Paimon tables with complex schemas > -- > > Key: FLINK-36004 > URL: https://issues.apache.org/jira/browse/FLINK-36004 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.1, 1.19.1 >Reporter: Xingcan Cui >Priority: Blocker > > We have a Paimon table with some nested files such as the following one. > {code:java} > `f1` ROW < > `f2` ROW < > `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >, > `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >, > `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >, > `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >, > `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < > FLOAT > > > NOT NULL >, > `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > > NOT NULL >, > `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < > BIGINT > > > NOT NULL >, > `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` > STRING, `f104` BIGINT > NOT NULL > >, > `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL > > > > {code} > When a select query includes some nested columns, the results will be wrong. > For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return > correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} > will return wrong values for {{r.}} > The query execution won't throw any exception but fails silently. > I'm not sure if this is a Paimon-specific issue, but I also tested running > the same query with Spark and StarRocks, and both of them can produce correct > results. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas
[ https://issues.apache.org/jira/browse/FLINK-36004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871864#comment-17871864 ] Xingcan Cui commented on FLINK-36004: - Flink version: 1.18.1 and 1.19.1 Paimon version: 0.8.2 > Flink SQL returns wrong results for Paimon tables with complex schemas > -- > > Key: FLINK-36004 > URL: https://issues.apache.org/jira/browse/FLINK-36004 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.18.1, 1.19.1 >Reporter: Xingcan Cui >Priority: Blocker > > We have a Paimon table with some nested files such as the following one. > {code:java} > `f1` ROW < > `f2` ROW < > `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >, > `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >, > `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >, > `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >, > `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < > FLOAT > > > NOT NULL >, > `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > > NOT NULL >, > `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < > BIGINT > > > NOT NULL >, > `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` > STRING, `f104` BIGINT > NOT NULL > >, > `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL > > > > {code} > When a select query includes some nested columns, the results will be wrong. > For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return > correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} > will return wrong values for {{r.}} > The query execution won't throw any exception but fails silently. > I'm not sure if this is a Paimon-specific issue, but I also tested running > the same query with Spark and StarRocks, and both of them can produce correct > results. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36004) Flink SQL returns wrong results for Paimon tables with complex schemas
Xingcan Cui created FLINK-36004: --- Summary: Flink SQL returns wrong results for Paimon tables with complex schemas Key: FLINK-36004 URL: https://issues.apache.org/jira/browse/FLINK-36004 Project: Flink Issue Type: Bug Components: Table SQL / Runtime Affects Versions: 1.19.1, 1.18.1 Reporter: Xingcan Cui We have a Paimon table with some nested files such as the following one. {code:java} `f1` ROW < `f2` ROW < `f3` ARRAY < ROW < `key` INT, `value` FLOAT > NOT NULL >, `f4` ARRAY < ROW < `key` INT, `value` STRING > NOT NULL >, `f5` ARRAY < ROW < `key` BIGINT, `value` FLOAT > NOT NULL >, `f6` ARRAY < ROW < `key` BIGINT, `value` BIGINT > NOT NULL >, `f7` ARRAY < ROW < `key` BIGINT, `value` ROW < `f71` ARRAY < FLOAT > > > NOT NULL >, `f8` ARRAY < ROW < `f81` STRING, `f82` STRING, `f83` BIGINT > NOT NULL >, `f9` ARRAY < ROW < `key` BIGINT, `value` ROW < `f91` ARRAY < BIGINT > > > NOT NULL >, `f10` ROW < `f101` ARRAY < ROW < `f102` BIGINT, `f103` STRING, `f104` BIGINT > NOT NULL > >, `f11` ARRAY < ROW < `key` BIGINT, `value` STRING > NOT NULL > > {code} When a select query includes some nested columns, the results will be wrong. For example, {{SELECT CARDINALITY(f1.f2.f3) AS r FROM...WHERE...}} can return correct results but {{SELECT CARDINALITY(f1.f2.f3) AS r, f1 FROM...WHERE...}} will return wrong values for {{r.}} The query execution won't throw any exception but fails silently. I'm not sure if this is a Paimon-specific issue, but I also tested running the same query with Spark and StarRocks, and both of them can produce correct results. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT
[ https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871856#comment-17871856 ] Xingcan Cui edited comment on FLINK-35935 at 8/8/24 5:39 AM: - hi [~xuyangzhong] , it looks like this doesn't work for the filesystem connector. For instance {code:java} CREATEOR REPLACE TABLE `default_catalog`.`default_database`.`test` WITH ( 'connector' = 'filesystem','path' = 'path','sink.parallelism' = '1', 'format' = 'json') AS ( SELECT * FROM table WHERE dt = '2024-08-05' LIMIT 10) {code} `FETCH NEXT 10 ROWS ONLY` was added twice to the inner query. {code:java} Caused by: java.lang.AssertionError: not a query: SELECT * FROM `table` WHERE `table`.`dt` = '2024-08-05' FETCH NEXT 10 ROWS ONLY FETCH NEXT 10 ROWS ONLY {code} was (Author: xccui): hi [~xuyangzhong] , it looks like this doesn't for the filesystem connector. For instance {code:java} CREATEOR REPLACE TABLE `default_catalog`.`default_database`.`test` WITH ( 'connector' = 'filesystem','path' = 'path','sink.parallelism' = '1', 'format' = 'json') AS ( SELECT * FROM table WHERE dt = '2024-08-05' LIMIT 10) {code} > CREATE TABLE AS doesn't work with LIMIT > --- > > Key: FLINK-35935 > URL: https://issues.apache.org/jira/browse/FLINK-35935 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > > {code:java} > CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code} > The above statement throws "Caused by: java.lang.AssertionError: not a query: > " exception. > A workaround is to wrap the query with CTE. > {code:java} > CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * > FROM R){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT
[ https://issues.apache.org/jira/browse/FLINK-35935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871856#comment-17871856 ] Xingcan Cui commented on FLINK-35935: - hi [~xuyangzhong] , it looks like this doesn't for the filesystem connector. For instance {code:java} CREATEOR REPLACE TABLE `default_catalog`.`default_database`.`test` WITH ( 'connector' = 'filesystem','path' = 'path','sink.parallelism' = '1', 'format' = 'json') AS ( SELECT * FROM table WHERE dt = '2024-08-05' LIMIT 10) {code} > CREATE TABLE AS doesn't work with LIMIT > --- > > Key: FLINK-35935 > URL: https://issues.apache.org/jira/browse/FLINK-35935 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > > {code:java} > CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code} > The above statement throws "Caused by: java.lang.AssertionError: not a query: > " exception. > A workaround is to wrap the query with CTE. > {code:java} > CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * > FROM R){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT
Xingcan Cui created FLINK-35935: --- Summary: CREATE TABLE AS doesn't work with LIMIT Key: FLINK-35935 URL: https://issues.apache.org/jira/browse/FLINK-35935 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.18.1 Reporter: Xingcan Cui {code:java} CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code} The above statement throws "Caused by: java.lang.AssertionError: not a query: " exception. A workaround is to wrap the query with CTE. {code:java} CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * FROM R){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35485) JobMaster failed with "the job xx has not been finished"
[ https://issues.apache.org/jira/browse/FLINK-35485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854891#comment-17854891 ] Xingcan Cui commented on FLINK-35485: - Hi [~mapohl], I hit the exception again today and it caused the jobmanager to restart. Still no WARN+ logs. I checked the corresponding job and it actually finished succefully. > JobMaster failed with "the job xx has not been finished" > > > Key: FLINK-35485 > URL: https://issues.apache.org/jira/browse/FLINK-35485 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > > We ran a session cluster on K8s and used Flink SQL gateway to submit queries. > Hit the following rare exception once which caused the job manager to restart. > {code:java} > org.apache.flink.util.FlinkException: JobMaster for job > 50d681ae1e8170f77b4341dda6aba9bc failed. > at > org.apache.flink.runtime.dispatcher.Dispatcher.jobMasterFailed(Dispatcher.java:1454) > at > org.apache.flink.runtime.dispatcher.Dispatcher.jobManagerRunnerFailed(Dispatcher.java:776) > at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$runJob$6(Dispatcher.java:698) > at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown > Source) > at > java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown > Source) > at java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown > Source) > at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRunAsync$4(PekkoRpcActor.java:451) > at > org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) > at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRunAsync(PekkoRpcActor.java:451) > at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:218) > at > org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:85) > at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:168) > at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33) > at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29) > at scala.PartialFunction.applyOrElse(PartialFunction.scala:127) > at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126) > at > org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29) > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175) > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176) > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176) > at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547) > at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545) > at > org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229) > at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590) > at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557) > at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280) > at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241) > at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253) > at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source) > at > java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown > Source) > at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source) > at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) > at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) > Caused by: org.apache.flink.runtime.jobmaster.JobNotFinishedException: The > job (50d681ae1e8170f77b4341dda6aba9bc) has not been finished. > at > org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.closeAsync(DefaultJobMasterServiceProcess.java:157) > at > org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcess(JobMasterServiceLeadershipRunner.java:431) > at > org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.callIfRunning(JobMasterServiceLeadershipRunner.java:476) > at > org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.lambda$stopJobMasterServiceProcessAsync$12(JobMasterServiceLeadershipRunner.java:407) > at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown > Source) > at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown > Source) > at > org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcessAsync(JobMasterServiceLeadershipRunner.java:405) > at > org.apache.flink.runtime.jobm
[jira] [Commented] (FLINK-35485) JobMaster failed with "the job xx has not been finished"
[ https://issues.apache.org/jira/browse/FLINK-35485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852267#comment-17852267 ] Xingcan Cui commented on FLINK-35485: - Hi [~mapohl], unfortunately, I failed to fetch more logs from the cluster. If it happens again, I'll try to get some logs and post them here. Thanks! > JobMaster failed with "the job xx has not been finished" > > > Key: FLINK-35485 > URL: https://issues.apache.org/jira/browse/FLINK-35485 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > > We ran a session cluster on K8s and used Flink SQL gateway to submit queries. > Hit the following rare exception once which caused the job manager to restart. > {code:java} > org.apache.flink.util.FlinkException: JobMaster for job > 50d681ae1e8170f77b4341dda6aba9bc failed. > at > org.apache.flink.runtime.dispatcher.Dispatcher.jobMasterFailed(Dispatcher.java:1454) > at > org.apache.flink.runtime.dispatcher.Dispatcher.jobManagerRunnerFailed(Dispatcher.java:776) > at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$runJob$6(Dispatcher.java:698) > at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown > Source) > at > java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown > Source) > at java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown > Source) > at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRunAsync$4(PekkoRpcActor.java:451) > at > org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) > at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRunAsync(PekkoRpcActor.java:451) > at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:218) > at > org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:85) > at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:168) > at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33) > at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29) > at scala.PartialFunction.applyOrElse(PartialFunction.scala:127) > at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126) > at > org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29) > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175) > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176) > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176) > at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547) > at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545) > at > org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229) > at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590) > at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557) > at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280) > at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241) > at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253) > at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source) > at > java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown > Source) > at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source) > at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) > at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) > Caused by: org.apache.flink.runtime.jobmaster.JobNotFinishedException: The > job (50d681ae1e8170f77b4341dda6aba9bc) has not been finished. > at > org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.closeAsync(DefaultJobMasterServiceProcess.java:157) > at > org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcess(JobMasterServiceLeadershipRunner.java:431) > at > org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.callIfRunning(JobMasterServiceLeadershipRunner.java:476) > at > org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.lambda$stopJobMasterServiceProcessAsync$12(JobMasterServiceLeadershipRunner.java:407) > at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown > Source) > at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown > Source) > at > org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcessAsync(JobMasterServiceLeadershipRunner.java:405) > at > org.apache.flink.runtime.jobmaster.JobMasterServiceLeader
[jira] [Created] (FLINK-35486) Potential sql expression generation issues on SQL gateway
Xingcan Cui created FLINK-35486: --- Summary: Potential sql expression generation issues on SQL gateway Key: FLINK-35486 URL: https://issues.apache.org/jira/browse/FLINK-35486 Project: Flink Issue Type: Bug Components: Table SQL / Gateway, Table SQL / Planner Affects Versions: 1.18.1 Reporter: Xingcan Cui We hit the following exceptions a few times when submitting queries to a session cluster with the Flink SQL gateway. When the same queries were submitted again, everything was good. There might be a concurrency problem for the expression generator. {code:java} "process.thread.name":"sql-gateway-operation-pool-thread-111","log.logger":"org.apache.flink.table.gateway.service.operation.OperationManager","error.type":"org.apache.flink.table.planner.codegen.CodeGenException","error.message":"Mismatch of expected output data type 'ARRAY NOT NULL>' and function's output type 'ARRAY NOT NULL>'.","error.stack_trace":"org.apache.flink.table.planner.codegen.CodeGenException: Mismatch of expected output data type 'ARRAY NOT NULL>' and function's output type 'ARRAY NOT NULL>'. at org.apache.flink.table.planner.codegen.calls.BridgingFunctionGenUtil$.verifyOutputType(BridgingFunctionGenUtil.scala:369) at org.apache.flink.table.planner.codegen.calls.BridgingFunctionGenUtil$.verifyFunctionAwareOutputType(BridgingFunctionGenUtil.scala:359) at org.apache.flink.table.planner.codegen.calls.BridgingFunctionGenUtil$.generateFunctionAwareCallWithDataType(BridgingFunctionGenUtil.scala:107) at org.apache.flink.table.planner.codegen.calls.BridgingFunctionGenUtil$.generateFunctionAwareCall(BridgingFunctionGenUtil.scala:84) at org.apache.flink.table.planner.codegen.calls.BridgingSqlFunctionCallGen.generate(BridgingSqlFunctionCallGen.scala:79) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateCallExpression(ExprCodeGenerator.scala:820) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:481) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:56) at org.apache.calcite.rex.RexCall.accept(RexCall.java:189) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.$anonfun$visitCall$1(ExprCodeGenerator.scala:478) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:58) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:51) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at scala.collection.TraversableLike.map(TraversableLike.scala:233) at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:469) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:56) at org.apache.calcite.rex.RexCall.accept(RexCall.java:189) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.$anonfun$visitCall$1(ExprCodeGenerator.scala:478) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:58) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:51) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at scala.collection.TraversableLike.map(TraversableLike.scala:233) at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:469) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:56) at org.apache.calcite.rex.RexCall.accept(RexCall.java:189) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateExpression(ExprCodeGenerator.scala:134) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$.$anonfun$generateProcessCode$4(CalcCodeGenerator.scala:140) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:58) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:51) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at scala.collection.TraversableLike.map(TraversableLike.scala:233) at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$.produceProjectionCode$1(CalcCodeGenerator.scala:140) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$
[jira] [Created] (FLINK-35485) JobMaster failed with "the job xx has not been finished"
Xingcan Cui created FLINK-35485: --- Summary: JobMaster failed with "the job xx has not been finished" Key: FLINK-35485 URL: https://issues.apache.org/jira/browse/FLINK-35485 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.18.1 Reporter: Xingcan Cui We ran a session cluster on K8s and used Flink SQL gateway to submit queries. Hit the following rare exception once which caused the job manager to restart. {code:java} org.apache.flink.util.FlinkException: JobMaster for job 50d681ae1e8170f77b4341dda6aba9bc failed. at org.apache.flink.runtime.dispatcher.Dispatcher.jobMasterFailed(Dispatcher.java:1454) at org.apache.flink.runtime.dispatcher.Dispatcher.jobManagerRunnerFailed(Dispatcher.java:776) at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$runJob$6(Dispatcher.java:698) at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown Source) at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source) at java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown Source) at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRunAsync$4(PekkoRpcActor.java:451) at org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRunAsync(PekkoRpcActor.java:451) at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:218) at org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:85) at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:168) at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33) at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29) at scala.PartialFunction.applyOrElse(PartialFunction.scala:127) at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126) at org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176) at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547) at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545) at org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229) at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590) at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557) at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280) at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241) at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253) at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) Caused by: org.apache.flink.runtime.jobmaster.JobNotFinishedException: The job (50d681ae1e8170f77b4341dda6aba9bc) has not been finished. at org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.closeAsync(DefaultJobMasterServiceProcess.java:157) at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcess(JobMasterServiceLeadershipRunner.java:431) at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.callIfRunning(JobMasterServiceLeadershipRunner.java:476) at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.lambda$stopJobMasterServiceProcessAsync$12(JobMasterServiceLeadershipRunner.java:407) at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown Source) at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown Source) at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcessAsync(JobMasterServiceLeadershipRunner.java:405) at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.runIfStateRunning(JobMasterServiceLeadershipRunner.java:463) at org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.revokeLeadership(JobMasterServiceLeadershipRunner.java:397) at org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.notifyLeaderContenderOfLeadershipLoss(DefaultLeaderElectionService.java:484) at java.base/java.util.HashMap.forEach(Unknown Source) at org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.onRevokeLeadershipInternal(DefaultLeaderElectionService.java:452) at o
[jira] [Commented] (FLINK-28867) Parquet reader support nested type in array/map type
[ https://issues.apache.org/jira/browse/FLINK-28867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845539#comment-17845539 ] Xingcan Cui commented on FLINK-28867: - Hey [~jark], any plan to improve this in the near future? I feel that this is a blocker for Flink OLAP despite the data lake projects having their data readers/writers. Sometimes users would like to use Flink to process some raw parquet files. > Parquet reader support nested type in array/map type > > > Key: FLINK-28867 > URL: https://issues.apache.org/jira/browse/FLINK-28867 > Project: Flink > Issue Type: Sub-task > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.16.0 >Reporter: dalongliu >Priority: Major > Attachments: ReadParquetArray1.java, part-00121.parquet > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs
[ https://issues.apache.org/jira/browse/FLINK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838379#comment-17838379 ] Xingcan Cui commented on FLINK-34583: - Hi [~xuyangzhong], thanks for looking into this. I hit the issue when using the Paimon table source. The execution plan looks good. However, the options don't work. It could be a runtime issue or Paimon source implementation bug. I can't remember clearly if Flink generates multiple table sources and then merges them at runtime. If it does, the options may not be merged properly. !image-2024-04-17-16-48-49-073.png! > Bug for dynamic table option hints with multiple CTEs > - > > Key: FLINK-34583 > URL: https://issues.apache.org/jira/browse/FLINK-34583 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > Attachments: image-2024-04-17-16-35-06-153.png, > image-2024-04-17-16-48-49-073.png > > > The table options hints don't work well with multiple WITH clauses referring > to the same table. Please see the following example. > > The following query with hints works well. > {code:java} > SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code} > The following query with multiple WITH clauses also works well. > {code:java} > WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), > T3 AS (SELECT ... FROM T2 WHERE...) > SELECT * FROM T3;{code} > The following query with multiple WITH clauses referring to the same original > table failed to recognize the hints. > {code:java} > WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), > T3 AS (SELECT ... FROM T2 WHERE...), > T4 AS (SELECT ... FROM T2 WHERE...), > T5 AS (SELECT ... FROM T3 JOIN T4 ON...) > SELECT * FROM T5;{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs
[ https://issues.apache.org/jira/browse/FLINK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-34583: Attachment: image-2024-04-17-16-48-49-073.png > Bug for dynamic table option hints with multiple CTEs > - > > Key: FLINK-34583 > URL: https://issues.apache.org/jira/browse/FLINK-34583 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > Attachments: image-2024-04-17-16-35-06-153.png, > image-2024-04-17-16-48-49-073.png > > > The table options hints don't work well with multiple WITH clauses referring > to the same table. Please see the following example. > > The following query with hints works well. > {code:java} > SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code} > The following query with multiple WITH clauses also works well. > {code:java} > WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), > T3 AS (SELECT ... FROM T2 WHERE...) > SELECT * FROM T3;{code} > The following query with multiple WITH clauses referring to the same original > table failed to recognize the hints. > {code:java} > WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), > T3 AS (SELECT ... FROM T2 WHERE...), > T4 AS (SELECT ... FROM T2 WHERE...), > T5 AS (SELECT ... FROM T3 JOIN T4 ON...) > SELECT * FROM T5;{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs
[ https://issues.apache.org/jira/browse/FLINK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-34583: Attachment: image-2024-04-17-16-35-06-153.png > Bug for dynamic table option hints with multiple CTEs > - > > Key: FLINK-34583 > URL: https://issues.apache.org/jira/browse/FLINK-34583 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > Attachments: image-2024-04-17-16-35-06-153.png > > > The table options hints don't work well with multiple WITH clauses referring > to the same table. Please see the following example. > > The following query with hints works well. > {code:java} > SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code} > The following query with multiple WITH clauses also works well. > {code:java} > WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), > T3 AS (SELECT ... FROM T2 WHERE...) > SELECT * FROM T3;{code} > The following query with multiple WITH clauses referring to the same original > table failed to recognize the hints. > {code:java} > WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), > T3 AS (SELECT ... FROM T2 WHERE...), > T4 AS (SELECT ... FROM T2 WHERE...), > T5 AS (SELECT ... FROM T3 JOIN T4 ON...) > SELECT * FROM T5;{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query
[ https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-34926: Attachment: image.png > Adaptive auto parallelism doesn't work for a query > -- > > Key: FLINK-34926 > URL: https://issues.apache.org/jira/browse/FLINK-34926 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > Attachments: image.png > > > We have the following query running in batch mode. > {code:java} > WITH FEATURE_INCLUSION AS ( > SELECT > insertion_id, -- Not unique > features -- Array> > FROM > features_table > ), > TOTAL AS ( > SELECT > COUNT(DISTINCT insertion_id) total_id > FROM > FEATURE_INCLUSION > ), > FEATURE_INCLUSION_COUNTS AS ( > SELECT > `key`, > COUNT(DISTINCT insertion_id) AS id_count > FROM > FEATURE_INCLUSION, > UNNEST(features) as t (`key`, `value`) > WHERE > TRUE > GROUP BY > `key` > ), > RESULTS AS ( > SELECT > `key` > FROM > FEATURE_INCLUSION_COUNTS, > TOTAL > WHERE > (1.0 * id_count)/total_id > 0.1 > ) > SELECT > JSON_ARRAYAGG(`key`) AS feature_ids, > FROM > RESULTS{code} > The parallelism adaptively set by Flink for the following operator was always > 1. > {code:java} > [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, > insertion_id]) > +- [38]:LocalHashAggregate(groupBy=[key], select=[key, > Partial_COUNT(insertion_id) AS count$0]){code} > If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and > manually set `parallelism.default` to be greater than one, it worked. > The screenshot of the full job graph is attached. !image_720.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query
[ https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-34926: Description: We have the following query running in batch mode. {code:java} WITH FEATURE_INCLUSION AS ( SELECT insertion_id, -- Not unique features -- Array> FROM features_table ), TOTAL AS ( SELECT COUNT(DISTINCT insertion_id) total_id FROM FEATURE_INCLUSION ), FEATURE_INCLUSION_COUNTS AS ( SELECT `key`, COUNT(DISTINCT insertion_id) AS id_count FROM FEATURE_INCLUSION, UNNEST(features) as t (`key`, `value`) WHERE TRUE GROUP BY `key` ), RESULTS AS ( SELECT `key` FROM FEATURE_INCLUSION_COUNTS, TOTAL WHERE (1.0 * id_count)/total_id > 0.1 ) SELECT JSON_ARRAYAGG(`key`) AS feature_ids, FROM RESULTS{code} The parallelism adaptively set by Flink for the following operator was always 1. {code:java} [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, insertion_id]) +- [38]:LocalHashAggregate(groupBy=[key], select=[key, Partial_COUNT(insertion_id) AS count$0]){code} If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually set `parallelism.default` to be greater than one, it worked. The screenshot of the full job graph is attached. !image.png! was: We have the following query running in batch mode. {code:java} WITH FEATURE_INCLUSION AS ( SELECT insertion_id, -- Not unique features -- Array> FROM features_table ), TOTAL AS ( SELECT COUNT(DISTINCT insertion_id) total_id FROM FEATURE_INCLUSION ), FEATURE_INCLUSION_COUNTS AS ( SELECT `key`, COUNT(DISTINCT insertion_id) AS id_count FROM FEATURE_INCLUSION, UNNEST(features) as t (`key`, `value`) WHERE TRUE GROUP BY `key` ), RESULTS AS ( SELECT `key` FROM FEATURE_INCLUSION_COUNTS, TOTAL WHERE (1.0 * id_count)/total_id > 0.1 ) SELECT JSON_ARRAYAGG(`key`) AS feature_ids, FROM RESULTS{code} The parallelism adaptively set by Flink for the following operator was always 1. {code:java} [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, insertion_id]) +- [38]:LocalHashAggregate(groupBy=[key], select=[key, Partial_COUNT(insertion_id) AS count$0]){code} If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually set `parallelism.default` to be greater than one, it worked. The screenshot of the full job graph is attached. !image_720.png! > Adaptive auto parallelism doesn't work for a query > -- > > Key: FLINK-34926 > URL: https://issues.apache.org/jira/browse/FLINK-34926 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > Attachments: image.png > > > We have the following query running in batch mode. > {code:java} > WITH FEATURE_INCLUSION AS ( > SELECT > insertion_id, -- Not unique > features -- Array> > FROM > features_table > ), > TOTAL AS ( > SELECT > COUNT(DISTINCT insertion_id) total_id > FROM > FEATURE_INCLUSION > ), > FEATURE_INCLUSION_COUNTS AS ( > SELECT > `key`, > COUNT(DISTINCT insertion_id) AS id_count > FROM > FEATURE_INCLUSION, > UNNEST(features) as t (`key`, `value`) > WHERE > TRUE > GROUP BY > `key` > ), > RESULTS AS ( > SELECT > `key` > FROM > FEATURE_INCLUSION_COUNTS, > TOTAL > WHERE > (1.0 * id_count)/total_id > 0.1 > ) > SELECT > JSON_ARRAYAGG(`key`) AS feature_ids, > FROM > RESULTS{code} > The parallelism adaptively set by Flink for the following operator was always > 1. > {code:java} > [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, > insertion_id]) > +- [38]:LocalHashAggregate(groupBy=[key], select=[key, > Partial_COUNT(insertion_id) AS count$0]){code} > If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and > manually set `parallelism.default` to be greater than one, it worked. > The screenshot of the full job graph is attached. !image.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query
[ https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-34926: Attachment: (was: image_720.png) > Adaptive auto parallelism doesn't work for a query > -- > > Key: FLINK-34926 > URL: https://issues.apache.org/jira/browse/FLINK-34926 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > > We have the following query running in batch mode. > {code:java} > WITH FEATURE_INCLUSION AS ( > SELECT > insertion_id, -- Not unique > features -- Array> > FROM > features_table > ), > TOTAL AS ( > SELECT > COUNT(DISTINCT insertion_id) total_id > FROM > FEATURE_INCLUSION > ), > FEATURE_INCLUSION_COUNTS AS ( > SELECT > `key`, > COUNT(DISTINCT insertion_id) AS id_count > FROM > FEATURE_INCLUSION, > UNNEST(features) as t (`key`, `value`) > WHERE > TRUE > GROUP BY > `key` > ), > RESULTS AS ( > SELECT > `key` > FROM > FEATURE_INCLUSION_COUNTS, > TOTAL > WHERE > (1.0 * id_count)/total_id > 0.1 > ) > SELECT > JSON_ARRAYAGG(`key`) AS feature_ids, > FROM > RESULTS{code} > The parallelism adaptively set by Flink for the following operator was always > 1. > {code:java} > [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, > insertion_id]) > +- [38]:LocalHashAggregate(groupBy=[key], select=[key, > Partial_COUNT(insertion_id) AS count$0]){code} > If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and > manually set `parallelism.default` to be greater than one, it worked. > The screenshot of the full job graph is attached. !image_720.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query
[ https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-34926: Description: We have the following query running in batch mode. {code:java} WITH FEATURE_INCLUSION AS ( SELECT insertion_id, -- Not unique features -- Array> FROM features_table ), TOTAL AS ( SELECT COUNT(DISTINCT insertion_id) total_id FROM FEATURE_INCLUSION ), FEATURE_INCLUSION_COUNTS AS ( SELECT `key`, COUNT(DISTINCT insertion_id) AS id_count FROM FEATURE_INCLUSION, UNNEST(features) as t (`key`, `value`) WHERE TRUE GROUP BY `key` ), RESULTS AS ( SELECT `key` FROM FEATURE_INCLUSION_COUNTS, TOTAL WHERE (1.0 * id_count)/total_id > 0.1 ) SELECT JSON_ARRAYAGG(`key`) AS feature_ids, FROM RESULTS{code} The parallelism adaptively set by Flink for the following operator was always 1. {code:java} [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, insertion_id]) +- [38]:LocalHashAggregate(groupBy=[key], select=[key, Partial_COUNT(insertion_id) AS count$0]){code} If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually set `parallelism.default` to be greater than one, it worked. The screenshot of the full job graph is attached. !image_720.png! was: We have the following query running in batch mode. {code:java} WITH FEATURE_INCLUSION AS ( SELECT insertion_id, -- Not unique features -- Array> FROM features_table ), TOTAL AS ( SELECT COUNT(DISTINCT insertion_id) total_id FROM FEATURE_INCLUSION ), FEATURE_INCLUSION_COUNTS AS ( SELECT `key`, COUNT(DISTINCT insertion_id) AS id_count FROM FEATURE_INCLUSION, UNNEST(features) as t (`key`, `value`) WHERE TRUE GROUP BY `key` ), RESULTS AS ( SELECT `key` FROM FEATURE_INCLUSION_COUNTS, TOTAL WHERE (1.0 * id_count)/total_id > 0.1 ) SELECT JSON_ARRAYAGG(`key`) AS feature_ids, FROM RESULTS{code} The parallelism adaptively set by Flink for the following operator was always 1. {code:java} [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, insertion_id]) +- [38]:LocalHashAggregate(groupBy=[key], select=[key, Partial_COUNT(insertion_id) AS count$0]){code} If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually set `parallelism.default` to be greater than one, it worked. The screenshot of the full job graph is attached. !image_720.png! > Adaptive auto parallelism doesn't work for a query > -- > > Key: FLINK-34926 > URL: https://issues.apache.org/jira/browse/FLINK-34926 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > > We have the following query running in batch mode. > {code:java} > WITH FEATURE_INCLUSION AS ( > SELECT > insertion_id, -- Not unique > features -- Array> > FROM > features_table > ), > TOTAL AS ( > SELECT > COUNT(DISTINCT insertion_id) total_id > FROM > FEATURE_INCLUSION > ), > FEATURE_INCLUSION_COUNTS AS ( > SELECT > `key`, > COUNT(DISTINCT insertion_id) AS id_count > FROM > FEATURE_INCLUSION, > UNNEST(features) as t (`key`, `value`) > WHERE > TRUE > GROUP BY > `key` > ), > RESULTS AS ( > SELECT > `key` > FROM > FEATURE_INCLUSION_COUNTS, > TOTAL > WHERE > (1.0 * id_count)/total_id > 0.1 > ) > SELECT > JSON_ARRAYAGG(`key`) AS feature_ids, > FROM > RESULTS{code} > The parallelism adaptively set by Flink for the following operator was always > 1. > {code:java} > [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, > insertion_id]) > +- [38]:LocalHashAggregate(groupBy=[key], select=[key, > Partial_COUNT(insertion_id) AS count$0]){code} > If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and > manually set `parallelism.default` to be greater than one, it worked. > The screenshot of the full job graph is attached. !image_720.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34926) Adaptive auto parallelism doesn't work for a query
Xingcan Cui created FLINK-34926: --- Summary: Adaptive auto parallelism doesn't work for a query Key: FLINK-34926 URL: https://issues.apache.org/jira/browse/FLINK-34926 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.18.1 Reporter: Xingcan Cui Attachments: image_720.png We have the following query running in batch mode. {code:java} WITH FEATURE_INCLUSION AS ( SELECT insertion_id, -- Not unique features -- Array> FROM features_table ), TOTAL AS ( SELECT COUNT(DISTINCT insertion_id) total_id FROM FEATURE_INCLUSION ), FEATURE_INCLUSION_COUNTS AS ( SELECT `key`, COUNT(DISTINCT insertion_id) AS id_count FROM FEATURE_INCLUSION, UNNEST(features) as t (`key`, `value`) WHERE TRUE GROUP BY `key` ), RESULTS AS ( SELECT `key` FROM FEATURE_INCLUSION_COUNTS, TOTAL WHERE (1.0 * id_count)/total_id > 0.1 ) SELECT JSON_ARRAYAGG(`key`) AS feature_ids, FROM RESULTS{code} The parallelism adaptively set by Flink for the following operator was always 1. {code:java} [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, insertion_id]) +- [38]:LocalHashAggregate(groupBy=[key], select=[key, Partial_COUNT(insertion_id) AS count$0]){code} If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually set `parallelism.default` to be greater than one, it worked. The screenshot of the full job graph is attached. !image_720.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34723) Parquet writer should restrict map keys to be not null
[ https://issues.apache.org/jira/browse/FLINK-34723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-34723: --- Assignee: (was: Xingcan Cui) > Parquet writer should restrict map keys to be not null > -- > > Key: FLINK-34723 > URL: https://issues.apache.org/jira/browse/FLINK-34723 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.19.0, 1.18.1 >Reporter: Xingcan Cui >Priority: Major > Labels: pull-request-available > > We got the following exception when reading a parquet file (with map types) > generated by Flink. > {code:java} > Map keys must be annotated as required.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34723) Parquet writer should restrict map keys to be not null
[ https://issues.apache.org/jira/browse/FLINK-34723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-34723: --- Assignee: Xingcan Cui > Parquet writer should restrict map keys to be not null > -- > > Key: FLINK-34723 > URL: https://issues.apache.org/jira/browse/FLINK-34723 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.19.0, 1.18.1 >Reporter: Xingcan Cui >Assignee: Xingcan Cui >Priority: Major > > We got the following exception when reading a parquet file (with map types) > generated by Flink. > {code:java} > Map keys must be annotated as required.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34723) Parquet writer should restrict map keys to be not null
[ https://issues.apache.org/jira/browse/FLINK-34723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-34723: Issue Type: Bug (was: New Feature) > Parquet writer should restrict map keys to be not null > -- > > Key: FLINK-34723 > URL: https://issues.apache.org/jira/browse/FLINK-34723 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.19.0, 1.18.1 >Reporter: Xingcan Cui >Priority: Major > > We got the following exception when reading a parquet file (with map types) > generated by Flink. > {code:java} > Map keys must be annotated as required.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34723) Parquet writer should restrict map keys to be not null
Xingcan Cui created FLINK-34723: --- Summary: Parquet writer should restrict map keys to be not null Key: FLINK-34723 URL: https://issues.apache.org/jira/browse/FLINK-34723 Project: Flink Issue Type: New Feature Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.18.1, 1.19.0 Reporter: Xingcan Cui We got the following exception when reading a parquet file (with map types) generated by Flink. {code:java} Map keys must be annotated as required.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34633) Support unnesting array constants
[ https://issues.apache.org/jira/browse/FLINK-34633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-34633: --- Assignee: Jeyhun Karimov > Support unnesting array constants > - > > Key: FLINK-34633 > URL: https://issues.apache.org/jira/browse/FLINK-34633 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Assignee: Jeyhun Karimov >Priority: Minor > Labels: pull-request-available > > It seems that the current planner doesn't support using UNNEST on array > constants.(x) > {code:java} > SELECT * FROM UNNEST(ARRAY[1,2,3]);{code} > > The following query can't be compiled.(x) > {code:java} > SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code} > > The rewritten version works. (/) > {code:java} > SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN > UNNEST(A){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34633) Support unnesting array constants
[ https://issues.apache.org/jira/browse/FLINK-34633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-34633: Description: It seems that the current planner doesn't support using UNNEST on array constants.(x) {code:java} SELECT * FROM UNNEST(ARRAY[1,2,3]);{code} The following query can't be compiled.(x) {code:java} SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code} The rewritten version works. (/) {code:java} SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN UNNEST(A){code} was: It seems that the current planner doesn't support using UNNEST on array constants.(x) {code:java} SELECT * FROM UNNEST(ARRAY[1,2,3]);{code} The following query can be compiled.(x) {code:java} SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code} The rewritten version works. (/) {code:java} SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN UNNEST(A){code} > Support unnesting array constants > - > > Key: FLINK-34633 > URL: https://issues.apache.org/jira/browse/FLINK-34633 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Minor > > It seems that the current planner doesn't support using UNNEST on array > constants.(x) > {code:java} > SELECT * FROM UNNEST(ARRAY[1,2,3]);{code} > > The following query can't be compiled.(x) > {code:java} > SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code} > > The rewritten version works. (/) > {code:java} > SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN > UNNEST(A){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34633) Support unnesting array constants
Xingcan Cui created FLINK-34633: --- Summary: Support unnesting array constants Key: FLINK-34633 URL: https://issues.apache.org/jira/browse/FLINK-34633 Project: Flink Issue Type: New Feature Components: Table SQL / Planner Affects Versions: 1.18.1 Reporter: Xingcan Cui It seems that the current planner doesn't support using UNNEST on array constants.(x) {code:java} SELECT * FROM UNNEST(ARRAY[1,2,3]);{code} The following query can be compiled.(x) {code:java} SELECT * FROM (VALUES('a')) CROSS JOIN UNNEST(ARRAY[1, 2, 3]){code} The rewritten version works. (/) {code:java} SELECT * FROM (SELECT *, ARRAY[1,2,3] AS A FROM (VALUES('a'))) CROSS JOIN UNNEST(A){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs
Xingcan Cui created FLINK-34583: --- Summary: Bug for dynamic table option hints with multiple CTEs Key: FLINK-34583 URL: https://issues.apache.org/jira/browse/FLINK-34583 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.18.1 Reporter: Xingcan Cui The table options hints don't work well with multiple WITH clauses referring to the same table. Please see the following example. The following query with hints works well. {code:java} SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code} The following query with multiple WITH clauses also works well. {code:java} WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...) T3 AS (SELECT ... FROM T2 WHERE...) SELECT * FROM T3;{code} The following query with multiple WITH clauses referring to the same original table failed to recognize the hints. {code:java} WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), T3 AS (SELECT ... FROM T2 WHERE...), T4 AS (SELECT ... FROM T2 WHERE...), T5 AS (SELECT ... FROM T3 JOIN T4 ON...) SELECT * FROM T5;{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs
[ https://issues.apache.org/jira/browse/FLINK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-34583: Description: The table options hints don't work well with multiple WITH clauses referring to the same table. Please see the following example. The following query with hints works well. {code:java} SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code} The following query with multiple WITH clauses also works well. {code:java} WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), T3 AS (SELECT ... FROM T2 WHERE...) SELECT * FROM T3;{code} The following query with multiple WITH clauses referring to the same original table failed to recognize the hints. {code:java} WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), T3 AS (SELECT ... FROM T2 WHERE...), T4 AS (SELECT ... FROM T2 WHERE...), T5 AS (SELECT ... FROM T3 JOIN T4 ON...) SELECT * FROM T5;{code} was: The table options hints don't work well with multiple WITH clauses referring to the same table. Please see the following example. The following query with hints works well. {code:java} SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code} The following query with multiple WITH clauses also works well. {code:java} WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...) T3 AS (SELECT ... FROM T2 WHERE...) SELECT * FROM T3;{code} The following query with multiple WITH clauses referring to the same original table failed to recognize the hints. {code:java} WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), T3 AS (SELECT ... FROM T2 WHERE...), T4 AS (SELECT ... FROM T2 WHERE...), T5 AS (SELECT ... FROM T3 JOIN T4 ON...) SELECT * FROM T5;{code} > Bug for dynamic table option hints with multiple CTEs > - > > Key: FLINK-34583 > URL: https://issues.apache.org/jira/browse/FLINK-34583 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.1 >Reporter: Xingcan Cui >Priority: Major > > The table options hints don't work well with multiple WITH clauses referring > to the same table. Please see the following example. > > The following query with hints works well. > {code:java} > SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code} > The following query with multiple WITH clauses also works well. > {code:java} > WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), > T3 AS (SELECT ... FROM T2 WHERE...) > SELECT * FROM T3;{code} > The following query with multiple WITH clauses referring to the same original > table failed to recognize the hints. > {code:java} > WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...), > T3 AS (SELECT ... FROM T2 WHERE...), > T4 AS (SELECT ... FROM T2 WHERE...), > T5 AS (SELECT ... FROM T3 JOIN T4 ON...) > SELECT * FROM T5;{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33184) HybridShuffleITCase fails with exception in resource cleanup of task Map on AZP
[ https://issues.apache.org/jira/browse/FLINK-33184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810661#comment-17810661 ] Xingcan Cui commented on FLINK-33184: - Just hit a similar issue in Flink 1.18.1. If [https://github.com/apache/flink/pull/23532] solved the issue, it's better to backport it. {code:java} ERROR org.apache.flink.runtime.taskmanager.Task [] - Error in the task canceler for task KeyedProcess (112/128)#1. java.lang.IllegalStateException: Leaking buffers. at org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier.disk.SubpartitionDiskCacheManager.release(SubpartitionDiskCacheManager.java:113) ~[flink-dist-1.18.1.jar:1.18.1] at java.util.Spliterators$ArraySpliterator.forEachRemaining(Unknown Source) ~[?:?] at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source) ~[?:?] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier.disk.DiskCacheManager.release(DiskCacheManager.java:128) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier.disk.DiskTierProducerAgent.releaseResources(DiskTierProducerAgent.java:222) ~[flink-dist-1.18.1.jar:1.18.1] at java.util.ArrayList.forEach(Unknown Source) ~[?:?] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.taskmanager.Task.access$100(Task.java:139) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.taskmanager.Task$TaskCanceler.run(Task.java:1677) [flink-dist-1.18.1.jar:1.18.1] at java.lang.Thread.run(Unknown Source) [?:?] 2024-01-25 03:44:21 [KeyedProcess (112/128)#1] INFO org.apache.flink.runtime.taskmanager.Task [] - KeyedProcess (112/128)#1 (7bb761e84f2d7957d3b927e49a6b28b3_e0d77c22cedd08dc719831d914bf_111_1) switched from CANCELING to CANCELED. {code} > HybridShuffleITCase fails with exception in resource cleanup of task Map on > AZP > --- > > Key: FLINK-33184 > URL: https://issues.apache.org/jira/browse/FLINK-33184 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This build fails > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53548&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8710 > {noformat} > Map (5/10)#0] ERROR org.apache.flink.runtime.taskmanager.Task >[] - FATAL - exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network
[jira] [Closed] (FLINK-33547) SQL primitive array type after upgrading to Flink 1.18.0
[ https://issues.apache.org/jira/browse/FLINK-33547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui closed FLINK-33547. --- Resolution: Duplicate Duplicated with https://issues.apache.org/jira/browse/FLINK-33523 > SQL primitive array type after upgrading to Flink 1.18.0 > > > Key: FLINK-33547 > URL: https://issues.apache.org/jira/browse/FLINK-33547 > Project: Flink > Issue Type: Technical Debt > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Xingcan Cui >Priority: Major > > We have some Flink SQL UDFs that use object array (Object[]) arguments and > take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink > 1.18.0, the data created by ARRAY[] SQL function became primitive arrays > (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's > expected. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33547) SQL primitive array type after upgrading to Flink 1.18.0
[ https://issues.apache.org/jira/browse/FLINK-33547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789319#comment-17789319 ] Xingcan Cui commented on FLINK-33547: - Sure. I'll close this. > SQL primitive array type after upgrading to Flink 1.18.0 > > > Key: FLINK-33547 > URL: https://issues.apache.org/jira/browse/FLINK-33547 > Project: Flink > Issue Type: Technical Debt > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Xingcan Cui >Priority: Major > > We have some Flink SQL UDFs that use object array (Object[]) arguments and > take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink > 1.18.0, the data created by ARRAY[] SQL function became primitive arrays > (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's > expected. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33547) SQL primitive array type after upgrading to Flink 1.18.0
[ https://issues.apache.org/jira/browse/FLINK-33547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789311#comment-17789311 ] Xingcan Cui commented on FLINK-33547: - Hi [~jeyhun] , thanks for your attention! What you explained makes sense. However, sometimes it's tricky to deal with primitive array parameters. They make it harder for users to write generic UDFs, e.g., one that takes an arbitrary array and returns the first 3 elements. Also, this is a breaking change in 1.18.0. All the old UDFs using Object[] as arguments before can't directly work for primitive arrays generated from ARRAY functions now. Type inference and dealing with null values are challenging in Flink SQL. Users won't understand why a UDF accepting an ARRAY argument can't work for an ARRAY parameter. It's also impossible for UDF developers to code a bunch of functions taking different primitive arrays. We need some changes here. > SQL primitive array type after upgrading to Flink 1.18.0 > > > Key: FLINK-33547 > URL: https://issues.apache.org/jira/browse/FLINK-33547 > Project: Flink > Issue Type: Technical Debt > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Xingcan Cui >Priority: Major > > We have some Flink SQL UDFs that use object array (Object[]) arguments and > take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink > 1.18.0, the data created by ARRAY[] SQL function became primitive arrays > (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's > expected. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33547) SQL primitive array type after upgrading to Flink 1.18.0
[ https://issues.apache.org/jira/browse/FLINK-33547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-33547: Summary: SQL primitive array type after upgrading to Flink 1.18.0 (was: Primitive SQL array type after upgrading to Flink 1.18.0) > SQL primitive array type after upgrading to Flink 1.18.0 > > > Key: FLINK-33547 > URL: https://issues.apache.org/jira/browse/FLINK-33547 > Project: Flink > Issue Type: Technical Debt > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Xingcan Cui >Priority: Major > > We have some Flink SQL UDFs that use object array (Object[]) arguments and > take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink > 1.18.0, the data created by ARRAY[] SQL function became primitive arrays > (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's > expected. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33547) Primitive SQL array type after upgrading to Flink 1.18.0
Xingcan Cui created FLINK-33547: --- Summary: Primitive SQL array type after upgrading to Flink 1.18.0 Key: FLINK-33547 URL: https://issues.apache.org/jira/browse/FLINK-33547 Project: Flink Issue Type: Technical Debt Components: Table SQL / Runtime Affects Versions: 1.18.0 Reporter: Xingcan Cui We have some Flink SQL UDFs that use object array (Object[]) arguments and take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink 1.18.0, the data created by ARRAY[] SQL function became primitive arrays (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's expected. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32171) Add PostStart hook to flink k8s operator helm
[ https://issues.apache.org/jira/browse/FLINK-32171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725588#comment-17725588 ] Xingcan Cui commented on FLINK-32171: - Hi [~gyfora], would like to get your thoughts on this. I can work on it if you think this feature is reasonable. Thanks! > Add PostStart hook to flink k8s operator helm > - > > Key: FLINK-32171 > URL: https://issues.apache.org/jira/browse/FLINK-32171 > Project: Flink > Issue Type: New Feature > Components: Kubernetes Operator >Reporter: Xingcan Cui >Priority: Minor > Fix For: kubernetes-operator-1.6.0, kubernetes-operator-1.5.1 > > > I feel it will be convenient to add a PostStart hook optional config to flink > k8s operator helm (e.g. when users need to download some Flink plugins). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32171) Add PostStart hook to flink k8s operator helm
Xingcan Cui created FLINK-32171: --- Summary: Add PostStart hook to flink k8s operator helm Key: FLINK-32171 URL: https://issues.apache.org/jira/browse/FLINK-32171 Project: Flink Issue Type: New Feature Components: Kubernetes Operator Reporter: Xingcan Cui Fix For: kubernetes-operator-1.6.0, kubernetes-operator-1.5.1 I feel it will be convenient to add a PostStart hook optional config to flink k8s operator helm (e.g. when users need to download some Flink plugins). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31021) JavaCodeSplitter doesn't split static method properly
[ https://issues.apache.org/jira/browse/FLINK-31021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688269#comment-17688269 ] Xingcan Cui commented on FLINK-31021: - [~libenchao] Thanks for the comments. I'm a bit busy these days. Will try to replace the static methods with non-static ones for {{flink-protobuf}} if no one works on this before I get some time. > JavaCodeSplitter doesn't split static method properly > - > > Key: FLINK-31021 > URL: https://issues.apache.org/jira/browse/FLINK-31021 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.4, 1.15.3, 1.16.1 >Reporter: Xingcan Cui >Priority: Minor > > The exception while compiling the generated source > {code:java} > cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: > Instance method "default void > org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)" > cannot be invoked in static context,{code} > The original method header > {code:java} > public static RowData decode(foo.bar.LogData message){{code} > The code after split > > {code:java} > Line 3383: public static RowData decode(foo.bar.LogData message){ > decodeImpl(message); return decodeReturnValue$0; } > Line 3384: > Line 3385: void decodeImpl(foo.bar.LogData message) {{code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31021) JavaCodeSplitter doesn't split static method properly
[ https://issues.apache.org/jira/browse/FLINK-31021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687728#comment-17687728 ] Xingcan Cui commented on FLINK-31021: - I'm playing with [https://github.com/apache/flink/blob/c096c03df70648b60b665a09816635b956b201cc/flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/deserialize/ProtoToRowConverter.java#L98] The generated converter source is too large sometimes. I fixed it locally by removing the static keyword for now. FYI [~libenchao] It's fine if the code splitter doesn't support static methods. But at least we should inform users with a proper message instead of generating incorrect code. > JavaCodeSplitter doesn't split static method properly > - > > Key: FLINK-31021 > URL: https://issues.apache.org/jira/browse/FLINK-31021 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.4, 1.15.3, 1.16.1 >Reporter: Xingcan Cui >Priority: Minor > > The exception while compiling the generated source > {code:java} > cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: > Instance method "default void > org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)" > cannot be invoked in static context,{code} > The original method header > {code:java} > public static RowData decode(foo.bar.LogData message){{code} > The code after split > > {code:java} > Line 3383: public static RowData decode(foo.bar.LogData message){ > decodeImpl(message); return decodeReturnValue$0; } > Line 3384: > Line 3385: void decodeImpl(foo.bar.LogData message) {{code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31021) JavaCodeSplitter doesn't split static method properly
[ https://issues.apache.org/jira/browse/FLINK-31021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-31021: Affects Version/s: 1.15.3 > JavaCodeSplitter doesn't split static method properly > - > > Key: FLINK-31021 > URL: https://issues.apache.org/jira/browse/FLINK-31021 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.4, 1.15.3, 1.16.1 >Reporter: Xingcan Cui >Priority: Minor > > The exception while compiling the generated source > {code:java} > cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: > Instance method "default void > org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)" > cannot be invoked in static context,{code} > The original method header > {code:java} > public static RowData decode(foo.bar.LogData message){{code} > The code after split > > {code:java} > Line 3383: public static RowData decode(foo.bar.LogData message){ > decodeImpl(message); return decodeReturnValue$0; } > Line 3384: > Line 3385: void decodeImpl(foo.bar.LogData message) {{code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31021) JavaCodeSplitter doesn't split static method properly
Xingcan Cui created FLINK-31021: --- Summary: JavaCodeSplitter doesn't split static method properly Key: FLINK-31021 URL: https://issues.apache.org/jira/browse/FLINK-31021 Project: Flink Issue Type: Bug Affects Versions: 1.16.1, 1.14.4 Reporter: Xingcan Cui The exception while compiling the generated source {code:java} cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: Instance method "default void org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)" cannot be invoked in static context,{code} The original method header {code:java} public static RowData decode(foo.bar.LogData message){{code} The code after split {code:java} Line 3383: public static RowData decode(foo.bar.LogData message){ decodeImpl(message); return decodeReturnValue$0; } Line 3384: Line 3385: void decodeImpl(foo.bar.LogData message) {{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-24007) Support Avro timestamp conversion with precision greater than three
[ https://issues.apache.org/jira/browse/FLINK-24007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406049#comment-17406049 ] Xingcan Cui commented on FLINK-24007: - Will close this since it's duplicated with https://issues.apache.org/jira/browse/FLINK-23589 > Support Avro timestamp conversion with precision greater than three > --- > > Key: FLINK-24007 > URL: https://issues.apache.org/jira/browse/FLINK-24007 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.13.2 >Reporter: Xingcan Cui >Priority: Major > > {{AvroSchemaConverter.convertToSchema()}} doesn't support timestamp with > precision > 3 now. This seems to be a bug and should be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-24007) Support Avro timestamp conversion with precision greater than three
[ https://issues.apache.org/jira/browse/FLINK-24007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui closed FLINK-24007. --- Resolution: Duplicate > Support Avro timestamp conversion with precision greater than three > --- > > Key: FLINK-24007 > URL: https://issues.apache.org/jira/browse/FLINK-24007 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.13.2 >Reporter: Xingcan Cui >Priority: Major > > {{AvroSchemaConverter.convertToSchema()}} doesn't support timestamp with > precision > 3 now. This seems to be a bug and should be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24007) Support Avro timestamp conversion with precision greater than three
Xingcan Cui created FLINK-24007: --- Summary: Support Avro timestamp conversion with precision greater than three Key: FLINK-24007 URL: https://issues.apache.org/jira/browse/FLINK-24007 Project: Flink Issue Type: Bug Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.13.2 Reporter: Xingcan Cui {{AvroSchemaConverter.convertToSchema()}} doesn't support timestamp with precision > 3 now. This seems to be a bug and should be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-8236) Allow to set the parallelism of table queries
[ https://issues.apache.org/jira/browse/FLINK-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17227070#comment-17227070 ] Xingcan Cui commented on FLINK-8236: Hi [~rex-remind], unfortunately, there's no progress from my side on this issue. I think we still can only use a global parallelism value for all the operators of a compiled SQL. Also, I'm going to unassign myself and see if other members of the community could work on this. > Allow to set the parallelism of table queries > - > > Key: FLINK-8236 > URL: https://issues.apache.org/jira/browse/FLINK-8236 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Affects Versions: 1.4.0 >Reporter: Timo Walther >Priority: Major > > Right now the parallelism of a table program is determined by the parallelism > of the stream/batch environment. E.g., by default, tumbling window operators > use the default parallelism of the environment. Simple project and select > operations have the same parallelism as the inputs they are applied on. > While we cannot change forwarding operations because this would change the > results when using retractions, it should be possible to change the > parallelism for operators after shuffling operations. > It should be possible to specify the default parallelism of a table program > in the {{TableConfig}} and/or {{QueryConfig}}. The configuration per query > has higher precedence that the configuration per table environment. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-8236) Allow to set the parallelism of table queries
[ https://issues.apache.org/jira/browse/FLINK-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-8236: -- Assignee: (was: Xingcan Cui) > Allow to set the parallelism of table queries > - > > Key: FLINK-8236 > URL: https://issues.apache.org/jira/browse/FLINK-8236 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Affects Versions: 1.4.0 >Reporter: Timo Walther >Priority: Major > > Right now the parallelism of a table program is determined by the parallelism > of the stream/batch environment. E.g., by default, tumbling window operators > use the default parallelism of the environment. Simple project and select > operations have the same parallelism as the inputs they are applied on. > While we cannot change forwarding operations because this would change the > results when using retractions, it should be possible to change the > parallelism for operators after shuffling operations. > It should be possible to specify the default parallelism of a table program > in the {{TableConfig}} and/or {{QueryConfig}}. The configuration per query > has higher precedence that the configuration per table environment. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16048) Support read/write confluent schema registry avro data from Kafka
[ https://issues.apache.org/jira/browse/FLINK-16048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192590#comment-17192590 ] Xingcan Cui commented on FLINK-16048: - Hi all, thanks for your effort on this feature. I wonder if it's possible to provide the authentication info (i.e.,{{value.converter.basic.auth.user.info}}) via this new schema class? > Support read/write confluent schema registry avro data from Kafka > -- > > Key: FLINK-16048 > URL: https://issues.apache.org/jira/browse/FLINK-16048 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / Ecosystem >Affects Versions: 1.11.0 >Reporter: Leonard Xu >Assignee: Danny Chen >Priority: Major > Labels: pull-request-available, usability > Fix For: 1.12.0 > > > *The background* > I found SQL Kafka connector can not consume avro data that was serialized by > `KafkaAvroSerializer` and only can consume Row data with avro schema because > we use `AvroRowDeserializationSchema/AvroRowSerializationSchema` to se/de > data in `AvroRowFormatFactory`. > I think we should support this because `KafkaAvroSerializer` is very common > in Kafka. > and someone met same question in stackoverflow[1]. > [[1]https://stackoverflow.com/questions/56452571/caused-by-org-apache-avro-avroruntimeexception-malformed-data-length-is-negat/56478259|https://stackoverflow.com/questions/56452571/caused-by-org-apache-avro-avroruntimeexception-malformed-data-length-is-negat/56478259] > *The format details* > _The factory identifier (or format id)_ > There are 2 candidates now ~ > - {{avro-sr}}: the pattern borrowed from KSQL {{JSON_SR}} format [1] > - {{avro-confluent}}: the pattern borrowed from Clickhouse {{AvroConfluent}} > [2] > Personally i would prefer {{avro-sr}} because it is more concise and the > confluent is a company name which i think is not that suitable for a format > name. > _The format attributes_ > || Options || required || Remark || > | schema-registry.url | true | URL to connect to schema registry service | > | schema-registry.subject | false | Subject name to write to the Schema > Registry service, required for sink | -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join
[ https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177438#comment-17177438 ] Xingcan Cui edited comment on FLINK-7865 at 8/14/20, 2:56 AM: -- Hi [~godfreyhe], it took me some time to think about how to do the reversion. However, the codebase has changed too much and I'm afraid I don't have much bandwidth in recent days to continue working on this (just onboarded a new company). I was wondering if you could take it over. Thanks! Some inspections that may help (hopefully): In {{FlinkDecorrelateProgram.optimize()}}, {{RelDecorrelator.decorrelateQuery(root)}} will rewrite the {{LogicalCorrelate}} of a left lateral join plan with extra conditions to {{LogicalJoin}}. For left lateral join without extra conditions, it will preserve the {{LogicalCorrelate}}. was (Author: xccui): Hi [~godfreyhe], it took me some time to think about how to do the reversion. However, the codebase has changed too much and I'm afraid I don't have much bandwidth in recent days to continue working on this. I was wondering if you could take it over. Thanks! Some inspections that may help (hopefully): In {{FlinkDecorrelateProgram.optimize()}}, {{RelDecorrelator.decorrelateQuery(root)}} will rewrite the {{LogicalCorrelate}} of a left lateral join plan with extra conditions to {{LogicalJoin}}. For left lateral join without extra conditions, it will preserve the {{LogicalCorrelate}}. > Remove predicate restrictions on TableFunction left outer join > -- > > Key: FLINK-7865 > URL: https://issues.apache.org/jira/browse/FLINK-7865 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Xingcan Cui >Priority: Major > > To cover up the improper translation of lateral table left outer join > (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} > literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has > been fixed in Calcite, we should remove the restrictions. The tasks may > include removing Table API/SQL condition check, removing validation tests, > enabling integration tests, updating the documents, etc. > See [this thread on Calcite dev > list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E] > for more information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join
[ https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-7865: -- Assignee: (was: Xingcan Cui) > Remove predicate restrictions on TableFunction left outer join > -- > > Key: FLINK-7865 > URL: https://issues.apache.org/jira/browse/FLINK-7865 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Xingcan Cui >Priority: Major > > To cover up the improper translation of lateral table left outer join > (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} > literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has > been fixed in Calcite, we should remove the restrictions. The tasks may > include removing Table API/SQL condition check, removing validation tests, > enabling integration tests, updating the documents, etc. > See [this thread on Calcite dev > list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E] > for more information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join
[ https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177438#comment-17177438 ] Xingcan Cui commented on FLINK-7865: Hi [~godfreyhe], it took me some time to think about how to do the reversion. However, the codebase has changed too much and I'm afraid I don't have much bandwidth in recent days to continue working on this. I was wondering if you could take it over. Thanks! Some inspections that may help (hopefully): In {{FlinkDecorrelateProgram.optimize()}}, {{RelDecorrelator.decorrelateQuery(root)}} will rewrite the {{LogicalCorrelate}} of a left lateral join plan with extra conditions to {{LogicalJoin}}. For left lateral join without extra conditions, it will preserve the {{LogicalCorrelate}}. > Remove predicate restrictions on TableFunction left outer join > -- > > Key: FLINK-7865 > URL: https://issues.apache.org/jira/browse/FLINK-7865 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: Xingcan Cui >Priority: Major > > To cover up the improper translation of lateral table left outer join > (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} > literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has > been fixed in Calcite, we should remove the restrictions. The tasks may > include removing Table API/SQL condition check, removing validation tests, > enabling integration tests, updating the documents, etc. > See [this thread on Calcite dev > list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E] > for more information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join
[ https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176325#comment-17176325 ] Xingcan Cui commented on FLINK-7865: [~liupengcheng] Thanks for your reminder. I'll remove the restrictions asap. > Remove predicate restrictions on TableFunction left outer join > -- > > Key: FLINK-7865 > URL: https://issues.apache.org/jira/browse/FLINK-7865 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: Xingcan Cui >Priority: Major > > To cover up the improper translation of lateral table left outer join > (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} > literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has > been fixed in Calcite, we should remove the restrictions. The tasks may > include removing Table API/SQL condition check, removing validation tests, > enabling integration tests, updating the documents, etc. > See [this thread on Calcite dev > list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E] > for more information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-13849) The back-pressure monitoring tab in Web UI may cause errors
Xingcan Cui created FLINK-13849: --- Summary: The back-pressure monitoring tab in Web UI may cause errors Key: FLINK-13849 URL: https://issues.apache.org/jira/browse/FLINK-13849 Project: Flink Issue Type: Bug Components: Runtime / Web Frontend Affects Versions: 1.9.0 Reporter: Xingcan Cui Clicking the back-pressure monitoring tab for a finished job in Web UI will cause an internal server error. The exceptions are as follows. {code:java} 2019-08-26 01:23:54,845 ERROR org.apache.flink.runtime.rest.handler.job.JobVertexBackPressureHandler - Unhandled exception. org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (09e107685e0b81b443b556062debb443) at org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGatewayFuture(Dispatcher.java:825) at org.apache.flink.runtime.dispatcher.Dispatcher.requestOperatorBackPressureStats(Dispatcher.java:524) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:279) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:194) at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at akka.actor.Actor$class.aroundReceive(Actor.scala:517) at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) at akka.actor.ActorCell.invoke(ActorCell.scala:561) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) at akka.dispatch.Mailbox.run(Mailbox.scala:225) at akka.dispatch.Mailbox.exec(Mailbox.scala:235) at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (FLINK-13405) Translate "Basic API Concepts" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-13405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902700#comment-16902700 ] Xingcan Cui commented on FLINK-13405: - [~WangHW], personally, I'd like to translate it to "数据汇", which corresponds to source ("数据源"). However, as [~jark] suggested, you can choose not to translate it. > Translate "Basic API Concepts" page into Chinese > > > Key: FLINK-13405 > URL: https://issues.apache.org/jira/browse/FLINK-13405 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Documentation >Affects Versions: 1.10.0 >Reporter: WangHengWei >Assignee: WangHengWei >Priority: Major > Labels: documentation, pull-request-available > Fix For: 1.10.0 > > Time Spent: 10m > Remaining Estimate: 0h > > The page url is > [https://github.com/apache/flink/blob/master/docs/dev/api_concepts.zh.md] > The markdown file is located in flink/docs/dev/api_concepts.zh.md -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13405) Translate "Basic API Concepts" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-13405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894879#comment-16894879 ] Xingcan Cui commented on FLINK-13405: - 首先,sorting在这里一定是排序而非分类。 其次,我个人对于此处“i.e.”的用法有所怀疑。如果访问其内容就等同于高效排序,那用"i.e."没问题。但根据我的理解,高效排序可能只是内容访问的用途之一。如果是后者,那应该把"i.e."替换成"e.g.",大致翻译是:……无法访问它们的内容(例如为了高效排序)。 > Translate "Basic API Concepts" page into Chinese > > > Key: FLINK-13405 > URL: https://issues.apache.org/jira/browse/FLINK-13405 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Documentation >Affects Versions: 1.10.0 >Reporter: WangHengWei >Assignee: WangHengWei >Priority: Major > Labels: documentation, pull-request-available > Fix For: 1.10.0 > > > The page url is > [https://github.com/apache/flink/blob/master/docs/dev/api_concepts.zh.md] > The markdown file is located in flink/docs/dev/api_concepts.zh.md -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Closed] (FLINK-11769) The estimateDataTypesSize method in FlinkRelNode causes NPE for Multiset
[ https://issues.apache.org/jira/browse/FLINK-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui closed FLINK-11769. --- Resolution: Duplicate > The estimateDataTypesSize method in FlinkRelNode causes NPE for Multiset > > > Key: FLINK-11769 > URL: https://issues.apache.org/jira/browse/FLINK-11769 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.6.4, 1.7.2 >Reporter: Xingcan Cui >Priority: Major > > As raised in [this > thread|https://lists.apache.org/thread.html/3a93723d2e74ae667a9aeb7d6ff28955f3ef79b5f20b4848b67fe709@%3Cuser.flink.apache.org%3E]. > The {{estimateDataTypesSize}} method in {{FlinkRelNode}} causes NPE for a > {{Multiset>}} field type. Maybe the {{keyType}} or the > {{valueType}} is empty in that case. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-11433) JOIN on a table having a column of type MULTISET gives a NPE
[ https://issues.apache.org/jira/browse/FLINK-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889287#comment-16889287 ] Xingcan Cui commented on FLINK-11433: - Hi [~azagrebin], I think they refer to the same issue. Will mark FLINK-11769 as duplicated and close it. > JOIN on a table having a column of type MULTISET gives a NPE > - > > Key: FLINK-11433 > URL: https://issues.apache.org/jira/browse/FLINK-11433 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.7.0, 1.7.1 >Reporter: Elias Saalmann >Assignee: TANG Wen-hui >Priority: Major > > I get an error (Error while applying rule FlinkLogicalJoinConverter) when > performing a JOIN on a table having a column of type MULTISET (e.g. a COLLECT > as aggregation of a GROUP BY), for instance: > SELECT a, d > FROM TableA JOIN ( > SELECT b, COLLECT(c) AS d > FROM TableB > GROUP BY b > ) TableC ON a = b > Full stacktrace: > Exception in thread "main" java.lang.RuntimeException: Error while applying > rule FlinkLogicalJoinConverter, args > [rel#71:LogicalJoin.NONE(left=rel#69:Subset#3.NONE,right=rel#70:Subset#4.NONE,condition==($2, > $0),joinType=inner)] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:236) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:646) > at > org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:339) > at > org.apache.flink.table.api.TableEnvironment.runVolcanoPlanner(TableEnvironment.scala:373) > at > org.apache.flink.table.api.TableEnvironment.optimizeLogicalPlan(TableEnvironment.scala:292) > at > org.apache.flink.table.api.BatchTableEnvironment.optimize(BatchTableEnvironment.scala:455) > at > org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:475) > at > org.apache.flink.table.api.java.BatchTableEnvironment.toDataSet(BatchTableEnvironment.scala:165) > at org.myorg.quickstart.TableJob2.main(TableJob2.java:40) > Caused by: java.lang.RuntimeException: Error occurred while applying rule > FlinkLogicalJoinConverter > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:149) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234) > at > org.apache.calcite.rel.convert.ConverterRule.onMatch(ConverterRule.java:141) > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:212) > ... 8 more > Caused by: java.lang.NullPointerException > at > org.apache.flink.table.plan.nodes.FlinkRelNode$class.estimateDataTypeSize(FlinkRelNode.scala:84) > at > org.apache.flink.table.plan.nodes.logical.FlinkLogicalJoinBase.estimateDataTypeSize(FlinkLogicalJoinBase.scala:29) > at > org.apache.flink.table.plan.nodes.FlinkRelNode$class.estimateDataTypeSize(FlinkRelNode.scala:104) > at > org.apache.flink.table.plan.nodes.logical.FlinkLogicalJoinBase.estimateDataTypeSize(FlinkLogicalJoinBase.scala:29) > at > org.apache.flink.table.plan.nodes.FlinkRelNode$$anonfun$estimateRowSize$2.apply(FlinkRelNode.scala:80) > at > org.apache.flink.table.plan.nodes.FlinkRelNode$$anonfun$estimateRowSize$2.apply(FlinkRelNode.scala:79) > at > scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57) > at > scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66) > at scala.collection.mutable.ArrayBuffer.foldLeft(ArrayBuffer.scala:48) > at > org.apache.flink.table.plan.nodes.FlinkRelNode$class.estimateRowSize(FlinkRelNode.scala:79) > at > org.apache.flink.table.plan.nodes.logical.FlinkLogicalJoinBase.estimateRowSize(FlinkLogicalJoinBase.scala:29) > at > org.apache.flink.table.plan.nodes.logical.FlinkLogicalJoinBase.computeSelfCost(FlinkLogicalJoinBase.scala:48) > at > org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getNonCumulativeCost(RelMdPercentageOriginalRows.java:162) > at > GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost_$(Unknown > Source) > at > GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost(Unknown > Source) > at > org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumulativeCost(RelMetadataQuery.java:301) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(VolcanoPlanner.java:953) > at > org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(RelSubset.java:339) > at > org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(RelSubset.java:322) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1643) > at > o
[jira] [Updated] (FLINK-12116) Args autocast will cause exception for plan transformation in TableAPI
[ https://issues.apache.org/jira/browse/FLINK-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-12116: Labels: (was: pull-request-available) > Args autocast will cause exception for plan transformation in TableAPI > -- > > Key: FLINK-12116 > URL: https://issues.apache.org/jira/browse/FLINK-12116 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.6.4, 1.7.2 >Reporter: Xingcan Cui >Assignee: vinoyang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In tableAPI, the automatic typecast for arguments may break their initial > structures, which makes {{TreeNode.makeCopy()}} fail. > Take the {{ConcatWs}} function as an example. It requires a string > {{Expression}} sequence for the second parameter of its constructor. If we > provide some {{Expressions}} with other types, the planner will try to cast > them automatically. However, during this process, the arguments will be > incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two > expressions {{f1.cast(String)}} and {{f2.cast(String)}}) which will cause > {{java.lang.IllegalArgumentException: wrong number of arguments}} for > {{Constructor.newInstance()}}. > As a workaround, we can cast these arguments manually. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-10049) Correctly handle NULL arguments in SQL built-in functions
[ https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-10049: Description: Currently, the built-in functions treat NULL arguments in different ways. E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws NPE. The general SQL-way of handling NULL values should be that if one argument is NULL the result is NULL. We should keep the correct semantics and avoid terminating the (continuous) queries unexpectedly. (was: Currently, the built-in functions treat NULL arguments in different ways. E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general SQL-way of handling NULL values should be that if one argument is NULL the result is NULL. We should unify the processing logic for that.) > Correctly handle NULL arguments in SQL built-in functions > - > > Key: FLINK-10049 > URL: https://issues.apache.org/jira/browse/FLINK-10049 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: vinoyang >Priority: Major > > Currently, the built-in functions treat NULL arguments in different ways. > E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws NPE. The general > SQL-way of handling NULL values should be that if one argument is NULL the > result is NULL. We should keep the correct semantics and avoid terminating > the (continuous) queries unexpectedly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-10049) Correctly handle NULL arguments in SQL built-in functions
[ https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-10049: Summary: Correctly handle NULL arguments in SQL built-in functions (was: Unify the processing logic for NULL arguments in SQL built-in functions) > Correctly handle NULL arguments in SQL built-in functions > - > > Key: FLINK-10049 > URL: https://issues.apache.org/jira/browse/FLINK-10049 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: vinoyang >Priority: Major > > Currently, the built-in functions treat NULL arguments in different ways. > E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general > SQL-way of handling NULL values should be that if one argument is NULL the > result is NULL. We should unify the processing logic for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10049) Unify the processing logic for NULL arguments in SQL built-in functions
[ https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813002#comment-16813002 ] Xingcan Cui commented on FLINK-10049: - Hey guys, thanks for the comments. I've not gone through all the documents, but it seems true that different SQL engines have different mechanisms for this problem. However, since all the fields in Flink SQL are nullable in the current version, simply throwing NPE and terminating the execution should always be avoided. IMO, each UDF is responsible to handle {{NULL}} arguments itself, with the correct semantics. The {{NULL}} means unknown in SQL, and thus most scalar functions should output "unknown" with an unknown input. We can add the {{RETURNS NULL ON NULL INPUT}} option to UDF definitions (maybe a method to be overridden), but it works more like an optimization method, which means event without this declaration, the function should return "NULL" after being invoked (just in case). Actually, there's no need to unify the processing logic. Just keep the correct semantics and avoid terminating the (continuous) queries unexpectedly. Thus, I plan to rename this ticket to "Correctly handle NULL arguments in SQL built-in functions". As for the exception handling mechanism, it's a little bit different and we'd better discuss it in another place. What do you think? > Unify the processing logic for NULL arguments in SQL built-in functions > --- > > Key: FLINK-10049 > URL: https://issues.apache.org/jira/browse/FLINK-10049 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: vinoyang >Priority: Major > > Currently, the built-in functions treat NULL arguments in different ways. > E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general > SQL-way of handling NULL values should be that if one argument is NULL the > result is NULL. We should unify the processing logic for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10684) Improve the CSV reading process
[ https://issues.apache.org/jira/browse/FLINK-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812514#comment-16812514 ] Xingcan Cui commented on FLINK-10684: - Hi all, thanks for your attention. I really encountered some problems when I tried to read some CSV files for data wrangling, and that's why this ticket was filed. As shown in the description, the problems come from different aspects: P1. Lack of schema inference. P2. Weak error handling. P3. Other implicit bugs (or standard incompatible mentioned by [~fhueske]). Since the basic {{CSVInputFormat}} are used for both streaming and batch environments, some solutions to these problems may be tricky. For instance, to automatically infer schemas, we need to introduce a new mechanism, stream sampling, from which I believe some other processes such as automatic parallelism tuning and stream SQL optimization will also benefit. To solve these problems on my own project, I applied some workarounds which are not general enough. Although I did have some general ideas (e.g., using side output for bad records), considering that the Flink project has been adopting some major changes recently, maybe it's better to propose them after then. All in all, personally I don't think it's a good time to concentrate on this issue because none of the solutions are trivial. What do you think? > Improve the CSV reading process > --- > > Key: FLINK-10684 > URL: https://issues.apache.org/jira/browse/FLINK-10684 > Project: Flink > Issue Type: Improvement > Components: API / DataSet >Reporter: Xingcan Cui >Priority: Major > > CSV is one of the most commonly used file formats in data wrangling. To load > records from CSV files, Flink has provided the basic {{CsvInputFormat}}, as > well as some variants (e.g., {{RowCsvInputFormat}} and > {{PojoCsvInputFormat}}). However, it seems that the reading process can be > improved. For example, we could add a built-in util to automatically infer > schemas from CSV headers and samples of data. Also, the current bad record > handling method can be improved by somehow keeping the invalid lines (and > even the reasons for failed parsing), instead of logging the total number > only. > This is an umbrella issue for all the improvements and bug fixes for the CSV > reading process. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12116) Args autocast will cause exception for plan transformation in TableAPI
[ https://issues.apache.org/jira/browse/FLINK-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-12116: Description: In tableAPI, the automatic typecast for arguments may break their initial structures, which makes {{TreeNode.makeCopy()}} fail. Take the {{ConcatWs}} function as an example. It requires a string {{Expression}} sequence for the second parameter of its constructor. If we provide some {{Expressions}} with other types, the planner will try to cast them automatically. However, during this process, the arguments will be incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two expressions {{f1.cast(String)}} and {{f2.cast(String)}}) which will cause {{java.lang.IllegalArgumentException: wrong number of arguments}} for {{Constructor.newInstance()}}. As a workaround, we can cast these arguments manually. was: In tableAPI, the automatic typecast for arguments may break their initial structures, which makes {{TreeNode.makeCopy()}} fail. Take the {{ConcatWs}} function as an example. It requires a string {{Expression}} sequence for the second parameter of its constructor. If we provide some {{Expressions}} with other types, the planner will try to cast them automatically. However, during this process, the arguments will be incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two expressions {{f1.cast(String)}} and {{f2.cast(String)}}) which will cause {{java.lang.IllegalArgumentException: wrong number of arguments}}. As a workaround, we can cast these arguments manually. > Args autocast will cause exception for plan transformation in TableAPI > -- > > Key: FLINK-12116 > URL: https://issues.apache.org/jira/browse/FLINK-12116 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.6.4, 1.7.2 >Reporter: Xingcan Cui >Priority: Major > > In tableAPI, the automatic typecast for arguments may break their initial > structures, which makes {{TreeNode.makeCopy()}} fail. > Take the {{ConcatWs}} function as an example. It requires a string > {{Expression}} sequence for the second parameter of its constructor. If we > provide some {{Expressions}} with other types, the planner will try to cast > them automatically. However, during this process, the arguments will be > incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two > expressions {{f1.cast(String)}} and {{f2.cast(String)}}) which will cause > {{java.lang.IllegalArgumentException: wrong number of arguments}} for > {{Constructor.newInstance()}}. > As a workaround, we can cast these arguments manually. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12116) Args autocast will cause exception for plan transformation in TableAPI
Xingcan Cui created FLINK-12116: --- Summary: Args autocast will cause exception for plan transformation in TableAPI Key: FLINK-12116 URL: https://issues.apache.org/jira/browse/FLINK-12116 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.7.2, 1.6.4 Reporter: Xingcan Cui In tableAPI, the automatic typecast for arguments may break their initial structures, which makes {{TreeNode.makeCopy()}} fail. Take the {{ConcatWs}} function as an example. It requires a string {{Expression}} sequence for the second parameter of its constructor. If we provide some {{Expressions}} with other types, the planner will try to cast them automatically. However, during this process, the arguments will be incorrectly unwrapped (e.g., {{[f1, f2]}} will be unwrapped to two expressions {{f1.cast(String)}} and {{f2.cast(String)}}) which will cause {{java.lang.IllegalArgumentException: wrong number of arguments}}. As a workaround, we can cast these arguments manually. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join
[ https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807880#comment-16807880 ] Xingcan Cui commented on FLINK-7865: Thanks for working on that, [~hyuan] ! I'll take care of this issue on Flink side. > Remove predicate restrictions on TableFunction left outer join > -- > > Key: FLINK-7865 > URL: https://issues.apache.org/jira/browse/FLINK-7865 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: Xingcan Cui >Priority: Major > > To cover up the improper translation of lateral table left outer join > (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} > literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has > been fixed in Calcite, we should remove the restrictions. The tasks may > include removing Table API/SQL condition check, removing validation tests, > enabling integration tests, updating the documents, etc. > See [this thread on Calcite dev > list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E] > for more information. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-7865) Remove predicate restrictions on TableFunction left outer join
[ https://issues.apache.org/jira/browse/FLINK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-7865: -- Assignee: Xingcan Cui > Remove predicate restrictions on TableFunction left outer join > -- > > Key: FLINK-7865 > URL: https://issues.apache.org/jira/browse/FLINK-7865 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: Xingcan Cui >Priority: Major > > To cover up the improper translation of lateral table left outer join > (CALCITE-2004), we have temporarily forbidden the predicates (except {{true}} > literal) in Table API (FLINK-7853) and SQL (FLINK-7854). Once the issue has > been fixed in Calcite, we should remove the restrictions. The tasks may > include removing Table API/SQL condition check, removing validation tests, > enabling integration tests, updating the documents, etc. > See [this thread on Calcite dev > list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E] > for more information. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11528) Translate the "Use Cases" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui closed FLINK-11528. --- > Translate the "Use Cases" page into Chinese > --- > > Key: FLINK-11528 > URL: https://issues.apache.org/jira/browse/FLINK-11528 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Project Website >Reporter: Jark Wu >Assignee: Xingcan Cui >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Translate flink-web/usecases.zh.md into Chinese. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (FLINK-11528) Translate the "Use Cases" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui resolved FLINK-11528. - Resolution: Done Implemented with 6af7f48bd754fb9a5635c25ec7656677fcf10b9b > Translate the "Use Cases" page into Chinese > --- > > Key: FLINK-11528 > URL: https://issues.apache.org/jira/browse/FLINK-11528 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Project Website >Reporter: Jark Wu >Assignee: Xingcan Cui >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Translate flink-web/usecases.zh.md into Chinese. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11819) Additional attribute for order by not support by flink sql, but told supported in doc
[ https://issues.apache.org/jira/browse/FLINK-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793592#comment-16793592 ] Xingcan Cui commented on FLINK-11819: - Hi [~hequn8128] and [~hustclf], sorry for the late reply. IMO, the current description seems to be OK. However, this could be subjective. If more readers are confused, we should definitely improve it. Best, Xingcan > Additional attribute for order by not support by flink sql, but told > supported in doc > - > > Key: FLINK-11819 > URL: https://issues.apache.org/jira/browse/FLINK-11819 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.7.2 >Reporter: Lifei Chen >Assignee: Lifei Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.7.3 > > Original Estimate: 3h > Time Spent: 20m > Remaining Estimate: 2h 40m > > I am using flink v1.7.1, when I use flink sql to order by an attribute (not > time attribute), the error logs is as follow. > > sql: > {quote}"SELECT * FROM events order by tenantId" > {quote} > > error logs: > {quote}Exception in thread "main" org.apache.flink.table.api.TableException: > Cannot generate a valid execution plan for the given query: > FlinkLogicalSort(sort0=[$2], dir0=[ASC]) > FlinkLogicalNativeTableScan(table=[[_DataStreamTable_0]]) > This exception indicates that the query uses an unsupported SQL feature. > Please check the documentation for the set of currently supported SQL > features. > at > org.apache.flink.table.api.TableEnvironment.runVolcanoPlanner(TableEnvironment.scala:377) > at > org.apache.flink.table.api.TableEnvironment.optimizePhysicalPlan(TableEnvironment.scala:302) > at > org.apache.flink.table.api.StreamTableEnvironment.optimize(StreamTableEnvironment.scala:814) > at > org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:860) > at > org.apache.flink.table.api.java.StreamTableEnvironment.toRetractStream(StreamTableEnvironment.scala:305) > at > org.apache.flink.table.api.java.StreamTableEnvironment.toRetractStream(StreamTableEnvironment.scala:248) > {quote} > > So as for now, only time attribute is supported by flink for command `order > by`, additional attribute is not supported yet, Is that right ? > If so, there is a mistake, indicated that other attribute except for `time > attribute` is supported . > related links: > [https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/table/sql.html#orderby--limit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11819) Additional attribute for order by not support by flink sql, but told supported in doc
[ https://issues.apache.org/jira/browse/FLINK-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784452#comment-16784452 ] Xingcan Cui commented on FLINK-11819: - Hi [~hustclf], I think "additional" means you should first sort on a time attribute and then use other attributes, e.g., "order by rowtime, tenantId". > Additional attribute for order by not support by flink sql, but told > supported in doc > - > > Key: FLINK-11819 > URL: https://issues.apache.org/jira/browse/FLINK-11819 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.7.2 >Reporter: Lifei Chen >Assignee: Lifei Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.7.3 > > Original Estimate: 3h > Time Spent: 10m > Remaining Estimate: 2h 50m > > I am using flink v1.7.1, when I use flink sql to order by an attribute (not > time attribute), the error logs is as follow. > > sql: > {quote}"SELECT * FROM events order by tenantId" > {quote} > > error logs: > {quote}Exception in thread "main" org.apache.flink.table.api.TableException: > Cannot generate a valid execution plan for the given query: > FlinkLogicalSort(sort0=[$2], dir0=[ASC]) > FlinkLogicalNativeTableScan(table=[[_DataStreamTable_0]]) > This exception indicates that the query uses an unsupported SQL feature. > Please check the documentation for the set of currently supported SQL > features. > at > org.apache.flink.table.api.TableEnvironment.runVolcanoPlanner(TableEnvironment.scala:377) > at > org.apache.flink.table.api.TableEnvironment.optimizePhysicalPlan(TableEnvironment.scala:302) > at > org.apache.flink.table.api.StreamTableEnvironment.optimize(StreamTableEnvironment.scala:814) > at > org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:860) > at > org.apache.flink.table.api.java.StreamTableEnvironment.toRetractStream(StreamTableEnvironment.scala:305) > at > org.apache.flink.table.api.java.StreamTableEnvironment.toRetractStream(StreamTableEnvironment.scala:248) > {quote} > > So as for now, only time attribute is supported by flink for command `order > by`, additional attribute is not supported yet, Is that right ? > If so, there is a mistake, indicated that other attribute except for `time > attribute` is supported . > related links: > [https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/table/sql.html#orderby--limit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11769) The estimateDataTypesSize method in FlinkRelNode causes NPE for Multiset
Xingcan Cui created FLINK-11769: --- Summary: The estimateDataTypesSize method in FlinkRelNode causes NPE for Multiset Key: FLINK-11769 URL: https://issues.apache.org/jira/browse/FLINK-11769 Project: Flink Issue Type: Bug Components: Table API & SQL Affects Versions: 1.7.2, 1.6.4 Reporter: Xingcan Cui As raised in [this thread|https://lists.apache.org/thread.html/3a93723d2e74ae667a9aeb7d6ff28955f3ef79b5f20b4848b67fe709@%3Cuser.flink.apache.org%3E]. The {{estimateDataTypesSize}} method in {{FlinkRelNode}} causes NPE for a {{Multiset>}} field type. Maybe the {{keyType}} or the {{valueType}} is empty in that case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (FLINK-11567) Translate "How to Review a Pull Request" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui resolved FLINK-11567. - Resolution: Done Fixed in flink-web: 63af7e0c6fa4a87072f54fc6bf0cf4ebe5c56b25 > Translate "How to Review a Pull Request" page into Chinese > -- > > Key: FLINK-11567 > URL: https://issues.apache.org/jira/browse/FLINK-11567 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Project Website >Reporter: Jark Wu >Assignee: xulinjie >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Translate "How to Review a Pull Request" page into Chinese. > The markdown file is located in: flink-web/reviewing-prs.zh.md > The url link is: https://flink.apache.org/zh/reviewing-prs.html > Please adjust the links in the page to Chinese pages when translating. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11528) Translate the "Use Cases" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760922#comment-16760922 ] Xingcan Cui commented on FLINK-11528: - Hi [~jark], thanks for your effort on the Chinese docs/website. I'll help to take care of this page. > Translate the "Use Cases" page into Chinese > --- > > Key: FLINK-11528 > URL: https://issues.apache.org/jira/browse/FLINK-11528 > Project: Flink > Issue Type: Sub-task > Components: Project Website >Reporter: Jark Wu >Assignee: Xingcan Cui >Priority: Major > > Translate flink-web/usecases.zh.md into Chinese. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-11528) Translate the "Use Cases" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui reassigned FLINK-11528: --- Assignee: Xingcan Cui > Translate the "Use Cases" page into Chinese > --- > > Key: FLINK-11528 > URL: https://issues.apache.org/jira/browse/FLINK-11528 > Project: Flink > Issue Type: Sub-task > Components: Project Website >Reporter: Jark Wu >Assignee: Xingcan Cui >Priority: Major > > Translate flink-web/usecases.zh.md into Chinese. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11227) The DescriptorProperties contains some bounds checking errors
Xingcan Cui created FLINK-11227: --- Summary: The DescriptorProperties contains some bounds checking errors Key: FLINK-11227 URL: https://issues.apache.org/jira/browse/FLINK-11227 Project: Flink Issue Type: Bug Components: Table API & SQL Affects Versions: 1.7.1, 1.6.3 Reporter: Xingcan Cui Assignee: Xingcan Cui Fix For: 1.6.4, 1.7.2, 1.8.0 In {{DescriptorProperties}}, both the {{validateFixedIndexedProperties()}} and {{validateArray()}} use wrong upperbounds for validation, which leads to the last element not being validated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11220) Can not Select row time field in JOIN query
[ https://issues.apache.org/jira/browse/FLINK-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729631#comment-16729631 ] Xingcan Cui commented on FLINK-11220: - Hi [~sunjincheng121] and [~jark], I admit that automatically materializing the rowtime fields will be convenient in some cases. However, that seems not to be quite reasonable. Here are my thoughts. 1. As you said, the {{RowtimeIndicator}} and {{TIMESTAMP}} are two different types. It will be confusing if we implicitly change the field types after a join. 2. The current approach makes it possible for the time-windowed join to be used as a sub-query for further rowtime-based operations (e.g., group windows). 3. Now that the results of time-windowed joins are still aligned with watermarks (which are held-back according to the window size), I think it doesn't make sense to ignore it and let users assign rowtime/watermarks manually again. All in all, using {{CAST}} seems to be a little verbose, but should be a feasible solution for the time being. Any idea? Best, Xingcan > Can not Select row time field in JOIN query > --- > > Key: FLINK-11220 > URL: https://issues.apache.org/jira/browse/FLINK-11220 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Affects Versions: 1.8.0 >Reporter: sunjincheng >Priority: Major > > SQL: > {code:java} > Orders...toTable(tEnv, 'orderId, 'orderTime.rowtime) > Payment...toTable(tEnv, 'orderId, 'payTime.rowtime) > SELECT orderTime, o.orderId, payTime > FROM Orders AS o JOIN Payment AS p > ON o.orderId = p.orderId AND > p.payTime BETWEEN orderTime AND orderTime + INTERVAL '1' HOUR > {code} > Execption: > {code:java} > org.apache.flink.table.api.TableException: Found more than one rowtime field: > [orderTime, payTime] in the table that should be converted to a DataStream. > Please select the rowtime field that should be used as event-time timestamp > for the DataStream by casting all other fields to TIMESTAMP. > at > org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:906) > {code} > The reason for the error is that we have 2 time fields `orderTime` and > `payTime`. I think we do not need throw the exception, and we can remove > the logic of `plan.process(new OutputRowtimeProcessFunction[A](conversion, > rowtimeFields.head.getIndex))`, if we want using the timestamp after > toDataSteram, we should using `assignTimestampsAndWatermarks()`. > What do you think ? [~twalthr] [~fhueske] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11220) Can not Select row time field in JOIN query
[ https://issues.apache.org/jira/browse/FLINK-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729271#comment-16729271 ] Xingcan Cui commented on FLINK-11220: - Hi [~sunjincheng121], if users want to preserve a rowtime field (as a common field), they can cast the field to TIMESTAMP manually (i.e., {{select orderTime, cast (payTime as timestamp) from ... join ...}}). > Can not Select row time field in JOIN query > --- > > Key: FLINK-11220 > URL: https://issues.apache.org/jira/browse/FLINK-11220 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Affects Versions: 1.8.0 >Reporter: sunjincheng >Priority: Major > > SQL: > {code:java} > Orders...toTable(tEnv, 'orderId, 'orderTime.rowtime) > Payment...toTable(tEnv, 'orderId, 'payTime.rowtime) > SELECT orderTime, o.orderId, payTime > FROM Orders AS o JOIN Payment AS p > ON o.orderId = p.orderId AND > p.payTime BETWEEN orderTime AND orderTime + INTERVAL '1' HOUR > {code} > Execption: > {code:java} > org.apache.flink.table.api.TableException: Found more than one rowtime field: > [orderTime, payTime] in the table that should be converted to a DataStream. > Please select the rowtime field that should be used as event-time timestamp > for the DataStream by casting all other fields to TIMESTAMP. > at > org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:906) > {code} > The reason for the error is that we have 2 time fields `orderTime` and > `payTime`. I think we do not need throw the exception, and we can remove > the logic of `plan.process(new OutputRowtimeProcessFunction[A](conversion, > rowtimeFields.head.getIndex))`, if we want using the timestamp after > toDataSteram, we should using `assignTimestampsAndWatermarks()`. > What do you think ? [~twalthr] [~fhueske] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11220) Can not Select row time field in JOIN query
[ https://issues.apache.org/jira/browse/FLINK-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729019#comment-16729019 ] Xingcan Cui commented on FLINK-11220: - Hi [~sunjincheng121], I'd like to give some explanations about this problem. Unlike the common join, the time-windowed join will produce a time-ordered stream, which means the event times of the results are still aligned with the watermarks. Thus we can directly choose either of the event time fields from the original streams as the new event time field. It's quite an interesting problem to deal with the event time field after some aggregations or joins (e.g., in common join, no event time field can be preserved). Personally, I suggest keeping the current manner and maybe in the future, we can extend the time system to support multiple event times and separated watermarks. What do you think? Best, Xingcan > Can not Select row time field in JOIN query > --- > > Key: FLINK-11220 > URL: https://issues.apache.org/jira/browse/FLINK-11220 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Affects Versions: 1.8.0 >Reporter: sunjincheng >Priority: Major > > SQL: > {code:java} > Orders...toTable(tEnv, 'orderId, 'orderTime.rowtime) > Payment...toTable(tEnv, 'orderId, 'payTime.rowtime) > SELECT orderTime, o.orderId, payTime > FROM Orders AS o JOIN Payment AS p > ON o.orderId = p.orderId AND > p.payTime BETWEEN orderTime AND orderTime + INTERVAL '1' HOUR > {code} > Execption: > {code:java} > org.apache.flink.table.api.TableException: Found more than one rowtime field: > [orderTime, payTime] in the table that should be converted to a DataStream. > Please select the rowtime field that should be used as event-time timestamp > for the DataStream by casting all other fields to TIMESTAMP. > at > org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:906) > {code} > The reason for the error is that we have 2 time fields `orderTime` and > `payTime`. I think we do not need throw the exception, and we can remove > the logic of `plan.process(new OutputRowtimeProcessFunction[A](conversion, > rowtimeFields.head.getIndex))`, if we want using the timestamp after > toDataSteram, we should using `assignTimestampsAndWatermarks()`. > What do you think ? [~twalthr] [~fhueske] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call
[ https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676901#comment-16676901 ] Xingcan Cui edited comment on FLINK-10463 at 11/7/18 11:32 AM: --- Fixed in 1.8.0 5a4e5e9dcaa83368862d27bd80429235c02ed66e Fixed in 1.7.1 98198d4cda78cc815fe6430e2249130efb102b61 Fixed in 1.6.3 761c7db58ea39bfc3a2bfdacba847d2d6e224129 was (Author: xccui): Fixed in 1.8.0 5a4e5e9dcaa83368862d27bd80429235c02ed66e > Null literal cannot be properly parsed in Java Table API function call > -- > > Key: FLINK-10463 > URL: https://issues.apache.org/jira/browse/FLINK-10463 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Xingcan Cui >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.6.3, 1.8.0, 1.7.1 > > > For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws > the following exception. > {code:java} > org.apache.flink.table.api.ExpressionParserException: Could not parse > expression at column 13: string matching regex > `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found > Null(STRING).regexpReplace('oo|ar', '') > ^ > at > org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576) > at > org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call
[ https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676968#comment-16676968 ] Xingcan Cui commented on FLINK-10463: - [~twalthr], sure. I'll cherry-pick the commit to versions 1.6.3 and 1.7.1. > Null literal cannot be properly parsed in Java Table API function call > -- > > Key: FLINK-10463 > URL: https://issues.apache.org/jira/browse/FLINK-10463 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Xingcan Cui >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.6.3, 1.8.0, 1.7.1 > > > For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws > the following exception. > {code:java} > org.apache.flink.table.api.ExpressionParserException: Could not parse > expression at column 13: string matching regex > `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found > Null(STRING).regexpReplace('oo|ar', '') > ^ > at > org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576) > at > org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call
[ https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-10463: Fix Version/s: 1.7.1 1.6.3 > Null literal cannot be properly parsed in Java Table API function call > -- > > Key: FLINK-10463 > URL: https://issues.apache.org/jira/browse/FLINK-10463 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Xingcan Cui >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.6.3, 1.8.0, 1.7.1 > > > For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws > the following exception. > {code:java} > org.apache.flink.table.api.ExpressionParserException: Could not parse > expression at column 13: string matching regex > `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found > Null(STRING).regexpReplace('oo|ar', '') > ^ > at > org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576) > at > org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call
[ https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui resolved FLINK-10463. - Resolution: Fixed Fixed in 1.8.0 5a4e5e9dcaa83368862d27bd80429235c02ed66e > Null literal cannot be properly parsed in Java Table API function call > -- > > Key: FLINK-10463 > URL: https://issues.apache.org/jira/browse/FLINK-10463 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Xingcan Cui >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > > For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws > the following exception. > {code:java} > org.apache.flink.table.api.ExpressionParserException: Could not parse > expression at column 13: string matching regex > `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found > Null(STRING).regexpReplace('oo|ar', '') > ^ > at > org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576) > at > org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call
[ https://issues.apache.org/jira/browse/FLINK-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-10463: Fix Version/s: 1.8.0 > Null literal cannot be properly parsed in Java Table API function call > -- > > Key: FLINK-10463 > URL: https://issues.apache.org/jira/browse/FLINK-10463 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Xingcan Cui >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > > For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws > the following exception. > {code:java} > org.apache.flink.table.api.ExpressionParserException: Could not parse > expression at column 13: string matching regex > `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found > Null(STRING).regexpReplace('oo|ar', '') > ^ > at > org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576) > at > org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-10684) Improve the CSV reading process
Xingcan Cui created FLINK-10684: --- Summary: Improve the CSV reading process Key: FLINK-10684 URL: https://issues.apache.org/jira/browse/FLINK-10684 Project: Flink Issue Type: Improvement Components: Core Reporter: Xingcan Cui CSV is one of the most commonly used file formats in data wrangling. To load records from CSV files, Flink has provided the basic {{CsvInputFormat}}, as well as some variants (e.g., {{RowCsvInputFormat}} and {{PojoCsvInputFormat}}). However, it seems that the reading process can be improved. For example, we could add a built-in util to automatically infer schemas from CSV headers and samples of data. Also, the current bad record handling method can be improved by somehow keeping the invalid lines (and even the reasons for failed parsing), instead of logging the total number only. This is an umbrella issue for all the improvements and bug fixes for the CSV reading process. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (FLINK-9990) Add regexp_extract supported in TableAPI and SQL
[ https://issues.apache.org/jira/browse/FLINK-9990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui resolved FLINK-9990. Resolution: Implemented Implemented in 1.7.0 5dc360984143005f73b8f70f97ed6b1c2afd7dc3 > Add regexp_extract supported in TableAPI and SQL > > > Key: FLINK-9990 > URL: https://issues.apache.org/jira/browse/FLINK-9990 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: vinoyang >Assignee: vinoyang >Priority: Minor > Labels: pull-request-available > > regex_extract is a very useful function, it returns a string based on a regex > pattern and a index. > For example : > {code:java} > regexp_extract('foothebar', 'foo(.*?)(bar)', 2) // returns 'bar.' > {code} > It is provided as a UDF in Hive, more details please see[1]. > [1]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-10463) Null literal cannot be properly parsed in Java Table API function call
Xingcan Cui created FLINK-10463: --- Summary: Null literal cannot be properly parsed in Java Table API function call Key: FLINK-10463 URL: https://issues.apache.org/jira/browse/FLINK-10463 Project: Flink Issue Type: Bug Components: Table API & SQL Reporter: Xingcan Cui For example, the expression `Null(STRING).regexpReplace('oo|ar', '')` throws the following exception. {code:java} org.apache.flink.table.api.ExpressionParserException: Could not parse expression at column 13: string matching regex `(?i)\Qas\E(?![_$\p{javaJavaIdentifierPart}])' expected but `.' found Null(STRING).regexpReplace('oo|ar', '') ^ at org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:576) at org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:569) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7800) Enable window joins without equi-join predicates
[ https://issues.apache.org/jira/browse/FLINK-7800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16632815#comment-16632815 ] Xingcan Cui commented on FLINK-7800: As posted in the calcite mailing list before [Question about join plan optimization in Flink|https://lists.apache.org/thread.html/f40182397d393b4348a2658d256c87fd7566b8add0eef3642a152471@%3Cdev.calcite.apache.org%3E], the manner to forbid some candidate plans is still confusing me and the problem becomes even harder when considering the time attributes (i.e., we need some pushing-down to get the equi-predicate, while that doesn't hold for time attributes). Do you have any suggestions? [~fhueske] [~twalthr] > Enable window joins without equi-join predicates > > > Key: FLINK-7800 > URL: https://issues.apache.org/jira/browse/FLINK-7800 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Affects Versions: 1.4.0 >Reporter: Fabian Hueske >Assignee: Xingcan Cui >Priority: Major > Labels: pull-request-available > > Currently, windowed joins can only be translated if they have at least on > equi-join predicate. This limitation exists due to the lack of a good cross > join strategy for the DataSet API. > Due to the window, windowed joins do not have to be executed as cross joins. > Hence, the equi-join limitation does not need to be enforces (even though > non-equi joins are executed with a parallelism of 1 right now). > We can resolve this issue by adding a boolean flag to the > {{FlinkLogicalJoinConverter}} rule to permit non-equi joins and add such a > rule to the logical optimization set of the DataStream API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (FLINK-10145) Add replace supported in TableAPI and SQL
[ https://issues.apache.org/jira/browse/FLINK-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui resolved FLINK-10145. - Resolution: Fixed Fix Version/s: 1.7.0 Implemented in 1.7.0: 24af70fdecbbb66e8555df7aca35a92a2f1aa7ac > Add replace supported in TableAPI and SQL > - > > Key: FLINK-10145 > URL: https://issues.apache.org/jira/browse/FLINK-10145 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: Guibo Pan >Assignee: Guibo Pan >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > replace is an useful function for String. > for example: > {code:java} > select replace("Hello World", "World", "Flink") // return "Hello Flink" > select replace("ababab", "abab", "z") // return "zab" > {code} > It is supported as a UDF in Hive, more details please see[1] > [1]: > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-StringFunctions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (FLINK-9991) Add regexp_replace supported in TableAPI and SQL
[ https://issues.apache.org/jira/browse/FLINK-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui resolved FLINK-9991. Resolution: Implemented Implemented in 1.7.0: f03d15a08ad5df44a4bb742d3edfdce211bf9e48 > Add regexp_replace supported in TableAPI and SQL > > > Key: FLINK-9991 > URL: https://issues.apache.org/jira/browse/FLINK-9991 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: vinoyang >Assignee: vinoyang >Priority: Minor > Labels: pull-request-available > Fix For: 1.7.0 > > > regexp_replace is a very userful function to process String. > For example : > {code:java} > regexp_replace("foobar", "oo|ar", "") //returns 'fb.' > {code} > It is supported as a UDF in Hive, more details please see[1]. > [1]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9991) Add regexp_replace supported in TableAPI and SQL
[ https://issues.apache.org/jira/browse/FLINK-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingcan Cui updated FLINK-9991: --- Fix Version/s: 1.7.0 > Add regexp_replace supported in TableAPI and SQL > > > Key: FLINK-9991 > URL: https://issues.apache.org/jira/browse/FLINK-9991 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: vinoyang >Assignee: vinoyang >Priority: Minor > Labels: pull-request-available > Fix For: 1.7.0 > > > regexp_replace is a very userful function to process String. > For example : > {code:java} > regexp_replace("foobar", "oo|ar", "") //returns 'fb.' > {code} > It is supported as a UDF in Hive, more details please see[1]. > [1]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10323) A single backslash cannot be successfully parsed in Java Table API
[ https://issues.apache.org/jira/browse/FLINK-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611894#comment-16611894 ] Xingcan Cui commented on FLINK-10323: - Yes, it seems to have been fixed and we even don't need to double the backslashes in SQL queries. Thanks for your work, [~twalthr] :-) > A single backslash cannot be successfully parsed in Java Table API > -- > > Key: FLINK-10323 > URL: https://issues.apache.org/jira/browse/FLINK-10323 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Affects Versions: 1.4.2, 1.5.3, 1.6.0 >Reporter: Xingcan Cui >Priority: Major > > The snippet below will cause a parser exception. > {code:java} > testAllApis( > concat_ws("~", "AA", "\\"), > "concat_ws('~','AA','\\')", > "concat_ws('~','AA','')", > "AA~\\") > {code} > {code} > org.apache.flink.table.api.ExpressionParserException: Could not parse > expression at column 20: Invalid expression. > concat_ws('~','AA','\') >^ > at > org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:549) > at > org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:542) > at > org.apache.flink.table.expressions.utils.ExpressionTestBase.addTableApiTestExpr(ExpressionTestBase.scala:255) > at > org.apache.flink.table.expressions.utils.ExpressionTestBase.testAllApis(ExpressionTestBase.scala:265) > at > org.apache.flink.table.expressions.ScalarFunctionsTest.testConcatWs(ScalarFunctionsTest.scala:340) > {code} > However, with double (or more) backslashes, it will be successfully parsed. > {code:java} > testAllApis( > concat_ws("~", "AA", ""), > "concat_ws('~','AA','')", > "concat_ws('~','AA','')", > "AA~") > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10323) A single backslash cannot be successfully parsed in Java Table API
[ https://issues.apache.org/jira/browse/FLINK-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611878#comment-16611878 ] Xingcan Cui commented on FLINK-10323: - Thanks for your reminder [~twalthr]. I'll check it and close this issue if it has been addressed. > A single backslash cannot be successfully parsed in Java Table API > -- > > Key: FLINK-10323 > URL: https://issues.apache.org/jira/browse/FLINK-10323 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Affects Versions: 1.4.2, 1.5.3, 1.6.0 >Reporter: Xingcan Cui >Priority: Major > > The snippet below will cause a parser exception. > {code:java} > testAllApis( > concat_ws("~", "AA", "\\"), > "concat_ws('~','AA','\\')", > "concat_ws('~','AA','')", > "AA~\\") > {code} > {code} > org.apache.flink.table.api.ExpressionParserException: Could not parse > expression at column 20: Invalid expression. > concat_ws('~','AA','\') >^ > at > org.apache.flink.table.expressions.ExpressionParser$.throwError(ExpressionParser.scala:549) > at > org.apache.flink.table.expressions.ExpressionParser$.parseExpression(ExpressionParser.scala:542) > at > org.apache.flink.table.expressions.utils.ExpressionTestBase.addTableApiTestExpr(ExpressionTestBase.scala:255) > at > org.apache.flink.table.expressions.utils.ExpressionTestBase.testAllApis(ExpressionTestBase.scala:265) > at > org.apache.flink.table.expressions.ScalarFunctionsTest.testConcatWs(ScalarFunctionsTest.scala:340) > {code} > However, with double (or more) backslashes, it will be successfully parsed. > {code:java} > testAllApis( > concat_ws("~", "AA", ""), > "concat_ws('~','AA','')", > "concat_ws('~','AA','')", > "AA~") > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)