[jira] [Updated] (FLINK-14027) Add documentation for Python user-defined functions
[ https://issues.apache.org/jira/browse/FLINK-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng updated FLINK-14027: Description: We should add documentation about how to use Python user-defined functions. (was: We should add documentation about how to use Python user-defined functions. Python dependencies should be included in the document. ) > Add documentation for Python user-defined functions > --- > > Key: FLINK-14027 > URL: https://issues.apache.org/jira/browse/FLINK-14027 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Documentation >Reporter: Dian Fu >Assignee: Wei Zhong >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > Time Spent: 20m > Remaining Estimate: 0h > > We should add documentation about how to use Python user-defined functions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-14027) Add documentation for Python user-defined functions
[ https://issues.apache.org/jira/browse/FLINK-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng closed FLINK-14027. --- Resolution: Resolved > Add documentation for Python user-defined functions > --- > > Key: FLINK-14027 > URL: https://issues.apache.org/jira/browse/FLINK-14027 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Documentation >Reporter: Dian Fu >Assignee: Wei Zhong >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > Time Spent: 20m > Remaining Estimate: 0h > > We should add documentation about how to use Python user-defined functions. > Python dependencies should be included in the document. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions. URL: https://github.com/apache/flink/pull/9886#issuecomment-541324020 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 9ee50d8d31d809e5c5eb578b3d1d428658006554 (Sun Oct 20 06:57:50 UTC 2019) **Warnings:** * **1 pom.xml files were touched**: Check for build and licensing issues. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-14027) Add documentation for Python user-defined functions
[ https://issues.apache.org/jira/browse/FLINK-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955415#comment-16955415 ] Hequn Cheng commented on FLINK-14027: - Resolved in 1.10.0 via fbb941403bc2f604734e5121a905f54f2a0d0c5f > Add documentation for Python user-defined functions > --- > > Key: FLINK-14027 > URL: https://issues.apache.org/jira/browse/FLINK-14027 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Documentation >Reporter: Dian Fu >Assignee: Wei Zhong >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > Time Spent: 20m > Remaining Estimate: 0h > > We should add documentation about how to use Python user-defined functions. > Python dependencies should be included in the document. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] hequn8128 closed pull request #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.
hequn8128 closed pull request #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions. URL: https://github.com/apache/flink/pull/9886 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph
flinkbot edited a comment on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph URL: https://github.com/apache/flink/pull/9942#issuecomment-54414 ## CI report: * 9edacc5d121c71befd2506f276b4210288c4654e : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132691365) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9
flinkbot edited a comment on issue #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9 URL: https://github.com/apache/flink/pull/9943#issuecomment-544224545 ## CI report: * e816681d90c2070b6136ddaca1ce11e4d32a21fd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132692257) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9
flinkbot commented on issue #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9 URL: https://github.com/apache/flink/pull/9943#issuecomment-544224545 ## CI report: * e816681d90c2070b6136ddaca1ce11e4d32a21fd : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph
flinkbot edited a comment on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph URL: https://github.com/apache/flink/pull/9942#issuecomment-544221279 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 9edacc5d121c71befd2506f276b4210288c4654e (Sun Oct 20 06:05:53 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhuzhurk commented on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph
zhuzhurk commented on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph URL: https://github.com/apache/flink/pull/9942#issuecomment-544223799 Thanks for opening this PR. The `sessionTimeout` is useless after we've move to FLIP6 architecture. The change looks good to me. +1 to merge it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph
flinkbot edited a comment on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph URL: https://github.com/apache/flink/pull/9942#issuecomment-54414 ## CI report: * 9edacc5d121c71befd2506f276b4210288c4654e : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132691365) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9
flinkbot commented on issue #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9 URL: https://github.com/apache/flink/pull/9943#issuecomment-544223382 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit e816681d90c2070b6136ddaca1ce11e4d32a21fd (Sun Oct 20 05:55:32 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-14463).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14463) Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9
[ https://issues.apache.org/jira/browse/FLINK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-14463: --- Labels: pull-request-available (was: ) > Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9 > - > > Key: FLINK-14463 > URL: https://issues.apache.org/jira/browse/FLINK-14463 > Project: Flink > Issue Type: Improvement >Affects Versions: 1.9.0 >Reporter: Jiayi Liao >Priority: Major > Labels: pull-request-available > > Handover.java exists both in flink-connector-kafka(kafka 2.x) module and > flink-connector-kafka-0.9 module. We should put this file into kafka base > module to avoid repeated codes. > cc [~sewen] [~yanghua] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] buptljy opened a new pull request #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9
buptljy opened a new pull request #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9 URL: https://github.com/apache/flink/pull/9943 ## What is the purpose of the change Handover.java exists both in flink-connector-kafka(kafka 2.x) module and flink-connector-kafka-0.9 module. We should put this file into kafka base module to avoid repeated codes. ## Brief change log Remove Handover in flink-connector-kafka(kafka 2.x) module and flink-connector-kafka-0.9 module, and add Handover in flink-connector-kafka-base module. ## Verifying this change Unit testing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14463) Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9
Jiayi Liao created FLINK-14463: -- Summary: Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9 Key: FLINK-14463 URL: https://issues.apache.org/jira/browse/FLINK-14463 Project: Flink Issue Type: Improvement Affects Versions: 1.9.0 Reporter: Jiayi Liao Handover.java exists both in flink-connector-kafka(kafka 2.x) module and flink-connector-kafka-0.9 module. We should put this file into kafka base module to avoid repeated codes. cc [~sewen] [~yanghua] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph
flinkbot commented on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph URL: https://github.com/apache/flink/pull/9942#issuecomment-54414 ## CI report: * 9edacc5d121c71befd2506f276b4210288c4654e : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-8363) Build Hadoop 2.9.0 convenience binaries
[ https://issues.apache.org/jira/browse/FLINK-8363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955392#comment-16955392 ] Igor Dvorzhak commented on FLINK-8363: -- Any updates on this? I would be nice to have flink-shaded libraries published for Hadoop 2.9.x and Hadoop 3.x > Build Hadoop 2.9.0 convenience binaries > --- > > Key: FLINK-8363 > URL: https://issues.apache.org/jira/browse/FLINK-8363 > Project: Flink > Issue Type: New Feature > Components: Build System, BuildSystem / Shaded >Affects Versions: 1.5.0 >Reporter: Greg Hogan >Assignee: Greg Hogan >Priority: Trivial > > Hadoop 2.9.0 was released on 17 November, 2017. A local {{mvn clean verify > -Dhadoop.version=2.9.0}} ran successfully. > With the new Hadoopless build we may be able to improve the build process by > reusing the {{flink-dist}} jar (which differ only in build timestamps) and > simply make each Hadoop-specific tarball by copying in the corresponding > {{flink-shaded-hadoop2-uber}} jar. > What portion of the TravisCI jobs can run Hadoopless? We could build and > verify these once and then run a Hadoop-versioned job for each Hadoop version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph
flinkbot commented on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph URL: https://github.com/apache/flink/pull/9942#issuecomment-544221279 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 9edacc5d121c71befd2506f276b4210288c4654e (Sun Oct 20 05:09:03 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14461) Remove unused sessionTimeout from JobGraph
[ https://issues.apache.org/jira/browse/FLINK-14461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-14461: --- Labels: pull-request-available (was: ) > Remove unused sessionTimeout from JobGraph > -- > > Key: FLINK-14461 > URL: https://issues.apache.org/jira/browse/FLINK-14461 > Project: Flink > Issue Type: Improvement > Components: Runtime / Configuration >Affects Versions: 1.10.0 >Reporter: Zili Chen >Assignee: Zili Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] TisonKun opened a new pull request #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph
TisonKun opened a new pull request #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph URL: https://github.com/apache/flink/pull/9942 ## What is the purpose of the change Remove unused sessionTimeout from JobGraph ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) cc @zentol @tillrohrmann This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14462) Remove JobGraph#allowQueuedScheduling flag because it is always true
Zili Chen created FLINK-14462: - Summary: Remove JobGraph#allowQueuedScheduling flag because it is always true Key: FLINK-14462 URL: https://issues.apache.org/jira/browse/FLINK-14462 Project: Flink Issue Type: Sub-task Components: Runtime / Configuration Affects Versions: 1.10.0 Reporter: Zili Chen Assignee: Zili Chen Fix For: 1.10.0 CC [~trohrmann][~zhuzh] The only point {{#setAllowQueuedScheduling(false)}} is in {{JobGraphGenerator}}. IIRC we always {{#setAllowQueuedScheduling(true)}} after the generation and before the submission. For reduce confusion I propose to remove {{JobGraph#allowQueuedScheduling}} and refactor the related logic to all respect {{true}}. This flag is originally used for configuring different resource allocation strategy between legacy mode and FLIP-6 arch. And there remains branches in {{Scheduler}} which might cause further confusion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14461) Remove unused sessionTimeout from JobGraph
Zili Chen created FLINK-14461: - Summary: Remove unused sessionTimeout from JobGraph Key: FLINK-14461 URL: https://issues.apache.org/jira/browse/FLINK-14461 Project: Flink Issue Type: Improvement Components: Runtime / Configuration Affects Versions: 1.10.0 Reporter: Zili Chen Assignee: Zili Chen Fix For: 1.10.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14424) Create tpc-ds end to end test to support all tpc-ds queries
[ https://issues.apache.org/jira/browse/FLINK-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955375#comment-16955375 ] Jiayi Liao commented on FLINK-14424: A quick ask, are you going to create a new tpc-ds module and put it into flink-end-to-end-tests? Is this your plan? > Create tpc-ds end to end test to support all tpc-ds queries > > > Key: FLINK-14424 > URL: https://issues.apache.org/jira/browse/FLINK-14424 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API, Table SQL / Planner >Reporter: Leonard Xu >Assignee: Leonard Xu >Priority: Major > Fix For: 1.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621 @twalthr OK, once I wrote the test I figured out my fix here won't solve the problem. After doing some more debugging, I've come to the following findings: Given the following test: ```scala val util = scalaStreamTestUtil() val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 'f1, 'f2, 'f3) val temporal = util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable GROUP BY f1, f2") val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2) util.tableEnv.registerFunction("f", temporalFunc) val queryTable = util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL TABLE(f(f1)) AS T(a, b, cs)") util.verifyPlan(queryTable) ``` If we first look at the generated table schema for the underlying table by the SQL query for the temporal table, we see: ``` root |-- f1: TIMESTAMP(3) |-- f2: BIGINT |-- f3s: MULTISET NOT NULL ``` When `FlinkPlanner` validates the SQL query it reaches the part where it needs to validate the `TemporalTableFunction` I've defined called `f`. It calls [FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57) to search for the method and once it finds it, it needs to convert it to a standard `SqlFunction`. In order to do that, it needs to convert the `TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new `DataType` API. Problem is, `TypeInformation[T]` doesn't carry information about the nullability of the field, thus when the conversion to `Multiset[T]` happens, [it ends up calling the default constructor which sets `nullable = true` by default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60), which ends up blowing up at runtime because the `TableSchema` expected a NOT NULL field. I'm not entirely sure how we can get around this issue. **EDIT**: OK, it seems like the `TableFunctionDefinition` for the temporal table already carries the `DataType` information which is visible via: `((TemporalTableFunctionImpl) functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`, which we can use in order to avoid the data losing conversion from `TypeInformation[T]`. WDYT? **EDIT 2:** Turns out this is not enough, since `DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call `tableFunction.getResult`, which has the `TypeInformation[T]` and thus will result in a nullable type as well. **EDIT 3:** `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null multiset: ```java public RelDataType createMultisetType( RelDataType type, long maxCardinality) { assert maxCardinality == -1; RelDataType newType = new MultisetSqlType(type, false); return canonize(newType); } ``` This is where the two diverge and why the `TableSchema` has a non null multiset type, and it seems like this will happen for any complex data type? **EDIT 4** Perhaps the better way is to make `FlinkTypeFactory` produce a `MultisetSqlType` which is by default nullable? i.e.: ```scala override def createMultisetType(elementType: RelDataType, maxCardinality: Long): RelDataType = { // Just validate type, make sure there is a failure in validate phase. toLogicalType(elementType) super.createTypeWithNullability(super.createMultisetType(elementType, maxCardinality), true) } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621 @twalthr OK, once I wrote the test I figured out my fix here won't solve the problem. After doing some more debugging, I've come to the following findings: Given the following test: ```scala val util = scalaStreamTestUtil() val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 'f1, 'f2, 'f3) val temporal = util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable GROUP BY f1, f2") val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2) util.tableEnv.registerFunction("f", temporalFunc) val queryTable = util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL TABLE(f(f1)) AS T(a, b, cs)") util.verifyPlan(queryTable) ``` If we first look at the generated table schema for the underlying table by the SQL query for the temporal table, we see: ``` root |-- f1: TIMESTAMP(3) |-- f2: BIGINT |-- f3s: MULTISET NOT NULL ``` When `FlinkPlanner` validates the SQL query it reaches the part where it needs to validate the `TemporalTableFunction` I've defined called `f`. It calls [FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57) to search for the method and once it finds it, it needs to convert it to a standard `SqlFunction`. In order to do that, it needs to convert the `TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new `DataType` API. Problem is, `TypeInformation[T]` doesn't carry information about the nullability of the field, thus when the conversion to `Multiset[T]` happens, [it ends up calling the default constructor which sets `nullable = true` by default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60), which ends up blowing up at runtime because the `TableSchema` expected a NOT NULL field. I'm not entirely sure how we can get around this issue. **EDIT**: OK, it seems like the `TableFunctionDefinition` for the temporal table already carries the `DataType` information which is visible via: `((TemporalTableFunctionImpl) functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`, which we can use in order to avoid the data losing conversion from `TypeInformation[T]`. WDYT? **EDIT 2:** Turns out this is not enough, since `DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call `tableFunction.getResult`, which has the `TypeInformation[T]` and thus will result in a nullable type as well. **EDIT 3:** `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null multiset: ```java public RelDataType createMultisetType( RelDataType type, long maxCardinality) { assert maxCardinality == -1; RelDataType newType = new MultisetSqlType(type, false); return canonize(newType); } ``` This is where the two diverge and why the `TableSchema` has a non null multiset type, and it seems like this will happen for any complex data type? **EDIT 4** Perhaps the better way is to make `FlinkTypeFactory` produce a `MultisetSqlType` which is by default nullable? i.e.: ```scala override def createMultisetType(elementType: RelDataType, maxCardinality: Long): RelDataType = { // Just validate type, make sure there is a failure in validate phase. toLogicalType(elementType) super.createTypeWithNullability(super.createMultisetType(elementType, maxCardinality), true) } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621 @twalthr OK, once I wrote the test I figured out my fix here won't solve the problem. After doing some more debugging, I've come to the following findings: Given the following test: ```scala val util = scalaStreamTestUtil() val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 'f1, 'f2, 'f3) val temporal = util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable GROUP BY f1, f2") val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2) util.tableEnv.registerFunction("f", temporalFunc) val queryTable = util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL TABLE(f(f1)) AS T(a, b, cs)") util.verifyPlan(queryTable) ``` If we first look at the generated table schema for the underlying table by the SQL query for the temporal table, we see: ``` root |-- f1: TIMESTAMP(3) |-- f2: BIGINT |-- f3s: MULTISET NOT NULL ``` When `FlinkPlanner` validates the SQL query it reaches the part where it needs to validate the `TemporalTableFunction` I've defined called `f`. It calls [FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57) to search for the method and once it finds it, it needs to convert it to a standard `SqlFunction`. In order to do that, it needs to convert the `TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new `DataType` API. Problem is, `TypeInformation[T]` doesn't carry information about the nullability of the field, thus when the conversion to `Multiset[T]` happens, [it ends up calling the default constructor which sets `nullable = true` by default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60), which ends up blowing up at runtime because the `TableSchema` expected a NOT NULL field. I'm not entirely sure how we can get around this issue. **EDIT**: OK, it seems like the `TableFunctionDefinition` for the temporal table already carries the `DataType` information which is visible via: `((TemporalTableFunctionImpl) functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`, which we can use in order to avoid the data losing conversion from `TypeInformation[T]`. WDYT? **EDIT 2:** Turns out this is not enough, since `DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call `tableFunction.getResult`, which has the `TypeInformation[T]` and thus will result in a nullable type as well. **EDIT 3:** `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null multiset: ```java public RelDataType createMultisetType( RelDataType type, long maxCardinality) { assert maxCardinality == -1; RelDataType newType = new MultisetSqlType(type, false); return canonize(newType); } ``` This is where the two diverge and why the `TableSchema` has a non null multiset type, and it seems like this will happen for any complex data type? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-12399) FilterableTableSource does not use filters on job run
[ https://issues.apache.org/jira/browse/FLINK-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955236#comment-16955236 ] Rong Rong commented on FLINK-12399: --- merged to 1.9: c1019105c22455c554ab91b9fc2ef8512873bee8 > FilterableTableSource does not use filters on job run > - > > Key: FLINK-12399 > URL: https://issues.apache.org/jira/browse/FLINK-12399 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.8.0 >Reporter: Josh Bradt >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > Attachments: flink-filter-bug.tar.gz > > Time Spent: 40m > Remaining Estimate: 0h > > As discussed [on the mailing > list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Filter-push-down-not-working-for-a-custom-BatchTableSource-tp27654.html], > there appears to be a bug where a job that uses a custom > FilterableTableSource does not keep the filters that were pushed down into > the table source. More specifically, the table source does receive filters > via applyPredicates, and a new table source with those filters is returned, > but the final job graph appears to use the original table source, which does > not contain any filters. > I attached a minimal example program to this ticket. The custom table source > is as follows: > {code:java} > public class CustomTableSource implements BatchTableSource, > FilterableTableSource { > private static final Logger LOG = > LoggerFactory.getLogger(CustomTableSource.class); > private final Filter[] filters; > private final FilterConverter converter = new FilterConverter(); > public CustomTableSource() { > this(null); > } > private CustomTableSource(Filter[] filters) { > this.filters = filters; > } > @Override > public DataSet getDataSet(ExecutionEnvironment execEnv) { > if (filters == null) { >LOG.info(" No filters defined "); > } else { > LOG.info(" Found filters "); > for (Filter filter : filters) { > LOG.info("FILTER: {}", filter); > } > } > return execEnv.fromCollection(allModels()); > } > @Override > public TableSource applyPredicate(List predicates) { > LOG.info("Applying predicates"); > List acceptedFilters = new ArrayList<>(); > for (final Expression predicate : predicates) { > converter.convert(predicate).ifPresent(acceptedFilters::add); > } > return new CustomTableSource(acceptedFilters.toArray(new Filter[0])); > } > @Override > public boolean isFilterPushedDown() { > return filters != null; > } > @Override > public TypeInformation getReturnType() { > return TypeInformation.of(Model.class); > } > @Override > public TableSchema getTableSchema() { > return TableSchema.fromTypeInfo(getReturnType()); > } > private List allModels() { > List models = new ArrayList<>(); > models.add(new Model(1, 2, 3, 4)); > models.add(new Model(10, 11, 12, 13)); > models.add(new Model(20, 21, 22, 23)); > return models; > } > } > {code} > > When run, it logs > {noformat} > 15:24:54,888 INFO com.klaviyo.filterbug.CustomTableSource >- Applying predicates > 15:24:54,901 INFO com.klaviyo.filterbug.CustomTableSource >- Applying predicates > 15:24:54,910 INFO com.klaviyo.filterbug.CustomTableSource >- Applying predicates > 15:24:54,977 INFO com.klaviyo.filterbug.CustomTableSource >- No filters defined {noformat} > which appears to indicate that although filters are getting pushed down, the > final job does not use them. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-12399) FilterableTableSource does not use filters on job run
[ https://issues.apache.org/jira/browse/FLINK-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-12399: -- Fix Version/s: 1.9.2 > FilterableTableSource does not use filters on job run > - > > Key: FLINK-12399 > URL: https://issues.apache.org/jira/browse/FLINK-12399 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.8.0 >Reporter: Josh Bradt >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0, 1.9.2 > > Attachments: flink-filter-bug.tar.gz > > Time Spent: 40m > Remaining Estimate: 0h > > As discussed [on the mailing > list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Filter-push-down-not-working-for-a-custom-BatchTableSource-tp27654.html], > there appears to be a bug where a job that uses a custom > FilterableTableSource does not keep the filters that were pushed down into > the table source. More specifically, the table source does receive filters > via applyPredicates, and a new table source with those filters is returned, > but the final job graph appears to use the original table source, which does > not contain any filters. > I attached a minimal example program to this ticket. The custom table source > is as follows: > {code:java} > public class CustomTableSource implements BatchTableSource, > FilterableTableSource { > private static final Logger LOG = > LoggerFactory.getLogger(CustomTableSource.class); > private final Filter[] filters; > private final FilterConverter converter = new FilterConverter(); > public CustomTableSource() { > this(null); > } > private CustomTableSource(Filter[] filters) { > this.filters = filters; > } > @Override > public DataSet getDataSet(ExecutionEnvironment execEnv) { > if (filters == null) { >LOG.info(" No filters defined "); > } else { > LOG.info(" Found filters "); > for (Filter filter : filters) { > LOG.info("FILTER: {}", filter); > } > } > return execEnv.fromCollection(allModels()); > } > @Override > public TableSource applyPredicate(List predicates) { > LOG.info("Applying predicates"); > List acceptedFilters = new ArrayList<>(); > for (final Expression predicate : predicates) { > converter.convert(predicate).ifPresent(acceptedFilters::add); > } > return new CustomTableSource(acceptedFilters.toArray(new Filter[0])); > } > @Override > public boolean isFilterPushedDown() { > return filters != null; > } > @Override > public TypeInformation getReturnType() { > return TypeInformation.of(Model.class); > } > @Override > public TableSchema getTableSchema() { > return TableSchema.fromTypeInfo(getReturnType()); > } > private List allModels() { > List models = new ArrayList<>(); > models.add(new Model(1, 2, 3, 4)); > models.add(new Model(10, 11, 12, 13)); > models.add(new Model(20, 21, 22, 23)); > return models; > } > } > {code} > > When run, it logs > {noformat} > 15:24:54,888 INFO com.klaviyo.filterbug.CustomTableSource >- Applying predicates > 15:24:54,901 INFO com.klaviyo.filterbug.CustomTableSource >- Applying predicates > 15:24:54,910 INFO com.klaviyo.filterbug.CustomTableSource >- Applying predicates > 15:24:54,977 INFO com.klaviyo.filterbug.CustomTableSource >- No filters defined {noformat} > which appears to indicate that although filters are getting pushed down, the > final job does not use them. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621 @twalthr OK, once I wrote the test I figured out my fix here won't solve the problem. After doing some more debugging, I've come to the following findings: Given the following test: ```scala val util = scalaStreamTestUtil() val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 'f1, 'f2, 'f3) val temporal = util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable GROUP BY f1, f2") val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2) util.tableEnv.registerFunction("f", temporalFunc) val queryTable = util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL TABLE(f(f1)) AS T(a, b, cs)") util.verifyPlan(queryTable) ``` If we first look at the generated table schema for the underlying table by the SQL query for the temporal table, we see: ``` root |-- f1: TIMESTAMP(3) |-- f2: BIGINT |-- f3s: MULTISET NOT NULL ``` When `FlinkPlanner` validates the SQL query it reaches the part where it needs to validate the `TemporalTableFunction` I've defined called `f`. It calls [FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57) to search for the method and once it finds it, it needs to convert it to a standard `SqlFunction`. In order to do that, it needs to convert the `TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new `DataType` API. Problem is, `TypeInformation[T]` doesn't carry information about the nullability of the field, thus when the conversion to `Multiset[T]` happens, [it ends up calling the default constructor which sets `nullable = true` by default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60), which ends up blowing up at runtime because the `TableSchema` expected a NOT NULL field. I'm not entirely sure how we can get around this issue. **EDIT**: OK, it seems like the `TableFunctionDefinition` for the temporal table already carries the `DataType` information which is visible via: `((TemporalTableFunctionImpl) functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`, which we can use in order to avoid the data losing conversion from `TypeInformation[T]`. WDYT? **EDIT 2:** Turns out this is not enough, since `DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call `tableFunction.getResult`, which has the `TypeInformation[T]` and thus will result in a nullable type as well. **EDIT 3:** `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null multiset: ```java public RelDataType createMultisetType( RelDataType type, long maxCardinality) { assert maxCardinality == -1; RelDataType newType = new MultisetSqlType(type, false); return canonize(newType); } ``` This is where the types converge and why the `TableSchema` has a non null multiset type, and it seems like this will happen for any complex data type? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621 @twalthr OK, once I wrote the test I figured out my fix here won't solve the problem. After doing some more debugging, I've come to the following findings: Given the following test: ```scala val util = scalaStreamTestUtil() val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 'f1, 'f2, 'f3) val temporal = util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable GROUP BY f1, f2") val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2) util.tableEnv.registerFunction("f", temporalFunc) val queryTable = util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL TABLE(f(f1)) AS T(a, b, cs)") util.verifyPlan(queryTable) ``` If we first look at the generated table schema for the underlying table by the SQL query for the temporal table, we see: ``` root |-- f1: TIMESTAMP(3) |-- f2: BIGINT |-- f3s: MULTISET NOT NULL ``` When `FlinkPlanner` validates the SQL query it reaches the part where it needs to validate the `TemporalTableFunction` I've defined called `f`. It calls [FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57) to search for the method and once it finds it, it needs to convert it to a standard `SqlFunction`. In order to do that, it needs to convert the `TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new `DataType` API. Problem is, `TypeInformation[T]` doesn't carry information about the nullability of the field, thus when the conversion to `Multiset[T]` happens, [it ends up calling the default constructor which sets `nullable = true` by default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60), which ends up blowing up at runtime because the `TableSchema` expected a NOT NULL field. I'm not entirely sure how we can get around this issue. **EDIT**: OK, it seems like the `TableFunctionDefinition` for the temporal table already carries the `DataType` information which is visible via: `((TemporalTableFunctionImpl) functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`, which we can use in order to avoid the data losing conversion from `TypeInformation[T]`. WDYT? **EDIT 2:** Turns out this is not enough, since `DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call `tableFunction.getResult`, which has the `TypeInformation[T]` and thus will result in a nullable type as well. **EDIT 3:** `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null multiset: ```java public RelDataType createMultisetType( RelDataType type, long maxCardinality) { assert maxCardinality == -1; RelDataType newType = new MultisetSqlType(type, false); return canonize(newType); } ``` This is where the types converge, and it seems like this will happen for any complex data type? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621 @twalthr OK, once I wrote the test I figured out my fix here won't solve the problem. After doing some more debugging, I've come to the following findings: Given the following test: ```scala val util = scalaStreamTestUtil() val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 'f1, 'f2, 'f3) val temporal = util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable GROUP BY f1, f2") val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2) util.tableEnv.registerFunction("f", temporalFunc) val queryTable = util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL TABLE(f(f1)) AS T(a, b, cs)") util.verifyPlan(queryTable) ``` If we first look at the generated table schema for the underlying table by the SQL query for the temporal table, we see: ``` root |-- f1: TIMESTAMP(3) |-- f2: BIGINT |-- f3s: MULTISET NOT NULL ``` When `FlinkPlanner` validates the SQL query it reaches the part where it needs to validate the `TemporalTableFunction` I've defined called `f`. It calls [FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57) to search for the method and once it finds it, it needs to convert it to a standard `SqlFunction`. In order to do that, it needs to convert the `TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new `DataType` API. Problem is, `TypeInformation[T]` doesn't carry information about the nullability of the field, thus when the conversion to `Multiset[T]` happens, [it ends up calling the default constructor which sets `nullable = true` by default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60), which ends up blowing up at runtime because the `TableSchema` expected a NOT NULL field. I'm not entirely sure how we can get around this issue. **EDIT**: OK, it seems like the `TableFunctionDefinition` for the temporal table already carries the `DataType` information which is visible via: `((TemporalTableFunctionImpl) functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`, which we can use in order to avoid the data losing conversion from `TypeInformation[T]`. WDYT? **EDIT 2:** Turns out this is not enough, since `DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call `tableFunction.getResult`, which has the `TypeInformation[T]` and thus will result in a nullable type as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-13850) Refactor part file configuration into a single method
[ https://issues.apache.org/jira/browse/FLINK-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955200#comment-16955200 ] lichong commented on FLINK-13850: - This is what I really need recently. I want to name the target file and the in-porgress file instead of the default way, such as a new target name without any subtask info, and remove the dot prefix of the in-progress file, etc. OutputFileConfig maybe make more sense for me as it means this is the config for the output file, and also there should be a way for users who can provide the configuration or just use the default value. My opinion, thanks. I am expecting this feature. > Refactor part file configuration into a single method > - > > Key: FLINK-13850 > URL: https://issues.apache.org/jira/browse/FLINK-13850 > Project: Flink > Issue Type: Sub-task > Components: Connectors / FileSystem >Reporter: Gyula Fora >Assignee: João Boto >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently there is only two methods on both format builders > withPartFilePrefix and withPartFileSuffix for configuring the part files but > in the future it is likely to grow. > * More settings, different directories for pending / inprogress files etc > I suggest we remove these two methods and replace them with a single : > withPartFileConfig(..) where we use an extensible config class. > This should be fixed before 1.10 in order to not release the other methods. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #9576: [FLINK-13915][ml] Add several base classes of summarizer.
flinkbot edited a comment on issue #9576: [FLINK-13915][ml] Add several base classes of summarizer. URL: https://github.com/apache/flink/pull/9576#issuecomment-526645449 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 010706cbbd12d506f23a52e8b71474ab2b06c2b6 (Sat Oct 19 15:10:48 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-13915).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add several base classes of summarizer.
walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add several base classes of summarizer. URL: https://github.com/apache/flink/pull/9576#discussion_r336740930 ## File path: flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/statistics/basicstatistic/BaseSummarizer.java ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.flink.ml.common.statistics.basicstatistic; + +import org.apache.flink.ml.common.linalg.DenseMatrix; + +import java.io.Serializable; + +/** + * Summarizer is the base class to calculate summary and store intermediate results, and Summary is the result of Summarizer. Review comment: I actually have a higher level questions regarding the type of summary we are going to support and what's the road map in the future. In general: a "summary" to me should be something that can be serialized into a format that an external system can read and use it to produce meaningful insights. such as visualization e.g. [tf.summary](https://www.tensorflow.org/api_docs/python/tf/summary), or for transforming into some object that can be used to export data [R.summary](https://www.rdocumentation.org/packages/base/versions/3.6.1/topics/summary). In this case, the class "summary" in this PR is more like a wrapper around: `sum()`, `count()`, `correlate()`, `covariance()`. I am not sure what's the value add to the ML-library. Maybe I was understanding this entire intention in the wrong way. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9576: [FLINK-13915][ml] Add several base classes of summarizer.
flinkbot edited a comment on issue #9576: [FLINK-13915][ml] Add several base classes of summarizer. URL: https://github.com/apache/flink/pull/9576#issuecomment-526645449 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 010706cbbd12d506f23a52e8b71474ab2b06c2b6 (Sat Oct 19 15:05:44 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-13915).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add several base classes of summarizer.
walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add several base classes of summarizer. URL: https://github.com/apache/flink/pull/9576#discussion_r33674 ## File path: flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/statistics/basicstatistic/BaseSummarizer.java ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.flink.ml.common.statistics.basicstatistic; + +import org.apache.flink.ml.common.linalg.DenseMatrix; + +import java.io.Serializable; + +/** + * Summarizer is the base class to calculate summary and store intermediate results, and Summary is the result of Summarizer. Review comment: To add to this discussion. Some of the summaries used by Tensorflow for example, is meant to be used by Tensorboard - which understands the format of the summary in order to plot visualization via [protobuf](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/summary.proto), same thing with [tf.Event](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/util/event.proto). If this is the case, is there an existing discussion regarding what needs to be done in terms of the public APIs of a `Summary` class should have in Flink-ML? I haven't follow up closely with the ML but I cant find any in the [FLIP-39 documentation](https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add several base classes of summarizer.
walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add several base classes of summarizer. URL: https://github.com/apache/flink/pull/9576#discussion_r33674 ## File path: flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/statistics/basicstatistic/BaseSummarizer.java ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.flink.ml.common.statistics.basicstatistic; + +import org.apache.flink.ml.common.linalg.DenseMatrix; + +import java.io.Serializable; + +/** + * Summarizer is the base class to calculate summary and store intermediate results, and Summary is the result of Summarizer. Review comment: To add to this discussion. Some of the summaries used by Tensorflow for example, is meant to be used by Tensorboard - which understands the format of the summary in order to plot visualization via [protobuf](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/summary.proto)), same thing with [tf.Event](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/util/event.proto). If this is the case, is there an existing discussion regarding what needs to be done in terms of the public APIs of a `Summary` class should have in Flink-ML? I haven't follow up closely with the ML but I cant find any in the [FLIP-39 documentation](https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621 @twalthr OK, once I wrote the test I figured out my fix here won't solve the problem. After doing some more debugging, I've come to the following findings: Given the following test: ```scala val util = scalaStreamTestUtil() val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 'f1, 'f2, 'f3) val temporal = util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable GROUP BY f1, f2") val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2) util.tableEnv.registerFunction("f", temporalFunc) val queryTable = util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL TABLE(f(f1)) AS T(a, b, cs)") util.verifyPlan(queryTable) ``` If we first look at the generated table schema for the underlying table by the SQL query for the temporal table, we see: ``` root |-- f1: TIMESTAMP(3) |-- f2: BIGINT |-- f3s: MULTISET NOT NULL ``` When `FlinkPlanner` validates the SQL query it reaches the part where it needs to validate the `TemporalTableFunction` I've defined called `f`. It calls [FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57) to search for the method and once it finds it, it needs to convert it to a standard `SqlFunction`. In order to do that, it needs to convert the `TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new `DataType` API. Problem is, `TypeInformation[T]` doesn't carry information about the nullability of the field, thus when the conversion to `Multiset[T]` happens, [it ends up calling the default constructor which sets `nullable = true` by default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60), which ends up blowing up at runtime because the `TableSchema` expected a NOT NULL field. I'm not entirely sure how we can get around this issue. **EDIT**: OK, it seems like the `TableFunctionDefinition` for the temporal table already carries the `DataType` information which is visible via: `((TemporalTableFunctionImpl) functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`, which we can use in order to avoid the data losing conversion from `TypeInformation[T]`. WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable
flinkbot edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable URL: https://github.com/apache/flink/pull/9780#issuecomment-535560331 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 8c0e3b4b94421c9f5ac4cfa470d0859e88471f38 (Sat Oct 19 14:29:10 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-14042).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session
flinkbot edited a comment on issue #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session URL: https://github.com/apache/flink/pull/9832#issuecomment-537039262 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 271703eda6f6c55b1641a54206109ef659f62854 (Sat Oct 19 14:29:14 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] YuvalItzchakov commented on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable
YuvalItzchakov commented on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621 @twalthr OK, once I wrote the test I figured out my fix here won't solve the problem. After doing some more debugging, I've come to the following findings: Given the following test: ```scala val util = scalaStreamTestUtil() val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 'f1, 'f2, 'f3) val temporal = util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable GROUP BY f1, f2") val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2) util.tableEnv.registerFunction("f", temporalFunc) val queryTable = util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL TABLE(f(f1)) AS T(a, b, cs)") util.verifyPlan(queryTable) ``` If we first look at the generated table schema for the underlying table by the SQL query for the temporal table, we see: ``` root |-- f1: TIMESTAMP(3) |-- f2: BIGINT |-- f3s: MULTISET NOT NULL ``` When `FlinkPlanner` validates the SQL query it reaches the part where it needs to validate the `TemporalTableFunction` I've defined called `f`. It calls [FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57) to search for the method and once it finds it, it needs to convert it to a standard `SqlFunction`. In order to do that, it needs to convert the `TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new `DataType` API. Problem is, `TypeInformation[T]` doesn't carry information about the nullability of the field, thus when the conversion to `Multiset[T]` happens, [it ends up calling the default constructor which sets `nullable = true` by default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60), which ends up blowing up at runtime because the `TableSchema` expected a NOT NULL field. I'm not entirely sure how we can get around this issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session
tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session URL: https://github.com/apache/flink/pull/9832#discussion_r336739367 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/runner/StoppedDispatcherLeaderProcess.java ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.dispatcher.runner; + +import org.apache.flink.runtime.clusterframework.ApplicationStatus; +import org.apache.flink.runtime.dispatcher.DispatcherGateway; + +import java.util.UUID; +import java.util.concurrent.CompletableFuture; + +/** + * {@link DispatcherLeaderProcess} implementation which is stopped. This class + * is useful as the initial state of the {@link DefaultDispatcherRunner}. + */ +public enum StoppedDispatcherLeaderProcess implements DispatcherLeaderProcess { + INSTANCE; + + private static final CompletableFuture TERMINATION_FUTURE = CompletableFuture.completedFuture(null); + private static final UUID LEADER_SESSION_ID = new UUID(0L, 0L); + private static final CompletableFuture NEVER_COMPLETED_LEADER_SESSION_FUTURE = new CompletableFuture<>(); + private static final CompletableFuture NEVER_COMPLETED_SHUTDOWN_FUTURE = new CompletableFuture<>(); + + @Override + public void start() { + + } + + @Override + public UUID getLeaderSessionId() { + return LEADER_SESSION_ID; + } + + @Override + public CompletableFuture getDispatcherGateway() { + return null; + } + + @Override + public CompletableFuture getConfirmLeaderSessionFuture() { + return NEVER_COMPLETED_LEADER_SESSION_FUTURE; Review comment: I think you are right. I will update the implementation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session
flinkbot edited a comment on issue #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session URL: https://github.com/apache/flink/pull/9832#issuecomment-537039262 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 271703eda6f6c55b1641a54206109ef659f62854 (Sat Oct 19 14:25:08 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session
tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session URL: https://github.com/apache/flink/pull/9832#discussion_r336739208 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/dispatcher/runner/DispatcherLeaderProcessImplTest.java ## @@ -239,6 +246,109 @@ public void closeAsync_duringJobRecovery_preventsDispatcherServiceCreation() thr } } + @Test + public void onRemovedJobGraph_cancelsRunningJob() throws Exception { Review comment: In this test class I'm trying to follow a new naming scheme which I believe is more helpful and is described here: https://osherove.com/blog/2005/4/3/naming-standards-for-unit-tests.html. The idea is to have more descriptive names for tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session
tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session URL: https://github.com/apache/flink/pull/9832#discussion_r336739241 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/runner/DefaultDispatcherRunner.java ## @@ -0,0 +1,235 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.dispatcher.runner; + +import org.apache.flink.runtime.clusterframework.ApplicationStatus; +import org.apache.flink.runtime.concurrent.FutureUtils; +import org.apache.flink.runtime.dispatcher.DispatcherGateway; +import org.apache.flink.runtime.leaderelection.LeaderContender; +import org.apache.flink.runtime.leaderelection.LeaderElectionService; +import org.apache.flink.runtime.rpc.FatalErrorHandler; +import org.apache.flink.util.FlinkException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Arrays; +import java.util.UUID; +import java.util.concurrent.CompletableFuture; + +/** + * Runner for the {@link org.apache.flink.runtime.dispatcher.Dispatcher} which is responsible for the + * leader election. + */ +public class DefaultDispatcherRunner implements DispatcherRunner, LeaderContender { + + private static final Logger LOG = LoggerFactory.getLogger(DefaultDispatcherRunner.class); + + private final Object lock = new Object(); + + private final LeaderElectionService leaderElectionService; + + private final FatalErrorHandler fatalErrorHandler; + + private final DispatcherLeaderProcessFactory dispatcherLeaderProcessFactory; + + private final CompletableFuture terminationFuture; + + private final CompletableFuture shutDownFuture; + + private boolean isRunning; + + private DispatcherLeaderProcess dispatcherLeaderProcess; + + private CompletableFuture previousDispatcherLeaderProcessTerminationFuture; + + private CompletableFuture dispatcherGatewayFuture; + + DefaultDispatcherRunner( + LeaderElectionService leaderElectionService, + FatalErrorHandler fatalErrorHandler, + DispatcherLeaderProcessFactory dispatcherLeaderProcessFactory) throws Exception { + this.leaderElectionService = leaderElectionService; + this.fatalErrorHandler = fatalErrorHandler; + this.dispatcherLeaderProcessFactory = dispatcherLeaderProcessFactory; + this.terminationFuture = new CompletableFuture<>(); + this.shutDownFuture = new CompletableFuture<>(); + + this.isRunning = true; + this.dispatcherLeaderProcess = StoppedDispatcherLeaderProcess.INSTANCE; + this.previousDispatcherLeaderProcessTerminationFuture = CompletableFuture.completedFuture(null); + this.dispatcherGatewayFuture = new CompletableFuture<>(); + + startDispatcherRunner(leaderElectionService); + } + + private void startDispatcherRunner(LeaderElectionService leaderElectionService) throws Exception { + LOG.info("Starting {}.", getClass().getName()); + + leaderElectionService.start(this); + } + + @Override + public CompletableFuture getDispatcherGateway() { + synchronized (lock) { + return dispatcherGatewayFuture; + } + } + + @Override + public CompletableFuture getShutDownFuture() { + return shutDownFuture; + } + + @Override + public CompletableFuture closeAsync() { + synchronized (lock) { + if (!isRunning) { + return terminationFuture; + } else { + isRunning = false; + } + } + + stopDispatcherLeaderProcess(); + final CompletableFuture servicesTerminationFuture = stopServices(); + + FutureUtils.forward( + FutureUtils.completeAll( + Arrays.asList( + p
[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions. URL: https://github.com/apache/flink/pull/9886#issuecomment-541326107 ## CI report: * e431c1c28a35dd86e60a0892ff0e0d15b8da7245 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131646097) * 0b45fe65b2a35154095af4944ea8c33e36165d0a : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131895829) * 586ea9b19299a7ea6bb9efd679522a872c15d0a8 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132129350) * 9ee50d8d31d809e5c5eb578b3d1d428658006554 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132643699) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-14459) Python module build hangs
[ https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng closed FLINK-14459. --- Resolution: Fixed > Python module build hangs > - > > Key: FLINK-14459 > URL: https://issues.apache.org/jira/browse/FLINK-14459 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.9.0, 1.9.1 >Reporter: Hequn Cheng >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0, 1.9.2 > > Time Spent: 20m > Remaining Estimate: 0h > > The build of python module hangs when installing conda. See travis log: > https://api.travis-ci.org/v3/job/599704570/log.txt > Can't reproduce it neither on my local mac nor on my repo with travis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14459) Python module build hangs
[ https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955156#comment-16955156 ] Hequn Cheng commented on FLINK-14459: - Fix in 1.10.0: 2b1187d299bc6fd8dcae0d4e565238d7800dd4bb 1.9.2: 4474748c31a0d71e23147409ad338d7f5b37d5e4 > Python module build hangs > - > > Key: FLINK-14459 > URL: https://issues.apache.org/jira/browse/FLINK-14459 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.9.0, 1.10.0, 1.9.1 >Reporter: Hequn Cheng >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The build of python module hangs when installing conda. See travis log: > https://api.travis-ci.org/v3/job/599704570/log.txt > Can't reproduce it neither on my local mac nor on my repo with travis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14459) Python module build hangs
[ https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng updated FLINK-14459: Affects Version/s: (was: 1.10.0) > Python module build hangs > - > > Key: FLINK-14459 > URL: https://issues.apache.org/jira/browse/FLINK-14459 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.9.0, 1.9.1 >Reporter: Hequn Cheng >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0, 1.9.2 > > Time Spent: 20m > Remaining Estimate: 0h > > The build of python module hangs when installing conda. See travis log: > https://api.travis-ci.org/v3/job/599704570/log.txt > Can't reproduce it neither on my local mac nor on my repo with travis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14459) Python module build hangs
[ https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng updated FLINK-14459: Fix Version/s: 1.9.2 1.10.0 > Python module build hangs > - > > Key: FLINK-14459 > URL: https://issues.apache.org/jira/browse/FLINK-14459 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.9.0, 1.10.0, 1.9.1 >Reporter: Hequn Cheng >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0, 1.9.2 > > Time Spent: 20m > Remaining Estimate: 0h > > The build of python module hangs when installing conda. See travis log: > https://api.travis-ci.org/v3/job/599704570/log.txt > Can't reproduce it neither on my local mac nor on my repo with travis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python module build hang problem
flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python module build hang problem URL: https://github.com/apache/flink/pull/9941#issuecomment-544126860 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit d14be543d3faf99a424a5480fda53ca7b7a65e49 (Sat Oct 19 12:21:56 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] hequn8128 closed pull request #9941: [FLINK-14459][python] Fix python module build hang problem
hequn8128 closed pull request #9941: [FLINK-14459][python] Fix python module build hang problem URL: https://github.com/apache/flink/pull/9941 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python module build hang problem
flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python module build hang problem URL: https://github.com/apache/flink/pull/9941#issuecomment-544131502 ## CI report: * d14be543d3faf99a424a5480fda53ca7b7a65e49 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132643044) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions. URL: https://github.com/apache/flink/pull/9886#issuecomment-541326107 ## CI report: * e431c1c28a35dd86e60a0892ff0e0d15b8da7245 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131646097) * 0b45fe65b2a35154095af4944ea8c33e36165d0a : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131895829) * 586ea9b19299a7ea6bb9efd679522a872c15d0a8 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132129350) * 9ee50d8d31d809e5c5eb578b3d1d428658006554 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132643699) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python module build hang problem
flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python module build hang problem URL: https://github.com/apache/flink/pull/9941#issuecomment-544131502 ## CI report: * d14be543d3faf99a424a5480fda53ca7b7a65e49 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132643044) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions. URL: https://github.com/apache/flink/pull/9886#issuecomment-541326107 ## CI report: * e431c1c28a35dd86e60a0892ff0e0d15b8da7245 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131646097) * 0b45fe65b2a35154095af4944ea8c33e36165d0a : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131895829) * 586ea9b19299a7ea6bb9efd679522a872c15d0a8 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132129350) * 9ee50d8d31d809e5c5eb578b3d1d428658006554 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions. URL: https://github.com/apache/flink/pull/9886#issuecomment-541324020 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 9ee50d8d31d809e5c5eb578b3d1d428658006554 (Sat Oct 19 11:35:06 UTC 2019) **Warnings:** * **1 pom.xml files were touched**: Check for build and licensing issues. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] WeiZhong94 commented on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.
WeiZhong94 commented on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions. URL: https://github.com/apache/flink/pull/9886#issuecomment-544132906 @hequn8128 Thank you for your reminding me of that! I have updated the python_configuration.html in the latest commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions. URL: https://github.com/apache/flink/pull/9886#issuecomment-541324020 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 9ee50d8d31d809e5c5eb578b3d1d428658006554 (Sat Oct 19 11:27:59 UTC 2019) **Warnings:** * **1 pom.xml files were touched**: Check for build and licensing issues. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9941: [FLINK-14459][python] Fix python module build hang problem
flinkbot commented on issue #9941: [FLINK-14459][python] Fix python module build hang problem URL: https://github.com/apache/flink/pull/9941#issuecomment-544131502 ## CI report: * d14be543d3faf99a424a5480fda53ca7b7a65e49 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-9953) Active Kubernetes integration
[ https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated FLINK-9953: - Description: This is the umbrella issue tracking Flink's active Kubernetes integration. Active means in this context that the {{ResourceManager}} can talk to Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration. Phase1 implementation will have complete functions to make flink running on kubernetes. Phrase2 is mainly focused on production optimization, including k8s native high-availability, storage, network, log collector and etc. was: This is the umbrella issue tracking Flink's active Kubernetes integration. Active means in this context that the {{ResourceManager}} can talk to Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration. Phase1 implementation will have complete functions to make flink running on kubernetes. Phrase1 is mainly focused on production optimization, including k8s native high-availability, storage, network, log collector and etc. > Active Kubernetes integration > - > > Key: FLINK-9953 > URL: https://issues.apache.org/jira/browse/FLINK-9953 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Till Rohrmann >Assignee: Yang Wang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This is the umbrella issue tracking Flink's active Kubernetes integration. > Active means in this context that the {{ResourceManager}} can talk to > Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration. > Phase1 implementation will have complete functions to make flink running on > kubernetes. Phrase2 is mainly focused on production optimization, including > k8s native high-availability, storage, network, log collector and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14460) Active Kubernetes integration phase2 - Production Optimization
Yang Wang created FLINK-14460: - Summary: Active Kubernetes integration phase2 - Production Optimization Key: FLINK-14460 URL: https://issues.apache.org/jira/browse/FLINK-14460 Project: Flink Issue Type: Improvement Reporter: Yang Wang This is phase2 of active kubernetes integration. It is a umbrella jira to track all production optimization features. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-9953) Active Kubernetes integration
[ https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955141#comment-16955141 ] Yang Wang edited comment on FLINK-9953 at 10/19/19 10:59 AM: - > Perjob cluster mode I suggest to build the user image with required dependencies in per job mode. And actually, standalone job cluster is also like this. Many companies has use this way in production. In order to solve dynamic dependency management, we could add the init container before jm and tm pod starting. The init container could download the jars and other files from http server, hdfs and other shared storage. This make flink application more like k8s style. In this way, the `MiniDispatcher` and `ClassPathJobGraphRetriever` is enough for the per job mode. The two parts submission is more like to start a session cluster to simulate per job. So we will need to new dispatcher to accept job from rest and allow only one job. Maybe we could support this in the future, but it need more discussion. > Submission cli Currently the `flink run` coud only support detach mode for per job cluster on Yarn. In attach mode, we use a session to simulate a per job cluster for multi-parts. Do we need to keep the same behavior as flink on Yarn? We do not need user jar in k8s per job mode, so using the `flink run` to start per job cluster will be strange. {code:java} // detach, DeployJobCluster() Use the jar in the image, not in the cli. ./bin/flink run -d -m kubernetes-cluster ./examples/batch/WordCount.jar // attach, DeploySessionCluster() ./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar{code} > Implementation plan Let's focus on current design and move the production optimization to phase2. I have created another umbrella jira to track. Also we need more feedback from other users to improve the active kubernetes integration after phase1. I will attach the PRs in the next few days. [~felixzheng] [~trohrmann] How do you think? was (Author: fly_in_gis): > Perjob cluster mode I suggest to build the user image with required dependencies in per job mode. And actually, standalone job cluster is also like. Many companies has use this way in production. In order to solve dynamic dependency management, we could add the init container before jm and tm pod starting. The init container could download the jars and other files from http server, hdfs and other shared storage. This make flink application more like k8s style. In this way, the `MiniDispatcher` and `ClassPathJobGraphRetriever` is enough for the per job mode. The two parts submission is more like to start a session cluster to simulate per job. So we will need to new dispatcher to accept job from rest and allow only one job. Maybe we could support this in the future, but it need more discussion. > Submission cli Currently the `flink run` coud only support detach mode for per job cluster on Yarn. In attach mode, we use a session to simulate a per job cluster for multi-parts. Do we need to keep the same behavior as flink on Yarn? We do not need user jar in k8s per job mode, so using the `flink run` to start per job cluster will be strange. {code:java} // detach, DeployJobCluster() Use the jar in the image, not in the cli. ./bin/flink run -d -m kubernetes-cluster ./examples/batch/WordCount.jar // attach, DeploySessionCluster() ./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar{code} > Implementation plan Let's focus on current design and move the production optimization to phase2. I have created another umbrella jira to track. Also we need more feedback from other users to improve the active kubernetes integration after phase1. I will attach the PRs in the next few days. [~felixzheng] [~trohrmann] How do you think? > Active Kubernetes integration > - > > Key: FLINK-9953 > URL: https://issues.apache.org/jira/browse/FLINK-9953 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Till Rohrmann >Assignee: Yang Wang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This is the umbrella issue tracking Flink's active Kubernetes integration. > Active means in this context that the {{ResourceManager}} can talk to > Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration. > Phase1 implementation will have complete functions to make flink running on > kubernetes. Phrase1 is mainly focused on production optimization, including > k8s native high-availability, storage, network, log collector and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-9953) Active Kubernetes integration
[ https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955141#comment-16955141 ] Yang Wang commented on FLINK-9953: -- > Perjob cluster mode I suggest to build the user image with required dependencies in per job mode. And actually, standalone job cluster is also like. Many companies has use this way in production. In order to solve dynamic dependency management, we could add the init container before jm and tm pod starting. The init container could download the jars and other files from http server, hdfs and other shared storage. This make flink application more like k8s style. In this way, the `MiniDispatcher` and `ClassPathJobGraphRetriever` is enough for the per job mode. The two parts submission is more like to start a session cluster to simulate per job. So we will need to new dispatcher to accept job from rest and allow only one job. Maybe we could support this in the future, but it need more discussion. > Submission cli Currently the `flink run` coud only support detach mode for per job cluster on Yarn. In attach mode, we use a session to simulate a per job cluster for multi-parts. Do we need to keep the same behavior as flink on Yarn? We do not need user jar in k8s per job mode, so using the `flink run` to start per job cluster will be strange. {code:java} // detach, DeployJobCluster() Use the jar in the image, not in the cli. ./bin/flink run -d -m kubernetes-cluster ./examples/batch/WordCount.jar // attach, DeploySessionCluster() ./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar{code} > Implementation plan Let's focus on current design and move the production optimization to phase2. I have created another umbrella jira to track. Also we need more feedback from other users to improve the active kubernetes integration after phase1. I will attach the PRs in the next few days. [~felixzheng] [~trohrmann] How do you think? > Active Kubernetes integration > - > > Key: FLINK-9953 > URL: https://issues.apache.org/jira/browse/FLINK-9953 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Till Rohrmann >Assignee: Yang Wang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This is the umbrella issue tracking Flink's active Kubernetes integration. > Active means in this context that the {{ResourceManager}} can talk to > Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration. > Phase1 implementation will have complete functions to make flink running on > kubernetes. Phrase1 is mainly focused on production optimization, including > k8s native high-availability, storage, network, log collector and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on issue #9941: [FLINK-14459][python] Fix python module build hang problem
flinkbot commented on issue #9941: [FLINK-14459][python] Fix python module build hang problem URL: https://github.com/apache/flink/pull/9941#issuecomment-544126860 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit d14be543d3faf99a424a5480fda53ca7b7a65e49 (Sat Oct 19 10:44:34 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14459) Python module build hangs
[ https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-14459: --- Labels: pull-request-available (was: ) > Python module build hangs > - > > Key: FLINK-14459 > URL: https://issues.apache.org/jira/browse/FLINK-14459 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.9.0, 1.10.0, 1.9.1 >Reporter: Hequn Cheng >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > > The build of python module hangs when installing conda. See travis log: > https://api.travis-ci.org/v3/job/599704570/log.txt > Can't reproduce it neither on my local mac nor on my repo with travis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] hequn8128 opened a new pull request #9941: [FLINK-14459][python] Fix python module build hang problem
hequn8128 opened a new pull request #9941: [FLINK-14459][python] Fix python module build hang problem URL: https://github.com/apache/flink/pull/9941 ## What is the purpose of the change This pull request is a hotfix for the build failure for the master and release-1.9. The problem is caused by the latest conda installer: https://github.com/conda/conda/issues/9345. In this PR, we specify a stable version instead of the latest one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-14459) Python module build hangs
[ https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955132#comment-16955132 ] Hequn Cheng commented on FLINK-14459: - It seems a problem with the latest conda installer. https://github.com/conda/conda/issues/9345 > Python module build hangs > - > > Key: FLINK-14459 > URL: https://issues.apache.org/jira/browse/FLINK-14459 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.9.0, 1.10.0, 1.9.1 >Reporter: Hequn Cheng >Priority: Major > > The build of python module hangs when installing conda. See travis log: > https://api.travis-ci.org/v3/job/599704570/log.txt > Can't reproduce it neither on my local mac nor on my repo with travis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-14459) Python module build hangs
[ https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng reassigned FLINK-14459: --- Assignee: Hequn Cheng > Python module build hangs > - > > Key: FLINK-14459 > URL: https://issues.apache.org/jira/browse/FLINK-14459 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.9.0, 1.10.0, 1.9.1 >Reporter: Hequn Cheng >Assignee: Hequn Cheng >Priority: Major > > The build of python module hangs when installing conda. See travis log: > https://api.travis-ci.org/v3/job/599704570/log.txt > Can't reproduce it neither on my local mac nor on my repo with travis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14459) Python module build hangs
Hequn Cheng created FLINK-14459: --- Summary: Python module build hangs Key: FLINK-14459 URL: https://issues.apache.org/jira/browse/FLINK-14459 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.9.0, 1.10.0, 1.9.1 Reporter: Hequn Cheng The build of python module hangs when installing conda. See travis log: https://api.travis-ci.org/v3/job/599704570/log.txt Can't reproduce it neither on my local mac nor on my repo with travis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-14370) KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955110#comment-16955110 ] Arvid Heise edited comment on FLINK-14370 at 10/19/19 8:32 AM: --- We found the root cause and are currently discussing fixes as this is a non-trivial interplay of components. If you are in need of a quick (and dirty) fix: [https://github.com/apache/flink/pull/9918/commits/921ef31baa96bfc7c0629854104515dd856a6d29|https://github.com/apache/flink/pull/9918/commits/6a79fb8e9272b5d56ecb286634170c72403c751e] was (Author: arvid.he...@gmail.com): We found the root cause and are currently discussing fixes as this is a non-trivial interplay of components. If you are in need of a quick (and dirty) fix: https://github.com/apache/flink/pull/9918/commits/921ef31baa96bfc7c0629854104515dd856a6d29 > KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink > fails on Travis > --- > > Key: FLINK-14370 > URL: https://issues.apache.org/jira/browse/FLINK-14370 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.10.0 >Reporter: Till Rohrmann >Assignee: Jiangjie Qin >Priority: Critical > Labels: pull-request-available, test-stability > Time Spent: 10m > Remaining Estimate: 0h > > The > {{KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink}} > fails on Travis with > {code} > Test > testOneToOneAtLeastOnceRegularSink(org.apache.flink.streaming.connectors.kafka.KafkaProducerAtLeastOnceITCase) > failed with: > java.lang.AssertionError: Job should fail! > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnce(KafkaProducerTestBase.java:280) > at > org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink(KafkaProducerTestBase.java:206) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBo
[jira] [Comment Edited] (FLINK-14370) KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955110#comment-16955110 ] Arvid Heise edited comment on FLINK-14370 at 10/19/19 8:30 AM: --- We found the root cause and are currently discussing fixes as this is a non-trivial interplay of components. If you are in need of a quick (and dirty) fix: https://github.com/apache/flink/pull/9918/commits/921ef31baa96bfc7c0629854104515dd856a6d29 was (Author: arvid.he...@gmail.com): We found the root cause and are currently discussing fixes as this is a non-trivial interplay of components. > KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink > fails on Travis > --- > > Key: FLINK-14370 > URL: https://issues.apache.org/jira/browse/FLINK-14370 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.10.0 >Reporter: Till Rohrmann >Assignee: Jiangjie Qin >Priority: Critical > Labels: pull-request-available, test-stability > Time Spent: 10m > Remaining Estimate: 0h > > The > {{KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink}} > fails on Travis with > {code} > Test > testOneToOneAtLeastOnceRegularSink(org.apache.flink.streaming.connectors.kafka.KafkaProducerAtLeastOnceITCase) > failed with: > java.lang.AssertionError: Job should fail! > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnce(KafkaProducerTestBase.java:280) > at > org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink(KafkaProducerTestBase.java:206) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} > https://api.travis-ci.com/v3/job/244297223/log.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14370) KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955110#comment-16955110 ] Arvid Heise commented on FLINK-14370: - We found the root cause and are currently discussing fixes as this is a non-trivial interplay of components. > KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink > fails on Travis > --- > > Key: FLINK-14370 > URL: https://issues.apache.org/jira/browse/FLINK-14370 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.10.0 >Reporter: Till Rohrmann >Assignee: Jiangjie Qin >Priority: Critical > Labels: pull-request-available, test-stability > Time Spent: 10m > Remaining Estimate: 0h > > The > {{KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink}} > fails on Travis with > {code} > Test > testOneToOneAtLeastOnceRegularSink(org.apache.flink.streaming.connectors.kafka.KafkaProducerAtLeastOnceITCase) > failed with: > java.lang.AssertionError: Job should fail! > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnce(KafkaProducerTestBase.java:280) > at > org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink(KafkaProducerTestBase.java:206) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} > https://api.travis-ci.com/v3/job/244297223/log.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#issuecomment-542243031 ## CI report: * dd14191ee919be31148be254070e7a777cf9cb4d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/131984556) * 1559637102832603a0dc0d09ab730e00f2e9d224 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132628239) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-9953) Active Kubernetes integration
[ https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955105#comment-16955105 ] Canbin Zheng commented on FLINK-9953: - [~trohrmann] Thanks very much for taking time to read my design doc. I think this is something really great to have. We plan to run Flink on Kubernetes natively in production shortly, our clusters could be massive and may rapidly scale up on size or number of the cluster. Since this is a significant feature, and the maintenance and enhancement work in the future could be much larger than the initial commit. I have some suggestions to forward this feature. # We can further discuss the design proposals to reach a general consensus, especially on resource resolution, managed pods lifecycle management, worker store, garbage collection and user-oriented interfaces(shell scripts), after that we merge the current design docs into a final one, maybe we need a new FLIP too. # Take some time to list the small features we want to have and make more small plans on proceeding the implementations incrementally in phases, it’s better to keep the initial commit as small as possible to ease the code reviewing and help the initial version merged into Flink release more quickly, then we start to iteratively improve it according to the plans made and the feedbacks from users. > Active Kubernetes integration > - > > Key: FLINK-9953 > URL: https://issues.apache.org/jira/browse/FLINK-9953 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Till Rohrmann >Assignee: Yang Wang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This is the umbrella issue tracking Flink's active Kubernetes integration. > Active means in this context that the {{ResourceManager}} can talk to > Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration. > Phase1 implementation will have complete functions to make flink running on > kubernetes. Phrase1 is mainly focused on production optimization, including > k8s native high-availability, storage, network, log collector and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#issuecomment-542243031 ## CI report: * dd14191ee919be31148be254070e7a777cf9cb4d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/131984556) * 1559637102832603a0dc0d09ab730e00f2e9d224 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132628239) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#issuecomment-542243031 ## CI report: * dd14191ee919be31148be254070e7a777cf9cb4d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/131984556) * 1559637102832603a0dc0d09ab730e00f2e9d224 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-9953) Active Kubernetes integration
[ https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955099#comment-16955099 ] Canbin Zheng commented on FLINK-9953: - [~fly_in_gis] Thanks a lot for your quick response and very glad to have a chance to work with you on this exciting feature. > I think in the per-job cluster in flink means it is dedicated cluster for > only one job and will not accept more other jobs. Yep, you are right; a per-job cluster is always a dedicated cluster for only one job. One option mentioned in my design doc is to split the deploying step into two parts; * Part one, deploy a cluster without JobGraph attached; * Part two, submit a job to that cluster via `RestClusterClient` and shut down the cluster when the job finishes. For this idea, some new Kubernetes dedicated specialization, such as `KubernetesMiniDispatcher`, `KubernetesSingleJobGraphStore` will be introduced to support the new per-job cluster workflow. Both `KubernetesMiniDispatcher` and `KubernetesSingleJobGraphStore` do not need a JobGraph when we construct them, but accept only one single `JobGraph` after they are instantiated. So actually, this solution does not change the current definition of a per-job cluster. It seems this solution is slightly customised, but at a higher level, I think it provides a possible way to unify the deploying process of session and per-job cluster. In addition to the previous option, we have other options to solve dynamic dependency management for a per-job cluster. As a prerequisite, we upload those locally-hosted dependencies to a Hadoop Compatible File System, referred to HCFS, which is accessible to the Flink cluster. Then we fetch those dependencies for the job to run, and there are at least two solutions to get this. * Solution One Download dependencies from HCFS after starting JM(s) or TMs. 1. JM localizes those dependencies by downloading them when a JobManagerImpl is instantiated. 2. TMs fetch those dependencies when Task#run() is invoked. * Solution Two Download dependencies before starting JM(s) or TMs by utilizing a Kubernetes feature known as init-containers. Init-containers always run to completion before the main container is started, typically used to handle initialization work for the primary containers. > The users could put their jars in the image. And the > `ClassPathJobGraphRetriever` will be used to generate and retrieve the job > graph in flink master pod. This is a straightforward workflow; we build an image containing all the necessary application resource, such as application code, input files, etc., then run the application entirely from that image; many applications are working in this way in the Kubernetes ecosystem, we can add support for this use case. But some dependencies may not be known at image build time, or could be too large to be baked into a container image, or need frequent changes according to new business scenarios. For these cases, I propose to use a standard image with Flink distribution and supply dependencies at runtime; surely, we have several workarounds to support dynamic dependencies. > So i suggest to add kubernetes-job.sh to start per-job cluster. It will not > need a user jar as required argument. `flink run` could be used to submitted > a flink job to existed session. We can make changes to the existing `flink` shell to meet this requirement; it’s better not to introduce another dedicated kubernetes-job.sh to start a per-job cluster on Kubernetes. > Active Kubernetes integration > - > > Key: FLINK-9953 > URL: https://issues.apache.org/jira/browse/FLINK-9953 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Till Rohrmann >Assignee: Yang Wang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This is the umbrella issue tracking Flink's active Kubernetes integration. > Active means in this context that the {{ResourceManager}} can talk to > Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration. > Phase1 implementation will have complete functions to make flink running on > kubernetes. Phrase1 is mainly focused on production optimization, including > k8s native high-availability, storage, network, log collector and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)