[jira] [Closed] (FLINK-12677) Add descriptor, validator, and factory for HiveCatalog
[ https://issues.apache.org/jira/browse/FLINK-12677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bowen Li closed FLINK-12677. Resolution: Fixed merged in 1.9.0: 80d62e2d7c96b0ec65b01d525f3eb48e58446576 > Add descriptor, validator, and factory for HiveCatalog > -- > > Key: FLINK-12677 > URL: https://issues.apache.org/jira/browse/FLINK-12677 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API, Table SQL / Client >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12625) Support catalogs in SQL Client
[ https://issues.apache.org/jira/browse/FLINK-12625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12625: --- Labels: pull-request-available (was: ) > Support catalogs in SQL Client > -- > > Key: FLINK-12625 > URL: https://issues.apache.org/jira/browse/FLINK-12625 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Client >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0 > > > An umbrella ticket for making SQL Client work with Catalog APIs -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] asfgit closed pull request #8589: [FLINK-12625][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog
asfgit closed pull request #8589: [FLINK-12625][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog URL: https://github.com/apache/flink/pull/8589 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on issue #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog
bowenli86 commented on issue #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog URL: https://github.com/apache/flink/pull/8589#issuecomment-500072991 Thanks. Merging This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11105) Add a new implementation of the HighAvailabilityServices using etcd
[ https://issues.apache.org/jira/browse/FLINK-11105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859030#comment-16859030 ] Nathan Howell commented on FLINK-11105: --- Regarding Kubernetes support, managed Kubernetes offerings such as GKE and EKS do not expose etcd. The same functionality can be implement using only Kubernetes APIs - a mix of coordination/v1beta1 Lease and ConfigMap resources, or purely with ConfigMaps on older versions of Kubernetes... I think Lease was introduced in 1.13 or 1.14. Atomic replace operations and polling are sufficient to implement cooperative leader election, checkpoint counters, etc. > Add a new implementation of the HighAvailabilityServices using etcd > --- > > Key: FLINK-11105 > URL: https://issues.apache.org/jira/browse/FLINK-11105 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Yang Wang >Assignee: MalcolmSanders >Priority: Major > > In flink, we use HighAvailabilityServices to do many things, e.g. RM/JM > leader election and retrieval. ZooKeeperHaServices is an implementation of > HighAvailabilityServices using Apache ZooKeeper. It is very easy to integrate > with hadoop ecosystem. However, the cloud native and micro service are become > more and more popular. We just need to follow the step and add a new > implementation EtcdHaService using etcd. > Now flink has supported to run StandaloneSession on kubernetes and FLINK-9953 > start to make an native integration with kubernetes. If we have the > EtcdHaService, both of them will benefit from this and we will not have > deploy a zookeeper service on kubernetes cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12781) REST handler should return full stack trace instead of just error msg
[ https://issues.apache.org/jira/browse/FLINK-12781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Zhen Wu updated FLINK-12781: --- Summary: REST handler should return full stack trace instead of just error msg (was: run job REST api doesn't return complete stack trace for start job failure) > REST handler should return full stack trace instead of just error msg > - > > Key: FLINK-12781 > URL: https://issues.apache.org/jira/browse/FLINK-12781 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Steven Zhen Wu >Priority: Major > > We use REST api to start a job in Flink cluster. > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run > When there is exception during job construction, the response payload doesn't > contain the full stack trace. > {code} > {"errors":["org.apache.flink.client.program.ProgramInvocationException: The > main method caused an error."]} > {code} > This problem becomes more serious after FLINK-11134 got released in 1.7.2, > because stack trace is completely lost now. FLINK-11134 is doing the right > thing. We just need the response payload to contain the full stack trace, > which seems to have been an issue/fix since 1.5. we know 1.4 doesn't have > this issue and correctly return full stack trace > on the jobmanager log, we only get > {code} > 2019-06-07 17:42:40,136 ERROR > org.apache.flink.runtime.webmonitor.handlers.JarRunHandler- Exception > occurred in REST handler: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] xuefuz commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog
xuefuz commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog URL: https://github.com/apache/flink/pull/8589#discussion_r291770702 ## File path: flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/table/catalog/hive/HiveTestUtils.java ## @@ -36,25 +37,34 @@ /** * Create a HiveCatalog with an embedded Hive Metastore. */ - public static HiveCatalog createHiveCatalog() throws IOException { - return new HiveCatalog(CatalogTestBase.TEST_CATALOG_NAME, null, getHiveConf()); + public static HiveCatalog createHiveCatalog() { + return createHiveCatalog(CatalogTestBase.TEST_CATALOG_NAME); + } + + public static HiveCatalog createHiveCatalog(String catalogName) { + return new HiveCatalog(catalogName, null, getHiveConf()); } public static HiveCatalog createHiveCatalog(HiveConf hiveConf) { return new HiveCatalog(CatalogTestBase.TEST_CATALOG_NAME, null, hiveConf); } - public static HiveConf getHiveConf() throws IOException { + public static HiveConf getHiveConf() { Review comment: rename to createHiveConf? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-12646) Fix broken tests of RestClientTest
[ https://issues.apache.org/jira/browse/FLINK-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859022#comment-16859022 ] Victor Wong commented on FLINK-12646: - I'd love to, [https://github.com/apache/flink/pull/8663] > Fix broken tests of RestClientTest > -- > > Key: FLINK-12646 > URL: https://issues.apache.org/jira/browse/FLINK-12646 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Reporter: Victor Wong >Assignee: Victor Wong >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In > {code:java} > org.apache.flink.runtime.rest.RestClientTest#testConnectionTimeout > {code} > , we use a "unroutableIp" with a value of "10.255.255.1" for test. > But sometimes this IP is reachable in a private network of a company, which > is the case for me. As a result, this test failed with a following exception: > > {code:java} > java.lang.AssertionError: Expected: an instance of > org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException but: > Connection refused: /10.255.255.1:80> is a > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.junit.Assert.assertThat(Assert.java:956) at > org.junit.Assert.assertThat(Assert.java:923) at > org.apache.flink.runtime.rest.RestClientTest.testConnectionTimeout(RestClientTest.java:76) > ... > {code} > > > Can we change the `unroutableIp` to a reserved IP address, i.e "240.0.0.0", > which is described as _Reserved for future use_ in > [wikipedia|https://en.wikipedia.org/wiki/Reserved_IP_addresses] > Or change the assertion? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] xuefuz commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog
xuefuz commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog URL: https://github.com/apache/flink/pull/8589#discussion_r291770433 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java ## @@ -114,46 +114,22 @@ private HiveMetastoreClientWrapper client; - public HiveCatalog(String catalogName, @Nullable String defaultDatabase, @Nullable String hiveSiteFilePath) { - this(catalogName, - defaultDatabase == null ? DEFAULT_DB : defaultDatabase, - getHiveConf(loadHiveSiteUrl(hiveSiteFilePath))); - } - public HiveCatalog(String catalogName, @Nullable String defaultDatabase, @Nullable URL hiveSiteUrl) { this(catalogName, defaultDatabase == null ? DEFAULT_DB : defaultDatabase, getHiveConf(hiveSiteUrl)); } - public HiveCatalog(String catalogName, @Nullable String defaultDatabase, @Nullable HiveConf hiveConf) { + @VisibleForTesting + protected HiveCatalog(String catalogName, String defaultDatabase, HiveConf hiveConf) { super(catalogName, defaultDatabase == null ? DEFAULT_DB : defaultDatabase); this.hiveConf = hiveConf == null ? getHiveConf(null) : hiveConf; LOG.info("Created HiveCatalog '{}'", catalogName); } - private static URL loadHiveSiteUrl(String filePath) { - - URL url = null; - - if (!StringUtils.isNullOrWhitespaceOnly(filePath)) { - try { - url = new File(filePath).toURI().toURL(); - - LOG.info("Successfully loaded '{}'", filePath); - - } catch (MalformedURLException e) { - throw new CatalogException( - String.format("Failed to get hive-site.xml from the given path '%s'", filePath), e); - } - } - - return url; - } - - private static HiveConf getHiveConf(URL hiveSiteUrl) { + public static HiveConf getHiveConf(URL hiveSiteUrl) { Review comment: 1. This doesn't seem needing to be public 2. Maybe rename it to createHiveConf. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #8663: [FLINK-12646][runtime] Change the test IP of RestClientTest to 240.0.0.0
flinkbot commented on issue #8663: [FLINK-12646][runtime] Change the test IP of RestClientTest to 240.0.0.0 URL: https://github.com/apache/flink/pull/8663#issuecomment-500056494 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12646) Fix broken tests of RestClientTest
[ https://issues.apache.org/jira/browse/FLINK-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12646: --- Labels: pull-request-available (was: ) > Fix broken tests of RestClientTest > -- > > Key: FLINK-12646 > URL: https://issues.apache.org/jira/browse/FLINK-12646 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Reporter: Victor Wong >Assignee: Victor Wong >Priority: Minor > Labels: pull-request-available > > In > {code:java} > org.apache.flink.runtime.rest.RestClientTest#testConnectionTimeout > {code} > , we use a "unroutableIp" with a value of "10.255.255.1" for test. > But sometimes this IP is reachable in a private network of a company, which > is the case for me. As a result, this test failed with a following exception: > > {code:java} > java.lang.AssertionError: Expected: an instance of > org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException but: > Connection refused: /10.255.255.1:80> is a > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.junit.Assert.assertThat(Assert.java:956) at > org.junit.Assert.assertThat(Assert.java:923) at > org.apache.flink.runtime.rest.RestClientTest.testConnectionTimeout(RestClientTest.java:76) > ... > {code} > > > Can we change the `unroutableIp` to a reserved IP address, i.e "240.0.0.0", > which is described as _Reserved for future use_ in > [wikipedia|https://en.wikipedia.org/wiki/Reserved_IP_addresses] > Or change the assertion? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] jiasheng55 opened a new pull request #8663: [FLINK-12646][runtime] Change the test IP of RestClientTest to 240.0.0.0
jiasheng55 opened a new pull request #8663: [FLINK-12646][runtime] Change the test IP of RestClientTest to 240.0.0.0 URL: https://github.com/apache/flink/pull/8663 ## What is the purpose of the change Fix broken tests of RestClientTest ## Brief change log Change the test IP of RestClientTest from 10.255.255.1 to 240.0.0.0. ## Verifying this change This change is already covered by existing tests, such as `org.apache.flink.runtime.rest.RestClientTest#testConnectionTimeout`. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] xuefuz commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog
xuefuz commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog URL: https://github.com/apache/flink/pull/8589#discussion_r291769731 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/descriptors/HiveCatalogDescriptor.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.catalog.hive.descriptors; + +import org.apache.flink.table.catalog.hive.HiveCatalog; +import org.apache.flink.table.descriptors.CatalogDescriptor; +import org.apache.flink.table.descriptors.DescriptorProperties; +import org.apache.flink.util.Preconditions; +import org.apache.flink.util.StringUtils; + +import java.util.Map; + +import static org.apache.flink.table.catalog.hive.descriptors.HiveCatalogValidator.CATALOG_HIVE_SITE_PATH; +import static org.apache.flink.table.catalog.hive.descriptors.HiveCatalogValidator.CATALOG_TYPE_VALUE_HIVE; + +/** + * Catalog descriptor for {@link HiveCatalog}. + */ +public class HiveCatalogDescriptor extends CatalogDescriptor { + + private String hiveSitePath; + + // TODO : set default database + public HiveCatalogDescriptor() { + super(CATALOG_TYPE_VALUE_HIVE, 1); + } + + public HiveCatalogDescriptor hiveSitePath(String hiveSitePath) { + Preconditions.checkArgument(!StringUtils.isNullOrWhitespaceOnly(hiveSitePath)); + this.hiveSitePath = hiveSitePath; + + return this; + } + + @Override + protected Map toCatalogProperties() { + final DescriptorProperties properties = new DescriptorProperties(); + + if (hiveSitePath != null) { Review comment: ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12781) run job REST api doesn't return complete stack trace for start job failure
[ https://issues.apache.org/jira/browse/FLINK-12781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Zhen Wu updated FLINK-12781: --- Description: We use REST api to start a job in Flink cluster. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run When there is exception during job construction, the response payload doesn't contain the full stack trace. {code} {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} {code} This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. We just need the response payload to contain the full stack trace, which seems to have been an issue/fix since 1.5. we know 1.4 doesn't have this issue and correctly return full stack trace on the jobmanager log, we only get {code} 2019-06-07 17:42:40,136 ERROR org.apache.flink.runtime.webmonitor.handlers.JarRunHandler- Exception occurred in REST handler: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error. {code} was: We use REST api to start a job in Flink cluster. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run When there is exception during job construction, the response payload doesn't contain the full stack trace. {code} {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} {code} This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. We just need the response payload to contain the full stack trace, which seems to have been an issue/fix since 1.5. we know 1.4 doesn't have this issue and correctly return full stack trace > run job REST api doesn't return complete stack trace for start job failure > -- > > Key: FLINK-12781 > URL: https://issues.apache.org/jira/browse/FLINK-12781 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Steven Zhen Wu >Priority: Major > > We use REST api to start a job in Flink cluster. > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run > When there is exception during job construction, the response payload doesn't > contain the full stack trace. > {code} > {"errors":["org.apache.flink.client.program.ProgramInvocationException: The > main method caused an error."]} > {code} > This problem becomes more serious after FLINK-11134 got released in 1.7.2, > because stack trace is completely lost now. FLINK-11134 is doing the right > thing. We just need the response payload to contain the full stack trace, > which seems to have been an issue/fix since 1.5. we know 1.4 doesn't have > this issue and correctly return full stack trace > on the jobmanager log, we only get > {code} > 2019-06-07 17:42:40,136 ERROR > org.apache.flink.runtime.webmonitor.handlers.JarRunHandler- Exception > occurred in REST handler: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] jiasheng55 commented on issue #8650: [hotfix][runtime] Fix the typo issue
jiasheng55 commented on issue #8650: [hotfix][runtime] Fix the typo issue URL: https://github.com/apache/flink/pull/8650#issuecomment-500053157 Thanks, I'll pay attention to that next time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12781) run job REST api doesn't return complete stack trace for start job failure
[ https://issues.apache.org/jira/browse/FLINK-12781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Zhen Wu updated FLINK-12781: --- Description: We use REST api to start a job in Flink cluster. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run When there is exception during job construction, the response payload doesn't contain the full stack trace. {code} {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} {code} This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. We just need the response payload to contain the full stack trace, which seems to have been an issue/fix since 1.5. we know 1.4 doesn't have this issue and correctly return full stack trace was: We use REST api to start a job in Flink cluster. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run When there is exception during job construction, the response payload doesn't contain the full stack trace. {code} {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} {code} This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. We just need the response payload to contain the full stack trace, which has always been an issue/fix. > run job REST api doesn't return complete stack trace for start job failure > -- > > Key: FLINK-12781 > URL: https://issues.apache.org/jira/browse/FLINK-12781 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Steven Zhen Wu >Priority: Major > > We use REST api to start a job in Flink cluster. > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run > When there is exception during job construction, the response payload doesn't > contain the full stack trace. > {code} > {"errors":["org.apache.flink.client.program.ProgramInvocationException: The > main method caused an error."]} > {code} > This problem becomes more serious after FLINK-11134 got released in 1.7.2, > because stack trace is completely lost now. FLINK-11134 is doing the right > thing. We just need the response payload to contain the full stack trace, > which seems to have been an issue/fix since 1.5. we know 1.4 doesn't have > this issue and correctly return full stack trace -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12782) Show formatted timestamp in Watermark Tab of JM Web UI
Felix Wollschläger created FLINK-12782: -- Summary: Show formatted timestamp in Watermark Tab of JM Web UI Key: FLINK-12782 URL: https://issues.apache.org/jira/browse/FLINK-12782 Project: Flink Issue Type: Improvement Components: Runtime / Web Frontend Reporter: Felix Wollschläger In the Watermarks Tab of a Flink-Job in the Web UI, show the timestamps as a formatted Date. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12781) run job REST api doesn't return complete stack trace for start job failure
[ https://issues.apache.org/jira/browse/FLINK-12781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Zhen Wu updated FLINK-12781: --- Description: We use REST api to start a job in Flink cluster. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run When there is exception during job construction, the response payload doesn't contain the full stack trace. {code} {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} {code} This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. We just need the response payload to contain the full stack trace, which has always been an issue/fix. was: We use REST api to start a job in Flink cluster. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run When there is exception during job construction, the response payload doesn't contain the full stack trace. {code} {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} {code} This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. But we need the response payload to contain the full stack trace. > run job REST api doesn't return complete stack trace for start job failure > -- > > Key: FLINK-12781 > URL: https://issues.apache.org/jira/browse/FLINK-12781 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Steven Zhen Wu >Priority: Major > > We use REST api to start a job in Flink cluster. > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run > When there is exception during job construction, the response payload doesn't > contain the full stack trace. > {code} > {"errors":["org.apache.flink.client.program.ProgramInvocationException: The > main method caused an error."]} > {code} > This problem becomes more serious after FLINK-11134 got released in 1.7.2, > because stack trace is completely lost now. FLINK-11134 is doing the right > thing. We just need the response payload to contain the full stack trace, > which has always been an issue/fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12781) run job REST api doesn't return complete stack trace for start job failure
[ https://issues.apache.org/jira/browse/FLINK-12781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Zhen Wu updated FLINK-12781: --- Description: We use REST api to start a job in Flink cluster. [https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run] When there is exception during job construction, the response payload doesn't contain the full stack trace. {code} {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} {code} This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. But we need the response payload to contain the full stack trace. was: We use REST api to start a job in Flink cluster. [https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run] When there is exception during job construction, the response payload doesn't contain the full stack trace. ``` {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} ``` This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. But we need the response payload to contain the full stack trace. > run job REST api doesn't return complete stack trace for start job failure > -- > > Key: FLINK-12781 > URL: https://issues.apache.org/jira/browse/FLINK-12781 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Steven Zhen Wu >Priority: Major > > We use REST api to start a job in Flink cluster. > [https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run] > > > When there is exception during job construction, the response payload doesn't > contain the full stack trace. > {code} > {"errors":["org.apache.flink.client.program.ProgramInvocationException: The > main method caused an error."]} > {code} > This problem becomes more serious after FLINK-11134 got released in 1.7.2, > because stack trace is completely lost now. FLINK-11134 is doing the right > thing. But we need the response payload to contain the full stack trace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12781) run job REST api doesn't return complete stack trace for start job failure
[ https://issues.apache.org/jira/browse/FLINK-12781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Zhen Wu updated FLINK-12781: --- Description: We use REST api to start a job in Flink cluster. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run When there is exception during job construction, the response payload doesn't contain the full stack trace. {code} {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} {code} This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. But we need the response payload to contain the full stack trace. was: We use REST api to start a job in Flink cluster. [https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run] When there is exception during job construction, the response payload doesn't contain the full stack trace. {code} {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} {code} This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. But we need the response payload to contain the full stack trace. > run job REST api doesn't return complete stack trace for start job failure > -- > > Key: FLINK-12781 > URL: https://issues.apache.org/jira/browse/FLINK-12781 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Steven Zhen Wu >Priority: Major > > We use REST api to start a job in Flink cluster. > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run > When there is exception during job construction, the response payload doesn't > contain the full stack trace. > {code} > {"errors":["org.apache.flink.client.program.ProgramInvocationException: The > main method caused an error."]} > {code} > This problem becomes more serious after FLINK-11134 got released in 1.7.2, > because stack trace is completely lost now. FLINK-11134 is doing the right > thing. But we need the response payload to contain the full stack trace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12781) run job REST api doesn't return complete stack trace for start job failure
Steven Zhen Wu created FLINK-12781: -- Summary: run job REST api doesn't return complete stack trace for start job failure Key: FLINK-12781 URL: https://issues.apache.org/jira/browse/FLINK-12781 Project: Flink Issue Type: Improvement Components: Runtime / REST Reporter: Steven Zhen Wu We use REST api to start a job in Flink cluster. [https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-jarid-run] When there is exception during job construction, the response payload doesn't contain the full stack trace. ``` {"errors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} ``` This problem becomes more serious after FLINK-11134 got released in 1.7.2, because stack trace is completely lost now. FLINK-11134 is doing the right thing. But we need the response payload to contain the full stack trace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] bowenli86 commented on issue #8214: [FLINK-11476] [table] Create CatalogManager to manage multiple catalogs and encapsulate Calcite schema
bowenli86 commented on issue #8214: [FLINK-11476] [table] Create CatalogManager to manage multiple catalogs and encapsulate Calcite schema URL: https://github.com/apache/flink/pull/8214#issuecomment-500043829 closing this as it's merge as part of https://github.com/apache/flink/pull/8404 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 closed pull request #8214: [FLINK-11476] [table] Create CatalogManager to manage multiple catalogs and encapsulate Calcite schema
bowenli86 closed pull request #8214: [FLINK-11476] [table] Create CatalogManager to manage multiple catalogs and encapsulate Calcite schema URL: https://github.com/apache/flink/pull/8214 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] asfgit closed pull request #8650: [hotfix][runtime] Fix the typo issue
asfgit closed pull request #8650: [hotfix][runtime] Fix the typo issue URL: https://github.com/apache/flink/pull/8650 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on issue #8650: [hotfix][runtime] Fix the typo issue
bowenli86 commented on issue #8650: [hotfix][runtime] Fix the typo issue URL: https://github.com/apache/flink/pull/8650#issuecomment-500042428 Thanks @jiasheng55 for your contribution. LGTM, but please have more detailed PR title and commit message next time, e.g. "fix typo in javadoc of StateSnapshotContextSynchronousImpl" I will change the title and commit message for you this time. Merging This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on issue #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog
bowenli86 commented on issue #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog URL: https://github.com/apache/flink/pull/8589#issuecomment-500023058 @xuefuz as we discussed offline, I kept only one public constructor and a protected constructor for testing purpose This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog
bowenli86 commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog URL: https://github.com/apache/flink/pull/8589#discussion_r291738275 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/factories/HiveCatalogFactory.java ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.catalog.hive.factories; + +import org.apache.flink.annotation.VisibleForTesting; +import org.apache.flink.table.catalog.Catalog; +import org.apache.flink.table.catalog.hive.HiveCatalog; +import org.apache.flink.table.catalog.hive.descriptors.HiveCatalogValidator; +import org.apache.flink.table.descriptors.DescriptorProperties; +import org.apache.flink.table.factories.CatalogFactory; + +import org.apache.hadoop.hive.conf.HiveConf; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +import static org.apache.flink.table.catalog.hive.descriptors.HiveCatalogValidator.CATALOG_HIVE_SITE_PATH; +import static org.apache.flink.table.catalog.hive.descriptors.HiveCatalogValidator.CATALOG_TYPE_VALUE_HIVE; +import static org.apache.flink.table.descriptors.CatalogDescriptorValidator.CATALOG_DEFAULT_DATABASE; +import static org.apache.flink.table.descriptors.CatalogDescriptorValidator.CATALOG_PROPERTY_VERSION; +import static org.apache.flink.table.descriptors.CatalogDescriptorValidator.CATALOG_TYPE; + +/** + * Catalog factory for {@link HiveCatalog}. + */ +public class HiveCatalogFactory implements CatalogFactory { + + @Override + public Map requiredContext() { + Map context = new HashMap<>(); + context.put(CATALOG_TYPE, CATALOG_TYPE_VALUE_HIVE); // hive + context.put(CATALOG_PROPERTY_VERSION, "1"); // backwards compatibility + return context; + } + + @Override + public List supportedProperties() { + List properties = new ArrayList<>(); + + // default database + properties.add(CATALOG_DEFAULT_DATABASE); + + properties.add(CATALOG_HIVE_SITE_PATH); + + return properties; + } + + @Override + public Catalog createCatalog(String name, Map properties) { + final DescriptorProperties descriptorProperties = getValidatedProperties(properties); + + final String defaultDatabase = + descriptorProperties.getOptionalString(CATALOG_DEFAULT_DATABASE) + .orElse(HiveCatalog.DEFAULT_DB); + + final Optional hiveSitePath = descriptorProperties.getOptionalString(CATALOG_HIVE_SITE_PATH); + + return new HiveCatalog(name, defaultDatabase, getHiveConf(hiveSitePath.orElse(null))); + } + + @VisibleForTesting + protected HiveConf getHiveConf(String hiveSitePath) { Review comment: removed as a result of the refactoring This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog
bowenli86 commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog URL: https://github.com/apache/flink/pull/8589#discussion_r291396450 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java ## @@ -132,9 +134,11 @@ public HiveCatalog(String catalogName, @Nullable String defaultDatabase, @Nullab this.hiveConf = hiveConf == null ? getHiveConf(null) : hiveConf; LOG.info("Created HiveCatalog '{}'", catalogName); + + HiveConf.setHiveSiteLocation(null); Review comment: bad merge result. removed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog
bowenli86 commented on a change in pull request #8589: [FLINK-12677][hive][sql-client] Add descriptor, validator, and factory for HiveCatalog URL: https://github.com/apache/flink/pull/8589#discussion_r291396683 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/descriptors/HiveCatalogDescriptor.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.catalog.hive.descriptors; + +import org.apache.flink.table.catalog.hive.HiveCatalog; +import org.apache.flink.table.descriptors.CatalogDescriptor; +import org.apache.flink.table.descriptors.DescriptorProperties; +import org.apache.flink.util.Preconditions; +import org.apache.flink.util.StringUtils; + +import java.util.Map; + +import static org.apache.flink.table.catalog.hive.descriptors.HiveCatalogValidator.CATALOG_HIVE_SITE_PATH; +import static org.apache.flink.table.catalog.hive.descriptors.HiveCatalogValidator.CATALOG_TYPE_VALUE_HIVE; + +/** + * Catalog descriptor for {@link HiveCatalog}. + */ +public class HiveCatalogDescriptor extends CatalogDescriptor { + + private String hiveSitePath; + + // TODO : set default database + public HiveCatalogDescriptor() { + super(CATALOG_TYPE_VALUE_HIVE, 1); + } + + public HiveCatalogDescriptor hiveSitePath(String hiveSitePath) { + Preconditions.checkArgument(!StringUtils.isNullOrWhitespaceOnly(hiveSitePath)); + this.hiveSitePath = hiveSitePath; + + return this; + } + + @Override + protected Map toCatalogProperties() { + final DescriptorProperties properties = new DescriptorProperties(); + + if (hiveSitePath != null) { Review comment: hiveSiteUrl is not in constructor, it resides in a builder method This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-12660) Integrate Flink with Hive GenericUDAFResolver and GenericUDAFResolver2
[ https://issues.apache.org/jira/browse/FLINK-12660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bowen Li reassigned FLINK-12660: Assignee: Bowen Li > Integrate Flink with Hive GenericUDAFResolver and GenericUDAFResolver2 > -- > > Key: FLINK-12660 > URL: https://issues.apache.org/jira/browse/FLINK-12660 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Fix For: 1.9.0 > > > https://hive.apache.org/javadocs/r3.1.1/api/org/apache/hadoop/hive/ql/udf/generic/AbstractGenericUDAFResolver.html > https://hive.apache.org/javadocs/r3.1.1/api/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFResolver2.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-12658) Integrate Flink with Hive GenericUDF
[ https://issues.apache.org/jira/browse/FLINK-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bowen Li reassigned FLINK-12658: Assignee: Bowen Li > Integrate Flink with Hive GenericUDF > > > Key: FLINK-12658 > URL: https://issues.apache.org/jira/browse/FLINK-12658 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Fix For: 1.9.0 > > > https://hive.apache.org/javadocs/r3.1.1/api/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-12657) Integrate Flink with Hive UDF
[ https://issues.apache.org/jira/browse/FLINK-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bowen Li reassigned FLINK-12657: Assignee: Bowen Li > Integrate Flink with Hive UDF > - > > Key: FLINK-12657 > URL: https://issues.apache.org/jira/browse/FLINK-12657 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Fix For: 1.9.0 > > > https://hive.apache.org/javadocs/r3.1.1/api/org/apache/hadoop/hive/ql/exec/UDF.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-12659) Integrate Flink with Hive GenericUDTF
[ https://issues.apache.org/jira/browse/FLINK-12659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bowen Li reassigned FLINK-12659: Assignee: Bowen Li > Integrate Flink with Hive GenericUDTF > - > > Key: FLINK-12659 > URL: https://issues.apache.org/jira/browse/FLINK-12659 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > > https://hive.apache.org/javadocs/r3.1.1/api/org/apache/hadoop/hive/ql/udf/generic/GenericUDTF.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8662: [FLINK-12780][hive] use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil
flinkbot commented on issue #8662: [FLINK-12780][hive] use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil URL: https://github.com/apache/flink/pull/8662#issuecomment-48159 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12780) use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil
[ https://issues.apache.org/jira/browse/FLINK-12780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12780: --- Labels: pull-request-available (was: ) > use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil > --- > > Key: FLINK-12780 > URL: https://issues.apache.org/jira/browse/FLINK-12780 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] bowenli86 commented on issue #8662: [FLINK-12780][hive] use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil
bowenli86 commented on issue #8662: [FLINK-12780][hive] use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil URL: https://github.com/apache/flink/pull/8662#issuecomment-47839 cc @xuefuz @lirui-apache @zjuwangg This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 opened a new pull request #8662: [FLINK-12780][hive] use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil
bowenli86 opened a new pull request #8662: [FLINK-12780][hive] use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil URL: https://github.com/apache/flink/pull/8662 ## What is the purpose of the change This PR changes type comparisons in `HiveTypeUtil` to use `LogicalTypeRoot` of Flink's `DataType`. @twalthr and @JingsongLi raised the concerns in https://github.com/apache/flink/pull/8645 that using `DataType` itself for comparison may not be reliable in Flink's new type system. ## Brief change log - changes type comparisons in `HiveTypeUtil` to use `LogicalTypeRoot` of Flink's `DataType` ## Verifying this change This change is already covered by existing tests ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] lamber-ken commented on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null
lamber-ken commented on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null URL: https://github.com/apache/flink/pull/8583#issuecomment-47592 > Scratch that, I don't think we can do this. Our Kafka consumer silently swallows null values: > > https://github.com/apache/flink/blob/049994274c9d4fc07925a7639e4044506b090d10/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/AbstractFetcher.java#L407-L410 > . Plus, I think our serializers in general don't always support `null` values. The fact that `StringSerializer` does is more of an anomaly. (also thanks to @GJL for pointing this out to me ) from the NPE stackstrace, we can see that the NPE happens before `AbstractFetcher#emitRecord`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12780) use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil
[ https://issues.apache.org/jira/browse/FLINK-12780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bowen Li updated FLINK-12780: - Summary: use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil (was: use LogicalTypeRoot for type comparison) > use Flink's LogicalTypeRoot for type comparison in HiveTypeUtil > --- > > Key: FLINK-12780 > URL: https://issues.apache.org/jira/browse/FLINK-12780 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12780) use LogicalTypeRoot for type comparison
Bowen Li created FLINK-12780: Summary: use LogicalTypeRoot for type comparison Key: FLINK-12780 URL: https://issues.apache.org/jira/browse/FLINK-12780 Project: Flink Issue Type: Sub-task Components: Connectors / Hive Reporter: Bowen Li Assignee: Bowen Li Fix For: 1.9.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] walterddr closed pull request #8324: [FLINK-11921][table] Upgrade to calcite 1.19
walterddr closed pull request #8324: [FLINK-11921][table] Upgrade to calcite 1.19 URL: https://github.com/apache/flink/pull/8324 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] walterddr commented on issue #8324: [FLINK-11921][table] Upgrade to calcite 1.19
walterddr commented on issue #8324: [FLINK-11921][table] Upgrade to calcite 1.19 URL: https://github.com/apache/flink/pull/8324#issuecomment-499967458 close this pull request and recreate new one once calcite 1.20 releases This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11921) Upgrade Calcite dependency to 1.20
[ https://issues.apache.org/jira/browse/FLINK-11921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858835#comment-16858835 ] Rong Rong commented on FLINK-11921: --- changed title as discussed: we will skip 1.19, directly go with 1.20 > Upgrade Calcite dependency to 1.20 > -- > > Key: FLINK-11921 > URL: https://issues.apache.org/jira/browse/FLINK-11921 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Reporter: Timo Walther >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Umbrella issue for all tasks related to the next Calcite upgrade to 1.20.x > release > We will skip 1.19.x since the change required is minor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11921) Upgrade Calcite dependency to 1.20
[ https://issues.apache.org/jira/browse/FLINK-11921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-11921: -- Description: Umbrella issue for all tasks related to the next Calcite upgrade to 1.20.x release We will skip 1.19.x since the change required is minor. was:Umbrella issue for all tasks related to the next Calcite upgrade. > Upgrade Calcite dependency to 1.20 > -- > > Key: FLINK-11921 > URL: https://issues.apache.org/jira/browse/FLINK-11921 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Reporter: Timo Walther >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Umbrella issue for all tasks related to the next Calcite upgrade to 1.20.x > release > We will skip 1.19.x since the change required is minor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11921) Upgrade Calcite dependency to 1.20
[ https://issues.apache.org/jira/browse/FLINK-11921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-11921: -- Summary: Upgrade Calcite dependency to 1.20 (was: Upgrade Calcite dependency to 1.19) > Upgrade Calcite dependency to 1.20 > -- > > Key: FLINK-11921 > URL: https://issues.apache.org/jira/browse/FLINK-11921 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Reporter: Timo Walther >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Umbrella issue for all tasks related to the next Calcite upgrade. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] lamber-ken edited a comment on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null
lamber-ken edited a comment on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null URL: https://github.com/apache/flink/pull/8583#issuecomment-499957988 hi, thanks for your commnet @aljoscha. here is detail stackstrace, flink-version: 1.6.3 ``` Caused by: java.lang.NullPointerException at java.lang.String.(String.java:515) at org.apache.flink.api.common.serialization.SimpleStringSchema.deserialize(SimpleStringSchema.java:75) at org.apache.flink.api.common.serialization.SimpleStringSchema.deserialize(SimpleStringSchema.java:36) at org.apache.flink.streaming.util.serialization.KeyedDeserializationSchemaWrapper.deserialize(KeyedDeserializationSchemaWrapper.java:44) at org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.runFetchLoop(Kafka09Fetcher.java:142) at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:738) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:94) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:58) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) at java.lang.Thread.run(Thread.java:748) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] lamber-ken edited a comment on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null
lamber-ken edited a comment on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null URL: https://github.com/apache/flink/pull/8583#issuecomment-499957988 hi, thanks for your commnet @aljoscha. here is detail stackstrace ``` Caused by: java.lang.NullPointerException at java.lang.String.(String.java:515) at org.apache.flink.api.common.serialization.SimpleStringSchema.deserialize(SimpleStringSchema.java:75) at org.apache.flink.api.common.serialization.SimpleStringSchema.deserialize(SimpleStringSchema.java:36) at org.apache.flink.streaming.util.serialization.KeyedDeserializationSchemaWrapper.deserialize(KeyedDeserializationSchemaWrapper.java:44) at org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.runFetchLoop(Kafka09Fetcher.java:142) at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:738) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:94) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:58) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) at java.lang.Thread.run(Thread.java:748) ``` A kafka demo example ``` import org.apache.flink.api.common.serialization.SimpleStringSchema; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010; import java.util.Properties; public class KafkaTest { public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); Properties p = new Properties(); p.setProperty("bootstrap.servers", "kmaster.bigdata.ly:9092"); p.setProperty("group.id", "aan"); FlinkKafkaConsumer010 consumer = new FlinkKafkaConsumer010("RTC_PROJECT_LOG_T_T_null", new SimpleStringSchema(), p); env.addSource(consumer).print(); env.execute(); } } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on a change in pull request #8645: [FLINK-12386] Support mapping BinaryType, VarBinaryType, CharType, VarCharType, and DecimalType between Flink and Hive in HiveCa
bowenli86 commented on a change in pull request #8645: [FLINK-12386] Support mapping BinaryType, VarBinaryType, CharType, VarCharType, and DecimalType between Flink and Hive in HiveCatalog URL: https://github.com/apache/flink/pull/8645#discussion_r291674279 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveTypeUtil.java ## @@ -18,127 +18,174 @@ package org.apache.flink.table.catalog.hive.util; -import org.apache.flink.api.common.typeinfo.BasicArrayTypeInfo; -import org.apache.flink.api.common.typeinfo.BasicTypeInfo; -import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo; -import org.apache.flink.api.common.typeinfo.SqlTimeTypeInfo; -import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.table.api.DataTypes; +import org.apache.flink.table.catalog.exceptions.CatalogException; +import org.apache.flink.table.types.DataType; +import org.apache.flink.table.types.logical.BinaryType; +import org.apache.flink.table.types.logical.CharType; +import org.apache.flink.table.types.logical.DecimalType; +import org.apache.flink.table.types.logical.LogicalType; +import org.apache.flink.table.types.logical.VarBinaryType; +import org.apache.flink.table.types.logical.VarCharType; -import org.apache.hadoop.hive.serde.serdeConstants; +import org.apache.hadoop.hive.common.type.HiveChar; +import org.apache.hadoop.hive.common.type.HiveVarchar; +import org.apache.hadoop.hive.serde2.typeinfo.CharTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo; import org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo; import org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo; import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo; /** * Utils to convert data types between Flink and Hive. */ public class HiveTypeUtil { - // Note: Need to keep this in sync with BaseSemanticAnalyzer::getTypeStringFromAST - private static final String HIVE_ARRAY_TYPE_NAME_FORMAT = serdeConstants.LIST_TYPE_NAME + "<%s>"; - private HiveTypeUtil() { } + /** +* Convert Flink data type to Hive data type name. +* TODO: the following Hive types are not supported in Flink yet, including MAP, STRUCT +* +* @param type a Flink data type +* @return the corresponding Hive data type name +*/ + public static String toHiveTypeName(DataType type) { + return toHiveTypeInfo(type).getTypeName(); + } + /** * Convert Flink data type to Hive data type. -* TODO: the following Hive types are not supported in Flink yet, including CHAR, VARCHAR, DECIMAL, MAP, STRUCT -* [FLINK-12386] Support complete mapping between Flink and Hive data types * * @param type a Flink data type * @return the corresponding Hive data type */ - public static String toHiveType(TypeInformation type) { - if (type.equals(BasicTypeInfo.BOOLEAN_TYPE_INFO)) { - return serdeConstants.BOOLEAN_TYPE_NAME; - } else if (type.equals(BasicTypeInfo.BYTE_TYPE_INFO)) { - return serdeConstants.TINYINT_TYPE_NAME; - } else if (type.equals(BasicTypeInfo.SHORT_TYPE_INFO)) { - return serdeConstants.SMALLINT_TYPE_NAME; - } else if (type.equals(BasicTypeInfo.INT_TYPE_INFO)) { - return serdeConstants.INT_TYPE_NAME; - } else if (type.equals(BasicTypeInfo.LONG_TYPE_INFO)) { - return serdeConstants.BIGINT_TYPE_NAME; - } else if (type.equals(BasicTypeInfo.FLOAT_TYPE_INFO)) { - return serdeConstants.FLOAT_TYPE_NAME; - } else if (type.equals(BasicTypeInfo.DOUBLE_TYPE_INFO)) { - return serdeConstants.DOUBLE_TYPE_NAME; - } else if (type.equals(BasicTypeInfo.STRING_TYPE_INFO)) { - return serdeConstants.STRING_TYPE_NAME; - } else if (type.equals(SqlTimeTypeInfo.DATE)) { - return serdeConstants.DATE_TYPE_NAME; - } else if (type.equals(PrimitiveArrayTypeInfo.BYTE_PRIMITIVE_ARRAY_TYPE_INFO)) { - return serdeConstants.BINARY_TYPE_NAME; - } else if (type.equals(SqlTimeTypeInfo.TIMESTAMP)) { - return serdeConstants.TIMESTAMP_TYPE_NAME; + public static TypeInfo toHiveTypeInfo(DataType type) { Review comment: I see. I'll have a followup PR for this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the
[GitHub] [flink] fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary
fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary URL: https://github.com/apache/flink/pull/8607#discussion_r291673345 ## File path: docs/concepts/glossary.md ## @@ -0,0 +1,140 @@ +--- +title: Glossary +nav-pos: 3 +nav-title: Glossary +nav-parent_id: concepts +--- + + + Flink Application Cluster + +A Flink Application Cluster is a dedicated [Flink Cluster](./glossary#flink-cluster) that only +executes a single [Flink Job](./glossary#flink-job). The lifetime of the +[Flink Cluster](./glossary#flink-cluster) is bound to the lifetime of the Flink Job. Formerly +Flink Application Clusters were also known as Flink Clusters in *job mode*. + + Flink Cluster + +The distributed system consisting of (typically) one Flink Master process and one or more Flink +Taskmanagers processes. Review comment: just FYI, I also used "Flink master process" when describing the concept in the book. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary
fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary URL: https://github.com/apache/flink/pull/8607#discussion_r291672539 ## File path: docs/concepts/glossary.md ## @@ -0,0 +1,166 @@ +--- +title: Glossary +nav-pos: 3 +nav-title: Glossary +nav-parent_id: concepts +--- + + + Flink Application Cluster + +A Flink Application Cluster is a dedicated [Flink Cluster](./glossary#flink-cluster) that only +executes a single [Flink Job](./glossary#flink-job). The lifetime of the +[Flink Cluster](./glossary#flink-cluster) is bound to the lifetime of the Flink Job. Formerly +Flink Application Clusters were also known as Flink Clusters in *job mode*. Compare to +[Flink Session Cluster](./glossary#flink-session-cluster). + + Flink Cluster + +The distributed system consisting of (typically) one Flink Master process and one or more Flink +Taskmanagers processes. + + Event + +An event is a statement about a change of the state of the domain modelled by the +application. Events can be input and/or output of a stream or batch processing application. +Events are special types of [records](./glossary#Record) + + ExecutionGraph + +see [Physical Graph](./glossary#physical-graph) + + Function + +Functions, or user-defined functions (UDFs), are implemented by the user and encapsulate the +application logic of a Flink program. Most Functions are wrapped by a corresponding +[Operator](./glossary#operator). + + Instance + +The term *instance* is used to describe a specific instance of a specific type (usually +[Operator](./glossary#operator) or [Function](./glossary#function)) during runtime. As Apache Flink +is mostly written in Java, this corresponds to the definition of *Instance* or *Object* in Java. +In the context of Apache Flink, the term *parallel instance* is also frequently used to emphasize +that multiple instances of the same [Operator](./glossary#operator) or +[Function](./glossary#function) type are running in parallel. + + Flink Job + +A Flink Job is the runtime representation of a Flink program. A Flink Job can either be submitted +to a long running [Flink Session Cluster](./glossary#flink-session-cluster) or it can be started as a +self-contained [Flink Application Cluster](./glossary#flink-application-cluster). + + JobGraph + +see [Logical Graph](./glossary#logical-graph) + + Flink JobManager + +JobManagers are one of the components running in the +[Flink JobManger Process](./glossary#flink-jobmanager-process). A JobManager is responsible for +supervising the execution of the [Tasks](./glossary#task) of a single job. + + Logical Graph + +A logical graph is a directed graph describing the high-level logic of a stream processing program. +The nodes are [Operators](./glossary#operator) and the edges indicate input/output-relationships or +data streams or data sets. + + Managed State + +Managed State describes application state which has been registered with the framework. For +Managed State, Apache Flink will take care about persistence and rescaling among other things. + + Flink JobManager Process + +The Job Manager Process is the master of a [Flink Cluster](./glossary#flink-cluster). It is called +*JobManager* for historical reasons, but actually has actually contains three distinct components: +Flink Resource Manager, Flink Dispatcher and one [Flink JobManager](./glossary#flink-jobmanager) +per running [Flink Job](./glossary#flink-job). + + Operator + +Node of a [Logical Graph](./glossary#logical-graph). An Operator performs a certain operation, +which is usually executed by a [Function](./glossary#function). Sources and Sinks are special +Operators for data ingestion and data egress. + + Operator Chain + +An Operator Chain consists of one or more consecutive [Operators](./glossary#operator) without any +repartitioning in between. Operators within the same Operation Chain forward records to each other +directly without going through serialization or Flink's network stack. + + Partition + +A partition is an independent subset of the overall data stream or data set. A data stream or +data set is divided into partitions by assigning each [record](./glossary#Record) to one or more +partitions. Partitions of data streams or data sets are consumed by [Tasks](./glossary#task) during +runtime. A transformation which changes the way a data stream or data set is partitioned is often +called repartitioning. + + Physical Graph + +A physical graph is the result of translating a [Logical Graph](./glossary#logical-graph) for +execution in a distributed runtime. The nodes are [Tasks](./glossary#task) and the edges indicate +input/output-relationships or [partitions](./glossary#partition) of data streams or data sets. + + Record + +Records are the constituent elements of a data set or data stream. +[Operators](./glossary#operator) and
[GitHub] [flink] fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary
fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary URL: https://github.com/apache/flink/pull/8607#discussion_r291671831 ## File path: docs/concepts/glossary.md ## @@ -0,0 +1,166 @@ +--- +title: Glossary +nav-pos: 3 +nav-title: Glossary +nav-parent_id: concepts +--- + + + Flink Application Cluster + +A Flink Application Cluster is a dedicated [Flink Cluster](./glossary#flink-cluster) that only +executes a single [Flink Job](./glossary#flink-job). The lifetime of the +[Flink Cluster](./glossary#flink-cluster) is bound to the lifetime of the Flink Job. Formerly +Flink Application Clusters were also known as Flink Clusters in *job mode*. Compare to +[Flink Session Cluster](./glossary#flink-session-cluster). + + Flink Cluster + +The distributed system consisting of (typically) one Flink Master process and one or more Flink +Taskmanagers processes. + + Event + +An event is a statement about a change of the state of the domain modelled by the +application. Events can be input and/or output of a stream or batch processing application. +Events are special types of [records](./glossary#Record) + + ExecutionGraph + +see [Physical Graph](./glossary#physical-graph) + + Function + +Functions, or user-defined functions (UDFs), are implemented by the user and encapsulate the +application logic of a Flink program. Most Functions are wrapped by a corresponding +[Operator](./glossary#operator). + + Instance + +The term *instance* is used to describe a specific instance of a specific type (usually +[Operator](./glossary#operator) or [Function](./glossary#function)) during runtime. As Apache Flink +is mostly written in Java, this corresponds to the definition of *Instance* or *Object* in Java. +In the context of Apache Flink, the term *parallel instance* is also frequently used to emphasize +that multiple instances of the same [Operator](./glossary#operator) or +[Function](./glossary#function) type are running in parallel. + + Flink Job + +A Flink Job is the runtime representation of a Flink program. A Flink Job can either be submitted +to a long running [Flink Session Cluster](./glossary#flink-session-cluster) or it can be started as a +self-contained [Flink Application Cluster](./glossary#flink-application-cluster). + + JobGraph + +see [Logical Graph](./glossary#logical-graph) + + Flink JobManager + +JobManagers are one of the components running in the +[Flink JobManger Process](./glossary#flink-jobmanager-process). A JobManager is responsible for +supervising the execution of the [Tasks](./glossary#task) of a single job. + + Logical Graph + +A logical graph is a directed graph describing the high-level logic of a stream processing program. +The nodes are [Operators](./glossary#operator) and the edges indicate input/output-relationships or +data streams or data sets. + + Managed State + +Managed State describes application state which has been registered with the framework. For +Managed State, Apache Flink will take care about persistence and rescaling among other things. + + Flink JobManager Process + +The Job Manager Process is the master of a [Flink Cluster](./glossary#flink-cluster). It is called +*JobManager* for historical reasons, but actually has actually contains three distinct components: +Flink Resource Manager, Flink Dispatcher and one [Flink JobManager](./glossary#flink-jobmanager) +per running [Flink Job](./glossary#flink-job). + + Operator + +Node of a [Logical Graph](./glossary#logical-graph). An Operator performs a certain operation, +which is usually executed by a [Function](./glossary#function). Sources and Sinks are special +Operators for data ingestion and data egress. + + Operator Chain + +An Operator Chain consists of one or more consecutive [Operators](./glossary#operator) without any +repartitioning in between. Operators within the same Operation Chain forward records to each other +directly without going through serialization or Flink's network stack. + + Partition + +A partition is an independent subset of the overall data stream or data set. A data stream or +data set is divided into partitions by assigning each [record](./glossary#Record) to one or more +partitions. Partitions of data streams or data sets are consumed by [Tasks](./glossary#task) during +runtime. A transformation which changes the way a data stream or data set is partitioned is often +called repartitioning. + + Physical Graph + +A physical graph is the result of translating a [Logical Graph](./glossary#logical-graph) for +execution in a distributed runtime. The nodes are [Tasks](./glossary#task) and the edges indicate +input/output-relationships or [partitions](./glossary#partition) of data streams or data sets. + + Record + +Records are the constituent elements of a data set or data stream. +[Operators](./glossary#operator) and
[GitHub] [flink] fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary
fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary URL: https://github.com/apache/flink/pull/8607#discussion_r291628201 ## File path: docs/concepts/glossary.md ## @@ -0,0 +1,166 @@ +--- +title: Glossary +nav-pos: 3 +nav-title: Glossary +nav-parent_id: concepts +--- + + + Flink Application Cluster + +A Flink Application Cluster is a dedicated [Flink Cluster](./glossary#flink-cluster) that only +executes a single [Flink Job](./glossary#flink-job). The lifetime of the +[Flink Cluster](./glossary#flink-cluster) is bound to the lifetime of the Flink Job. Formerly +Flink Application Clusters were also known as Flink Clusters in *job mode*. Compare to +[Flink Session Cluster](./glossary#flink-session-cluster). + + Flink Cluster + +The distributed system consisting of (typically) one Flink Master process and one or more Flink +Taskmanagers processes. + + Event + +An event is a statement about a change of the state of the domain modelled by the +application. Events can be input and/or output of a stream or batch processing application. +Events are special types of [records](./glossary#Record) + + ExecutionGraph + +see [Physical Graph](./glossary#physical-graph) + + Function + +Functions, or user-defined functions (UDFs), are implemented by the user and encapsulate the Review comment: We have to be careful to not confuse these UDFs with SQL UDFs. (AFAIK) the term UDF originates from SQL but some of the things described here do not apply to SQL UDFs. I'd rather remove `user-defined functions (UDFs)` here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary
fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary URL: https://github.com/apache/flink/pull/8607#discussion_r291672843 ## File path: docs/concepts/glossary.md ## @@ -0,0 +1,166 @@ +--- +title: Glossary +nav-pos: 3 +nav-title: Glossary +nav-parent_id: concepts +--- + + + Flink Application Cluster + +A Flink Application Cluster is a dedicated [Flink Cluster](./glossary#flink-cluster) that only +executes a single [Flink Job](./glossary#flink-job). The lifetime of the +[Flink Cluster](./glossary#flink-cluster) is bound to the lifetime of the Flink Job. Formerly +Flink Application Clusters were also known as Flink Clusters in *job mode*. Compare to +[Flink Session Cluster](./glossary#flink-session-cluster). + + Flink Cluster + +The distributed system consisting of (typically) one Flink Master process and one or more Flink +Taskmanagers processes. + + Event + +An event is a statement about a change of the state of the domain modelled by the +application. Events can be input and/or output of a stream or batch processing application. +Events are special types of [records](./glossary#Record) + + ExecutionGraph + +see [Physical Graph](./glossary#physical-graph) + + Function + +Functions, or user-defined functions (UDFs), are implemented by the user and encapsulate the +application logic of a Flink program. Most Functions are wrapped by a corresponding +[Operator](./glossary#operator). + + Instance + +The term *instance* is used to describe a specific instance of a specific type (usually +[Operator](./glossary#operator) or [Function](./glossary#function)) during runtime. As Apache Flink +is mostly written in Java, this corresponds to the definition of *Instance* or *Object* in Java. +In the context of Apache Flink, the term *parallel instance* is also frequently used to emphasize +that multiple instances of the same [Operator](./glossary#operator) or +[Function](./glossary#function) type are running in parallel. + + Flink Job + +A Flink Job is the runtime representation of a Flink program. A Flink Job can either be submitted +to a long running [Flink Session Cluster](./glossary#flink-session-cluster) or it can be started as a +self-contained [Flink Application Cluster](./glossary#flink-application-cluster). + + JobGraph + +see [Logical Graph](./glossary#logical-graph) + + Flink JobManager + +JobManagers are one of the components running in the +[Flink JobManger Process](./glossary#flink-jobmanager-process). A JobManager is responsible for +supervising the execution of the [Tasks](./glossary#task) of a single job. + + Logical Graph + +A logical graph is a directed graph describing the high-level logic of a stream processing program. +The nodes are [Operators](./glossary#operator) and the edges indicate input/output-relationships or +data streams or data sets. + + Managed State + +Managed State describes application state which has been registered with the framework. For +Managed State, Apache Flink will take care about persistence and rescaling among other things. + + Flink JobManager Process + +The Job Manager Process is the master of a [Flink Cluster](./glossary#flink-cluster). It is called +*JobManager* for historical reasons, but actually has actually contains three distinct components: +Flink Resource Manager, Flink Dispatcher and one [Flink JobManager](./glossary#flink-jobmanager) +per running [Flink Job](./glossary#flink-job). + + Operator + +Node of a [Logical Graph](./glossary#logical-graph). An Operator performs a certain operation, +which is usually executed by a [Function](./glossary#function). Sources and Sinks are special +Operators for data ingestion and data egress. + + Operator Chain + +An Operator Chain consists of one or more consecutive [Operators](./glossary#operator) without any +repartitioning in between. Operators within the same Operation Chain forward records to each other +directly without going through serialization or Flink's network stack. + + Partition + +A partition is an independent subset of the overall data stream or data set. A data stream or +data set is divided into partitions by assigning each [record](./glossary#Record) to one or more +partitions. Partitions of data streams or data sets are consumed by [Tasks](./glossary#task) during +runtime. A transformation which changes the way a data stream or data set is partitioned is often +called repartitioning. + + Physical Graph + +A physical graph is the result of translating a [Logical Graph](./glossary#logical-graph) for +execution in a distributed runtime. The nodes are [Tasks](./glossary#task) and the edges indicate +input/output-relationships or [partitions](./glossary#partition) of data streams or data sets. + + Record + +Records are the constituent elements of a data set or data stream. +[Operators](./glossary#operator) and
[GitHub] [flink] fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary
fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary URL: https://github.com/apache/flink/pull/8607#discussion_r291672090 ## File path: docs/concepts/glossary.md ## @@ -0,0 +1,166 @@ +--- +title: Glossary +nav-pos: 3 +nav-title: Glossary +nav-parent_id: concepts +--- + + + Flink Application Cluster + +A Flink Application Cluster is a dedicated [Flink Cluster](./glossary#flink-cluster) that only +executes a single [Flink Job](./glossary#flink-job). The lifetime of the +[Flink Cluster](./glossary#flink-cluster) is bound to the lifetime of the Flink Job. Formerly +Flink Application Clusters were also known as Flink Clusters in *job mode*. Compare to +[Flink Session Cluster](./glossary#flink-session-cluster). + + Flink Cluster + +The distributed system consisting of (typically) one Flink Master process and one or more Flink +Taskmanagers processes. + + Event + +An event is a statement about a change of the state of the domain modelled by the +application. Events can be input and/or output of a stream or batch processing application. +Events are special types of [records](./glossary#Record) + + ExecutionGraph + +see [Physical Graph](./glossary#physical-graph) + + Function + +Functions, or user-defined functions (UDFs), are implemented by the user and encapsulate the +application logic of a Flink program. Most Functions are wrapped by a corresponding +[Operator](./glossary#operator). + + Instance + +The term *instance* is used to describe a specific instance of a specific type (usually +[Operator](./glossary#operator) or [Function](./glossary#function)) during runtime. As Apache Flink +is mostly written in Java, this corresponds to the definition of *Instance* or *Object* in Java. +In the context of Apache Flink, the term *parallel instance* is also frequently used to emphasize +that multiple instances of the same [Operator](./glossary#operator) or +[Function](./glossary#function) type are running in parallel. + + Flink Job + +A Flink Job is the runtime representation of a Flink program. A Flink Job can either be submitted +to a long running [Flink Session Cluster](./glossary#flink-session-cluster) or it can be started as a +self-contained [Flink Application Cluster](./glossary#flink-application-cluster). + + JobGraph + +see [Logical Graph](./glossary#logical-graph) + + Flink JobManager + +JobManagers are one of the components running in the +[Flink JobManger Process](./glossary#flink-jobmanager-process). A JobManager is responsible for +supervising the execution of the [Tasks](./glossary#task) of a single job. + + Logical Graph + +A logical graph is a directed graph describing the high-level logic of a stream processing program. +The nodes are [Operators](./glossary#operator) and the edges indicate input/output-relationships or +data streams or data sets. + + Managed State + +Managed State describes application state which has been registered with the framework. For +Managed State, Apache Flink will take care about persistence and rescaling among other things. + + Flink JobManager Process + +The Job Manager Process is the master of a [Flink Cluster](./glossary#flink-cluster). It is called +*JobManager* for historical reasons, but actually has actually contains three distinct components: +Flink Resource Manager, Flink Dispatcher and one [Flink JobManager](./glossary#flink-jobmanager) +per running [Flink Job](./glossary#flink-job). + + Operator + +Node of a [Logical Graph](./glossary#logical-graph). An Operator performs a certain operation, +which is usually executed by a [Function](./glossary#function). Sources and Sinks are special +Operators for data ingestion and data egress. + + Operator Chain + +An Operator Chain consists of one or more consecutive [Operators](./glossary#operator) without any Review comment: I would not call a single operator an operator chain. IMO a chain consist of at least two operators. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary
fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary URL: https://github.com/apache/flink/pull/8607#discussion_r291630616 ## File path: docs/concepts/glossary.md ## @@ -0,0 +1,166 @@ +--- +title: Glossary +nav-pos: 3 +nav-title: Glossary +nav-parent_id: concepts +--- + + + Flink Application Cluster + +A Flink Application Cluster is a dedicated [Flink Cluster](./glossary#flink-cluster) that only +executes a single [Flink Job](./glossary#flink-job). The lifetime of the +[Flink Cluster](./glossary#flink-cluster) is bound to the lifetime of the Flink Job. Formerly +Flink Application Clusters were also known as Flink Clusters in *job mode*. Compare to +[Flink Session Cluster](./glossary#flink-session-cluster). + + Flink Cluster + +The distributed system consisting of (typically) one Flink Master process and one or more Flink +Taskmanagers processes. + + Event + +An event is a statement about a change of the state of the domain modelled by the +application. Events can be input and/or output of a stream or batch processing application. +Events are special types of [records](./glossary#Record) + + ExecutionGraph + +see [Physical Graph](./glossary#physical-graph) + + Function + +Functions, or user-defined functions (UDFs), are implemented by the user and encapsulate the +application logic of a Flink program. Most Functions are wrapped by a corresponding +[Operator](./glossary#operator). + + Instance + +The term *instance* is used to describe a specific instance of a specific type (usually +[Operator](./glossary#operator) or [Function](./glossary#function)) during runtime. As Apache Flink +is mostly written in Java, this corresponds to the definition of *Instance* or *Object* in Java. +In the context of Apache Flink, the term *parallel instance* is also frequently used to emphasize +that multiple instances of the same [Operator](./glossary#operator) or +[Function](./glossary#function) type are running in parallel. + + Flink Job + +A Flink Job is the runtime representation of a Flink program. A Flink Job can either be submitted +to a long running [Flink Session Cluster](./glossary#flink-session-cluster) or it can be started as a +self-contained [Flink Application Cluster](./glossary#flink-application-cluster). + + JobGraph + +see [Logical Graph](./glossary#logical-graph) + + Flink JobManager + +JobManagers are one of the components running in the +[Flink JobManger Process](./glossary#flink-jobmanager-process). A JobManager is responsible for +supervising the execution of the [Tasks](./glossary#task) of a single job. + + Logical Graph + +A logical graph is a directed graph describing the high-level logic of a stream processing program. +The nodes are [Operators](./glossary#operator) and the edges indicate input/output-relationships or +data streams or data sets. + + Managed State + +Managed State describes application state which has been registered with the framework. For +Managed State, Apache Flink will take care about persistence and rescaling among other things. + + Flink JobManager Process + +The Job Manager Process is the master of a [Flink Cluster](./glossary#flink-cluster). It is called +*JobManager* for historical reasons, but actually has actually contains three distinct components: Review comment: It depends on the setup, which components are run together. The description indicates that resource manager, dispatcher and JM are always executed together. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] lamber-ken commented on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null
lamber-ken commented on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null URL: https://github.com/apache/flink/pull/8583#issuecomment-499957988 hi, thanks for your commnet @aljoscha. here is detail stackstrace ``` Caused by: java.lang.NullPointerException at java.lang.String.(String.java:515) at org.apache.flink.api.common.serialization.SimpleStringSchema.deserialize(SimpleStringSchema.java:75) at org.apache.flink.api.common.serialization.SimpleStringSchema.deserialize(SimpleStringSchema.java:36) at org.apache.flink.streaming.util.serialization.KeyedDeserializationSchemaWrapper.deserialize(KeyedDeserializationSchemaWrapper.java:44) at org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.runFetchLoop(Kafka09Fetcher.java:142) at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:738) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:94) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:58) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) at java.lang.Thread.run(Thread.java:748) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-12779) Avoid field conflicts when generate field names for non-composite Typeinformation
Hequn Cheng created FLINK-12779: --- Summary: Avoid field conflicts when generate field names for non-composite Typeinformation Key: FLINK-12779 URL: https://issues.apache.org/jira/browse/FLINK-12779 Project: Flink Issue Type: Improvement Components: Table SQL / API Reporter: Hequn Cheng Assignee: Hequn Cheng We use {{FieldInfoUtils.getFieldNames(resultType)}} to get the relative field names of the resultType. There are no problem for composite types. For non-composite types, we always set the field name to `f0`. But the `f0` may conflict with the predefined field names. To make it more robust, we should generate a field name with no conflicts. For example, we can use `f0_0` as the field name if `f0` has been used. This is also consistent with the behavior of SQL. The following test can reproduce the problem. {code:java} @Test def testUserDefinedTableFunctionWithParameter(): Unit = { val tableFunc1 = new RichTableFunc1 StreamITCase.testResults = mutable.MutableList() val result = StreamTestData.getSmall3TupleDataStream(env) .toTable(tEnv, 'f0, 'f1, 'f2) .joinLateral(tableFunc1('f2)) val results = result.toAppendStream[Row] results.addSink(new StreamITCase.StringSink[Row]) env.execute() val expected = mutable.MutableList("3,Hello", "3,world") assertEquals(expected.sorted, StreamITCase.testResults.sorted) } {code} Exception {code:java} org.apache.flink.table.api.ValidationException: join relations with ambiguous names: [f0] at org.apache.flink.table.operations.JoinOperationFactory.validateNamesAmbiguity(JoinOperationFactory.java:115) at org.apache.flink.table.operations.JoinOperationFactory.create(JoinOperationFactory.java:78) at org.apache.flink.table.operations.OperationTreeBuilder.join(OperationTreeBuilder.scala:358) at org.apache.flink.table.operations.OperationTreeBuilder.joinLateral(OperationTreeBuilder.scala:373) at org.apache.flink.table.api.TableImpl.joinLateralInternal(tableImpl.scala:256) at org.apache.flink.table.api.TableImpl.joinLateral(tableImpl.scala:214) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8661: [FLINK-12711][table] Separate function implementation and definition
flinkbot commented on issue #8661: [FLINK-12711][table] Separate function implementation and definition URL: https://github.com/apache/flink/pull/8661#issuecomment-499937163 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12711) Separate function implementation and definition
[ https://issues.apache.org/jira/browse/FLINK-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12711: --- Labels: pull-request-available (was: ) > Separate function implementation and definition > --- > > Key: FLINK-12711 > URL: https://issues.apache.org/jira/browse/FLINK-12711 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > Labels: pull-request-available > > This issue continues the work that was started in FLINK-11449. It > distinguishes between function with implementation (UDFs) and function with > no implementation. > The rough design looks as follows: > {noformat} > 1. `interface FunctionDefinition` >--> general interface for describing a function >--> goal: separation of runtime and planning/optimization >--> long-term methods: `getKind()` (aggregate, scalar, table), > `getTypeInference()` > 2. `interface UserDefinedFunctionDefinition extends FunctionDefinition` >--> interface for describing a function with implementation >--> methods: `createImplementation(): UserDefinedFunction` >--> default: getTypeInference() = Util.DEFAULT_INFERENCE // future work > 3. `class BuiltInFunctionDefinition implements FunctionDefinition` >--> class for describing a function where the planner provides an > implementation >--> methods: `getName(): String` > 4. Add `getKind` to `AggregateFunction`, `ScalarFunction`, `TableFunction` > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] twalthr opened a new pull request #8661: [FLINK-12711][table] Separate function implementation and definition
twalthr opened a new pull request #8661: [FLINK-12711][table] Separate function implementation and definition URL: https://github.com/apache/flink/pull/8661 ## What is the purpose of the change This PR is a first step toward unified expression stacks. It uncouples runtime implementation from information that is needed during validation and planning. With this change it is possible to declare properties such as `isDeterminstic` or `requiresOver` also for built-in functions. A next step is to add a `getTypeInference` method to `FunctionDefinition` which enables the removal of all `PlannerExpression` by using `BuiltInFunctionDefinition` instead. ## Brief change log See commit messages. ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? JavaDocs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-12619) Support TERMINATE/SUSPEND Job with Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858747#comment-16858747 ] Yu Li commented on FLINK-12619: --- Thanks for the clarification about your thoughts [~aljoscha], but I still have some questions. First of all, if we stick to the savepoint solution (regardless how we implement the "optimized" format), how do we resolve the below issue?: It requires user to trigger savepoint frequently (or else along with time the "incremental" savepoint will actually become "full" when each key-value is updated), which will interfere with the normal system-triggered checkpoint process. bq. I mentioned incremental savepoints only as a possible future development... I think the solution for that is to allow savepoints to be in various different formats... which keeps the clear distinction between checkpoints and savepoints but allows an optimized format for the savepoint which is what users want in some cases. Sorry but I'm a little bit confused here, if not a unified "incremental savepoint format", what this "optimized" or "canonical/unified" format would/could be? bq. My main point is that the distinction between checkpoints and savepoints is that the former are system controlled while the latter are user controlled and that we should keep that distinction. I think we could resolve the concern in the following way, wdyt?: Introducing a configuration like {{job.stop.with.checkpoint}} and if user set it to true, every job stop/suspend action will be accompanied by a checkpoint unless triggered by the stop-with-savepoint command. Thanks. > Support TERMINATE/SUSPEND Job with Checkpoint > - > > Key: FLINK-12619 > URL: https://issues.apache.org/jira/browse/FLINK-12619 > Project: Flink > Issue Type: New Feature > Components: Runtime / State Backends >Reporter: Congxian Qiu(klion26) >Assignee: Congxian Qiu(klion26) >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Inspired by the idea of FLINK-11458, we propose to support terminate/suspend > a job with checkpoint. This improvement cooperates with incremental and > external checkpoint features, that if checkpoint is retained and this feature > is configured, we will trigger a checkpoint before the job stops. It could > accelarate job recovery a lot since: > 1. No source rewinding required any more. > 2. It's much faster than taking a savepoint since incremental checkpoint is > enabled. > Please note that conceptually savepoints is different from checkpoint in a > similar way that backups are different from recovery logs in traditional > database systems. So we suggest using this feature only for job recovery, > while stick with FLINK-11458 for the > upgrading/cross-cluster-job-migration/state-backend-switch cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12513) Improve end-to-end (bash based) tests
[ https://issues.apache.org/jira/browse/FLINK-12513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex updated FLINK-12513: - Description: Tests in {{flink-end-to-end-tests/test-scripts}} directory are re-using the same Flink directory to configure and launch a Flink cluster for a test. As part of their setup steps, they may modify Flink config files (application config, logs config), the directory itself (e.g. by copying some jars into {{lib}} folder). Also, some tests involve using additional services (like spinning up Zookeper, Kafka and so on clusters). The corresponding clean up code (to stop services, to revert Flink directory to original state) is spread out and a little bit not well-structured. In particular * the test runner itself reverts the Flink config (but doesn't revert other changes in Flink dir); * some tests use shell's {{trap}} on exit hook for clean up callback. Adding multiple of such callbacks in one test would result in non-proper test tear down. As the result, some tests may have left overs that may affect execution of next steps. The proposal is to introduce a helper method for using one global (per test run) {{trap}} hook that would enable adding multiple clean up callbacks. This should enable registering "resource" clean up callbacks in the same place where resource is used/launched. Optional improvement: -make the test runner create a temporal copy of Flink directory and launch test using that temporal directory. After the test is done, the temporal directory would be removed.- Update: the test runner now rollbacks some Flink distribution folders (conf, lib and plugins in other PR). The above idea with temporal folder currently doesn't work for tests that involve dockerized Flink, because the {{build.sh}} script uses the relative path to the Flink dir under test. was: Tests in {{flink-end-to-end-tests/test-scripts}} directory are re-using the same Flink directory to configure and launch a Flink cluster for a test. As part of their setup steps, they may modify Flink config files (application config, logs config), the directory itself (e.g. by copying some jars into {{lib}} folder). Also, some tests involve using additional services (like spinning up Zookeper, Kafka and so on clusters). The corresponding clean up code (to stop services, to revert Flink directory to original state) is spread out and a little bit not well-structured. In particular * the test runner itself reverts the Flink config (but doesn't revert other changes in Flink dir); * some tests use shell's {{trap}} on exit hook for clean up callback. Adding multiple of such callbacks in one test would result in non-proper test tear down. As the result, some tests may have left overs that may affect execution of next steps. The proposal is to introduce a helper method for using one global (per test run) {{trap}} hook that would enable adding multiple clean up callbacks. This should enable registering "resource" clean up callbacks in the same place where resource is used/launched. Optional improvement: make the test runner create a temporal copy of Flink directory and launch test using that temporal directory. After the test is done, the temporal directory would be removed. > Improve end-to-end (bash based) tests > - > > Key: FLINK-12513 > URL: https://issues.apache.org/jira/browse/FLINK-12513 > Project: Flink > Issue Type: Improvement > Components: Tests >Reporter: Alex >Assignee: Alex >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Tests in {{flink-end-to-end-tests/test-scripts}} directory are re-using the > same Flink directory to configure and launch a Flink cluster for a test. > As part of their setup steps, they may modify Flink config files (application > config, logs config), the directory itself (e.g. by copying some jars into > {{lib}} folder). Also, some tests involve using additional services (like > spinning up Zookeper, Kafka and so on clusters). > The corresponding clean up code (to stop services, to revert Flink directory > to original state) is spread out and a little bit not well-structured. In > particular > * the test runner itself reverts the Flink config (but doesn't revert other > changes in Flink dir); > * some tests use shell's {{trap}} on exit hook for clean up callback. Adding > multiple of such callbacks in one test would result in non-proper test tear > down. > As the result, some tests may have left overs that may affect execution of > next steps. > The proposal is to introduce a helper method for using one global (per test > run) {{trap}} hook that would enable adding multiple clean up callbacks. This > should enable registering
[jira] [Commented] (FLINK-12662) show jobs failover in history server as well
[ https://issues.apache.org/jira/browse/FLINK-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858733#comment-16858733 ] vinoyang commented on FLINK-12662: -- Hi [~till.rohrmann] , My original thought is that you may against changing the {{ExecutionGraph}} , so I tried to introduce a new variant of {{ExecutionGraph}}. Now that you think it's possible to change {{ExecutionGraph}} and add more information to record intermediate results, that sounds like good news. However, in order to present this new information to the Flink web UI, we need to add new interface methods or refine exists methods for them in {{AccessExecutionGraph}}. OK, I will change my original plan and try to add this information to {{ExecutionGraph}}. I'll list more details later. > show jobs failover in history server as well > > > Key: FLINK-12662 > URL: https://issues.apache.org/jira/browse/FLINK-12662 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Su Ralph >Assignee: vinoyang >Priority: Major > > Currently > [https://ci.apache.org/projects/flink/flink-docs-release-1.8/monitoring/historyserver.html] > only show the completed jobs (completd, cancel, failed). Not showing any > intermediate failover. > Which make the cluster administrator/developer hard to find first place if > there is two failover happens. Feature ask is to > - make a failover as a record in history server as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] wisgood commented on issue #8621: [FLINK-12682][connectors] StringWriter support custom row delimiter
wisgood commented on issue #8621: [FLINK-12682][connectors] StringWriter support custom row delimiter URL: https://github.com/apache/flink/pull/8621#issuecomment-499920683 thanks for your explanation! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] aljoscha edited a comment on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null
aljoscha edited a comment on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null URL: https://github.com/apache/flink/pull/8583#issuecomment-499919264 Scratch that, I don't think we can do this. Our Kafka consumer silently swallows null values: https://github.com/apache/flink/blob/049994274c9d4fc07925a7639e4044506b090d10/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/AbstractFetcher.java#L407-L410. Plus, I think our serializers in general don't always support `null` values. The fact that `StringSerializer` does is more of an anomaly. (also thanks to @GJL for pointing this out to me ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] aljoscha commented on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null
aljoscha commented on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null URL: https://github.com/apache/flink/pull/8583#issuecomment-499919264 Scratch that, I don't think we can do this. Our Kafka consumer silently swallows null values: https://github.com/apache/flink/blob/049994274c9d4fc07925a7639e4044506b090d10/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/AbstractFetcher.java#L408. Plus, I think our serializers in general don't always support `null` values. The fact that `StringSerializer` does is more of an anomaly. (also thanks to @GJL for pointing this out to me ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary
fhueske commented on a change in pull request #8607: [FLINK-12652] [documentation] add first version of a glossary URL: https://github.com/apache/flink/pull/8607#discussion_r291627017 ## File path: docs/concepts/glossary.md ## @@ -0,0 +1,166 @@ +--- +title: Glossary +nav-pos: 3 +nav-title: Glossary +nav-parent_id: concepts +--- + + + Flink Application Cluster + +A Flink Application Cluster is a dedicated [Flink Cluster](./glossary#flink-cluster) that only +executes a single [Flink Job](./glossary#flink-job). The lifetime of the +[Flink Cluster](./glossary#flink-cluster) is bound to the lifetime of the Flink Job. Formerly +Flink Application Clusters were also known as Flink Clusters in *job mode*. Compare to +[Flink Session Cluster](./glossary#flink-session-cluster). + + Flink Cluster + +The distributed system consisting of (typically) one Flink Master process and one or more Flink +Taskmanagers processes. + + Event + +An event is a statement about a change of the state of the domain modelled by the +application. Events can be input and/or output of a stream or batch processing application. +Events are special types of [records](./glossary#Record) Review comment: Add `.` at the end This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] azagrebin commented on issue #8485: [FLINK-12555] Introduce an encapsulated metric group layout for shuffle API
azagrebin commented on issue #8485: [FLINK-12555] Introduce an encapsulated metric group layout for shuffle API URL: https://github.com/apache/flink/pull/8485#issuecomment-499913877 Thanks for the reviews @zentol @zhijiangW, I've addressed comments This PR is now based on #8608 which I think needs to be merged first. While rebasing, I introduced a `ShuffleIOOwnerContext` which is now created by Task using `ShuffleEnvironment.createShuffleIOOwnerContext` before creating partitions/gates. `NettyShuffleEnvironment.createShuffleIOOwnerContext` also creates netty metric groups, only once. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] azagrebin commented on a change in pull request #8485: [FLINK-12555] Introduce an encapsulated metric group layout for shuffle API
azagrebin commented on a change in pull request #8485: [FLINK-12555] Introduce an encapsulated metric group layout for shuffle API URL: https://github.com/apache/flink/pull/8485#discussion_r291622614 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java ## @@ -388,16 +381,18 @@ public Task( executionId, this, inputGateDeploymentDescriptors, - metrics.getIOMetricGroup(), - inputGroup, - buffersGroup); + taskShuffleMetricGroup); this.inputGates = new InputGate[gates.length]; int counter = 0; for (InputGate gate : gates) { inputGates[counter++] = new InputGateWithMetrics(gate, metrics.getIOMetricGroup().getNumBytesInCounter()); } + // we will have to check type of shuffle service later whether it is NetworkEnvironment + //noinspection deprecation + networkEnvironment.registerLegacyNetworkMetrics(metrics.getIOMetricGroup(), resultPartitionWriters, gates); Review comment: which is now happening after rebasing onto #8608 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] aljoscha commented on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null
aljoscha commented on issue #8583: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null URL: https://github.com/apache/flink/pull/8583#issuecomment-499911130 Yes, I think this should work. +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-12778) Fix deriveTableAggRowType bug for non-composite types
Hequn Cheng created FLINK-12778: --- Summary: Fix deriveTableAggRowType bug for non-composite types Key: FLINK-12778 URL: https://issues.apache.org/jira/browse/FLINK-12778 Project: Flink Issue Type: Bug Components: Table SQL / API Reporter: Hequn Cheng Assignee: Hequn Cheng Currently, we call {{aggCalls.get(0).`type`.getFieldList.foreach(builder.add)}} when derive row type for table aggregate. However, for types which are not composite types, the field list would be null. Table Aggregate should, of course, support non-composite types. To solve the problem, we should judge whether types are structured. This is because a composite type will be converted to a RelDataType which contains field list and is structured. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] azagrebin commented on issue #8608: [FLINK-11392][network] Introduce ShuffleEnvironment interface
azagrebin commented on issue #8608: [FLINK-11392][network] Introduce ShuffleEnvironment interface URL: https://github.com/apache/flink/pull/8608#issuecomment-499890156 Thanks for the review @tillrohrmann, I've pushed the final version with the last changes including yours. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tillrohrmann commented on a change in pull request #8552: [FLINK-12631]:Check if proper JAR file in JobWithJars
tillrohrmann commented on a change in pull request #8552: [FLINK-12631]:Check if proper JAR file in JobWithJars URL: https://github.com/apache/flink/pull/8552#discussion_r291582520 ## File path: flink-clients/src/main/java/org/apache/flink/client/program/JobWithJars.java ## @@ -122,7 +123,11 @@ public static void checkJarFile(URL jar) throws IOException { if (!jarFile.canRead()) { throw new IOException("JAR file can't be read '" + jarFile.getAbsolutePath() + "'"); } - // TODO: Check if proper JAR file + try (JarFile f = new JarFile(jarFile)) { + + } catch (IOException e) { + throw new IOException("Error while opening jar file '" + jarFile.getAbsolutePath() + "'", e); + } Review comment: could we replace the first check where we create the `File` with this one? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #8660: [FLINK-12143]
flinkbot commented on issue #8660: [FLINK-12143] URL: https://github.com/apache/flink/pull/8660#issuecomment-499876516 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12143) Mechanism to ship plugin jars in the cluster
[ https://issues.apache.org/jira/browse/FLINK-12143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12143: --- Labels: pull-request-available (was: ) > Mechanism to ship plugin jars in the cluster > > > Key: FLINK-12143 > URL: https://issues.apache.org/jira/browse/FLINK-12143 > Project: Flink > Issue Type: Sub-task > Components: FileSystems, Runtime / Coordination >Reporter: Stefan Richter >Assignee: Piotr Nowojski >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] 1u0 opened a new pull request #8660: [FLINK-12143]
1u0 opened a new pull request #8660: [FLINK-12143] URL: https://github.com/apache/flink/pull/8660 ## What is the purpose of the change This PR changes Flink distribution's starter scripts: defines new configuration environment parameter for plugins dir and passes the configuration to the plugins manager. Currently, only file system components support the plugin manager, so related (to file system) end-to-end tests also have been modified to be loaded via the plugin manager. ## Brief change log - the scripts in `flink-dist` have been modified, a new `FLINK_HOME` environment variable replaces `FLINK_ROOT_DIR`; - the various Flink startup entry points are extended to configure and pass plugins dir to plugins manager; - the plugin manager is configured by system wide parents first patterns for class loading; - some end-to-end tests involving different file system components are modified to use fs components via plugins mechanism. ## Verifying this change This change is already covered by existing tests, such as: - e2e batch wordcount tests for hadoop, presto, azure and dummy fs are modified to use the corresponding fs component via plugin mechanism; - `test_streaming_file_sink.sh` is modified to use hadoop fs via plugin mechanism; - `test_docker_embedded_job.sh` is modified to use dummy fs via plugin mechanism; - `test_yarn_kerberos_docker.sh` is modified to use dummy fs via plugin mechanism. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (**yes** / no) (the distribution Flink scripts) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (**yes** / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12637) Add metrics for floatingBufferUsage and exclusiveBufferUsage for credit based mode
[ https://issues.apache.org/jira/browse/FLINK-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12637: -- Summary: Add metrics for floatingBufferUsage and exclusiveBufferUsage for credit based mode (was: Add floatingBufferUsage and exclusiveBufferUsage for credit based mode) > Add metrics for floatingBufferUsage and exclusiveBufferUsage for credit based > mode > -- > > Key: FLINK-12637 > URL: https://issues.apache.org/jira/browse/FLINK-12637 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics, Runtime / Network >Affects Versions: 1.9.0 >Reporter: Aitozi >Assignee: Aitozi >Priority: Minor > > Described > [here|https://github.com/apache/flink/pull/8455#issuecomment-496077999] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12640) mvn install flink-shaded-hadoop2-uber: Error creating shaded jar: null
[ https://issues.apache.org/jira/browse/FLINK-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858609#comment-16858609 ] Till Rohrmann commented on FLINK-12640: --- The build seems to work for me. What maven version are you using [~overcls]? > mvn install flink-shaded-hadoop2-uber: Error creating shaded jar: null > --- > > Key: FLINK-12640 > URL: https://issues.apache.org/jira/browse/FLINK-12640 > Project: Flink > Issue Type: Bug > Components: BuildSystem / Shaded >Affects Versions: 1.9.0 > Environment: jdk8 >Reporter: xiezhiqiang >Priority: Major > > {code:java} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-shade-plugin:3.0.0:shade (shade-hadoop) on > project flink-shaded-hadoop2-uber: Error creating shaded jar: null: > IllegalArgumentException -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :flink-shaded-hadoop2-uber > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12646) Fix broken tests of RestClientTest
[ https://issues.apache.org/jira/browse/FLINK-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858607#comment-16858607 ] Till Rohrmann commented on FLINK-12646: --- Sounds like a good idea to me [~victor-wong]. Do you wanna contribute this fix? > Fix broken tests of RestClientTest > -- > > Key: FLINK-12646 > URL: https://issues.apache.org/jira/browse/FLINK-12646 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Reporter: Victor Wong >Assignee: Victor Wong >Priority: Minor > > In > {code:java} > org.apache.flink.runtime.rest.RestClientTest#testConnectionTimeout > {code} > , we use a "unroutableIp" with a value of "10.255.255.1" for test. > But sometimes this IP is reachable in a private network of a company, which > is the case for me. As a result, this test failed with a following exception: > > {code:java} > java.lang.AssertionError: Expected: an instance of > org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException but: > Connection refused: /10.255.255.1:80> is a > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.junit.Assert.assertThat(Assert.java:956) at > org.junit.Assert.assertThat(Assert.java:923) at > org.apache.flink.runtime.rest.RestClientTest.testConnectionTimeout(RestClientTest.java:76) > ... > {code} > > > Can we change the `unroutableIp` to a reserved IP address, i.e "240.0.0.0", > which is described as _Reserved for future use_ in > [wikipedia|https://en.wikipedia.org/wiki/Reserved_IP_addresses] > Or change the assertion? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-12661) test failure: RestAPIStabilityTest.testDispatcherRestAPIStability
[ https://issues.apache.org/jira/browse/FLINK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann closed FLINK-12661. - Resolution: Cannot Reproduce Fix Version/s: (was: 1.9.0) Could not reproduce the problem. The {{NoClassDefFoundException}} indicates that this might be a Travis glitch since the dependency usually should be there. > test failure: RestAPIStabilityTest.testDispatcherRestAPIStability > - > > Key: FLINK-12661 > URL: https://issues.apache.org/jira/browse/FLINK-12661 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.9.0 >Reporter: Bowen Li >Priority: Critical > Labels: test-stability > > https://travis-ci.org/apache/flink/jobs/538313859 > {code:java} > 20:02:16.575 [ERROR] testDispatcherRestAPIStability[version = > V1](org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest) Time > elapsed: 0.421 s <<< ERROR! > java.lang.NoClassDefFoundError: > org/apache/flink/shaded/jackson2/com/fasterxml/jackson/module/jsonSchema/JsonSchemaGenerator > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.lambda$createSnapshot$1(RestAPIStabilityTest.java:107) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.createSnapshot(RestAPIStabilityTest.java:114) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.testDispatcherRestAPIStability(RestAPIStabilityTest.java:76) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.shaded.jackson2.com.fasterxml.jackson.module.jsonSchema.JsonSchemaGenerator > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.lambda$createSnapshot$1(RestAPIStabilityTest.java:107) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.createSnapshot(RestAPIStabilityTest.java:114) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.testDispatcherRestAPIStability(RestAPIStabilityTest.java:76) > ... > 20:09:32.746 [ERROR] Errors: > 20:09:32.747 [ERROR] > RestAPIStabilityTest.testDispatcherRestAPIStability:76->createSnapshot:114->lambda$createSnapshot$1:107 > » NoClassDefFound > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12655) SlotCountExceedingParallelismTest failed on Travis
[ https://issues.apache.org/jira/browse/FLINK-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12655: -- Priority: Critical (was: Major) > SlotCountExceedingParallelismTest failed on Travis > -- > > Key: FLINK-12655 > URL: https://issues.apache.org/jira/browse/FLINK-12655 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.9.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Critical > Fix For: 1.9.0 > > > https://travis-ci.org/apache/flink/jobs/538166458 > {code} > akka.actor.RobustActorSystemTest > 12:42:34.415 [INFO] > 12:42:34.415 [INFO] Results: > 12:42:34.415 [INFO] > 12:42:34.415 [ERROR] Errors: > 12:42:34.420 [ERROR] > SlotCountExceedingParallelismTest.testNoSlotSharingAndBlockingResultBoth:91->submitJobGraphAndWait:97 > » JobExecution > 12:42:34.420 [ERROR] > SlotCountExceedingParallelismTest.testNoSlotSharingAndBlockingResultReceiver:84->submitJobGraphAndWait:97 > » JobExecution > 12:42:34.420 [ERROR] > SlotCountExceedingParallelismTest.testNoSlotSharingAndBlockingResultSender:77->submitJobGraphAndWait:97 > » > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12655) SlotCountExceedingParallelismTest failed on Travis
[ https://issues.apache.org/jira/browse/FLINK-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12655: -- Labels: test-stability (was: ) > SlotCountExceedingParallelismTest failed on Travis > -- > > Key: FLINK-12655 > URL: https://issues.apache.org/jira/browse/FLINK-12655 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.9.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Critical > Labels: test-stability > Fix For: 1.9.0 > > > https://travis-ci.org/apache/flink/jobs/538166458 > {code} > akka.actor.RobustActorSystemTest > 12:42:34.415 [INFO] > 12:42:34.415 [INFO] Results: > 12:42:34.415 [INFO] > 12:42:34.415 [ERROR] Errors: > 12:42:34.420 [ERROR] > SlotCountExceedingParallelismTest.testNoSlotSharingAndBlockingResultBoth:91->submitJobGraphAndWait:97 > » JobExecution > 12:42:34.420 [ERROR] > SlotCountExceedingParallelismTest.testNoSlotSharingAndBlockingResultReceiver:84->submitJobGraphAndWait:97 > » JobExecution > 12:42:34.420 [ERROR] > SlotCountExceedingParallelismTest.testNoSlotSharingAndBlockingResultSender:77->submitJobGraphAndWait:97 > » > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12661) test failure: RestAPIStabilityTest.testDispatcherRestAPIStability
[ https://issues.apache.org/jira/browse/FLINK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12661: -- Labels: test-stability (was: ) > test failure: RestAPIStabilityTest.testDispatcherRestAPIStability > - > > Key: FLINK-12661 > URL: https://issues.apache.org/jira/browse/FLINK-12661 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.9.0 >Reporter: Bowen Li >Priority: Critical > Labels: test-stability > Fix For: 1.9.0 > > > https://travis-ci.org/apache/flink/jobs/538313859 > {code:java} > 20:02:16.575 [ERROR] testDispatcherRestAPIStability[version = > V1](org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest) Time > elapsed: 0.421 s <<< ERROR! > java.lang.NoClassDefFoundError: > org/apache/flink/shaded/jackson2/com/fasterxml/jackson/module/jsonSchema/JsonSchemaGenerator > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.lambda$createSnapshot$1(RestAPIStabilityTest.java:107) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.createSnapshot(RestAPIStabilityTest.java:114) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.testDispatcherRestAPIStability(RestAPIStabilityTest.java:76) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.shaded.jackson2.com.fasterxml.jackson.module.jsonSchema.JsonSchemaGenerator > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.lambda$createSnapshot$1(RestAPIStabilityTest.java:107) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.createSnapshot(RestAPIStabilityTest.java:114) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.testDispatcherRestAPIStability(RestAPIStabilityTest.java:76) > ... > 20:09:32.746 [ERROR] Errors: > 20:09:32.747 [ERROR] > RestAPIStabilityTest.testDispatcherRestAPIStability:76->createSnapshot:114->lambda$createSnapshot$1:107 > » NoClassDefFound > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12661) test failure: RestAPIStabilityTest.testDispatcherRestAPIStability
[ https://issues.apache.org/jira/browse/FLINK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12661: -- Priority: Critical (was: Major) > test failure: RestAPIStabilityTest.testDispatcherRestAPIStability > - > > Key: FLINK-12661 > URL: https://issues.apache.org/jira/browse/FLINK-12661 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.9.0 >Reporter: Bowen Li >Priority: Critical > Fix For: 1.9.0 > > > https://travis-ci.org/apache/flink/jobs/538313859 > {code:java} > 20:02:16.575 [ERROR] testDispatcherRestAPIStability[version = > V1](org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest) Time > elapsed: 0.421 s <<< ERROR! > java.lang.NoClassDefFoundError: > org/apache/flink/shaded/jackson2/com/fasterxml/jackson/module/jsonSchema/JsonSchemaGenerator > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.lambda$createSnapshot$1(RestAPIStabilityTest.java:107) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.createSnapshot(RestAPIStabilityTest.java:114) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.testDispatcherRestAPIStability(RestAPIStabilityTest.java:76) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.shaded.jackson2.com.fasterxml.jackson.module.jsonSchema.JsonSchemaGenerator > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.lambda$createSnapshot$1(RestAPIStabilityTest.java:107) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.createSnapshot(RestAPIStabilityTest.java:114) > at > org.apache.flink.runtime.rest.compatibility.RestAPIStabilityTest.testDispatcherRestAPIStability(RestAPIStabilityTest.java:76) > ... > 20:09:32.746 [ERROR] Errors: > 20:09:32.747 [ERROR] > RestAPIStabilityTest.testDispatcherRestAPIStability:76->createSnapshot:114->lambda$createSnapshot$1:107 > » NoClassDefFound > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12662) show jobs failover in history server as well
[ https://issues.apache.org/jira/browse/FLINK-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858597#comment-16858597 ] Till Rohrmann commented on FLINK-12662: --- Could it be enough to keep a history of {{ExecutionGraph}} failures in the {{ExecutionGraph}}? We could have a list of {{ErrorInfos}} or something similar with additional information about restart times, for example. > show jobs failover in history server as well > > > Key: FLINK-12662 > URL: https://issues.apache.org/jira/browse/FLINK-12662 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Su Ralph >Assignee: vinoyang >Priority: Major > > Currently > [https://ci.apache.org/projects/flink/flink-docs-release-1.8/monitoring/historyserver.html] > only show the completed jobs (completd, cancel, failed). Not showing any > intermediate failover. > Which make the cluster administrator/developer hard to find first place if > there is two failover happens. Feature ask is to > - make a failover as a record in history server as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-12706) Introduce ShuffleService interface and its configuration
[ https://issues.apache.org/jira/browse/FLINK-12706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann reassigned FLINK-12706: - Assignee: Andrey Zagrebin > Introduce ShuffleService interface and its configuration > > > Key: FLINK-12706 > URL: https://issues.apache.org/jira/browse/FLINK-12706 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration, Runtime / Coordination, Runtime > / Network >Reporter: Andrey Zagrebin >Assignee: Andrey Zagrebin >Priority: Major > Fix For: 1.9.0 > > > ShuffleService should be a factory for ShuffleMaster in JM and local > ShuffleEnvironment in TM. The default implementation is already available > former NetworkEnvironment. To make it pluggable, we need to provide a service > loading for the configured ShuffleService implementation class in Flink > configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12707) Close minicluster will cause memory leak when there are StreamTask closed abnormal
[ https://issues.apache.org/jira/browse/FLINK-12707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858593#comment-16858593 ] Till Rohrmann commented on FLINK-12707: --- Could this be caused by FLINK-11630? It sounds as if the {{Task}} threads are not being terminated. If this is the case, then please close this issue as a duplicate of FLINK-11630. > Close minicluster will cause memory leak when there are StreamTask closed > abnormal > --- > > Key: FLINK-12707 > URL: https://issues.apache.org/jira/browse/FLINK-12707 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.8.0 >Reporter: liuzhaokun >Assignee: liuzhaokun >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > There are several threads in my application,and every thread will launch a > flink job with LocalStreamEnvironment/MiniCluster.But when I interrupt these > threads again and again, I found there are many residue threads,and that > caused a memory leak. > When I debug it,this message appears > "Recipient[Actor[akka://flink/user/taskmanager_0#-584606215]] had already > been terminated. Sender[null] sent the message of type > "org.apache.flink.runtime.rpc.messages.LocalRpcInvocation". So I think > when flink close minicluster,TaskExecutor will be closed in the first > place,and this operation will cause akka message which will close StreamTask > abnormal.So the work thread will be more and more. > I call Thread sleep in my application to avoid this problem,any good > suggestions? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-12513) Improve end-to-end (bash based) tests
[ https://issues.apache.org/jira/browse/FLINK-12513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-12513. -- Resolution: Fixed Merged to master as a0cf2054cc4ab9ea2a6cce3f60ec0b2dc8a59b27..29f4e523600e80886655fb317961dd58a874e19f > Improve end-to-end (bash based) tests > - > > Key: FLINK-12513 > URL: https://issues.apache.org/jira/browse/FLINK-12513 > Project: Flink > Issue Type: Improvement > Components: Tests >Reporter: Alex >Assignee: Alex >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Tests in {{flink-end-to-end-tests/test-scripts}} directory are re-using the > same Flink directory to configure and launch a Flink cluster for a test. > As part of their setup steps, they may modify Flink config files (application > config, logs config), the directory itself (e.g. by copying some jars into > {{lib}} folder). Also, some tests involve using additional services (like > spinning up Zookeper, Kafka and so on clusters). > The corresponding clean up code (to stop services, to revert Flink directory > to original state) is spread out and a little bit not well-structured. In > particular > * the test runner itself reverts the Flink config (but doesn't revert other > changes in Flink dir); > * some tests use shell's {{trap}} on exit hook for clean up callback. Adding > multiple of such callbacks in one test would result in non-proper test tear > down. > As the result, some tests may have left overs that may affect execution of > next steps. > The proposal is to introduce a helper method for using one global (per test > run) {{trap}} hook that would enable adding multiple clean up callbacks. This > should enable registering "resource" clean up callbacks in the same place > where resource is used/launched. > Optional improvement: make the test runner create a temporal copy of Flink > directory and launch test using that temporal directory. After the test is > done, the temporal directory would be removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12707) Close minicluster will cause memory leak when there are StreamTask closed abnormal
[ https://issues.apache.org/jira/browse/FLINK-12707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12707: -- Issue Type: Bug (was: Improvement) > Close minicluster will cause memory leak when there are StreamTask closed > abnormal > --- > > Key: FLINK-12707 > URL: https://issues.apache.org/jira/browse/FLINK-12707 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.8.0 >Reporter: liuzhaokun >Assignee: liuzhaokun >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > There are several threads in my application,and every thread will launch a > flink job with LocalStreamEnvironment/MiniCluster.But when I interrupt these > threads again and again, I found there are many residue threads,and that > caused a memory leak. > When I debug it,this message appears > "Recipient[Actor[akka://flink/user/taskmanager_0#-584606215]] had already > been terminated. Sender[null] sent the message of type > "org.apache.flink.runtime.rpc.messages.LocalRpcInvocation". So I think > when flink close minicluster,TaskExecutor will be closed in the first > place,and this operation will cause akka message which will close StreamTask > abnormal.So the work thread will be more and more. > I call Thread sleep in my application to avoid this problem,any good > suggestions? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] pnowojski commented on issue #8451: [FLINK-12513][e2e] Improve end-to-end (bash based) tests
pnowojski commented on issue #8451: [FLINK-12513][e2e] Improve end-to-end (bash based) tests URL: https://github.com/apache/flink/pull/8451#issuecomment-499870527 Ok :) Merging This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tillrohrmann closed pull request #8595: [FLINK-12707]Close minicluster will cause memory leak when there are StreamTask closed abnormal
tillrohrmann closed pull request #8595: [FLINK-12707]Close minicluster will cause memory leak when there are StreamTask closed abnormal URL: https://github.com/apache/flink/pull/8595 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] pnowojski merged pull request #8451: [FLINK-12513][e2e] Improve end-to-end (bash based) tests
pnowojski merged pull request #8451: [FLINK-12513][e2e] Improve end-to-end (bash based) tests URL: https://github.com/apache/flink/pull/8451 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tillrohrmann commented on issue #8595: [FLINK-12707]Close minicluster will cause memory leak when there are StreamTask closed abnormal
tillrohrmann commented on issue #8595: [FLINK-12707]Close minicluster will cause memory leak when there are StreamTask closed abnormal URL: https://github.com/apache/flink/pull/8595#issuecomment-499870633 Thanks for opening this PR @liu-zhaokun. I'm not sure whether this solves the underlying problem. I'll close this PR. Let's discuss on the JIRA first what we want to solve. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12730) Combine BitSet implementations in flink-runtime
[ https://issues.apache.org/jira/browse/FLINK-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12730: -- Component/s: (was: Runtime / Coordination) Runtime / Task > Combine BitSet implementations in flink-runtime > --- > > Key: FLINK-12730 > URL: https://issues.apache.org/jira/browse/FLINK-12730 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Reporter: Liya Fan >Assignee: Liya Fan >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > There are two implementations for BitSet in flink-runtime ocmponent: one is > org.apache.flink.runtime.operators.util.BloomFilter#BitSet, while the other > is org.apache.flink.runtime.operators.util.BitSet > The two classes are quite similar in their API and implementations. The only > difference is that, the former is based based on long operation while the > latter is based on byte operation. This has the following consequence: > # The byte based BitSet has better performance for get/set operations. > # The long based BitSet has better performance for the clear operation. > We combine the two implementations and make the best of both worlds. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12736) ResourceManager may release TM with allocated slots
[ https://issues.apache.org/jira/browse/FLINK-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858589#comment-16858589 ] Till Rohrmann commented on FLINK-12736: --- As a corollary, it could also happen that new partitions are stored on the TM if it can have allocated slots when the callback is being processed. I guess in order to properly solve this problem we would need something like a message counter between the RM and the TM. Only if the message counter is the same as before sending the partition check message, we can be sure that nothing has changed on the TM. > ResourceManager may release TM with allocated slots > --- > > Key: FLINK-12736 > URL: https://issues.apache.org/jira/browse/FLINK-12736 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.9.0 >Reporter: Chesnay Schepler >Priority: Critical > Fix For: 1.9.0 > > > The {{ResourceManager}} looks out for TaskManagers that have not had any > slots allocated on them for a while, as these could be released to safe > resources. If such a TM is found the RM checks via an RPC call whether the TM > still holds any partitions. If no partition is held then the TM is released. > However, in the RPC callback no check is made whether the TM is actually > _still_ idle. In the meantime a slot could've been allocated on the TM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] 1u0 commented on issue #8451: [FLINK-12513][e2e] Improve end-to-end (bash based) tests
1u0 commented on issue #8451: [FLINK-12513][e2e] Improve end-to-end (bash based) tests URL: https://github.com/apache/flink/pull/8451#issuecomment-499868494 > Would be nice to run this changes in CI environment and see if there are any test failures. That message is a little bit outdated. Since then, I have already experimented a little bit with CI runs. > Or do you think that risk of something failing is small enough (basing on passing pre-commit tests) that it's better to commit it as it is and just observe next nightly build? I lean towards this 爛. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12736) ResourceManager may release TM with allocated slots
[ https://issues.apache.org/jira/browse/FLINK-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12736: -- Priority: Critical (was: Major) > ResourceManager may release TM with allocated slots > --- > > Key: FLINK-12736 > URL: https://issues.apache.org/jira/browse/FLINK-12736 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.9.0 >Reporter: Chesnay Schepler >Priority: Critical > Fix For: 1.9.0 > > > The {{ResourceManager}} looks out for TaskManagers that have not had any > slots allocated on them for a while, as these could be released to safe > resources. If such a TM is found the RM checks via an RPC call whether the TM > still holds any partitions. If no partition is held then the TM is released. > However, in the RPC callback no check is made whether the TM is actually > _still_ idle. In the meantime a slot could've been allocated on the TM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12761) Fine grained resource management
[ https://issues.apache.org/jira/browse/FLINK-12761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12761: -- Fix Version/s: 1.9.0 > Fine grained resource management > > > Key: FLINK-12761 > URL: https://issues.apache.org/jira/browse/FLINK-12761 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.8.0, 1.9.0 >Reporter: Tony Xintong Song >Assignee: Tony Xintong Song >Priority: Major > Labels: Umbrella > Fix For: 1.9.0 > > > This is an umbrella issue for enabling fine grained resource management in > Flink. > Fine grained resource management is a big topic that requires long term > efforts. There are many issues to be addressed and designing decisions to be > made, some of which may not be resolved in short time. Here we propose our > design and implementation plan for the upcoming release 1.9, as well as our > thoughts and ideas on the long term road map on this topic. > A practical short term target is to enable fine grained resource management > for batch sql jobs only in the upcoming Flink 1.9. This is necessary for > batch operators added from blink to achieve good performance. > Please find detailed design and implementation plan in attached docs. Any > comment and feedback are welcomed and appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12761) Fine grained resource management
[ https://issues.apache.org/jira/browse/FLINK-12761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12761: -- Component/s: (was: Runtime / Configuration) Runtime / Coordination > Fine grained resource management > > > Key: FLINK-12761 > URL: https://issues.apache.org/jira/browse/FLINK-12761 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.8.0, 1.9.0 >Reporter: Tony Xintong Song >Assignee: Tony Xintong Song >Priority: Major > Labels: Umbrella > > This is an umbrella issue for enabling fine grained resource management in > Flink. > Fine grained resource management is a big topic that requires long term > efforts. There are many issues to be addressed and designing decisions to be > made, some of which may not be resolved in short time. Here we propose our > design and implementation plan for the upcoming release 1.9, as well as our > thoughts and ideas on the long term road map on this topic. > A practical short term target is to enable fine grained resource management > for batch sql jobs only in the upcoming Flink 1.9. This is necessary for > batch operators added from blink to achieve good performance. > Please find detailed design and implementation plan in attached docs. Any > comment and feedback are welcomed and appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12776) Ambiguous content in flink-dist NOTICE file
[ https://issues.apache.org/jira/browse/FLINK-12776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-12776: -- Component/s: API / Python > Ambiguous content in flink-dist NOTICE file > --- > > Key: FLINK-12776 > URL: https://issues.apache.org/jira/browse/FLINK-12776 > Project: Flink > Issue Type: Improvement > Components: API / Python, Release System >Affects Versions: 1.9.0 >Reporter: Chesnay Schepler >Priority: Blocker > Fix For: 1.9.0 > > > With FLINK-12409 we include the new flink-python module in flink-dist. As a > result we now have 2 {{flink-python}} entries in the flink-dist NOTICE file, > one for the old batch API and one for the newly added one, which is > ambiguous. We should rectify this by either excluding the old batch API from > flink-dist, or rename the new module to something like {{flink-api-python}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)