[GitHub] [flink] sunjincheng121 commented on issue #8393: [hotfix][table] Fix typo for method name in FieldReferenceLookup
sunjincheng121 commented on issue #8393: [hotfix][table] Fix typo for method name in FieldReferenceLookup URL: https://github.com/apache/flink/pull/8393#issuecomment-491166331 Thanks for your review! @hequn8128 @flinkbot approve all Merging... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] GJL commented on a change in pull request #8309: [FLINK-12229] [runtime] Implement LazyFromSourcesScheduling Strategy
GJL commented on a change in pull request #8309: [FLINK-12229] [runtime] Implement LazyFromSourcesScheduling Strategy URL: https://github.com/apache/flink/pull/8309#discussion_r282749057 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/strategy/LazyFromSourcesSchedulingStrategy.java ## @@ -0,0 +1,296 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.scheduler.strategy; + +import org.apache.flink.api.common.InputDependencyConstraint; +import org.apache.flink.runtime.execution.ExecutionState; +import org.apache.flink.runtime.io.network.partition.ResultPartitionID; +import org.apache.flink.runtime.jobgraph.IntermediateDataSet; +import org.apache.flink.runtime.jobgraph.IntermediateDataSetID; +import org.apache.flink.runtime.jobgraph.JobGraph; +import org.apache.flink.runtime.scheduler.DeploymentOption; +import org.apache.flink.runtime.scheduler.ExecutionVertexDeploymentOption; +import org.apache.flink.runtime.scheduler.SchedulerOperations; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import static org.apache.flink.runtime.scheduler.strategy.SchedulingResultPartition.ResultPartitionState.DONE; +import static org.apache.flink.runtime.scheduler.strategy.SchedulingResultPartition.ResultPartitionState.PRODUCING; +import static org.apache.flink.util.Preconditions.checkNotNull; + +/** + * {@link SchedulingStrategy} instance for batch job which schedule vertices when input data are ready. + */ +public class LazyFromSourcesSchedulingStrategy implements SchedulingStrategy { + + private final SchedulerOperations schedulerOperations; + + private final SchedulingTopology schedulingTopology; + + private final Map deploymentOptions; + + private final Map intermediateDataSets; + + public LazyFromSourcesSchedulingStrategy( + SchedulerOperations schedulerOperations, + SchedulingTopology schedulingTopology) { + this.schedulerOperations = checkNotNull(schedulerOperations); + this.schedulingTopology = checkNotNull(schedulingTopology); + this.intermediateDataSets = new HashMap<>(); + this.deploymentOptions = new HashMap<>(); + } + + @Override + public void startScheduling() { + List executionVertexDeploymentOptions = new ArrayList<>(); + DeploymentOption updateOption = new DeploymentOption(true); + DeploymentOption nonUpdateOption = new DeploymentOption(false); + + for (SchedulingExecutionVertex schedulingVertex : schedulingTopology.getVertices()) { + DeploymentOption option = nonUpdateOption; + for (SchedulingResultPartition srp : schedulingVertex.getProducedResultPartitions()) { + SchedulingIntermediateDataSet sid = intermediateDataSets.computeIfAbsent(srp.getResultId(), + (key) -> new SchedulingIntermediateDataSet()); + sid.addSchedulingResultPartition(srp); + if (srp.getPartitionType().isPipelined()) { + option = updateOption; + } + } + deploymentOptions.put(schedulingVertex.getId(), option); + + if (schedulingVertex.getConsumedResultPartitions().isEmpty()) { + // schedule vertices without consumed result partition + executionVertexDeploymentOptions.add( + new ExecutionVertexDeploymentOption(schedulingVertex.getId(), option)); + } + } + + schedulerOperations.allocateSlotsAndDeploy(executionVertexDeploymentOptions); + } + + @Override + public void restartTasks(Set verticesToRestart) { + // increase counter of the dataset first + for (ExecutionVertexID executionVertexId :
[GitHub] [flink] GJL commented on a change in pull request #8309: [FLINK-12229] [runtime] Implement LazyFromSourcesScheduling Strategy
GJL commented on a change in pull request #8309: [FLINK-12229] [runtime] Implement LazyFromSourcesScheduling Strategy URL: https://github.com/apache/flink/pull/8309#discussion_r282749057 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/strategy/LazyFromSourcesSchedulingStrategy.java ## @@ -0,0 +1,296 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.scheduler.strategy; + +import org.apache.flink.api.common.InputDependencyConstraint; +import org.apache.flink.runtime.execution.ExecutionState; +import org.apache.flink.runtime.io.network.partition.ResultPartitionID; +import org.apache.flink.runtime.jobgraph.IntermediateDataSet; +import org.apache.flink.runtime.jobgraph.IntermediateDataSetID; +import org.apache.flink.runtime.jobgraph.JobGraph; +import org.apache.flink.runtime.scheduler.DeploymentOption; +import org.apache.flink.runtime.scheduler.ExecutionVertexDeploymentOption; +import org.apache.flink.runtime.scheduler.SchedulerOperations; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import static org.apache.flink.runtime.scheduler.strategy.SchedulingResultPartition.ResultPartitionState.DONE; +import static org.apache.flink.runtime.scheduler.strategy.SchedulingResultPartition.ResultPartitionState.PRODUCING; +import static org.apache.flink.util.Preconditions.checkNotNull; + +/** + * {@link SchedulingStrategy} instance for batch job which schedule vertices when input data are ready. + */ +public class LazyFromSourcesSchedulingStrategy implements SchedulingStrategy { + + private final SchedulerOperations schedulerOperations; + + private final SchedulingTopology schedulingTopology; + + private final Map deploymentOptions; + + private final Map intermediateDataSets; + + public LazyFromSourcesSchedulingStrategy( + SchedulerOperations schedulerOperations, + SchedulingTopology schedulingTopology) { + this.schedulerOperations = checkNotNull(schedulerOperations); + this.schedulingTopology = checkNotNull(schedulingTopology); + this.intermediateDataSets = new HashMap<>(); + this.deploymentOptions = new HashMap<>(); + } + + @Override + public void startScheduling() { + List executionVertexDeploymentOptions = new ArrayList<>(); + DeploymentOption updateOption = new DeploymentOption(true); + DeploymentOption nonUpdateOption = new DeploymentOption(false); + + for (SchedulingExecutionVertex schedulingVertex : schedulingTopology.getVertices()) { + DeploymentOption option = nonUpdateOption; + for (SchedulingResultPartition srp : schedulingVertex.getProducedResultPartitions()) { + SchedulingIntermediateDataSet sid = intermediateDataSets.computeIfAbsent(srp.getResultId(), + (key) -> new SchedulingIntermediateDataSet()); + sid.addSchedulingResultPartition(srp); + if (srp.getPartitionType().isPipelined()) { + option = updateOption; + } + } + deploymentOptions.put(schedulingVertex.getId(), option); + + if (schedulingVertex.getConsumedResultPartitions().isEmpty()) { + // schedule vertices without consumed result partition + executionVertexDeploymentOptions.add( + new ExecutionVertexDeploymentOption(schedulingVertex.getId(), option)); + } + } + + schedulerOperations.allocateSlotsAndDeploy(executionVertexDeploymentOptions); + } + + @Override + public void restartTasks(Set verticesToRestart) { + // increase counter of the dataset first + for (ExecutionVertexID executionVertexId :
[jira] [Created] (FLINK-12473) Add ML pipeline and ML lib Core API
Shaoxuan Wang created FLINK-12473: - Summary: Add ML pipeline and ML lib Core API Key: FLINK-12473 URL: https://issues.apache.org/jira/browse/FLINK-12473 Project: Flink Issue Type: Sub-task Components: Library / Machine Learning Reporter: Shaoxuan Wang Attachments: image-2019-05-10-12-50-15-869.png This Jira will introduce the major interfaces for ML pipeline and ML lib. The major interfaces and their relationship diagram is shown as below. For more details, please refer to [FLIP39 design doc|[https://docs.google.com/document/d/1StObo1DLp8iiy0rbukx8kwAJb0BwDZrQrMWub3DzsEo]] !image-2019-05-10-12-50-15-869.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] jiasheng55 opened a new pull request #8396: [FLINK-12468][yarn] Unregister application from the YARN Resource Man…
jiasheng55 opened a new pull request #8396: [FLINK-12468][yarn] Unregister application from the YARN Resource Man… URL: https://github.com/apache/flink/pull/8396 …ager with a valid appTrackingUrl ## What is the purpose of the change Currently when a Flink job on Yarn finished, it's tracking URL is not set. As a result, we can not jump to Flink history server directly from Yarn. This PR aims to provide a valid "Tracking URL". ## Brief change log Add a new configuration to set the "Tracking URL" of a finished Yarn application with the address of a Flink history server. ## Verifying this change All the existing integration tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12468) Unregister application from the YARN Resource Manager with a valid appTrackingUrl
[ https://issues.apache.org/jira/browse/FLINK-12468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12468: --- Labels: pull-request-available (was: ) > Unregister application from the YARN Resource Manager with a valid > appTrackingUrl > - > > Key: FLINK-12468 > URL: https://issues.apache.org/jira/browse/FLINK-12468 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Reporter: Victor Wong >Assignee: Victor Wong >Priority: Minor > Labels: pull-request-available > Attachments: image-2019-05-09-18-06-06-954.png, > image-2019-05-09-18-07-52-725.png > > > Currently when a Flink job on yarn finished, it's tracking URL is not valid. > As a result, we can not jump to Flink history server directly from Yarn. > !image-2019-05-09-18-06-06-954.png! > We can provide a valid appTrackingUrl when unregister from Yarn. > _org.apache.flink.yarn.YarnResourceManager#internalDeregisterApplication_ > > _!image-2019-05-09-18-07-52-725.png!_ > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8396: [FLINK-12468][yarn] Unregister application from the YARN Resource Man…
flinkbot commented on issue #8396: [FLINK-12468][yarn] Unregister application from the YARN Resource Man… URL: https://github.com/apache/flink/pull/8396#issuecomment-491151985 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] godfreyhe commented on a change in pull request #8317: [FLINK-12371] [table-planner-blink] Add support for converting (NOT) IN/ (NOT) EXISTS to semi-join/anti-join, and generating opt
godfreyhe commented on a change in pull request #8317: [FLINK-12371] [table-planner-blink] Add support for converting (NOT) IN/ (NOT) EXISTS to semi-join/anti-join, and generating optimized logical plan URL: https://github.com/apache/flink/pull/8317#discussion_r282738118 ## File path: flink-table/flink-table-planner-blink/src/main/java/org/apache/calcite/rel/core/Join.java ## @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to you under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.calcite.rel.core; Review comment: If packages are different betweent Flink `Join` and Calcite `Join`, we need to copy all codes associated with `Join` from Calcite to Flink. This class will be delete when calcite-1.20 is released. (may be one month later) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #8395: [FLINK-12459][yarn] YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR shou…
flinkbot commented on issue #8395: [FLINK-12459][yarn] YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR shou… URL: https://github.com/apache/flink/pull/8395#issuecomment-491143035 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] jiasheng55 opened a new pull request #8395: [FLINK-12459][yarn] YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR shou…
jiasheng55 opened a new pull request #8395: [FLINK-12459][yarn] YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR shou… URL: https://github.com/apache/flink/pull/8395 …ld affect the order of classpath between user jars and flink jars ## What is the purpose of the change Currently when setting `YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR` to "LAST", the classpath order of user jars still come before "flink.jar", which may cause class loading conflicts in detached mode. ## Brief change log Add user jars to classpath after "flink.jar" in case of `YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR` is "LAST". ## Verifying this change All the existing integration tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12459) YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR should affect the order of classpath between user jars and flink jars
[ https://issues.apache.org/jira/browse/FLINK-12459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12459: --- Labels: pull-request-available (was: ) > YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR should affect the order of > classpath between user jars and flink jars > -- > > Key: FLINK-12459 > URL: https://issues.apache.org/jira/browse/FLINK-12459 > Project: Flink > Issue Type: Improvement > Components: Command Line Client >Reporter: Victor Wong >Assignee: Victor Wong >Priority: Minor > Labels: pull-request-available > Attachments: image-2019-05-09-15-04-36-360.png > > > When setting YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR to "LAST", I think > the user jars should come after all flink libs in runtime classpath, > including "flink.jar". > But actually it's not: > org.apache.flink.yarn.AbstractYarnClusterDescriptor#startAppMaster > !image-2019-05-09-15-04-36-360.png! > I'm not sure if it's an expected behavior, because if a user forgets to mark > Flink dependencies as "provided", it causes conflicts in detached mode. > Can we optimize it like this: > [https://github.com/apache/flink/commit/86f264d76b3cfd89346f8c5ab2b8a9e99600aaee] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12459) YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR should affect the order of classpath between user jars and flink jars
[ https://issues.apache.org/jira/browse/FLINK-12459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Victor Wong updated FLINK-12459: Description: When setting YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR to "LAST", I think the user jars should come after all flink libs in runtime classpath, including "flink.jar". But actually it's not: org.apache.flink.yarn.AbstractYarnClusterDescriptor#startAppMaster !image-2019-05-09-15-04-36-360.png! I'm not sure if it's an expected behavior, because if a user forgets to mark Flink dependencies as "provided", it causes conflicts in detached mode. was: When setting YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR to "LAST", I think the user jars should come after all flink libs in runtime classpath, including "flink.jar". But actually it's not: org.apache.flink.yarn.AbstractYarnClusterDescriptor#startAppMaster !image-2019-05-09-15-04-36-360.png! I'm not sure if it's an expected behavior, because if a user forgets to mark Flink dependencies as "provided", it causes conflicts in detached mode. Can we optimize it like this: [https://github.com/apache/flink/commit/86f264d76b3cfd89346f8c5ab2b8a9e99600aaee] > YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR should affect the order of > classpath between user jars and flink jars > -- > > Key: FLINK-12459 > URL: https://issues.apache.org/jira/browse/FLINK-12459 > Project: Flink > Issue Type: Improvement > Components: Command Line Client >Reporter: Victor Wong >Assignee: Victor Wong >Priority: Minor > Labels: pull-request-available > Attachments: image-2019-05-09-15-04-36-360.png > > Time Spent: 10m > Remaining Estimate: 0h > > When setting YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR to "LAST", I think > the user jars should come after all flink libs in runtime classpath, > including "flink.jar". > But actually it's not: > org.apache.flink.yarn.AbstractYarnClusterDescriptor#startAppMaster > !image-2019-05-09-15-04-36-360.png! > I'm not sure if it's an expected behavior, because if a user forgets to mark > Flink dependencies as "provided", it causes conflicts in detached mode. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12459) YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR should affect the order of classpath between user jars and flink jars
[ https://issues.apache.org/jira/browse/FLINK-12459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Victor Wong updated FLINK-12459: Description: When setting YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR to "LAST", I think the user jars should come after all flink libs in runtime classpath, including "flink.jar". But actually it's not: org.apache.flink.yarn.AbstractYarnClusterDescriptor#startAppMaster !image-2019-05-09-15-04-36-360.png! I'm not sure if it's an expected behavior, because if a user forgets to mark Flink dependencies as "provided", it causes conflicts in detached mode. Can we optimize it like this: [https://github.com/apache/flink/commit/86f264d76b3cfd89346f8c5ab2b8a9e99600aaee] was: When setting YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR to "LAST", I think the user jars should come after all flink libs in runtime classpath, including "flink.jar". But actually it's not: org.apache.flink.yarn.AbstractYarnClusterDescriptor#startAppMaster !image-2019-05-09-15-04-36-360.png! I'm not sure if it's an expected behavior, because if a user forgets to mark Flink dependencies as "provided", it causes conflicts in detached mode. > YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR should affect the order of > classpath between user jars and flink jars > -- > > Key: FLINK-12459 > URL: https://issues.apache.org/jira/browse/FLINK-12459 > Project: Flink > Issue Type: Improvement > Components: Command Line Client >Reporter: Victor Wong >Assignee: Victor Wong >Priority: Minor > Attachments: image-2019-05-09-15-04-36-360.png > > > When setting YarnConfigOptions#CLASSPATH_INCLUDE_USER_JAR to "LAST", I think > the user jars should come after all flink libs in runtime classpath, > including "flink.jar". > But actually it's not: > org.apache.flink.yarn.AbstractYarnClusterDescriptor#startAppMaster > !image-2019-05-09-15-04-36-360.png! > I'm not sure if it's an expected behavior, because if a user forgets to mark > Flink dependencies as "provided", it causes conflicts in detached mode. > Can we optimize it like this: > [https://github.com/apache/flink/commit/86f264d76b3cfd89346f8c5ab2b8a9e99600aaee] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11115) Port some flink.ml algorithms to table based
[ https://issues.apache.org/jira/browse/FLINK-5?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaoxuan Wang closed FLINK-5. - Resolution: Duplicate We decided to move the entire ml pipeline and ml lib development under the umbrella Jira (FLINK-12470) of FLIP39 > Port some flink.ml algorithms to table based > > > Key: FLINK-5 > URL: https://issues.apache.org/jira/browse/FLINK-5 > Project: Flink > Issue Type: Sub-task > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > This sub-task is to port some flink.ml algorithms to table based to verify > the correctness of design and implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11114) Support wrapping inference pipeline as a UDF function in SQL
[ https://issues.apache.org/jira/browse/FLINK-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaoxuan Wang closed FLINK-4. - Resolution: Duplicate We decided to move the entire ml pipeline and ml lib development under the umbrella Jira (FLINK-12470) of FLIP39 > Support wrapping inference pipeline as a UDF function in SQL > > > Key: FLINK-4 > URL: https://issues.apache.org/jira/browse/FLINK-4 > Project: Flink > Issue Type: Sub-task > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > Though the standard ML pipeline inference usage is table based (that is, user > at client construct the DAG only using table API), it is also desirable to > wrap the inference logic as a UDF to be used in SQL. This means it shall be > record based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] YueYeShen commented on a change in pull request #8384: [FLINK-11610][docs-zh] Translate the "Examples" page into Chinese
YueYeShen commented on a change in pull request #8384: [FLINK-11610][docs-zh] Translate the "Examples" page into Chinese URL: https://github.com/apache/flink/pull/8384#discussion_r282731659 ## File path: docs/examples/index.zh.md ## @@ -1,7 +1,9 @@ --- title: 示例 nav-id: examples -nav-title: ' 示例' + +nav-title: ' Examples' Review comment: ok,thank you ,I will do that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-11113) Support periodically update models when inferencing
[ https://issues.apache.org/jira/browse/FLINK-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaoxuan Wang closed FLINK-3. - Resolution: Duplicate We decided to move the entire ml pipeline and ml lib development under the umbrella Jira (FLINK-12470) of FLIP39 > Support periodically update models when inferencing > --- > > Key: FLINK-3 > URL: https://issues.apache.org/jira/browse/FLINK-3 > Project: Flink > Issue Type: Sub-task > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > As models will be periodically updated and the inference job may running on > stream and will NOT finish, it is important to have this inference job > periodically reload the latest model for inference without start/stop the > inference job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11111) Create a new set of parameters
[ https://issues.apache.org/jira/browse/FLINK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaoxuan Wang closed FLINK-1. - Resolution: Duplicate We decided to move the entire ml pipeline and ml lib development under the umbrella Jira (FLINK-12470) of FLIP39 > Create a new set of parameters > -- > > Key: FLINK-1 > URL: https://issues.apache.org/jira/browse/FLINK-1 > Project: Flink > Issue Type: Sub-task > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > One goal of new Table based ML Pipeline is easy for tooling. That is, for any > ML/AI algorithm adapt to this ML Pipeline standard shall declare all its > parameters via a well-defined interface. So that, the AI platform can > uniformly get/set corresponding parameters while agnostic about the details > of specific algorithm. The only difference between algorithms, from a user's > perspective, is its name. All the other algorithm parameters are self > descriptive. > > This will also be useful for future Flink ML SQL as the SQL parser can > uniformly handle all these parameter things. This can greatly simplify the > SQL parser. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11112) Support pipeline import/export
[ https://issues.apache.org/jira/browse/FLINK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaoxuan Wang closed FLINK-2. - Resolution: Duplicate We decided to move the entire ml pipeline and ml lib development under the umbrella Jira (FLINK-12470) of FLIP39 > Support pipeline import/export > -- > > Key: FLINK-2 > URL: https://issues.apache.org/jira/browse/FLINK-2 > Project: Flink > Issue Type: Sub-task > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > It is quite common to have one job for training and periodical export trained > pipeline and models and another job to load these exported pipeline for > inference. > > Thus, we will need functionalities for pipeline import/export. This shall > work in both streaming/batch environment. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11109) Create Table based optimizers
[ https://issues.apache.org/jira/browse/FLINK-11109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaoxuan Wang closed FLINK-11109. - Resolution: Duplicate We decided to move the entire ml pipeline and ml lib development under the umbrella Jira (FLINK-12470) of FLIP39 > Create Table based optimizers > - > > Key: FLINK-11109 > URL: https://issues.apache.org/jira/browse/FLINK-11109 > Project: Flink > Issue Type: Sub-task > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > The existing optimizers in org.apache.flink.ml package are dataset based. > This task is to create a new set of optimizers which are table based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11108) Create a new set of table based ML Pipeline classes
[ https://issues.apache.org/jira/browse/FLINK-11108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaoxuan Wang closed FLINK-11108. - Resolution: Duplicate We decided to move the entire ml pipeline and ml lib development under the umbrella Jira (FLINK-12470) of FLIP39 > Create a new set of table based ML Pipeline classes > --- > > Key: FLINK-11108 > URL: https://issues.apache.org/jira/browse/FLINK-11108 > Project: Flink > Issue Type: Sub-task > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > The main classes are: > # PipelineStage (the trait for each pipeline stage) > # Estimator (training stage) > # Transformer (the inference/feature engineering stage) > # Pipeline (the whole pipeline) > # Predictor (extends Estimator, for supervised learning) > Detailed design can be referred at parent JIRA's design document. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11096) Create a new table based flink ML package
[ https://issues.apache.org/jira/browse/FLINK-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaoxuan Wang closed FLINK-11096. - Resolution: Duplicate We decided to move the entire ml pipeline and ml lib development under the umbrella Jira (FLINK-12470) of FLIP39 > Create a new table based flink ML package > - > > Key: FLINK-11096 > URL: https://issues.apache.org/jira/browse/FLINK-11096 > Project: Flink > Issue Type: Sub-task > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > Currently, the DataSet based ML library is under org.apache._flink.ml_ scala > package and under _flink-libraries/flink-ml directory._ > > There are two questions related to packaging: > # Shall we create a new scala/java package, e.g. org.apache.flink.table.ml? > Or still stay in org.apache.flink.ml? > # Shall we still put new code in flink-libraries/flink-ml directory or > create a new one, e.g. flink-libraries/flink-table-ml and corresponding maven > package? > > I implemented a prototype for the design and found that the new design is > very hard to fit into existing flink.ml codebase. The existing flink.ml code > is tightly coupled with DataSet API. Thus, I have to rewrite almost all parts > of flink.ml to get some sample case to work. The only reusable code from > flink.ml are the base math classes under _org.apache.flink.ml.math_ and > _org.apache.flink.ml.metrics.distance_ packages. > Considering this fact, I will prefer to create a new package > org.apache.flink.table.ml and a new maven package flink-table-ml. > > Please feel free to give your feedbacks. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-11095) Table based ML Pipeline
[ https://issues.apache.org/jira/browse/FLINK-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836837#comment-16836837 ] Shaoxuan Wang edited comment on FLINK-11095 at 5/10/19 2:48 AM: We decided to move the entire ml pipeline and ml lib development under the umbrella Jira ([FLINK-12470|https://issues.apache.org/jira/browse/FLINK-12470]) of FLIP39 was (Author: shaoxuanwang): We decided to move the entire ml pipeline and ml lib development under FLIP39 > Table based ML Pipeline > --- > > Key: FLINK-11095 > URL: https://issues.apache.org/jira/browse/FLINK-11095 > Project: Flink > Issue Type: New Feature > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > As Table API will be the unified high level API for both batch and streaming, > it is desired to have a table API based ML Pipeline definition. > This new table based ML Pipeline shall: > # support unified batch/stream ML/AI functionalities (train/inference). > # seamless integrated with flink Table based ecosystem. > # provide a base for further flink based AI platform/tooling support. > # provide a base for further flink ML SQL integration. > The initial design is here > [https://docs.google.com/document/d/1PLddLEMP_wn4xHwi6069f3vZL7LzkaP0MN9nAB63X90/edit?usp=sharing.] > And based on this design, I made some initial implementation/prototyping. I > will share the code later. > This is the umbrella JIRA. I will create corresponding sub-jira for each > sub-task. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12472) Support setting attemptFailuresValidityInterval of jobs on Yarn
Victor Wong created FLINK-12472: --- Summary: Support setting attemptFailuresValidityInterval of jobs on Yarn Key: FLINK-12472 URL: https://issues.apache.org/jira/browse/FLINK-12472 Project: Flink Issue Type: Improvement Components: Deployment / YARN Reporter: Victor Wong Assignee: Victor Wong According to the documentation of [Yarn|http://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.html], a yarn application can set a _attemptFailuresValidityInterval_ to reset application attempts. "attemptFailuresValidityInterval. _The default value is -1. when attemptFailuresValidityInterval in milliseconds is set to > 0, the failure number will no take failures which happen out of the validityInterval into failure count. If failure count reaches to maxAppAttempts, the application will be failed."_ We can make use of this feature to make Flink jobs on Yarn to be more long-running. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-11560) Translate "Flink Applications" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] francisdu reassigned FLINK-11560: - Assignee: (was: francisdu) > Translate "Flink Applications" page into Chinese > > > Key: FLINK-11560 > URL: https://issues.apache.org/jira/browse/FLINK-11560 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Project Website >Reporter: Jark Wu >Priority: Major > > Translate "Flink Applications" page into Chinese. > The markdown file is located in: flink-web/flink-applications.zh.md > The url link is: https://flink.apache.org/zh/flink-applications.html > Please adjust the links in the page to Chinese pages when translating. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-11560) Translate "Flink Applications" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] francisdu reassigned FLINK-11560: - Assignee: francisdu > Translate "Flink Applications" page into Chinese > > > Key: FLINK-11560 > URL: https://issues.apache.org/jira/browse/FLINK-11560 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Project Website >Reporter: Jark Wu >Assignee: francisdu >Priority: Major > > Translate "Flink Applications" page into Chinese. > The markdown file is located in: flink-web/flink-applications.zh.md > The url link is: https://flink.apache.org/zh/flink-applications.html > Please adjust the links in the page to Chinese pages when translating. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-11560) Translate "Flink Applications" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] francisdu reassigned FLINK-11560: - Assignee: (was: francisdu) > Translate "Flink Applications" page into Chinese > > > Key: FLINK-11560 > URL: https://issues.apache.org/jira/browse/FLINK-11560 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Project Website >Reporter: Jark Wu >Priority: Major > > Translate "Flink Applications" page into Chinese. > The markdown file is located in: flink-web/flink-applications.zh.md > The url link is: https://flink.apache.org/zh/flink-applications.html > Please adjust the links in the page to Chinese pages when translating. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11095) Table based ML Pipeline
[ https://issues.apache.org/jira/browse/FLINK-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaoxuan Wang closed FLINK-11095. - Resolution: Fixed We decided to move the entire ml pipeline and ml lib development under FLIP39 > Table based ML Pipeline > --- > > Key: FLINK-11095 > URL: https://issues.apache.org/jira/browse/FLINK-11095 > Project: Flink > Issue Type: New Feature > Components: Library / Machine Learning >Reporter: Weihua Jiang >Priority: Major > > As Table API will be the unified high level API for both batch and streaming, > it is desired to have a table API based ML Pipeline definition. > This new table based ML Pipeline shall: > # support unified batch/stream ML/AI functionalities (train/inference). > # seamless integrated with flink Table based ecosystem. > # provide a base for further flink based AI platform/tooling support. > # provide a base for further flink ML SQL integration. > The initial design is here > [https://docs.google.com/document/d/1PLddLEMP_wn4xHwi6069f3vZL7LzkaP0MN9nAB63X90/edit?usp=sharing.] > And based on this design, I made some initial implementation/prototyping. I > will share the code later. > This is the umbrella JIRA. I will create corresponding sub-jira for each > sub-task. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] zjffdu commented on issue #8144: [FLINK-12159]. Enable YarnMiniCluster integration test under non-secure mode
zjffdu commented on issue #8144: [FLINK-12159]. Enable YarnMiniCluster integration test under non-secure mode URL: https://github.com/apache/flink/pull/8144#issuecomment-491134217 @tillrohrmann I have updated the PR, now I will ship the yarn-site.xml file if IN_TESTS is set, and write the yarn-site.xml to the test-classes directory. Let me know if you have any other concern This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #8394: [FLINK-12471][docs] Fix broken links in documentation to make CRON travis job work
flinkbot commented on issue #8394: [FLINK-12471][docs] Fix broken links in documentation to make CRON travis job work URL: https://github.com/apache/flink/pull/8394#issuecomment-491133937 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wuchong opened a new pull request #8394: [FLINK-12471][docs] Fix broken links in documentation to make CRON travis job work
wuchong opened a new pull request #8394: [FLINK-12471][docs] Fix broken links in documentation to make CRON travis job work URL: https://github.com/apache/flink/pull/8394 ## What is the purpose of the change Fix the following broken links: ``` [2019-05-09 14:05:44] ERROR `/zh/dev/stream/side_output.html' not found. [2019-05-09 14:05:45] ERROR `/dev/table/(/dev/table/sourceSinks.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.8.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.7.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.6.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.5.html' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/levels_of_abstraction.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/table_api.html' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/program_dataflow.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/parallel_dataflow.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/windows.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/event_ingestion_processing_time.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/state_partitioning.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_chains.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/processes.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_slots.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/slot_sharing.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/checkpoints.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/linking_with_flink.html' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/linking.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_timestamps_watermarks.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_timestamp_extractors.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_time.html' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/table/(/dev/table/sourceSinks.html' not found. [2019-05-09 14:05:49] ERROR `/zh/fig/checkpoint_tuning.svg' not found. [2019-05-09 14:05:49] ERROR `/zh/fig/local_recovery.png' not found. ``` ## Brief change log - Create release note pages for Chinese - Use `{{ site.baseurl }}/fig/checkpoints.svg` instead of relative links `../fig/checkpoints.svg` - Use the new links instead of the redirect links in all pages. ## Verifying this change N/A ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12471) Fix broken links in documentation to make CRON travis job work
[ https://issues.apache.org/jira/browse/FLINK-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12471: --- Labels: pull-request-available (was: ) > Fix broken links in documentation to make CRON travis job work > -- > > Key: FLINK-12471 > URL: https://issues.apache.org/jira/browse/FLINK-12471 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > Labels: pull-request-available > > The CRON travis job is failing because of documentation link checks. > https://travis-ci.org/apache/flink/jobs/530213609 > Following are the broken links: > {code:java} > [2019-05-09 14:05:44] ERROR `/zh/dev/stream/side_output.html' not found. > [2019-05-09 14:05:45] ERROR `/dev/table/(/dev/table/sourceSinks.html' not > found. > [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.8.html' not found. > [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.7.html' not found. > [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.6.html' not found. > [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.5.html' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/levels_of_abstraction.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/dev/table_api.html' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/program_dataflow.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/parallel_dataflow.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/windows.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/event_ingestion_processing_time.svg' not > found. > [2019-05-09 14:05:48] ERROR `/zh/fig/state_partitioning.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_chains.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/processes.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_slots.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/slot_sharing.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/fig/checkpoints.svg' not found. > [2019-05-09 14:05:48] ERROR `/zh/dev/linking_with_flink.html' not found. > [2019-05-09 14:05:48] ERROR `/zh/dev/linking.html' not found. > [2019-05-09 14:05:48] ERROR > `/zh/apis/streaming/event_timestamps_watermarks.html' not found. > [2019-05-09 14:05:48] ERROR > `/zh/apis/streaming/event_timestamp_extractors.html' not found. > [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_time.html' not found. > [2019-05-09 14:05:48] ERROR `/zh/dev/table/(/dev/table/sourceSinks.html' not > found. > [2019-05-09 14:05:49] ERROR `/zh/fig/checkpoint_tuning.svg' not found. > [2019-05-09 14:05:49] ERROR `/zh/fig/local_recovery.png' not found. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] xuefuz commented on issue #8390: [FLINK-12469][table] Clean up catalog API on default/current database
xuefuz commented on issue #8390: [FLINK-12469][table] Clean up catalog API on default/current database URL: https://github.com/apache/flink/pull/8390#issuecomment-491133575 > > default DB is most likely from configuration and it cannot be verified until a catalog is opened. This shouldn't be a problem as error will be reported later when the db is actually referenced. > > I wonder if we should verify its existence upon catalog opening and fail early if the db doesn't exist? The reason being that the default db cannot be changed at runtime. If SQL CLI users realize it's nonexistent in the later stage of their workflow, e.g. the default db name they put has a typo, they may have several options: 1) not use any feature related to default db, 2) reconfig and restart their clients and lost all their previous progress in memory, 3) create that db (not a good option for a typo name). None seems very user-friendly. Okay. Sounds good. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10232) Add a SQL DDL
[ https://issues.apache.org/jira/browse/FLINK-10232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836834#comment-16836834 ] Kurt Young commented on FLINK-10232: Also cc [~rongrong] > Add a SQL DDL > - > > Key: FLINK-10232 > URL: https://issues.apache.org/jira/browse/FLINK-10232 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Timo Walther >Assignee: Shuyi Chen >Priority: Major > > This is an umbrella issue for all efforts related to supporting a SQL Data > Definition Language (DDL) in Flink's Table & SQL API. > Such a DDL includes creating, deleting, replacing: > - tables > - views > - functions > - types > - libraries > - catalogs > If possible, the parsing/validating/logical part should be done using > Calcite. Related issues are CALCITE-707, CALCITE-2045, CALCITE-2046, > CALCITE-2214, and others. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12471) Fix broken links in documentation to make CRON travis job work
[ https://issues.apache.org/jira/browse/FLINK-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-12471: Description: The CRON travis job is failing because of documentation link checks. https://travis-ci.org/apache/flink/jobs/530213609 Following are the broken links: {code:java} [2019-05-09 14:05:44] ERROR `/zh/dev/stream/side_output.html' not found. [2019-05-09 14:05:45] ERROR `/dev/table/(/dev/table/sourceSinks.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.8.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.7.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.6.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.5.html' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/levels_of_abstraction.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/table_api.html' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/program_dataflow.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/parallel_dataflow.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/windows.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/event_ingestion_processing_time.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/state_partitioning.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_chains.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/processes.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_slots.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/slot_sharing.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/checkpoints.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/linking_with_flink.html' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/linking.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_timestamps_watermarks.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_timestamp_extractors.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_time.html' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/table/(/dev/table/sourceSinks.html' not found. [2019-05-09 14:05:49] ERROR `/zh/fig/checkpoint_tuning.svg' not found. [2019-05-09 14:05:49] ERROR `/zh/fig/local_recovery.png' not found. {code} was: The CRON travis job is failing because of documentation link checks. Following are the broken links: {code:java} [2019-05-09 14:05:44] ERROR `/zh/dev/stream/side_output.html' not found. [2019-05-09 14:05:45] ERROR `/dev/table/(/dev/table/sourceSinks.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.8.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.7.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.6.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.5.html' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/levels_of_abstraction.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/table_api.html' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/program_dataflow.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/parallel_dataflow.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/windows.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/event_ingestion_processing_time.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/state_partitioning.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_chains.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/processes.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_slots.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/slot_sharing.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/checkpoints.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/linking_with_flink.html' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/linking.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_timestamps_watermarks.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_timestamp_extractors.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_time.html' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/table/(/dev/table/sourceSinks.html' not found. [2019-05-09 14:05:49] ERROR `/zh/fig/checkpoint_tuning.svg' not found. [2019-05-09 14:05:49] ERROR `/zh/fig/local_recovery.png' not found. {code} > Fix broken links in documentation to make CRON travis job work > -- > > Key: FLINK-12471 > URL: https://issues.apache.org/jira/browse/FLINK-12471 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > > The CRON travis job is failing because of documentation link checks. > https://travis-ci.org/apache/flink/jobs/530213609 > Following are the broken links: > {code:java} > [2019-05-09 14:05:44] ERROR `/zh/dev/stream/side_output.html' not found. > [2019-05-09
[jira] [Created] (FLINK-12471) Fix broken links in documentation to make CRON travis job work
Jark Wu created FLINK-12471: --- Summary: Fix broken links in documentation to make CRON travis job work Key: FLINK-12471 URL: https://issues.apache.org/jira/browse/FLINK-12471 Project: Flink Issue Type: Bug Components: Documentation Reporter: Jark Wu Assignee: Jark Wu The CRON travis job is failing because of documentation link checks. Following are the broken links: {code:java} [2019-05-09 14:05:44] ERROR `/zh/dev/stream/side_output.html' not found. [2019-05-09 14:05:45] ERROR `/dev/table/(/dev/table/sourceSinks.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.8.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.7.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.6.html' not found. [2019-05-09 14:05:48] ERROR `/zh/release-notes/flink-1.5.html' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/levels_of_abstraction.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/table_api.html' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/program_dataflow.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/parallel_dataflow.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/windows.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/event_ingestion_processing_time.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/state_partitioning.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_chains.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/processes.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/tasks_slots.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/slot_sharing.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/fig/checkpoints.svg' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/linking_with_flink.html' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/linking.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_timestamps_watermarks.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_timestamp_extractors.html' not found. [2019-05-09 14:05:48] ERROR `/zh/apis/streaming/event_time.html' not found. [2019-05-09 14:05:48] ERROR `/zh/dev/table/(/dev/table/sourceSinks.html' not found. [2019-05-09 14:05:49] ERROR `/zh/fig/checkpoint_tuning.svg' not found. [2019-05-09 14:05:49] ERROR `/zh/fig/local_recovery.png' not found. {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12470) FLIP39: Flink ML pipeline and ML libs
Shaoxuan Wang created FLINK-12470: - Summary: FLIP39: Flink ML pipeline and ML libs Key: FLINK-12470 URL: https://issues.apache.org/jira/browse/FLINK-12470 Project: Flink Issue Type: New Feature Components: Library / Machine Learning Affects Versions: 1.9.0 Reporter: Shaoxuan Wang Assignee: Shaoxuan Wang Fix For: 1.9.0 This is the umbrella Jira for FLIP39, which intents to to enhance the scalability and the ease of use of Flink ML. ML Discussion thread: [http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-39-Flink-ML-pipeline-and-ML-libs-td28633.html] Google Doc: (will convert it to an official confluence page very soon ) [https://docs.google.com/document/d/1StObo1DLp8iiy0rbukx8kwAJb0BwDZrQrMWub3DzsEo|https://docs.google.com/document/d/1StObo1DLp8iiy0rbukx8kwAJb0BwDZrQrMWub3DzsEo/edit] In machine learning, there are mainly two types of people. The first type is MLlib developer. They need a set of standard/well abstracted core ML APIs to implement the algorithms. Every ML algorithm is a certain concrete implementation on top of these APIs. The second type is MLlib users who utilize the existing/packaged MLlib to train or server a model. It is pretty common that the entire training or inference is constructed by a sequence of transformation or algorithms. It is essential to provide a workflow/pipeline API for MLlib users such that they can easily combine multiple algorithms to describe the ML workflow/pipeline. Current Flink has a set of ML core inferences, but they are built on top of dataset API. This does not quite align with the latest flink [roadmap|https://flink.apache.org/roadmap.html] (TableAPI will become the first class citizen and primary API for analytics use cases, while dataset API will be gradually deprecated). Moreover, Flink at present does not have any interface that allows MLlib users to describe an ML workflow/pipeline, nor provides any approach to persist pipeline or model and reuse them in the future. To solve/improve these issues, in this FLIP we propose to: * Provide a new set of ML core interface (on top of Flink TableAPI) * Provide a ML pipeline interface (on top of Flink TableAPI) * Provide the interfaces for parameters management and pipeline persistence * All the above interfaces should facilitate any new ML algorithm. We will gradually add various standard ML algorithms on top of these new proposed interfaces to ensure their feasibility and scalability. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8393: [hotfix][table] Fix typo for method name
flinkbot commented on issue #8393: [hotfix][table] Fix typo for method name URL: https://github.com/apache/flink/pull/8393#issuecomment-491125622 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #8392: [FLINK-12370][python][travis] Integrated Travis for Python Table API.
flinkbot commented on issue #8392: [FLINK-12370][python][travis] Integrated Travis for Python Table API. URL: https://github.com/apache/flink/pull/8392#issuecomment-491125451 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12370) Integrated Travis for Python Table API
[ https://issues.apache.org/jira/browse/FLINK-12370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12370: --- Labels: pull-request-available (was: ) > Integrated Travis for Python Table API > -- > > Key: FLINK-12370 > URL: https://issues.apache.org/jira/browse/FLINK-12370 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: Wei Zhong >Priority: Major > Labels: pull-request-available > > Integrated Travis for Python Table API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] sunjincheng121 opened a new pull request #8393: [hotfix][table] Fix typo for method name
sunjincheng121 opened a new pull request #8393: [hotfix][table] Fix typo for method name URL: https://github.com/apache/flink/pull/8393 hotfix typo, `Ambigous` -> `Ambiguous`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] WeiZhong94 opened a new pull request #8392: [FLINK-12370][python][travis] Integrated Travis for Python Table API.
WeiZhong94 opened a new pull request #8392: [FLINK-12370][python][travis] Integrated Travis for Python Table API. URL: https://github.com/apache/flink/pull/8392 ## What is the purpose of the change *This pull request integrates travis for Python Table API.* ## Brief change log - *Add new stage option in travis_controller.sh to enable python tests* - *Adjust `minimizeCachedFiles` logic to reserve files under `flink-dist` directory and test jars.* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10232) Add a SQL DDL
[ https://issues.apache.org/jira/browse/FLINK-10232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836825#comment-16836825 ] Danny Chan commented on FLINK-10232: Hi, [~suez1224], I'm Danny from Blink SQL team, we have a module named *flink-sql-parser* to handle the sql parsing which includes all the DDLs and DMLs, i saw that there are many comments on your design doc[1], and there are still few disputes on the DDL grammar of streaming tables. But for batch, we have make agreement cause it is almost standard. So i'm wondering if we can make some progress. Saying, we can contribute the DDLs for batch firstly. I kind of want to know your work for this issue to see if we can make some cooperation. Thx again for firing the discussion and detailed design doc, hoping for your suggestions. [1] [https://docs.google.com/document/d/1TTP-GCC8wSsibJaSUyFZ_5NBAHYEB1FVmPpP7RgDGBA/edit#heading=h.wpsqidkaaoil] > Add a SQL DDL > - > > Key: FLINK-10232 > URL: https://issues.apache.org/jira/browse/FLINK-10232 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Timo Walther >Assignee: Shuyi Chen >Priority: Major > > This is an umbrella issue for all efforts related to supporting a SQL Data > Definition Language (DDL) in Flink's Table & SQL API. > Such a DDL includes creating, deleting, replacing: > - tables > - views > - functions > - types > - libraries > - catalogs > If possible, the parsing/validating/logical part should be done using > Calcite. Related issues are CALCITE-707, CALCITE-2045, CALCITE-2046, > CALCITE-2214, and others. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-11560) Translate "Flink Applications" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] francisdu reassigned FLINK-11560: - Assignee: francisdu (was: Stephen) > Translate "Flink Applications" page into Chinese > > > Key: FLINK-11560 > URL: https://issues.apache.org/jira/browse/FLINK-11560 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Project Website >Reporter: Jark Wu >Assignee: francisdu >Priority: Major > > Translate "Flink Applications" page into Chinese. > The markdown file is located in: flink-web/flink-applications.zh.md > The url link is: https://flink.apache.org/zh/flink-applications.html > Please adjust the links in the page to Chinese pages when translating. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] KurtYoung commented on issue #8389: [FLINK-12399][table] Fix FilterableTableSource does not change after applyPredicate
KurtYoung commented on issue #8389: [FLINK-12399][table] Fix FilterableTableSource does not change after applyPredicate URL: https://github.com/apache/flink/pull/8389#issuecomment-491124524 @flinkbot attention @godfreyhe This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] KurtYoung commented on issue #8389: [FLINK-12399][table] Fix FilterableTableSource does not change after applyPredicate
KurtYoung commented on issue #8389: [FLINK-12399][table] Fix FilterableTableSource does not change after applyPredicate URL: https://github.com/apache/flink/pull/8389#issuecomment-491124555 @flinkbot attention @beyond1920 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wuchong commented on a change in pull request #8384: [FLINK-11610][docs-zh] Translate the "Examples" page into Chinese
wuchong commented on a change in pull request #8384: [FLINK-11610][docs-zh] Translate the "Examples" page into Chinese URL: https://github.com/apache/flink/pull/8384#discussion_r282721301 ## File path: docs/examples/index.zh.md ## @@ -1,7 +1,9 @@ --- title: 示例 nav-id: examples -nav-title: ' 示例' + +nav-title: ' Examples' Review comment: Revert this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sunjincheng121 commented on issue #8359: [FLINK-11051][table] Add streaming window FlatAggregate to Table API
sunjincheng121 commented on issue #8359: [FLINK-11051][table] Add streaming window FlatAggregate to Table API URL: https://github.com/apache/flink/pull/8359#issuecomment-491112960 @flinkbot attention @aljoscha @twalthr @dawidwys This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] xuefuz commented on a change in pull request #8353: [FLINK-12233][hive] Support table related operations in HiveCatalog
xuefuz commented on a change in pull request #8353: [FLINK-12233][hive] Support table related operations in HiveCatalog URL: https://github.com/apache/flink/pull/8353#discussion_r282712487 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalogBase.java ## @@ -197,4 +237,196 @@ protected void alterHiveDatabase(String name, Database newHiveDatabase, boolean throw new CatalogException(String.format("Failed to alter database %s", name), e); } } + + // -- tables -- + + @Override + public CatalogBaseTable getTable(ObjectPath tablePath) + throws TableNotExistException, CatalogException { + return createCatalogBaseTable(getHiveTable(tablePath)); + } + + @Override + public void createTable(ObjectPath tablePath, CatalogBaseTable table, boolean ignoreIfExists) + throws TableAlreadyExistException, DatabaseNotExistException, CatalogException { + validateCatalogBaseTable(table); + + if (!databaseExists(tablePath.getDatabaseName())) { + throw new DatabaseNotExistException(catalogName, tablePath.getDatabaseName()); + } else { + try { + client.createTable(createHiveTable(tablePath, table)); + } catch (AlreadyExistsException e) { + if (!ignoreIfExists) { + throw new TableAlreadyExistException(catalogName, tablePath); + } + } catch (TException e) { + throw new CatalogException(String.format("Failed to create table %s", tablePath.getFullName()), e); + } + } + } + + @Override + public void renameTable(ObjectPath tablePath, String newTableName, boolean ignoreIfNotExists) + throws TableNotExistException, TableAlreadyExistException, CatalogException { + try { + // alter_table() doesn't throw a clear exception when target table doesn't exist. + // Thus, check the table existence explicitly + if (tableExists(tablePath)) { + ObjectPath newPath = new ObjectPath(tablePath.getDatabaseName(), newTableName); + // alter_table() doesn't throw a clear exception when new table already exists. + // Thus, check the table existence explicitly + if (tableExists(newPath)) { + throw new TableAlreadyExistException(catalogName, newPath); + } else { + Table table = getHiveTable(tablePath); + table.setTableName(newTableName); + client.alter_table(tablePath.getDatabaseName(), tablePath.getObjectName(), table); + } + } else if (!ignoreIfNotExists) { + throw new TableNotExistException(catalogName, tablePath); + } + } catch (TException e) { + throw new CatalogException( + String.format("Failed to rename table %s", tablePath.getFullName()), e); + } + } + + @Override + public void alterTable(ObjectPath tablePath, CatalogBaseTable newCatalogTable, boolean ignoreIfNotExists) + throws TableNotExistException, CatalogException { + validateCatalogBaseTable(newCatalogTable); + + try { + if (!tableExists(tablePath)) { + if (!ignoreIfNotExists) { + throw new TableNotExistException(catalogName, tablePath); + } + } else { + // TODO: [FLINK-12452] alterTable() in all catalogs should ensure existing base table and the new one are of the same type + Table newTable = createHiveTable(tablePath, newCatalogTable); + + // client.alter_table() requires a valid location + // thus, if new table doesn't have that, it reuses location of the old table + if (!newTable.getSd().isSetLocation()) { + Table oldTable = getHiveTable(tablePath); + newTable.getSd().setLocation(oldTable.getSd().getLocation()); + } + +
[GitHub] [flink] xuefuz commented on a change in pull request #8353: [FLINK-12233][hive] Support table related operations in HiveCatalog
xuefuz commented on a change in pull request #8353: [FLINK-12233][hive] Support table related operations in HiveCatalog URL: https://github.com/apache/flink/pull/8353#discussion_r282711429 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/GenericHiveMetastoreCatalog.java ## @@ -363,4 +332,23 @@ public CatalogColumnStatistics getPartitionColumnStatistics(ObjectPath tablePath throw new UnsupportedOperationException(); } + // -- utils -- + + /** +* Filter out Hive-created properties, and return Flink-created properties. +*/ + private Map retrieveFlinkProperties(Map hiveTableParams) { Review comment: The difference is that a static method doesn't require an instance of the class in order to invoke it. For example, if another static method of this class needs to call this method, without this one defined as static, one has to create an instance of the class to invoke the method. The general principle is that if a method doesn't access any of the object variables/methods, make it static. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] xuefuz commented on a change in pull request #8353: [FLINK-12233][hive] Support table related operations in HiveCatalog
xuefuz commented on a change in pull request #8353: [FLINK-12233][hive] Support table related operations in HiveCatalog URL: https://github.com/apache/flink/pull/8353#discussion_r282710918 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalogBase.java ## @@ -89,6 +101,34 @@ private static IMetaStoreClient getMetastoreClient(HiveConf hiveConf) { } } + // -- APIs -- + + /** +* Validate input base table. +* +* @param catalogBaseTable the base table to be validated +* @throws IllegalArgumentException thrown if the input base table is invalid. +*/ + protected abstract void validateCatalogBaseTable(CatalogBaseTable catalogBaseTable) Review comment: I'm not backtracking what I said earlier. It's fine to defined an API, but the API needs to be well-thought, which doesn't seem to be the case here. That's why I said we can forgo the API. If we think an API is the way to go, then let's define the API better and parent class shouldn't define what validation a subclass might do. Personally, I don't feel the API is sufficiently sound, but feel free to get 2nd opinion though. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11580) Provide a shaded netty-tcnative
[ https://issues.apache.org/jira/browse/FLINK-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836794#comment-16836794 ] sunjincheng commented on FLINK-11580: - And I have mentioned this Jira’s situation in VOTE thread. Please see the detail: [https://lists.apache.org/thread.html/85a8d61b545502e3c69495b5e767bcc1502157b77c9141f64583c0bb@%3Cdev.flink.apache.org%3E] > Provide a shaded netty-tcnative > --- > > Key: FLINK-11580 > URL: https://issues.apache.org/jira/browse/FLINK-11580 > Project: Flink > Issue Type: Sub-task > Components: BuildSystem / Shaded, Runtime / Network >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Labels: pull-request-available > Fix For: shaded-7.0 > > Time Spent: 10m > Remaining Estimate: 0h > > In order to have an openSSL-based SSL engine available, we need a shaded > netty-tc version in the classpath which relies on openSSL libraries from the > system it is running on. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11580) Provide a shaded netty-tcnative
[ https://issues.apache.org/jira/browse/FLINK-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836793#comment-16836793 ] sunjincheng commented on FLINK-11580: - Thanks for the update, I am not pretty sure about the change of `upgrade Travis profile `, Maybe [~Zentol] can give us more comments! If we really need this change it's better to open a new Jira to trace the change. > Provide a shaded netty-tcnative > --- > > Key: FLINK-11580 > URL: https://issues.apache.org/jira/browse/FLINK-11580 > Project: Flink > Issue Type: Sub-task > Components: BuildSystem / Shaded, Runtime / Network >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Labels: pull-request-available > Fix For: shaded-7.0 > > Time Spent: 10m > Remaining Estimate: 0h > > In order to have an openSSL-based SSL engine available, we need a shaded > netty-tc version in the classpath which relies on openSSL libraries from the > system it is running on. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (FLINK-12308) Support python language in Flink Table API
[ https://issues.apache.org/jira/browse/FLINK-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng updated FLINK-12308: Comment: was deleted (was: Fixed in master: 478956c32323ddfa29cf9b673f4854bfe5767dbc) > Support python language in Flink Table API > -- > > Key: FLINK-12308 > URL: https://issues.apache.org/jira/browse/FLINK-12308 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 1.9.0 > > > At the Flink API level, we have DataStreamAPI/DataSetAPI/TableAPI, the > Table API will become the first-class citizen. Table API is declarative, and > can be automatically optimized, which is mentioned in the Flink mid-term > roadmap by Stephan. So, first considering supporting Python at the Table > level to cater to the current large number of analytics users. And Flink's > goal for Python Table API as follows: > * Users can write Flink Table API job in Python, and should mirror Java / > Scala Table API > * Users can submit Python Table API job in the following ways: > ** Submit a job with python script, integrate with `flink run` > ** Submit a job with python script by REST service > ** Submit a job in an interactive way, similar `scala-shell` > ** Local debug in IDE. > * Users can write custom functions(UDF, UDTF, UDAF) > * Pandas functions can be used in Flink Python Table API > A more detailed description can be found in > [FLIP-38.|https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API] > For the API level, we make the following plan: > * The short-term: > We may initially go with a simple approach to map the Python Table API to > the Java Table API via Py4J. > * The long-term: > We may need to create a Python API that follows the same structure as > Flink's Table API that produces the language-independent DAG. (As Stephan > already motioned on the [mailing > thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html#a28096]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (FLINK-12308) Support python language in Flink Table API
[ https://issues.apache.org/jira/browse/FLINK-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng reopened FLINK-12308: - > Support python language in Flink Table API > -- > > Key: FLINK-12308 > URL: https://issues.apache.org/jira/browse/FLINK-12308 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 1.9.0 > > > At the Flink API level, we have DataStreamAPI/DataSetAPI/TableAPI, the > Table API will become the first-class citizen. Table API is declarative, and > can be automatically optimized, which is mentioned in the Flink mid-term > roadmap by Stephan. So, first considering supporting Python at the Table > level to cater to the current large number of analytics users. And Flink's > goal for Python Table API as follows: > * Users can write Flink Table API job in Python, and should mirror Java / > Scala Table API > * Users can submit Python Table API job in the following ways: > ** Submit a job with python script, integrate with `flink run` > ** Submit a job with python script by REST service > ** Submit a job in an interactive way, similar `scala-shell` > ** Local debug in IDE. > * Users can write custom functions(UDF, UDTF, UDAF) > * Pandas functions can be used in Flink Python Table API > A more detailed description can be found in > [FLIP-38.|https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API] > For the API level, we make the following plan: > * The short-term: > We may initially go with a simple approach to map the Python Table API to > the Java Table API via Py4J. > * The long-term: > We may need to create a Python API that follows the same structure as > Flink's Table API that produces the language-independent DAG. (As Stephan > already motioned on the [mailing > thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html#a28096]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-12454) Add custom check options for lint-python.sh
[ https://issues.apache.org/jira/browse/FLINK-12454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng closed FLINK-12454. --- Resolution: Fixed Fix Version/s: 1.9.0 Fixed in master: 478956c32323ddfa29cf9b673f4854bfe5767dbc > Add custom check options for lint-python.sh > --- > > Key: FLINK-12454 > URL: https://issues.apache.org/jira/browse/FLINK-12454 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 20m > Remaining Estimate: 0h > > At present, int-python.sh has Python compatibility checks and code style > checks. By default, all checks will be executed by default. In the > development stage, only code styles check required in some case. Therefore, > in this JIRA, the specified check item parameters are added for > lint-python.sh. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12455) Move the packaging of pyflink to flink-dist
[ https://issues.apache.org/jira/browse/FLINK-12455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng updated FLINK-12455: Issue Type: Improvement (was: Sub-task) Parent: (was: FLINK-12308) > Move the packaging of pyflink to flink-dist > --- > > Key: FLINK-12455 > URL: https://issues.apache.org/jira/browse/FLINK-12455 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently, there is a pom.xml under module flink-python which is responsible > for the package of pyflink. The package logic should be moved to flink-dist > and then we can remove the pom.xml under flink-python and make flink-python a > pure python module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] asfgit closed pull request #8383: [FLINK-12454][python] Add -l(list) -i(include) and -e(exclude) option…
asfgit closed pull request #8383: [FLINK-12454][python] Add -l(list) -i(include) and -e(exclude) option… URL: https://github.com/apache/flink/pull/8383 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-12308) Support python language in Flink Table API
[ https://issues.apache.org/jira/browse/FLINK-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng closed FLINK-12308. --- Resolution: Fixed Fixed in master: 478956c32323ddfa29cf9b673f4854bfe5767dbc > Support python language in Flink Table API > -- > > Key: FLINK-12308 > URL: https://issues.apache.org/jira/browse/FLINK-12308 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 1.9.0 > > > At the Flink API level, we have DataStreamAPI/DataSetAPI/TableAPI, the > Table API will become the first-class citizen. Table API is declarative, and > can be automatically optimized, which is mentioned in the Flink mid-term > roadmap by Stephan. So, first considering supporting Python at the Table > level to cater to the current large number of analytics users. And Flink's > goal for Python Table API as follows: > * Users can write Flink Table API job in Python, and should mirror Java / > Scala Table API > * Users can submit Python Table API job in the following ways: > ** Submit a job with python script, integrate with `flink run` > ** Submit a job with python script by REST service > ** Submit a job in an interactive way, similar `scala-shell` > ** Local debug in IDE. > * Users can write custom functions(UDF, UDTF, UDAF) > * Pandas functions can be used in Flink Python Table API > A more detailed description can be found in > [FLIP-38.|https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API] > For the API level, we make the following plan: > * The short-term: > We may initially go with a simple approach to map the Python Table API to > the Java Table API via Py4J. > * The long-term: > We may need to create a Python API that follows the same structure as > Flink's Table API that produces the language-independent DAG. (As Stephan > already motioned on the [mailing > thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html#a28096]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] sunjincheng121 commented on issue #8383: [FLINK-12454][python] Add -l(list) -i(include) and -e(exclude) option…
sunjincheng121 commented on issue #8383: [FLINK-12454][python] Add -l(list) -i(include) and -e(exclude) option… URL: https://github.com/apache/flink/pull/8383#issuecomment-491101352 @flinkbot approve all This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on issue #8390: [FLINK-12469][table] Clean up catalog API on default/current database
bowenli86 commented on issue #8390: [FLINK-12469][table] Clean up catalog API on default/current database URL: https://github.com/apache/flink/pull/8390#issuecomment-491097070 > default DB is most likely from configuration and it cannot be verified until a catalog is opened. This shouldn't be a problem as error will be reported later when the db is actually referenced. I wonder if we should verify its existence upon catalog opening and fail early if the db doesn't exist? The reason being that the default db cannot be changed at runtime. If SQL CLI users realize it's nonexistent in the later stage of their workflow, e.g. the default db name they put has a typo, they may have several options: 1) not use any feature related to default db, 2) reconfig and restart their clients and lost all their previous progress in memory, 3) create that db (not a good option for a typo name). None seems very user-friendly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] xuefuz commented on issue #8390: [FLINK-12469][table] Clean up catalog API on default/current database
xuefuz commented on issue #8390: [FLINK-12469][table] Clean up catalog API on default/current database URL: https://github.com/apache/flink/pull/8390#issuecomment-491088370 > Thanks @xuefuz for the PR. > > Currently this PR doesn't seem to verify whether user-set "defaultDatabase" in catalog impls' constructors actually exists or not upon initialization. Do we need to ensure it actually exists? if so, how? I have something in mind but want to listen to your ideas first. default DB is most likely from configuration and it cannot be verified until a catalog is opened. This shouldn't be a problem as error will be reported later when the db is actually referenced. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] xuefuz commented on a change in pull request #8390: [FLINK-12469][table] Clean up catalog API on default/current database
xuefuz commented on a change in pull request #8390: [FLINK-12469][table] Clean up catalog API on default/current database URL: https://github.com/apache/flink/pull/8390#discussion_r282685701 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalogBase.java ## @@ -48,23 +48,26 @@ */ public abstract class HiveCatalogBase implements Catalog { private static final Logger LOG = LoggerFactory.getLogger(HiveCatalogBase.class); - - public static final String DEFAULT_DB = "default"; + private static final String DEFAULT_DB = "default"; protected final String catalogName; protected final HiveConf hiveConf; - protected String currentDatabase = DEFAULT_DB; + private final String defaultDatabase; protected IMetaStoreClient client; public HiveCatalogBase(String catalogName, String hivemetastoreURI) { - this(catalogName, getHiveConf(hivemetastoreURI)); + this(catalogName, DEFAULT_DB, getHiveConf(hivemetastoreURI)); } public HiveCatalogBase(String catalogName, HiveConf hiveConf) { + this(catalogName, DEFAULT_DB, hiveConf); + } + + public HiveCatalogBase(String catalogName, String defaultDatabase, HiveConf hiveConf) { checkArgument(!StringUtils.isNullOrWhitespaceOnly(catalogName), "catalogName cannot be null or empty"); this.catalogName = catalogName; - + this.defaultDatabase = checkNotNull(defaultDatabase, "defaultDatabase cannot be null"); Review comment: I think it cannot be either null or empty string. I will add additional checks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #8391: [FLINK-12416][FLINK-12375] Fix docker build scripts on Flink-1.8
flinkbot commented on issue #8391: [FLINK-12416][FLINK-12375] Fix docker build scripts on Flink-1.8 URL: https://github.com/apache/flink/pull/8391#issuecomment-491023097 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on a change in pull request #8390: [FLINK-12469][table] Clean up catalog API on default/current database
bowenli86 commented on a change in pull request #8390: [FLINK-12469][table] Clean up catalog API on default/current database URL: https://github.com/apache/flink/pull/8390#discussion_r282612561 ## File path: flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/catalog/GenericInMemoryCatalog.java ## @@ -65,9 +64,14 @@ private final Map> partitionColumnStats; public GenericInMemoryCatalog(String name) { + this(name, DEFAULT_DB); + } + + public GenericInMemoryCatalog(String name, String defaultDatabase) { checkArgument(!StringUtils.isNullOrWhitespaceOnly(name), "name cannot be null or empty"); this.catalogName = name; + this.defaultDatabase = defaultDatabase; Review comment: check whether defaultDatabase is null or empty string? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on a change in pull request #8390: [FLINK-12469][table] Clean up catalog API on default/current database
bowenli86 commented on a change in pull request #8390: [FLINK-12469][table] Clean up catalog API on default/current database URL: https://github.com/apache/flink/pull/8390#discussion_r282607390 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalogBase.java ## @@ -48,23 +48,26 @@ */ public abstract class HiveCatalogBase implements Catalog { private static final Logger LOG = LoggerFactory.getLogger(HiveCatalogBase.class); - - public static final String DEFAULT_DB = "default"; + private static final String DEFAULT_DB = "default"; protected final String catalogName; protected final HiveConf hiveConf; - protected String currentDatabase = DEFAULT_DB; + private final String defaultDatabase; protected IMetaStoreClient client; public HiveCatalogBase(String catalogName, String hivemetastoreURI) { - this(catalogName, getHiveConf(hivemetastoreURI)); + this(catalogName, DEFAULT_DB, getHiveConf(hivemetastoreURI)); } public HiveCatalogBase(String catalogName, HiveConf hiveConf) { + this(catalogName, DEFAULT_DB, hiveConf); + } + + public HiveCatalogBase(String catalogName, String defaultDatabase, HiveConf hiveConf) { checkArgument(!StringUtils.isNullOrWhitespaceOnly(catalogName), "catalogName cannot be null or empty"); this.catalogName = catalogName; - + this.defaultDatabase = checkNotNull(defaultDatabase, "defaultDatabase cannot be null"); Review comment: can defaultDatabase be empty string? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] Myasuka opened a new pull request #8391: [FLINK-12416][FLINK-12375] Fix docker build scripts on Flink-1.8
Myasuka opened a new pull request #8391: [FLINK-12416][FLINK-12375] Fix docker build scripts on Flink-1.8 URL: https://github.com/apache/flink/pull/8391 ## What is the purpose of the change This PR fix docker build script error from `Flink-1.8` and also fix [FLINK-12375](https://issues.apache.org/jira/browse/FLINK-12375) to give proper persimmon to job jar package. ## Brief change log - Change docker build script and related `Dockerfile`. - Change related `README` ## Verifying this change I verify this change manually. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): **no** - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **no** - The serializers: **no** - The runtime per-record code paths (performance sensitive): **no** - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: **no** - The S3 file system connector: **no** ## Documentation - Does this pull request introduce a new feature? **no** - If yes, how is the feature documented? **not applicable** This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12416) Docker build script fails on symlink creation ln -s
[ https://issues.apache.org/jira/browse/FLINK-12416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12416: --- Labels: pull-request-available (was: ) > Docker build script fails on symlink creation ln -s > --- > > Key: FLINK-12416 > URL: https://issues.apache.org/jira/browse/FLINK-12416 > Project: Flink > Issue Type: Bug > Components: Deployment / Docker >Affects Versions: 1.8.0 >Reporter: Slava D >Assignee: Yun Tang >Priority: Major > Labels: pull-request-available > > When using script 'build.sh' from 'flink-container/docker' it fails on > {code:java} > + ln -s /opt/flink-1.8.0-bin-hadoop28-scala_2.12.tgz /opt/flink > + ln -s /opt/job.jar /opt/flink/lib > ln: /opt/flink/lib: Not a directory > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12417) Unify ReadableCatalog and ReadableWritableCatalog interfaces to Catalog interface
[ https://issues.apache.org/jira/browse/FLINK-12417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836621#comment-16836621 ] Bowen Li commented on FLINK-12417: -- part 2 merged in 1.9.0: 26273d99fe8b4e3b6bc5d638858ee6f8b369d8ec > Unify ReadableCatalog and ReadableWritableCatalog interfaces to Catalog > interface > - > > Key: FLINK-12417 > URL: https://issues.apache.org/jira/browse/FLINK-12417 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 20m > Remaining Estimate: 0h > > As discussed with [~dawidwys], the original purpose to separate > ReadableCatalog and ReadableWritableCatalog is to isolate access to metadata. > However, we believe access control and authorization is orthogonal to design > of catalog APIs and should be of a different effort. > Thus, we propose to merge ReadableCatalog and ReadableWritableCatalog to > simplify the design. > cc [~twalthr] [~xuefuz] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] piyushnarang edited a comment on issue #8117: [FLINK-12115] [filesystems]: Add support for AzureFS
piyushnarang edited a comment on issue #8117: [FLINK-12115] [filesystems]: Add support for AzureFS URL: https://github.com/apache/flink/pull/8117#issuecomment-491016295 Thanks @tillrohrmann. I'm happy to put in a follow up on the shading stuff. Can sit down with it next week and put in a fresh PR on that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] piyushnarang commented on issue #8117: [FLINK-12115] [filesystems]: Add support for AzureFS
piyushnarang commented on issue #8117: [FLINK-12115] [filesystems]: Add support for AzureFS URL: https://github.com/apache/flink/pull/8117#issuecomment-491016295 Thanks @tillrohrmann. I'm happy to put in a follow up on the shading stuff. Can sit down with that next week and put in a fresh PR on that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] xuefuz opened a new pull request #8390: [FLINK-12469][table] Clean up catalog API on default/current database
xuefuz opened a new pull request #8390: [FLINK-12469][table] Clean up catalog API on default/current database URL: https://github.com/apache/flink/pull/8390 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - Removed get/setCurrentDatabase() from Catalog API - Added getDefaultDatabase() in Catalog API - Provided implementations in sub classes ## Verifying this change This change is already covered by existing tests ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (yes) - If yes, how is the feature documented? (JavaDocs) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12469) Clean up catalog API on default/current DB
[ https://issues.apache.org/jira/browse/FLINK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12469: --- Labels: pull-request-available (was: ) > Clean up catalog API on default/current DB > -- > > Key: FLINK-12469 > URL: https://issues.apache.org/jira/browse/FLINK-12469 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.9.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang >Priority: Major > Labels: pull-request-available > > Currently catalog API has get/setCurrentDatabase(), which is more user > session specific. In our design principal, catalog instance is agnostic to > user sessions. Thus, current database concept doesn't belong there. However, > a catalog should support a (configurable) default database, which would be > taken as user's current database when user's session doesn't specify a > current DB. > This JIRA is to remove current database concept from catalog api and add > default database instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8390: [FLINK-12469][table] Clean up catalog API on default/current database
flinkbot commented on issue #8390: [FLINK-12469][table] Clean up catalog API on default/current database URL: https://github.com/apache/flink/pull/8390#issuecomment-490999881 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-12469) Clean up catalog API on default/current DB
Xuefu Zhang created FLINK-12469: --- Summary: Clean up catalog API on default/current DB Key: FLINK-12469 URL: https://issues.apache.org/jira/browse/FLINK-12469 Project: Flink Issue Type: Sub-task Components: Table SQL / API Affects Versions: 1.9.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Currently catalog API has get/setCurrentDatabase(), which is more user session specific. In our design principal, catalog instance is agnostic to user sessions. Thus, current database concept doesn't belong there. However, a catalog should support a (configurable) default database, which would be taken as user's current database when user's session doesn't specify a current DB. This JIRA is to remove current database concept from catalog api and add default database instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8389: [FLINK-12399][table] Fix FilterableTableSource does not change after applyPredicate
flinkbot commented on issue #8389: [FLINK-12399][table] Fix FilterableTableSource does not change after applyPredicate URL: https://github.com/apache/flink/pull/8389#issuecomment-490965405 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12399) FilterableTableSource does not use filters on job run
[ https://issues.apache.org/jira/browse/FLINK-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12399: --- Labels: pull-request-available (was: ) > FilterableTableSource does not use filters on job run > - > > Key: FLINK-12399 > URL: https://issues.apache.org/jira/browse/FLINK-12399 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.8.0 >Reporter: Josh Bradt >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > Attachments: flink-filter-bug.tar.gz > > > As discussed [on the mailing > list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Filter-push-down-not-working-for-a-custom-BatchTableSource-tp27654.html], > there appears to be a bug where a job that uses a custom > FilterableTableSource does not keep the filters that were pushed down into > the table source. More specifically, the table source does receive filters > via applyPredicates, and a new table source with those filters is returned, > but the final job graph appears to use the original table source, which does > not contain any filters. > I attached a minimal example program to this ticket. The custom table source > is as follows: > {code:java} > public class CustomTableSource implements BatchTableSource, > FilterableTableSource { > private static final Logger LOG = > LoggerFactory.getLogger(CustomTableSource.class); > private final Filter[] filters; > private final FilterConverter converter = new FilterConverter(); > public CustomTableSource() { > this(null); > } > private CustomTableSource(Filter[] filters) { > this.filters = filters; > } > @Override > public DataSet getDataSet(ExecutionEnvironment execEnv) { > if (filters == null) { >LOG.info(" No filters defined "); > } else { > LOG.info(" Found filters "); > for (Filter filter : filters) { > LOG.info("FILTER: {}", filter); > } > } > return execEnv.fromCollection(allModels()); > } > @Override > public TableSource applyPredicate(List predicates) { > LOG.info("Applying predicates"); > List acceptedFilters = new ArrayList<>(); > for (final Expression predicate : predicates) { > converter.convert(predicate).ifPresent(acceptedFilters::add); > } > return new CustomTableSource(acceptedFilters.toArray(new Filter[0])); > } > @Override > public boolean isFilterPushedDown() { > return filters != null; > } > @Override > public TypeInformation getReturnType() { > return TypeInformation.of(Model.class); > } > @Override > public TableSchema getTableSchema() { > return TableSchema.fromTypeInfo(getReturnType()); > } > private List allModels() { > List models = new ArrayList<>(); > models.add(new Model(1, 2, 3, 4)); > models.add(new Model(10, 11, 12, 13)); > models.add(new Model(20, 21, 22, 23)); > return models; > } > } > {code} > > When run, it logs > {noformat} > 15:24:54,888 INFO com.klaviyo.filterbug.CustomTableSource >- Applying predicates > 15:24:54,901 INFO com.klaviyo.filterbug.CustomTableSource >- Applying predicates > 15:24:54,910 INFO com.klaviyo.filterbug.CustomTableSource >- Applying predicates > 15:24:54,977 INFO com.klaviyo.filterbug.CustomTableSource >- No filters defined {noformat} > which appears to indicate that although filters are getting pushed down, the > final job does not use them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] walterddr opened a new pull request #8389: [FLINK-12399][table] Fix FilterableTableSource does not change after applyPredicate
walterddr opened a new pull request #8389: [FLINK-12399][table] Fix FilterableTableSource does not change after applyPredicate URL: https://github.com/apache/flink/pull/8389 ## What is the purpose of the change This PR fixes the problem: FilterableTableSource does not generate a new digest after the predicate push down, unless `explainSource()` API is explicitly override. ## Brief change log - Changed the `explainTerm` API from `FlinkLogicalTableSourceScan` to include auxiliary source description. - Changed the `TestFilterableTableSource` to include both the none explainSource and with explainSource version. - Added test to both stream and batch cases. ## Verifying this change - Tests for both stream and batch cases for predicate push down without explainSource override. - Others are covered for current tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? n/a This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11580) Provide a shaded netty-tcnative
[ https://issues.apache.org/jira/browse/FLINK-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836491#comment-16836491 ] Nico Kruber commented on FLINK-11580: - Small update from my side: I was able to run our unit tests with a dynamically-linked openSSL from flink-shaded as well: see https://github.com/NicoK/flink/commits/f9816-master.testing However, I had to upgrade our Travis profile to Ubuntu 16.04 Xenial and change to openjdk8 because of too old openSSL in trusty and oraclejdk8 not being available in their xenial image. > Provide a shaded netty-tcnative > --- > > Key: FLINK-11580 > URL: https://issues.apache.org/jira/browse/FLINK-11580 > Project: Flink > Issue Type: Sub-task > Components: BuildSystem / Shaded, Runtime / Network >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Labels: pull-request-available > Fix For: shaded-7.0 > > Time Spent: 10m > Remaining Estimate: 0h > > In order to have an openSSL-based SSL engine available, we need a shaded > netty-tc version in the classpath which relies on openSSL libraries from the > system it is running on. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12043) Null value check in array serializers classes
[ https://issues.apache.org/jira/browse/FLINK-12043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836482#comment-16836482 ] Quan Shi commented on FLINK-12043: -- Hi, IMO, Flink should make null check in stead of throwing NPE in this place. Passing null to down operator is hard to avoid. In the production environment, you may encounter following situations: 1. Source data in a format that doesn't meet the requirements. To avoid null return, you have to new an empty instance with no values, and filter it afterwards. 2. Map one java class to another according to some outside conditions, and, not every one needed to be transformed; In these cases, empty instance is not a graceful choice. Some other serializers like DateSerializer has null check already. Based on the above, I recommend adding null value check for ArraySerializers too. Best, Quan Shi > Null value check in array serializers classes > - > > Key: FLINK-12043 > URL: https://issues.apache.org/jira/browse/FLINK-12043 > Project: Flink > Issue Type: Bug > Components: API / Type Serialization System >Affects Versions: 1.7.2 >Reporter: Quan Shi >Assignee: aloyszhang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Null pointer exception when get length of "_from"_ if _"from"_ is null in > copy() method: > > Involved classes: > {code:java} > // code placeholder > public String[] copy(String[] from) { >String[] target = new String[from.length]; >System.arraycopy(from, 0, target, 0, from.length); >return target; > } > {code} > Involved serializer classes in package > "org.apache.flink.api.common.typeutils.base.array" > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-12111) TaskManagerProcessFailureBatchRecoveryITCase fails due to removed Slot
[ https://issues.apache.org/jira/browse/FLINK-12111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-12111: Assignee: Chesnay Schepler > TaskManagerProcessFailureBatchRecoveryITCase fails due to removed Slot > -- > > Key: FLINK-12111 > URL: https://issues.apache.org/jira/browse/FLINK-12111 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.8.1 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Critical > Labels: test-stability > > https://travis-ci.org/apache/flink/jobs/515636826 > {code} > org.apache.flink.client.program.ProgramInvocationException: Job failed. > (JobID: 4f32093d9c7554c3de832d20f0a06eb5) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:471) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:446) > at > org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:210) > at > org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:187) > at > org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:173) > at > org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:817) > at org.apache.flink.api.java.DataSet.collect(DataSet.java:413) > at > org.apache.flink.test.recovery.TaskManagerProcessFailureBatchRecoveryITCase.testTaskManagerFailure(TaskManagerProcessFailureBatchRecoveryITCase.java:115) > at > org.apache.flink.test.recovery.AbstractTaskManagerProcessFailureRecoveryTest$1.run(AbstractTaskManagerProcessFailureRecoveryTest.java:143) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) > ... 10 more > Caused by: org.apache.flink.util.FlinkException: The assigned slot > 786ec47f893240315bb01291aab680ec_1 was removed. > at > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlot(SlotManager.java:893) > at > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlots(SlotManager.java:863) > at > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.internalUnregisterTaskManager(SlotManager.java:1058) > at > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.unregisterTaskManager(SlotManager.java:385) > at > org.apache.flink.runtime.resourcemanager.ResourceManager.closeTaskManagerConnection(ResourceManager.java:847) > at > org.apache.flink.runtime.resourcemanager.ResourceManager$TaskManagerHeartbeatListener$1.run(ResourceManager.java:1161) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:392) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:185) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:147) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40) > at > akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165) > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > org.apache.flink.client.program.ProgramInvocationException: Job failed. > (JobID: 4f32093d9c7554c3de832d20f0a06eb5) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) > at >
[GitHub] [flink] flinkbot commented on issue #8388: [FLINK-12164][runtime] Harden JobMasterTest against timeouts
flinkbot commented on issue #8388: [FLINK-12164][runtime] Harden JobMasterTest against timeouts URL: https://github.com/apache/flink/pull/8388#issuecomment-490945025 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-12164) JobMasterTest.testJobFailureWhenTaskExecutorHeartbeatTimeout is unstable
[ https://issues.apache.org/jira/browse/FLINK-12164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-12164: --- Labels: pull-request-available test-stability (was: test-stability) > JobMasterTest.testJobFailureWhenTaskExecutorHeartbeatTimeout is unstable > > > Key: FLINK-12164 > URL: https://issues.apache.org/jira/browse/FLINK-12164 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Reporter: Aljoscha Krettek >Assignee: Chesnay Schepler >Priority: Critical > Labels: pull-request-available, test-stability > > {code} > 07:28:23.957 [ERROR] Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time > elapsed: 8.968 s <<< FAILURE! - in > org.apache.flink.runtime.jobmaster.JobMasterTest > 07:28:23.957 [ERROR] > testJobFailureWhenTaskExecutorHeartbeatTimeout(org.apache.flink.runtime.jobmaster.JobMasterTest) > Time elapsed: 0.177 s <<< ERROR! > java.util.concurrent.ExecutionException: java.lang.Exception: Unknown > TaskManager 69a7c8c18a36069ff90a1eae8ec41066 > at > org.apache.flink.runtime.jobmaster.JobMasterTest.registerSlotsAtJobMaster(JobMasterTest.java:1746) > at > org.apache.flink.runtime.jobmaster.JobMasterTest.runJobFailureWhenTaskExecutorTerminatesTest(JobMasterTest.java:1670) > at > org.apache.flink.runtime.jobmaster.JobMasterTest.testJobFailureWhenTaskExecutorHeartbeatTimeout(JobMasterTest.java:1630) > Caused by: java.lang.Exception: Unknown TaskManager > 69a7c8c18a36069ff90a1eae8ec41066 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] zentol opened a new pull request #8388: [FLINK-12164][runtime] Harden JobMasterTest against timeouts
zentol opened a new pull request #8388: [FLINK-12164][runtime] Harden JobMasterTest against timeouts URL: https://github.com/apache/flink/pull/8388 Hardens `JobMasterTest#testJobFailureWhenTaskExecutorHeartbeatTimeout` The reported failure occurred when offering slots, indicating that at this time the TM was no longer registered at the JM. However, the TM is being registered right before the slot offer. The only explanation I could find is that due so some freak timing thing a heartbeat times out in between these 2 calls. `testJobFailureWhenTaskExecutorHeartbeatTimeout` uses a very small heartbeat interval (1ms) and timeout (5ms). I was only able to reproduce the issue locally after reducing the timeout to 2ms. This PR doubles the timeout to reduce the likely-hood of this happening again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #8387: [FLINK-11982] BatchTableSourceFactory support Json File
flinkbot commented on issue #8387: [FLINK-11982] BatchTableSourceFactory support Json File URL: https://github.com/apache/flink/pull/8387#issuecomment-490942297 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11982) BatchTableSourceFactory support Json Format File
[ https://issues.apache.org/jira/browse/FLINK-11982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11982: --- Labels: pull-request-available (was: ) > BatchTableSourceFactory support Json Format File > > > Key: FLINK-11982 > URL: https://issues.apache.org/jira/browse/FLINK-11982 > Project: Flink > Issue Type: Bug > Components: Table SQL / Ecosystem >Affects Versions: 1.6.4, 1.7.2, 1.8.0 >Reporter: pingle wang >Assignee: frank wang >Priority: Major > Labels: pull-request-available > > java code : > {code:java} > val connector = FileSystem().path("data/in/test.json") > val desc = tEnv.connect(connector) > .withFormat( > new Json().failOnMissingField(true) > ).registerTableSource("persion") > val sql = "select * from person" > val result = tEnv.sqlQuery(sql) > {code} > Exception info : > {code:java} > Exception in thread "main" > org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a > suitable table factory for > 'org.apache.flink.table.factories.BatchTableSourceFactory' in > the classpath. > Reason: No context matches. > The following properties are requested: > connector.path=file:///Users/batch/test.json > connector.property-version=1 > connector.type=filesystem > format.derive-schema=true > format.fail-on-missing-field=true > format.property-version=1 > format.type=json > The following factories have been considered: > org.apache.flink.table.sources.CsvBatchTableSourceFactory > org.apache.flink.table.sources.CsvAppendTableSourceFactory > org.apache.flink.table.sinks.CsvBatchTableSinkFactory > org.apache.flink.table.sinks.CsvAppendTableSinkFactory > org.apache.flink.formats.avro.AvroRowFormatFactory > org.apache.flink.formats.json.JsonRowFormatFactory > org.apache.flink.streaming.connectors.kafka.Kafka010TableSourceSinkFactory > org.apache.flink.streaming.connectors.kafka.Kafka09TableSourceSinkFactory > at > org.apache.flink.table.factories.TableFactoryService$.filterByContext(TableFactoryService.scala:214) > at > org.apache.flink.table.factories.TableFactoryService$.findInternal(TableFactoryService.scala:130) > at > org.apache.flink.table.factories.TableFactoryService$.find(TableFactoryService.scala:81) > at > org.apache.flink.table.factories.TableFactoryUtil$.findAndCreateTableSource(TableFactoryUtil.scala:44) > at > org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:46) > at com.meitu.mlink.sql.batch.JsonExample.main(JsonExample.java:36){code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] ambition119 opened a new pull request #8387: [FLINK-11982] BatchTableSourceFactory support Json File
ambition119 opened a new pull request #8387: [FLINK-11982] BatchTableSourceFactory support Json File URL: https://github.com/apache/flink/pull/8387 ## What is the purpose of the change fix [ISSUE #11982](https://issues.apache.org/jira/browse/FLINK-11982) ## Brief change log - [flink-connector-json] module to support File System Connector's JSON Format. - JsonBatchTableSourceFactory is a BatchTableSourceFactory - JsonBatchTableSource is a BatchTableSource ## Verifying this change Unit tests for add java class. *(example:)* ```java FileSystem connector = new FileSystem().path(path); tEnv.connect(connector) .withFormat(new Json().failOnMissingField(true).deriveSchema()) .withSchema(new Schema().field(fieldName, Types.STRING())) .registerTableSource("jsonTable"); String fullSql = "select * FROM jsonTable"; Table fullTable = tEnv.sqlQuery(fullSql); ``` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (**yes**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (**yes**) - The serializers: (**yes**) - The runtime per-record code paths (performance sensitive): (**no**) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (**no**) - The S3 file system connector: (**no**) ## Documentation - Does this pull request introduce a new feature? (**yes**) - If yes, how is the feature documented? (example) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-12164) JobMasterTest.testJobFailureWhenTaskExecutorHeartbeatTimeout is unstable
[ https://issues.apache.org/jira/browse/FLINK-12164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-12164: Assignee: Chesnay Schepler > JobMasterTest.testJobFailureWhenTaskExecutorHeartbeatTimeout is unstable > > > Key: FLINK-12164 > URL: https://issues.apache.org/jira/browse/FLINK-12164 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Reporter: Aljoscha Krettek >Assignee: Chesnay Schepler >Priority: Critical > Labels: test-stability > > {code} > 07:28:23.957 [ERROR] Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time > elapsed: 8.968 s <<< FAILURE! - in > org.apache.flink.runtime.jobmaster.JobMasterTest > 07:28:23.957 [ERROR] > testJobFailureWhenTaskExecutorHeartbeatTimeout(org.apache.flink.runtime.jobmaster.JobMasterTest) > Time elapsed: 0.177 s <<< ERROR! > java.util.concurrent.ExecutionException: java.lang.Exception: Unknown > TaskManager 69a7c8c18a36069ff90a1eae8ec41066 > at > org.apache.flink.runtime.jobmaster.JobMasterTest.registerSlotsAtJobMaster(JobMasterTest.java:1746) > at > org.apache.flink.runtime.jobmaster.JobMasterTest.runJobFailureWhenTaskExecutorTerminatesTest(JobMasterTest.java:1670) > at > org.apache.flink.runtime.jobmaster.JobMasterTest.testJobFailureWhenTaskExecutorHeartbeatTimeout(JobMasterTest.java:1630) > Caused by: java.lang.Exception: Unknown TaskManager > 69a7c8c18a36069ff90a1eae8ec41066 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11982) BatchTableSourceFactory support Json Format File
[ https://issues.apache.org/jira/browse/FLINK-11982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pingle wang updated FLINK-11982: Description: java code : {code:java} val connector = FileSystem().path("data/in/test.json") val desc = tEnv.connect(connector) .withFormat( new Json().failOnMissingField(true) ).registerTableSource("persion") val sql = "select * from person" val result = tEnv.sqlQuery(sql) {code} Exception info : {code:java} Exception in thread "main" org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.BatchTableSourceFactory' in the classpath. Reason: No context matches. The following properties are requested: connector.path=file:///Users/batch/test.json connector.property-version=1 connector.type=filesystem format.derive-schema=true format.fail-on-missing-field=true format.property-version=1 format.type=json The following factories have been considered: org.apache.flink.table.sources.CsvBatchTableSourceFactory org.apache.flink.table.sources.CsvAppendTableSourceFactory org.apache.flink.table.sinks.CsvBatchTableSinkFactory org.apache.flink.table.sinks.CsvAppendTableSinkFactory org.apache.flink.formats.avro.AvroRowFormatFactory org.apache.flink.formats.json.JsonRowFormatFactory org.apache.flink.streaming.connectors.kafka.Kafka010TableSourceSinkFactory org.apache.flink.streaming.connectors.kafka.Kafka09TableSourceSinkFactory at org.apache.flink.table.factories.TableFactoryService$.filterByContext(TableFactoryService.scala:214) at org.apache.flink.table.factories.TableFactoryService$.findInternal(TableFactoryService.scala:130) at org.apache.flink.table.factories.TableFactoryService$.find(TableFactoryService.scala:81) at org.apache.flink.table.factories.TableFactoryUtil$.findAndCreateTableSource(TableFactoryUtil.scala:44) at org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:46) at com.meitu.mlink.sql.batch.JsonExample.main(JsonExample.java:36){code} was: java code : {code:java} val connector = FileSystem().path("data/in/test.json") val desc = tEnv.connect(connector) .withFormat( new Json() .schema( Types.ROW( Array[String]("id", "name", "age"), Array[TypeInformation[_]](Types.STRING, Types.STRING, Types.INT)) ) .failOnMissingField(true) ).registerTableSource("persion") val sql = "select * from person" val result = tEnv.sqlQuery(sql) {code} Exception info : {code:java} Exception in thread "main" org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.BatchTableSourceFactory' in the classpath. Reason: No context matches. The following properties are requested: connector.path=file:///Users/batch/test.json connector.property-version=1 connector.type=filesystem format.derive-schema=true format.fail-on-missing-field=true format.property-version=1 format.type=json The following factories have been considered: org.apache.flink.table.sources.CsvBatchTableSourceFactory org.apache.flink.table.sources.CsvAppendTableSourceFactory org.apache.flink.table.sinks.CsvBatchTableSinkFactory org.apache.flink.table.sinks.CsvAppendTableSinkFactory org.apache.flink.formats.avro.AvroRowFormatFactory org.apache.flink.formats.json.JsonRowFormatFactory org.apache.flink.streaming.connectors.kafka.Kafka010TableSourceSinkFactory org.apache.flink.streaming.connectors.kafka.Kafka09TableSourceSinkFactory at org.apache.flink.table.factories.TableFactoryService$.filterByContext(TableFactoryService.scala:214) at org.apache.flink.table.factories.TableFactoryService$.findInternal(TableFactoryService.scala:130) at org.apache.flink.table.factories.TableFactoryService$.find(TableFactoryService.scala:81) at org.apache.flink.table.factories.TableFactoryUtil$.findAndCreateTableSource(TableFactoryUtil.scala:44) at org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:46) at com.meitu.mlink.sql.batch.JsonExample.main(JsonExample.java:36){code} > BatchTableSourceFactory support Json Format File > > > Key: FLINK-11982 > URL: https://issues.apache.org/jira/browse/FLINK-11982 > Project: Flink > Issue Type: Bug > Components: Table SQL / Ecosystem >Affects Versions: 1.6.4, 1.7.2, 1.8.0 >Reporter: pingle wang >Assignee: frank wang >Priority: Major > > java code : > {code:java} > val connector = FileSystem().path("data/in/test.json") > val desc = tEnv.connect(connector) > .withFormat( > new Json().failOnMissingField(true) > ).registerTableSource("persion") > val sql = "select * from person" > val result = tEnv.sqlQuery(sql) > {code} > Exception info : > {code:java} > Exception in thread "main" >
[jira] [Updated] (FLINK-11982) BatchTableSourceFactory support Json Format File
[ https://issues.apache.org/jira/browse/FLINK-11982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pingle wang updated FLINK-11982: Affects Version/s: 1.8.0 > BatchTableSourceFactory support Json Format File > > > Key: FLINK-11982 > URL: https://issues.apache.org/jira/browse/FLINK-11982 > Project: Flink > Issue Type: Bug > Components: Table SQL / Ecosystem >Affects Versions: 1.6.4, 1.7.2, 1.8.0 >Reporter: pingle wang >Assignee: frank wang >Priority: Major > > java code : > {code:java} > val connector = FileSystem().path("data/in/test.json") > val desc = tEnv.connect(connector) > .withFormat( > new Json() > .schema( > Types.ROW( > Array[String]("id", "name", "age"), > Array[TypeInformation[_]](Types.STRING, Types.STRING, Types.INT)) > ) > .failOnMissingField(true) > ).registerTableSource("persion") > val sql = "select * from person" > val result = tEnv.sqlQuery(sql) > {code} > Exception info : > {code:java} > Exception in thread "main" > org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a > suitable table factory for > 'org.apache.flink.table.factories.BatchTableSourceFactory' in > the classpath. > Reason: No context matches. > The following properties are requested: > connector.path=file:///Users/batch/test.json > connector.property-version=1 > connector.type=filesystem > format.derive-schema=true > format.fail-on-missing-field=true > format.property-version=1 > format.type=json > The following factories have been considered: > org.apache.flink.table.sources.CsvBatchTableSourceFactory > org.apache.flink.table.sources.CsvAppendTableSourceFactory > org.apache.flink.table.sinks.CsvBatchTableSinkFactory > org.apache.flink.table.sinks.CsvAppendTableSinkFactory > org.apache.flink.formats.avro.AvroRowFormatFactory > org.apache.flink.formats.json.JsonRowFormatFactory > org.apache.flink.streaming.connectors.kafka.Kafka010TableSourceSinkFactory > org.apache.flink.streaming.connectors.kafka.Kafka09TableSourceSinkFactory > at > org.apache.flink.table.factories.TableFactoryService$.filterByContext(TableFactoryService.scala:214) > at > org.apache.flink.table.factories.TableFactoryService$.findInternal(TableFactoryService.scala:130) > at > org.apache.flink.table.factories.TableFactoryService$.find(TableFactoryService.scala:81) > at > org.apache.flink.table.factories.TableFactoryUtil$.findAndCreateTableSource(TableFactoryUtil.scala:44) > at > org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:46) > at com.meitu.mlink.sql.batch.JsonExample.main(JsonExample.java:36){code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] asfgit closed pull request #8360: [FLINK-12393][table-common] Add the user-facing classes of the new type system
asfgit closed pull request #8360: [FLINK-12393][table-common] Add the user-facing classes of the new type system URL: https://github.com/apache/flink/pull/8360 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-12393) Add the user-facing classes of the new type system
[ https://issues.apache.org/jira/browse/FLINK-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther closed FLINK-12393. Resolution: Fixed Fix Version/s: 1.9.0 Fixed in 1.9.0: 1cb9d680cbbc46e67de10acc99c5fb510b9382e5 > Add the user-facing classes of the new type system > -- > > Key: FLINK-12393 > URL: https://issues.apache.org/jira/browse/FLINK-12393 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 10m > Remaining Estimate: 0h > > FLINK-12253 introduces logical types that will be used mostly internally. > Users will use the {{DataType}} stack described in FLIP-37. This issue > describes the class hierarchy around this class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11974) Introduce StreamOperatorFactory to help table perform the whole Operator CodeGen
[ https://issues.apache.org/jira/browse/FLINK-11974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-11974. -- Resolution: Fixed Fix Version/s: 1.9.0 h3. merged commit 7b3678f into apache:master > Introduce StreamOperatorFactory to help table perform the whole Operator > CodeGen > > > Key: FLINK-11974 > URL: https://issues.apache.org/jira/browse/FLINK-11974 > Project: Flink > Issue Type: New Feature > Components: Runtime / Operators >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 40m > Remaining Estimate: 0h > > If we need CodeGen an entire Operator, one possible solution is to introduce > an OperatorWrapper, then generate a CodeGen sub-Operator in OperatorWrapper's > open, and then proxy all methods to the sub-Operator. But introduce > OperatorWrapper results in multiple virtual function calls. > The another way is to introduce a StreamOperatorFactory. In runtime, we get > the StreamOperatorFactory and create real operator to invoke. In this way, > there is no redundant virtual call, the test results show that the > performance improves by about 10% after the introduction of > StreamOperatorFactory. (Benchmark for simple query: > [https://github.com/JingsongLi/flink/blob/benchmarkop/flink-table/flink-table-planner-blink/src/test/java/org/apache/flink/table/benchmark/batch/CalcBenchmark.java]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] pnowojski merged pull request #8295: [FLINK-11974][runtime] Introduce StreamOperatorFactory
pnowojski merged pull request #8295: [FLINK-11974][runtime] Introduce StreamOperatorFactory URL: https://github.com/apache/flink/pull/8295 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)
becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE) URL: https://github.com/apache/flink/pull/6594#discussion_r282448138 ## File path: flink-connectors/flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/common/PubSubDeserializationSchema.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.gcp.pubsub.common; + +import org.apache.flink.api.java.typeutils.ResultTypeQueryable; + +import com.google.pubsub.v1.PubsubMessage; + +import java.io.Serializable; + +/** + * The deserialization schema describes how to turn the PubsubMessages + * into data types (Java/Scala objects) that are processed by Flink. + * + * @param The type created by the deserialization schema. + */ +public interface PubSubDeserializationSchema extends Serializable, ResultTypeQueryable { Review comment: I had the same question in the first round of review and I think it is because the deserializer does not (only) take `byte[]` as input, which is unlike the `DeserializationSchema`. There are some potential benefits to use `PubSubMessage` as the input. For example, while the `data` field in the `PubSubMessage` could be converted to a byte array, it will introduce an additional memory copy. Another example is that if the deserialization depends on the attribute in the PubSubMessage (e.g. compression type, encryption hint, etc.), only passing the data bytes to the deserializer may not work. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)
becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE) URL: https://github.com/apache/flink/pull/6594#discussion_r282462314 ## File path: flink-end-to-end-tests/test-scripts/test_streaming_gcp_pubsub.sh ## @@ -0,0 +1,22 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +cd "${END_TO_END_DIR}/flink-connector-gcp-pubsub-emulator-tests" + +mvn test -DskipTests=false Review comment: After a clean recompile the tests does not run for me either. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] twalthr commented on issue #8360: [FLINK-12393][table-common] Add the user-facing classes of the new type system
twalthr commented on issue #8360: [FLINK-12393][table-common] Add the user-facing classes of the new type system URL: https://github.com/apache/flink/pull/8360#issuecomment-490880151 Thanks everyone for the feedback. I will merge this now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] twalthr commented on a change in pull request #8360: [FLINK-12393][table-common] Add the user-facing classes of the new type system
twalthr commented on a change in pull request #8360: [FLINK-12393][table-common] Add the user-facing classes of the new type system URL: https://github.com/apache/flink/pull/8360#discussion_r282458283 ## File path: flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/DataType.java ## @@ -0,0 +1,427 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.types; + +import org.apache.flink.annotation.PublicEvolving; +import org.apache.flink.table.api.DataTypes; +import org.apache.flink.table.api.ValidationException; +import org.apache.flink.table.types.logical.LogicalType; +import org.apache.flink.util.Preconditions; + +import javax.annotation.Nullable; + +import java.io.Serializable; +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; +import java.util.Objects; + +/** + * Describes the data type of a value in the table ecosystem. Instances of this class can be used to + * declare input and/or output types of operations. + * + * The {@link DataType} class has two responsibilities: declaring a logical type and giving hints + * about the physical representation of data to the optimizer. While the logical type is mandatory, + * hints are optional but useful at the edges to other APIs. + * + * The logical type is independent of any physical representation and is close to the "data type" + * terminology of the SQL standard. See {@link org.apache.flink.table.types.logical.LogicalType} and + * its subclasses for more information about available logical types and their properties. + * + * Physical hints are required at the edges of the table ecosystem. Hints indicate the data format + * that an implementation expects. For example, a data source could express that it produces values for + * logical timestamps using a {@link java.sql.Timestamp} class instead of using {@link java.time.LocalDateTime}. + * With this information, the runtime is able to convert the produced class into its internal data + * format. In return, a data sink can declare the data format it consumes from the runtime. + * + * @see DataTypes for a list of supported data types and instances of this class. + */ +@PublicEvolving +public abstract class DataType implements Serializable { + + protected LogicalType logicalType; + + protected @Nullable Class conversionClass; + + private DataType(LogicalType logicalType, @Nullable Class conversionClass) { + this.logicalType = Preconditions.checkNotNull(logicalType, "Logical type must not be null."); + this.conversionClass = performEarlyClassValidation(logicalType, conversionClass); + } + + /** +* Returns the corresponding logical type. +* +* @return a parameterized instance of {@link LogicalType} +*/ + public LogicalType getLogicalType() { + return logicalType; + } + + /** +* Returns the corresponding conversion class for representing values. If no conversion class was +* defined manually, the default conversion defined by the logical type is used. +* +* @see LogicalType#getDefaultConversion() +* +* @return the expected conversion class +*/ + public Class getConversionClass() { + if (conversionClass == null) { + return logicalType.getDefaultConversion(); + } + return conversionClass; + } + + /** +* Adds a hint that null values are not expected in the data for this type. +* +* @return a new, reconfigured data type instance +*/ + public abstract DataType notNull(); + + /** +* Adds a hint that null values are expected in the data for this type (default behavior). +* +* This method exists for explicit declaration of the default behavior or for invalidation of +* a previous call to {@link #notNull()}. +* +* @return a new, reconfigured data type instance +*/ + public abstract DataType andNull(); + + /** +* Adds a hint that data should be represented using
[GitHub] [flink] yanghua commented on a change in pull request #8322: [FLINK-12364] Introduce a CheckpointFailureManager to centralized manage checkpoint failure
yanghua commented on a change in pull request #8322: [FLINK-12364] Introduce a CheckpointFailureManager to centralized manage checkpoint failure URL: https://github.com/apache/flink/pull/8322#discussion_r282453955 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraphBuilder.java ## @@ -296,7 +296,8 @@ public static ExecutionGraph buildGraph( checkpointIdCounter, completedCheckpoints, rootBackend, - checkpointStatsTracker); + checkpointStatsTracker, + chkConfig.getTolerableCheckpointFailureNumber()); Review comment: Agree This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] KurtYoung commented on a change in pull request #8317: [FLINK-12371] [table-planner-blink] Add support for converting (NOT) IN/ (NOT) EXISTS to semi-join/anti-join, and generating opt
KurtYoung commented on a change in pull request #8317: [FLINK-12371] [table-planner-blink] Add support for converting (NOT) IN/ (NOT) EXISTS to semi-join/anti-join, and generating optimized logical plan URL: https://github.com/apache/flink/pull/8317#discussion_r282452526 ## File path: flink-table/flink-table-planner-blink/src/main/java/org/apache/calcite/rel/core/Join.java ## @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to you under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.calcite.rel.core; Review comment: Maybe we should try to avoid making this package exactly the same with Calcite's? It will make developer confusing which one did we import. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services