[jira] [Assigned] (FLINK-5431) time format for akka status
[ https://issues.apache.org/jira/browse/FLINK-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Solovev reassigned FLINK-5431: Assignee: Anton Solovev > time format for akka status > --- > > Key: FLINK-5431 > URL: https://issues.apache.org/jira/browse/FLINK-5431 > Project: Flink > Issue Type: Improvement >Reporter: Alexey Diomin >Assignee: Anton Solovev >Priority: Minor > > In ExecutionGraphMessages we have code > {code} > private val DATE_FORMATTER: SimpleDateFormat = new > SimpleDateFormat("MM/dd/ HH:mm:ss") > {code} > But sometimes it cause confusion when main logger configured with > "dd/MM/". > We need making this format configurable or maybe stay only "HH:mm:ss" for > prevent misunderstanding output date-time -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5357) WordCountTable fails
[ https://issues.apache.org/jira/browse/FLINK-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814112#comment-15814112 ] ASF GitHub Bot commented on FLINK-5357: --- Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3063#discussion_r95308316 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/logical/operators.scala --- @@ -92,20 +92,19 @@ case class Project(projectList: Seq[NamedExpression], child: LogicalNode) extend } override protected[logical] def construct(relBuilder: RelBuilder): RelBuilder = { -val allAlias = projectList.forall(_.isInstanceOf[Alias]) child.construct(relBuilder) -if (allAlias) { - // Calcite's RelBuilder does not translate identity projects even if they rename fields. - // Add a projection ourselves (will be automatically removed by translation rules). - val project = LogicalProject.create(relBuilder.peek(), +// Calcite's RelBuilder does not translate identity projects even if they rename fields. +// We add a projection ourselves (will be automatically removed by translation rules). +val project = LogicalProject.create( --- End diff -- Good point ! The hack code existed because of Calcite's RelBuilder does not create project when only rename fields. The `project( Iterable nodes, Iterable fieldNames, boolean force)` was introduced and fixed this issue by Calcite-1342. I have tried to replace the hack code by the following code, and all tests passed ! ```scala override protected[logical] def construct(relBuilder: RelBuilder): RelBuilder = { child.construct(relBuilder) relBuilder.project( projectList.map { case Alias(c, _, _) => c.toRexNode(relBuilder) case n@_ => n.toRexNode(relBuilder) }.asJava, projectList.map(_.name).asJava, true) } ``` > WordCountTable fails > > > Key: FLINK-5357 > URL: https://issues.apache.org/jira/browse/FLINK-5357 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Timo Walther > > The execution of org.apache.flink.table.examples.java.WordCountTable fails: > {code} > Exception in thread "main" org.apache.flink.table.api.TableException: POJO > does not define field name: TMP_0 > at > org.apache.flink.table.typeutils.TypeConverter$$anonfun$2.apply(TypeConverter.scala:85) > at > org.apache.flink.table.typeutils.TypeConverter$$anonfun$2.apply(TypeConverter.scala:81) > at scala.collection.immutable.List.foreach(List.scala:318) > at > org.apache.flink.table.typeutils.TypeConverter$.determineReturnType(TypeConverter.scala:81) > at > org.apache.flink.table.plan.nodes.dataset.DataSetCalc.translateToPlan(DataSetCalc.scala:110) > at > org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:305) > at > org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:289) > at > org.apache.flink.table.api.java.BatchTableEnvironment.toDataSet(BatchTableEnvironment.scala:146) > at > org.apache.flink.table.examples.java.WordCountTable.main(WordCountTable.java:56) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #3063: [FLINK-5357] [table] Fix dropped projections
Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3063#discussion_r95308316 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/logical/operators.scala --- @@ -92,20 +92,19 @@ case class Project(projectList: Seq[NamedExpression], child: LogicalNode) extend } override protected[logical] def construct(relBuilder: RelBuilder): RelBuilder = { -val allAlias = projectList.forall(_.isInstanceOf[Alias]) child.construct(relBuilder) -if (allAlias) { - // Calcite's RelBuilder does not translate identity projects even if they rename fields. - // Add a projection ourselves (will be automatically removed by translation rules). - val project = LogicalProject.create(relBuilder.peek(), +// Calcite's RelBuilder does not translate identity projects even if they rename fields. +// We add a projection ourselves (will be automatically removed by translation rules). +val project = LogicalProject.create( --- End diff -- Good point ! The hack code existed because of Calcite's RelBuilder does not create project when only rename fields. The `project( Iterable nodes, Iterable fieldNames, boolean force)` was introduced and fixed this issue by Calcite-1342. I have tried to replace the hack code by the following code, and all tests passed ! ```scala override protected[logical] def construct(relBuilder: RelBuilder): RelBuilder = { child.construct(relBuilder) relBuilder.project( projectList.map { case Alias(c, _, _) => c.toRexNode(relBuilder) case n@_ => n.toRexNode(relBuilder) }.asJava, projectList.map(_.name).asJava, true) } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4920) Add a Scala Function Gauge
[ https://issues.apache.org/jira/browse/FLINK-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814070#comment-15814070 ] ASF GitHub Bot commented on FLINK-4920: --- Github user KurtYoung commented on a diff in the pull request: https://github.com/apache/flink/pull/3080#discussion_r95306713 --- Diff: flink-scala/src/main/scala/org/apache/flink/api/scala/metrics/ScalaGauge.scala --- @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.scala.metrics + +import org.apache.flink.metrics.Gauge + +class ScalaGauge[T](value : T) extends Gauge[T] { --- End diff -- Directly assigning value in constructor is one possible approach to achieve the goal of reducing the minimal code lines needed for registering a Scala gauge. However, when we meet situations like it will contain some logics to calculate the value, this approach will not work. One possible solution is to use Scala's Function0 utility to pass in a "value generator" function instead of value itself. Something like this: ` class ScalaGauge[T](func: () => T) extends Gauge[T] ` then one can use the class like this: ` val myGauge = new ScalaGauge(() => 1) ` or ` val myGauge = new ScalaGauge(() => { // codes to calculate the value }) ` when we meet some complex situations. Here is the Function0 reference which described the usage more clearly: http://www.scala-lang.org/api/2.6.0/scala/Function0.html > Add a Scala Function Gauge > -- > > Key: FLINK-4920 > URL: https://issues.apache.org/jira/browse/FLINK-4920 > Project: Flink > Issue Type: Improvement > Components: Metrics, Scala API >Reporter: Stephan Ewen >Assignee: Pattarawat Chormai > Labels: easyfix, starter > > A useful metrics utility for the Scala API would be to add a Gauge that > obtains its value by calling a Scala Function0. > That way, one can add Gauges in Scala programs using Scala lambda notation or > function references. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #3080: [FLINK-4920] Add a Scala Function Gauge
Github user KurtYoung commented on a diff in the pull request: https://github.com/apache/flink/pull/3080#discussion_r95306713 --- Diff: flink-scala/src/main/scala/org/apache/flink/api/scala/metrics/ScalaGauge.scala --- @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.scala.metrics + +import org.apache.flink.metrics.Gauge + +class ScalaGauge[T](value : T) extends Gauge[T] { --- End diff -- Directly assigning value in constructor is one possible approach to achieve the goal of reducing the minimal code lines needed for registering a Scala gauge. However, when we meet situations like it will contain some logics to calculate the value, this approach will not work. One possible solution is to use Scala's Function0 utility to pass in a "value generator" function instead of value itself. Something like this: ` class ScalaGauge[T](func: () => T) extends Gauge[T] ` then one can use the class like this: ` val myGauge = new ScalaGauge(() => 1) ` or ` val myGauge = new ScalaGauge(() => { // codes to calculate the value }) ` when we meet some complex situations. Here is the Function0 reference which described the usage more clearly: http://www.scala-lang.org/api/2.6.0/scala/Function0.html --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5122) Elasticsearch Sink loses documents when cluster has high load
[ https://issues.apache.org/jira/browse/FLINK-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814010#comment-15814010 ] ASF GitHub Bot commented on FLINK-5122: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/2861 Hi @static-max, thank you for working on this, it'll be an important fix for proper at least once support for the ES connector. Recently, the community has agreed to first restructure the multiple ES connector version, so that important fixes like this one can be done once and for all across all versions (1.x, 2.x, and 5.x which is currently pending). Here's the discussion: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-ElasticSearch-in-Flink-Strategy-td15049.html. Could we wait just a little a bit on this PR, and once the ES connector refactoring is complete, we can come back and rebase this PR on that? You can follow the progress here: #2767. I'm trying to come up with the restructure PR within the next day. Very sorry for the extra wait needed on this, but it'll be good for the long run, hope you can understand :) > Elasticsearch Sink loses documents when cluster has high load > - > > Key: FLINK-5122 > URL: https://issues.apache.org/jira/browse/FLINK-5122 > Project: Flink > Issue Type: Bug > Components: Streaming Connectors >Affects Versions: 1.2.0 >Reporter: static-max >Assignee: static-max > > My cluster had high load and documents got not indexed. This violates the "at > least once" semantics in the ES connector. > I gave pressure on my cluster to test Flink, causing new indices to be > created and balanced. On those errors the bulk should be tried again instead > of being discarded. > Primary shard not active because ES decided to rebalance the index: > 2016-11-15 15:35:16,123 ERROR > org.apache.flink.streaming.connectors.elasticsearch2.ElasticsearchSink - > Failed to index document in Elasticsearch: > UnavailableShardsException[[index-name][3] primary shard is not active > Timeout: [1m], request: [BulkShardRequest to [index-name] containing [20] > requests]] > Bulk queue on node full (I set queue to a low value to reproduce error): > 22:37:57,702 ERROR > org.apache.flink.streaming.connectors.elasticsearch2.ElasticsearchSink - > Failed to index document in Elasticsearch: > RemoteTransportException[[node1][192.168.1.240:9300][indices:data/write/bulk[s][p]]]; > nested: EsRejectedExecutionException[rejected execution of > org.elasticsearch.transport.TransportService$4@727e677c on > EsThreadPoolExecutor[bulk, queue capacity = 1, > org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@51322d37[Running, > pool size = 2, active threads = 2, queued tasks = 1, completed tasks = > 2939]]]; > I can try to propose a PR for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2861: [FLINK-5122] Index requests will be retried if the error ...
Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/2861 Hi @static-max, thank you for working on this, it'll be an important fix for proper at least once support for the ES connector. Recently, the community has agreed to first restructure the multiple ES connector version, so that important fixes like this one can be done once and for all across all versions (1.x, 2.x, and 5.x which is currently pending). Here's the discussion: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-ElasticSearch-in-Flink-Strategy-td15049.html. Could we wait just a little a bit on this PR, and once the ES connector refactoring is complete, we can come back and rebase this PR on that? You can follow the progress here: #2767. I'm trying to come up with the restructure PR within the next day. Very sorry for the extra wait needed on this, but it'll be good for the long run, hope you can understand :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5357) WordCountTable fails
[ https://issues.apache.org/jira/browse/FLINK-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814005#comment-15814005 ] ASF GitHub Bot commented on FLINK-5357: --- Github user KurtYoung commented on a diff in the pull request: https://github.com/apache/flink/pull/3063#discussion_r95304042 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/logical/operators.scala --- @@ -92,20 +92,19 @@ case class Project(projectList: Seq[NamedExpression], child: LogicalNode) extend } override protected[logical] def construct(relBuilder: RelBuilder): RelBuilder = { -val allAlias = projectList.forall(_.isInstanceOf[Alias]) child.construct(relBuilder) -if (allAlias) { - // Calcite's RelBuilder does not translate identity projects even if they rename fields. - // Add a projection ourselves (will be automatically removed by translation rules). - val project = LogicalProject.create(relBuilder.peek(), +// Calcite's RelBuilder does not translate identity projects even if they rename fields. +// We add a projection ourselves (will be automatically removed by translation rules). +val project = LogicalProject.create( --- End diff -- I noticed that Calcite's RelBuilder has this method: `project( Iterable nodes, Iterable fieldNames, boolean force)` Would it be more consistent to use this method by setting the force to `true`, instead of creating Projection by ourself. But either is fine with me. > WordCountTable fails > > > Key: FLINK-5357 > URL: https://issues.apache.org/jira/browse/FLINK-5357 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Timo Walther > > The execution of org.apache.flink.table.examples.java.WordCountTable fails: > {code} > Exception in thread "main" org.apache.flink.table.api.TableException: POJO > does not define field name: TMP_0 > at > org.apache.flink.table.typeutils.TypeConverter$$anonfun$2.apply(TypeConverter.scala:85) > at > org.apache.flink.table.typeutils.TypeConverter$$anonfun$2.apply(TypeConverter.scala:81) > at scala.collection.immutable.List.foreach(List.scala:318) > at > org.apache.flink.table.typeutils.TypeConverter$.determineReturnType(TypeConverter.scala:81) > at > org.apache.flink.table.plan.nodes.dataset.DataSetCalc.translateToPlan(DataSetCalc.scala:110) > at > org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:305) > at > org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:289) > at > org.apache.flink.table.api.java.BatchTableEnvironment.toDataSet(BatchTableEnvironment.scala:146) > at > org.apache.flink.table.examples.java.WordCountTable.main(WordCountTable.java:56) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #3063: [FLINK-5357] [table] Fix dropped projections
Github user KurtYoung commented on a diff in the pull request: https://github.com/apache/flink/pull/3063#discussion_r95304042 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/logical/operators.scala --- @@ -92,20 +92,19 @@ case class Project(projectList: Seq[NamedExpression], child: LogicalNode) extend } override protected[logical] def construct(relBuilder: RelBuilder): RelBuilder = { -val allAlias = projectList.forall(_.isInstanceOf[Alias]) child.construct(relBuilder) -if (allAlias) { - // Calcite's RelBuilder does not translate identity projects even if they rename fields. - // Add a projection ourselves (will be automatically removed by translation rules). - val project = LogicalProject.create(relBuilder.peek(), +// Calcite's RelBuilder does not translate identity projects even if they rename fields. +// We add a projection ourselves (will be automatically removed by translation rules). +val project = LogicalProject.create( --- End diff -- I noticed that Calcite's RelBuilder has this method: `project( Iterable nodes, Iterable fieldNames, boolean force)` Would it be more consistent to use this method by setting the force to `true`, instead of creating Projection by ourself. But either is fine with me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data
[ https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813852#comment-15813852 ] ASF GitHub Bot commented on FLINK-5280: --- Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3039#discussion_r95292990 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/TableSource.scala --- @@ -19,22 +19,23 @@ package org.apache.flink.table.sources import org.apache.flink.api.common.typeinfo.TypeInformation +import org.apache.flink.table.api.TableEnvironment -/** Defines an external table by providing schema information, i.e., field names and types. +/** Defines an external table by providing schema information and used to produce a + * [[org.apache.flink.api.scala.DataSet]] or [[org.apache.flink.streaming.api.scala.DataStream]]. + * Schema information consists of a data type, field names, and corresponding indices of + * these names in the data type. + * + * To define a TableSource one need to implement [[TableSource#getReturnType]]. In this case + * field names and field indices are derived from the returned type. + * + * In case if custom field names are required one need to additionally implement --- End diff -- In case if -> In case of > Extend TableSource to support nested data > - > > Key: FLINK-5280 > URL: https://issues.apache.org/jira/browse/FLINK-5280 > Project: Flink > Issue Type: Improvement > Components: Table API & SQL >Affects Versions: 1.2.0 >Reporter: Fabian Hueske >Assignee: Ivan Mushketyk > > The {{TableSource}} interface does currently only support the definition of > flat rows. > However, there are several storage formats for nested data that should be > supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can > also natively handle nested rows. > The {{TableSource}} interface and the code to register table sources in > Calcite's schema need to be extended to support nested data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #3039: [FLINK-5280] Update TableSource to support nested ...
Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3039#discussion_r95292990 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/TableSource.scala --- @@ -19,22 +19,23 @@ package org.apache.flink.table.sources import org.apache.flink.api.common.typeinfo.TypeInformation +import org.apache.flink.table.api.TableEnvironment -/** Defines an external table by providing schema information, i.e., field names and types. +/** Defines an external table by providing schema information and used to produce a + * [[org.apache.flink.api.scala.DataSet]] or [[org.apache.flink.streaming.api.scala.DataStream]]. + * Schema information consists of a data type, field names, and corresponding indices of + * these names in the data type. + * + * To define a TableSource one need to implement [[TableSource#getReturnType]]. In this case + * field names and field indices are derived from the returned type. + * + * In case if custom field names are required one need to additionally implement --- End diff -- In case if -> In case of --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4692) Add tumbling group-windows for batch tables
[ https://issues.apache.org/jira/browse/FLINK-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813765#comment-15813765 ] ASF GitHub Bot commented on FLINK-4692: --- Github user wuchong commented on the issue: https://github.com/apache/flink/pull/2938 It's wired. I can't reproduce the wrong results in my local environment. What is the `data` in your test @twalthr ? I'm using the following data in the batch and stream test, but the result is same. ```scala val data = List( (1L, 1, "Hi"), (2L, 2, "Hello"), (4L, 2, "Hello"), (8L, 3, "Hello world"), (6L, 3, "Hello world")) ``` cc @fhueske could your help to test this in your environment ? > Add tumbling group-windows for batch tables > --- > > Key: FLINK-4692 > URL: https://issues.apache.org/jira/browse/FLINK-4692 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Jark Wu > > Add Tumble group-windows for batch tables as described in > [FLIP-11|https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations]. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2938: [FLINK-4692] [tableApi] Add tumbling group-windows for ba...
Github user wuchong commented on the issue: https://github.com/apache/flink/pull/2938 It's wired. I can't reproduce the wrong results in my local environment. What is the `data` in your test @twalthr ? I'm using the following data in the batch and stream test, but the result is same. ```scala val data = List( (1L, 1, "Hi"), (2L, 2, "Hello"), (4L, 2, "Hello"), (8L, 3, "Hello world"), (6L, 3, "Hello world")) ``` cc @fhueske could your help to test this in your environment ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5364) Rework JAAS configuration to support user-supplied entries
[ https://issues.apache.org/jira/browse/FLINK-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813707#comment-15813707 ] ASF GitHub Bot commented on FLINK-5364: --- Github user EronWright commented on the issue: https://github.com/apache/flink/pull/3057 @StephanEwen updated based on feedback, thanks again. > Rework JAAS configuration to support user-supplied entries > -- > > Key: FLINK-5364 > URL: https://issues.apache.org/jira/browse/FLINK-5364 > Project: Flink > Issue Type: Bug > Components: Cluster Management >Reporter: Eron Wright >Assignee: Eron Wright >Priority: Critical > Labels: kerberos, security > > Recent issues (see linked) have brought to light a critical deficiency in the > handling of JAAS configuration. > 1. the MapR distribution relies on an explicit JAAS conf, rather than > in-memory conf used by stock Hadoop. > 2. the ZK/Kafka/Hadoop security configuration is supposed to be independent > (one can enable each element separately) but isn't. > Perhaps we should rework the JAAS conf code to merge any user-supplied > configuration with our defaults, rather than using an all-or-nothing > approach. > We should also address some recent regressions: > 1. The HadoopSecurityContext should be installed regardless of auth mode, to > login with UserGroupInformation, which: > - handles the HADOOP_USER_NAME variable. > - installs an OS-specific user principal (from UnixLoginModule etc.) > unrelated to Kerberos. > - picks up the HDFS/HBASE delegation tokens. > 2. Fix the use of alternative authentication methods - delegation tokens and > Kerberos ticket cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3057: [FLINK-5364] Rework JAAS configuration to support user-su...
Github user EronWright commented on the issue: https://github.com/apache/flink/pull/3057 @StephanEwen updated based on feedback, thanks again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4988) Elasticsearch 5.x support
[ https://issues.apache.org/jira/browse/FLINK-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813546#comment-15813546 ] ASF GitHub Bot commented on FLINK-4988: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/2767#discussion_r95287899 --- Diff: flink-connectors/flink-connector-elasticsearch5/src/main/java/org/apache/flink/streaming/connectors/elasticsearch5/ElasticsearchSink.java --- @@ -0,0 +1,259 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.streaming.connectors.elasticsearch5; + +import org.apache.flink.api.java.utils.ParameterTool; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.functions.sink.RichSinkFunction; +import org.apache.flink.util.Preconditions; +import org.elasticsearch.action.bulk.BulkItemResponse; +import org.elasticsearch.action.bulk.BulkProcessor; +import org.elasticsearch.action.bulk.BulkRequest; +import org.elasticsearch.action.bulk.BulkResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.network.NetworkModule; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.transport.TransportAddress; +import org.elasticsearch.common.unit.ByteSizeUnit; +import org.elasticsearch.common.unit.ByteSizeValue; +import org.elasticsearch.common.unit.TimeValue; +import org.elasticsearch.transport.Netty3Plugin; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetSocketAddress; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.concurrent.atomic.AtomicBoolean; +import java.util.concurrent.atomic.AtomicReference; + +/** + * Sink that emits its input elements in bulk to an Elasticsearch cluster. + * + * + * The first {@link Map} passed to the constructor is forwarded to Elasticsearch when creating + * {@link TransportClient}. The config keys can be found in the Elasticsearch + * documentation. An important setting is {@code cluster.name}, this should be set to the name + * of the cluster that the sink should emit to. + * + * Attention: When using the {@code TransportClient} the sink will fail if no cluster + * can be connected to. + * + * The second {@link Map} is used to configure a {@link BulkProcessor} to send {@link IndexRequest IndexRequests}. + * This will buffer elements before sending a request to the cluster. The behaviour of the + * {@code BulkProcessor} can be configured using these config keys: + * + * {@code bulk.flush.max.actions}: Maximum amount of elements to buffer + * {@code bulk.flush.max.size.mb}: Maximum amount of data (in megabytes) to buffer + * {@code bulk.flush.interval.ms}: Interval at which to flush data regardless of the other two + * settings in milliseconds + * + * + * + * You also have to provide an {@link RequestIndexer}. This is used to create an + * {@link IndexRequest} from an element that needs to be added to Elasticsearch. See + * {@link RequestIndexer} for an example. + * + * @param Type of the elements emitted by this sink + */ +public class ElasticsearchSink extends RichSinkFunction { + + public static final String CONFIG_KEY_BULK_FLUSH_MAX_ACTIONS = "bulk.flush.max.actions"; + public static final String CONFIG_KEY_BULK_FLUSH_MAX_SIZE_MB = "bulk.flush.max.size.mb"; + public static final String CONFIG_KEY_BULK_FLUSH_INTERVAL_MS = "bulk.flush.interval.ms"; + + private static final long serialVersionUID = 1L; + + private static f
[GitHub] flink pull request #2767: [FLINK-4988] Elasticsearch 5.x support
Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/2767#discussion_r95287899 --- Diff: flink-connectors/flink-connector-elasticsearch5/src/main/java/org/apache/flink/streaming/connectors/elasticsearch5/ElasticsearchSink.java --- @@ -0,0 +1,259 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.streaming.connectors.elasticsearch5; + +import org.apache.flink.api.java.utils.ParameterTool; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.functions.sink.RichSinkFunction; +import org.apache.flink.util.Preconditions; +import org.elasticsearch.action.bulk.BulkItemResponse; +import org.elasticsearch.action.bulk.BulkProcessor; +import org.elasticsearch.action.bulk.BulkRequest; +import org.elasticsearch.action.bulk.BulkResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.network.NetworkModule; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.transport.TransportAddress; +import org.elasticsearch.common.unit.ByteSizeUnit; +import org.elasticsearch.common.unit.ByteSizeValue; +import org.elasticsearch.common.unit.TimeValue; +import org.elasticsearch.transport.Netty3Plugin; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetSocketAddress; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.concurrent.atomic.AtomicBoolean; +import java.util.concurrent.atomic.AtomicReference; + +/** + * Sink that emits its input elements in bulk to an Elasticsearch cluster. + * + * + * The first {@link Map} passed to the constructor is forwarded to Elasticsearch when creating + * {@link TransportClient}. The config keys can be found in the Elasticsearch + * documentation. An important setting is {@code cluster.name}, this should be set to the name + * of the cluster that the sink should emit to. + * + * Attention: When using the {@code TransportClient} the sink will fail if no cluster + * can be connected to. + * + * The second {@link Map} is used to configure a {@link BulkProcessor} to send {@link IndexRequest IndexRequests}. + * This will buffer elements before sending a request to the cluster. The behaviour of the + * {@code BulkProcessor} can be configured using these config keys: + * + * {@code bulk.flush.max.actions}: Maximum amount of elements to buffer + * {@code bulk.flush.max.size.mb}: Maximum amount of data (in megabytes) to buffer + * {@code bulk.flush.interval.ms}: Interval at which to flush data regardless of the other two + * settings in milliseconds + * + * + * + * You also have to provide an {@link RequestIndexer}. This is used to create an + * {@link IndexRequest} from an element that needs to be added to Elasticsearch. See + * {@link RequestIndexer} for an example. + * + * @param Type of the elements emitted by this sink + */ +public class ElasticsearchSink extends RichSinkFunction { + + public static final String CONFIG_KEY_BULK_FLUSH_MAX_ACTIONS = "bulk.flush.max.actions"; + public static final String CONFIG_KEY_BULK_FLUSH_MAX_SIZE_MB = "bulk.flush.max.size.mb"; + public static final String CONFIG_KEY_BULK_FLUSH_INTERVAL_MS = "bulk.flush.interval.ms"; + + private static final long serialVersionUID = 1L; + + private static final Logger LOG = LoggerFactory.getLogger(ElasticsearchSink.class); + + /** +* The user specified config map that we forward to Elasticsearch when we create the Client. +*/ + private final Map esConfig; + + /**
[jira] [Commented] (FLINK-5364) Rework JAAS configuration to support user-supplied entries
[ https://issues.apache.org/jira/browse/FLINK-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813450#comment-15813450 ] ASF GitHub Bot commented on FLINK-5364: --- Github user EronWright commented on a diff in the pull request: https://github.com/apache/flink/pull/3057#discussion_r95283348 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/JaasModule.java --- @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.runtime.security.modules; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.runtime.security.DynamicConfiguration; +import org.apache.flink.runtime.security.KerberosUtils; +import org.apache.flink.runtime.security.SecurityUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import javax.security.auth.login.AppConfigurationEntry; +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.StandardCopyOption; + +/** + * Responsible for installing a process-wide JAAS configuration. + * + * The installed configuration combines login modules based on: + * - the user-supplied JAAS configuration file, if any + * - a Kerberos keytab, if configured + * - any cached Kerberos credentials from the current environment + * + * The module also installs a default JAAS config file (if necessary) for + * compatibility with ZK and Kafka. Note that the JRE actually draws on numerous file locations. + * See: https://docs.oracle.com/javase/7/docs/jre/api/security/jaas/spec/com/sun/security/auth/login/ConfigFile.html + * See: https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/security/kerberos/Login.java#L289 + */ +@Internal +public class JaasModule implements SecurityModule { + + private static final Logger LOG = LoggerFactory.getLogger(JaasModule.class); + + static final String JAVA_SECURITY_AUTH_LOGIN_CONFIG = "java.security.auth.login.config"; + + static final String JAAS_CONF_RESOURCE_NAME = "flink-jaas.conf"; + + private String priorConfigFile; + private javax.security.auth.login.Configuration priorConfig; + + private DynamicConfiguration currentConfig; + + @Override + public void install(SecurityUtils.SecurityConfiguration securityConfig) { + + // ensure that a config file is always defined, for compatibility with + // ZK and Kafka which check for the system property and existence of the file + priorConfigFile = System.getProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, null); + if (priorConfigFile == null) { + File configFile = generateDefaultConfigFile(); + System.setProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, configFile.getAbsolutePath()); + } + + // read the JAAS configuration file + priorConfig = javax.security.auth.login.Configuration.getConfiguration(); + + // construct a dynamic JAAS configuration + currentConfig = new DynamicConfiguration(priorConfig); + + // wire up the configured JAAS login contexts to use the krb5 entries + AppConfigurationEntry[] krb5Entries = getAppConfigurationEntries(securityConfig); + if(krb5Entries != null) { + for (String app : securityConfig.getLoginContextNames()) { + currentConfig.addAppConfigurationEntry(app, krb5Entries); + } + } + + javax.security.auth.login.Configuration.setConfiguration(currentConfig); + } + + @Override + public void uninstall() { + if(priorConfigFile != null) { + System.setProperty(JAVA_SEC
[GitHub] flink pull request #3057: [FLINK-5364] Rework JAAS configuration to support ...
Github user EronWright commented on a diff in the pull request: https://github.com/apache/flink/pull/3057#discussion_r95283348 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/JaasModule.java --- @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.runtime.security.modules; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.runtime.security.DynamicConfiguration; +import org.apache.flink.runtime.security.KerberosUtils; +import org.apache.flink.runtime.security.SecurityUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import javax.security.auth.login.AppConfigurationEntry; +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.StandardCopyOption; + +/** + * Responsible for installing a process-wide JAAS configuration. + * + * The installed configuration combines login modules based on: + * - the user-supplied JAAS configuration file, if any + * - a Kerberos keytab, if configured + * - any cached Kerberos credentials from the current environment + * + * The module also installs a default JAAS config file (if necessary) for + * compatibility with ZK and Kafka. Note that the JRE actually draws on numerous file locations. + * See: https://docs.oracle.com/javase/7/docs/jre/api/security/jaas/spec/com/sun/security/auth/login/ConfigFile.html + * See: https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/security/kerberos/Login.java#L289 + */ +@Internal +public class JaasModule implements SecurityModule { + + private static final Logger LOG = LoggerFactory.getLogger(JaasModule.class); + + static final String JAVA_SECURITY_AUTH_LOGIN_CONFIG = "java.security.auth.login.config"; + + static final String JAAS_CONF_RESOURCE_NAME = "flink-jaas.conf"; + + private String priorConfigFile; + private javax.security.auth.login.Configuration priorConfig; + + private DynamicConfiguration currentConfig; + + @Override + public void install(SecurityUtils.SecurityConfiguration securityConfig) { + + // ensure that a config file is always defined, for compatibility with + // ZK and Kafka which check for the system property and existence of the file + priorConfigFile = System.getProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, null); + if (priorConfigFile == null) { + File configFile = generateDefaultConfigFile(); + System.setProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, configFile.getAbsolutePath()); + } + + // read the JAAS configuration file + priorConfig = javax.security.auth.login.Configuration.getConfiguration(); + + // construct a dynamic JAAS configuration + currentConfig = new DynamicConfiguration(priorConfig); + + // wire up the configured JAAS login contexts to use the krb5 entries + AppConfigurationEntry[] krb5Entries = getAppConfigurationEntries(securityConfig); + if(krb5Entries != null) { + for (String app : securityConfig.getLoginContextNames()) { + currentConfig.addAppConfigurationEntry(app, krb5Entries); + } + } + + javax.security.auth.login.Configuration.setConfiguration(currentConfig); + } + + @Override + public void uninstall() { + if(priorConfigFile != null) { + System.setProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, priorConfigFile); + } else { + System.clearProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG); + } + javax.security.auth.login.Configuration.setConfiguration(priorConfig);
[jira] [Commented] (FLINK-5395) support locally build distribution by script create_release_files.sh
[ https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813305#comment-15813305 ] ASF GitHub Bot commented on FLINK-5395: --- Github user shijinkui commented on a diff in the pull request: https://github.com/apache/flink/pull/3049#discussion_r95276351 --- Diff: tools/create_release_files.sh --- @@ -66,16 +66,19 @@ fi GPG_PASSPHRASE=${GPG_PASSPHRASE:-XXX} GPG_KEY=${GPG_KEY:-XXX} GIT_AUTHOR=${GIT_AUTHOR:-"Your name "} -OLD_VERSION=${OLD_VERSION:-1.1-SNAPSHOT} -RELEASE_VERSION=${NEW_VERSION} +OLD_VERSION=${OLD_VERSION:-1.2-SNAPSHOT} +RELEASE_VERSION=${NEW_VERSION:-1.3-SNAPSHOT} RELEASE_CANDIDATE=${RELEASE_CANDIDATE:-rc1} RELEASE_BRANCH=${RELEASE_BRANCH:-master} USER_NAME=${USER_NAME:-yourapacheidhere} MVN=${MVN:-mvn} GPG=${GPG:-gpg} sonatype_user=${sonatype_user:-yourapacheidhere} sonatype_pw=${sonatype_pw:-XXX} - +IS_LOCAL_DIST=${IS_LOCAL_DIST:-false} +GIT_REPO=${GIT_REPO:-git-wip-us.apache.org/repos/asf/flink.git} +scalaV=none --- End diff -- That' fine. > support locally build distribution by script create_release_files.sh > > > Key: FLINK-5395 > URL: https://issues.apache.org/jira/browse/FLINK-5395 > Project: Flink > Issue Type: Improvement > Components: Build System >Reporter: shijinkui > > create_release_files.sh is build flink release only. It's hard to build > custom local Flink release distribution. > Let create_release_files.sh support: > 1. custom git repo url > 2. custom build special scala and hadoop version > 3. add `tools/flink` to .gitignore > 4. add usage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #3049: [FLINK-5395] [Build System] support locally build ...
Github user shijinkui commented on a diff in the pull request: https://github.com/apache/flink/pull/3049#discussion_r95276351 --- Diff: tools/create_release_files.sh --- @@ -66,16 +66,19 @@ fi GPG_PASSPHRASE=${GPG_PASSPHRASE:-XXX} GPG_KEY=${GPG_KEY:-XXX} GIT_AUTHOR=${GIT_AUTHOR:-"Your name "} -OLD_VERSION=${OLD_VERSION:-1.1-SNAPSHOT} -RELEASE_VERSION=${NEW_VERSION} +OLD_VERSION=${OLD_VERSION:-1.2-SNAPSHOT} +RELEASE_VERSION=${NEW_VERSION:-1.3-SNAPSHOT} RELEASE_CANDIDATE=${RELEASE_CANDIDATE:-rc1} RELEASE_BRANCH=${RELEASE_BRANCH:-master} USER_NAME=${USER_NAME:-yourapacheidhere} MVN=${MVN:-mvn} GPG=${GPG:-gpg} sonatype_user=${sonatype_user:-yourapacheidhere} sonatype_pw=${sonatype_pw:-XXX} - +IS_LOCAL_DIST=${IS_LOCAL_DIST:-false} +GIT_REPO=${GIT_REPO:-git-wip-us.apache.org/repos/asf/flink.git} +scalaV=none --- End diff -- That' fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5364) Rework JAAS configuration to support user-supplied entries
[ https://issues.apache.org/jira/browse/FLINK-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813113#comment-15813113 ] ASF GitHub Bot commented on FLINK-5364: --- Github user EronWright commented on a diff in the pull request: https://github.com/apache/flink/pull/3057#discussion_r95265401 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/security/SecurityUtils.java --- @@ -71,163 +64,93 @@ */ public static void install(SecurityConfiguration config) throws Exception { - if (!config.securityIsEnabled()) { - // do not perform any initialization if no Kerberos crendetails are provided - return; - } - - // establish the JAAS config - JaasConfiguration jaasConfig = new JaasConfiguration(config.keytab, config.principal); - javax.security.auth.login.Configuration.setConfiguration(jaasConfig); - - populateSystemSecurityProperties(config.flinkConf); - - // establish the UGI login user - UserGroupInformation.setConfiguration(config.hadoopConf); - - // only configure Hadoop security if we have security enabled - if (UserGroupInformation.isSecurityEnabled()) { - - final UserGroupInformation loginUser; - - if (config.keytab != null && !StringUtils.isBlank(config.principal)) { - String keytabPath = (new File(config.keytab)).getAbsolutePath(); - - UserGroupInformation.loginUserFromKeytab(config.principal, keytabPath); - - loginUser = UserGroupInformation.getLoginUser(); - - // supplement with any available tokens - String fileLocation = System.getenv(UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION); - if (fileLocation != null) { - /* -* Use reflection API since the API semantics are not available in Hadoop1 profile. Below APIs are -* used in the context of reading the stored tokens from UGI. -* Credentials cred = Credentials.readTokenStorageFile(new File(fileLocation), config.hadoopConf); -* loginUser.addCredentials(cred); - */ - try { - Method readTokenStorageFileMethod = Credentials.class.getMethod("readTokenStorageFile", - File.class, org.apache.hadoop.conf.Configuration.class); - Credentials cred = (Credentials) readTokenStorageFileMethod.invoke(null, new File(fileLocation), - config.hadoopConf); - Method addCredentialsMethod = UserGroupInformation.class.getMethod("addCredentials", - Credentials.class); - addCredentialsMethod.invoke(loginUser, cred); - } catch (NoSuchMethodException e) { - LOG.warn("Could not find method implementations in the shaded jar. Exception: {}", e); - } - } - } else { - // login with current user credentials (e.g. ticket cache) - try { - //Use reflection API to get the login user object - //UserGroupInformation.loginUserFromSubject(null); - Method loginUserFromSubjectMethod = UserGroupInformation.class.getMethod("loginUserFromSubject", Subject.class); - Subject subject = null; - loginUserFromSubjectMethod.invoke(null, subject); - } catch (NoSuchMethodException e) { - LOG.warn("Could not find method implementations in the shaded jar. Exception: {}", e); - } - - // note that the stored tokens are read automatically - loginUser = UserGroupInformation.getLoginUser(); + // install the security modules + List modules = new ArrayList(); + try { + for (Class moduleClass : config.getSecurityModules(
[GitHub] flink pull request #3057: [FLINK-5364] Rework JAAS configuration to support ...
Github user EronWright commented on a diff in the pull request: https://github.com/apache/flink/pull/3057#discussion_r95265401 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/security/SecurityUtils.java --- @@ -71,163 +64,93 @@ */ public static void install(SecurityConfiguration config) throws Exception { - if (!config.securityIsEnabled()) { - // do not perform any initialization if no Kerberos crendetails are provided - return; - } - - // establish the JAAS config - JaasConfiguration jaasConfig = new JaasConfiguration(config.keytab, config.principal); - javax.security.auth.login.Configuration.setConfiguration(jaasConfig); - - populateSystemSecurityProperties(config.flinkConf); - - // establish the UGI login user - UserGroupInformation.setConfiguration(config.hadoopConf); - - // only configure Hadoop security if we have security enabled - if (UserGroupInformation.isSecurityEnabled()) { - - final UserGroupInformation loginUser; - - if (config.keytab != null && !StringUtils.isBlank(config.principal)) { - String keytabPath = (new File(config.keytab)).getAbsolutePath(); - - UserGroupInformation.loginUserFromKeytab(config.principal, keytabPath); - - loginUser = UserGroupInformation.getLoginUser(); - - // supplement with any available tokens - String fileLocation = System.getenv(UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION); - if (fileLocation != null) { - /* -* Use reflection API since the API semantics are not available in Hadoop1 profile. Below APIs are -* used in the context of reading the stored tokens from UGI. -* Credentials cred = Credentials.readTokenStorageFile(new File(fileLocation), config.hadoopConf); -* loginUser.addCredentials(cred); - */ - try { - Method readTokenStorageFileMethod = Credentials.class.getMethod("readTokenStorageFile", - File.class, org.apache.hadoop.conf.Configuration.class); - Credentials cred = (Credentials) readTokenStorageFileMethod.invoke(null, new File(fileLocation), - config.hadoopConf); - Method addCredentialsMethod = UserGroupInformation.class.getMethod("addCredentials", - Credentials.class); - addCredentialsMethod.invoke(loginUser, cred); - } catch (NoSuchMethodException e) { - LOG.warn("Could not find method implementations in the shaded jar. Exception: {}", e); - } - } - } else { - // login with current user credentials (e.g. ticket cache) - try { - //Use reflection API to get the login user object - //UserGroupInformation.loginUserFromSubject(null); - Method loginUserFromSubjectMethod = UserGroupInformation.class.getMethod("loginUserFromSubject", Subject.class); - Subject subject = null; - loginUserFromSubjectMethod.invoke(null, subject); - } catch (NoSuchMethodException e) { - LOG.warn("Could not find method implementations in the shaded jar. Exception: {}", e); - } - - // note that the stored tokens are read automatically - loginUser = UserGroupInformation.getLoginUser(); + // install the security modules + List modules = new ArrayList(); + try { + for (Class moduleClass : config.getSecurityModules()) { + SecurityModule module = moduleClass.newInstance(); + module.install(config); + modules.add(module); } + } +
[jira] [Created] (FLINK-5432) ContinuousFileMonitoringFunction is not monitoring nested files
Yassine Marzougui created FLINK-5432: Summary: ContinuousFileMonitoringFunction is not monitoring nested files Key: FLINK-5432 URL: https://issues.apache.org/jira/browse/FLINK-5432 Project: Flink Issue Type: Bug Components: filesystem-connector Affects Versions: 1.2.0 Reporter: Yassine Marzougui The {{ContinuousFileMonitoringFunction}} does not monitor nested files even if the inputformat has NestedFileEnumeration set to true. This can be fixed by enabling a recursive scan of the directories in the {{listEligibleFiles}} method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5386) Refactoring Window Clause
[ https://issues.apache.org/jira/browse/FLINK-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812931#comment-15812931 ] ASF GitHub Bot commented on FLINK-5386: --- Github user shaoxuan-wang commented on a diff in the pull request: https://github.com/apache/flink/pull/3046#discussion_r95244683 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/windows.scala --- @@ -150,7 +147,7 @@ class TumblingWindow(size: Expression) extends GroupWindow { def as(alias: String): TumblingWindow = as(ExpressionParser.parseExpression(alias)) override private[flink] def toLogicalWindow: LogicalWindow = -ProcessingTimeTumblingGroupWindow(alias, size) +ProcessingTimeTumblingGroupWindow(name, size) --- End diff -- Better to keep using "alias" here. By the way, I found that the input of some functions use "name", like ProcessingTimeTumblingGroupWindow, while some others use "alias", like TumblingEventTimeWindow. Better to change them consistently to "alias". What do you think. > Refactoring Window Clause > - > > Key: FLINK-5386 > URL: https://issues.apache.org/jira/browse/FLINK-5386 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: sunjincheng >Assignee: sunjincheng > > Similar to the SQL, window clause is defined "as" a symbol which is > explicitly used in groupby/over. We are proposing to refactor the way to > write groupby+window tableAPI as follows: > val windowedTable = table > .window(Slide over 10.milli every 5.milli as 'w1) > .window(Tumble over 5.milli as 'w2) > .groupBy('w1, 'key) > .select('string, 'int.count as 'count, 'w1.start) > .groupBy( 'w2, 'key) > .select('string, 'count.sum as sum2) > .window(Tumble over 5.milli as 'w3) > .groupBy( 'w3) // windowAll > .select('sum2, 'w3.start, 'w3.end) > In this way, we can remove both GroupWindowedTable and the window() method in > GroupedTable which makes the API a bit clean. In addition, for row-window, we > anyway need to define window clause as a symbol. This change will make the > API of window and row-window consistent, example for row-window: > .window(RowXXXWindow as ‘x, RowYYYWindow as ‘y) > .select(‘a, ‘b.count over ‘x as ‘xcnt, ‘c.count over ‘y as ‘ycnt, ‘x.start, > ‘x.end) > What do you think? [~fhueske] [~twalthr] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #3046: [FLINK-5386][Table API & SQL] refactoring Window C...
Github user shaoxuan-wang commented on a diff in the pull request: https://github.com/apache/flink/pull/3046#discussion_r95244683 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/windows.scala --- @@ -150,7 +147,7 @@ class TumblingWindow(size: Expression) extends GroupWindow { def as(alias: String): TumblingWindow = as(ExpressionParser.parseExpression(alias)) override private[flink] def toLogicalWindow: LogicalWindow = -ProcessingTimeTumblingGroupWindow(alias, size) +ProcessingTimeTumblingGroupWindow(name, size) --- End diff -- Better to keep using "alias" here. By the way, I found that the input of some functions use "name", like ProcessingTimeTumblingGroupWindow, while some others use "alias", like TumblingEventTimeWindow. Better to change them consistently to "alias". What do you think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5355) Handle AmazonKinesisException gracefully in Kinesis Streaming Connector
[ https://issues.apache.org/jira/browse/FLINK-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812879#comment-15812879 ] ASF GitHub Bot commented on FLINK-5355: --- Github user skidder commented on the issue: https://github.com/apache/flink/pull/3078 Thanks @tzulitai , great feedback! The `ProvisionedThroughputExceededException` exception will be reported with an HTTP 400 response status code: http://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html#API_GetRecords_Errors The AWS SDK will assign the `ErrorType.Client` type to all exceptions with an HTTP status code less than 500 ([AWS SDK source](https://github.com/aws/aws-sdk-java/blob/7844c64cf248aed889811bf2e871ad6b276a89ca/aws-java-sdk-core/src/main/java/com/amazonaws/http/JsonErrorResponseHandler.java#L119-L121)). Perhaps we can perform exponential-backoff for exceptions where: * `Client` error of type `ProvisionedThroughputExceededException` * All `Server` errors (e.g. HTTP 500, 503) * All `Unknown` errors (appear to be limited to errors unmarshalling the Kinesis service response) All other exceptions can be thrown up. What are your thoughts? > Handle AmazonKinesisException gracefully in Kinesis Streaming Connector > --- > > Key: FLINK-5355 > URL: https://issues.apache.org/jira/browse/FLINK-5355 > Project: Flink > Issue Type: Improvement > Components: Kinesis Connector >Affects Versions: 1.2.0, 1.1.3 >Reporter: Scott Kidder >Assignee: Scott Kidder > > My Flink job that consumes from a Kinesis stream must be restarted at least > once daily due to an uncaught AmazonKinesisException when reading from > Kinesis. The complete stacktrace looks like: > {noformat} > com.amazonaws.services.kinesis.model.AmazonKinesisException: null (Service: > AmazonKinesis; Status Code: 500; Error Code: InternalFailure; Request ID: > dc1b7a1a-1b97-1a32-8cd5-79a896a55223) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1545) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1183) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:964) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:676) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:650) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:633) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:601) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:583) > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:447) > at > com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:1747) > at > com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:1723) > at > com.amazonaws.services.kinesis.AmazonKinesisClient.getRecords(AmazonKinesisClient.java:858) > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getRecords(KinesisProxy.java:193) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.getRecords(ShardConsumer.java:268) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.run(ShardConsumer.java:176) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It's interesting that the Kinesis endpoint returned a 500 status code, but > that's outside the scope of this issue. > I think we can handle this exception in the same manner as a > ProvisionedThroughputException: performing an exponential backoff and > retrying a finite number of times before throwing an exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3078: [FLINK-5355] Handle AmazonKinesisException gracefully in ...
Github user skidder commented on the issue: https://github.com/apache/flink/pull/3078 Thanks @tzulitai , great feedback! The `ProvisionedThroughputExceededException` exception will be reported with an HTTP 400 response status code: http://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html#API_GetRecords_Errors The AWS SDK will assign the `ErrorType.Client` type to all exceptions with an HTTP status code less than 500 ([AWS SDK source](https://github.com/aws/aws-sdk-java/blob/7844c64cf248aed889811bf2e871ad6b276a89ca/aws-java-sdk-core/src/main/java/com/amazonaws/http/JsonErrorResponseHandler.java#L119-L121)). Perhaps we can perform exponential-backoff for exceptions where: * `Client` error of type `ProvisionedThroughputExceededException` * All `Server` errors (e.g. HTTP 500, 503) * All `Unknown` errors (appear to be limited to errors unmarshalling the Kinesis service response) All other exceptions can be thrown up. What are your thoughts? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs
[ https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812569#comment-15812569 ] ASF GitHub Bot commented on FLINK-2821: --- Github user schmichri commented on the issue: https://github.com/apache/flink/pull/2917 short: it works! long: my scenario a docker-compose setup: ` jobmanager: image: flink:1.2.0_rc0 container_name: "jobmanager" expose: - "6123" - "8081" ports: - "8081:8081" - "10.0.0.11:6123:6123" command: jobmanager environment: - JOB_MANAGER_RPC_ADDRESS=10.0.0.11 ` and the taskmanager(s): ` taskmanager: image: flink:1.2.0_rc0 expose: - "6121" - "6122" command: taskmanager - "jobmanager:jobmanager" environment: - JOB_MANAGER_RPC_ADDRESS=10.0.0.11 extra_hosts: - "jobmanager:10.0.0.11"` 1. the jobmanager webinterface works now 2. the taskmanager can connect now to the jobmanager previous errormessage: > dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Actor[akka.tcp://flink@10.0.0.11:6123/]] arriving at [akka.tcp://flink@10.0.0.11:6123] inbound addresses are [akka.tcp://flink@172.31.0.2:6123] > flink-master| 2017-01-09 15:30:39,865 ERROR akka.remote.EndpointWriter thats very cool! > Change Akka configuration to allow accessing actors from different URLs > --- > > Key: FLINK-2821 > URL: https://issues.apache.org/jira/browse/FLINK-2821 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Reporter: Robert Metzger >Assignee: Maximilian Michels > Fix For: 1.2.0 > > > Akka expects the actor's URL to be exactly matching. > As pointed out here, cases where users were complaining about this: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html > - Proxy routing (as described here, send to the proxy URL, receiver > recognizes only original URL) > - Using hostname / IP interchangeably does not work (we solved this by > always putting IP addresses into URLs, never hostnames) > - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still > no solution to that (but seems not too much of a restriction) > I am aware that this is not possible due to Akka, so it is actually not a > Flink bug. But I think we should track the resolution of the issue here > anyways because its affecting our user's satisfaction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2917: [FLINK-2821] use custom Akka build to listen on all inter...
Github user schmichri commented on the issue: https://github.com/apache/flink/pull/2917 short: it works! long: my scenario a docker-compose setup: ` jobmanager: image: flink:1.2.0_rc0 container_name: "jobmanager" expose: - "6123" - "8081" ports: - "8081:8081" - "10.0.0.11:6123:6123" command: jobmanager environment: - JOB_MANAGER_RPC_ADDRESS=10.0.0.11 ` and the taskmanager(s): ` taskmanager: image: flink:1.2.0_rc0 expose: - "6121" - "6122" command: taskmanager - "jobmanager:jobmanager" environment: - JOB_MANAGER_RPC_ADDRESS=10.0.0.11 extra_hosts: - "jobmanager:10.0.0.11"` 1. the jobmanager webinterface works now 2. the taskmanager can connect now to the jobmanager previous errormessage: > dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Actor[akka.tcp://flink@10.0.0.11:6123/]] arriving at [akka.tcp://flink@10.0.0.11:6123] inbound addresses are [akka.tcp://flink@172.31.0.2:6123] > flink-master| 2017-01-09 15:30:39,865 ERROR akka.remote.EndpointWriter thats very cool! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3046: [FLINK-5386][Table API & SQL] refactoring Window C...
Github user shaoxuan-wang commented on a diff in the pull request: https://github.com/apache/flink/pull/3046#discussion_r95219196 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/table.scala --- @@ -785,13 +793,19 @@ class Table( * will be processed by a single operator. * * @param groupWindow group-window that specifies how elements are grouped. -* @return A windowed table. */ - def window(groupWindow: GroupWindow): GroupWindowedTable = { + def window(groupWindow: GroupWindow): Table = { if (tableEnv.isInstanceOf[BatchTableEnvironment]) { throw new ValidationException(s"Windows on batch tables are currently not supported.") } -new GroupWindowedTable(this, Seq(), groupWindow) +if (None == groupWindow.name) { + throw new ValidationException("An alias must be specified for the window.") +} +if (windowPool.contains(groupWindow.name.get)) { + throw new ValidationException("The window alias can not be duplicated.") --- End diff -- s/"The window alias can not be duplicated."/"The same window alias has been defined"/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5386) Refactoring Window Clause
[ https://issues.apache.org/jira/browse/FLINK-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812507#comment-15812507 ] ASF GitHub Bot commented on FLINK-5386: --- Github user shaoxuan-wang commented on a diff in the pull request: https://github.com/apache/flink/pull/3046#discussion_r95219196 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/table.scala --- @@ -785,13 +793,19 @@ class Table( * will be processed by a single operator. * * @param groupWindow group-window that specifies how elements are grouped. -* @return A windowed table. */ - def window(groupWindow: GroupWindow): GroupWindowedTable = { + def window(groupWindow: GroupWindow): Table = { if (tableEnv.isInstanceOf[BatchTableEnvironment]) { throw new ValidationException(s"Windows on batch tables are currently not supported.") } -new GroupWindowedTable(this, Seq(), groupWindow) +if (None == groupWindow.name) { + throw new ValidationException("An alias must be specified for the window.") +} +if (windowPool.contains(groupWindow.name.get)) { + throw new ValidationException("The window alias can not be duplicated.") --- End diff -- s/"The window alias can not be duplicated."/"The same window alias has been defined"/ > Refactoring Window Clause > - > > Key: FLINK-5386 > URL: https://issues.apache.org/jira/browse/FLINK-5386 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: sunjincheng >Assignee: sunjincheng > > Similar to the SQL, window clause is defined "as" a symbol which is > explicitly used in groupby/over. We are proposing to refactor the way to > write groupby+window tableAPI as follows: > val windowedTable = table > .window(Slide over 10.milli every 5.milli as 'w1) > .window(Tumble over 5.milli as 'w2) > .groupBy('w1, 'key) > .select('string, 'int.count as 'count, 'w1.start) > .groupBy( 'w2, 'key) > .select('string, 'count.sum as sum2) > .window(Tumble over 5.milli as 'w3) > .groupBy( 'w3) // windowAll > .select('sum2, 'w3.start, 'w3.end) > In this way, we can remove both GroupWindowedTable and the window() method in > GroupedTable which makes the API a bit clean. In addition, for row-window, we > anyway need to define window clause as a symbol. This change will make the > API of window and row-window consistent, example for row-window: > .window(RowXXXWindow as ‘x, RowYYYWindow as ‘y) > .select(‘a, ‘b.count over ‘x as ‘xcnt, ‘c.count over ‘y as ‘ycnt, ‘x.start, > ‘x.end) > What do you think? [~fhueske] [~twalthr] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3061: [FLINK-5030] Support hostname verification
Github user EronWright commented on the issue: https://github.com/apache/flink/pull/3061 @tillrohrmann @StephanEwen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5030) Support hostname verification
[ https://issues.apache.org/jira/browse/FLINK-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812494#comment-15812494 ] ASF GitHub Bot commented on FLINK-5030: --- Github user EronWright commented on the issue: https://github.com/apache/flink/pull/3061 @tillrohrmann @StephanEwen > Support hostname verification > - > > Key: FLINK-5030 > URL: https://issues.apache.org/jira/browse/FLINK-5030 > Project: Flink > Issue Type: Sub-task > Components: Security >Reporter: Eron Wright >Assignee: Eron Wright > Fix For: 1.2.0 > > > _See [Dangerous Code|http://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf] and > [further > commentary|https://tersesystems.com/2014/03/23/fixing-hostname-verification/] > for useful background._ > When hostname verification is performed, it should use the hostname (not IP > address) to match the certificate. The current code is wrongly using the > address. > In technical terms, ensure that calls to `SSLContext::createSSLEngine` supply > the expected hostname, not host address. > Please audit all SSL setup code as to whether hostname verification is > enabled, and file follow-ups where necessary. For example, Akka 2.4 > supports it but 2.3 doesn't > ([ref|http://doc.akka.io/docs/akka/2.4.4/scala/http/client-side/https-support.html#Hostname_verification]). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3059: [docs] Clarify restart strategy defaults set by checkpoin...
Github user rehevkor5 commented on the issue: https://github.com/apache/flink/pull/3059 It does, thanks! I'll try to remember to include "closes #_{pr number}_" in my commit messages which might be handy. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3059: [docs] Clarify restart strategy defaults set by checkpoin...
Github user uce commented on the issue: https://github.com/apache/flink/pull/3059 Since the comments were very minor I addressed them myself (used the backticks and skipped the change in "If checkpointing is activated..." as it didn't really improve anything). The work flow was as follows: I checked out your PR branch, rebased it onto master (via `git rebase apache/master`) and pushed it with another related commit. My related commit had a comment `This closes #3059`, which tells the ASF bot to close this PR. Then I cherry picked it to the 1.2 branch. GitHub only notices that the changes were merged for merges to master and if the commit hashes are not changed I think. Since I rebased they did change and I needed the manual close. Does this make sense? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-5431) time format for akka status
Alexey Diomin created FLINK-5431: Summary: time format for akka status Key: FLINK-5431 URL: https://issues.apache.org/jira/browse/FLINK-5431 Project: Flink Issue Type: Improvement Reporter: Alexey Diomin Priority: Minor In ExecutionGraphMessages we have code {code} private val DATE_FORMATTER: SimpleDateFormat = new SimpleDateFormat("MM/dd/ HH:mm:ss") {code} But sometimes it cause confusion when main logger configured with "dd/MM/". We need making this format configurable or maybe stay only "HH:mm:ss" for prevent misunderstanding output date-time -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-5360) Fix arguments names in WindowedStream
[ https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Mushketyk closed FLINK-5360. - Resolution: Fixed > Fix arguments names in WindowedStream > - > > Key: FLINK-5360 > URL: https://issues.apache.org/jira/browse/FLINK-5360 > Project: Flink > Issue Type: Bug >Reporter: Ivan Mushketyk >Assignee: Ivan Mushketyk > > Should be "field" instead of "positionToMaxBy" in some methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #3022: [FLINK-5360] Fix argument names in WindowedStream
Github user mushketyk closed the pull request at: https://github.com/apache/flink/pull/3022 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5360) Fix arguments names in WindowedStream
[ https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812358#comment-15812358 ] ASF GitHub Bot commented on FLINK-5360: --- Github user mushketyk closed the pull request at: https://github.com/apache/flink/pull/3022 > Fix arguments names in WindowedStream > - > > Key: FLINK-5360 > URL: https://issues.apache.org/jira/browse/FLINK-5360 > Project: Flink > Issue Type: Bug >Reporter: Ivan Mushketyk >Assignee: Ivan Mushketyk > > Should be "field" instead of "positionToMaxBy" in some methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5360) Fix arguments names in WindowedStream
[ https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812357#comment-15812357 ] ASF GitHub Bot commented on FLINK-5360: --- Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/3022 Thank you @aljoscha, @xhumanoid Closing the issue. > Fix arguments names in WindowedStream > - > > Key: FLINK-5360 > URL: https://issues.apache.org/jira/browse/FLINK-5360 > Project: Flink > Issue Type: Bug >Reporter: Ivan Mushketyk >Assignee: Ivan Mushketyk > > Should be "field" instead of "positionToMaxBy" in some methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3022: [FLINK-5360] Fix argument names in WindowedStream
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/3022 Thank you @aljoscha, @xhumanoid Closing the issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3059: [docs] Clarify restart strategy defaults set by checkpoin...
Github user rehevkor5 commented on the issue: https://github.com/apache/flink/pull/3059 I'm curious as to how you added my commit to `master` and `release-1.2` but it doesn't show up in this PR. The whole "rehevkor5 committed with uce 5 days ago" thing. Github isn't aware that the changes in this PR were merged. If you have a moment can you describe your workflow is so I can understand what's happening behind the scenes? Is it rebase onto master + merge --ff-only or something? Maybe it's due to the fact that Github is only mirroring another repo? I'd appreciate it! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs
[ https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812341#comment-15812341 ] ASF GitHub Bot commented on FLINK-2821: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2917 There is a release candidate out: https://github.com/apache/flink/tree/release-1.2.0-rc0 Download at http://people.apache.org/~rmetzger/flink-1.2.0-rc0/ The official release comes when the testing phase is done, which should be soon. If you can try the release candidate and tell us if it worked for your scenario, that would help with the release testing. > Change Akka configuration to allow accessing actors from different URLs > --- > > Key: FLINK-2821 > URL: https://issues.apache.org/jira/browse/FLINK-2821 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Reporter: Robert Metzger >Assignee: Maximilian Michels > Fix For: 1.2.0 > > > Akka expects the actor's URL to be exactly matching. > As pointed out here, cases where users were complaining about this: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html > - Proxy routing (as described here, send to the proxy URL, receiver > recognizes only original URL) > - Using hostname / IP interchangeably does not work (we solved this by > always putting IP addresses into URLs, never hostnames) > - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still > no solution to that (but seems not too much of a restriction) > I am aware that this is not possible due to Akka, so it is actually not a > Flink bug. But I think we should track the resolution of the issue here > anyways because its affecting our user's satisfaction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2917: [FLINK-2821] use custom Akka build to listen on all inter...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2917 There is a release candidate out: https://github.com/apache/flink/tree/release-1.2.0-rc0 Download at http://people.apache.org/~rmetzger/flink-1.2.0-rc0/ The official release comes when the testing phase is done, which should be soon. If you can try the release candidate and tell us if it worked for your scenario, that would help with the release testing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3049: [FLINK-5395] [Build System] support locally build ...
Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/3049#discussion_r95202903 --- Diff: tools/create_release_files.sh --- @@ -85,18 +88,74 @@ else MD5SUM="md5sum" fi +usage() { + set +x + echo "./create_release_files.sh --scala-version 2.11 --hadoop-version 2.7.2" + echo "" + echo "usage:" + echo "[--scala-version ] [--hadoop-version ]" + echo "" + echo "example:" + echo " sonatype_user=APACHEID sonatype_pw=APACHEIDPASSWORD \ " + echo " NEW_VERSION=1.2.0 \ " + echo " RELEASE_CANDIDATE="rc1" RELEASE_BRANCH=release-1.2.0 \ " + echo " OLD_VERSION=1.1-SNAPSHOT \ " + echo " USER_NAME=APACHEID \ " + echo " GPG_PASSPHRASE=XXX GPG_KEY=KEYID \ " + echo " GIT_AUTHOR=\"`git config --get user.name` <`git config --get user.email`>\" \ " + echo " IS_LOCAL_DIST=true GIT_REPO=github.com/apache/flink.git" + echo " ./create_release_files.sh --scala-version 2.11 --hadoop-version 2.7.2" + exit 1 +} + +# Parse arguments +while (( "$#" )); do + case $1 in +--scala-version) + scalaV="$2" + shift + ;; +--hadoop-version) + hadoopV="$2" + shift + ;; +--help) + usage + ;; +*) + break + ;; + esac + shift +done + +### prepare() { # prepare - git clone http://git-wip-us.apache.org/repos/asf/flink.git flink + target_branch=release-$RELEASE_VERSION-$RELEASE_CANDIDATE + if [ ! -d ./flink ]; then +git clone http://$GIT_REPO flink + else +# if flink git repo exist, delete target branch, delete builded distribution +rm -rf flink-*.tgz +cd flink +# try-catch +{ + git pull --all + git checkout master + git branch -D $target_branch -f +} || { + echo "branch $target_branch maybe not found" --- End diff -- I think the error message should just be `branch $target_branch not found` without the maybe. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3049: [FLINK-5395] [Build System] support locally build ...
Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/3049#discussion_r95203611 --- Diff: tools/create_release_files.sh --- @@ -201,19 +262,34 @@ prepare make_source_release -make_binary_release "hadoop2" "" 2.10 -make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.10 -make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.10 -make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.10 - -make_binary_release "hadoop2" "" 2.11 -make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.11 -make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.11 -make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.11 - -copy_data - -deploy_to_maven +# build dist by input parameter of "--scala-vervion xxx --hadoop-version xxx" +if [ "$scalaV" == "none" ] && [ "$hadoopV" == "none" ]; then + make_binary_release "hadoop2" "" 2.10 + make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.10 + make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.10 + make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.10 + + make_binary_release "hadoop2" "" 2.11 + make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.11 + make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.11 + make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.11 +elif [ "$scalaV" == none ] && [ "$hadoopV" != "none" ] --- End diff -- Can we add the `"..."` for all strings to have it consistent? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3049: [FLINK-5395] [Build System] support locally build ...
Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/3049#discussion_r95202566 --- Diff: tools/create_release_files.sh --- @@ -66,16 +66,19 @@ fi GPG_PASSPHRASE=${GPG_PASSPHRASE:-XXX} GPG_KEY=${GPG_KEY:-XXX} GIT_AUTHOR=${GIT_AUTHOR:-"Your name "} -OLD_VERSION=${OLD_VERSION:-1.1-SNAPSHOT} -RELEASE_VERSION=${NEW_VERSION} +OLD_VERSION=${OLD_VERSION:-1.2-SNAPSHOT} +RELEASE_VERSION=${NEW_VERSION:-1.3-SNAPSHOT} RELEASE_CANDIDATE=${RELEASE_CANDIDATE:-rc1} RELEASE_BRANCH=${RELEASE_BRANCH:-master} USER_NAME=${USER_NAME:-yourapacheidhere} MVN=${MVN:-mvn} GPG=${GPG:-gpg} sonatype_user=${sonatype_user:-yourapacheidhere} sonatype_pw=${sonatype_pw:-XXX} - +IS_LOCAL_DIST=${IS_LOCAL_DIST:-false} +GIT_REPO=${GIT_REPO:-git-wip-us.apache.org/repos/asf/flink.git} +scalaV=none --- End diff -- Should we rename this to `SCALA_VERSION` and `HADOOP_VERSION` in order to follow the style of the other variables? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5395) support locally build distribution by script create_release_files.sh
[ https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812336#comment-15812336 ] ASF GitHub Bot commented on FLINK-5395: --- Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/3049#discussion_r95202566 --- Diff: tools/create_release_files.sh --- @@ -66,16 +66,19 @@ fi GPG_PASSPHRASE=${GPG_PASSPHRASE:-XXX} GPG_KEY=${GPG_KEY:-XXX} GIT_AUTHOR=${GIT_AUTHOR:-"Your name "} -OLD_VERSION=${OLD_VERSION:-1.1-SNAPSHOT} -RELEASE_VERSION=${NEW_VERSION} +OLD_VERSION=${OLD_VERSION:-1.2-SNAPSHOT} +RELEASE_VERSION=${NEW_VERSION:-1.3-SNAPSHOT} RELEASE_CANDIDATE=${RELEASE_CANDIDATE:-rc1} RELEASE_BRANCH=${RELEASE_BRANCH:-master} USER_NAME=${USER_NAME:-yourapacheidhere} MVN=${MVN:-mvn} GPG=${GPG:-gpg} sonatype_user=${sonatype_user:-yourapacheidhere} sonatype_pw=${sonatype_pw:-XXX} - +IS_LOCAL_DIST=${IS_LOCAL_DIST:-false} +GIT_REPO=${GIT_REPO:-git-wip-us.apache.org/repos/asf/flink.git} +scalaV=none --- End diff -- Should we rename this to `SCALA_VERSION` and `HADOOP_VERSION` in order to follow the style of the other variables? > support locally build distribution by script create_release_files.sh > > > Key: FLINK-5395 > URL: https://issues.apache.org/jira/browse/FLINK-5395 > Project: Flink > Issue Type: Improvement > Components: Build System >Reporter: shijinkui > > create_release_files.sh is build flink release only. It's hard to build > custom local Flink release distribution. > Let create_release_files.sh support: > 1. custom git repo url > 2. custom build special scala and hadoop version > 3. add `tools/flink` to .gitignore > 4. add usage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5395) support locally build distribution by script create_release_files.sh
[ https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812337#comment-15812337 ] ASF GitHub Bot commented on FLINK-5395: --- Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/3049#discussion_r95203611 --- Diff: tools/create_release_files.sh --- @@ -201,19 +262,34 @@ prepare make_source_release -make_binary_release "hadoop2" "" 2.10 -make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.10 -make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.10 -make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.10 - -make_binary_release "hadoop2" "" 2.11 -make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.11 -make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.11 -make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.11 - -copy_data - -deploy_to_maven +# build dist by input parameter of "--scala-vervion xxx --hadoop-version xxx" +if [ "$scalaV" == "none" ] && [ "$hadoopV" == "none" ]; then + make_binary_release "hadoop2" "" 2.10 + make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.10 + make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.10 + make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.10 + + make_binary_release "hadoop2" "" 2.11 + make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.11 + make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.11 + make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.11 +elif [ "$scalaV" == none ] && [ "$hadoopV" != "none" ] --- End diff -- Can we add the `"..."` for all strings to have it consistent? > support locally build distribution by script create_release_files.sh > > > Key: FLINK-5395 > URL: https://issues.apache.org/jira/browse/FLINK-5395 > Project: Flink > Issue Type: Improvement > Components: Build System >Reporter: shijinkui > > create_release_files.sh is build flink release only. It's hard to build > custom local Flink release distribution. > Let create_release_files.sh support: > 1. custom git repo url > 2. custom build special scala and hadoop version > 3. add `tools/flink` to .gitignore > 4. add usage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5395) support locally build distribution by script create_release_files.sh
[ https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812338#comment-15812338 ] ASF GitHub Bot commented on FLINK-5395: --- Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/3049#discussion_r95202903 --- Diff: tools/create_release_files.sh --- @@ -85,18 +88,74 @@ else MD5SUM="md5sum" fi +usage() { + set +x + echo "./create_release_files.sh --scala-version 2.11 --hadoop-version 2.7.2" + echo "" + echo "usage:" + echo "[--scala-version ] [--hadoop-version ]" + echo "" + echo "example:" + echo " sonatype_user=APACHEID sonatype_pw=APACHEIDPASSWORD \ " + echo " NEW_VERSION=1.2.0 \ " + echo " RELEASE_CANDIDATE="rc1" RELEASE_BRANCH=release-1.2.0 \ " + echo " OLD_VERSION=1.1-SNAPSHOT \ " + echo " USER_NAME=APACHEID \ " + echo " GPG_PASSPHRASE=XXX GPG_KEY=KEYID \ " + echo " GIT_AUTHOR=\"`git config --get user.name` <`git config --get user.email`>\" \ " + echo " IS_LOCAL_DIST=true GIT_REPO=github.com/apache/flink.git" + echo " ./create_release_files.sh --scala-version 2.11 --hadoop-version 2.7.2" + exit 1 +} + +# Parse arguments +while (( "$#" )); do + case $1 in +--scala-version) + scalaV="$2" + shift + ;; +--hadoop-version) + hadoopV="$2" + shift + ;; +--help) + usage + ;; +*) + break + ;; + esac + shift +done + +### prepare() { # prepare - git clone http://git-wip-us.apache.org/repos/asf/flink.git flink + target_branch=release-$RELEASE_VERSION-$RELEASE_CANDIDATE + if [ ! -d ./flink ]; then +git clone http://$GIT_REPO flink + else +# if flink git repo exist, delete target branch, delete builded distribution +rm -rf flink-*.tgz +cd flink +# try-catch +{ + git pull --all + git checkout master + git branch -D $target_branch -f +} || { + echo "branch $target_branch maybe not found" --- End diff -- I think the error message should just be `branch $target_branch not found` without the maybe. > support locally build distribution by script create_release_files.sh > > > Key: FLINK-5395 > URL: https://issues.apache.org/jira/browse/FLINK-5395 > Project: Flink > Issue Type: Improvement > Components: Build System >Reporter: shijinkui > > create_release_files.sh is build flink release only. It's hard to build > custom local Flink release distribution. > Let create_release_files.sh support: > 1. custom git repo url > 2. custom build special scala and hadoop version > 3. add `tools/flink` to .gitignore > 4. add usage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3059: [docs] Clarify restart strategy defaults set by checkpoin...
Github user rehevkor5 commented on the issue: https://github.com/apache/flink/pull/3059 @uce Thanks. I didn't get an opportunity to address your comments... is there anything you want me to do? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5360) Fix arguments names in WindowedStream
[ https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812311#comment-15812311 ] ASF GitHub Bot commented on FLINK-5360: --- Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/3022 Merged. Could you please close this PR and the Jira issue? > Fix arguments names in WindowedStream > - > > Key: FLINK-5360 > URL: https://issues.apache.org/jira/browse/FLINK-5360 > Project: Flink > Issue Type: Bug >Reporter: Ivan Mushketyk >Assignee: Ivan Mushketyk > > Should be "field" instead of "positionToMaxBy" in some methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3022: [FLINK-5360] Fix argument names in WindowedStream
Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/3022 Merged. Could you please close this PR and the Jira issue? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4651) Re-register processing time timers at the WindowOperator upon recovery.
[ https://issues.apache.org/jira/browse/FLINK-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812288#comment-15812288 ] Scott Kidder commented on FLINK-4651: - I can confirm that this issue appears to be fixed in the release-1.2.0-rc0 branch (commit f3c59cedae7a508825d8032a8aa9f5af6d177555). > Re-register processing time timers at the WindowOperator upon recovery. > --- > > Key: FLINK-4651 > URL: https://issues.apache.org/jira/browse/FLINK-4651 > Project: Flink > Issue Type: Bug > Components: Windowing Operators >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas > Labels: windows > Fix For: 1.2.0, 1.1.5 > > > Currently the {{WindowOperator}} checkpoints the processing time timers, but > upon recovery it does not re-registers them with the {{TimeServiceProvider}}. > To actually reprocess them it relies on another element that will come and > register a new timer for a future point in time. Although this is a realistic > assumption in long running jobs, we can remove this assumption by > re-registering the restored timers with the {{TimeServiceProvider}} in the > {{open()}} method of the {{WindowOperator}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs
[ https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812287#comment-15812287 ] ASF GitHub Bot commented on FLINK-2821: --- Github user schmichri commented on the issue: https://github.com/apache/flink/pull/2917 when is this planned to be released? > Change Akka configuration to allow accessing actors from different URLs > --- > > Key: FLINK-2821 > URL: https://issues.apache.org/jira/browse/FLINK-2821 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Reporter: Robert Metzger >Assignee: Maximilian Michels > Fix For: 1.2.0 > > > Akka expects the actor's URL to be exactly matching. > As pointed out here, cases where users were complaining about this: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html > - Proxy routing (as described here, send to the proxy URL, receiver > recognizes only original URL) > - Using hostname / IP interchangeably does not work (we solved this by > always putting IP addresses into URLs, never hostnames) > - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still > no solution to that (but seems not too much of a restriction) > I am aware that this is not possible due to Akka, so it is actually not a > Flink bug. But I think we should track the resolution of the issue here > anyways because its affecting our user's satisfaction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2917: [FLINK-2821] use custom Akka build to listen on all inter...
Github user schmichri commented on the issue: https://github.com/apache/flink/pull/2917 when is this planned to be released? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5178) allow BlobCache to use a distributed file system irrespective of the HA mode
[ https://issues.apache.org/jira/browse/FLINK-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812271#comment-15812271 ] ASF GitHub Bot commented on FLINK-5178: --- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/3085 [FLINK-5178] allow BlobCache to use a distributed file system irrespective of the HA mode Allow the BlobServer and BlobCache to use a distributed file system for distributing BLOBs even if not in HA-mode. For this, we always try to use the path given by the `high-availability.storageDir` config option and, if accessible, set it up appropriately. This builds upon https://github.com/apache/flink/pull/3084 You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink FLINK-5178a Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3085.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3085 commit 464f2c834688507c67acb3ad584827132ebe444e Author: Nico Kruber Date: 2016-11-22T11:49:03Z [hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath This was actually the same implementation as FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the two could have been removed but the implementation makes most sense at the concrete file system abstraction layer, i.e. in FileSystemBlobStore. commit 2ebffd4c2d499b61f164b4d54dc86c9d44b9c0ea Author: Nico Kruber Date: 2016-11-23T15:11:35Z [hotfix] do not create intermediate strings inside String.format in BlobUtils commit 36ab6121e336f63138e442ea48a751ede7fb04c3 Author: Nico Kruber Date: 2016-11-24T16:11:19Z [hotfix] properly shut down the BlobServer in BlobServerRangeTest commit c8c12c67ae875ca5c96db78375bef880cf2a3c59 Author: Nico Kruber Date: 2017-01-05T17:06:01Z [hotfix] use JUnit's TemporaryFolder in BlobRecoveryITCase, too This makes cleaning up simpler. commit a078cb0c26071fe70e3668d23d0c8bef8550892f Author: Nico Kruber Date: 2017-01-05T17:27:00Z [hotfix] add a missing "'" to the BlobStore class commit a643f0b989c640a81b112ad14ae27a2a2b1ab257 Author: Nico Kruber Date: 2017-01-05T17:07:13Z [FLINK-5129] BlobServer: include the cluster id in the HA storage path for blobs This applies to the ZookeeperHaServices implementation. commit 7d832919040059961940fc96d0cdb285bc9f77d3 Author: Nico Kruber Date: 2017-01-05T17:18:10Z [FLINK-5129] unify duplicate code between the BlobServer and ZookeeperHaServices (this was introduced by c64860677f) commit 19879a01b99c4772a09627eb5f380f794f6c1e27 Author: Nico Kruber Date: 2016-11-30T13:52:12Z [hotfix] add some more documentation in BlobStore-related classes commit 80c17ef83104d1186c06d8f5d4cde11e4b05f2b8 Author: Nico Kruber Date: 2017-01-06T10:55:23Z [hotfix] minor code beautifications when checking parameters + also check the blobService parameter in BlobLibraryCacheManager commit ff920e48bd69acef280bdef2a12e5f5f9cca3a88 Author: Nico Kruber Date: 2017-01-06T13:21:42Z [FLINK-5129] let BlobUtils#initStorageDirectory() throw a proper IOException commit c8e2815787338f52e5ad369bcaedb1798284dd29 Author: Nico Kruber Date: 2017-01-06T13:59:51Z [hotfix] simplify code in BlobCache#deleteGlobal() Also, re-order the code so that a local delete is always tried before creating a connection to the BlobServer. If that fails, the local file is deleted at least. commit 5cd1c20aa604a9556c069ab78d8e471fa058499e Author: Nico Kruber Date: 2016-11-29T17:11:06Z [hotfix] re-use some code in BlobServerDeleteTest commit d39948a6baa0cd6f68c4dfd8daffdd65e573fbca Author: Nico Kruber Date: 2016-11-30T13:35:38Z [hotfix] improve some failure messages in the BlobService's HA unit tests commit dc87ae36088cc48a4122351ebe5b09a31d7fba41 Author: Nico Kruber Date: 2017-01-06T14:06:30Z [FLINK-5129] make the BlobCache also use a distributed file system in HA mode If available (in HA mode), download the jar files from the distributed file system directly instead of querying the BlobServer. This way the load is more distributed among the nodes of the file system (depending on its implementation of course) compared to putting all the burden on a single BlobServer. commit 389eaa9779d4bf22cc3972208d4f35ac7a966f5c Author: Nico Kruber Date: 2017-01-06T16:21:05Z [FLINK-5129] add unit tests for the BlobCache accessing the distributed FS directly commit b3bcf944df87f37cccd831e8fb56b95caa620dad Author: Nico Kruber Date: 2017-01-09T13:41:59Z [FLINK-5129] let FileSystemBlobStore#get() remove the target file on failure If the copy fai
[GitHub] flink pull request #3085: [FLINK-5178] allow BlobCache to use a distributed ...
GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/3085 [FLINK-5178] allow BlobCache to use a distributed file system irrespective of the HA mode Allow the BlobServer and BlobCache to use a distributed file system for distributing BLOBs even if not in HA-mode. For this, we always try to use the path given by the `high-availability.storageDir` config option and, if accessible, set it up appropriately. This builds upon https://github.com/apache/flink/pull/3084 You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink FLINK-5178a Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3085.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3085 commit 464f2c834688507c67acb3ad584827132ebe444e Author: Nico Kruber Date: 2016-11-22T11:49:03Z [hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath This was actually the same implementation as FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the two could have been removed but the implementation makes most sense at the concrete file system abstraction layer, i.e. in FileSystemBlobStore. commit 2ebffd4c2d499b61f164b4d54dc86c9d44b9c0ea Author: Nico Kruber Date: 2016-11-23T15:11:35Z [hotfix] do not create intermediate strings inside String.format in BlobUtils commit 36ab6121e336f63138e442ea48a751ede7fb04c3 Author: Nico Kruber Date: 2016-11-24T16:11:19Z [hotfix] properly shut down the BlobServer in BlobServerRangeTest commit c8c12c67ae875ca5c96db78375bef880cf2a3c59 Author: Nico Kruber Date: 2017-01-05T17:06:01Z [hotfix] use JUnit's TemporaryFolder in BlobRecoveryITCase, too This makes cleaning up simpler. commit a078cb0c26071fe70e3668d23d0c8bef8550892f Author: Nico Kruber Date: 2017-01-05T17:27:00Z [hotfix] add a missing "'" to the BlobStore class commit a643f0b989c640a81b112ad14ae27a2a2b1ab257 Author: Nico Kruber Date: 2017-01-05T17:07:13Z [FLINK-5129] BlobServer: include the cluster id in the HA storage path for blobs This applies to the ZookeeperHaServices implementation. commit 7d832919040059961940fc96d0cdb285bc9f77d3 Author: Nico Kruber Date: 2017-01-05T17:18:10Z [FLINK-5129] unify duplicate code between the BlobServer and ZookeeperHaServices (this was introduced by c64860677f) commit 19879a01b99c4772a09627eb5f380f794f6c1e27 Author: Nico Kruber Date: 2016-11-30T13:52:12Z [hotfix] add some more documentation in BlobStore-related classes commit 80c17ef83104d1186c06d8f5d4cde11e4b05f2b8 Author: Nico Kruber Date: 2017-01-06T10:55:23Z [hotfix] minor code beautifications when checking parameters + also check the blobService parameter in BlobLibraryCacheManager commit ff920e48bd69acef280bdef2a12e5f5f9cca3a88 Author: Nico Kruber Date: 2017-01-06T13:21:42Z [FLINK-5129] let BlobUtils#initStorageDirectory() throw a proper IOException commit c8e2815787338f52e5ad369bcaedb1798284dd29 Author: Nico Kruber Date: 2017-01-06T13:59:51Z [hotfix] simplify code in BlobCache#deleteGlobal() Also, re-order the code so that a local delete is always tried before creating a connection to the BlobServer. If that fails, the local file is deleted at least. commit 5cd1c20aa604a9556c069ab78d8e471fa058499e Author: Nico Kruber Date: 2016-11-29T17:11:06Z [hotfix] re-use some code in BlobServerDeleteTest commit d39948a6baa0cd6f68c4dfd8daffdd65e573fbca Author: Nico Kruber Date: 2016-11-30T13:35:38Z [hotfix] improve some failure messages in the BlobService's HA unit tests commit dc87ae36088cc48a4122351ebe5b09a31d7fba41 Author: Nico Kruber Date: 2017-01-06T14:06:30Z [FLINK-5129] make the BlobCache also use a distributed file system in HA mode If available (in HA mode), download the jar files from the distributed file system directly instead of querying the BlobServer. This way the load is more distributed among the nodes of the file system (depending on its implementation of course) compared to putting all the burden on a single BlobServer. commit 389eaa9779d4bf22cc3972208d4f35ac7a966f5c Author: Nico Kruber Date: 2017-01-06T16:21:05Z [FLINK-5129] add unit tests for the BlobCache accessing the distributed FS directly commit b3bcf944df87f37cccd831e8fb56b95caa620dad Author: Nico Kruber Date: 2017-01-09T13:41:59Z [FLINK-5129] let FileSystemBlobStore#get() remove the target file on failure If the copy fails, an IOException was thrown but the target file remained and was (most likely) not finished. This cleans up the file in that case so that code above, e.g. BlobServer and BlobCache, can rely on a file being complete as long as it exists. co
[jira] [Commented] (FLINK-4988) Elasticsearch 5.x support
[ https://issues.apache.org/jira/browse/FLINK-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812217#comment-15812217 ] ASF GitHub Bot commented on FLINK-4988: --- Github user mikedias commented on the issue: https://github.com/apache/flink/pull/2767 @tzulitai sure, no problem! :) > Elasticsearch 5.x support > - > > Key: FLINK-4988 > URL: https://issues.apache.org/jira/browse/FLINK-4988 > Project: Flink > Issue Type: New Feature >Reporter: Mike Dias > > Elasticsearch 5.x was released: > https://www.elastic.co/blog/elasticsearch-5-0-0-released -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2767: [FLINK-4988] Elasticsearch 5.x support
Github user mikedias commented on the issue: https://github.com/apache/flink/pull/2767 @tzulitai sure, no problem! :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3555) Web interface does not render job information properly
[ https://issues.apache.org/jira/browse/FLINK-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812183#comment-15812183 ] ASF GitHub Bot commented on FLINK-3555: --- Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/2941 Imo, the best way to achieve the change equivalent to the change to vendor.css file would be to add a new class, say, .panel-body-flowable in index.styl which defines the overflow rule, and add this class to the elements wherever needed. > Web interface does not render job information properly > -- > > Key: FLINK-3555 > URL: https://issues.apache.org/jira/browse/FLINK-3555 > Project: Flink > Issue Type: Bug > Components: Webfrontend >Affects Versions: 1.0.0 >Reporter: Till Rohrmann >Assignee: Sergey_Sokur >Priority: Minor > Attachments: Chrome.png, Safari.png > > > In Chrome and Safari, the different tabs of the detailed job view are not > properly rendered. The text goes beyond the surrounding box. I would guess > that this is some kind of css issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2941: [FLINK-3555] Web interface does not render job informatio...
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/2941 Imo, the best way to achieve the change equivalent to the change to vendor.css file would be to add a new class, say, .panel-body-flowable in index.styl which defines the overflow rule, and add this class to the elements wherever needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3039: [FLINK-5280] Update TableSource to support nested ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3039#discussion_r95178608 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/CsvTableSource.scala --- @@ -53,7 +53,9 @@ class CsvTableSource( ignoreFirstLine: Boolean = false, ignoreComments: String = null, lenient: Boolean = false) - extends AbstractBatchStreamTableSource[Row] + extends BatchTableSource[Row] + with StreamTableSource[Row] + with DefinedFieldNames --- End diff -- If we define `returnType` as `new RowTypeInfo(fieldTypes, fieldNames)`, we do not need to implement `DefinedFieldNames`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data
[ https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812080#comment-15812080 ] ASF GitHub Bot commented on FLINK-5280: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3039#discussion_r95178608 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/CsvTableSource.scala --- @@ -53,7 +53,9 @@ class CsvTableSource( ignoreFirstLine: Boolean = false, ignoreComments: String = null, lenient: Boolean = false) - extends AbstractBatchStreamTableSource[Row] + extends BatchTableSource[Row] + with StreamTableSource[Row] + with DefinedFieldNames --- End diff -- If we define `returnType` as `new RowTypeInfo(fieldTypes, fieldNames)`, we do not need to implement `DefinedFieldNames`. > Extend TableSource to support nested data > - > > Key: FLINK-5280 > URL: https://issues.apache.org/jira/browse/FLINK-5280 > Project: Flink > Issue Type: Improvement > Components: Table API & SQL >Affects Versions: 1.2.0 >Reporter: Fabian Hueske >Assignee: Ivan Mushketyk > > The {{TableSource}} interface does currently only support the definition of > flat rows. > However, there are several storage formats for nested data that should be > supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can > also natively handle nested rows. > The {{TableSource}} interface and the code to register table sources in > Calcite's schema need to be extended to support nested data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5129) make the BlobServer use a distributed file system
[ https://issues.apache.org/jira/browse/FLINK-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812063#comment-15812063 ] ASF GitHub Bot commented on FLINK-5129: --- Github user NicoK commented on the issue: https://github.com/apache/flink/pull/3076 fixed a typo in the unit test that lead to the tests passing although there was still something wrong which is now fixed as well > make the BlobServer use a distributed file system > - > > Key: FLINK-5129 > URL: https://issues.apache.org/jira/browse/FLINK-5129 > Project: Flink > Issue Type: Improvement > Components: Network >Reporter: Nico Kruber >Assignee: Nico Kruber > > Currently, the BlobServer uses a local storage and, in addition when the HA > mode is set, a distributed file system, e.g. hdfs. This, however, is only > used by the JobManager and all TaskManager instances request blobs from the > JobManager. By using the distributed file system there as well, we would > lower the load on the JobManager and increase scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #3084: [FLINK-5129] make the BlobServer use a distributed...
GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/3084 [FLINK-5129] make the BlobServer use a distributed file system Make the BlobCache use the BlobServer's distributed file system in HA mode: previously even in HA mode and if the cache has access to the file system, it would download BLOBs from one central BlobServer. By using the distributed file system beneath we may leverage its scalability and remove a single point of (performance) failure. If the distributed file system is not accessible at the blob caches, the old behaviour is used. @uce can you have a look? (this is an updated and fixed version of https://github.com/apache/flink/pull/3076) You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink FLINK-5129a Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3084.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3084 commit 464f2c834688507c67acb3ad584827132ebe444e Author: Nico Kruber Date: 2016-11-22T11:49:03Z [hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath This was actually the same implementation as FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the two could have been removed but the implementation makes most sense at the concrete file system abstraction layer, i.e. in FileSystemBlobStore. commit 2ebffd4c2d499b61f164b4d54dc86c9d44b9c0ea Author: Nico Kruber Date: 2016-11-23T15:11:35Z [hotfix] do not create intermediate strings inside String.format in BlobUtils commit 36ab6121e336f63138e442ea48a751ede7fb04c3 Author: Nico Kruber Date: 2016-11-24T16:11:19Z [hotfix] properly shut down the BlobServer in BlobServerRangeTest commit c8c12c67ae875ca5c96db78375bef880cf2a3c59 Author: Nico Kruber Date: 2017-01-05T17:06:01Z [hotfix] use JUnit's TemporaryFolder in BlobRecoveryITCase, too This makes cleaning up simpler. commit a078cb0c26071fe70e3668d23d0c8bef8550892f Author: Nico Kruber Date: 2017-01-05T17:27:00Z [hotfix] add a missing "'" to the BlobStore class commit a643f0b989c640a81b112ad14ae27a2a2b1ab257 Author: Nico Kruber Date: 2017-01-05T17:07:13Z [FLINK-5129] BlobServer: include the cluster id in the HA storage path for blobs This applies to the ZookeeperHaServices implementation. commit 7d832919040059961940fc96d0cdb285bc9f77d3 Author: Nico Kruber Date: 2017-01-05T17:18:10Z [FLINK-5129] unify duplicate code between the BlobServer and ZookeeperHaServices (this was introduced by c64860677f) commit 19879a01b99c4772a09627eb5f380f794f6c1e27 Author: Nico Kruber Date: 2016-11-30T13:52:12Z [hotfix] add some more documentation in BlobStore-related classes commit 80c17ef83104d1186c06d8f5d4cde11e4b05f2b8 Author: Nico Kruber Date: 2017-01-06T10:55:23Z [hotfix] minor code beautifications when checking parameters + also check the blobService parameter in BlobLibraryCacheManager commit ff920e48bd69acef280bdef2a12e5f5f9cca3a88 Author: Nico Kruber Date: 2017-01-06T13:21:42Z [FLINK-5129] let BlobUtils#initStorageDirectory() throw a proper IOException commit c8e2815787338f52e5ad369bcaedb1798284dd29 Author: Nico Kruber Date: 2017-01-06T13:59:51Z [hotfix] simplify code in BlobCache#deleteGlobal() Also, re-order the code so that a local delete is always tried before creating a connection to the BlobServer. If that fails, the local file is deleted at least. commit 5cd1c20aa604a9556c069ab78d8e471fa058499e Author: Nico Kruber Date: 2016-11-29T17:11:06Z [hotfix] re-use some code in BlobServerDeleteTest commit d39948a6baa0cd6f68c4dfd8daffdd65e573fbca Author: Nico Kruber Date: 2016-11-30T13:35:38Z [hotfix] improve some failure messages in the BlobService's HA unit tests commit dc87ae36088cc48a4122351ebe5b09a31d7fba41 Author: Nico Kruber Date: 2017-01-06T14:06:30Z [FLINK-5129] make the BlobCache also use a distributed file system in HA mode If available (in HA mode), download the jar files from the distributed file system directly instead of querying the BlobServer. This way the load is more distributed among the nodes of the file system (depending on its implementation of course) compared to putting all the burden on a single BlobServer. commit 389eaa9779d4bf22cc3972208d4f35ac7a966f5c Author: Nico Kruber Date: 2017-01-06T16:21:05Z [FLINK-5129] add unit tests for the BlobCache accessing the distributed FS directly commit b3bcf944df87f37cccd831e8fb56b95caa620dad Author: Nico Kruber Date: 2017-01-09T13:41:59Z [FLINK-5129] let FileSystemBlobStore#get() remove the target file on failure If the copy fails, an IOException was thrown but the target file r
[jira] [Commented] (FLINK-5129) make the BlobServer use a distributed file system
[ https://issues.apache.org/jira/browse/FLINK-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812074#comment-15812074 ] ASF GitHub Bot commented on FLINK-5129: --- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/3084 [FLINK-5129] make the BlobServer use a distributed file system Make the BlobCache use the BlobServer's distributed file system in HA mode: previously even in HA mode and if the cache has access to the file system, it would download BLOBs from one central BlobServer. By using the distributed file system beneath we may leverage its scalability and remove a single point of (performance) failure. If the distributed file system is not accessible at the blob caches, the old behaviour is used. @uce can you have a look? (this is an updated and fixed version of https://github.com/apache/flink/pull/3076) You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink FLINK-5129a Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3084.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3084 commit 464f2c834688507c67acb3ad584827132ebe444e Author: Nico Kruber Date: 2016-11-22T11:49:03Z [hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath This was actually the same implementation as FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the two could have been removed but the implementation makes most sense at the concrete file system abstraction layer, i.e. in FileSystemBlobStore. commit 2ebffd4c2d499b61f164b4d54dc86c9d44b9c0ea Author: Nico Kruber Date: 2016-11-23T15:11:35Z [hotfix] do not create intermediate strings inside String.format in BlobUtils commit 36ab6121e336f63138e442ea48a751ede7fb04c3 Author: Nico Kruber Date: 2016-11-24T16:11:19Z [hotfix] properly shut down the BlobServer in BlobServerRangeTest commit c8c12c67ae875ca5c96db78375bef880cf2a3c59 Author: Nico Kruber Date: 2017-01-05T17:06:01Z [hotfix] use JUnit's TemporaryFolder in BlobRecoveryITCase, too This makes cleaning up simpler. commit a078cb0c26071fe70e3668d23d0c8bef8550892f Author: Nico Kruber Date: 2017-01-05T17:27:00Z [hotfix] add a missing "'" to the BlobStore class commit a643f0b989c640a81b112ad14ae27a2a2b1ab257 Author: Nico Kruber Date: 2017-01-05T17:07:13Z [FLINK-5129] BlobServer: include the cluster id in the HA storage path for blobs This applies to the ZookeeperHaServices implementation. commit 7d832919040059961940fc96d0cdb285bc9f77d3 Author: Nico Kruber Date: 2017-01-05T17:18:10Z [FLINK-5129] unify duplicate code between the BlobServer and ZookeeperHaServices (this was introduced by c64860677f) commit 19879a01b99c4772a09627eb5f380f794f6c1e27 Author: Nico Kruber Date: 2016-11-30T13:52:12Z [hotfix] add some more documentation in BlobStore-related classes commit 80c17ef83104d1186c06d8f5d4cde11e4b05f2b8 Author: Nico Kruber Date: 2017-01-06T10:55:23Z [hotfix] minor code beautifications when checking parameters + also check the blobService parameter in BlobLibraryCacheManager commit ff920e48bd69acef280bdef2a12e5f5f9cca3a88 Author: Nico Kruber Date: 2017-01-06T13:21:42Z [FLINK-5129] let BlobUtils#initStorageDirectory() throw a proper IOException commit c8e2815787338f52e5ad369bcaedb1798284dd29 Author: Nico Kruber Date: 2017-01-06T13:59:51Z [hotfix] simplify code in BlobCache#deleteGlobal() Also, re-order the code so that a local delete is always tried before creating a connection to the BlobServer. If that fails, the local file is deleted at least. commit 5cd1c20aa604a9556c069ab78d8e471fa058499e Author: Nico Kruber Date: 2016-11-29T17:11:06Z [hotfix] re-use some code in BlobServerDeleteTest commit d39948a6baa0cd6f68c4dfd8daffdd65e573fbca Author: Nico Kruber Date: 2016-11-30T13:35:38Z [hotfix] improve some failure messages in the BlobService's HA unit tests commit dc87ae36088cc48a4122351ebe5b09a31d7fba41 Author: Nico Kruber Date: 2017-01-06T14:06:30Z [FLINK-5129] make the BlobCache also use a distributed file system in HA mode If available (in HA mode), download the jar files from the distributed file system directly instead of querying the BlobServer. This way the load is more distributed among the nodes of the file system (depending on its implementation of course) compared to putting all the burden on a single BlobServer. commit 389eaa9779d4bf22cc3972208d4f35ac7a966f5c Author: Nico Kruber Date: 2017-01-06T16:21:05Z [FLINK-5129] add unit tests for the BlobCache accessing the distributed FS directly
[jira] [Created] (FLINK-5430) Bring documentation up to speed for current feature set
Stephan Ewen created FLINK-5430: --- Summary: Bring documentation up to speed for current feature set Key: FLINK-5430 URL: https://issues.apache.org/jira/browse/FLINK-5430 Project: Flink Issue Type: Improvement Components: Documentation Reporter: Stephan Ewen Fix For: 1.2.0, 1.3.0 The documentation is still lacking in many parts that were developed or changed since the 1.1 release. I would like to fix that for {{master}} and {{release-1.2}}. This is a parent issue, let's create sub issues for each respective component and feature that we want to document better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3076: [FLINK-5129] make the BlobServer use a distributed file s...
Github user NicoK commented on the issue: https://github.com/apache/flink/pull/3076 fixed a typo in the unit test that lead to the tests passing although there was still something wrong which is now fixed as well --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4410) Report more information about operator checkpoints
[ https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812046#comment-15812046 ] ASF GitHub Bot commented on FLINK-4410: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/3042 +1 to merge > Report more information about operator checkpoints > -- > > Key: FLINK-4410 > URL: https://issues.apache.org/jira/browse/FLINK-4410 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing, Webfrontend >Affects Versions: 1.1.2 >Reporter: Ufuk Celebi >Assignee: Ufuk Celebi > Fix For: 1.2.0 > > > Checkpoint statistics contain the duration of a checkpoint, measured as from > the CheckpointCoordinator's start to the point when the acknowledge message > came. > We should additionally expose > - duration of the synchronous part of a checkpoint > - duration of the asynchronous part of a checkpoint > - number of bytes buffered during the stream alignment phase > - duration of the stream alignment phase > Note: In the case of using *at-least once* semantics, the latter two will > always be zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3042: [FLINK-4410] Expose more fine grained checkpoint statisti...
Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/3042 +1 to merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Closed] (FLINK-5066) add more efficient isEvent check to EventSerializer
[ https://issues.apache.org/jira/browse/FLINK-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi closed FLINK-5066. -- Resolution: Fixed Fix Version/s: 1.3.0 Implemented in 9457465 (master). > add more efficient isEvent check to EventSerializer > --- > > Key: FLINK-5066 > URL: https://issues.apache.org/jira/browse/FLINK-5066 > Project: Flink > Issue Type: Improvement > Components: Network >Reporter: Nico Kruber >Assignee: Nico Kruber > Fix For: 1.3.0 > > > -LocalInputChannel#getNextBuffer de-serialises all incoming events on the > lookout for an EndOfPartitionEvent.- > Some buffer code de-serialises all incoming events on the lookout for an > EndOfPartitionEvent > (now applies to PartitionRequestQueue#isEndOfPartitionEvent()). > Instead, if EventSerializer offered a function to check for an event type > only without de-serialising the whole event, we could save some resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-5059) only serialise events once in RecordWriter#broadcastEvent
[ https://issues.apache.org/jira/browse/FLINK-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi closed FLINK-5059. -- Resolution: Implemented Fix Version/s: 1.3.0 Fixed in 9cff8c9 (master). > only serialise events once in RecordWriter#broadcastEvent > - > > Key: FLINK-5059 > URL: https://issues.apache.org/jira/browse/FLINK-5059 > Project: Flink > Issue Type: Improvement > Components: Network >Reporter: Nico Kruber >Assignee: Nico Kruber > Fix For: 1.3.0 > > > Currently, > org.apache.flink.runtime.io.network.api.writer.RecordWriter#broadcastEvent > serialises the event once per target channel. Instead, it could serialise the > event only once and use the serialised form for every channel and thus save > resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5059) only serialise events once in RecordWriter#broadcastEvent
[ https://issues.apache.org/jira/browse/FLINK-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812039#comment-15812039 ] ASF GitHub Bot commented on FLINK-5059: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2805 > only serialise events once in RecordWriter#broadcastEvent > - > > Key: FLINK-5059 > URL: https://issues.apache.org/jira/browse/FLINK-5059 > Project: Flink > Issue Type: Improvement > Components: Network >Reporter: Nico Kruber >Assignee: Nico Kruber > Fix For: 1.3.0 > > > Currently, > org.apache.flink.runtime.io.network.api.writer.RecordWriter#broadcastEvent > serialises the event once per target channel. Instead, it could serialise the > event only once and use the serialised form for every channel and thus save > resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5066) add more efficient isEvent check to EventSerializer
[ https://issues.apache.org/jira/browse/FLINK-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812040#comment-15812040 ] ASF GitHub Bot commented on FLINK-5066: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2806 > add more efficient isEvent check to EventSerializer > --- > > Key: FLINK-5066 > URL: https://issues.apache.org/jira/browse/FLINK-5066 > Project: Flink > Issue Type: Improvement > Components: Network >Reporter: Nico Kruber >Assignee: Nico Kruber > > -LocalInputChannel#getNextBuffer de-serialises all incoming events on the > lookout for an EndOfPartitionEvent.- > Some buffer code de-serialises all incoming events on the lookout for an > EndOfPartitionEvent > (now applies to PartitionRequestQueue#isEndOfPartitionEvent()). > Instead, if EventSerializer offered a function to check for an event type > only without de-serialising the whole event, we could save some resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #2829: [hotfix] prevent RecordWriter#flush() to clear the...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2829 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2806: [FLINK-5066] add more efficient isEvent check to E...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2806 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2805: [FLINK-5059] only serialise events once in RecordW...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2805 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3083: [FLINK-2662] [optimizer] Fix translation of broadc...
GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/3083 [FLINK-2662] [optimizer] Fix translation of broadcasted unions. Fix optimizer to support union with > 2 broadcasted inputs. 1. `BinaryUnionDescriptor` is changed to compute fully replicated global property if both inputs are fully replicated. 2. Enumeration of plans for binary union is changed to enforce local forward connection if input is fully replicated. 3. Add a test to check that union with three inputs can be on broadcasted side of join (fully replicated property is requested and pushed to all inputs of the union). You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink unionBug3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3083.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3083 commit a0019d386e09f727874e54b7c8ce2ca94b26f2e5 Author: Fabian Hueske Date: 2017-01-05T23:00:30Z [FLINK-2662] [optimizer] Fix translation of broadcasted unions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2662) CompilerException: "Bug: Plan generation for Unions picked a ship strategy between binary plan operators."
[ https://issues.apache.org/jira/browse/FLINK-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812018#comment-15812018 ] ASF GitHub Bot commented on FLINK-2662: --- GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/3083 [FLINK-2662] [optimizer] Fix translation of broadcasted unions. Fix optimizer to support union with > 2 broadcasted inputs. 1. `BinaryUnionDescriptor` is changed to compute fully replicated global property if both inputs are fully replicated. 2. Enumeration of plans for binary union is changed to enforce local forward connection if input is fully replicated. 3. Add a test to check that union with three inputs can be on broadcasted side of join (fully replicated property is requested and pushed to all inputs of the union). You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink unionBug3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3083.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3083 commit a0019d386e09f727874e54b7c8ce2ca94b26f2e5 Author: Fabian Hueske Date: 2017-01-05T23:00:30Z [FLINK-2662] [optimizer] Fix translation of broadcasted unions. > CompilerException: "Bug: Plan generation for Unions picked a ship strategy > between binary plan operators." > -- > > Key: FLINK-2662 > URL: https://issues.apache.org/jira/browse/FLINK-2662 > Project: Flink > Issue Type: Bug > Components: Optimizer >Affects Versions: 0.9.1, 0.10.0 >Reporter: Gabor Gevay >Assignee: Fabian Hueske >Priority: Critical > Fix For: 1.0.0, 1.2.0, 1.3.0, 1.1.5 > > Attachments: Bug.java, FlinkBug.scala > > > I have a Flink program which throws the exception in the jira title. Full > text: > Exception in thread "main" org.apache.flink.optimizer.CompilerException: Bug: > Plan generation for Unions picked a ship strategy between binary plan > operators. > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.collect(BinaryUnionReplacer.java:113) > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.postVisit(BinaryUnionReplacer.java:72) > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.postVisit(BinaryUnionReplacer.java:41) > at > org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:170) > at > org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199) > at > org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:163) > at > org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:163) > at > org.apache.flink.optimizer.plan.WorksetIterationPlanNode.acceptForStepFunction(WorksetIterationPlanNode.java:194) > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.preVisit(BinaryUnionReplacer.java:49) > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.preVisit(BinaryUnionReplacer.java:41) > at > org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:162) > at > org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199) > at > org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199) > at > org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:127) > at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:520) > at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:402) > at > org.apache.flink.client.LocalExecutor.getOptimizerPlanAsJSON(LocalExecutor.java:202) > at > org.apache.flink.api.java.LocalEnvironment.getExecutionPlan(LocalEnvironment.java:63) > at malom.Solver.main(Solver.java:66) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) > The execution plan: > http://compalg.inf.elte.hu/~ggevay/flink/plan_3_4_0_0_without_verif.txt > (I obtained this by commenting out the line that throws the exception) > The code is here: > https://github.com/ggevay/flink/tree/plan-generation-bug > The class to run is "Solver". It needs a command line argument, which
[GitHub] flink issue #2941: [FLINK-3555] Web interface does not render job informatio...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2941 Sorry, I have to take a step back here: The changes you made were directly to the resulting CSS files. These files are generated by the web build scripts, so the changes have to be made to the original files. Otherwise they will be overwritten the next time somebody re-builds the web UI. See here: https://github.com/apache/flink/blob/master/flink-runtime-web/README.md The change to `flink-runtime-web/web-dashboard/web/css/index.css` is easy t put into `index.styl` instead. However, the change in `flink-runtime-web/web-dashboard/web/css/vendor.css` overwrites a style that is pulled in as a dependency. Not sure how to handle this correctly. @uce @sachingoel0101 ot @iampeter Can probably help here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3555) Web interface does not render job information properly
[ https://issues.apache.org/jira/browse/FLINK-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811940#comment-15811940 ] ASF GitHub Bot commented on FLINK-3555: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2941 Sorry, I have to take a step back here: The changes you made were directly to the resulting CSS files. These files are generated by the web build scripts, so the changes have to be made to the original files. Otherwise they will be overwritten the next time somebody re-builds the web UI. See here: https://github.com/apache/flink/blob/master/flink-runtime-web/README.md The change to `flink-runtime-web/web-dashboard/web/css/index.css` is easy t put into `index.styl` instead. However, the change in `flink-runtime-web/web-dashboard/web/css/vendor.css` overwrites a style that is pulled in as a dependency. Not sure how to handle this correctly. @uce @sachingoel0101 ot @iampeter Can probably help here. > Web interface does not render job information properly > -- > > Key: FLINK-3555 > URL: https://issues.apache.org/jira/browse/FLINK-3555 > Project: Flink > Issue Type: Bug > Components: Webfrontend >Affects Versions: 1.0.0 >Reporter: Till Rohrmann >Assignee: Sergey_Sokur >Priority: Minor > Attachments: Chrome.png, Safari.png > > > In Chrome and Safari, the different tabs of the detailed job view are not > properly rendered. The text goes beyond the surrounding box. I would guess > that this is some kind of css issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4410) Report more information about operator checkpoints
[ https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811918#comment-15811918 ] ASF GitHub Bot commented on FLINK-4410: --- Github user uce commented on the issue: https://github.com/apache/flink/pull/3042 Thanks for checking it out Robert. Would love to merge it for 1.2 as well. I fixed the rounding issue. > Report more information about operator checkpoints > -- > > Key: FLINK-4410 > URL: https://issues.apache.org/jira/browse/FLINK-4410 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing, Webfrontend >Affects Versions: 1.1.2 >Reporter: Ufuk Celebi >Assignee: Ufuk Celebi > Fix For: 1.2.0 > > > Checkpoint statistics contain the duration of a checkpoint, measured as from > the CheckpointCoordinator's start to the point when the acknowledge message > came. > We should additionally expose > - duration of the synchronous part of a checkpoint > - duration of the asynchronous part of a checkpoint > - number of bytes buffered during the stream alignment phase > - duration of the stream alignment phase > Note: In the case of using *at-least once* semantics, the latter two will > always be zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3042: [FLINK-4410] Expose more fine grained checkpoint statisti...
Github user uce commented on the issue: https://github.com/apache/flink/pull/3042 Thanks for checking it out Robert. Would love to merge it for 1.2 as well. I fixed the rounding issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5420) Make the CEP library rescalable.
[ https://issues.apache.org/jira/browse/FLINK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811887#comment-15811887 ] Till Rohrmann commented on FLINK-5420: -- Adding a description containing some more details about the problem could be helpful. > Make the CEP library rescalable. > > > Key: FLINK-5420 > URL: https://issues.apache.org/jira/browse/FLINK-5420 > Project: Flink > Issue Type: Bug > Components: CEP >Affects Versions: 1.2.0 >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas > Fix For: 1.2.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-5423) Implement Stochastic Outlier Selection
[ https://issues.apache.org/jira/browse/FLINK-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-5423: - Assignee: Fokko Driesprong > Implement Stochastic Outlier Selection > -- > > Key: FLINK-5423 > URL: https://issues.apache.org/jira/browse/FLINK-5423 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong > > I've implemented the Stochastic Outlier Selection (SOS) algorithm by Jeroen > Jansen. > http://jeroenjanssens.com/2013/11/24/stochastic-outlier-selection.html > Integrated as much as possible with the components from the machine learning > library. > The algorithm itself has been compared to four other algorithms and it it > shows that SOS has a higher performance on most of these real-world datasets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5426) Clean up the Flink Machine Learning library
[ https://issues.apache.org/jira/browse/FLINK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811864#comment-15811864 ] ASF GitHub Bot commented on FLINK-5426: --- Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/3081 Thanks for your contribution @Fokko. I'll take a look at your PR in the next days. > Clean up the Flink Machine Learning library > --- > > Key: FLINK-5426 > URL: https://issues.apache.org/jira/browse/FLINK-5426 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong > > Hi Guys, > I would like to clean up the Machine Learning library. A lot of the code in > the ML Library does not conform to the original contribution guide. For > example: > Duplicate tests, different names, but exactly the same testcase: > https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L148 > https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L164 > Lot of multi-line tests-cases: > https://github.com/Fokko/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala > Mis-use of constants: > https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/math/DenseMatrix.scala#L58 > Please allow me to clean this up, and I'm looking forward to contribute more > code, especially to the ML part. I've have been a contributor to Apache Spark > and am happy to extend the codebase with new distributed algorithms and make > the codebase more mature. > Cheers, Fokko -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-5423) Implement Stochastic Outlier Selection
[ https://issues.apache.org/jira/browse/FLINK-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811868#comment-15811868 ] ASF GitHub Bot commented on FLINK-5423: --- Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/3077 Thanks for your contribution @Fokko. I'll take a look at this PR in the next days :-) > Implement Stochastic Outlier Selection > -- > > Key: FLINK-5423 > URL: https://issues.apache.org/jira/browse/FLINK-5423 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Fokko Driesprong > > I've implemented the Stochastic Outlier Selection (SOS) algorithm by Jeroen > Jansen. > http://jeroenjanssens.com/2013/11/24/stochastic-outlier-selection.html > Integrated as much as possible with the components from the machine learning > library. > The algorithm itself has been compared to four other algorithms and it it > shows that SOS has a higher performance on most of these real-world datasets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3077: [FLINK-5423] Implement Stochastic Outlier Selection
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/3077 Thanks for your contribution @Fokko. I'll take a look at this PR in the next days :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3081: [FLINK-5426] Clean up the Flink Machine Learning library
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/3081 Thanks for your contribution @Fokko. I'll take a look at your PR in the next days. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-5426) Clean up the Flink Machine Learning library
[ https://issues.apache.org/jira/browse/FLINK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-5426: - Assignee: Fokko Driesprong > Clean up the Flink Machine Learning library > --- > > Key: FLINK-5426 > URL: https://issues.apache.org/jira/browse/FLINK-5426 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong > > Hi Guys, > I would like to clean up the Machine Learning library. A lot of the code in > the ML Library does not conform to the original contribution guide. For > example: > Duplicate tests, different names, but exactly the same testcase: > https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L148 > https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L164 > Lot of multi-line tests-cases: > https://github.com/Fokko/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala > Mis-use of constants: > https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/math/DenseMatrix.scala#L58 > Please allow me to clean this up, and I'm looking forward to contribute more > code, especially to the ML part. I've have been a contributor to Apache Spark > and am happy to extend the codebase with new distributed algorithms and make > the codebase more mature. > Cheers, Fokko -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-5429) Code generate types between operators in Table API
Timo Walther created FLINK-5429: --- Summary: Code generate types between operators in Table API Key: FLINK-5429 URL: https://issues.apache.org/jira/browse/FLINK-5429 Project: Flink Issue Type: New Feature Components: Table API & SQL Reporter: Timo Walther Currently, the Table API uses the generic Row type for shipping records between operators in underlying DataSet and DataStream API. For efficiency reasons we should code generate those records. The final design is up for discussion but here are some ideas: A row like {{(a: INT NULL, b: INT NOT NULL, c: STRING)}} could look like {code} final class GeneratedRow$123 { public boolean a_isNull; public int a; public int b; public String c; } {code} Types could be generated using Janino in the pre-flight phase. The generated types should use primitive types wherever possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4391) Provide support for asynchronous operations over streams
[ https://issues.apache.org/jira/browse/FLINK-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811848#comment-15811848 ] Till Rohrmann commented on FLINK-4391: -- The plan is to document this feature: https://issues.apache.org/jira/browse/FLINK-5371. > Provide support for asynchronous operations over streams > > > Key: FLINK-4391 > URL: https://issues.apache.org/jira/browse/FLINK-4391 > Project: Flink > Issue Type: New Feature > Components: DataStream API >Reporter: Jamie Grier >Assignee: david.wang > Fix For: 1.2.0 > > > Many Flink users need to do asynchronous processing driven by data from a > DataStream. The classic example would be joining against an external > database in order to enrich a stream with extra information. > It would be nice to add general support for this type of operation in the > Flink API. Ideally this could simply take the form of a new operator that > manages async operations, keeps so many of them in flight, and then emits > results to downstream operators as the async operations complete. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4410) Report more information about operator checkpoints
[ https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811826#comment-15811826 ] ASF GitHub Bot commented on FLINK-4410: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/3042 Very nice change! I would love to merge it to 1.2 as well, its so helpful! One very minor thing: I would suggest to round the percentages shown for the completion. I had an instance where it was showing a progress of 51.1%. Here's a screenshot from my testing job: ![image](https://cloud.githubusercontent.com/assets/89049/21768632/e7800fcc-d67a-11e6-8961-6063f4faa138.png) > Report more information about operator checkpoints > -- > > Key: FLINK-4410 > URL: https://issues.apache.org/jira/browse/FLINK-4410 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing, Webfrontend >Affects Versions: 1.1.2 >Reporter: Ufuk Celebi >Assignee: Ufuk Celebi > Fix For: 1.2.0 > > > Checkpoint statistics contain the duration of a checkpoint, measured as from > the CheckpointCoordinator's start to the point when the acknowledge message > came. > We should additionally expose > - duration of the synchronous part of a checkpoint > - duration of the asynchronous part of a checkpoint > - number of bytes buffered during the stream alignment phase > - duration of the stream alignment phase > Note: In the case of using *at-least once* semantics, the latter two will > always be zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3042: [FLINK-4410] Expose more fine grained checkpoint statisti...
Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/3042 Very nice change! I would love to merge it to 1.2 as well, its so helpful! One very minor thing: I would suggest to round the percentages shown for the completion. I had an instance where it was showing a progress of 51.1%. Here's a screenshot from my testing job: ![image](https://cloud.githubusercontent.com/assets/89049/21768632/e7800fcc-d67a-11e6-8961-6063f4faa138.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5360) Fix arguments names in WindowedStream
[ https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811814#comment-15811814 ] ASF GitHub Bot commented on FLINK-5360: --- Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/3022 I'm running a last check and then I'm merging. Thanks for reviewing @xhumanoid! And thanks for fixing it @mushketyk. 👍 > Fix arguments names in WindowedStream > - > > Key: FLINK-5360 > URL: https://issues.apache.org/jira/browse/FLINK-5360 > Project: Flink > Issue Type: Bug >Reporter: Ivan Mushketyk >Assignee: Ivan Mushketyk > > Should be "field" instead of "positionToMaxBy" in some methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3022: [FLINK-5360] Fix argument names in WindowedStream
Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/3022 I'm running a last check and then I'm merging. Thanks for reviewing @xhumanoid! And thanks for fixing it @mushketyk. ð --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4410) Report more information about operator checkpoints
[ https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811806#comment-15811806 ] ASF GitHub Bot commented on FLINK-4410: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3042 I think a triggering delay would be very nice and helpful. We can do this as a next step, separate from this pull request. > Report more information about operator checkpoints > -- > > Key: FLINK-4410 > URL: https://issues.apache.org/jira/browse/FLINK-4410 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing, Webfrontend >Affects Versions: 1.1.2 >Reporter: Ufuk Celebi >Assignee: Ufuk Celebi > Fix For: 1.2.0 > > > Checkpoint statistics contain the duration of a checkpoint, measured as from > the CheckpointCoordinator's start to the point when the acknowledge message > came. > We should additionally expose > - duration of the synchronous part of a checkpoint > - duration of the asynchronous part of a checkpoint > - number of bytes buffered during the stream alignment phase > - duration of the stream alignment phase > Note: In the case of using *at-least once* semantics, the latter two will > always be zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #3042: [FLINK-4410] Expose more fine grained checkpoint statisti...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3042 I think a triggering delay would be very nice and helpful. We can do this as a next step, separate from this pull request. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---