date:20170109

[jira] [Assigned] (FLINK-5431) time format for akka status

2017-01-09 Thread Anton Solovev (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Solovev reassigned FLINK-5431:


Assignee: Anton Solovev

> time format for akka status
> ---
>
> Key: FLINK-5431
> URL: https://issues.apache.org/jira/browse/FLINK-5431
> Project: Flink
>  Issue Type: Improvement
>Reporter: Alexey Diomin
>Assignee: Anton Solovev
>Priority: Minor
>
> In ExecutionGraphMessages we have code
> {code}
> private val DATE_FORMATTER: SimpleDateFormat = new 
> SimpleDateFormat("MM/dd/ HH:mm:ss")
> {code}
> But sometimes it cause confusion when main logger configured with 
> "dd/MM/".
> We need making this format configurable or maybe stay only "HH:mm:ss" for 
> prevent misunderstanding output date-time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5357) WordCountTable fails

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814112#comment-15814112
 ] 

ASF GitHub Bot commented on FLINK-5357:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3063#discussion_r95308316
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/logical/operators.scala
 ---
@@ -92,20 +92,19 @@ case class Project(projectList: Seq[NamedExpression], 
child: LogicalNode) extend
   }
 
   override protected[logical] def construct(relBuilder: RelBuilder): 
RelBuilder = {
-val allAlias = projectList.forall(_.isInstanceOf[Alias])
 child.construct(relBuilder)
-if (allAlias) {
-  // Calcite's RelBuilder does not translate identity projects even if 
they rename fields.
-  //   Add a projection ourselves (will be automatically removed by 
translation rules).
-  val project = LogicalProject.create(relBuilder.peek(),
+// Calcite's RelBuilder does not translate identity projects even if 
they rename fields.
+// We add a projection ourselves (will be automatically removed by 
translation rules).
+val project = LogicalProject.create(
--- End diff --

Good point !  The hack code existed because of Calcite's RelBuilder does 
not create  project when only rename fields.  The `project( Iterable nodes, Iterable fieldNames, boolean force)` was introduced and 
fixed this issue by Calcite-1342. 

I have tried to replace the hack code by the following code, and all tests 
passed !

```scala
override protected[logical] def construct(relBuilder: RelBuilder): 
RelBuilder = {
child.construct(relBuilder)
relBuilder.project(
  projectList.map {
case Alias(c, _, _) => c.toRexNode(relBuilder)
case n@_ => n.toRexNode(relBuilder)
  }.asJava,
  projectList.map(_.name).asJava,
  true)
}
```

 


> WordCountTable fails
> 
>
> Key: FLINK-5357
> URL: https://issues.apache.org/jira/browse/FLINK-5357
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> The execution of org.apache.flink.table.examples.java.WordCountTable fails:
> {code}
> Exception in thread "main" org.apache.flink.table.api.TableException: POJO 
> does not define field name: TMP_0
>   at 
> org.apache.flink.table.typeutils.TypeConverter$$anonfun$2.apply(TypeConverter.scala:85)
>   at 
> org.apache.flink.table.typeutils.TypeConverter$$anonfun$2.apply(TypeConverter.scala:81)
>   at scala.collection.immutable.List.foreach(List.scala:318)
>   at 
> org.apache.flink.table.typeutils.TypeConverter$.determineReturnType(TypeConverter.scala:81)
>   at 
> org.apache.flink.table.plan.nodes.dataset.DataSetCalc.translateToPlan(DataSetCalc.scala:110)
>   at 
> org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:305)
>   at 
> org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:289)
>   at 
> org.apache.flink.table.api.java.BatchTableEnvironment.toDataSet(BatchTableEnvironment.scala:146)
>   at 
> org.apache.flink.table.examples.java.WordCountTable.main(WordCountTable.java:56)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #3063: [FLINK-5357] [table] Fix dropped projections

2017-01-09 Thread wuchong

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3063#discussion_r95308316
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/logical/operators.scala
 ---
@@ -92,20 +92,19 @@ case class Project(projectList: Seq[NamedExpression], 
child: LogicalNode) extend
   }
 
   override protected[logical] def construct(relBuilder: RelBuilder): 
RelBuilder = {
-val allAlias = projectList.forall(_.isInstanceOf[Alias])
 child.construct(relBuilder)
-if (allAlias) {
-  // Calcite's RelBuilder does not translate identity projects even if 
they rename fields.
-  //   Add a projection ourselves (will be automatically removed by 
translation rules).
-  val project = LogicalProject.create(relBuilder.peek(),
+// Calcite's RelBuilder does not translate identity projects even if 
they rename fields.
+// We add a projection ourselves (will be automatically removed by 
translation rules).
+val project = LogicalProject.create(
--- End diff --

Good point !  The hack code existed because of Calcite's RelBuilder does 
not create  project when only rename fields.  The `project( Iterable nodes, Iterable fieldNames, boolean force)` was introduced and 
fixed this issue by Calcite-1342. 

I have tried to replace the hack code by the following code, and all tests 
passed !

```scala
override protected[logical] def construct(relBuilder: RelBuilder): 
RelBuilder = {
child.construct(relBuilder)
relBuilder.project(
  projectList.map {
case Alias(c, _, _) => c.toRexNode(relBuilder)
case n@_ => n.toRexNode(relBuilder)
  }.asJava,
  projectList.map(_.name).asJava,
  true)
}
```

 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4920) Add a Scala Function Gauge

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814070#comment-15814070
 ] 

ASF GitHub Bot commented on FLINK-4920:
---

Github user KurtYoung commented on a diff in the pull request:

https://github.com/apache/flink/pull/3080#discussion_r95306713
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/metrics/ScalaGauge.scala 
---
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.metrics
+
+import org.apache.flink.metrics.Gauge
+
+class ScalaGauge[T](value : T) extends Gauge[T] {
--- End diff --

Directly assigning value in constructor is one possible approach to achieve 
the goal of reducing the minimal code lines needed for registering a Scala 
gauge. However, when we meet situations like it will contain some logics to 
calculate the value, this approach will not work.
One possible solution is to use Scala's Function0 utility to pass in a 
"value generator" function instead of value itself. Something like this:
`
class ScalaGauge[T](func: () => T) extends Gauge[T]
`
then one can use the class like this:
`
val myGauge = new ScalaGauge(() => 1)
`
or
`
val myGauge = new ScalaGauge(() => {
  // codes to calculate the value
})
`
when we meet some complex situations.

Here is the Function0 reference which described the usage more clearly:
http://www.scala-lang.org/api/2.6.0/scala/Function0.html


> Add a Scala Function Gauge
> --
>
> Key: FLINK-4920
> URL: https://issues.apache.org/jira/browse/FLINK-4920
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics, Scala API
>Reporter: Stephan Ewen
>Assignee: Pattarawat Chormai
>  Labels: easyfix, starter
>
> A useful metrics utility for the Scala API would be to add a Gauge that 
> obtains its value by calling a Scala Function0.
> That way, one can add Gauges in Scala programs using Scala lambda notation or 
> function references.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #3080: [FLINK-4920] Add a Scala Function Gauge

2017-01-09 Thread KurtYoung

Github user KurtYoung commented on a diff in the pull request:

https://github.com/apache/flink/pull/3080#discussion_r95306713
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/metrics/ScalaGauge.scala 
---
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.metrics
+
+import org.apache.flink.metrics.Gauge
+
+class ScalaGauge[T](value : T) extends Gauge[T] {
--- End diff --

Directly assigning value in constructor is one possible approach to achieve 
the goal of reducing the minimal code lines needed for registering a Scala 
gauge. However, when we meet situations like it will contain some logics to 
calculate the value, this approach will not work.
One possible solution is to use Scala's Function0 utility to pass in a 
"value generator" function instead of value itself. Something like this:
`
class ScalaGauge[T](func: () => T) extends Gauge[T]
`
then one can use the class like this:
`
val myGauge = new ScalaGauge(() => 1)
`
or
`
val myGauge = new ScalaGauge(() => {
  // codes to calculate the value
})
`
when we meet some complex situations.

Here is the Function0 reference which described the usage more clearly:
http://www.scala-lang.org/api/2.6.0/scala/Function0.html


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5122) Elasticsearch Sink loses documents when cluster has high load

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814010#comment-15814010
 ] 

ASF GitHub Bot commented on FLINK-5122:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2861
  
Hi @static-max, thank you for working on this, it'll be an important fix 
for proper at least once support for the ES connector.

Recently, the community has agreed to first restructure the multiple ES 
connector version, so that important fixes like this one can be done once and 
for all across all versions (1.x, 2.x, and 5.x which is currently pending). 
Here's the discussion: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-ElasticSearch-in-Flink-Strategy-td15049.html.

Could we wait just a little a bit on this PR, and once the ES connector 
refactoring is complete, we can come back and rebase this PR on that? You can 
follow the progress here: #2767. I'm trying to come up with the restructure PR 
within the next day.
Very sorry for the extra wait needed on this, but it'll be good for the 
long run, hope you can understand :)


> Elasticsearch Sink loses documents when cluster has high load
> -
>
> Key: FLINK-5122
> URL: https://issues.apache.org/jira/browse/FLINK-5122
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.2.0
>Reporter: static-max
>Assignee: static-max
>
> My cluster had high load and documents got not indexed. This violates the "at 
> least once" semantics in the ES connector.
> I gave pressure on my cluster to test Flink, causing new indices to be 
> created and balanced. On those errors the bulk should be tried again instead 
> of being discarded.
> Primary shard not active because ES decided to rebalance the index:
> 2016-11-15 15:35:16,123 ERROR 
> org.apache.flink.streaming.connectors.elasticsearch2.ElasticsearchSink  - 
> Failed to index document in Elasticsearch: 
> UnavailableShardsException[[index-name][3] primary shard is not active 
> Timeout: [1m], request: [BulkShardRequest to [index-name] containing [20] 
> requests]]
> Bulk queue on node full (I set queue to a low value to reproduce error):
> 22:37:57,702 ERROR 
> org.apache.flink.streaming.connectors.elasticsearch2.ElasticsearchSink  - 
> Failed to index document in Elasticsearch: 
> RemoteTransportException[[node1][192.168.1.240:9300][indices:data/write/bulk[s][p]]];
>  nested: EsRejectedExecutionException[rejected execution of 
> org.elasticsearch.transport.TransportService$4@727e677c on 
> EsThreadPoolExecutor[bulk, queue capacity = 1, 
> org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@51322d37[Running,
>  pool size = 2, active threads = 2, queued tasks = 1, completed tasks = 
> 2939]]];
> I can try to propose a PR for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #2861: [FLINK-5122] Index requests will be retried if the error ...

2017-01-09 Thread tzulitai

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2861
  
Hi @static-max, thank you for working on this, it'll be an important fix 
for proper at least once support for the ES connector.

Recently, the community has agreed to first restructure the multiple ES 
connector version, so that important fixes like this one can be done once and 
for all across all versions (1.x, 2.x, and 5.x which is currently pending). 
Here's the discussion: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-ElasticSearch-in-Flink-Strategy-td15049.html.

Could we wait just a little a bit on this PR, and once the ES connector 
refactoring is complete, we can come back and rebase this PR on that? You can 
follow the progress here: #2767. I'm trying to come up with the restructure PR 
within the next day.
Very sorry for the extra wait needed on this, but it'll be good for the 
long run, hope you can understand :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5357) WordCountTable fails

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814005#comment-15814005
 ] 

ASF GitHub Bot commented on FLINK-5357:
---

Github user KurtYoung commented on a diff in the pull request:

https://github.com/apache/flink/pull/3063#discussion_r95304042
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/logical/operators.scala
 ---
@@ -92,20 +92,19 @@ case class Project(projectList: Seq[NamedExpression], 
child: LogicalNode) extend
   }
 
   override protected[logical] def construct(relBuilder: RelBuilder): 
RelBuilder = {
-val allAlias = projectList.forall(_.isInstanceOf[Alias])
 child.construct(relBuilder)
-if (allAlias) {
-  // Calcite's RelBuilder does not translate identity projects even if 
they rename fields.
-  //   Add a projection ourselves (will be automatically removed by 
translation rules).
-  val project = LogicalProject.create(relBuilder.peek(),
+// Calcite's RelBuilder does not translate identity projects even if 
they rename fields.
+// We add a projection ourselves (will be automatically removed by 
translation rules).
+val project = LogicalProject.create(
--- End diff --

I noticed that Calcite's RelBuilder has this method:
`project(
  Iterable nodes,
  Iterable fieldNames,
  boolean force)`
Would it be more consistent to use this method by setting the force to 
`true`, instead of creating Projection by ourself. 
But either is fine with me.


> WordCountTable fails
> 
>
> Key: FLINK-5357
> URL: https://issues.apache.org/jira/browse/FLINK-5357
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> The execution of org.apache.flink.table.examples.java.WordCountTable fails:
> {code}
> Exception in thread "main" org.apache.flink.table.api.TableException: POJO 
> does not define field name: TMP_0
>   at 
> org.apache.flink.table.typeutils.TypeConverter$$anonfun$2.apply(TypeConverter.scala:85)
>   at 
> org.apache.flink.table.typeutils.TypeConverter$$anonfun$2.apply(TypeConverter.scala:81)
>   at scala.collection.immutable.List.foreach(List.scala:318)
>   at 
> org.apache.flink.table.typeutils.TypeConverter$.determineReturnType(TypeConverter.scala:81)
>   at 
> org.apache.flink.table.plan.nodes.dataset.DataSetCalc.translateToPlan(DataSetCalc.scala:110)
>   at 
> org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:305)
>   at 
> org.apache.flink.table.api.BatchTableEnvironment.translate(BatchTableEnvironment.scala:289)
>   at 
> org.apache.flink.table.api.java.BatchTableEnvironment.toDataSet(BatchTableEnvironment.scala:146)
>   at 
> org.apache.flink.table.examples.java.WordCountTable.main(WordCountTable.java:56)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #3063: [FLINK-5357] [table] Fix dropped projections

2017-01-09 Thread KurtYoung

Github user KurtYoung commented on a diff in the pull request:

https://github.com/apache/flink/pull/3063#discussion_r95304042
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/logical/operators.scala
 ---
@@ -92,20 +92,19 @@ case class Project(projectList: Seq[NamedExpression], 
child: LogicalNode) extend
   }
 
   override protected[logical] def construct(relBuilder: RelBuilder): 
RelBuilder = {
-val allAlias = projectList.forall(_.isInstanceOf[Alias])
 child.construct(relBuilder)
-if (allAlias) {
-  // Calcite's RelBuilder does not translate identity projects even if 
they rename fields.
-  //   Add a projection ourselves (will be automatically removed by 
translation rules).
-  val project = LogicalProject.create(relBuilder.peek(),
+// Calcite's RelBuilder does not translate identity projects even if 
they rename fields.
+// We add a projection ourselves (will be automatically removed by 
translation rules).
+val project = LogicalProject.create(
--- End diff --

I noticed that Calcite's RelBuilder has this method:
`project(
  Iterable nodes,
  Iterable fieldNames,
  boolean force)`
Would it be more consistent to use this method by setting the force to 
`true`, instead of creating Projection by ourself. 
But either is fine with me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813852#comment-15813852
 ] 

ASF GitHub Bot commented on FLINK-5280:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3039#discussion_r95292990
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/TableSource.scala
 ---
@@ -19,22 +19,23 @@
 package org.apache.flink.table.sources
 
 import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.TableEnvironment
 
-/** Defines an external table by providing schema information, i.e., field 
names and types.
+/** Defines an external table by providing schema information and used to 
produce a
+  * [[org.apache.flink.api.scala.DataSet]] or 
[[org.apache.flink.streaming.api.scala.DataStream]].
+  * Schema information consists of a data type, field names, and 
corresponding indices of
+  * these names in the data type.
+  *
+  * To define a TableSource one need to implement 
[[TableSource#getReturnType]]. In this case
+  * field names and field indices are derived from the returned type.
+  *
+  * In case if custom field names are required one need to additionally 
implement
--- End diff --

In case if  -> In case of


> Extend TableSource to support nested data
> -
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #3039: [FLINK-5280] Update TableSource to support nested ...

2017-01-09 Thread wuchong

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3039#discussion_r95292990
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/TableSource.scala
 ---
@@ -19,22 +19,23 @@
 package org.apache.flink.table.sources
 
 import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.TableEnvironment
 
-/** Defines an external table by providing schema information, i.e., field 
names and types.
+/** Defines an external table by providing schema information and used to 
produce a
+  * [[org.apache.flink.api.scala.DataSet]] or 
[[org.apache.flink.streaming.api.scala.DataStream]].
+  * Schema information consists of a data type, field names, and 
corresponding indices of
+  * these names in the data type.
+  *
+  * To define a TableSource one need to implement 
[[TableSource#getReturnType]]. In this case
+  * field names and field indices are derived from the returned type.
+  *
+  * In case if custom field names are required one need to additionally 
implement
--- End diff --

In case if  -> In case of


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4692) Add tumbling group-windows for batch tables

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813765#comment-15813765
 ] 

ASF GitHub Bot commented on FLINK-4692:
---

Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/2938
  
It's wired. I can't reproduce the wrong results in my local environment. 
What is the `data` in your test @twalthr ?  I'm using the following data in the 
batch and stream test, but the result is same. 

```scala
val data = List(
(1L, 1, "Hi"),
(2L, 2, "Hello"),
(4L, 2, "Hello"),
(8L, 3, "Hello world"),
(6L, 3, "Hello world"))
```

cc @fhueske could your help to test this in your environment ? 


> Add tumbling group-windows for batch tables
> ---
>
> Key: FLINK-4692
> URL: https://issues.apache.org/jira/browse/FLINK-4692
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Jark Wu
>
> Add Tumble group-windows for batch tables as described in 
> [FLIP-11|https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations].
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #2938: [FLINK-4692] [tableApi] Add tumbling group-windows for ba...

2017-01-09 Thread wuchong

Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/2938
  
It's wired. I can't reproduce the wrong results in my local environment. 
What is the `data` in your test @twalthr ?  I'm using the following data in the 
batch and stream test, but the result is same. 

```scala
val data = List(
(1L, 1, "Hi"),
(2L, 2, "Hello"),
(4L, 2, "Hello"),
(8L, 3, "Hello world"),
(6L, 3, "Hello world"))
```

cc @fhueske could your help to test this in your environment ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5364) Rework JAAS configuration to support user-supplied entries

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813707#comment-15813707
 ] 

ASF GitHub Bot commented on FLINK-5364:
---

Github user EronWright commented on the issue:

https://github.com/apache/flink/pull/3057
  
@StephanEwen updated based on feedback, thanks again.


> Rework JAAS configuration to support user-supplied entries
> --
>
> Key: FLINK-5364
> URL: https://issues.apache.org/jira/browse/FLINK-5364
> Project: Flink
>  Issue Type: Bug
>  Components: Cluster Management
>Reporter: Eron Wright 
>Assignee: Eron Wright 
>Priority: Critical
>  Labels: kerberos, security
>
> Recent issues (see linked) have brought to light a critical deficiency in the 
> handling of JAAS configuration.   
> 1. the MapR distribution relies on an explicit JAAS conf, rather than 
> in-memory conf used by stock Hadoop.
> 2. the ZK/Kafka/Hadoop security configuration is supposed to be independent 
> (one can enable each element separately) but isn't.
> Perhaps we should rework the JAAS conf code to merge any user-supplied 
> configuration with our defaults, rather than using an all-or-nothing 
> approach.   
> We should also address some recent regressions:
> 1. The HadoopSecurityContext should be installed regardless of auth mode, to 
> login with UserGroupInformation, which:
> - handles the HADOOP_USER_NAME variable.
> - installs an OS-specific user principal (from UnixLoginModule etc.) 
> unrelated to Kerberos.
> - picks up the HDFS/HBASE delegation tokens.
> 2. Fix the use of alternative authentication methods - delegation tokens and 
> Kerberos ticket cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3057: [FLINK-5364] Rework JAAS configuration to support user-su...

2017-01-09 Thread EronWright

Github user EronWright commented on the issue:

https://github.com/apache/flink/pull/3057
  
@StephanEwen updated based on feedback, thanks again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4988) Elasticsearch 5.x support

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813546#comment-15813546
 ] 

ASF GitHub Bot commented on FLINK-4988:
---

Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/2767#discussion_r95287899
  
--- Diff: 
flink-connectors/flink-connector-elasticsearch5/src/main/java/org/apache/flink/streaming/connectors/elasticsearch5/ElasticsearchSink.java
 ---
@@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.streaming.connectors.elasticsearch5;
+
+import org.apache.flink.api.java.utils.ParameterTool;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
+import org.apache.flink.util.Preconditions;
+import org.elasticsearch.action.bulk.BulkItemResponse;
+import org.elasticsearch.action.bulk.BulkProcessor;
+import org.elasticsearch.action.bulk.BulkRequest;
+import org.elasticsearch.action.bulk.BulkResponse;
+import org.elasticsearch.action.index.IndexRequest;
+import org.elasticsearch.client.Client;
+import org.elasticsearch.client.transport.TransportClient;
+import org.elasticsearch.common.network.NetworkModule;
+import org.elasticsearch.common.settings.Settings;
+import org.elasticsearch.common.transport.InetSocketTransportAddress;
+import org.elasticsearch.common.transport.TransportAddress;
+import org.elasticsearch.common.unit.ByteSizeUnit;
+import org.elasticsearch.common.unit.ByteSizeValue;
+import org.elasticsearch.common.unit.TimeValue;
+import org.elasticsearch.transport.Netty3Plugin;
+import org.elasticsearch.transport.client.PreBuiltTransportClient;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.net.InetSocketAddress;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicReference;
+
+/**
+ * Sink that emits its input elements in bulk to an Elasticsearch cluster.
+ * 
+ * 
+ * The first {@link Map} passed to the constructor is forwarded to 
Elasticsearch when creating
+ * {@link TransportClient}. The config keys can be found in the 
Elasticsearch
+ * documentation. An important setting is {@code cluster.name}, this 
should be set to the name
+ * of the cluster that the sink should emit to.
+ * 
+ * Attention:  When using the {@code TransportClient} the sink will 
fail if no cluster
+ * can be connected to.
+ * 
+ * The second {@link Map} is used to configure a {@link BulkProcessor} to 
send {@link IndexRequest IndexRequests}.
+ * This will buffer elements before sending a request to the cluster. The 
behaviour of the
+ * {@code BulkProcessor} can be configured using these config keys:
+ * 
+ *  {@code bulk.flush.max.actions}: Maximum amount of elements to 
buffer
+ *  {@code bulk.flush.max.size.mb}: Maximum amount of data (in 
megabytes) to buffer
+ *  {@code bulk.flush.interval.ms}: Interval at which to flush data 
regardless of the other two
+ * settings in milliseconds
+ * 
+ * 
+ * 
+ * You also have to provide an {@link RequestIndexer}. This is used to 
create an
+ * {@link IndexRequest} from an element that needs to be added to 
Elasticsearch. See
+ * {@link RequestIndexer} for an example.
+ *
+ * @param  Type of the elements emitted by this sink
+ */
+public class ElasticsearchSink extends RichSinkFunction {
+
+   public static final String CONFIG_KEY_BULK_FLUSH_MAX_ACTIONS = 
"bulk.flush.max.actions";
+   public static final String CONFIG_KEY_BULK_FLUSH_MAX_SIZE_MB = 
"bulk.flush.max.size.mb";
+   public static final String CONFIG_KEY_BULK_FLUSH_INTERVAL_MS = 
"bulk.flush.interval.ms";
+
+   private static final long serialVersionUID = 1L;
+
+   private static f

[GitHub] flink pull request #2767: [FLINK-4988] Elasticsearch 5.x support

2017-01-09 Thread tzulitai

Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/2767#discussion_r95287899
  
--- Diff: 
flink-connectors/flink-connector-elasticsearch5/src/main/java/org/apache/flink/streaming/connectors/elasticsearch5/ElasticsearchSink.java
 ---
@@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.streaming.connectors.elasticsearch5;
+
+import org.apache.flink.api.java.utils.ParameterTool;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
+import org.apache.flink.util.Preconditions;
+import org.elasticsearch.action.bulk.BulkItemResponse;
+import org.elasticsearch.action.bulk.BulkProcessor;
+import org.elasticsearch.action.bulk.BulkRequest;
+import org.elasticsearch.action.bulk.BulkResponse;
+import org.elasticsearch.action.index.IndexRequest;
+import org.elasticsearch.client.Client;
+import org.elasticsearch.client.transport.TransportClient;
+import org.elasticsearch.common.network.NetworkModule;
+import org.elasticsearch.common.settings.Settings;
+import org.elasticsearch.common.transport.InetSocketTransportAddress;
+import org.elasticsearch.common.transport.TransportAddress;
+import org.elasticsearch.common.unit.ByteSizeUnit;
+import org.elasticsearch.common.unit.ByteSizeValue;
+import org.elasticsearch.common.unit.TimeValue;
+import org.elasticsearch.transport.Netty3Plugin;
+import org.elasticsearch.transport.client.PreBuiltTransportClient;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.net.InetSocketAddress;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicReference;
+
+/**
+ * Sink that emits its input elements in bulk to an Elasticsearch cluster.
+ * 
+ * 
+ * The first {@link Map} passed to the constructor is forwarded to 
Elasticsearch when creating
+ * {@link TransportClient}. The config keys can be found in the 
Elasticsearch
+ * documentation. An important setting is {@code cluster.name}, this 
should be set to the name
+ * of the cluster that the sink should emit to.
+ * 
+ * Attention:  When using the {@code TransportClient} the sink will 
fail if no cluster
+ * can be connected to.
+ * 
+ * The second {@link Map} is used to configure a {@link BulkProcessor} to 
send {@link IndexRequest IndexRequests}.
+ * This will buffer elements before sending a request to the cluster. The 
behaviour of the
+ * {@code BulkProcessor} can be configured using these config keys:
+ * 
+ *  {@code bulk.flush.max.actions}: Maximum amount of elements to 
buffer
+ *  {@code bulk.flush.max.size.mb}: Maximum amount of data (in 
megabytes) to buffer
+ *  {@code bulk.flush.interval.ms}: Interval at which to flush data 
regardless of the other two
+ * settings in milliseconds
+ * 
+ * 
+ * 
+ * You also have to provide an {@link RequestIndexer}. This is used to 
create an
+ * {@link IndexRequest} from an element that needs to be added to 
Elasticsearch. See
+ * {@link RequestIndexer} for an example.
+ *
+ * @param  Type of the elements emitted by this sink
+ */
+public class ElasticsearchSink extends RichSinkFunction {
+
+   public static final String CONFIG_KEY_BULK_FLUSH_MAX_ACTIONS = 
"bulk.flush.max.actions";
+   public static final String CONFIG_KEY_BULK_FLUSH_MAX_SIZE_MB = 
"bulk.flush.max.size.mb";
+   public static final String CONFIG_KEY_BULK_FLUSH_INTERVAL_MS = 
"bulk.flush.interval.ms";
+
+   private static final long serialVersionUID = 1L;
+
+   private static final Logger LOG = 
LoggerFactory.getLogger(ElasticsearchSink.class);
+
+   /**
+* The user specified config map that we forward to Elasticsearch when 
we create the Client.
+*/
+   private final Map esConfig;
+
+   /**

[jira] [Commented] (FLINK-5364) Rework JAAS configuration to support user-supplied entries

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813450#comment-15813450
 ] 

ASF GitHub Bot commented on FLINK-5364:
---

Github user EronWright commented on a diff in the pull request:

https://github.com/apache/flink/pull/3057#discussion_r95283348
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/JaasModule.java
 ---
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.runtime.security.modules;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.runtime.security.DynamicConfiguration;
+import org.apache.flink.runtime.security.KerberosUtils;
+import org.apache.flink.runtime.security.SecurityUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.security.auth.login.AppConfigurationEntry;
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.StandardCopyOption;
+
+/**
+ * Responsible for installing a process-wide JAAS configuration.
+ * 
+ * The installed configuration combines login modules based on:
+ * - the user-supplied JAAS configuration file, if any
+ * - a Kerberos keytab, if configured
+ * - any cached Kerberos credentials from the current environment
+ * 
+ * The module also installs a default JAAS config file (if necessary) for
+ * compatibility with ZK and Kafka.  Note that the JRE actually draws on 
numerous file locations.
+ * See: 
https://docs.oracle.com/javase/7/docs/jre/api/security/jaas/spec/com/sun/security/auth/login/ConfigFile.html
+ * See: 
https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/security/kerberos/Login.java#L289
+ */
+@Internal
+public class JaasModule implements SecurityModule {
+
+   private static final Logger LOG = 
LoggerFactory.getLogger(JaasModule.class);
+
+   static final String JAVA_SECURITY_AUTH_LOGIN_CONFIG = 
"java.security.auth.login.config";
+
+   static final String JAAS_CONF_RESOURCE_NAME = "flink-jaas.conf";
+
+   private String priorConfigFile;
+   private javax.security.auth.login.Configuration priorConfig;
+
+   private DynamicConfiguration currentConfig;
+
+   @Override
+   public void install(SecurityUtils.SecurityConfiguration securityConfig) 
{
+
+   // ensure that a config file is always defined, for 
compatibility with
+   // ZK and Kafka which check for the system property and 
existence of the file
+   priorConfigFile = 
System.getProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, null);
+   if (priorConfigFile == null) {
+   File configFile = generateDefaultConfigFile();
+   System.setProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, 
configFile.getAbsolutePath());
+   }
+
+   // read the JAAS configuration file
+   priorConfig = 
javax.security.auth.login.Configuration.getConfiguration();
+
+   // construct a dynamic JAAS configuration
+   currentConfig = new DynamicConfiguration(priorConfig);
+
+   // wire up the configured JAAS login contexts to use the krb5 
entries
+   AppConfigurationEntry[] krb5Entries = 
getAppConfigurationEntries(securityConfig);
+   if(krb5Entries != null) {
+   for (String app : 
securityConfig.getLoginContextNames()) {
+   currentConfig.addAppConfigurationEntry(app, 
krb5Entries);
+   }
+   }
+
+   
javax.security.auth.login.Configuration.setConfiguration(currentConfig);
+   }
+
+   @Override
+   public void uninstall() {
+   if(priorConfigFile != null) {
+   System.setProperty(JAVA_SEC

[GitHub] flink pull request #3057: [FLINK-5364] Rework JAAS configuration to support ...

2017-01-09 Thread EronWright

Github user EronWright commented on a diff in the pull request:

https://github.com/apache/flink/pull/3057#discussion_r95283348
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/JaasModule.java
 ---
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.runtime.security.modules;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.runtime.security.DynamicConfiguration;
+import org.apache.flink.runtime.security.KerberosUtils;
+import org.apache.flink.runtime.security.SecurityUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.security.auth.login.AppConfigurationEntry;
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.StandardCopyOption;
+
+/**
+ * Responsible for installing a process-wide JAAS configuration.
+ * 
+ * The installed configuration combines login modules based on:
+ * - the user-supplied JAAS configuration file, if any
+ * - a Kerberos keytab, if configured
+ * - any cached Kerberos credentials from the current environment
+ * 
+ * The module also installs a default JAAS config file (if necessary) for
+ * compatibility with ZK and Kafka.  Note that the JRE actually draws on 
numerous file locations.
+ * See: 
https://docs.oracle.com/javase/7/docs/jre/api/security/jaas/spec/com/sun/security/auth/login/ConfigFile.html
+ * See: 
https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/security/kerberos/Login.java#L289
+ */
+@Internal
+public class JaasModule implements SecurityModule {
+
+   private static final Logger LOG = 
LoggerFactory.getLogger(JaasModule.class);
+
+   static final String JAVA_SECURITY_AUTH_LOGIN_CONFIG = 
"java.security.auth.login.config";
+
+   static final String JAAS_CONF_RESOURCE_NAME = "flink-jaas.conf";
+
+   private String priorConfigFile;
+   private javax.security.auth.login.Configuration priorConfig;
+
+   private DynamicConfiguration currentConfig;
+
+   @Override
+   public void install(SecurityUtils.SecurityConfiguration securityConfig) 
{
+
+   // ensure that a config file is always defined, for 
compatibility with
+   // ZK and Kafka which check for the system property and 
existence of the file
+   priorConfigFile = 
System.getProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, null);
+   if (priorConfigFile == null) {
+   File configFile = generateDefaultConfigFile();
+   System.setProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, 
configFile.getAbsolutePath());
+   }
+
+   // read the JAAS configuration file
+   priorConfig = 
javax.security.auth.login.Configuration.getConfiguration();
+
+   // construct a dynamic JAAS configuration
+   currentConfig = new DynamicConfiguration(priorConfig);
+
+   // wire up the configured JAAS login contexts to use the krb5 
entries
+   AppConfigurationEntry[] krb5Entries = 
getAppConfigurationEntries(securityConfig);
+   if(krb5Entries != null) {
+   for (String app : 
securityConfig.getLoginContextNames()) {
+   currentConfig.addAppConfigurationEntry(app, 
krb5Entries);
+   }
+   }
+
+   
javax.security.auth.login.Configuration.setConfiguration(currentConfig);
+   }
+
+   @Override
+   public void uninstall() {
+   if(priorConfigFile != null) {
+   System.setProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG, 
priorConfigFile);
+   } else {
+   System.clearProperty(JAVA_SECURITY_AUTH_LOGIN_CONFIG);
+   }
+   
javax.security.auth.login.Configuration.setConfiguration(priorConfig);

[jira] [Commented] (FLINK-5395) support locally build distribution by script create_release_files.sh

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813305#comment-15813305
 ] 

ASF GitHub Bot commented on FLINK-5395:
---

Github user shijinkui commented on a diff in the pull request:

https://github.com/apache/flink/pull/3049#discussion_r95276351
  
--- Diff: tools/create_release_files.sh ---
@@ -66,16 +66,19 @@ fi
 GPG_PASSPHRASE=${GPG_PASSPHRASE:-XXX}
 GPG_KEY=${GPG_KEY:-XXX}
 GIT_AUTHOR=${GIT_AUTHOR:-"Your name "}
-OLD_VERSION=${OLD_VERSION:-1.1-SNAPSHOT}
-RELEASE_VERSION=${NEW_VERSION}
+OLD_VERSION=${OLD_VERSION:-1.2-SNAPSHOT}
+RELEASE_VERSION=${NEW_VERSION:-1.3-SNAPSHOT}
 RELEASE_CANDIDATE=${RELEASE_CANDIDATE:-rc1}
 RELEASE_BRANCH=${RELEASE_BRANCH:-master}
 USER_NAME=${USER_NAME:-yourapacheidhere}
 MVN=${MVN:-mvn}
 GPG=${GPG:-gpg}
 sonatype_user=${sonatype_user:-yourapacheidhere}
 sonatype_pw=${sonatype_pw:-XXX}
-
+IS_LOCAL_DIST=${IS_LOCAL_DIST:-false}
+GIT_REPO=${GIT_REPO:-git-wip-us.apache.org/repos/asf/flink.git}
+scalaV=none
--- End diff --

That' fine.


> support locally build distribution by script create_release_files.sh
> 
>
> Key: FLINK-5395
> URL: https://issues.apache.org/jira/browse/FLINK-5395
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: shijinkui
>
> create_release_files.sh is build flink release only. It's hard to build 
> custom local Flink release distribution.
> Let create_release_files.sh support:
> 1. custom git repo url
> 2. custom build special scala and hadoop version
> 3. add `tools/flink` to .gitignore
> 4. add usage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #3049: [FLINK-5395] [Build System] support locally build ...

2017-01-09 Thread shijinkui

Github user shijinkui commented on a diff in the pull request:

https://github.com/apache/flink/pull/3049#discussion_r95276351
  
--- Diff: tools/create_release_files.sh ---
@@ -66,16 +66,19 @@ fi
 GPG_PASSPHRASE=${GPG_PASSPHRASE:-XXX}
 GPG_KEY=${GPG_KEY:-XXX}
 GIT_AUTHOR=${GIT_AUTHOR:-"Your name "}
-OLD_VERSION=${OLD_VERSION:-1.1-SNAPSHOT}
-RELEASE_VERSION=${NEW_VERSION}
+OLD_VERSION=${OLD_VERSION:-1.2-SNAPSHOT}
+RELEASE_VERSION=${NEW_VERSION:-1.3-SNAPSHOT}
 RELEASE_CANDIDATE=${RELEASE_CANDIDATE:-rc1}
 RELEASE_BRANCH=${RELEASE_BRANCH:-master}
 USER_NAME=${USER_NAME:-yourapacheidhere}
 MVN=${MVN:-mvn}
 GPG=${GPG:-gpg}
 sonatype_user=${sonatype_user:-yourapacheidhere}
 sonatype_pw=${sonatype_pw:-XXX}
-
+IS_LOCAL_DIST=${IS_LOCAL_DIST:-false}
+GIT_REPO=${GIT_REPO:-git-wip-us.apache.org/repos/asf/flink.git}
+scalaV=none
--- End diff --

That' fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5364) Rework JAAS configuration to support user-supplied entries

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813113#comment-15813113
 ] 

ASF GitHub Bot commented on FLINK-5364:
---

Github user EronWright commented on a diff in the pull request:

https://github.com/apache/flink/pull/3057#discussion_r95265401
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/security/SecurityUtils.java
 ---
@@ -71,163 +64,93 @@
 */
public static void install(SecurityConfiguration config) throws 
Exception {
 
-   if (!config.securityIsEnabled()) {
-   // do not perform any initialization if no Kerberos 
crendetails are provided
-   return;
-   }
-
-   // establish the JAAS config
-   JaasConfiguration jaasConfig = new 
JaasConfiguration(config.keytab, config.principal);
-   
javax.security.auth.login.Configuration.setConfiguration(jaasConfig);
-
-   populateSystemSecurityProperties(config.flinkConf);
-
-   // establish the UGI login user
-   UserGroupInformation.setConfiguration(config.hadoopConf);
-
-   // only configure Hadoop security if we have security enabled
-   if (UserGroupInformation.isSecurityEnabled()) {
-
-   final UserGroupInformation loginUser;
-
-   if (config.keytab != null && 
!StringUtils.isBlank(config.principal)) {
-   String keytabPath = (new 
File(config.keytab)).getAbsolutePath();
-
-   
UserGroupInformation.loginUserFromKeytab(config.principal, keytabPath);
-
-   loginUser = UserGroupInformation.getLoginUser();
-
-   // supplement with any available tokens
-   String fileLocation = 
System.getenv(UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION);
-   if (fileLocation != null) {
-   /*
-* Use reflection API since the API semantics 
are not available in Hadoop1 profile. Below APIs are
-* used in the context of reading the stored 
tokens from UGI.
-* Credentials cred = 
Credentials.readTokenStorageFile(new File(fileLocation), config.hadoopConf);
-* loginUser.addCredentials(cred);
-   */
-   try {
-   Method 
readTokenStorageFileMethod = Credentials.class.getMethod("readTokenStorageFile",
-   File.class, 
org.apache.hadoop.conf.Configuration.class);
-   Credentials cred = 
(Credentials) readTokenStorageFileMethod.invoke(null, new File(fileLocation),
-   config.hadoopConf);
-   Method addCredentialsMethod = 
UserGroupInformation.class.getMethod("addCredentials",
-   Credentials.class);
-   
addCredentialsMethod.invoke(loginUser, cred);
-   } catch (NoSuchMethodException e) {
-   LOG.warn("Could not find method 
implementations in the shaded jar. Exception: {}", e);
-   }
-   }
-   } else {
-   // login with current user credentials (e.g. 
ticket cache)
-   try {
-   //Use reflection API to get the login 
user object
-   
//UserGroupInformation.loginUserFromSubject(null);
-   Method loginUserFromSubjectMethod = 
UserGroupInformation.class.getMethod("loginUserFromSubject", Subject.class);
-   Subject subject = null;
-   loginUserFromSubjectMethod.invoke(null, 
subject);
-   } catch (NoSuchMethodException e) {
-   LOG.warn("Could not find method 
implementations in the shaded jar. Exception: {}", e);
-   }
-
-   // note that the stored tokens are read 
automatically
-   loginUser = UserGroupInformation.getLoginUser();
+   // install the security modules
+   List modules = new ArrayList();
+   try {
+   for (Class moduleClass : 
config.getSecurityModules(

[GitHub] flink pull request #3057: [FLINK-5364] Rework JAAS configuration to support ...

2017-01-09 Thread EronWright

Github user EronWright commented on a diff in the pull request:

https://github.com/apache/flink/pull/3057#discussion_r95265401
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/security/SecurityUtils.java
 ---
@@ -71,163 +64,93 @@
 */
public static void install(SecurityConfiguration config) throws 
Exception {
 
-   if (!config.securityIsEnabled()) {
-   // do not perform any initialization if no Kerberos 
crendetails are provided
-   return;
-   }
-
-   // establish the JAAS config
-   JaasConfiguration jaasConfig = new 
JaasConfiguration(config.keytab, config.principal);
-   
javax.security.auth.login.Configuration.setConfiguration(jaasConfig);
-
-   populateSystemSecurityProperties(config.flinkConf);
-
-   // establish the UGI login user
-   UserGroupInformation.setConfiguration(config.hadoopConf);
-
-   // only configure Hadoop security if we have security enabled
-   if (UserGroupInformation.isSecurityEnabled()) {
-
-   final UserGroupInformation loginUser;
-
-   if (config.keytab != null && 
!StringUtils.isBlank(config.principal)) {
-   String keytabPath = (new 
File(config.keytab)).getAbsolutePath();
-
-   
UserGroupInformation.loginUserFromKeytab(config.principal, keytabPath);
-
-   loginUser = UserGroupInformation.getLoginUser();
-
-   // supplement with any available tokens
-   String fileLocation = 
System.getenv(UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION);
-   if (fileLocation != null) {
-   /*
-* Use reflection API since the API semantics 
are not available in Hadoop1 profile. Below APIs are
-* used in the context of reading the stored 
tokens from UGI.
-* Credentials cred = 
Credentials.readTokenStorageFile(new File(fileLocation), config.hadoopConf);
-* loginUser.addCredentials(cred);
-   */
-   try {
-   Method 
readTokenStorageFileMethod = Credentials.class.getMethod("readTokenStorageFile",
-   File.class, 
org.apache.hadoop.conf.Configuration.class);
-   Credentials cred = 
(Credentials) readTokenStorageFileMethod.invoke(null, new File(fileLocation),
-   config.hadoopConf);
-   Method addCredentialsMethod = 
UserGroupInformation.class.getMethod("addCredentials",
-   Credentials.class);
-   
addCredentialsMethod.invoke(loginUser, cred);
-   } catch (NoSuchMethodException e) {
-   LOG.warn("Could not find method 
implementations in the shaded jar. Exception: {}", e);
-   }
-   }
-   } else {
-   // login with current user credentials (e.g. 
ticket cache)
-   try {
-   //Use reflection API to get the login 
user object
-   
//UserGroupInformation.loginUserFromSubject(null);
-   Method loginUserFromSubjectMethod = 
UserGroupInformation.class.getMethod("loginUserFromSubject", Subject.class);
-   Subject subject = null;
-   loginUserFromSubjectMethod.invoke(null, 
subject);
-   } catch (NoSuchMethodException e) {
-   LOG.warn("Could not find method 
implementations in the shaded jar. Exception: {}", e);
-   }
-
-   // note that the stored tokens are read 
automatically
-   loginUser = UserGroupInformation.getLoginUser();
+   // install the security modules
+   List modules = new ArrayList();
+   try {
+   for (Class moduleClass : 
config.getSecurityModules()) {
+   SecurityModule module = 
moduleClass.newInstance();
+   module.install(config);
+   modules.add(module);
}
+   }
+

[jira] [Created] (FLINK-5432) ContinuousFileMonitoringFunction is not monitoring nested files

2017-01-09 Thread Yassine Marzougui (JIRA)

Yassine Marzougui created FLINK-5432:


 Summary: ContinuousFileMonitoringFunction is not monitoring nested 
files
 Key: FLINK-5432
 URL: https://issues.apache.org/jira/browse/FLINK-5432
 Project: Flink
  Issue Type: Bug
  Components: filesystem-connector
Affects Versions: 1.2.0
Reporter: Yassine Marzougui


The {{ContinuousFileMonitoringFunction}} does not monitor nested files even if 
the inputformat has NestedFileEnumeration set to true. This can be fixed by 
enabling a recursive scan of the directories in the {{listEligibleFiles}} 
method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5386) Refactoring Window Clause

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812931#comment-15812931
 ] 

ASF GitHub Bot commented on FLINK-5386:
---

Github user shaoxuan-wang commented on a diff in the pull request:

https://github.com/apache/flink/pull/3046#discussion_r95244683
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/windows.scala
 ---
@@ -150,7 +147,7 @@ class TumblingWindow(size: Expression) extends 
GroupWindow {
   def as(alias: String): TumblingWindow = 
as(ExpressionParser.parseExpression(alias))
 
   override private[flink] def toLogicalWindow: LogicalWindow =
-ProcessingTimeTumblingGroupWindow(alias, size)
+ProcessingTimeTumblingGroupWindow(name, size)
--- End diff --

Better to keep using "alias" here. By the way,
I found that the input of some functions use "name", like 
ProcessingTimeTumblingGroupWindow, while some others use "alias", like 
TumblingEventTimeWindow. Better to change them consistently to "alias". What do 
you think.


> Refactoring Window Clause
> -
>
> Key: FLINK-5386
> URL: https://issues.apache.org/jira/browse/FLINK-5386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> Similar to the SQL, window clause is defined "as" a symbol which is 
> explicitly used in groupby/over. We are proposing to refactor the way to 
> write groupby+window tableAPI as follows: 
> val windowedTable = table
>  .window(Slide over 10.milli every 5.milli as 'w1)
>  .window(Tumble over 5.milli  as 'w2)
>  .groupBy('w1, 'key)
>  .select('string, 'int.count as 'count, 'w1.start)
>  .groupBy( 'w2, 'key)
>  .select('string, 'count.sum as sum2)
>  .window(Tumble over 5.milli  as 'w3)
>  .groupBy( 'w3) // windowAll
>  .select('sum2, 'w3.start, 'w3.end)
> In this way, we can remove both GroupWindowedTable and the window() method in 
> GroupedTable which makes the API a bit clean. In addition, for row-window, we 
> anyway need to define window clause as a symbol. This change will make the 
> API of window and row-window consistent, example for row-window:
>   .window(RowXXXWindow as ‘x, RowYYYWindow as ‘y)
>   .select(‘a, ‘b.count over ‘x as ‘xcnt, ‘c.count over ‘y as ‘ycnt, ‘x.start, 
> ‘x.end)
> What do you think? [~fhueske] [~twalthr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #3046: [FLINK-5386][Table API & SQL] refactoring Window C...

2017-01-09 Thread shaoxuan-wang

Github user shaoxuan-wang commented on a diff in the pull request:

https://github.com/apache/flink/pull/3046#discussion_r95244683
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/windows.scala
 ---
@@ -150,7 +147,7 @@ class TumblingWindow(size: Expression) extends 
GroupWindow {
   def as(alias: String): TumblingWindow = 
as(ExpressionParser.parseExpression(alias))
 
   override private[flink] def toLogicalWindow: LogicalWindow =
-ProcessingTimeTumblingGroupWindow(alias, size)
+ProcessingTimeTumblingGroupWindow(name, size)
--- End diff --

Better to keep using "alias" here. By the way,
I found that the input of some functions use "name", like 
ProcessingTimeTumblingGroupWindow, while some others use "alias", like 
TumblingEventTimeWindow. Better to change them consistently to "alias". What do 
you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5355) Handle AmazonKinesisException gracefully in Kinesis Streaming Connector

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812879#comment-15812879
 ] 

ASF GitHub Bot commented on FLINK-5355:
---

Github user skidder commented on the issue:

https://github.com/apache/flink/pull/3078
  
Thanks @tzulitai , great feedback!

The `ProvisionedThroughputExceededException` exception will be reported 
with an HTTP 400 response status code:

http://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html#API_GetRecords_Errors

The AWS SDK will assign the `ErrorType.Client` type to all exceptions with 
an HTTP status code less than 500 ([AWS SDK 
source](https://github.com/aws/aws-sdk-java/blob/7844c64cf248aed889811bf2e871ad6b276a89ca/aws-java-sdk-core/src/main/java/com/amazonaws/http/JsonErrorResponseHandler.java#L119-L121)).

Perhaps we can perform exponential-backoff for exceptions where:
 * `Client` error of type `ProvisionedThroughputExceededException`
 * All `Server` errors (e.g. HTTP 500, 503)
 * All `Unknown` errors (appear to be limited to errors unmarshalling the 
Kinesis service response)

All other exceptions can be thrown up.

What are your thoughts?


> Handle AmazonKinesisException gracefully in Kinesis Streaming Connector
> ---
>
> Key: FLINK-5355
> URL: https://issues.apache.org/jira/browse/FLINK-5355
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Scott Kidder
>Assignee: Scott Kidder
>
> My Flink job that consumes from a Kinesis stream must be restarted at least 
> once daily due to an uncaught AmazonKinesisException when reading from 
> Kinesis. The complete stacktrace looks like:
> {noformat}
> com.amazonaws.services.kinesis.model.AmazonKinesisException: null (Service: 
> AmazonKinesis; Status Code: 500; Error Code: InternalFailure; Request ID: 
> dc1b7a1a-1b97-1a32-8cd5-79a896a55223)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1545)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1183)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:964)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:676)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:650)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:633)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:601)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:583)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:447)
>   at 
> com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:1747)
>   at 
> com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:1723)
>   at 
> com.amazonaws.services.kinesis.AmazonKinesisClient.getRecords(AmazonKinesisClient.java:858)
>   at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getRecords(KinesisProxy.java:193)
>   at 
> org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.getRecords(ShardConsumer.java:268)
>   at 
> org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.run(ShardConsumer.java:176)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> It's interesting that the Kinesis endpoint returned a 500 status code, but 
> that's outside the scope of this issue.
> I think we can handle this exception in the same manner as a 
> ProvisionedThroughputException: performing an exponential backoff and 
> retrying a finite number of times before throwing an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3078: [FLINK-5355] Handle AmazonKinesisException gracefully in ...

2017-01-09 Thread skidder

Github user skidder commented on the issue:

https://github.com/apache/flink/pull/3078
  
Thanks @tzulitai , great feedback!

The `ProvisionedThroughputExceededException` exception will be reported 
with an HTTP 400 response status code:

http://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html#API_GetRecords_Errors

The AWS SDK will assign the `ErrorType.Client` type to all exceptions with 
an HTTP status code less than 500 ([AWS SDK 
source](https://github.com/aws/aws-sdk-java/blob/7844c64cf248aed889811bf2e871ad6b276a89ca/aws-java-sdk-core/src/main/java/com/amazonaws/http/JsonErrorResponseHandler.java#L119-L121)).

Perhaps we can perform exponential-backoff for exceptions where:
 * `Client` error of type `ProvisionedThroughputExceededException`
 * All `Server` errors (e.g. HTTP 500, 503)
 * All `Unknown` errors (appear to be limited to errors unmarshalling the 
Kinesis service response)

All other exceptions can be thrown up.

What are your thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812569#comment-15812569
 ] 

ASF GitHub Bot commented on FLINK-2821:
---

Github user schmichri commented on the issue:

https://github.com/apache/flink/pull/2917
  
short: it works!

long: my scenario a docker-compose setup: 

`  jobmanager:
image: flink:1.2.0_rc0
container_name: "jobmanager"
expose:
  - "6123"
  - "8081"
ports:
  - "8081:8081"
  - "10.0.0.11:6123:6123"
command: jobmanager
environment:
 - JOB_MANAGER_RPC_ADDRESS=10.0.0.11
`
and the taskmanager(s):
`  taskmanager:
image: flink:1.2.0_rc0
expose:
  - "6121"
  - "6122"
command: taskmanager
  - "jobmanager:jobmanager"
environment:
  - JOB_MANAGER_RPC_ADDRESS=10.0.0.11
extra_hosts:
  - "jobmanager:10.0.0.11"`

1. the jobmanager webinterface works now
2. the taskmanager can connect now to the jobmanager 
previous errormessage: 
> dropping message [class akka.actor.ActorSelectionMessage] for non-local 
recipient [Actor[akka.tcp://flink@10.0.0.11:6123/]] arriving at 
[akka.tcp://flink@10.0.0.11:6123] inbound addresses are 
[akka.tcp://flink@172.31.0.2:6123]
> flink-master| 2017-01-09 15:30:39,865 ERROR akka.remote.EndpointWriter

thats very cool! 


> Change Akka configuration to allow accessing actors from different URLs
> ---
>
> Key: FLINK-2821
> URL: https://issues.apache.org/jira/browse/FLINK-2821
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Reporter: Robert Metzger
>Assignee: Maximilian Michels
> Fix For: 1.2.0
>
>
> Akka expects the actor's URL to be exactly matching.
> As pointed out here, cases where users were complaining about this: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html
>   - Proxy routing (as described here, send to the proxy URL, receiver 
> recognizes only original URL)
>   - Using hostname / IP interchangeably does not work (we solved this by 
> always putting IP addresses into URLs, never hostnames)
>   - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still 
> no solution to that (but seems not too much of a restriction)
> I am aware that this is not possible due to Akka, so it is actually not a 
> Flink bug. But I think we should track the resolution of the issue here 
> anyways because its affecting our user's satisfaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #2917: [FLINK-2821] use custom Akka build to listen on all inter...

2017-01-09 Thread schmichri

Github user schmichri commented on the issue:

https://github.com/apache/flink/pull/2917
  
short: it works!

long: my scenario a docker-compose setup: 

`  jobmanager:
image: flink:1.2.0_rc0
container_name: "jobmanager"
expose:
  - "6123"
  - "8081"
ports:
  - "8081:8081"
  - "10.0.0.11:6123:6123"
command: jobmanager
environment:
 - JOB_MANAGER_RPC_ADDRESS=10.0.0.11
`
and the taskmanager(s):
`  taskmanager:
image: flink:1.2.0_rc0
expose:
  - "6121"
  - "6122"
command: taskmanager
  - "jobmanager:jobmanager"
environment:
  - JOB_MANAGER_RPC_ADDRESS=10.0.0.11
extra_hosts:
  - "jobmanager:10.0.0.11"`

1. the jobmanager webinterface works now
2. the taskmanager can connect now to the jobmanager 
previous errormessage: 
> dropping message [class akka.actor.ActorSelectionMessage] for non-local 
recipient [Actor[akka.tcp://flink@10.0.0.11:6123/]] arriving at 
[akka.tcp://flink@10.0.0.11:6123] inbound addresses are 
[akka.tcp://flink@172.31.0.2:6123]
> flink-master| 2017-01-09 15:30:39,865 ERROR akka.remote.EndpointWriter

thats very cool! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3046: [FLINK-5386][Table API & SQL] refactoring Window C...

2017-01-09 Thread shaoxuan-wang

Github user shaoxuan-wang commented on a diff in the pull request:

https://github.com/apache/flink/pull/3046#discussion_r95219196
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/table.scala
 ---
@@ -785,13 +793,19 @@ class Table(
 * will be processed by a single operator.
 *
 * @param groupWindow group-window that specifies how elements are 
grouped.
-* @return A windowed table.
 */
-  def window(groupWindow: GroupWindow): GroupWindowedTable = {
+  def window(groupWindow: GroupWindow): Table = {
 if (tableEnv.isInstanceOf[BatchTableEnvironment]) {
   throw new ValidationException(s"Windows on batch tables are 
currently not supported.")
 }
-new GroupWindowedTable(this, Seq(), groupWindow)
+if (None == groupWindow.name) {
+  throw new ValidationException("An alias must be specified for the 
window.")
+}
+if (windowPool.contains(groupWindow.name.get)) {
+  throw new ValidationException("The window alias can not be 
duplicated.")
--- End diff --

s/"The window alias can not be duplicated."/"The same window alias has been 
defined"/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5386) Refactoring Window Clause

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812507#comment-15812507
 ] 

ASF GitHub Bot commented on FLINK-5386:
---

Github user shaoxuan-wang commented on a diff in the pull request:

https://github.com/apache/flink/pull/3046#discussion_r95219196
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/table.scala
 ---
@@ -785,13 +793,19 @@ class Table(
 * will be processed by a single operator.
 *
 * @param groupWindow group-window that specifies how elements are 
grouped.
-* @return A windowed table.
 */
-  def window(groupWindow: GroupWindow): GroupWindowedTable = {
+  def window(groupWindow: GroupWindow): Table = {
 if (tableEnv.isInstanceOf[BatchTableEnvironment]) {
   throw new ValidationException(s"Windows on batch tables are 
currently not supported.")
 }
-new GroupWindowedTable(this, Seq(), groupWindow)
+if (None == groupWindow.name) {
+  throw new ValidationException("An alias must be specified for the 
window.")
+}
+if (windowPool.contains(groupWindow.name.get)) {
+  throw new ValidationException("The window alias can not be 
duplicated.")
--- End diff --

s/"The window alias can not be duplicated."/"The same window alias has been 
defined"/


> Refactoring Window Clause
> -
>
> Key: FLINK-5386
> URL: https://issues.apache.org/jira/browse/FLINK-5386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> Similar to the SQL, window clause is defined "as" a symbol which is 
> explicitly used in groupby/over. We are proposing to refactor the way to 
> write groupby+window tableAPI as follows: 
> val windowedTable = table
>  .window(Slide over 10.milli every 5.milli as 'w1)
>  .window(Tumble over 5.milli  as 'w2)
>  .groupBy('w1, 'key)
>  .select('string, 'int.count as 'count, 'w1.start)
>  .groupBy( 'w2, 'key)
>  .select('string, 'count.sum as sum2)
>  .window(Tumble over 5.milli  as 'w3)
>  .groupBy( 'w3) // windowAll
>  .select('sum2, 'w3.start, 'w3.end)
> In this way, we can remove both GroupWindowedTable and the window() method in 
> GroupedTable which makes the API a bit clean. In addition, for row-window, we 
> anyway need to define window clause as a symbol. This change will make the 
> API of window and row-window consistent, example for row-window:
>   .window(RowXXXWindow as ‘x, RowYYYWindow as ‘y)
>   .select(‘a, ‘b.count over ‘x as ‘xcnt, ‘c.count over ‘y as ‘ycnt, ‘x.start, 
> ‘x.end)
> What do you think? [~fhueske] [~twalthr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3061: [FLINK-5030] Support hostname verification

2017-01-09 Thread EronWright

Github user EronWright commented on the issue:

https://github.com/apache/flink/pull/3061
  
@tillrohrmann @StephanEwen


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5030) Support hostname verification

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812494#comment-15812494
 ] 

ASF GitHub Bot commented on FLINK-5030:
---

Github user EronWright commented on the issue:

https://github.com/apache/flink/pull/3061
  
@tillrohrmann @StephanEwen


> Support hostname verification
> -
>
> Key: FLINK-5030
> URL: https://issues.apache.org/jira/browse/FLINK-5030
> Project: Flink
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Eron Wright 
>Assignee: Eron Wright 
> Fix For: 1.2.0
>
>
> _See [Dangerous Code|http://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf] and 
> [further 
> commentary|https://tersesystems.com/2014/03/23/fixing-hostname-verification/] 
> for useful background._
> When hostname verification is performed, it should use the hostname (not IP 
> address) to match the certificate.   The current code is wrongly using the 
> address.
> In technical terms, ensure that calls to `SSLContext::createSSLEngine` supply 
> the expected hostname, not host address.
> Please audit all SSL setup code as to whether hostname verification is 
> enabled, and file follow-ups where necessary.   For example, Akka 2.4 
> supports it but 2.3 doesn't 
> ([ref|http://doc.akka.io/docs/akka/2.4.4/scala/http/client-side/https-support.html#Hostname_verification]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3059: [docs] Clarify restart strategy defaults set by checkpoin...

2017-01-09 Thread rehevkor5

Github user rehevkor5 commented on the issue:

https://github.com/apache/flink/pull/3059
  
It does, thanks! I'll try to remember to include "closes #_{pr number}_" in 
my commit messages which might be handy.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3059: [docs] Clarify restart strategy defaults set by checkpoin...

2017-01-09 Thread uce

Github user uce commented on the issue:

https://github.com/apache/flink/pull/3059
  
Since the comments were very minor I addressed them myself (used the 
backticks and skipped the change in "If checkpointing is activated..." as it 
didn't really improve anything).

The work flow was as follows: I checked out your PR branch, rebased it onto 
master (via `git rebase apache/master`) and pushed it with another related 
commit. My related commit had a comment `This closes #3059`, which tells the 
ASF bot to close this PR. Then I cherry picked it to the 1.2 branch.

GitHub only notices that the changes were merged for merges to master and 
if the commit hashes are not changed I think. Since I rebased they did change 
and I needed the manual close. Does this make sense?





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-5431) time format for akka status

2017-01-09 Thread Alexey Diomin (JIRA)

Alexey Diomin created FLINK-5431:


 Summary: time format for akka status
 Key: FLINK-5431
 URL: https://issues.apache.org/jira/browse/FLINK-5431
 Project: Flink
  Issue Type: Improvement
Reporter: Alexey Diomin
Priority: Minor


In ExecutionGraphMessages we have code
{code}
private val DATE_FORMATTER: SimpleDateFormat = new SimpleDateFormat("MM/dd/ 
HH:mm:ss")
{code}

But sometimes it cause confusion when main logger configured with "dd/MM/".

We need making this format configurable or maybe stay only "HH:mm:ss" for 
prevent misunderstanding output date-time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-5360) Fix arguments names in WindowedStream

2017-01-09 Thread Ivan Mushketyk (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mushketyk closed FLINK-5360.
-
Resolution: Fixed

> Fix arguments names in WindowedStream
> -
>
> Key: FLINK-5360
> URL: https://issues.apache.org/jira/browse/FLINK-5360
> Project: Flink
>  Issue Type: Bug
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> Should be "field" instead of "positionToMaxBy" in some methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #3022: [FLINK-5360] Fix argument names in WindowedStream

2017-01-09 Thread mushketyk

Github user mushketyk closed the pull request at:

https://github.com/apache/flink/pull/3022


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5360) Fix arguments names in WindowedStream

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812358#comment-15812358
 ] 

ASF GitHub Bot commented on FLINK-5360:
---

Github user mushketyk closed the pull request at:

https://github.com/apache/flink/pull/3022


> Fix arguments names in WindowedStream
> -
>
> Key: FLINK-5360
> URL: https://issues.apache.org/jira/browse/FLINK-5360
> Project: Flink
>  Issue Type: Bug
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> Should be "field" instead of "positionToMaxBy" in some methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5360) Fix arguments names in WindowedStream

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812357#comment-15812357
 ] 

ASF GitHub Bot commented on FLINK-5360:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/3022
  
Thank you @aljoscha, @xhumanoid 

Closing the issue.


> Fix arguments names in WindowedStream
> -
>
> Key: FLINK-5360
> URL: https://issues.apache.org/jira/browse/FLINK-5360
> Project: Flink
>  Issue Type: Bug
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> Should be "field" instead of "positionToMaxBy" in some methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3022: [FLINK-5360] Fix argument names in WindowedStream

2017-01-09 Thread mushketyk

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/3022
  
Thank you @aljoscha, @xhumanoid 

Closing the issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3059: [docs] Clarify restart strategy defaults set by checkpoin...

2017-01-09 Thread rehevkor5

Github user rehevkor5 commented on the issue:

https://github.com/apache/flink/pull/3059
  
I'm curious as to how you added my commit to `master` and `release-1.2` but 
it doesn't show up in this PR. The whole "rehevkor5 committed with uce 5 days 
ago" thing. Github isn't aware that the changes in this PR were merged.

If you have a moment can you describe your workflow is so I can understand 
what's happening behind the scenes? Is it rebase onto master + merge --ff-only 
or something? Maybe it's due to the fact that Github is only mirroring another 
repo? I'd appreciate it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812341#comment-15812341
 ] 

ASF GitHub Bot commented on FLINK-2821:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2917
  
There is a release candidate out: 
https://github.com/apache/flink/tree/release-1.2.0-rc0
Download at http://people.apache.org/~rmetzger/flink-1.2.0-rc0/

The official release comes when the testing phase is done, which should be 
soon.

If you can try the release candidate and tell us if it worked for your 
scenario, that would help with the release testing.


> Change Akka configuration to allow accessing actors from different URLs
> ---
>
> Key: FLINK-2821
> URL: https://issues.apache.org/jira/browse/FLINK-2821
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Reporter: Robert Metzger
>Assignee: Maximilian Michels
> Fix For: 1.2.0
>
>
> Akka expects the actor's URL to be exactly matching.
> As pointed out here, cases where users were complaining about this: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html
>   - Proxy routing (as described here, send to the proxy URL, receiver 
> recognizes only original URL)
>   - Using hostname / IP interchangeably does not work (we solved this by 
> always putting IP addresses into URLs, never hostnames)
>   - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still 
> no solution to that (but seems not too much of a restriction)
> I am aware that this is not possible due to Akka, so it is actually not a 
> Flink bug. But I think we should track the resolution of the issue here 
> anyways because its affecting our user's satisfaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #2917: [FLINK-2821] use custom Akka build to listen on all inter...

2017-01-09 Thread StephanEwen

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2917
  
There is a release candidate out: 
https://github.com/apache/flink/tree/release-1.2.0-rc0
Download at http://people.apache.org/~rmetzger/flink-1.2.0-rc0/

The official release comes when the testing phase is done, which should be 
soon.

If you can try the release candidate and tell us if it worked for your 
scenario, that would help with the release testing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3049: [FLINK-5395] [Build System] support locally build ...

2017-01-09 Thread uce

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/3049#discussion_r95202903
  
--- Diff: tools/create_release_files.sh ---
@@ -85,18 +88,74 @@ else
 MD5SUM="md5sum"
 fi
 
+usage() {
+  set +x
+  echo "./create_release_files.sh --scala-version 2.11 --hadoop-version 
2.7.2"
+  echo ""
+  echo "usage:"
+  echo "[--scala-version ] [--hadoop-version ]"
+  echo ""
+  echo "example:"
+  echo "  sonatype_user=APACHEID sonatype_pw=APACHEIDPASSWORD \ "
+  echo "  NEW_VERSION=1.2.0 \ "
+  echo "  RELEASE_CANDIDATE="rc1" RELEASE_BRANCH=release-1.2.0 \ "
+  echo "  OLD_VERSION=1.1-SNAPSHOT \ "
+  echo "  USER_NAME=APACHEID \ "
+  echo "  GPG_PASSPHRASE=XXX GPG_KEY=KEYID \ "
+  echo "  GIT_AUTHOR=\"`git config --get user.name` <`git config --get 
user.email`>\" \ "
+  echo "  IS_LOCAL_DIST=true GIT_REPO=github.com/apache/flink.git"
+  echo "  ./create_release_files.sh --scala-version 2.11 --hadoop-version 
2.7.2"
+  exit 1
+}
+
+# Parse arguments
+while (( "$#" )); do
+  case $1 in
+--scala-version)
+  scalaV="$2"
+  shift
+  ;;
+--hadoop-version)
+  hadoopV="$2"
+  shift
+  ;;
+--help)
+  usage
+  ;;
+*)
+  break
+  ;;
+  esac
+  shift
+done
+
+###
 
 prepare() {
   # prepare
-  git clone http://git-wip-us.apache.org/repos/asf/flink.git flink
+  target_branch=release-$RELEASE_VERSION-$RELEASE_CANDIDATE
+  if [ ! -d ./flink ]; then
+git clone http://$GIT_REPO flink
+  else
+# if flink git repo exist, delete target branch, delete builded 
distribution
+rm -rf flink-*.tgz
+cd flink
+# try-catch
+{
+  git pull --all
+  git checkout master
+  git branch -D $target_branch -f
+} || {
+  echo "branch $target_branch maybe not found"
--- End diff --

I think the error message should just be `branch $target_branch not found` 
without the maybe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3049: [FLINK-5395] [Build System] support locally build ...

2017-01-09 Thread uce

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/3049#discussion_r95203611
  
--- Diff: tools/create_release_files.sh ---
@@ -201,19 +262,34 @@ prepare
 
 make_source_release
 
-make_binary_release "hadoop2" "" 2.10
-make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.10
-make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.10
-make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.10
-
-make_binary_release "hadoop2" "" 2.11
-make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.11
-make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.11
-make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.11
-
-copy_data
-
-deploy_to_maven
+# build dist by input parameter of "--scala-vervion xxx --hadoop-version 
xxx"
+if [ "$scalaV" == "none" ] && [ "$hadoopV" == "none" ]; then
+  make_binary_release "hadoop2" "" 2.10
+  make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.10
+  make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.10
+  make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.10
+
+  make_binary_release "hadoop2" "" 2.11
+  make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.11
+  make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.11
+  make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.11
+elif [ "$scalaV" == none ] && [ "$hadoopV" != "none" ]
--- End diff --

Can we add the `"..."` for all strings to have it consistent?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3049: [FLINK-5395] [Build System] support locally build ...

2017-01-09 Thread uce

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/3049#discussion_r95202566
  
--- Diff: tools/create_release_files.sh ---
@@ -66,16 +66,19 @@ fi
 GPG_PASSPHRASE=${GPG_PASSPHRASE:-XXX}
 GPG_KEY=${GPG_KEY:-XXX}
 GIT_AUTHOR=${GIT_AUTHOR:-"Your name "}
-OLD_VERSION=${OLD_VERSION:-1.1-SNAPSHOT}
-RELEASE_VERSION=${NEW_VERSION}
+OLD_VERSION=${OLD_VERSION:-1.2-SNAPSHOT}
+RELEASE_VERSION=${NEW_VERSION:-1.3-SNAPSHOT}
 RELEASE_CANDIDATE=${RELEASE_CANDIDATE:-rc1}
 RELEASE_BRANCH=${RELEASE_BRANCH:-master}
 USER_NAME=${USER_NAME:-yourapacheidhere}
 MVN=${MVN:-mvn}
 GPG=${GPG:-gpg}
 sonatype_user=${sonatype_user:-yourapacheidhere}
 sonatype_pw=${sonatype_pw:-XXX}
-
+IS_LOCAL_DIST=${IS_LOCAL_DIST:-false}
+GIT_REPO=${GIT_REPO:-git-wip-us.apache.org/repos/asf/flink.git}
+scalaV=none
--- End diff --

Should we rename this to `SCALA_VERSION` and `HADOOP_VERSION` in order to 
follow the style of the other variables?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5395) support locally build distribution by script create_release_files.sh

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812336#comment-15812336
 ] 

ASF GitHub Bot commented on FLINK-5395:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/3049#discussion_r95202566
  
--- Diff: tools/create_release_files.sh ---
@@ -66,16 +66,19 @@ fi
 GPG_PASSPHRASE=${GPG_PASSPHRASE:-XXX}
 GPG_KEY=${GPG_KEY:-XXX}
 GIT_AUTHOR=${GIT_AUTHOR:-"Your name "}
-OLD_VERSION=${OLD_VERSION:-1.1-SNAPSHOT}
-RELEASE_VERSION=${NEW_VERSION}
+OLD_VERSION=${OLD_VERSION:-1.2-SNAPSHOT}
+RELEASE_VERSION=${NEW_VERSION:-1.3-SNAPSHOT}
 RELEASE_CANDIDATE=${RELEASE_CANDIDATE:-rc1}
 RELEASE_BRANCH=${RELEASE_BRANCH:-master}
 USER_NAME=${USER_NAME:-yourapacheidhere}
 MVN=${MVN:-mvn}
 GPG=${GPG:-gpg}
 sonatype_user=${sonatype_user:-yourapacheidhere}
 sonatype_pw=${sonatype_pw:-XXX}
-
+IS_LOCAL_DIST=${IS_LOCAL_DIST:-false}
+GIT_REPO=${GIT_REPO:-git-wip-us.apache.org/repos/asf/flink.git}
+scalaV=none
--- End diff --

Should we rename this to `SCALA_VERSION` and `HADOOP_VERSION` in order to 
follow the style of the other variables?


> support locally build distribution by script create_release_files.sh
> 
>
> Key: FLINK-5395
> URL: https://issues.apache.org/jira/browse/FLINK-5395
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: shijinkui
>
> create_release_files.sh is build flink release only. It's hard to build 
> custom local Flink release distribution.
> Let create_release_files.sh support:
> 1. custom git repo url
> 2. custom build special scala and hadoop version
> 3. add `tools/flink` to .gitignore
> 4. add usage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5395) support locally build distribution by script create_release_files.sh

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812337#comment-15812337
 ] 

ASF GitHub Bot commented on FLINK-5395:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/3049#discussion_r95203611
  
--- Diff: tools/create_release_files.sh ---
@@ -201,19 +262,34 @@ prepare
 
 make_source_release
 
-make_binary_release "hadoop2" "" 2.10
-make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.10
-make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.10
-make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.10
-
-make_binary_release "hadoop2" "" 2.11
-make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.11
-make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.11
-make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.11
-
-copy_data
-
-deploy_to_maven
+# build dist by input parameter of "--scala-vervion xxx --hadoop-version 
xxx"
+if [ "$scalaV" == "none" ] && [ "$hadoopV" == "none" ]; then
+  make_binary_release "hadoop2" "" 2.10
+  make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.10
+  make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.10
+  make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.10
+
+  make_binary_release "hadoop2" "" 2.11
+  make_binary_release "hadoop24" "-Dhadoop.version=2.4.1" 2.11
+  make_binary_release "hadoop26" "-Dhadoop.version=2.6.3" 2.11
+  make_binary_release "hadoop27" "-Dhadoop.version=2.7.2" 2.11
+elif [ "$scalaV" == none ] && [ "$hadoopV" != "none" ]
--- End diff --

Can we add the `"..."` for all strings to have it consistent?


> support locally build distribution by script create_release_files.sh
> 
>
> Key: FLINK-5395
> URL: https://issues.apache.org/jira/browse/FLINK-5395
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: shijinkui
>
> create_release_files.sh is build flink release only. It's hard to build 
> custom local Flink release distribution.
> Let create_release_files.sh support:
> 1. custom git repo url
> 2. custom build special scala and hadoop version
> 3. add `tools/flink` to .gitignore
> 4. add usage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5395) support locally build distribution by script create_release_files.sh

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812338#comment-15812338
 ] 

ASF GitHub Bot commented on FLINK-5395:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/3049#discussion_r95202903
  
--- Diff: tools/create_release_files.sh ---
@@ -85,18 +88,74 @@ else
 MD5SUM="md5sum"
 fi
 
+usage() {
+  set +x
+  echo "./create_release_files.sh --scala-version 2.11 --hadoop-version 
2.7.2"
+  echo ""
+  echo "usage:"
+  echo "[--scala-version ] [--hadoop-version ]"
+  echo ""
+  echo "example:"
+  echo "  sonatype_user=APACHEID sonatype_pw=APACHEIDPASSWORD \ "
+  echo "  NEW_VERSION=1.2.0 \ "
+  echo "  RELEASE_CANDIDATE="rc1" RELEASE_BRANCH=release-1.2.0 \ "
+  echo "  OLD_VERSION=1.1-SNAPSHOT \ "
+  echo "  USER_NAME=APACHEID \ "
+  echo "  GPG_PASSPHRASE=XXX GPG_KEY=KEYID \ "
+  echo "  GIT_AUTHOR=\"`git config --get user.name` <`git config --get 
user.email`>\" \ "
+  echo "  IS_LOCAL_DIST=true GIT_REPO=github.com/apache/flink.git"
+  echo "  ./create_release_files.sh --scala-version 2.11 --hadoop-version 
2.7.2"
+  exit 1
+}
+
+# Parse arguments
+while (( "$#" )); do
+  case $1 in
+--scala-version)
+  scalaV="$2"
+  shift
+  ;;
+--hadoop-version)
+  hadoopV="$2"
+  shift
+  ;;
+--help)
+  usage
+  ;;
+*)
+  break
+  ;;
+  esac
+  shift
+done
+
+###
 
 prepare() {
   # prepare
-  git clone http://git-wip-us.apache.org/repos/asf/flink.git flink
+  target_branch=release-$RELEASE_VERSION-$RELEASE_CANDIDATE
+  if [ ! -d ./flink ]; then
+git clone http://$GIT_REPO flink
+  else
+# if flink git repo exist, delete target branch, delete builded 
distribution
+rm -rf flink-*.tgz
+cd flink
+# try-catch
+{
+  git pull --all
+  git checkout master
+  git branch -D $target_branch -f
+} || {
+  echo "branch $target_branch maybe not found"
--- End diff --

I think the error message should just be `branch $target_branch not found` 
without the maybe.


> support locally build distribution by script create_release_files.sh
> 
>
> Key: FLINK-5395
> URL: https://issues.apache.org/jira/browse/FLINK-5395
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: shijinkui
>
> create_release_files.sh is build flink release only. It's hard to build 
> custom local Flink release distribution.
> Let create_release_files.sh support:
> 1. custom git repo url
> 2. custom build special scala and hadoop version
> 3. add `tools/flink` to .gitignore
> 4. add usage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3059: [docs] Clarify restart strategy defaults set by checkpoin...

2017-01-09 Thread rehevkor5

Github user rehevkor5 commented on the issue:

https://github.com/apache/flink/pull/3059
  
@uce Thanks. I didn't get an opportunity to address your comments... is 
there anything you want me to do?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5360) Fix arguments names in WindowedStream

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812311#comment-15812311
 ] 

ASF GitHub Bot commented on FLINK-5360:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/3022
  
Merged. Could you please close this PR and the Jira issue?


> Fix arguments names in WindowedStream
> -
>
> Key: FLINK-5360
> URL: https://issues.apache.org/jira/browse/FLINK-5360
> Project: Flink
>  Issue Type: Bug
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> Should be "field" instead of "positionToMaxBy" in some methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3022: [FLINK-5360] Fix argument names in WindowedStream

2017-01-09 Thread aljoscha

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/3022
  
Merged. Could you please close this PR and the Jira issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4651) Re-register processing time timers at the WindowOperator upon recovery.

2017-01-09 Thread Scott Kidder (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812288#comment-15812288
 ] 

Scott Kidder commented on FLINK-4651:
-

I can confirm that this issue appears to be fixed in the release-1.2.0-rc0 
branch (commit f3c59cedae7a508825d8032a8aa9f5af6d177555).

> Re-register processing time timers at the WindowOperator upon recovery.
> ---
>
> Key: FLINK-4651
> URL: https://issues.apache.org/jira/browse/FLINK-4651
> Project: Flink
>  Issue Type: Bug
>  Components: Windowing Operators
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>  Labels: windows
> Fix For: 1.2.0, 1.1.5
>
>
> Currently the {{WindowOperator}} checkpoints the processing time timers, but 
> upon recovery it does not re-registers them with the {{TimeServiceProvider}}. 
> To actually reprocess them it relies on another element that will come and 
> register a new timer for a future point in time. Although this is a realistic 
> assumption in long running jobs, we can remove this assumption by 
> re-registering the restored timers with the {{TimeServiceProvider}} in the 
> {{open()}} method of the {{WindowOperator}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812287#comment-15812287
 ] 

ASF GitHub Bot commented on FLINK-2821:
---

Github user schmichri commented on the issue:

https://github.com/apache/flink/pull/2917
  
when is this planned to be released? 


> Change Akka configuration to allow accessing actors from different URLs
> ---
>
> Key: FLINK-2821
> URL: https://issues.apache.org/jira/browse/FLINK-2821
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Reporter: Robert Metzger
>Assignee: Maximilian Michels
> Fix For: 1.2.0
>
>
> Akka expects the actor's URL to be exactly matching.
> As pointed out here, cases where users were complaining about this: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html
>   - Proxy routing (as described here, send to the proxy URL, receiver 
> recognizes only original URL)
>   - Using hostname / IP interchangeably does not work (we solved this by 
> always putting IP addresses into URLs, never hostnames)
>   - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still 
> no solution to that (but seems not too much of a restriction)
> I am aware that this is not possible due to Akka, so it is actually not a 
> Flink bug. But I think we should track the resolution of the issue here 
> anyways because its affecting our user's satisfaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #2917: [FLINK-2821] use custom Akka build to listen on all inter...

2017-01-09 Thread schmichri

Github user schmichri commented on the issue:

https://github.com/apache/flink/pull/2917
  
when is this planned to be released? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5178) allow BlobCache to use a distributed file system irrespective of the HA mode

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812271#comment-15812271
 ] 

ASF GitHub Bot commented on FLINK-5178:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/3085

[FLINK-5178] allow BlobCache to use a distributed file system irrespective 
of the HA mode

Allow the BlobServer and BlobCache to use a distributed file system for 
distributing BLOBs even if not in HA-mode. For this, we always try to use the 
path given by the `high-availability.storageDir` config option and, if 
accessible, set it up appropriately.

This builds upon https://github.com/apache/flink/pull/3084

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink FLINK-5178a

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3085.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3085


commit 464f2c834688507c67acb3ad584827132ebe444e
Author: Nico Kruber 
Date:   2016-11-22T11:49:03Z

[hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath

This was actually the same implementation as
FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the 
two
could have been removed but the implementation makes most sense at the
concrete file system abstraction layer, i.e. in FileSystemBlobStore.

commit 2ebffd4c2d499b61f164b4d54dc86c9d44b9c0ea
Author: Nico Kruber 
Date:   2016-11-23T15:11:35Z

[hotfix] do not create intermediate strings inside String.format in 
BlobUtils

commit 36ab6121e336f63138e442ea48a751ede7fb04c3
Author: Nico Kruber 
Date:   2016-11-24T16:11:19Z

[hotfix] properly shut down the BlobServer in BlobServerRangeTest

commit c8c12c67ae875ca5c96db78375bef880cf2a3c59
Author: Nico Kruber 
Date:   2017-01-05T17:06:01Z

[hotfix] use JUnit's TemporaryFolder in BlobRecoveryITCase, too

This makes cleaning up simpler.

commit a078cb0c26071fe70e3668d23d0c8bef8550892f
Author: Nico Kruber 
Date:   2017-01-05T17:27:00Z

[hotfix] add a missing "'" to the BlobStore class

commit a643f0b989c640a81b112ad14ae27a2a2b1ab257
Author: Nico Kruber 
Date:   2017-01-05T17:07:13Z

[FLINK-5129] BlobServer: include the cluster id in the HA storage path for 
blobs

This applies to the ZookeeperHaServices implementation.

commit 7d832919040059961940fc96d0cdb285bc9f77d3
Author: Nico Kruber 
Date:   2017-01-05T17:18:10Z

[FLINK-5129] unify duplicate code between the BlobServer and 
ZookeeperHaServices

(this was introduced by c64860677f)

commit 19879a01b99c4772a09627eb5f380f794f6c1e27
Author: Nico Kruber 
Date:   2016-11-30T13:52:12Z

[hotfix] add some more documentation in BlobStore-related classes

commit 80c17ef83104d1186c06d8f5d4cde11e4b05f2b8
Author: Nico Kruber 
Date:   2017-01-06T10:55:23Z

[hotfix] minor code beautifications when checking parameters

+ also check the blobService parameter in BlobLibraryCacheManager

commit ff920e48bd69acef280bdef2a12e5f5f9cca3a88
Author: Nico Kruber 
Date:   2017-01-06T13:21:42Z

[FLINK-5129] let BlobUtils#initStorageDirectory() throw a proper IOException

commit c8e2815787338f52e5ad369bcaedb1798284dd29
Author: Nico Kruber 
Date:   2017-01-06T13:59:51Z

[hotfix] simplify code in BlobCache#deleteGlobal()

Also, re-order the code so that a local delete is always tried before 
creating
a connection to the BlobServer. If that fails, the local file is deleted at
least.

commit 5cd1c20aa604a9556c069ab78d8e471fa058499e
Author: Nico Kruber 
Date:   2016-11-29T17:11:06Z

[hotfix] re-use some code in BlobServerDeleteTest

commit d39948a6baa0cd6f68c4dfd8daffdd65e573fbca
Author: Nico Kruber 
Date:   2016-11-30T13:35:38Z

[hotfix] improve some failure messages in the BlobService's HA unit tests

commit dc87ae36088cc48a4122351ebe5b09a31d7fba41
Author: Nico Kruber 
Date:   2017-01-06T14:06:30Z

[FLINK-5129] make the BlobCache also use a distributed file system in HA 
mode

If available (in HA mode), download the jar files from the distributed file
system directly instead of querying the BlobServer. This way the load is 
more
distributed among the nodes of the file system (depending on its 
implementation
of course) compared to putting all the burden on a single BlobServer.

commit 389eaa9779d4bf22cc3972208d4f35ac7a966f5c
Author: Nico Kruber 
Date:   2017-01-06T16:21:05Z

[FLINK-5129] add unit tests for the BlobCache accessing the distributed FS 
directly

commit b3bcf944df87f37cccd831e8fb56b95caa620dad
Author: Nico Kruber 
Date:   2017-01-09T13:41:59Z

[FLINK-5129] let FileSystemBlobStore#get() remove the target file on failure

If the copy fai

[GitHub] flink pull request #3085: [FLINK-5178] allow BlobCache to use a distributed ...

2017-01-09 Thread NicoK

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/3085

[FLINK-5178] allow BlobCache to use a distributed file system irrespective 
of the HA mode

Allow the BlobServer and BlobCache to use a distributed file system for 
distributing BLOBs even if not in HA-mode. For this, we always try to use the 
path given by the `high-availability.storageDir` config option and, if 
accessible, set it up appropriately.

This builds upon https://github.com/apache/flink/pull/3084

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink FLINK-5178a

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3085.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3085


commit 464f2c834688507c67acb3ad584827132ebe444e
Author: Nico Kruber 
Date:   2016-11-22T11:49:03Z

[hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath

This was actually the same implementation as
FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the 
two
could have been removed but the implementation makes most sense at the
concrete file system abstraction layer, i.e. in FileSystemBlobStore.

commit 2ebffd4c2d499b61f164b4d54dc86c9d44b9c0ea
Author: Nico Kruber 
Date:   2016-11-23T15:11:35Z

[hotfix] do not create intermediate strings inside String.format in 
BlobUtils

commit 36ab6121e336f63138e442ea48a751ede7fb04c3
Author: Nico Kruber 
Date:   2016-11-24T16:11:19Z

[hotfix] properly shut down the BlobServer in BlobServerRangeTest

commit c8c12c67ae875ca5c96db78375bef880cf2a3c59
Author: Nico Kruber 
Date:   2017-01-05T17:06:01Z

[hotfix] use JUnit's TemporaryFolder in BlobRecoveryITCase, too

This makes cleaning up simpler.

commit a078cb0c26071fe70e3668d23d0c8bef8550892f
Author: Nico Kruber 
Date:   2017-01-05T17:27:00Z

[hotfix] add a missing "'" to the BlobStore class

commit a643f0b989c640a81b112ad14ae27a2a2b1ab257
Author: Nico Kruber 
Date:   2017-01-05T17:07:13Z

[FLINK-5129] BlobServer: include the cluster id in the HA storage path for 
blobs

This applies to the ZookeeperHaServices implementation.

commit 7d832919040059961940fc96d0cdb285bc9f77d3
Author: Nico Kruber 
Date:   2017-01-05T17:18:10Z

[FLINK-5129] unify duplicate code between the BlobServer and 
ZookeeperHaServices

(this was introduced by c64860677f)

commit 19879a01b99c4772a09627eb5f380f794f6c1e27
Author: Nico Kruber 
Date:   2016-11-30T13:52:12Z

[hotfix] add some more documentation in BlobStore-related classes

commit 80c17ef83104d1186c06d8f5d4cde11e4b05f2b8
Author: Nico Kruber 
Date:   2017-01-06T10:55:23Z

[hotfix] minor code beautifications when checking parameters

+ also check the blobService parameter in BlobLibraryCacheManager

commit ff920e48bd69acef280bdef2a12e5f5f9cca3a88
Author: Nico Kruber 
Date:   2017-01-06T13:21:42Z

[FLINK-5129] let BlobUtils#initStorageDirectory() throw a proper IOException

commit c8e2815787338f52e5ad369bcaedb1798284dd29
Author: Nico Kruber 
Date:   2017-01-06T13:59:51Z

[hotfix] simplify code in BlobCache#deleteGlobal()

Also, re-order the code so that a local delete is always tried before 
creating
a connection to the BlobServer. If that fails, the local file is deleted at
least.

commit 5cd1c20aa604a9556c069ab78d8e471fa058499e
Author: Nico Kruber 
Date:   2016-11-29T17:11:06Z

[hotfix] re-use some code in BlobServerDeleteTest

commit d39948a6baa0cd6f68c4dfd8daffdd65e573fbca
Author: Nico Kruber 
Date:   2016-11-30T13:35:38Z

[hotfix] improve some failure messages in the BlobService's HA unit tests

commit dc87ae36088cc48a4122351ebe5b09a31d7fba41
Author: Nico Kruber 
Date:   2017-01-06T14:06:30Z

[FLINK-5129] make the BlobCache also use a distributed file system in HA 
mode

If available (in HA mode), download the jar files from the distributed file
system directly instead of querying the BlobServer. This way the load is 
more
distributed among the nodes of the file system (depending on its 
implementation
of course) compared to putting all the burden on a single BlobServer.

commit 389eaa9779d4bf22cc3972208d4f35ac7a966f5c
Author: Nico Kruber 
Date:   2017-01-06T16:21:05Z

[FLINK-5129] add unit tests for the BlobCache accessing the distributed FS 
directly

commit b3bcf944df87f37cccd831e8fb56b95caa620dad
Author: Nico Kruber 
Date:   2017-01-09T13:41:59Z

[FLINK-5129] let FileSystemBlobStore#get() remove the target file on failure

If the copy fails, an IOException was thrown but the target file remained 
and
was (most likely) not finished. This cleans up the file in that case so that
code above, e.g. BlobServer and BlobCache, can rely on a file being 
complete as
long as it exists.

co

[jira] [Commented] (FLINK-4988) Elasticsearch 5.x support

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812217#comment-15812217
 ] 

ASF GitHub Bot commented on FLINK-4988:
---

Github user mikedias commented on the issue:

https://github.com/apache/flink/pull/2767
  
@tzulitai sure, no problem! :) 


> Elasticsearch 5.x support
> -
>
> Key: FLINK-4988
> URL: https://issues.apache.org/jira/browse/FLINK-4988
> Project: Flink
>  Issue Type: New Feature
>Reporter: Mike Dias
>
> Elasticsearch 5.x was released: 
> https://www.elastic.co/blog/elasticsearch-5-0-0-released



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #2767: [FLINK-4988] Elasticsearch 5.x support

2017-01-09 Thread mikedias

Github user mikedias commented on the issue:

https://github.com/apache/flink/pull/2767
  
@tzulitai sure, no problem! :) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3555) Web interface does not render job information properly

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812183#comment-15812183
 ] 

ASF GitHub Bot commented on FLINK-3555:
---

Github user sachingoel0101 commented on the issue:

https://github.com/apache/flink/pull/2941
  
Imo, the best way to achieve the change equivalent to the change to 
vendor.css file would be to add a new class, say, .panel-body-flowable in 
index.styl which defines the overflow rule, and add this class to the elements 
wherever needed. 


> Web interface does not render job information properly
> --
>
> Key: FLINK-3555
> URL: https://issues.apache.org/jira/browse/FLINK-3555
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Affects Versions: 1.0.0
>Reporter: Till Rohrmann
>Assignee: Sergey_Sokur
>Priority: Minor
> Attachments: Chrome.png, Safari.png
>
>
> In Chrome and Safari, the different tabs of the detailed job view are not 
> properly rendered. The text goes beyond the surrounding box. I would guess 
> that this is some kind of css issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #2941: [FLINK-3555] Web interface does not render job informatio...

2017-01-09 Thread sachingoel0101

Github user sachingoel0101 commented on the issue:

https://github.com/apache/flink/pull/2941
  
Imo, the best way to achieve the change equivalent to the change to 
vendor.css file would be to add a new class, say, .panel-body-flowable in 
index.styl which defines the overflow rule, and add this class to the elements 
wherever needed. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3039: [FLINK-5280] Update TableSource to support nested ...

2017-01-09 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3039#discussion_r95178608
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/CsvTableSource.scala
 ---
@@ -53,7 +53,9 @@ class CsvTableSource(
 ignoreFirstLine: Boolean = false,
 ignoreComments: String = null,
 lenient: Boolean = false)
-  extends AbstractBatchStreamTableSource[Row]
+  extends BatchTableSource[Row]
+  with StreamTableSource[Row]
+  with DefinedFieldNames
--- End diff --

If we define `returnType` as `new RowTypeInfo(fieldTypes, fieldNames)`, we 
do not need to implement `DefinedFieldNames`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812080#comment-15812080
 ] 

ASF GitHub Bot commented on FLINK-5280:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3039#discussion_r95178608
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/CsvTableSource.scala
 ---
@@ -53,7 +53,9 @@ class CsvTableSource(
 ignoreFirstLine: Boolean = false,
 ignoreComments: String = null,
 lenient: Boolean = false)
-  extends AbstractBatchStreamTableSource[Row]
+  extends BatchTableSource[Row]
+  with StreamTableSource[Row]
+  with DefinedFieldNames
--- End diff --

If we define `returnType` as `new RowTypeInfo(fieldTypes, fieldNames)`, we 
do not need to implement `DefinedFieldNames`.


> Extend TableSource to support nested data
> -
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5129) make the BlobServer use a distributed file system

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812063#comment-15812063
 ] 

ASF GitHub Bot commented on FLINK-5129:
---

Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/3076
  
fixed a typo in the unit test that lead to the tests passing although there 
was still something wrong which is now fixed as well


> make the BlobServer use a distributed file system
> -
>
> Key: FLINK-5129
> URL: https://issues.apache.org/jira/browse/FLINK-5129
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> Currently, the BlobServer uses a local storage and, in addition when the HA 
> mode is set, a distributed file system, e.g. hdfs. This, however, is only 
> used by the JobManager and all TaskManager instances request blobs from the 
> JobManager. By using the distributed file system there as well, we would 
> lower the load on the JobManager and increase scalability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #3084: [FLINK-5129] make the BlobServer use a distributed...

2017-01-09 Thread NicoK

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/3084

[FLINK-5129] make the BlobServer use a distributed file system

Make the BlobCache use the BlobServer's distributed file system in HA mode: 
previously even in HA mode and if the cache has access to the file system, it 
would download BLOBs from one central BlobServer. By using the distributed file 
system beneath we may leverage its scalability and remove a single point of 
(performance) failure. If the distributed file system is not accessible at the 
blob
caches, the old behaviour is used.

@uce can you have a look?
(this is an updated and fixed version of 
https://github.com/apache/flink/pull/3076)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink FLINK-5129a

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3084.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3084


commit 464f2c834688507c67acb3ad584827132ebe444e
Author: Nico Kruber 
Date:   2016-11-22T11:49:03Z

[hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath

This was actually the same implementation as
FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the 
two
could have been removed but the implementation makes most sense at the
concrete file system abstraction layer, i.e. in FileSystemBlobStore.

commit 2ebffd4c2d499b61f164b4d54dc86c9d44b9c0ea
Author: Nico Kruber 
Date:   2016-11-23T15:11:35Z

[hotfix] do not create intermediate strings inside String.format in 
BlobUtils

commit 36ab6121e336f63138e442ea48a751ede7fb04c3
Author: Nico Kruber 
Date:   2016-11-24T16:11:19Z

[hotfix] properly shut down the BlobServer in BlobServerRangeTest

commit c8c12c67ae875ca5c96db78375bef880cf2a3c59
Author: Nico Kruber 
Date:   2017-01-05T17:06:01Z

[hotfix] use JUnit's TemporaryFolder in BlobRecoveryITCase, too

This makes cleaning up simpler.

commit a078cb0c26071fe70e3668d23d0c8bef8550892f
Author: Nico Kruber 
Date:   2017-01-05T17:27:00Z

[hotfix] add a missing "'" to the BlobStore class

commit a643f0b989c640a81b112ad14ae27a2a2b1ab257
Author: Nico Kruber 
Date:   2017-01-05T17:07:13Z

[FLINK-5129] BlobServer: include the cluster id in the HA storage path for 
blobs

This applies to the ZookeeperHaServices implementation.

commit 7d832919040059961940fc96d0cdb285bc9f77d3
Author: Nico Kruber 
Date:   2017-01-05T17:18:10Z

[FLINK-5129] unify duplicate code between the BlobServer and 
ZookeeperHaServices

(this was introduced by c64860677f)

commit 19879a01b99c4772a09627eb5f380f794f6c1e27
Author: Nico Kruber 
Date:   2016-11-30T13:52:12Z

[hotfix] add some more documentation in BlobStore-related classes

commit 80c17ef83104d1186c06d8f5d4cde11e4b05f2b8
Author: Nico Kruber 
Date:   2017-01-06T10:55:23Z

[hotfix] minor code beautifications when checking parameters

+ also check the blobService parameter in BlobLibraryCacheManager

commit ff920e48bd69acef280bdef2a12e5f5f9cca3a88
Author: Nico Kruber 
Date:   2017-01-06T13:21:42Z

[FLINK-5129] let BlobUtils#initStorageDirectory() throw a proper IOException

commit c8e2815787338f52e5ad369bcaedb1798284dd29
Author: Nico Kruber 
Date:   2017-01-06T13:59:51Z

[hotfix] simplify code in BlobCache#deleteGlobal()

Also, re-order the code so that a local delete is always tried before 
creating
a connection to the BlobServer. If that fails, the local file is deleted at
least.

commit 5cd1c20aa604a9556c069ab78d8e471fa058499e
Author: Nico Kruber 
Date:   2016-11-29T17:11:06Z

[hotfix] re-use some code in BlobServerDeleteTest

commit d39948a6baa0cd6f68c4dfd8daffdd65e573fbca
Author: Nico Kruber 
Date:   2016-11-30T13:35:38Z

[hotfix] improve some failure messages in the BlobService's HA unit tests

commit dc87ae36088cc48a4122351ebe5b09a31d7fba41
Author: Nico Kruber 
Date:   2017-01-06T14:06:30Z

[FLINK-5129] make the BlobCache also use a distributed file system in HA 
mode

If available (in HA mode), download the jar files from the distributed file
system directly instead of querying the BlobServer. This way the load is 
more
distributed among the nodes of the file system (depending on its 
implementation
of course) compared to putting all the burden on a single BlobServer.

commit 389eaa9779d4bf22cc3972208d4f35ac7a966f5c
Author: Nico Kruber 
Date:   2017-01-06T16:21:05Z

[FLINK-5129] add unit tests for the BlobCache accessing the distributed FS 
directly

commit b3bcf944df87f37cccd831e8fb56b95caa620dad
Author: Nico Kruber 
Date:   2017-01-09T13:41:59Z

[FLINK-5129] let FileSystemBlobStore#get() remove the target file on failure

If the copy fails, an IOException was thrown but the target file r

[jira] [Commented] (FLINK-5129) make the BlobServer use a distributed file system

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812074#comment-15812074
 ] 

ASF GitHub Bot commented on FLINK-5129:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/3084

[FLINK-5129] make the BlobServer use a distributed file system

Make the BlobCache use the BlobServer's distributed file system in HA mode: 
previously even in HA mode and if the cache has access to the file system, it 
would download BLOBs from one central BlobServer. By using the distributed file 
system beneath we may leverage its scalability and remove a single point of 
(performance) failure. If the distributed file system is not accessible at the 
blob
caches, the old behaviour is used.

@uce can you have a look?
(this is an updated and fixed version of 
https://github.com/apache/flink/pull/3076)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink FLINK-5129a

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3084.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3084


commit 464f2c834688507c67acb3ad584827132ebe444e
Author: Nico Kruber 
Date:   2016-11-22T11:49:03Z

[hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath

This was actually the same implementation as
FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the 
two
could have been removed but the implementation makes most sense at the
concrete file system abstraction layer, i.e. in FileSystemBlobStore.

commit 2ebffd4c2d499b61f164b4d54dc86c9d44b9c0ea
Author: Nico Kruber 
Date:   2016-11-23T15:11:35Z

[hotfix] do not create intermediate strings inside String.format in 
BlobUtils

commit 36ab6121e336f63138e442ea48a751ede7fb04c3
Author: Nico Kruber 
Date:   2016-11-24T16:11:19Z

[hotfix] properly shut down the BlobServer in BlobServerRangeTest

commit c8c12c67ae875ca5c96db78375bef880cf2a3c59
Author: Nico Kruber 
Date:   2017-01-05T17:06:01Z

[hotfix] use JUnit's TemporaryFolder in BlobRecoveryITCase, too

This makes cleaning up simpler.

commit a078cb0c26071fe70e3668d23d0c8bef8550892f
Author: Nico Kruber 
Date:   2017-01-05T17:27:00Z

[hotfix] add a missing "'" to the BlobStore class

commit a643f0b989c640a81b112ad14ae27a2a2b1ab257
Author: Nico Kruber 
Date:   2017-01-05T17:07:13Z

[FLINK-5129] BlobServer: include the cluster id in the HA storage path for 
blobs

This applies to the ZookeeperHaServices implementation.

commit 7d832919040059961940fc96d0cdb285bc9f77d3
Author: Nico Kruber 
Date:   2017-01-05T17:18:10Z

[FLINK-5129] unify duplicate code between the BlobServer and 
ZookeeperHaServices

(this was introduced by c64860677f)

commit 19879a01b99c4772a09627eb5f380f794f6c1e27
Author: Nico Kruber 
Date:   2016-11-30T13:52:12Z

[hotfix] add some more documentation in BlobStore-related classes

commit 80c17ef83104d1186c06d8f5d4cde11e4b05f2b8
Author: Nico Kruber 
Date:   2017-01-06T10:55:23Z

[hotfix] minor code beautifications when checking parameters

+ also check the blobService parameter in BlobLibraryCacheManager

commit ff920e48bd69acef280bdef2a12e5f5f9cca3a88
Author: Nico Kruber 
Date:   2017-01-06T13:21:42Z

[FLINK-5129] let BlobUtils#initStorageDirectory() throw a proper IOException

commit c8e2815787338f52e5ad369bcaedb1798284dd29
Author: Nico Kruber 
Date:   2017-01-06T13:59:51Z

[hotfix] simplify code in BlobCache#deleteGlobal()

Also, re-order the code so that a local delete is always tried before 
creating
a connection to the BlobServer. If that fails, the local file is deleted at
least.

commit 5cd1c20aa604a9556c069ab78d8e471fa058499e
Author: Nico Kruber 
Date:   2016-11-29T17:11:06Z

[hotfix] re-use some code in BlobServerDeleteTest

commit d39948a6baa0cd6f68c4dfd8daffdd65e573fbca
Author: Nico Kruber 
Date:   2016-11-30T13:35:38Z

[hotfix] improve some failure messages in the BlobService's HA unit tests

commit dc87ae36088cc48a4122351ebe5b09a31d7fba41
Author: Nico Kruber 
Date:   2017-01-06T14:06:30Z

[FLINK-5129] make the BlobCache also use a distributed file system in HA 
mode

If available (in HA mode), download the jar files from the distributed file
system directly instead of querying the BlobServer. This way the load is 
more
distributed among the nodes of the file system (depending on its 
implementation
of course) compared to putting all the burden on a single BlobServer.

commit 389eaa9779d4bf22cc3972208d4f35ac7a966f5c
Author: Nico Kruber 
Date:   2017-01-06T16:21:05Z

[FLINK-5129] add unit tests for the BlobCache accessing the distributed FS 
directly

[jira] [Created] (FLINK-5430) Bring documentation up to speed for current feature set

2017-01-09 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-5430:
---

 Summary: Bring documentation up to speed for current feature set
 Key: FLINK-5430
 URL: https://issues.apache.org/jira/browse/FLINK-5430
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Stephan Ewen
 Fix For: 1.2.0, 1.3.0


The documentation is still lacking in many parts that were developed or changed 
since the 1.1 release.

I would like to fix that for {{master}} and {{release-1.2}}.

This is a parent issue, let's create sub issues for each respective component 
and feature that we want to document better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3076: [FLINK-5129] make the BlobServer use a distributed file s...

2017-01-09 Thread NicoK

Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/3076
  
fixed a typo in the unit test that lead to the tests passing although there 
was still something wrong which is now fixed as well


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4410) Report more information about operator checkpoints

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812046#comment-15812046
 ] 

ASF GitHub Bot commented on FLINK-4410:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/3042
  
+1 to merge


> Report more information about operator checkpoints
> --
>
> Key: FLINK-4410
> URL: https://issues.apache.org/jira/browse/FLINK-4410
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
> Fix For: 1.2.0
>
>
> Checkpoint statistics contain the duration of a checkpoint, measured as from 
> the CheckpointCoordinator's start to the point when the acknowledge message 
> came.
> We should additionally expose
>   - duration of the synchronous part of a checkpoint
>   - duration of the asynchronous part of a checkpoint
>   - number of bytes buffered during the stream alignment phase
>   - duration of the stream alignment phase
> Note: In the case of using *at-least once* semantics, the latter two will 
> always be zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3042: [FLINK-4410] Expose more fine grained checkpoint statisti...

2017-01-09 Thread rmetzger

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/3042
  
+1 to merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Closed] (FLINK-5066) add more efficient isEvent check to EventSerializer

2017-01-09 Thread Ufuk Celebi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi closed FLINK-5066.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

Implemented in 9457465 (master).

> add more efficient isEvent check to EventSerializer
> ---
>
> Key: FLINK-5066
> URL: https://issues.apache.org/jira/browse/FLINK-5066
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
> Fix For: 1.3.0
>
>
> -LocalInputChannel#getNextBuffer de-serialises all incoming events on the 
> lookout for an EndOfPartitionEvent.-
> Some buffer code de-serialises all incoming events on the lookout for an 
> EndOfPartitionEvent
> (now applies to PartitionRequestQueue#isEndOfPartitionEvent()).
> Instead, if EventSerializer offered a function to check for an event type 
> only without de-serialising the whole event, we could save some resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-5059) only serialise events once in RecordWriter#broadcastEvent

2017-01-09 Thread Ufuk Celebi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi closed FLINK-5059.
--
   Resolution: Implemented
Fix Version/s: 1.3.0

Fixed in 9cff8c9 (master).

> only serialise events once in RecordWriter#broadcastEvent
> -
>
> Key: FLINK-5059
> URL: https://issues.apache.org/jira/browse/FLINK-5059
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
> Fix For: 1.3.0
>
>
> Currently, 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter#broadcastEvent 
> serialises the event once per target channel. Instead, it could serialise the 
> event only once and use the serialised form for every channel and thus save 
> resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5059) only serialise events once in RecordWriter#broadcastEvent

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812039#comment-15812039
 ] 

ASF GitHub Bot commented on FLINK-5059:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2805


> only serialise events once in RecordWriter#broadcastEvent
> -
>
> Key: FLINK-5059
> URL: https://issues.apache.org/jira/browse/FLINK-5059
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
> Fix For: 1.3.0
>
>
> Currently, 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter#broadcastEvent 
> serialises the event once per target channel. Instead, it could serialise the 
> event only once and use the serialised form for every channel and thus save 
> resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5066) add more efficient isEvent check to EventSerializer

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812040#comment-15812040
 ] 

ASF GitHub Bot commented on FLINK-5066:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2806


> add more efficient isEvent check to EventSerializer
> ---
>
> Key: FLINK-5066
> URL: https://issues.apache.org/jira/browse/FLINK-5066
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> -LocalInputChannel#getNextBuffer de-serialises all incoming events on the 
> lookout for an EndOfPartitionEvent.-
> Some buffer code de-serialises all incoming events on the lookout for an 
> EndOfPartitionEvent
> (now applies to PartitionRequestQueue#isEndOfPartitionEvent()).
> Instead, if EventSerializer offered a function to check for an event type 
> only without de-serialising the whole event, we could save some resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #2829: [hotfix] prevent RecordWriter#flush() to clear the...

2017-01-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2829


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2806: [FLINK-5066] add more efficient isEvent check to E...

2017-01-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2806


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2805: [FLINK-5059] only serialise events once in RecordW...

2017-01-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2805


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3083: [FLINK-2662] [optimizer] Fix translation of broadc...

2017-01-09 Thread fhueske

GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/3083

[FLINK-2662] [optimizer] Fix translation of broadcasted unions.

Fix optimizer to support union with > 2 broadcasted inputs.

1. `BinaryUnionDescriptor` is changed to compute fully replicated global 
property if both inputs are fully replicated.
2. Enumeration of plans for binary union is changed to enforce local 
forward connection if input is fully replicated.
3. Add a test to check that union with three inputs can be on broadcasted 
side of join (fully replicated property is requested and pushed to all inputs 
of the union).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink unionBug3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3083.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3083


commit a0019d386e09f727874e54b7c8ce2ca94b26f2e5
Author: Fabian Hueske 
Date:   2017-01-05T23:00:30Z

[FLINK-2662] [optimizer] Fix translation of broadcasted unions.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2662) CompilerException: "Bug: Plan generation for Unions picked a ship strategy between binary plan operators."

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812018#comment-15812018
 ] 

ASF GitHub Bot commented on FLINK-2662:
---

GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/3083

[FLINK-2662] [optimizer] Fix translation of broadcasted unions.

Fix optimizer to support union with > 2 broadcasted inputs.

1. `BinaryUnionDescriptor` is changed to compute fully replicated global 
property if both inputs are fully replicated.
2. Enumeration of plans for binary union is changed to enforce local 
forward connection if input is fully replicated.
3. Add a test to check that union with three inputs can be on broadcasted 
side of join (fully replicated property is requested and pushed to all inputs 
of the union).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink unionBug3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3083.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3083


commit a0019d386e09f727874e54b7c8ce2ca94b26f2e5
Author: Fabian Hueske 
Date:   2017-01-05T23:00:30Z

[FLINK-2662] [optimizer] Fix translation of broadcasted unions.




> CompilerException: "Bug: Plan generation for Unions picked a ship strategy 
> between binary plan operators."
> --
>
> Key: FLINK-2662
> URL: https://issues.apache.org/jira/browse/FLINK-2662
> Project: Flink
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 0.9.1, 0.10.0
>Reporter: Gabor Gevay
>Assignee: Fabian Hueske
>Priority: Critical
> Fix For: 1.0.0, 1.2.0, 1.3.0, 1.1.5
>
> Attachments: Bug.java, FlinkBug.scala
>
>
> I have a Flink program which throws the exception in the jira title. Full 
> text:
> Exception in thread "main" org.apache.flink.optimizer.CompilerException: Bug: 
> Plan generation for Unions picked a ship strategy between binary plan 
> operators.
>   at 
> org.apache.flink.optimizer.traversals.BinaryUnionReplacer.collect(BinaryUnionReplacer.java:113)
>   at 
> org.apache.flink.optimizer.traversals.BinaryUnionReplacer.postVisit(BinaryUnionReplacer.java:72)
>   at 
> org.apache.flink.optimizer.traversals.BinaryUnionReplacer.postVisit(BinaryUnionReplacer.java:41)
>   at 
> org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:170)
>   at 
> org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
>   at 
> org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:163)
>   at 
> org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:163)
>   at 
> org.apache.flink.optimizer.plan.WorksetIterationPlanNode.acceptForStepFunction(WorksetIterationPlanNode.java:194)
>   at 
> org.apache.flink.optimizer.traversals.BinaryUnionReplacer.preVisit(BinaryUnionReplacer.java:49)
>   at 
> org.apache.flink.optimizer.traversals.BinaryUnionReplacer.preVisit(BinaryUnionReplacer.java:41)
>   at 
> org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:162)
>   at 
> org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
>   at 
> org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
>   at 
> org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:127)
>   at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:520)
>   at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:402)
>   at 
> org.apache.flink.client.LocalExecutor.getOptimizerPlanAsJSON(LocalExecutor.java:202)
>   at 
> org.apache.flink.api.java.LocalEnvironment.getExecutionPlan(LocalEnvironment.java:63)
>   at malom.Solver.main(Solver.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
> The execution plan:
> http://compalg.inf.elte.hu/~ggevay/flink/plan_3_4_0_0_without_verif.txt
> (I obtained this by commenting out the line that throws the exception)
> The code is here:
> https://github.com/ggevay/flink/tree/plan-generation-bug
> The class to run is "Solver". It needs a command line argument, which

[GitHub] flink issue #2941: [FLINK-3555] Web interface does not render job informatio...

2017-01-09 Thread StephanEwen

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2941
  
Sorry, I have to take a step back here:

The changes you made were directly to the resulting CSS files. These files 
are generated by the web build scripts, so the changes have to be made to the 
original files. Otherwise they will be overwritten the next time somebody 
re-builds the web UI. See here: 
https://github.com/apache/flink/blob/master/flink-runtime-web/README.md

The change to `flink-runtime-web/web-dashboard/web/css/index.css` is easy t 
put into `index.styl` instead. However, the change in 
`flink-runtime-web/web-dashboard/web/css/vendor.css` overwrites a style that is 
pulled in as a dependency. Not sure how to handle this correctly.

@uce @sachingoel0101 ot @iampeter Can probably help here.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3555) Web interface does not render job information properly

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811940#comment-15811940
 ] 

ASF GitHub Bot commented on FLINK-3555:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2941
  
Sorry, I have to take a step back here:

The changes you made were directly to the resulting CSS files. These files 
are generated by the web build scripts, so the changes have to be made to the 
original files. Otherwise they will be overwritten the next time somebody 
re-builds the web UI. See here: 
https://github.com/apache/flink/blob/master/flink-runtime-web/README.md

The change to `flink-runtime-web/web-dashboard/web/css/index.css` is easy t 
put into `index.styl` instead. However, the change in 
`flink-runtime-web/web-dashboard/web/css/vendor.css` overwrites a style that is 
pulled in as a dependency. Not sure how to handle this correctly.

@uce @sachingoel0101 ot @iampeter Can probably help here.




> Web interface does not render job information properly
> --
>
> Key: FLINK-3555
> URL: https://issues.apache.org/jira/browse/FLINK-3555
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Affects Versions: 1.0.0
>Reporter: Till Rohrmann
>Assignee: Sergey_Sokur
>Priority: Minor
> Attachments: Chrome.png, Safari.png
>
>
> In Chrome and Safari, the different tabs of the detailed job view are not 
> properly rendered. The text goes beyond the surrounding box. I would guess 
> that this is some kind of css issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-4410) Report more information about operator checkpoints

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811918#comment-15811918
 ] 

ASF GitHub Bot commented on FLINK-4410:
---

Github user uce commented on the issue:

https://github.com/apache/flink/pull/3042
  
Thanks for checking it out Robert. Would love to merge it for 1.2 as well. 
I fixed the rounding issue. 


> Report more information about operator checkpoints
> --
>
> Key: FLINK-4410
> URL: https://issues.apache.org/jira/browse/FLINK-4410
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
> Fix For: 1.2.0
>
>
> Checkpoint statistics contain the duration of a checkpoint, measured as from 
> the CheckpointCoordinator's start to the point when the acknowledge message 
> came.
> We should additionally expose
>   - duration of the synchronous part of a checkpoint
>   - duration of the asynchronous part of a checkpoint
>   - number of bytes buffered during the stream alignment phase
>   - duration of the stream alignment phase
> Note: In the case of using *at-least once* semantics, the latter two will 
> always be zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3042: [FLINK-4410] Expose more fine grained checkpoint statisti...

2017-01-09 Thread uce

Github user uce commented on the issue:

https://github.com/apache/flink/pull/3042
  
Thanks for checking it out Robert. Would love to merge it for 1.2 as well. 
I fixed the rounding issue. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5420) Make the CEP library rescalable.

2017-01-09 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811887#comment-15811887
 ] 

Till Rohrmann commented on FLINK-5420:
--

Adding a description containing some more details about the problem could be 
helpful.

> Make the CEP library rescalable.
> 
>
> Key: FLINK-5420
> URL: https://issues.apache.org/jira/browse/FLINK-5420
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.2.0
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-5423) Implement Stochastic Outlier Selection

2017-01-09 Thread Till Rohrmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-5423:
-
Assignee: Fokko Driesprong

> Implement Stochastic Outlier Selection
> --
>
> Key: FLINK-5423
> URL: https://issues.apache.org/jira/browse/FLINK-5423
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>
> I've implemented the Stochastic Outlier Selection (SOS) algorithm by Jeroen 
> Jansen.
> http://jeroenjanssens.com/2013/11/24/stochastic-outlier-selection.html
> Integrated as much as possible with the components from the machine learning 
> library.
> The algorithm itself has been compared to four other algorithms and it it 
> shows that SOS has a higher performance on most of these real-world datasets. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5426) Clean up the Flink Machine Learning library

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811864#comment-15811864
 ] 

ASF GitHub Bot commented on FLINK-5426:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3081
  
Thanks for your contribution @Fokko. I'll take a look at your PR in the 
next days.


> Clean up the Flink Machine Learning library
> ---
>
> Key: FLINK-5426
> URL: https://issues.apache.org/jira/browse/FLINK-5426
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>
> Hi Guys,
> I would like to clean up the Machine Learning library. A lot of the code in 
> the ML Library does not conform to the original contribution guide. For 
> example:
> Duplicate tests, different names, but exactly the same testcase:
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L148
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L164
> Lot of multi-line tests-cases:
> https://github.com/Fokko/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala
> Mis-use of constants:
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/math/DenseMatrix.scala#L58
> Please allow me to clean this up, and I'm looking forward to contribute more 
> code, especially to the ML part. I've have been a contributor to Apache Spark 
> and am happy to extend the codebase with new distributed algorithms and make 
> the codebase more mature.
> Cheers, Fokko



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5423) Implement Stochastic Outlier Selection

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811868#comment-15811868
 ] 

ASF GitHub Bot commented on FLINK-5423:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3077
  
Thanks for your contribution @Fokko. I'll take a look at this PR in the 
next days :-)


> Implement Stochastic Outlier Selection
> --
>
> Key: FLINK-5423
> URL: https://issues.apache.org/jira/browse/FLINK-5423
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Fokko Driesprong
>
> I've implemented the Stochastic Outlier Selection (SOS) algorithm by Jeroen 
> Jansen.
> http://jeroenjanssens.com/2013/11/24/stochastic-outlier-selection.html
> Integrated as much as possible with the components from the machine learning 
> library.
> The algorithm itself has been compared to four other algorithms and it it 
> shows that SOS has a higher performance on most of these real-world datasets. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3077: [FLINK-5423] Implement Stochastic Outlier Selection

2017-01-09 Thread tillrohrmann

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3077
  
Thanks for your contribution @Fokko. I'll take a look at this PR in the 
next days :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3081: [FLINK-5426] Clean up the Flink Machine Learning library

2017-01-09 Thread tillrohrmann

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3081
  
Thanks for your contribution @Fokko. I'll take a look at your PR in the 
next days.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-5426) Clean up the Flink Machine Learning library

2017-01-09 Thread Till Rohrmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-5426:
-
Assignee: Fokko Driesprong

> Clean up the Flink Machine Learning library
> ---
>
> Key: FLINK-5426
> URL: https://issues.apache.org/jira/browse/FLINK-5426
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>
> Hi Guys,
> I would like to clean up the Machine Learning library. A lot of the code in 
> the ML Library does not conform to the original contribution guide. For 
> example:
> Duplicate tests, different names, but exactly the same testcase:
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L148
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L164
> Lot of multi-line tests-cases:
> https://github.com/Fokko/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala
> Mis-use of constants:
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/math/DenseMatrix.scala#L58
> Please allow me to clean this up, and I'm looking forward to contribute more 
> code, especially to the ML part. I've have been a contributor to Apache Spark 
> and am happy to extend the codebase with new distributed algorithms and make 
> the codebase more mature.
> Cheers, Fokko



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-5429) Code generate types between operators in Table API

2017-01-09 Thread Timo Walther (JIRA)

Timo Walther created FLINK-5429:
---

 Summary: Code generate types between operators in Table API
 Key: FLINK-5429
 URL: https://issues.apache.org/jira/browse/FLINK-5429
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Reporter: Timo Walther


Currently, the Table API uses the generic Row type for shipping records between 
operators in underlying DataSet and DataStream API. For efficiency reasons we 
should code generate those records. The final design is up for discussion but 
here are some ideas:

A row like {{(a: INT NULL, b: INT NOT NULL, c: STRING)}} could look like

{code}
final class GeneratedRow$123 {
  public boolean a_isNull;
  public int a;

  public int b;
  public String c;
}
{code}

Types could be generated using Janino in the pre-flight phase. The generated 
types should use primitive types wherever possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-4391) Provide support for asynchronous operations over streams

2017-01-09 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811848#comment-15811848
 ] 

Till Rohrmann commented on FLINK-4391:
--

The plan is to document this feature: 
https://issues.apache.org/jira/browse/FLINK-5371.

> Provide support for asynchronous operations over streams
> 
>
> Key: FLINK-4391
> URL: https://issues.apache.org/jira/browse/FLINK-4391
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Jamie Grier
>Assignee: david.wang
> Fix For: 1.2.0
>
>
> Many Flink users need to do asynchronous processing driven by data from a 
> DataStream.  The classic example would be joining against an external 
> database in order to enrich a stream with extra information.
> It would be nice to add general support for this type of operation in the 
> Flink API.  Ideally this could simply take the form of a new operator that 
> manages async operations, keeps so many of them in flight, and then emits 
> results to downstream operators as the async operations complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-4410) Report more information about operator checkpoints

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811826#comment-15811826
 ] 

ASF GitHub Bot commented on FLINK-4410:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/3042
  
Very nice change!

I would love to merge it to 1.2 as well, its so helpful!
One very minor thing: I would suggest to round the percentages shown for 
the completion. I had an instance where it was showing a progress of 
51.1%.

Here's a screenshot from my testing job:

![image](https://cloud.githubusercontent.com/assets/89049/21768632/e7800fcc-d67a-11e6-8961-6063f4faa138.png)



> Report more information about operator checkpoints
> --
>
> Key: FLINK-4410
> URL: https://issues.apache.org/jira/browse/FLINK-4410
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
> Fix For: 1.2.0
>
>
> Checkpoint statistics contain the duration of a checkpoint, measured as from 
> the CheckpointCoordinator's start to the point when the acknowledge message 
> came.
> We should additionally expose
>   - duration of the synchronous part of a checkpoint
>   - duration of the asynchronous part of a checkpoint
>   - number of bytes buffered during the stream alignment phase
>   - duration of the stream alignment phase
> Note: In the case of using *at-least once* semantics, the latter two will 
> always be zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3042: [FLINK-4410] Expose more fine grained checkpoint statisti...

2017-01-09 Thread rmetzger

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/3042
  
Very nice change!

I would love to merge it to 1.2 as well, its so helpful!
One very minor thing: I would suggest to round the percentages shown for 
the completion. I had an instance where it was showing a progress of 
51.1%.

Here's a screenshot from my testing job:

![image](https://cloud.githubusercontent.com/assets/89049/21768632/e7800fcc-d67a-11e6-8961-6063f4faa138.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5360) Fix arguments names in WindowedStream

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811814#comment-15811814
 ] 

ASF GitHub Bot commented on FLINK-5360:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/3022
  
I'm running a last check and then I'm merging. Thanks for reviewing 
@xhumanoid! And thanks for fixing it @mushketyk. 👍 


> Fix arguments names in WindowedStream
> -
>
> Key: FLINK-5360
> URL: https://issues.apache.org/jira/browse/FLINK-5360
> Project: Flink
>  Issue Type: Bug
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> Should be "field" instead of "positionToMaxBy" in some methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3022: [FLINK-5360] Fix argument names in WindowedStream

2017-01-09 Thread aljoscha

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/3022
  
I'm running a last check and then I'm merging. Thanks for reviewing 
@xhumanoid! And thanks for fixing it @mushketyk. ð 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4410) Report more information about operator checkpoints

2017-01-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811806#comment-15811806
 ] 

ASF GitHub Bot commented on FLINK-4410:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3042
  
I think a triggering delay would be very nice and helpful.
We can do this as a next step, separate from this pull request.


> Report more information about operator checkpoints
> --
>
> Key: FLINK-4410
> URL: https://issues.apache.org/jira/browse/FLINK-4410
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
> Fix For: 1.2.0
>
>
> Checkpoint statistics contain the duration of a checkpoint, measured as from 
> the CheckpointCoordinator's start to the point when the acknowledge message 
> came.
> We should additionally expose
>   - duration of the synchronous part of a checkpoint
>   - duration of the asynchronous part of a checkpoint
>   - number of bytes buffered during the stream alignment phase
>   - duration of the stream alignment phase
> Note: In the case of using *at-least once* semantics, the latter two will 
> always be zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink issue #3042: [FLINK-4410] Expose more fine grained checkpoint statisti...

2017-01-09 Thread StephanEwen

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3042
  
I think a triggering delay would be very nice and helpful.
We can do this as a next step, separate from this pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

1 2 >

1 - 100 of 144 matches

Mail list logo