[jira] [Updated] (FLINK-7611) add metrics to measure the data drop by watermark

2017-09-11 Thread aitozi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

aitozi updated FLINK-7611:
--
Affects Version/s: 1.3.0

> add metrics to measure the data drop by watermark
> -
>
> Key: FLINK-7611
> URL: https://issues.apache.org/jira/browse/FLINK-7611
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.2.0, 1.3.0
>Reporter: aitozi
>Priority: Minor
>  Labels: patch
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> when use the window operator with event time, the data come after the 
> window.endtime + allowLatency, the data will be droped, but there is no 
> existed metrics to measure the num of dropped data, and this value will help 
> to set the correct allowLatency



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-2395) Check Scala catch blocks for Throwable

2017-09-11 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162401#comment-16162401
 ] 

mingleizhang commented on FLINK-2395:
-

The link was broken. 404 error.

> Check Scala catch blocks for Throwable
> --
>
> Key: FLINK-2395
> URL: https://issues.apache.org/jira/browse/FLINK-2395
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Scala API
>Reporter: Till Rohrmann
>Priority: Minor
>
> As described in [1], it's not a good practice to catch {{Throwables}} in 
> Scala catch blocks because Scala uses some special exceptions for the control 
> flow. Therefore we should check whether we can restrict ourselves to only 
> catching subtypes of {{Throwable}}, such as {{Exception}}, instead.
> [1] 
> https://www.sumologic.com/2014/05/05/why-you-should-never-catch-throwable-in-scala/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7611) add metrics to measure the data drop by watermark

2017-09-11 Thread aitozi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

aitozi updated FLINK-7611:
--
Affects Version/s: 1.2.0

> add metrics to measure the data drop by watermark
> -
>
> Key: FLINK-7611
> URL: https://issues.apache.org/jira/browse/FLINK-7611
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.2.0
>Reporter: aitozi
>Priority: Minor
>  Labels: patch
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> when use the window operator with event time, the data come after the 
> window.endtime + allowLatency, the data will be droped, but there is no 
> existed metrics to measure the num of dropped data, and this value will help 
> to set the correct allowLatency



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-2428) Clean up unused properties in StreamConfig

2017-09-11 Thread Hai Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hai Zhou closed FLINK-2428.
---
Resolution: Fixed

> Clean up unused properties in StreamConfig
> --
>
> Key: FLINK-2428
> URL: https://issues.apache.org/jira/browse/FLINK-2428
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Priority: Minor
>
> There is a multitude of unused properties in the {{StreamConfig}}, which 
> should be removed, if no longer relevant.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-3542) FlinkKafkaConsumer09 cannot handle changing number of partitions

2017-09-11 Thread Hai Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162054#comment-16162054
 ] 

Hai Zhou commented on FLINK-3542:
-

[~till.rohrmann]
In flink 1.3.x,   FlinkKafkaConsumer08 also have this problem ?


> FlinkKafkaConsumer09 cannot handle changing number of partitions
> 
>
> Key: FLINK-3542
> URL: https://issues.apache.org/jira/browse/FLINK-3542
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.0.0
>Reporter: Till Rohrmann
>Priority: Minor
>
> The current {{FlinkKafkaConsumer09}} cannot handle increasing the number of 
> partitions of a topic while running. The consumer will simply leave the newly 
> created partitions out and thus miss all data which is written to the new 
> partitions. The reason seems to be a static assignment of partitions to 
> consumer tasks when the job is started.
> We should either fix this behaviour or clearly document it in the online and 
> code docs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-6703) Document how to take a savepoint on YARN

2017-09-11 Thread Bowen Li (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161996#comment-16161996
 ] 

Bowen Li edited comment on FLINK-6703 at 9/11/17 9:09 PM:
--

I tested on our Flink-YARN cluster, and {{-m yarn-cluster}} is actually not 
required. So I'll probably just add a few code snippet of:

{{./bin/flink savepoint  -yid|-yarnapplicationId }}

[~Zentol] What do you think?


was (Author: phoenixjiangnan):
I tested on our Flink-YARN cluster, and {{-m yarn-cluster}} is actually not 
required. So I'll probably just add a few code snippet of:

{{./bin/flink savepoint  -yid|-yarnapplicationId }}

> Document how to take a savepoint on YARN
> 
>
> Key: FLINK-6703
> URL: https://issues.apache.org/jira/browse/FLINK-6703
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, YARN
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Bowen Li
>
> The documentation should have a separate entry for savepoint related CLI 
> commands in combination with YARN. It is currently not documented that you 
> have to supply the application id, nor how you can pass it.
> {code}
> ./bin/flink savepoint  -m yarn-cluster (-yid|-yarnapplicationId) 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6703) Document how to take a savepoint on YARN

2017-09-11 Thread Bowen Li (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161996#comment-16161996
 ] 

Bowen Li commented on FLINK-6703:
-

I tested on our Flink-YARN cluster, and {{-m yarn-cluster}} is actually not 
required. So I'll probably just add a few code snippet of:

{{./bin/flink savepoint  -yid|-yarnapplicationId }}

> Document how to take a savepoint on YARN
> 
>
> Key: FLINK-6703
> URL: https://issues.apache.org/jira/browse/FLINK-6703
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, YARN
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Bowen Li
>
> The documentation should have a separate entry for savepoint related CLI 
> commands in combination with YARN. It is currently not documented that you 
> have to supply the application id, nor how you can pass it.
> {code}
> ./bin/flink savepoint  -m yarn-cluster (-yid|-yarnapplicationId) 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (FLINK-6703) Document how to take a savepoint on YARN

2017-09-11 Thread Bowen Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-6703:

Comment: was deleted

(was: [~Zentol] is this still valid? is the above savepoint command for Flink 
job running in yarn-session mode?

One thing I found missing in the documentation is the following commond example:

{code:java}
$ bin/flink savepoint :jobId [-m jobManagerAddress:jobManagerPort] 
[:targetDirectory]
{code}

I've been using the above command to take savepoints in YARN all the time)

> Document how to take a savepoint on YARN
> 
>
> Key: FLINK-6703
> URL: https://issues.apache.org/jira/browse/FLINK-6703
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, YARN
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Bowen Li
>
> The documentation should have a separate entry for savepoint related CLI 
> commands in combination with YARN. It is currently not documented that you 
> have to supply the application id, nor how you can pass it.
> {code}
> ./bin/flink savepoint  -m yarn-cluster (-yid|-yarnapplicationId) 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6957) WordCountTable example cannot be run

2017-09-11 Thread Bowen Li (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161843#comment-16161843
 ] 

Bowen Li commented on FLINK-6957:
-

[~Zentol] I'm able to run this example after running a build fot the full Flink 
src code. Let me know whether you can run it or not now.

> WordCountTable example cannot be run
> 
>
> Key: FLINK-6957
> URL: https://issues.apache.org/jira/browse/FLINK-6957
> Project: Flink
>  Issue Type: Bug
>  Components: Examples, Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Bowen Li
> Fix For: 1.4.0
>
>
> Running the example (with the fix for FLINK-6956 applied) gives the following 
> exception:
> {code}
> Table program cannot be compiled. This is a bug. Please file an issue.
> 
> org.apache.flink.table.codegen.Compiler$class.compile(Compiler.scala:36)
> org.apache.flink.table.runtime.MapRunner.compile(MapRunner.scala:28)
> org.apache.flink.table.runtime.MapRunner.open(MapRunner.scala:42)
> 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 
> org.apache.flink.api.common.operators.base.MapOperatorBase.executeOnCollections(MapOperatorBase.java:64)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:250)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:148)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:228)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:148)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:228)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:148)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:228)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:148)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:228)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:148)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:130)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.executeDataSink(CollectionExecutor.java:181)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:157)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:130)
> 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:114)
> 
> org.apache.flink.api.java.CollectionEnvironment.execute(CollectionEnvironment.java:35)
> 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> org.apache.flink.api.java.DataSet.collect(DataSet.java:410)
> org.apache.flink.api.java.DataSet.print(DataSet.java:1605)
> 
> org.apache.flink.table.examples.java.WordCountTable.main(WordCountTable.java:58)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7609) WindowWordCount example doesn't print countWindow output with default configs

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161656#comment-16161656
 ] 

ASF GitHub Bot commented on FLINK-7609:
---

Github user bowenli86 commented on the issue:

https://github.com/apache/flink/pull/4662
  
There's no output with 100/50 either, because the default text we use at 
WordCountData.WORDS is not very long and cannot satisfy the count window size. 
There's only 11 window ouputs even with 10/5, which are still a bigger size.

I prefer 4/2, but also think 10/5 is also ok, because the program is for 
demo purpose and new users would doubt the program's accuracy if it doesn't 
generate enough output. What do you think? @zentol 




> WindowWordCount example doesn't print countWindow output with default configs
> -
>
> Key: FLINK-7609
> URL: https://issues.apache.org/jira/browse/FLINK-7609
> Project: Flink
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 1.4.0
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0
>
>
> When running WindowWordCount example with no params (using default params), 
> no output is generated and thus printed, because the default 'window' and 
> 'slide' value is too large (250 and 150).
> The solution is to lower default 'window' and 'slide' values to probably 4 
> and 2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4662: [FLINK-7609][examples] WindowWordCount example doesn't pr...

2017-09-11 Thread bowenli86
Github user bowenli86 commented on the issue:

https://github.com/apache/flink/pull/4662
  
There's no output with 100/50 either, because the default text we use at 
WordCountData.WORDS is not very long and cannot satisfy the count window size. 
There's only 11 window ouputs even with 10/5, which are still a bigger size.

I prefer 4/2, but also think 10/5 is also ok, because the program is for 
demo purpose and new users would doubt the program's accuracy if it doesn't 
generate enough output. What do you think? @zentol 




---


[jira] [Commented] (FLINK-4047) Fix documentation about determinism of KeySelectors

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161616#comment-16161616
 ] 

ASF GitHub Bot commented on FLINK-4047:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4659
  
+1.


> Fix documentation about determinism of KeySelectors
> ---
>
> Key: FLINK-4047
> URL: https://issues.apache.org/jira/browse/FLINK-4047
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0, 1.0.3
>Reporter: Fabian Hueske
>Priority: Blocker
>  Labels: starter
> Fix For: 1.4.0
>
>
> KeySelectors must return deterministic keys, i.e., if invoked multiple times 
> on the same object, the returned key must be the same.
> The documentation about this aspect is broken ("The key can be of any type 
> and be derived from arbitrary computations.").
> We need to fix the JavaDocs of the {{KeySelector}} interface and the web 
> documentation 
> (https://ci.apache.org/projects/flink/flink-docs-master/apis/common/index.html#specifying-keys).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4659: [FLINK-4047] Fix documentation about determinism of KeySe...

2017-09-11 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4659
  
+1.


---


[jira] [Commented] (FLINK-7521) Remove the 10MB limit from the current REST implementation.

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161611#comment-16161611
 ] 

ASF GitHub Bot commented on FLINK-7521:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4654
  
The actual solution that I'm thinking of currently is to have a modified 
`HttpObjectAggregator` implementation that spills larger messages to disk.


> Remove the 10MB limit from the current REST implementation.
> ---
>
> Key: FLINK-7521
> URL: https://issues.apache.org/jira/browse/FLINK-7521
> Project: Flink
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 1.4.0
>Reporter: Kostas Kloudas
>Assignee: Fang Yong
>Priority: Blocker
> Fix For: 1.4.0
>
>
> In the current {{AbstractRestServer}} we impose an upper bound of 10MB in the 
> states we can transfer. This is in the line {{.addLast(new 
> HttpObjectAggregator(1024 * 1024 * 10))}} of the server implementation. 
> This limit is restrictive for some of the usecases planned to use this 
> implementation (e.g. the job submission client which has to send full jars, 
> or the queryable state client which may have to receive states bigger than 
> that).
> This issue proposes the elimination of this limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4654: [FLINK-7521] Add config option to set the content length ...

2017-09-11 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4654
  
The actual solution that I'm thinking of currently is to have a modified 
`HttpObjectAggregator` implementation that spills larger messages to disk.


---


[GitHub] flink pull request #4647: [FLINK-7575] [WEB-DASHBOARD] Display "Fetching..."...

2017-09-11 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4647#discussion_r138133547
  
--- Diff: 
flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/utils/MutableIOMetrics.java
 ---
@@ -72,13 +72,54 @@ public void addIOMetrics(AccessExecution attempt, 
@Nullable MetricFetcher fetche
} else { // execAttempt is still running, use 
MetricQueryService instead
if (fetcher != null) {
fetcher.update();
-   MetricStore.SubtaskMetricStore metrics = 
fetcher.getMetricStore().getSubtaskMetricStore(jobID, taskID, 
attempt.getParallelSubtaskIndex());
-   if (metrics != null) {
-   this.numBytesInLocal += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_IN_LOCAL, "0"));
-   this.numBytesInRemote += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_IN_REMOTE, "0"));
-   this.numBytesOut += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_OUT, "0"));
-   this.numRecordsIn += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_RECORDS_IN, "0"));
-   this.numRecordsOut += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_RECORDS_OUT, "0"));
+   MetricStore metricStore = 
fetcher.getMetricStore();
+   synchronized (metricStore) {
+   MetricStore.SubtaskMetricStore metrics 
= metricStore.getSubtaskMetricStore(jobID, taskID, 
attempt.getParallelSubtaskIndex());
+   if (metrics != null) {
+   /**
+* We want to keep track of 
missing metrics to be able to make a difference between 0 as a value
+* and a missing value.
+* In case a metric is missing 
for a parallel instance of a task, we initialize if with -1 and
+* will be considered as 
incomplete
--- End diff --

missing period, and the sentence flow is a bit weird, how about "and 
consider it as incomplete."?


---


[jira] [Commented] (FLINK-7575) Dashboard jobs/tasks metrics display 0 when metrics are not yet available

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161608#comment-16161608
 ] 

ASF GitHub Bot commented on FLINK-7575:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4647#discussion_r138133369
  
--- Diff: 
flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/utils/MutableIOMetrics.java
 ---
@@ -72,13 +72,54 @@ public void addIOMetrics(AccessExecution attempt, 
@Nullable MetricFetcher fetche
} else { // execAttempt is still running, use 
MetricQueryService instead
if (fetcher != null) {
fetcher.update();
-   MetricStore.SubtaskMetricStore metrics = 
fetcher.getMetricStore().getSubtaskMetricStore(jobID, taskID, 
attempt.getParallelSubtaskIndex());
-   if (metrics != null) {
-   this.numBytesInLocal += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_IN_LOCAL, "0"));
-   this.numBytesInRemote += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_IN_REMOTE, "0"));
-   this.numBytesOut += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_OUT, "0"));
-   this.numRecordsIn += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_RECORDS_IN, "0"));
-   this.numRecordsOut += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_RECORDS_OUT, "0"));
+   MetricStore metricStore = 
fetcher.getMetricStore();
+   synchronized (metricStore) {
+   MetricStore.SubtaskMetricStore metrics 
= metricStore.getSubtaskMetricStore(jobID, taskID, 
attempt.getParallelSubtaskIndex());
+   if (metrics != null) {
+   /**
+* We want to keep track of 
missing metrics to be able to make a difference between 0 as a value
+* and a missing value.
+* In case a metric is missing 
for a parallel instance of a task, we initialize if with -1 and
--- End diff --

typo: if -> it


> Dashboard jobs/tasks metrics display 0 when metrics are not yet available
> -
>
> Key: FLINK-7575
> URL: https://issues.apache.org/jira/browse/FLINK-7575
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.3.2
>Reporter: James Lafa
>Assignee: James Lafa
>Priority: Minor
>
> The web frontend is currently displaying "0" when a metric is not available 
> yet (ex: records-in/out, bytes-in/out). 
> 0 is misleading and it's preferable to display no value while the value is 
> still unknown.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7575) Dashboard jobs/tasks metrics display 0 when metrics are not yet available

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161607#comment-16161607
 ] 

ASF GitHub Bot commented on FLINK-7575:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4647#discussion_r138133763
  
--- Diff: 
flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/utils/MutableIOMetrics.java
 ---
@@ -99,11 +140,35 @@ public void addIOMetrics(AccessExecution attempt, 
@Nullable MetricFetcher fetche
 * @throws IOException
 */
public void writeIOMetricsAsJson(JsonGenerator gen) throws IOException {
+   /**
+* Ask describe in the addIOMetrics, we want to distinguish 
incomplete values from 0.
--- End diff --

typo: ask -> as
remote "the" before addIOMetrics
could use  ```@link annotation for the method```


> Dashboard jobs/tasks metrics display 0 when metrics are not yet available
> -
>
> Key: FLINK-7575
> URL: https://issues.apache.org/jira/browse/FLINK-7575
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.3.2
>Reporter: James Lafa
>Assignee: James Lafa
>Priority: Minor
>
> The web frontend is currently displaying "0" when a metric is not available 
> yet (ex: records-in/out, bytes-in/out). 
> 0 is misleading and it's preferable to display no value while the value is 
> still unknown.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7575) Dashboard jobs/tasks metrics display 0 when metrics are not yet available

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161609#comment-16161609
 ] 

ASF GitHub Bot commented on FLINK-7575:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4647#discussion_r138133547
  
--- Diff: 
flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/utils/MutableIOMetrics.java
 ---
@@ -72,13 +72,54 @@ public void addIOMetrics(AccessExecution attempt, 
@Nullable MetricFetcher fetche
} else { // execAttempt is still running, use 
MetricQueryService instead
if (fetcher != null) {
fetcher.update();
-   MetricStore.SubtaskMetricStore metrics = 
fetcher.getMetricStore().getSubtaskMetricStore(jobID, taskID, 
attempt.getParallelSubtaskIndex());
-   if (metrics != null) {
-   this.numBytesInLocal += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_IN_LOCAL, "0"));
-   this.numBytesInRemote += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_IN_REMOTE, "0"));
-   this.numBytesOut += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_OUT, "0"));
-   this.numRecordsIn += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_RECORDS_IN, "0"));
-   this.numRecordsOut += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_RECORDS_OUT, "0"));
+   MetricStore metricStore = 
fetcher.getMetricStore();
+   synchronized (metricStore) {
+   MetricStore.SubtaskMetricStore metrics 
= metricStore.getSubtaskMetricStore(jobID, taskID, 
attempt.getParallelSubtaskIndex());
+   if (metrics != null) {
+   /**
+* We want to keep track of 
missing metrics to be able to make a difference between 0 as a value
+* and a missing value.
+* In case a metric is missing 
for a parallel instance of a task, we initialize if with -1 and
+* will be considered as 
incomplete
--- End diff --

missing period, and the sentence flow is a bit weird, how about "and 
consider it as incomplete."?


> Dashboard jobs/tasks metrics display 0 when metrics are not yet available
> -
>
> Key: FLINK-7575
> URL: https://issues.apache.org/jira/browse/FLINK-7575
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.3.2
>Reporter: James Lafa
>Assignee: James Lafa
>Priority: Minor
>
> The web frontend is currently displaying "0" when a metric is not available 
> yet (ex: records-in/out, bytes-in/out). 
> 0 is misleading and it's preferable to display no value while the value is 
> still unknown.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4647: [FLINK-7575] [WEB-DASHBOARD] Display "Fetching..."...

2017-09-11 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4647#discussion_r138133369
  
--- Diff: 
flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/utils/MutableIOMetrics.java
 ---
@@ -72,13 +72,54 @@ public void addIOMetrics(AccessExecution attempt, 
@Nullable MetricFetcher fetche
} else { // execAttempt is still running, use 
MetricQueryService instead
if (fetcher != null) {
fetcher.update();
-   MetricStore.SubtaskMetricStore metrics = 
fetcher.getMetricStore().getSubtaskMetricStore(jobID, taskID, 
attempt.getParallelSubtaskIndex());
-   if (metrics != null) {
-   this.numBytesInLocal += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_IN_LOCAL, "0"));
-   this.numBytesInRemote += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_IN_REMOTE, "0"));
-   this.numBytesOut += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_BYTES_OUT, "0"));
-   this.numRecordsIn += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_RECORDS_IN, "0"));
-   this.numRecordsOut += 
Long.valueOf(metrics.getMetric(MetricNames.IO_NUM_RECORDS_OUT, "0"));
+   MetricStore metricStore = 
fetcher.getMetricStore();
+   synchronized (metricStore) {
+   MetricStore.SubtaskMetricStore metrics 
= metricStore.getSubtaskMetricStore(jobID, taskID, 
attempt.getParallelSubtaskIndex());
+   if (metrics != null) {
+   /**
+* We want to keep track of 
missing metrics to be able to make a difference between 0 as a value
+* and a missing value.
+* In case a metric is missing 
for a parallel instance of a task, we initialize if with -1 and
--- End diff --

typo: if -> it


---


[GitHub] flink pull request #4647: [FLINK-7575] [WEB-DASHBOARD] Display "Fetching..."...

2017-09-11 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4647#discussion_r138133763
  
--- Diff: 
flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/utils/MutableIOMetrics.java
 ---
@@ -99,11 +140,35 @@ public void addIOMetrics(AccessExecution attempt, 
@Nullable MetricFetcher fetche
 * @throws IOException
 */
public void writeIOMetricsAsJson(JsonGenerator gen) throws IOException {
+   /**
+* Ask describe in the addIOMetrics, we want to distinguish 
incomplete values from 0.
--- End diff --

typo: ask -> as
remote "the" before addIOMetrics
could use  ```@link annotation for the method```


---


[jira] [Commented] (FLINK-7609) WindowWordCount example doesn't print countWindow output with default configs

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161598#comment-16161598
 ] 

ASF GitHub Bot commented on FLINK-7609:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4662
  
Aren't these values a bit too low? (I'm specifically worried about cases 
where users supply their own data)
Would 100/50 maybe work?


> WindowWordCount example doesn't print countWindow output with default configs
> -
>
> Key: FLINK-7609
> URL: https://issues.apache.org/jira/browse/FLINK-7609
> Project: Flink
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 1.4.0
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0
>
>
> When running WindowWordCount example with no params (using default params), 
> no output is generated and thus printed, because the default 'window' and 
> 'slide' value is too large (250 and 150).
> The solution is to lower default 'window' and 'slide' values to probably 4 
> and 2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4662: [FLINK-7609][examples] WindowWordCount example doesn't pr...

2017-09-11 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4662
  
Aren't these values a bit too low? (I'm specifically worried about cases 
where users supply their own data)
Would 100/50 maybe work?


---


[jira] [Commented] (FLINK-5944) Flink should support reading Snappy Files

2017-09-11 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161590#comment-16161590
 ] 

Chesnay Schepler commented on FLINK-5944:
-

Option 2/3 are out imo as they extend the existing API which is already really 
loaded.
Option 4 is not really viable, since realistically, users are just not gonna do 
it.

So i would go for Option 1.

> Flink should support reading Snappy Files
> -
>
> Key: FLINK-5944
> URL: https://issues.apache.org/jira/browse/FLINK-5944
> Project: Flink
>  Issue Type: New Feature
>  Components: Batch Connectors and Input/Output Formats
>Reporter: Ilya Ganelin
>Assignee: Mikhail Lipkovich
>  Labels: features
>
> Snappy is an extremely performant compression format that's widely used 
> offering fast decompression/compression. 
> This can be easily implemented by creating a SnappyInflaterInputStreamFactory 
> and updating the initDefaultInflateInputStreamFactories in FileInputFormat.
> Flink already includes the Snappy dependency in the project. 
> There is a minor gotcha in this. If we wish to use this with Hadoop, then we 
> must provide two separate implementations since Hadoop uses a different 
> version of the snappy format than Snappy Java (which is the xerial/snappy 
> included in Flink). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4831) Implement a slf4j metric reporter

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161582#comment-16161582
 ] 

ASF GitHub Bot commented on FLINK-4831:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4661
  
Really nice that you figured out to rename the package to slf4j :)

I'll check this out in more detail next week.


> Implement a slf4j metric reporter
> -
>
> Key: FLINK-4831
> URL: https://issues.apache.org/jira/browse/FLINK-4831
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.1.2
>Reporter: Chesnay Schepler
>Assignee: Hai Zhou
>Priority: Minor
>  Labels: easyfix, starter
>
> For debugging purpose it would be very useful to have a log4j metric 
> reporter. If you don't want to setup a metric backend you currently have to 
> rely on JMX, which a) works a bit differently than other reporters (for 
> example it doesn't extend AbstractReporter) and b) makes it a bit tricky to 
> analyze results as metrics are cleaned up once a job finishes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4661: [FLINK-4831] Implement a slf4j metric reporter

2017-09-11 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4661
  
Really nice that you figured out to rename the package to slf4j :)

I'll check this out in more detail next week.


---


[jira] [Commented] (FLINK-6549) Improve error message for type mismatches with side outputs

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161573#comment-16161573
 ] 

ASF GitHub Bot commented on FLINK-6549:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4663
  
I think this looks good.


> Improve error message for type mismatches with side outputs
> ---
>
> Key: FLINK-6549
> URL: https://issues.apache.org/jira/browse/FLINK-6549
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Bowen Li
>Priority: Minor
>
> A type mismatch when using side outputs causes a ClassCastException to be 
> thrown. It would be neat to include the name of the OutputTags in the 
> exception message.
> This can occur when multiple {{OutputTag]}s with different types but 
> identical names are being used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4663: [FLINK-6549] [DataStream API] Improve error message for t...

2017-09-11 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4663
  
I think this looks good.


---


[jira] [Commented] (FLINK-7562) uploaded jars sort by upload-time on dashboard page

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161570#comment-16161570
 ] 

ASF GitHub Bot commented on FLINK-7562:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4664
  
Ill check this out sometime next week.


> uploaded jars sort by upload-time on dashboard page
> ---
>
> Key: FLINK-7562
> URL: https://issues.apache.org/jira/browse/FLINK-7562
> Project: Flink
>  Issue Type: Wish
>  Components: Webfrontend
>Reporter: Hai Zhou
>Assignee: Hai Zhou
>Priority: Minor
>
> Currently, upload the jar package show is unordered in the Apache Flink Web 
> Dashboard page, 
> I think we should sort by upload time, so look more friendly!
> Regards,
> Hai Zhou.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4664: [FLINK-7562] Show uploaded-jars descending by upload time

2017-09-11 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4664
  
Ill check this out sometime next week.


---


[jira] [Commented] (FLINK-7521) Remove the 10MB limit from the current REST implementation.

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161568#comment-16161568
 ] 

ASF GitHub Bot commented on FLINK-7521:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4654
  
removing the object aggregator will break the remaining pipeline, so that 
option is out too.

As for the client upload, that's exactly whats being done right now. The 
jar is uploaded to the blob server, and only the blob key is transmitted via 
REST. The problem here is that this also requires any client to be written in 
java and rely on Flink, whereas we would prefer if any client, written in any 
language, can communicate with the REST APIs.


> Remove the 10MB limit from the current REST implementation.
> ---
>
> Key: FLINK-7521
> URL: https://issues.apache.org/jira/browse/FLINK-7521
> Project: Flink
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 1.4.0
>Reporter: Kostas Kloudas
>Assignee: Fang Yong
>Priority: Blocker
> Fix For: 1.4.0
>
>
> In the current {{AbstractRestServer}} we impose an upper bound of 10MB in the 
> states we can transfer. This is in the line {{.addLast(new 
> HttpObjectAggregator(1024 * 1024 * 10))}} of the server implementation. 
> This limit is restrictive for some of the usecases planned to use this 
> implementation (e.g. the job submission client which has to send full jars, 
> or the queryable state client which may have to receive states bigger than 
> that).
> This issue proposes the elimination of this limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4654: [FLINK-7521] Add config option to set the content length ...

2017-09-11 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4654
  
removing the object aggregator will break the remaining pipeline, so that 
option is out too.

As for the client upload, that's exactly whats being done right now. The 
jar is uploaded to the blob server, and only the blob key is transmitted via 
REST. The problem here is that this also requires any client to be written in 
java and rely on Flink, whereas we would prefer if any client, written in any 
language, can communicate with the REST APIs.


---


[jira] [Created] (FLINK-7611) add metrics to measure the data drop by watermark

2017-09-11 Thread aitozi (JIRA)
aitozi created FLINK-7611:
-

 Summary: add metrics to measure the data drop by watermark
 Key: FLINK-7611
 URL: https://issues.apache.org/jira/browse/FLINK-7611
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Reporter: aitozi
Priority: Minor


when use the window operator with event time, the data come after the 
window.endtime + allowLatency, the data will be droped, but there is no existed 
metrics to measure the num of dropped data, and this value will help to set the 
correct allowLatency



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7606) Memory leak on NestedMapsStateTable

2017-09-11 Thread Kostas Kloudas (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161200#comment-16161200
 ] 

Kostas Kloudas commented on FLINK-7606:
---

Hi Matteo,

Thanks for reporting this!

Any additional information and details that you have would help us pin down the 
problem. 
It would be really helpful if you could actually provide a minimal code example 
that reproduces the problem
(e.g. job code and format of the elements).

> Memory leak on NestedMapsStateTable
> ---
>
> Key: FLINK-7606
> URL: https://issues.apache.org/jira/browse/FLINK-7606
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.3.1
>Reporter: Matteo Ferrario
>
> The NestedMapsStateTable grows up continuously without free the heap memory.
> We created a simple job that processes a stream of messages and uses CEP to 
> generate an outcome message when a specific pattern is identified.
> The messages coming from the stream are grouped by a key defined in a 
> specific field of the message.
> We've also added the "within" clause (set as 5 minutes), indicating that two 
> incoming messages match the pattern only if they come in a certain time 
> window.
> What we've seen is that for every key present in the message, an NFA object 
> is instantiated in the NestedMapsStateTable and it is never deallocated.
> Also the "within" clause didn't help: we've seen that if we send messages 
> that don't match the pattern, the memory grows up (I suppose that the state 
> of NFA is updated) but it is not cleaned also after the 5 minutes of time 
> window defined in "within" clause.
> If you need, I can provide more details about the job we've implemented and 
> also the screenshots about the memory leak.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6444) Add a check that '@VisibleForTesting' methods are only used in tests

2017-09-11 Thread Hai Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161119#comment-16161119
 ] 

Hai Zhou commented on FLINK-6444:
-

I already have an idea.
When I am not busy with work;), I will go to fix it.

> Add a check that '@VisibleForTesting' methods are only used in tests
> 
>
> Key: FLINK-6444
> URL: https://issues.apache.org/jira/browse/FLINK-6444
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
>
> Some methods are annotated with {{@VisibleForTesting}}. These methods should 
> only be called from tests.
> This is currently not enforced / checked during the build. We should add such 
> a check.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4319) Rework Cluster Management (FLIP-6)

2017-09-11 Thread Ashish Pokharel (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161107#comment-16161107
 ] 

Ashish Pokharel commented on FLINK-4319:


Thanks Till - this ia great.

> Rework Cluster Management (FLIP-6)
> --
>
> Key: FLINK-4319
> URL: https://issues.apache.org/jira/browse/FLINK-4319
> Project: Flink
>  Issue Type: Improvement
>  Components: Cluster Management
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>  Labels: flip-6
>
> This is the root issue to track progress of the rework of cluster management 
> (FLIP-6) 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7001) Improve performance of Sliding Time Window with pane optimization

2017-09-11 Thread Philipp Grulich (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161075#comment-16161075
 ] 

Philipp Grulich commented on FLINK-7001:


Hey,
what is the current state of this?
Over the last month, I worked on a similar research project.
We implemented this kind of aggregation sharing for all time-based window 
operators.
Maybe it would make sense to work together on this.

Best,
Philipp

> Improve performance of Sliding Time Window with pane optimization
> -
>
> Key: FLINK-7001
> URL: https://issues.apache.org/jira/browse/FLINK-7001
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: Jark Wu
>Assignee: Jark Wu
> Fix For: 1.4.0
>
>
> Currently, the implementation of time-based sliding windows treats each 
> window individually and replicates records to each window. For a window of 10 
> minute size that slides by 1 second the data is replicated 600 fold (10 
> minutes / 1 second). We can optimize sliding window by divide windows into 
> panes (aligned with slide), so that we can avoid record duplication and 
> leverage the checkpoint.
> I will attach a more detail design doc to the issue.
> The following issues are similar to this issue: FLINK-5387, FLINK-6990



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6444) Add a check that '@VisibleForTesting' methods are only used in tests

2017-09-11 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160925#comment-16160925
 ] 

mingleizhang commented on FLINK-6444:
-

Hello, [~StephanEwen] . I would like work on this. But I need check it with you 
with some points in order to have a better understanding what you were trying 
to say. What you mean is when I type mvn clean install -DskipTests, then those 
method which has @VisibleForTesting is still called from non-tests. So, we 
should do something which can prevent it from this. I am correct ?

> Add a check that '@VisibleForTesting' methods are only used in tests
> 
>
> Key: FLINK-6444
> URL: https://issues.apache.org/jira/browse/FLINK-6444
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
>
> Some methods are annotated with {{@VisibleForTesting}}. These methods should 
> only be called from tests.
> This is currently not enforced / checked during the build. We should add such 
> a check.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4319) Rework Cluster Management (FLIP-6)

2017-09-11 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160890#comment-16160890
 ] 

Till Rohrmann commented on FLINK-4319:
--

Hi Ashish,

great to hear that things are working so far :-)

Dynamic scaling will be part of the Flip-6 work. With Flip-6 we will offer a 
API call to scale up/down running jobs. Some of the code is already merged into 
master and we are currently working on merging the rest. We hope to get it all 
in for Flink 1.4. 

> Rework Cluster Management (FLIP-6)
> --
>
> Key: FLINK-4319
> URL: https://issues.apache.org/jira/browse/FLINK-4319
> Project: Flink
>  Issue Type: Improvement
>  Components: Cluster Management
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>  Labels: flip-6
>
> This is the root issue to track progress of the rework of cluster management 
> (FLIP-6) 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4796) Add new Sink interface with access to more meta data

2017-09-11 Thread Erik van Oosten (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160810#comment-16160810
 ] 

Erik van Oosten commented on FLINK-4796:


A workaround is to override {{setRuntimeContext}} (make sure to call 
{{super.setRuntimeContext}}), and use the passed in context. Possibly store it 
in a private field for later access.

> Add new Sink interface with access to more meta data
> 
>
> Key: FLINK-4796
> URL: https://issues.apache.org/jira/browse/FLINK-4796
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>
> The current {{SinkFunction}} cannot access the timestamps of elements which 
> resulted in the (somewhat hacky) {{FlinkKafkaProducer010}}. Due to other 
> limitations {{GenericWriteAheadSink}} is currently also a {{StreamOperator}} 
> and not a {{SinkFunction}}.
> We should add a new interface for sinks that takes a context parameter, 
> similar to {{ProcessFunction}}. This will allow sinks to query additional 
> meta data about the element that they're receiving. 
> This is one ML thread where a user ran into a problem caused by this: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Why-I-am-getting-Null-pointer-exception-while-accessing-RuntimeContext-in-FlinkKafkaProducer010-td12633.html#a12635
> h3. Original Text (that is still valid but not general)
> The Kafka 0.10 connector supports writing event timestamps to Kafka.
> Currently, the regular DataStream APIs don't allow user code to access the 
> event timestamp easily. That's why the Kafka connector is using a custom 
> operator ({{transform()}}) to access the event time.
> With this JIRA, I would like to provide the event timestamp in the regular 
> DataStream APIs.
> Once I'll look into the issue, I'll post some proposals how to add the 
> timestamp. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)