date:20161118

[jira] [Commented] (FLINK-5103) TaskManager process virtual memory and physical memory used size gauge

2016-11-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678795#comment-15678795
 ] 

ASF GitHub Bot commented on FLINK-5103:
---

GitHub user zhuhaifengleon opened a pull request:

https://github.com/apache/flink/pull/2833

[FLINK-5103] [Metrics] TaskManager process virtual memory and physical 
memory used size gauge

This PR add TaskManger process virtual memory and physical memory used size 
gauge metrics.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhuhaifengleon/flink FLINK-5103

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2833.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2833


commit f65176360593ab72a6fee27f4ae54a143e386a67
Author: zhuhaifengleon 
Date:   2016-11-11T01:29:53Z

[metrics] expose process VmRSS/VmSize metrics

commit 5f1bf29913b809f6dad178b7f0e7e59381dfdd6a
Author: zhuhaifengleon 
Date:   2016-11-19T07:22:50Z

[FLINK-5103] [Metrics] TaskManager process virtual memory and physical 
memory used size gauge




> TaskManager process virtual memory and physical memory used size gauge
> --
>
> Key: FLINK-5103
> URL: https://issues.apache.org/jira/browse/FLINK-5103
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: zhuhaifeng
>Assignee: zhuhaifeng
>Priority: Minor
> Fix For: 1.2.0
>
>
> Add TaskManger Process virtual memory and physical memory used size gauge 
> metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request #2833: [FLINK-5103] [Metrics] TaskManager process virtual...

2016-11-18 Thread zhuhaifengleon

GitHub user zhuhaifengleon opened a pull request:

https://github.com/apache/flink/pull/2833

[FLINK-5103] [Metrics] TaskManager process virtual memory and physical 
memory used size gauge

This PR add TaskManger process virtual memory and physical memory used size 
gauge metrics.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhuhaifengleon/flink FLINK-5103

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2833.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2833


commit f65176360593ab72a6fee27f4ae54a143e386a67
Author: zhuhaifengleon 
Date:   2016-11-11T01:29:53Z

[metrics] expose process VmRSS/VmSize metrics

commit 5f1bf29913b809f6dad178b7f0e7e59381dfdd6a
Author: zhuhaifengleon 
Date:   2016-11-19T07:22:50Z

[FLINK-5103] [Metrics] TaskManager process virtual memory and physical 
memory used size gauge




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-5103) TaskManager process virtual memory and physical memory used size gauge

2016-11-18 Thread zhuhaifeng (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuhaifeng updated FLINK-5103:
--
Summary: TaskManager process virtual memory and physical memory used size 
gauge  (was: Process virtual memory and physical memory used size gauge)

> TaskManager process virtual memory and physical memory used size gauge
> --
>
> Key: FLINK-5103
> URL: https://issues.apache.org/jira/browse/FLINK-5103
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: zhuhaifeng
>Assignee: zhuhaifeng
>Priority: Minor
> Fix For: 1.2.0
>
>
> Add TaskManger Process virtual memory and physical memory used size gauge 
> metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-5103) TaskManager process virtual memory and physical memory used size gauge

2016-11-18 Thread zhuhaifeng (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuhaifeng updated FLINK-5103:
--
Component/s: Metrics

> TaskManager process virtual memory and physical memory used size gauge
> --
>
> Key: FLINK-5103
> URL: https://issues.apache.org/jira/browse/FLINK-5103
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: zhuhaifeng
>Assignee: zhuhaifeng
>Priority: Minor
> Fix For: 1.2.0
>
>
> Add TaskManger Process virtual memory and physical memory used size gauge 
> metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-5103) Process virtual memory and physical memory used size gauge

2016-11-18 Thread zhuhaifeng (JIRA)

zhuhaifeng created FLINK-5103:
-

 Summary: Process virtual memory and physical memory used size gauge
 Key: FLINK-5103
 URL: https://issues.apache.org/jira/browse/FLINK-5103
 Project: Flink
  Issue Type: Improvement
Reporter: zhuhaifeng
Assignee: zhuhaifeng
Priority: Minor
 Fix For: 1.2.0


Add TaskManger Process virtual memory and physical memory used size gauge 
metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (FLINK-5031) Consecutive DataStream.split() ignored

2016-11-18 Thread Renkai Ge (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678716#comment-15678716
 ] 

Renkai Ge edited comment on FLINK-5031 at 11/19/16 6:29 AM:


[~fhueske]The second split was not ignored, it was unioned by the first 
one.{code}union({1,2},{1,2,3,4,5})={1,2,3,4,5}{code},if the second select 
change to "GreaterEqual", the result would be 
{code}{3,4,5,6,7,8,9,10,11}{code},that was {code} 
union({3,4,5,6,7,8,9,10,11},{6,7,8,9,10,11}) {code} see 
https://github.com/apache/flink/blob/a612b9966f3ee020a5721ac2f039a3633c40146c/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/collector/selector/DirectedOutput.java#L114.
In the current implementation of split, you will get a unioned result of all 
split&select combination, I think it  was strange somehow.We might solve this 
issue by reimplement the split function by an OneInputTransformation.


was (Author: renkaige):
[~fhueske]The second split was not ignored, it was unioned by the first 
one.{code}union({1,2},{1,2,3,4,5})={1,2,3,4,5}{code},if the second select 
change to "GreaterEqual", the result would be {3,4,5,6,7,8,9,10,11},that was 
{code} union({3,4,5,6,7,8,9,10,11},{6,7,8,9,10,11}) {code} see 
https://github.com/apache/flink/blob/a612b9966f3ee020a5721ac2f039a3633c40146c/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/collector/selector/DirectedOutput.java#L114.
In the current implementation of split, you will get a unioned result of all 
split&select combination, I think it  was strange somehow.We might solve this 
issue by reimplement the split function by an OneInputTransformation.

> Consecutive DataStream.split() ignored
> --
>
> Key: FLINK-5031
> URL: https://issues.apache.org/jira/browse/FLINK-5031
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Fabian Hueske
>Assignee: Renkai Ge
> Fix For: 1.2.0
>
>
> The output of the following program 
> {code}
> static final class ThresholdSelector implements OutputSelector {
>   long threshold;
>   public ThresholdSelector(long threshold) {
>   this.threshold = threshold;
>   }
>   @Override
>   public Iterable select(Long value) {
>   if (value < threshold) {
>   return Collections.singletonList("Less");
>   } else {
>   return Collections.singletonList("GreaterEqual");
>   }
>   }
> }
> public static void main(String[] args) throws Exception {
>   StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>   env.setParallelism(1);
>   SplitStream split1 = env.generateSequence(1, 11)
>   .split(new ThresholdSelector(6));
>   // stream11 should be [1,2,3,4,5]
>   DataStream stream11 = split1.select("Less");
>   SplitStream split2 = stream11
> //.map(new MapFunction() {
> //@Override
> //public Long map(Long value) throws Exception {
> //return value;
> //}
> //})
>   .split(new ThresholdSelector(3));
>   DataStream stream21 = split2.select("Less");
>   // stream21 should be [1,2]
>   stream21.print();
>   env.execute();
> }
> {code}
> should be {{1, 2}}, however it is {{1, 2, 3, 4, 5}}. It seems that the second 
> {{split}} operation is ignored.
> The program is correctly evaluate if the identity {{MapFunction}} is added to 
> the program.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5031) Consecutive DataStream.split() ignored

2016-11-18 Thread Renkai Ge (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678716#comment-15678716
 ] 

Renkai Ge commented on FLINK-5031:
--

[~fhueske]The second split was not ignored, it was unioned by the first 
one.{code}union({1,2},{1,2,3,4,5})={1,2,3,4,5}{code},if the second select 
change to "GreaterEqual", the result would be {3,4,5,6,7,8,9,10,11},that was 
{code} union({3,4,5,6,7,8,9,10,11},{6,7,8,9,10,11}) {code} see 
https://github.com/apache/flink/blob/a612b9966f3ee020a5721ac2f039a3633c40146c/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/collector/selector/DirectedOutput.java#L114.
In the current implementation of split, you will get a unioned result of all 
split&select combination, I think it  was strange somehow.We might solve this 
issue by reimplement the split function by an OneInputTransformation.

> Consecutive DataStream.split() ignored
> --
>
> Key: FLINK-5031
> URL: https://issues.apache.org/jira/browse/FLINK-5031
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Fabian Hueske
>Assignee: Renkai Ge
> Fix For: 1.2.0
>
>
> The output of the following program 
> {code}
> static final class ThresholdSelector implements OutputSelector {
>   long threshold;
>   public ThresholdSelector(long threshold) {
>   this.threshold = threshold;
>   }
>   @Override
>   public Iterable select(Long value) {
>   if (value < threshold) {
>   return Collections.singletonList("Less");
>   } else {
>   return Collections.singletonList("GreaterEqual");
>   }
>   }
> }
> public static void main(String[] args) throws Exception {
>   StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>   env.setParallelism(1);
>   SplitStream split1 = env.generateSequence(1, 11)
>   .split(new ThresholdSelector(6));
>   // stream11 should be [1,2,3,4,5]
>   DataStream stream11 = split1.select("Less");
>   SplitStream split2 = stream11
> //.map(new MapFunction() {
> //@Override
> //public Long map(Long value) throws Exception {
> //return value;
> //}
> //})
>   .split(new ThresholdSelector(3));
>   DataStream stream21 = split2.select("Less");
>   // stream21 should be [1,2]
>   stream21.print();
>   env.execute();
> }
> {code}
> should be {{1, 2}}, however it is {{1, 2, 3, 4, 5}}. It seems that the second 
> {{split}} operation is ignored.
> The program is correctly evaluate if the identity {{MapFunction}} is added to 
> the program.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-5053) Incremental / lightweight snapshots for checkpoints

2016-11-18 Thread Stefan Richter (JIRA)

[
https://issues.apache.org/jira/browse/FLINK-5053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stefan Richter updated FLINK-5053:
--
Description:
There is currently basically no difference between savepoints and checkpoints
in Flink and both are created through exactly the same process.

However, savepoints and checkpoints have a slightly different meaning which we
should take into account to keep Flink efficient:

- Savepoints are (typically infrequently) triggered by the user to create a
state from which the application can be restarted, e.g. because Flink, some
code, or the parallelism needs to be changed.

- Checkpoints are (typically frequently) triggered by the System to allow for
fast recovery in case of failure, but keeping the job/system unchanged.

This means that savepoints and checkpoints can have different properties in
that:

- Savepoint should represent a state of the application, where characteristics
of the job (e.g. parallelism) can be adjusted for the next restart. One example
for things that savepoints need to be aware of are key-groups. Savepoints can
potentially be a little more expensive than checkpoints, because they are
usually created a lot less frequently through the user.

- Checkpoints are frequently triggered by the system to allow for fast failure
recovery. However, failure recovery leaves all characteristics of the job
unchanged. This checkpoints do not have to be aware of those, e.g. think again
of key groups. Checkpoints should run faster than creating savepoints, in
particular it would be nice to have incremental checkpoints.

For a first approach, I would suggest the following steps/changes:

- In checkpoint coordination: differentiate between triggering checkpoints
and savepoints. Introduce properties for checkpoints that describe their set of
abilities, e.g. "is-key-group-aware", "is-incremental".

- In state handle infrastructure: introduce state handles that reflect
incremental checkpoints and drop full key-group awareness, i.e. covering
folders instead of files and not having keygroup_id -> file/offset mapping, but
keygroup_range -> folder?

- Backend side: We should start with RocksDB by reintroducing something similar
to semi-async snapshots, but using
BackupableDBOptions::setShareTableFiles(true) and transferring only new
incremental outputs to HDFS. Notice that using RocksDB's internal backup
mechanism is giving up on the information about individual key-groups. But as
explained above, this should be totally acceptable for checkpoints, while
savepoints should use the key-group-aware fully async mode. Of course we also
need to implement the ability to restore from both types of snapshots.

One problem in the suggested approach is still that even checkpoints should
support scale-down, in case that only a smaller number of instances is left
available in a recovery case.

was:
There is currently basically no difference between savepoints and checkpoints
in Flink and both are created through exactly the same process.

However, savepoints and checkpoints have a slightly different meaning which we
should take into account to keep Flink efficient:

- Savepoints are (typically infrequently) triggered by the user to create a
state from which the application can be restarted, e.g. because Flink, some
code, or the parallelism needs to be changed.

1 2 >

1 - 100 of 154 matches

Mail list logo