[jira] [Commented] (FLINK-4406) Implement job master registration at resource manager

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544260#comment-15544260
 ] 

ASF GitHub Bot commented on FLINK-4406:
---

Github user KurtYoung closed the pull request at:

https://github.com/apache/flink/pull/2565


> Implement job master registration at resource manager
> -
>
> Key: FLINK-4406
> URL: https://issues.apache.org/jira/browse/FLINK-4406
> Project: Flink
>  Issue Type: Sub-task
>  Components: Cluster Management
>Reporter: Wenlong Lyu
>Assignee: Kurt Young
>
> Job Master needs to register to Resource Manager when starting and then 
> watches leadership changes of RM, and trigger re-registration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4657) Implement HighAvailabilityServices based on zookeeper

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544259#comment-15544259
 ] 

ASF GitHub Bot commented on FLINK-4657:
---

Github user KurtYoung closed the pull request at:

https://github.com/apache/flink/pull/2550


> Implement HighAvailabilityServices based on zookeeper
> -
>
> Key: FLINK-4657
> URL: https://issues.apache.org/jira/browse/FLINK-4657
> Project: Flink
>  Issue Type: New Feature
>  Components: Cluster Management
>Reporter: Kurt Young
>Assignee: Kurt Young
>
> For flip-6, we will have ResourceManager and every JobManager as potential 
> leader contender and retriever. We should separate them by using different 
> zookeeper path. 
> For example, the path could be /leader/resource-manaeger for RM. And for each 
> JM, the path could be /leader/job-managers/JobID



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2550: [FLINK-4657] Implement HighAvailabilityServices ba...

2016-10-03 Thread KurtYoung
Github user KurtYoung closed the pull request at:

https://github.com/apache/flink/pull/2550


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2565: [FLINK-4406] [cluster management] Implement job ma...

2016-10-03 Thread KurtYoung
Github user KurtYoung closed the pull request at:

https://github.com/apache/flink/pull/2565


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544170#comment-15544170
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4722 at 10/4/16 3:17 AM:
-

One last comment:
Iterating over all data shouldn't be required. Say you have 3 parallelism (3 
subtasks) for the FlinkKafkaConsumer. Each subtask will be assigned a single 
partition and read only that partition. Using a {{KeyedDeserializationSchema}}, 
you can key each data record with the partition id "within the source". That 
means you won't need another map function to do this keying. The data that 
comes out of FlinkKafkaConsumer will be (Tuple(partition, matrixData))

The only "iteration" would probably be a "keyBy" operation after the source to 
transform the stream into a {{KeyedStream}} before windowing. I think, in use 
cases like yours, you would benefit if the returned stream from the source is 
already a {{KeyedStream}}, to take advantage of pre-partitioned data outside of 
Flink. Not sure if it makes sense yet, but I'll think a bit about this idea 
(would be great if you can let me know what you think of this)!

In any way, writing your own custom consumer also works for now :)


was (Author: tzulitai):
One last comment:
Iterating over all data shouldn't be required. Say you have 3 parallelism (3 
subtasks) for the FlinkKafkaConsumer. Each subtask will be assigned a single 
partition and read only that partition. Using a {{KeyedDeserializationSchema}}, 
you can key each data record with the partition id "within the source". That 
means you won't need another map function to do this keying. The data that 
comes out of FlinkKafkaConsumer will be (Tuple(partition, matrixData))

The only "iteration" would probably be a "keyBy" operation after the source to 
transform the stream into a {{KeyedStream}} before windowing. I think, in use 
cases like yours, you would benefit if the returned stream from the source is 
already a {{KeyedStream}}, to take advantage of pre-partitioned data outside of 
Flink. I'll think a bit about this idea (would be great if you can let me know 
what you think of this)!

In any way, writing your own custom consumer also works for now :)

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544170#comment-15544170
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4722 at 10/4/16 3:04 AM:
-

One last comment:
Iterating over all data shouldn't be required. Say you have 3 parallelism (3 
subtasks) for the FlinkKafkaConsumer. Each subtask will be assigned a single 
partition and read only that partition. Using a {{KeyedDeserializationSchema}}, 
you can key each data record with the partition id "within the source". That 
means you won't need another map function to do this keying. The data that 
comes out of FlinkKafkaConsumer will be (Tuple(partition, matrixData))

The only "iteration" would probably be a "keyBy" operation after the source to 
transform the stream into a {{KeyedStream}} before windowing. I think, in use 
cases like yours, you would benefit if the returned stream from the source is 
already a {{KeyedStream}}, to take advantage of pre-partitioned data outside of 
Flink. I'll think a bit about this idea (would be great if you can let me know 
what you think of this)!

In any way, writing your own custom consumer also works for now :)


was (Author: tzulitai):
One last comment:
Iterating over all data shouldn't be required. Say you have 3 parallelism (3 
subtasks) for the FlinkKafkaConsumer. Each subtask will be assigned a single 
partition and read only that partition. Using a {{KeyedDeserializationSchema}}, 
you can key each data record with the partition id "within the source". That 
means you won't need another map function to do this keying. The data that 
comes out of FlinkKafkaConsumer will be (Tuple(partition, matrixData))

The only "iteration" would probably be a "keyBy" operation after the source to 
transform the stream into a {{KeyedStream}} before windowing. I think, in use 
cases like yours, you would benefit if the returned stream from the source is 
already a {{KeyedStream}}, to take advantage of pre-partitioned data outside of 
Flink. I'll think a bit about this idea!

In any way, writing your own custom consumer also works for now :)

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544170#comment-15544170
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4722 at 10/4/16 3:01 AM:
-

One last comment:
Iterating over all data shouldn't be required. Say you have 3 parallelism (3 
subtasks) for the FlinkKafkaConsumer. Each subtask will be assigned a single 
partition and read only that partition. Using a {{KeyedDeserializationSchema}}, 
you can key each data record with the partition id "within the source". That 
means you won't need another map function to do this keying. The data that 
comes out of FlinkKafkaConsumer will be (Tuple(partition, matrixData))

The only "iteration" would probably be a "keyBy" operation after the source to 
transform the stream into a {{KeyedStream}} before windowing. I think, in use 
cases like yours, you would benefit if the returned stream from the source is 
already a {{KeyedStream}}, to take advantage of pre-partitioned data outside of 
Flink. I'll think a bit about this idea!

In any way, writing your own custom consumer also works for now :)


was (Author: tzulitai):
One last comment:
Iterating over all data shouldn't be required. Say you have 3 parallelism (3 
subtasks) for the FlinkKafkaConsumer. Each subtask will be assigned a single 
partition and read only that partition. Using a {{KeyedDeserializationSchema}}, 
you can key each data record with the partition id "within the source". That 
means you won't need another map function to do this keying. The data that 
comes out of FlinkKafkaConsumer will be (Tuple(partition, matrixData))

The only "iteration" would probably be a "keyBy" operation after the source. I 
think, in use cases like yours, you would benefit if the returned stream from 
the source is already a {{KeyedStream}}, to take advantage of pre-partitioned 
data outside of Flink. I'll think a bit about this idea!

In any way, writing your own custom consumer also works for now :)

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544170#comment-15544170
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4722:


One last comment:
Iterating over all data shouldn't be required. Say you have 3 parallelism (3 
subtasks) for the FlinkKafkaConsumer. Each subtask will be assigned a single 
partition and read only that partition. Using a {{KeyedDeserializationSchema}}, 
you can key each data record with the partition id "within the source". That 
means you won't need another map function to do this keying. The data that 
comes out of FlinkKafkaConsumer will be (Tuple(partition, matrixData))

The only "iteration" would probably be a "keyBy" operation after the source. I 
think, in use cases like yours, you would benefit if the returned stream from 
the source is already a {{KeyedStream}}, to take advantage of pre-partitioned 
data outside of Flink. I'll think a bit about this idea!

In any way, writing your own custom consumer also works for now :)

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2425: FLINK-3930 Added shared secret based authorization for Fl...

2016-10-03 Thread vijikarthi
Github user vijikarthi commented on the issue:

https://github.com/apache/flink/pull/2425
  
Addressed [FLINK-4635] Netty data transfer authentication (missing piece of 
FLINK-3930)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3930) Implement Service-Level Authorization

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543956#comment-15543956
 ] 

ASF GitHub Bot commented on FLINK-3930:
---

Github user vijikarthi commented on the issue:

https://github.com/apache/flink/pull/2425
  
Addressed [FLINK-4635] Netty data transfer authentication (missing piece of 
FLINK-3930)


> Implement Service-Level Authorization
> -
>
> Key: FLINK-3930
> URL: https://issues.apache.org/jira/browse/FLINK-3930
> Project: Flink
>  Issue Type: New Feature
>  Components: Security
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>  Labels: security
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> _This issue is part of a series of improvements detailed in the [Secure Data 
> Access|https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing]
>  design doc._
> Service-level authorization is the initial authorization mechanism to ensure 
> clients (or servers) connecting to the Flink cluster are authorized to do so. 
>   The purpose is to prevent a cluster from being used by an unauthorized 
> user, whether to execute jobs, disrupt cluster functionality, or gain access 
> to secrets stored within the cluster.
> Implement service-level authorization as described in the design doc.
> - Introduce a shared secret cookie
> - Enable Akka security cookie
> - Implement data transfer authentication
> - Secure the web dashboard



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3999) Rename the `running` flag in the drivers to `canceled`

2016-10-03 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543749#comment-15543749
 ] 

Neelesh Srinivas Salian commented on FLINK-3999:


Apologies. Haven't had the chance to work on this. Will post a PR soon.


> Rename the `running` flag in the drivers to `canceled`
> --
>
> Key: FLINK-3999
> URL: https://issues.apache.org/jira/browse/FLINK-3999
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Reporter: Gabor Gevay
>Assignee: Neelesh Srinivas Salian
>Priority: Trivial
>
> The name of the {{running}} flag in the drivers doesn't reflect its usage: 
> when the operator just stops normally, then it is not running anymore, but 
> the {{running}}  flag will still be true, since the {{running}} flag is only 
> set when cancelling.
> It should be renamed, and the value inverted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4728) Replace reference equality with object equality

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543370#comment-15543370
 ] 

ASF GitHub Bot commented on FLINK-4728:
---

GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/2582

[FLINK-4728] [core,optimizer] Replace reference equality with object 
equality

Some cases of testing Integer equality using == rather than 
Integer.equals(Integer), and some additional cleanup.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 
4728_replace_reference_equality_with_object_equality

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2582.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2582


commit 409883eb3c3ffa7a94ad25e67cfd6dfdc25010ba
Author: Greg Hogan 
Date:   2016-10-03T17:59:57Z

[FLINK-4728] [core,optimizer] Replace reference equality with object 
equality

Some cases of testing Integer equality using == rather than
Integer.equals(Integer), and some additional cleanup.




> Replace reference equality with object equality
> ---
>
> Key: FLINK-4728
> URL: https://issues.apache.org/jira/browse/FLINK-4728
> Project: Flink
>  Issue Type: Improvement
>  Components: Core, Optimizer
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
> Fix For: 1.2.0
>
>
> Some cases of testing {{Integer}} equality using {{==}} rather than 
> {{Integer.equals(Integer)}}, and some additional cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4406) Implement job master registration at resource manager

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543365#comment-15543365
 ] 

ASF GitHub Bot commented on FLINK-4406:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2565
  
Thanks for your contribution @KurtYoung. I've merged the code. You can 
close the PR now.


> Implement job master registration at resource manager
> -
>
> Key: FLINK-4406
> URL: https://issues.apache.org/jira/browse/FLINK-4406
> Project: Flink
>  Issue Type: Sub-task
>  Components: Cluster Management
>Reporter: Wenlong Lyu
>Assignee: Kurt Young
>
> Job Master needs to register to Resource Manager when starting and then 
> watches leadership changes of RM, and trigger re-registration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2582: [FLINK-4728] [core,optimizer] Replace reference eq...

2016-10-03 Thread greghogan
GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/2582

[FLINK-4728] [core,optimizer] Replace reference equality with object 
equality

Some cases of testing Integer equality using == rather than 
Integer.equals(Integer), and some additional cleanup.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 
4728_replace_reference_equality_with_object_equality

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2582.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2582


commit 409883eb3c3ffa7a94ad25e67cfd6dfdc25010ba
Author: Greg Hogan 
Date:   2016-10-03T17:59:57Z

[FLINK-4728] [core,optimizer] Replace reference equality with object 
equality

Some cases of testing Integer equality using == rather than
Integer.equals(Integer), and some additional cleanup.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4657) Implement HighAvailabilityServices based on zookeeper

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543367#comment-15543367
 ] 

ASF GitHub Bot commented on FLINK-4657:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2550
  
Thanks for your contribution @KurtYoung. I've merged the code. You can 
close the PR now.


> Implement HighAvailabilityServices based on zookeeper
> -
>
> Key: FLINK-4657
> URL: https://issues.apache.org/jira/browse/FLINK-4657
> Project: Flink
>  Issue Type: New Feature
>  Components: Cluster Management
>Reporter: Kurt Young
>Assignee: Kurt Young
>
> For flip-6, we will have ResourceManager and every JobManager as potential 
> leader contender and retriever. We should separate them by using different 
> zookeeper path. 
> For example, the path could be /leader/resource-manaeger for RM. And for each 
> JM, the path could be /leader/job-managers/JobID



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2565: [FLINK-4406] [cluster management] Implement job master re...

2016-10-03 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2565
  
Thanks for your contribution @KurtYoung. I've merged the code. You can 
close the PR now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2550: [FLINK-4657] Implement HighAvailabilityServices based on ...

2016-10-03 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2550
  
Thanks for your contribution @KurtYoung. I've merged the code. You can 
close the PR now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-4406) Implement job master registration at resource manager

2016-10-03 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann resolved FLINK-4406.
--
Resolution: Fixed

Added via ba2b59096022b480a70f6410d9b1643da5158608

> Implement job master registration at resource manager
> -
>
> Key: FLINK-4406
> URL: https://issues.apache.org/jira/browse/FLINK-4406
> Project: Flink
>  Issue Type: Sub-task
>  Components: Cluster Management
>Reporter: Wenlong Lyu
>Assignee: Kurt Young
>
> Job Master needs to register to Resource Manager when starting and then 
> watches leadership changes of RM, and trigger re-registration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4478) Implement heartbeat logic

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543355#comment-15543355
 ] 

ASF GitHub Bot commented on FLINK-4478:
---

Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/2435


> Implement heartbeat logic
> -
>
> Key: FLINK-4478
> URL: https://issues.apache.org/jira/browse/FLINK-4478
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 1.2.0
>
>
> With the Flip-6 refactoring, we'll have the need for a dedicated heartbeat 
> component. The heartbeat component is used to check the liveliness of the 
> distributed components among each other. Furthermore, heartbeats are used to 
> regularly transmit status updates to another component. For example, the 
> TaskManager informs the ResourceManager with each heartbeat about the current 
> slot allocation.
> The heartbeat is initiated from one component. This component sends a 
> heartbeat request to another component which answers with an heartbeat 
> response. Thus, one can differentiate between a sending and a receiving side. 
> Apart from the triggering of the heartbeat request, the logic of treating 
> heartbeats, marking components dead and payload delivery are the same and 
> should be reusable by different distributed components (JM, TM, RM).
> Different models for the heartbeat reporting are conceivable. First of all, 
> the heartbeat request could be sent as an ask operation where the heartbeat 
> response is returned as a future on the sending side. Alternatively, the 
> sending side could request a heartbeat response by sending a tell message. 
> The heartbeat response is then delivered by an RPC back to the heartbeat 
> sender. The latter model has the advantage that a heartbeat response is not 
> tightly coupled to a heartbeat request. Such a tight coupling could cause 
> that heartbeat response are ignored after the future has timed out even 
> though they might still contain valuable information (receiver is still 
> alive).
> Furthermore, different strategies for the heartbeat triggering and marking 
> heartbeat targets as dead are conceivable. For example, we could periodically 
> (with a fixed period) trigger a heartbeat request and mark all targets as 
> dead if we didn't receive a heartbeat response in a given time period. 
> Furthermore, we could adapt the heartbeat interval and heartbeat timeouts 
> with respect to the latency of previous heartbeat responses. This would 
> reflect the current load and network conditions better.
> For the first version, I would propose to use a fixed period heartbeat with a 
> maximum heartbeat timeout before a target is marked dead. Furthermore, I 
> would propose to use tell messages (fire and forget) to request and report 
> heartbeats because they are the more flexible model imho.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2435: [FLINK-4478] [flip-6] Add HeartbeatManager

2016-10-03 Thread tillrohrmann
Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/2435


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3888) Custom Aggregator with Convergence can't be registered directly with DeltaIteration

2016-10-03 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543308#comment-15543308
 ] 

Greg Hogan commented on FLINK-3888:
---

Okay, that sounds better than requiring the user to additionally test for 
workset convergence.

> Custom Aggregator with Convergence can't be registered directly with 
> DeltaIteration
> ---
>
> Key: FLINK-3888
> URL: https://issues.apache.org/jira/browse/FLINK-3888
> Project: Flink
>  Issue Type: Bug
>  Components: Iterations
>Reporter: Martin Liesenberg
>
> Contrary to the 
> [documentation|https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/iterations.html]
>  the method to add an aggregator with a custom convergence criterion to a 
> DeltaIteration is not exposed directly to DeltaIteration, but can only be 
> accessed via the {{aggregatorRegistry}}.
> Moreover, when registering an aggregator with a custom convergence criterion  
> and running the program, the following exception appears in the logs:
> {noformat}
> Error: Cannot use custom convergence criterion with workset iteration. 
> Workset iterations have implicit convergence criterion where workset is empty.
> org.apache.flink.optimizer.CompilerException: Error: Cannot use custom 
> convergence criterion with workset iteration. Workset iterations have 
> implicit convergence criterion where workset is empty.
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.finalizeWorksetIteration(JobGraphGenerator.java:1518)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:198)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:164)
>   at 
> org.apache.flink.test.util.TestEnvironment.execute(TestEnvironment.java:76)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:898)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:410)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1605)
> {noformat}
> The issue has been found while discussing FLINK-2926



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3888) Custom Aggregator with Convergence can't be registered directly with DeltaIteration

2016-10-03 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543256#comment-15543256
 ] 

Vasia Kalavri commented on FLINK-3888:
--

We shouldn't override the default convergence criterion of the delta iteration. 
When the workset is empty there's no work to do. Instead, if a custom criterion 
is provided, the convergence condition should be the disjunction of the two.

> Custom Aggregator with Convergence can't be registered directly with 
> DeltaIteration
> ---
>
> Key: FLINK-3888
> URL: https://issues.apache.org/jira/browse/FLINK-3888
> Project: Flink
>  Issue Type: Bug
>  Components: Iterations
>Reporter: Martin Liesenberg
>
> Contrary to the 
> [documentation|https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/iterations.html]
>  the method to add an aggregator with a custom convergence criterion to a 
> DeltaIteration is not exposed directly to DeltaIteration, but can only be 
> accessed via the {{aggregatorRegistry}}.
> Moreover, when registering an aggregator with a custom convergence criterion  
> and running the program, the following exception appears in the logs:
> {noformat}
> Error: Cannot use custom convergence criterion with workset iteration. 
> Workset iterations have implicit convergence criterion where workset is empty.
> org.apache.flink.optimizer.CompilerException: Error: Cannot use custom 
> convergence criterion with workset iteration. Workset iterations have 
> implicit convergence criterion where workset is empty.
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.finalizeWorksetIteration(JobGraphGenerator.java:1518)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:198)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:164)
>   at 
> org.apache.flink.test.util.TestEnvironment.execute(TestEnvironment.java:76)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:898)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:410)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1605)
> {noformat}
> The issue has been found while discussing FLINK-2926



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3888) Custom Aggregator with Convergence can't be registered directly with DeltaIteration

2016-10-03 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543190#comment-15543190
 ] 

Greg Hogan commented on FLINK-3888:
---

This looks quite useful. Why have a function to enable the default rather than 
having {{TaskConfig.setConvergenceCriterion()}} override the default? Do we 
need {{TaskConfig.setDefaultConvergeCriterion()}}?

> Custom Aggregator with Convergence can't be registered directly with 
> DeltaIteration
> ---
>
> Key: FLINK-3888
> URL: https://issues.apache.org/jira/browse/FLINK-3888
> Project: Flink
>  Issue Type: Bug
>  Components: Iterations
>Reporter: Martin Liesenberg
>
> Contrary to the 
> [documentation|https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/iterations.html]
>  the method to add an aggregator with a custom convergence criterion to a 
> DeltaIteration is not exposed directly to DeltaIteration, but can only be 
> accessed via the {{aggregatorRegistry}}.
> Moreover, when registering an aggregator with a custom convergence criterion  
> and running the program, the following exception appears in the logs:
> {noformat}
> Error: Cannot use custom convergence criterion with workset iteration. 
> Workset iterations have implicit convergence criterion where workset is empty.
> org.apache.flink.optimizer.CompilerException: Error: Cannot use custom 
> convergence criterion with workset iteration. Workset iterations have 
> implicit convergence criterion where workset is empty.
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.finalizeWorksetIteration(JobGraphGenerator.java:1518)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:198)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:164)
>   at 
> org.apache.flink.test.util.TestEnvironment.execute(TestEnvironment.java:76)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:898)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:410)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1605)
> {noformat}
> The issue has been found while discussing FLINK-2926



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3888) Custom Aggregator with Convergence can't be registered directly with DeltaIteration

2016-10-03 Thread Vasia Kalavri (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasia Kalavri updated FLINK-3888:
-
Component/s: Iterations

> Custom Aggregator with Convergence can't be registered directly with 
> DeltaIteration
> ---
>
> Key: FLINK-3888
> URL: https://issues.apache.org/jira/browse/FLINK-3888
> Project: Flink
>  Issue Type: Bug
>  Components: Iterations
>Reporter: Martin Liesenberg
>
> Contrary to the 
> [documentation|https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/iterations.html]
>  the method to add an aggregator with a custom convergence criterion to a 
> DeltaIteration is not exposed directly to DeltaIteration, but can only be 
> accessed via the {{aggregatorRegistry}}.
> Moreover, when registering an aggregator with a custom convergence criterion  
> and running the program, the following exception appears in the logs:
> {noformat}
> Error: Cannot use custom convergence criterion with workset iteration. 
> Workset iterations have implicit convergence criterion where workset is empty.
> org.apache.flink.optimizer.CompilerException: Error: Cannot use custom 
> convergence criterion with workset iteration. Workset iterations have 
> implicit convergence criterion where workset is empty.
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.finalizeWorksetIteration(JobGraphGenerator.java:1518)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:198)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:164)
>   at 
> org.apache.flink.test.util.TestEnvironment.execute(TestEnvironment.java:76)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:898)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:410)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1605)
> {noformat}
> The issue has been found while discussing FLINK-2926



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3888) Custom Aggregator with Convergence can't be registered directly with DeltaIteration

2016-10-03 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543133#comment-15543133
 ] 

Vasia Kalavri commented on FLINK-3888:
--

I had a quick look into this and I don't see any fundamental reason why we 
can't add a custom convergence criterion in delta iterations. It seems that it 
is currently allowed to add one convergence criterion only, and since delta 
iterations have the {{WorksetEmptyConvergenceCriterion}} by default, adding a 
custom one is not possible.

So, one solution could be the following:
1. use {{TaskConfig.setConvergenceCriterion()}} to set the custom, user-defined 
convergence criterion (like in the case of bulk iteration)
2. add a new method {{TaskConfig.setDefaultConvergeCriterion()}} to add the 
default empty workset convergence
​3. check both criteria in 
{{IterationSynchronizationSinkTask.checkForConvergence()​}}
4. expose the custom convergence criterion in {{DeltaIteration}}

If I'm not missing something and this seems acceptable I'd like to resolve this 
issue. Custom convergence would be helpful in several Gelly algorithms.

> Custom Aggregator with Convergence can't be registered directly with 
> DeltaIteration
> ---
>
> Key: FLINK-3888
> URL: https://issues.apache.org/jira/browse/FLINK-3888
> Project: Flink
>  Issue Type: Bug
>  Components: Iterations
>Reporter: Martin Liesenberg
>
> Contrary to the 
> [documentation|https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/iterations.html]
>  the method to add an aggregator with a custom convergence criterion to a 
> DeltaIteration is not exposed directly to DeltaIteration, but can only be 
> accessed via the {{aggregatorRegistry}}.
> Moreover, when registering an aggregator with a custom convergence criterion  
> and running the program, the following exception appears in the logs:
> {noformat}
> Error: Cannot use custom convergence criterion with workset iteration. 
> Workset iterations have implicit convergence criterion where workset is empty.
> org.apache.flink.optimizer.CompilerException: Error: Cannot use custom 
> convergence criterion with workset iteration. Workset iterations have 
> implicit convergence criterion where workset is empty.
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.finalizeWorksetIteration(JobGraphGenerator.java:1518)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:198)
>   at 
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:164)
>   at 
> org.apache.flink.test.util.TestEnvironment.execute(TestEnvironment.java:76)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:898)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:410)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1605)
> {noformat}
> The issue has been found while discussing FLINK-2926



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Sudhanshu Sekhar Lenka (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543036#comment-15543036
 ] 

Sudhanshu Sekhar Lenka commented on FLINK-4722:
---

according to your solution i have to iterate over all data that come to each 
consumer. which is issue. we don't want to iterate over all data due to 
performance issue..

Thnx for your suggestion. By writing my own custom consumer  i am able to solve 
this issue.

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4729) Use optional VertexCentric CombineFunction

2016-10-03 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-4729:
-

 Summary: Use optional VertexCentric CombineFunction
 Key: FLINK-4729
 URL: https://issues.apache.org/jira/browse/FLINK-4729
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 1.2.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor
 Fix For: 1.2.0


Passes through the {{CombineFunction}} to {{VertexCentricIteration}}, and other 
code cleanup discovered via IntelliJ's code analyzer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4728) Replace reference equality with object equality

2016-10-03 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-4728:
-

 Summary: Replace reference equality with object equality
 Key: FLINK-4728
 URL: https://issues.apache.org/jira/browse/FLINK-4728
 Project: Flink
  Issue Type: Improvement
  Components: Core, Optimizer
Affects Versions: 1.2.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor
 Fix For: 1.2.0


Some cases of testing {{Integer}} equality using {{==}} rather than 
{{Integer.equals(Integer)}}, and some additional cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542855#comment-15542855
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4722 at 10/3/16 4:58 PM:
-

I'm not sure if I fully understand your use case, but you could consider this 
solution:
Use a single FlinkKafkaConsumer to read the topic, and key the input elements 
by the partition id (you can do this by supplying your own 
{{KeyedDeserializationSchema}} to the consumer; the deserialization schema 
exposes the partition of the read record, so you can use that to key the data, 
ex. output {{Tuple(partitionId,matrixData)}} or your own POJO / case class 
directly from the FlinkKafkaConsumer09).
Then, on the keyed stream, you can perform windows on each key (i.e., in your 
case, each matrix) like you mentioned.

Does this help?


was (Author: tzulitai):
I'm not sure if I fully understand your use case, but you could consider this 
solution:
Use a single FlinkKafkaConsumer to read the topic, and key the input elements 
by the partition id (you can do this by supplying your own 
{{KeyedDeserializationSchema}} to the consumer; the deserialization schema 
exposes the partition of the read record, so you can use that to key the data, 
ex. output {{Tuple(partitionId,matrixData)}} or your own POJO / case class 
directly from the source).
Then, on the keyed stream, you can perform windows on each key (i.e., in your 
case, each matrix) like you mentioned.

Does this help?

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542855#comment-15542855
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4722 at 10/3/16 4:58 PM:
-

I'm not sure if I fully understand your use case, but you could consider this 
solution:
Use a single FlinkKafkaConsumer to read the topic, and key the input elements 
by the partition id (you can do this by supplying your own 
{{KeyedDeserializationSchema}} to the consumer; the deserialization schema 
exposes the partition of the read record, so you can use that to key the data, 
ex. output {{Tuple(partitionId,matrixData)}} or your own POJO / case class 
directly from the source).
Then, on the keyed stream, you can perform windows on each key (i.e., in your 
case, each matrix) like you mentioned.

Does this help?


was (Author: tzulitai):
I'm not sure if I fully understand your use case, but you could consider this 
solution:
Use a single FlinkKafkaConsumer to read the topic, and key the input elements 
by the partition id (you can do this by supplying your own 
{{KeyedDeserializationSchema}} to the consumer). Then, on the keyed stream, you 
can perform windows on each key (i.e., in your case, each matrix) like you 
mentioned.

Does this help?

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542855#comment-15542855
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4722 at 10/3/16 4:55 PM:
-

I'm not sure if I fully understand your use case, but you could consider this 
solution:
Use a single FlinkKafkaConsumer to read the topic, and key the input elements 
by the partition id (you can do this by supplying your own 
{{KeyedDeserializationSchema}} to the consumer). Then, on the keyed stream, you 
can perform windows on each key (i.e., in your case, each matrix) like you 
mentioned.

Does this help?


was (Author: tzulitai):
I'm not sure if I fully understand your use case, but you could consider this 
solution:
Use a single FlinkKafkaConsumer to read the topic, and key the input elements 
by the partition id (you can do this by supplying your own 
{{KeyedDeserializationSchema}} to the consumer). Then, on the keyed stream, you 
can perform windows on each key (i.e., in your case, each matrix) like you 
mentioned.

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542855#comment-15542855
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4722:


I'm not sure if I fully understand your use case, but you could consider this 
solution:
Use a single FlinkKafkaConsumer to read the topic, and key the input elements 
by the partition id (you can do this by supplying your own 
{{KeyedDeserializationSchema}} to the consumer). Then, on the keyed stream, you 
can perform windows on each key (i.e., in your case, each matrix) like you 
mentioned.

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Sudhanshu Sekhar Lenka (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542840#comment-15542840
 ] 

Sudhanshu Sekhar Lenka commented on FLINK-4722:
---

 i am able to figure out solution  by writing on FlinkConsumer with extending 
FlinkKafkaConsumerBase and assigned each consumer to each partition .

private static List 
convertToFlinkKafkaTopicPartition(List partitions,int partition) 
{
checkNotNull(partitions);
List ret = new 
ArrayList<>(partitions.size());
ret.add(new 
KafkaTopicPartition(partitions.get(partition).topic(), 
partitions.get(partition).partition()));
return ret;
}


> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Sudhanshu Sekhar Lenka (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542828#comment-15542828
 ] 

Sudhanshu Sekhar Lenka commented on FLINK-4722:
---

Real case is i am sending 1lac records per seconds to one topic withwith 6 
different matrix (in to 6 partition). The problem is i have to windowing 
specific matrix data and to wait for certain time for other interface stream to 
happen. if i have to filter these matrix data on basic of some condition again 
in consumer side which already done in producer side . which causing 
performance issue for filtering other 5 matrix data. and put the filter matrix 
to some windowing. 

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542796#comment-15542796
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4722 at 10/3/16 4:35 PM:
-

Is there any reason / use case why you need to have 3 FlinkKafkaConsumers, and 
let each read a single partitions?

Otherwise, a single FlinkKafkaConsumer09 can read all 3 partitions exactly-once 
(each partition is assigned to a source subtask, you can check out the details 
of how Flink assigns Kafka topic partitions to source subtasks here: 
http://data-artisans.com/kafka-flink-a-practical-how-to/; the FAQ part "How are 
Kafka partitions assigned to Flink workers?" should be of interest to you).


was (Author: tzulitai):
Is there any reason / use case why you need to have 3 FlinkKafkaConsumers, and 
let each read a single partitions?

Otherwise, a single FlinkKafkaConsumer09 can read all 3 partitions exactly-once 
(each partition is assigned to a source subtask, you can check out the details 
of how Flink assigns Kafka topic partitions to source subtasks here: 
http://data-artisans.com/kafka-flink-a-practical-how-to/).

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2002: Support for bz2 compression in flink-core

2016-10-03 Thread mtanski
Github user mtanski commented on a diff in the pull request:

https://github.com/apache/flink/pull/2002#discussion_r81589240
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/io/compression/InflaterInputStreamFactory.java
 ---
@@ -23,13 +23,12 @@
 import java.io.IOException;
 import java.io.InputStream;
 import java.util.Collection;
-import java.util.zip.InflaterInputStream;
 
 /**
  * Creates a new instance of a certain subclass of {@link 
java.util.zip.InflaterInputStream}.
  */
 @Internal
-public interface InflaterInputStreamFactory 
{
+public interface InflaterInputStreamFactory {
--- End diff --

This causes a build error due to binary backwards compatibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542796#comment-15542796
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4722 at 10/3/16 4:30 PM:
-

Is there any reason / use case why you need to have 3 FlinkKafkaConsumers, and 
let each read a single partitions?

Otherwise, a single FlinkKafkaConsumer09 can read all 3 partitions exactly-once 
(each partition is assigned to a source subtask, you can check out the details 
of how Flink assigns Kafka topic partitions to source subtasks here: 
http://data-artisans.com/kafka-flink-a-practical-how-to/).


was (Author: tzulitai):
Is there any reason / use case why you need to have 3 FlinkKafkaConsumers, and 
let each read particular partitions?

Otherwise, a single FlinkKafkaConsumer09 can read all 3 partitions exactly-once 
(each partition is assigned to a source subtask, you can check out the details 
of how Flink assigns Kafka topic partitions to source subtasks here: 
http://data-artisans.com/kafka-flink-a-practical-how-to/).

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542796#comment-15542796
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4722:


Is there any reason / use case why you need to have 3 FlinkKafkaConsumers, and 
let each read particular partitions?

Otherwise, a single FlinkKafkaConsumer09 can read all 3 partitions exactly-once 
(each partition is assigned to a source subtask, you can check out the details 
of how Flink assigns Kafka topic partitions to source subtasks here: 
http://data-artisans.com/kafka-flink-a-practical-how-to/).

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Sudhanshu Sekhar Lenka (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542774#comment-15542774
 ] 

Sudhanshu Sekhar Lenka commented on FLINK-4722:
---

Thnx for your reply.

Is there any configuration or logic available to modify  
KafkaConsumer#subscribe() for specific partitions ?

if i have   one topic which  has 3 partitions(0,1,2).   and 3 
FlinkKafkaConsumer09  is it possible to  subscribe particular partitions.
as inside FlinkKafkaConsumer09 it subscribe to all partitions by default ?
that cause problem of getting duplicate data and filter it again inside flink 
which causing performance issue for us.







> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2002: Support for bz2 compression in flink-core

2016-10-03 Thread mtanski
Github user mtanski commented on the issue:

https://github.com/apache/flink/pull/2002
  
I tried changing the name from to InflaterInputStreamFactory -> 
DecompressingStreamFactory

But now it will not build because of:
[ERROR] Failed to execute goal 
com.github.siom79.japicmp:japicmp-maven-plugin:0.7.0:cmp (default) on project 
flink-core: Breaking the build because there is at least one binary 
incompatible class: org.apache.flink.api.common.io.BinaryInputFormat -> [Help 1]

This is because this change causes prototype changes in a bunch of places 
flink.api.common.io.FileInputFormat

The next change will have bz2, xz, docs & maven dep for commons 
compression. I'll also rebase it against master. Can we get it up after that.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-3734) Unclosed DataInputView in AbstractAlignedProcessingTimeWindowOperator#restoreState()

2016-10-03 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved FLINK-3734.
---
Resolution: Cannot Reproduce

> Unclosed DataInputView in 
> AbstractAlignedProcessingTimeWindowOperator#restoreState()
> 
>
> Key: FLINK-3734
> URL: https://issues.apache.org/jira/browse/FLINK-3734
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
> DataInputView in = inputState.getState(getUserCodeClassloader());
> final long nextEvaluationTime = in.readLong();
> final long nextSlideTime = in.readLong();
> AbstractKeyedTimePanes panes = 
> createPanes(keySelector, function);
> panes.readFromInput(in, keySerializer, stateTypeSerializer);
> restoredState = new RestoredState<>(panes, nextEvaluationTime, 
> nextSlideTime);
>   }
> {code}
> DataInputView in is not closed upon return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2580: [FLINK-4723] [kafka-connector] Unify committed offsets to...

2016-10-03 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2580
  
Seems like one of the new IT tests is a bit unstable, fixing it ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4723) Unify behaviour of committed offsets to Kafka / ZK for Kafka 0.8 and 0.9 consumer

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542696#comment-15542696
 ] 

ASF GitHub Bot commented on FLINK-4723:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2580
  
Seems like one of the new IT tests is a bit unstable, fixing it ...


> Unify behaviour of committed offsets to Kafka / ZK for Kafka 0.8 and 0.9 
> consumer
> -
>
> Key: FLINK-4723
> URL: https://issues.apache.org/jira/browse/FLINK-4723
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0, 1.1.3
>
>
> The proper "behaviour" of the offsets committed back to Kafka / ZK should be 
> "the next offset that consumers should read (in Kafka terms, the 'position')".
> This is already fixed for the 0.9 consumer by FLINK-4618, by incrementing the 
> committed offsets back to Kafka by the 0.9 by 1, so that the internal 
> {{KafkaConsumer}} picks up the correct start position when committed offsets 
> are present. This fix was required because the start position from committed 
> offsets was implicitly determined with Kafka 0.9 APIs.
> However, since the 0.8 consumer handles offset committing and start position 
> using Flink's own {{ZookeeperOffsetHandler}} and not Kafka's high-level APIs, 
> the 0.8 consumer did not require a fix.
> I propose to still unify the behaviour of committed offsets across 0.8 and 
> 0.9 to the definition above.
> Otherwise, if users in any case first uses the 0.8 consumer to read data and 
> have Flink-committed offsets in ZK, and then uses a high-level 0.8 Kafka 
> consumer to read the same topic in a non-Flink application, the first record 
> will be duplicate (because, like described above, Kafka high-level consumers 
> expect the committed offsets to be "the next record to process" and not "the 
> last processed record").
> This requires incrementing the committed ZK offsets in 0.8 to also be 
> incremented by 1, and changing how Flink internal offsets are initialized 
> with accordance to the acquired ZK offsets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2581: [FLINK-4709] Fix resource leak in InputStreamFSInp...

2016-10-03 Thread fholger
GitHub user fholger opened a pull request:

https://github.com/apache/flink/pull/2581

[FLINK-4709] Fix resource leak in InputStreamFSInputWrapper

InputStreamFSInputWrapper is a wrapper around an arbitrary Java 
InputStream. As such, it needs to reimplement the close() function and forward 
it to the wrapped stream, otherwise there is a potential resource leak. This PR 
forwards the close() call to the underlying stream.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fholger/flink flink-4709

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2581.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2581


commit 8ac14e34f87bcc318e9a57b8385fc419f3443345
Author: Holger Frydrych 
Date:   2016-10-03T12:34:19Z

[FLINK-4709] Fix resource leak in InputStreamFSInputWrapper




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4709) InputStreamFSInputWrapper does not close wrapped stream

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542651#comment-15542651
 ] 

ASF GitHub Bot commented on FLINK-4709:
---

GitHub user fholger opened a pull request:

https://github.com/apache/flink/pull/2581

[FLINK-4709] Fix resource leak in InputStreamFSInputWrapper

InputStreamFSInputWrapper is a wrapper around an arbitrary Java 
InputStream. As such, it needs to reimplement the close() function and forward 
it to the wrapped stream, otherwise there is a potential resource leak. This PR 
forwards the close() call to the underlying stream.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fholger/flink flink-4709

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2581.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2581


commit 8ac14e34f87bcc318e9a57b8385fc419f3443345
Author: Holger Frydrych 
Date:   2016-10-03T12:34:19Z

[FLINK-4709] Fix resource leak in InputStreamFSInputWrapper




> InputStreamFSInputWrapper does not close wrapped stream
> ---
>
> Key: FLINK-4709
> URL: https://issues.apache.org/jira/browse/FLINK-4709
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.3, 1.1.2
>Reporter: Holger Frydrych
>
> In FileInputFormat::decorateInputStream (flink.api.common.io), if the 
> inputStream gets wrapped by an InputStreamFSInputWrapper, it will never get 
> closed. FileInputFormat does call close() on its stream, but the wrapper does 
> not implement close() to forward this call to the wrapped stream. This leads 
> to a resource leak which can eventually exhaust the maximum number of open 
> file handles allowed if processing a lot of input files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4643) Average Clustering Coefficient

2016-10-03 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-4643.
-
Resolution: Implemented

Implemented in 95c08eab36bfa09c501a84f5b5f116d666d03ae1

> Average Clustering Coefficient
> --
>
> Key: FLINK-4643
> URL: https://issues.apache.org/jira/browse/FLINK-4643
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
> Fix For: 1.2.0
>
>
> Gelly has Global Clustering Coefficient and Local Clustering Coefficient. 
> This adds Average Clustering Coefficient. The distinction is discussed in 
> [http://jponnela.com/web_documents/twomode.pdf] (pdf page 2, document page 
> 32).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4624) Gelly's summarization algorithm cannot deal with null vertex group values

2016-10-03 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-4624.
-
Resolution: Fixed

Fixed in a79efdc23e1bd35eb7cd2e5eb9551cc297c02dd5

> Gelly's summarization algorithm cannot deal with null vertex group values
> -
>
> Key: FLINK-4624
> URL: https://issues.apache.org/jira/browse/FLINK-4624
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Till Rohrmann
>Assignee: Martin Junghanns
> Fix For: 1.2.0
>
>
> Gelly's {{Summarization}} algorithm cannot handle null values in the 
> `VertexGroupItem.f2`. This behaviour is hidden by using Strings as a vertex 
> value in the {{SummarizationITCase}}, because the {{StringSerializer}} can 
> handle null values. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4643) Average Clustering Coefficient

2016-10-03 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan updated FLINK-4643:
--
Fix Version/s: 1.2.0

> Average Clustering Coefficient
> --
>
> Key: FLINK-4643
> URL: https://issues.apache.org/jira/browse/FLINK-4643
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
> Fix For: 1.2.0
>
>
> Gelly has Global Clustering Coefficient and Local Clustering Coefficient. 
> This adds Average Clustering Coefficient. The distinction is discussed in 
> [http://jponnela.com/web_documents/twomode.pdf] (pdf page 2, document page 
> 32).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4624) Gelly's summarization algorithm cannot deal with null vertex group values

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542606#comment-15542606
 ] 

ASF GitHub Bot commented on FLINK-4624:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2527


> Gelly's summarization algorithm cannot deal with null vertex group values
> -
>
> Key: FLINK-4624
> URL: https://issues.apache.org/jira/browse/FLINK-4624
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Till Rohrmann
>Assignee: Martin Junghanns
> Fix For: 1.2.0
>
>
> Gelly's {{Summarization}} algorithm cannot handle null values in the 
> `VertexGroupItem.f2`. This behaviour is hidden by using Strings as a vertex 
> value in the {{SummarizationITCase}}, because the {{StringSerializer}} can 
> handle null values. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2527: [FLINK-4624] Allow for null values in Graph Summar...

2016-10-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2527


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4643) Average Clustering Coefficient

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542605#comment-15542605
 ] 

ASF GitHub Bot commented on FLINK-4643:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2528


> Average Clustering Coefficient
> --
>
> Key: FLINK-4643
> URL: https://issues.apache.org/jira/browse/FLINK-4643
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> Gelly has Global Clustering Coefficient and Local Clustering Coefficient. 
> This adds Average Clustering Coefficient. The distinction is discussed in 
> [http://jponnela.com/web_documents/twomode.pdf] (pdf page 2, document page 
> 32).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2528: [FLINK-4643] [gelly] Average Clustering Coefficien...

2016-10-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2528


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4624) Gelly's summarization algorithm cannot deal with null vertex group values

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542449#comment-15542449
 ] 

ASF GitHub Bot commented on FLINK-4624:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2527
  
Will merge today. I'll add a note that the `TypeInformation` will be 
removed once we resolve the issues typing `Either`.


> Gelly's summarization algorithm cannot deal with null vertex group values
> -
>
> Key: FLINK-4624
> URL: https://issues.apache.org/jira/browse/FLINK-4624
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Till Rohrmann
>Assignee: Martin Junghanns
> Fix For: 1.2.0
>
>
> Gelly's {{Summarization}} algorithm cannot handle null values in the 
> `VertexGroupItem.f2`. This behaviour is hidden by using Strings as a vertex 
> value in the {{SummarizationITCase}}, because the {{StringSerializer}} can 
> handle null values. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2527: [FLINK-4624] Allow for null values in Graph Summarization

2016-10-03 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2527
  
Will merge today. I'll add a note that the `TypeInformation` will be 
removed once we resolve the issues typing `Either`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4723) Unify behaviour of committed offsets to Kafka / ZK for Kafka 0.8 and 0.9 consumer

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542348#comment-15542348
 ] 

ASF GitHub Bot commented on FLINK-4723:
---

GitHub user tzulitai opened a pull request:

https://github.com/apache/flink/pull/2580

[FLINK-4723] [kafka-connector] Unify committed offsets to Kafka to be the 
next record to process

The description within the JIRA ticket 
([FLINK-4723](https://issues.apache.org/jira/browse/FLINK-4723)) explains the 
reasoning for this change.

With this change, offsets committed to Kafka are larger by 1 compared to 
the internally checkpointed offsets. This is changed at the 
`FlinkKafkaConsumerBase` level, so that offsets given through the abstract 
`commitSpecificOffsetsToKafka()` method to the version-specific implementations 
are already incremented and represent the next record to process. This way, the 
version-specific implementations simply commit the given offsets without the 
need to manipulate them.

This PR also includes major refactoring of the IT tests to add commit 
offset related IT tests to `FlinkKafkaConsumerTestBase`, and let both the 0.8 
and 0.9 consumers run offset committing / initial offset startup tests 
(previously only the 0.8 consumer had these tests).

R: @rmetzger what's your take on this?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tzulitai/flink FLINK-4723

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2580.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2580


commit cc782ffd4c174f23c45349771b318a08a2be75a3
Author: Tzu-Li (Gordon) Tai 
Date:   2016-10-02T08:54:57Z

[FLINK-4723] [kafka-connector] Unify committed offsets to Kafka to be next 
record to process




> Unify behaviour of committed offsets to Kafka / ZK for Kafka 0.8 and 0.9 
> consumer
> -
>
> Key: FLINK-4723
> URL: https://issues.apache.org/jira/browse/FLINK-4723
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0, 1.1.3
>
>
> The proper "behaviour" of the offsets committed back to Kafka / ZK should be 
> "the next offset that consumers should read (in Kafka terms, the 'position')".
> This is already fixed for the 0.9 consumer by FLINK-4618, by incrementing the 
> committed offsets back to Kafka by the 0.9 by 1, so that the internal 
> {{KafkaConsumer}} picks up the correct start position when committed offsets 
> are present. This fix was required because the start position from committed 
> offsets was implicitly determined with Kafka 0.9 APIs.
> However, since the 0.8 consumer handles offset committing and start position 
> using Flink's own {{ZookeeperOffsetHandler}} and not Kafka's high-level APIs, 
> the 0.8 consumer did not require a fix.
> I propose to still unify the behaviour of committed offsets across 0.8 and 
> 0.9 to the definition above.
> Otherwise, if users in any case first uses the 0.8 consumer to read data and 
> have Flink-committed offsets in ZK, and then uses a high-level 0.8 Kafka 
> consumer to read the same topic in a non-Flink application, the first record 
> will be duplicate (because, like described above, Kafka high-level consumers 
> expect the committed offsets to be "the next record to process" and not "the 
> last processed record").
> This requires incrementing the committed ZK offsets in 0.8 to also be 
> incremented by 1, and changing how Flink internal offsets are initialized 
> with accordance to the acquired ZK offsets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2580: [FLINK-4723] [kafka-connector] Unify committed off...

2016-10-03 Thread tzulitai
GitHub user tzulitai opened a pull request:

https://github.com/apache/flink/pull/2580

[FLINK-4723] [kafka-connector] Unify committed offsets to Kafka to be the 
next record to process

The description within the JIRA ticket 
([FLINK-4723](https://issues.apache.org/jira/browse/FLINK-4723)) explains the 
reasoning for this change.

With this change, offsets committed to Kafka are larger by 1 compared to 
the internally checkpointed offsets. This is changed at the 
`FlinkKafkaConsumerBase` level, so that offsets given through the abstract 
`commitSpecificOffsetsToKafka()` method to the version-specific implementations 
are already incremented and represent the next record to process. This way, the 
version-specific implementations simply commit the given offsets without the 
need to manipulate them.

This PR also includes major refactoring of the IT tests to add commit 
offset related IT tests to `FlinkKafkaConsumerTestBase`, and let both the 0.8 
and 0.9 consumers run offset committing / initial offset startup tests 
(previously only the 0.8 consumer had these tests).

R: @rmetzger what's your take on this?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tzulitai/flink FLINK-4723

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2580.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2580


commit cc782ffd4c174f23c45349771b318a08a2be75a3
Author: Tzu-Li (Gordon) Tai 
Date:   2016-10-02T08:54:57Z

[FLINK-4723] [kafka-connector] Unify committed offsets to Kafka to be next 
record to process




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4624) Gelly's summarization algorithm cannot deal with null vertex group values

2016-10-03 Thread Martin Junghanns (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542267#comment-15542267
 ] 

Martin Junghanns commented on FLINK-4624:
-

ping [~greghogan]

> Gelly's summarization algorithm cannot deal with null vertex group values
> -
>
> Key: FLINK-4624
> URL: https://issues.apache.org/jira/browse/FLINK-4624
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Till Rohrmann
>Assignee: Martin Junghanns
> Fix For: 1.2.0
>
>
> Gelly's {{Summarization}} algorithm cannot handle null values in the 
> `VertexGroupItem.f2`. This behaviour is hidden by using Strings as a vertex 
> value in the {{SummarizationITCase}}, because the {{StringSerializer}} can 
> handle null values. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542145#comment-15542145
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4722:


Hi [~sudhanshulenka],
I think this is expected behaviour.

Internally, each source subtask of {{FlinkKafkaConsumer09}} uses 
{{KafkaConsumer#assign()}} (no consumer group functionality) instead of 
{{KafkaConsumer#subscribe()}} (has consumer group functionality). So that's why 
all 3 FlinkKafkaConsumer09 are getting all records.

Right now, we need to internally use {{KafkaConsumer#assign()}} because 
partition-to-subtask assignment must be determinate to achieve exactly-once 
guarantees using Flink's checkpointing mechanism.

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4722) Consumer group concept not working properly with FlinkKafkaConsumer09

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542145#comment-15542145
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4722 at 10/3/16 11:00 AM:
--

Hi [~sudhanshulenka],
I think this is expected behaviour.

Internally, each source subtask of {{FlinkKafkaConsumer09}} uses 
{{KafkaConsumer#assign()}} (no consumer group functionality) instead of 
{{KafkaConsumer#subscribe()}} (has consumer group functionality) to read from 
topic partitions. So that's why all 3 FlinkKafkaConsumer09 are getting all 
records.

Right now, we need to internally use {{KafkaConsumer#assign()}} because 
partition-to-subtask assignment must be determinate to achieve exactly-once 
guarantees using Flink's checkpointing mechanism.


was (Author: tzulitai):
Hi [~sudhanshulenka],
I think this is expected behaviour.

Internally, each source subtask of {{FlinkKafkaConsumer09}} uses 
{{KafkaConsumer#assign()}} (no consumer group functionality) instead of 
{{KafkaConsumer#subscribe()}} (has consumer group functionality). So that's why 
all 3 FlinkKafkaConsumer09 are getting all records.

Right now, we need to internally use {{KafkaConsumer#assign()}} because 
partition-to-subtask assignment must be determinate to achieve exactly-once 
guarantees using Flink's checkpointing mechanism.

> Consumer group concept not working properly with FlinkKafkaConsumer09  
> ---
>
> Key: FLINK-4722
> URL: https://issues.apache.org/jira/browse/FLINK-4722
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.2
>Reporter: Sudhanshu Sekhar Lenka
>
> When Kafka one Topic has 3 partition and 3 FlinkKafkaConsumer09 connected to 
> that same topic using "group.id" ,"myGroup" property . Still flink consumer 
> get all data which are push to each 3   partition . While it work properly 
> with normal java consumer. each consumer get specific data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1815) Add methods to read and write a Graph as adjacency list

2016-10-03 Thread Faye Beligianni (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541994#comment-15541994
 ] 

Faye Beligianni commented on FLINK-1815:


Hey, 

I am not sure why I wasn't aware of the {{FieldParser}}'s functionality, that 
would have save me a lot of time since I wouldn't have to use the  
{{GraphAdjacencyListReader.fromString()}} and test again and again the parsing 
functionality :/ . I am sorry for that, but of course I can/will change that. 

Regarding the string splits and concatenations, since I read the file with the 
{{readTextFile}} method, I used java's {{String.split}} method for the splits.  
I split each line with respect to the delimiters separating: the source from 
the neighbours, the vertex ids from the vertex/edge values e.t.c. . Again, 
please tell me if there is a better way to perform the splits of a text line.

If though you believe that this read/write a Graph as adjacency list issue 
should be approached in a much different way, than the one I implemented, of 
course I have no problem to reassign the ticket, or reject the PR and discuss 
further the implementation of this issue.



> Add methods to read and write a Graph as adjacency list
> ---
>
> Key: FLINK-1815
> URL: https://issues.apache.org/jira/browse/FLINK-1815
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.9
>Reporter: Vasia Kalavri
>Assignee: Faye Beligianni
>Priority: Minor
>
> It would be nice to add utility methods to read a graph from an Adjacency 
> list format and also write a graph in such a format.
> The simple case would be to read a graph with no vertex or edge values, where 
> we would need to define (a) a line delimiter, (b) a delimiter to separate 
> vertices from neighbor list and (c) and a delimiter to separate the neighbors.
> For example, "1 2,3,4\n2 1,3" would give vertex 1 with neighbors 2, 3 and 4 
> and vertex 2 with neighbors 1 and 3.
> If we have vertex values and/or edge values, we also need to have a way to 
> separate IDs from values. For example, we could have "1 0.1 2 0.5, 3 0.2" to 
> define a vertex 1 with value 0.1, edge (1, 2) with weight 0.5 and edge (1, 3) 
> with weight 0.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3204) TaskManagers are not shutting down properly on YARN

2016-10-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi updated FLINK-3204:
--
Assignee: Nikolay Vasilishin

> TaskManagers are not shutting down properly on YARN
> ---
>
> Key: FLINK-3204
> URL: https://issues.apache.org/jira/browse/FLINK-3204
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Robert Metzger
>Assignee: Nikolay Vasilishin
>  Labels: test-stability
>
> While running some experiments on a YARN cluster, I saw the following error
> {code}
> 10:15:24,741 INFO  org.apache.flink.yarn.YarnJobManager   
>- Stopping YARN JobManager with status SUCCEEDED and diagnostic Flink YARN 
> Client requested shutdown.
> 10:15:24,748 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl  
>- Waiting for application to be successfully unregistered.
> 10:15:24,852 INFO  
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl  - 
> Interrupted while waiting for queue
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275)
> 10:15:24,875 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_10when 
> stopping NMClientImpl
> 10:15:24,899 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_07when 
> stopping NMClientImpl
> 10:15:24,954 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_06when 
> stopping NMClientImpl
> 10:15:24,982 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_09when 
> stopping NMClientImpl
> 10:15:25,013 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_11when 
> stopping NMClientImpl
> 10:15:25,037 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_08when 
> stopping NMClientImpl
> 10:15:25,041 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_12when 
> stopping NMClientImpl
> 10:15:25,072 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_05when 
> stopping NMClientImpl
> 10:15:25,075 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_03when 
> stopping NMClientImpl
> 10:15:25,077 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_04when 
> stopping NMClientImpl
> 10:15:25,079 ERROR org.apache.hadoop.yarn.client.api.impl.NMClientImpl
>- Failed to stop Container container_1452019681933_0002_01_02when 
> stopping NMClientImpl
> 10:15:25,080 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Closing proxy : cdh544-worker-0.c.astral-sorter-757.internal:8041
> 10:15:25,080 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Closing proxy : cdh544-worker-1.c.astral-sorter-757.internal:8041
> 10:15:25,080 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Closing proxy : cdh544-master.c.astral-sorter-757.internal:8041
> 10:15:25,080 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Closing proxy : cdh544-worker-4.c.astral-sorter-757.internal:8041
> 10:15:25,081 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Closing proxy : cdh544-worker-2.c.astral-sorter-757.internal:8041
> 10:15:25,081 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Closing proxy : cdh544-worker-3.c.astral-sorter-757.internal:8041
> 10:15:25,081 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Closing proxy : cdh544-worker-5.c.astral-sorter-757.internal:8041
> 10:15:25,085 INFO  

[jira] [Updated] (FLINK-4727) Kafka 0.9 Consumer should also checkpoint auto retrieved offsets even when no data is read

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-4727:
---
Description: 
This is basically the 0.9 version counterpart for FLINK-3440.

When the 0.9 consumer fetches initial offsets from Kafka on startup, but does 
not have any data to read, it should also checkpoint & commit these initial 
offsets.

  was:
This is basically the 0.9 version counterpart for FLINK-3440.

When the 0.9 consumer fetches initial offsets from Kafka, but does not have any 
data to read, it should also checkpoint & commit these initial offsets.


> Kafka 0.9 Consumer should also checkpoint auto retrieved offsets even when no 
> data is read
> --
>
> Key: FLINK-4727
> URL: https://issues.apache.org/jira/browse/FLINK-4727
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Blocker
> Fix For: 1.2.0, 1.1.3
>
>
> This is basically the 0.9 version counterpart for FLINK-3440.
> When the 0.9 consumer fetches initial offsets from Kafka on startup, but does 
> not have any data to read, it should also checkpoint & commit these initial 
> offsets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3734) Unclosed DataInputView in AbstractAlignedProcessingTimeWindowOperator#restoreState()

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541912#comment-15541912
 ] 

ASF GitHub Bot commented on FLINK-3734:
---

Github user lw-lin closed the pull request at:

https://github.com/apache/flink/pull/2557


> Unclosed DataInputView in 
> AbstractAlignedProcessingTimeWindowOperator#restoreState()
> 
>
> Key: FLINK-3734
> URL: https://issues.apache.org/jira/browse/FLINK-3734
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
> DataInputView in = inputState.getState(getUserCodeClassloader());
> final long nextEvaluationTime = in.readLong();
> final long nextSlideTime = in.readLong();
> AbstractKeyedTimePanes panes = 
> createPanes(keySelector, function);
> panes.readFromInput(in, keySerializer, stateTypeSerializer);
> restoredState = new RestoredState<>(panes, nextEvaluationTime, 
> nextSlideTime);
>   }
> {code}
> DataInputView in is not closed upon return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2557: [FLINK-3734] Close stream state handles after stat...

2016-10-03 Thread lw-lin
Github user lw-lin closed the pull request at:

https://github.com/apache/flink/pull/2557


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3734) Unclosed DataInputView in AbstractAlignedProcessingTimeWindowOperator#restoreState()

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541909#comment-15541909
 ] 

ASF GitHub Bot commented on FLINK-3734:
---

Github user lw-lin commented on the issue:

https://github.com/apache/flink/pull/2557
  
Glad to see this issue got fixed along with 
https://github.com/apache/flink/pull/2512 -- the fix is [this line in 
StreamTask](https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L577).

So @tedyu would you mind closing FLINK-3734? :-D


> Unclosed DataInputView in 
> AbstractAlignedProcessingTimeWindowOperator#restoreState()
> 
>
> Key: FLINK-3734
> URL: https://issues.apache.org/jira/browse/FLINK-3734
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
> DataInputView in = inputState.getState(getUserCodeClassloader());
> final long nextEvaluationTime = in.readLong();
> final long nextSlideTime = in.readLong();
> AbstractKeyedTimePanes panes = 
> createPanes(keySelector, function);
> panes.readFromInput(in, keySerializer, stateTypeSerializer);
> restoredState = new RestoredState<>(panes, nextEvaluationTime, 
> nextSlideTime);
>   }
> {code}
> DataInputView in is not closed upon return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2557: [FLINK-3734] Close stream state handles after state resto...

2016-10-03 Thread lw-lin
Github user lw-lin commented on the issue:

https://github.com/apache/flink/pull/2557
  
Glad to see this issue got fixed along with 
https://github.com/apache/flink/pull/2512 -- the fix is [this line in 
StreamTask](https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L577).

So @tedyu would you mind closing FLINK-3734? :-D


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-4727) Kafka 0.9 Consumer should also checkpoint auto retrieved offsets even when no data is read

2016-10-03 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-4727:
--

 Summary: Kafka 0.9 Consumer should also checkpoint auto retrieved 
offsets even when no data is read
 Key: FLINK-4727
 URL: https://issues.apache.org/jira/browse/FLINK-4727
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
Priority: Blocker
 Fix For: 1.2.0, 1.1.3


This is basically the 0.9 version counterpart for FLINK-3440.

When the 0.9 consumer fetches initial offsets from Kafka, but does not have any 
data to read, it should also checkpoint & commit these initial offsets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4315) Remove Hadoop Dependency from flink-java

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541713#comment-15541713
 ] 

ASF GitHub Bot commented on FLINK-4315:
---

Github user kenmy closed the pull request at:

https://github.com/apache/flink/pull/2576


> Remove Hadoop Dependency from flink-java
> 
>
> Key: FLINK-4315
> URL: https://issues.apache.org/jira/browse/FLINK-4315
> Project: Flink
>  Issue Type: Sub-task
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Evgeny Kincharov
> Fix For: 2.0.0
>
>
> The API projects should be independent of Hadoop, because Hadoop is not an 
> integral part of the Flink stack, and we should have the option to offer 
> Flink without Hadoop dependencies.
> The current batch APIs have a hard dependency on Hadoop, mainly because the 
> API has utility methods like `readHadoopFile(...)`.
> I suggest to remove those methods and instead add helpers in the 
> `flink-hadoop-compatibility` project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2576: [2.0] [FLINK-4315] Remove Hadoop Dependency from flink-ja...

2016-10-03 Thread kenmy
Github user kenmy commented on the issue:

https://github.com/apache/flink/pull/2576
  
Merge this PR when 2.0 branch will be created


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4315) Remove Hadoop Dependency from flink-java

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541714#comment-15541714
 ] 

ASF GitHub Bot commented on FLINK-4315:
---

Github user kenmy commented on the issue:

https://github.com/apache/flink/pull/2576
  
Merge this PR when 2.0 branch will be created


> Remove Hadoop Dependency from flink-java
> 
>
> Key: FLINK-4315
> URL: https://issues.apache.org/jira/browse/FLINK-4315
> Project: Flink
>  Issue Type: Sub-task
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Evgeny Kincharov
> Fix For: 2.0.0
>
>
> The API projects should be independent of Hadoop, because Hadoop is not an 
> integral part of the Flink stack, and we should have the option to offer 
> Flink without Hadoop dependencies.
> The current batch APIs have a hard dependency on Hadoop, mainly because the 
> API has utility methods like `readHadoopFile(...)`.
> I suggest to remove those methods and instead add helpers in the 
> `flink-hadoop-compatibility` project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2576: [2.0] [FLINK-4315] Remove Hadoop Dependency from f...

2016-10-03 Thread kenmy
Github user kenmy closed the pull request at:

https://github.com/apache/flink/pull/2576


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---