[GitHub] flink issue #2414: [FLINK-4341] Let idle consumer subtasks emit max value wa...

2016-08-26 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2414
  
Minus @rmetzger's comment this looks good to merge! Thanks for fixing this 
@tzulitai!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4341) Kinesis connector does not emit maximum watermark properly

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440877#comment-15440877
 ] 

ASF GitHub Bot commented on FLINK-4341:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2414
  
Minus @rmetzger's comment this looks good to merge! Thanks for fixing this 
@tzulitai!


> Kinesis connector does not emit maximum watermark properly
> --
>
> Key: FLINK-4341
> URL: https://issues.apache.org/jira/browse/FLINK-4341
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.1.0, 1.1.1
>Reporter: Scott Kidder
>Assignee: Robert Metzger
>Priority: Blocker
> Fix For: 1.2.0, 1.1.2
>
>
> **Prevously reported as "Checkpoint state size grows unbounded when task 
> parallelism not uniform"**
> This issue was first encountered with Flink release 1.1.0 (commit 45f7825). I 
> was previously using a 1.1.0 snapshot (commit 18995c8) which performed as 
> expected.  This issue was introduced somewhere between those commits.
> I've got a Flink application that uses the Kinesis Stream Consumer to read 
> from a Kinesis stream with 2 shards. I've got 2 task managers with 2 slots 
> each, providing a total of 4 slots.  When running the application with a 
> parallelism of 4, the Kinesis consumer uses 2 slots (one per Kinesis shard) 
> and 4 slots for subsequent tasks that process the Kinesis stream data. I use 
> an in-memory store for checkpoint data.
> Yesterday I upgraded to Flink 1.1.0 (45f7825) and noticed that checkpoint 
> states were growing unbounded when running with a parallelism of 4, 
> checkpoint interval of 10 seconds:
> {code}
> ID  State Size
> 1   11.3 MB
> 220.9 MB
> 3   30.6 MB
> 4   41.4 MB
> 5   52.6 MB
> 6   62.5 MB
> 7   71.5 MB
> 8   83.3 MB
> 9   93.5 MB
> {code}
> The first 4 checkpoints generally succeed, but then fail with an exception 
> like the following:
> {code}
> java.lang.RuntimeException: Error triggering a checkpoint as the result of 
> receiving checkpoint barrier at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:768)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:758)
>   at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:203)
>   at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:129)
>   at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:183)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:66)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Size of the state is larger than the maximum 
> permitted memory-backed state. Size=12105407 , maxSize=5242880 . Consider 
> using a different state backend, like the File System State backend.
>   at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend.checkSize(MemoryStateBackend.java:146)
>   at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetBytes(MemoryStateBackend.java:200)
>   at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetHandle(MemoryStateBackend.java:190)
>   at 
> org.apache.flink.runtime.state.AbstractStateBackend$CheckpointStateOutputView.closeAndGetHandle(AbstractStateBackend.java:447)
>   at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.snapshotOperatorState(WindowOperator.java:879)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:598)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:762)
>   ... 8 more
> {code}
> Or:
> {code}
> 2016-08-09 17:44:43,626 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Restoring 
> checkpointed state to task Fold: property_id, player -> 10-minute 
> Sliding-Window Percentile Aggregation -> Sink: InfluxDB (2/4)
> 2016-08-09 17:44:51,236 ERROR akka.remote.EndpointWriter- 
> Transient association error (association remains live) 
> akka.remote.OversizedPayloadException: Discarding oversized payload sent to 
> Actor[akka.tcp://flink@10.55.2.212:6123/user/jobmanager#510517238]: max 
> allowed size 10485760 bytes, actual size of encoded class 
> org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint was 
> 10891825 bytes.
> {code}
> This can be fixed by simply submitting the job with a parallelism of 2. I 
> suspect there was a r

[jira] [Commented] (FLINK-4329) Fix Streaming File Source Timestamps/Watermarks Handling

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440870#comment-15440870
 ] 

ASF GitHub Bot commented on FLINK-4329:
---

Github user kl0u closed the pull request at:

https://github.com/apache/flink/pull/2350


> Fix Streaming File Source Timestamps/Watermarks Handling
> 
>
> Key: FLINK-4329
> URL: https://issues.apache.org/jira/browse/FLINK-4329
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Kostas Kloudas
> Fix For: 1.2.0, 1.1.2
>
>
> The {{ContinuousFileReaderOperator}} does not correctly deal with watermarks, 
> i.e. they are just passed through. This means that when the 
> {{ContinuousFileMonitoringFunction}} closes and emits a {{Long.MAX_VALUE}} 
> that watermark can "overtake" the records that are to be emitted in the 
> {{ContinuousFileReaderOperator}}. Together with the new "allowed lateness" 
> setting in window operator this can lead to elements being dropped as late.
> Also, {{ContinuousFileReaderOperator}} does not correctly assign ingestion 
> timestamps since it is not technically a source but looks like one to the 
> user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2350: [FLINK-4329] Fix Streaming File Source Timestamps/...

2016-08-26 Thread kl0u
Github user kl0u closed the pull request at:

https://github.com/apache/flink/pull/2350


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4499) Introduce findbugs maven plugin

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440831#comment-15440831
 ] 

ASF GitHub Bot commented on FLINK-4499:
---

Github user smarthi commented on the issue:

https://github.com/apache/flink/pull/2422
  
I updated the findbugs-exclude.xml with some basic rules that can be 
ignored. 

I set the  to 'default' for now - can't keep it 'High' since it was 
freezing up my laptop. 

There are still more rules that can be excluded, but I think this is a good 
start to seed the excluded rules.

Build times take an impact if the  is set to high, I now have that 
set to 'default'.


> Introduce findbugs maven plugin
> ---
>
> Key: FLINK-4499
> URL: https://issues.apache.org/jira/browse/FLINK-4499
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> As suggested by Stephan in FLINK-4482, this issue is to add 
> findbugs-maven-plugin into the build process so that we can detect lack of 
> proper locking and other defects automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2422: FLINK-4499: Introduce findbugs maven plugin

2016-08-26 Thread smarthi
Github user smarthi commented on the issue:

https://github.com/apache/flink/pull/2422
  
I updated the findbugs-exclude.xml with some basic rules that can be 
ignored. 

I set the  to 'default' for now - can't keep it 'High' since it was 
freezing up my laptop. 

There are still more rules that can be excluded, but I think this is a good 
start to seed the excluded rules.

Build times take an impact if the  is set to high, I now have that 
set to 'default'.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4516) ResourceManager leadership election

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440828#comment-15440828
 ] 

ASF GitHub Bot commented on FLINK-4516:
---

GitHub user beyond1920 opened a pull request:

https://github.com/apache/flink/pull/2427

[FLINK-4516] [cluster management]leader election of resourcemanager

This pull request is to implement resourceManager leader Election, which 
including:
1. When a resourceManager is started, it starts the leadership election 
service first and take part in contending for leadership
2. Every resourceManager contains a ResourceManagerLeaderContender, when it 
is granted leadership, it will start SlotManager and other main components. 
when it is revoked leadership, it will stop all its components and clear 
everything.

Main difference are 3 points:
1. Add ResourceManagerLeaderContender 
2. Add getResourceManagerLeaderElectionService method in 
HighAvailabilityServices, NonHaServices, TestingHighAvailabilityServices to get 
leadership election service for resourceManager
3. Add a test for ResourceManager HA

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alibaba/flink jira-4516

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2427.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2427


commit 3117989b87c5e9a002251353a47271fed0a84271
Author: beyond1920 
Date:   2016-08-27T06:14:28Z

leader election of resourcemanager




> ResourceManager leadership election
> ---
>
> Key: FLINK-4516
> URL: https://issues.apache.org/jira/browse/FLINK-4516
> Project: Flink
>  Issue Type: Sub-task
>  Components: Cluster Management
>Reporter: zhangjing
>Assignee: zhangjing
>
> 1. When a resourceManager is started, it starts the leadership election 
> service first and take part in contending for leadership
> 2. Every resourceManager contains a ResourceManagerLeaderContender, when it 
> is granted leadership, it will start SlotManager and other main components. 
> when it is revoked leadership, it will stop all its components and clear 
> everything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2427: [FLINK-4516] [cluster management]leader election o...

2016-08-26 Thread beyond1920
GitHub user beyond1920 opened a pull request:

https://github.com/apache/flink/pull/2427

[FLINK-4516] [cluster management]leader election of resourcemanager

This pull request is to implement resourceManager leader Election, which 
including:
1. When a resourceManager is started, it starts the leadership election 
service first and take part in contending for leadership
2. Every resourceManager contains a ResourceManagerLeaderContender, when it 
is granted leadership, it will start SlotManager and other main components. 
when it is revoked leadership, it will stop all its components and clear 
everything.

Main difference are 3 points:
1. Add ResourceManagerLeaderContender 
2. Add getResourceManagerLeaderElectionService method in 
HighAvailabilityServices, NonHaServices, TestingHighAvailabilityServices to get 
leadership election service for resourceManager
3. Add a test for ResourceManager HA

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alibaba/flink jira-4516

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2427.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2427


commit 3117989b87c5e9a002251353a47271fed0a84271
Author: beyond1920 
Date:   2016-08-27T06:14:28Z

leader election of resourcemanager




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-4516) ResourceManager leadership election

2016-08-26 Thread zhangjing (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjing reassigned FLINK-4516:


Assignee: zhangjing

> ResourceManager leadership election
> ---
>
> Key: FLINK-4516
> URL: https://issues.apache.org/jira/browse/FLINK-4516
> Project: Flink
>  Issue Type: Sub-task
>  Components: Cluster Management
>Reporter: zhangjing
>Assignee: zhangjing
>
> 1. When a resourceManager is started, it starts the leadership election 
> service first and take part in contending for leadership
> 2. Every resourceManager contains a ResourceManagerLeaderContender, when it 
> is granted leadership, it will start SlotManager and other main components. 
> when it is revoked leadership, it will stop all its components and clear 
> everything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4516) ResourceManager leadership election

2016-08-26 Thread zhangjing (JIRA)
zhangjing created FLINK-4516:


 Summary: ResourceManager leadership election
 Key: FLINK-4516
 URL: https://issues.apache.org/jira/browse/FLINK-4516
 Project: Flink
  Issue Type: Sub-task
  Components: Cluster Management
Reporter: zhangjing


1. When a resourceManager is started, it starts the leadership election service 
first and take part in contending for leadership
2. Every resourceManager contains a ResourceManagerLeaderContender, when it is 
granted leadership, it will start SlotManager and other main components. when 
it is revoked leadership, it will stop all its components and clear everything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4347) Implement SlotManager core

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440532#comment-15440532
 ] 

ASF GitHub Bot commented on FLINK-4347:
---

Github user KurtYoung commented on the issue:

https://github.com/apache/flink/pull/2388
  
I think i shall do the resolving thing.
The commits are squashed and conflicts been resolved.


> Implement SlotManager core
> --
>
> Key: FLINK-4347
> URL: https://issues.apache.org/jira/browse/FLINK-4347
> Project: Flink
>  Issue Type: Sub-task
>  Components: Cluster Management
>Reporter: Kurt Young
>Assignee: Kurt Young
>
> The slot manager is responsible to maintain the list of slot requests and 
> slot allocations. It allows to request slots from the registered 
> TaskExecutors and issues container allocation requests in case that there are 
> not enough available resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2388: [FLINK-4347][cluster management] Implement SlotManager co...

2016-08-26 Thread KurtYoung
Github user KurtYoung commented on the issue:

https://github.com/apache/flink/pull/2388
  
I think i shall do the resolving thing.
The commits are squashed and conflicts been resolved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1979) Implement Loss Functions

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440354#comment-15440354
 ] 

ASF GitHub Bot commented on FLINK-1979:
---

Github user skavulya commented on the issue:

https://github.com/apache/flink/pull/1985
  
hi @chiwanpark sorry, I hadn't checked this PR for a while. I merged the 
latest master. Do you have any preference on an alternative name for 
RegularizationPenalty? Would Regularizer work better?


> Implement Loss Functions
> 
>
> Key: FLINK-1979
> URL: https://issues.apache.org/jira/browse/FLINK-1979
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Johannes Günther
>Assignee: Johannes Günther
>Priority: Minor
>  Labels: ML
>
> For convex optimization problems, optimizer methods like SGD rely on a 
> pluggable implementation of a loss function and its first derivative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #1985: [FLINK-1979] Add logistic loss, hinge loss and regulariza...

2016-08-26 Thread skavulya
Github user skavulya commented on the issue:

https://github.com/apache/flink/pull/1985
  
hi @chiwanpark sorry, I hadn't checked this PR for a while. I merged the 
latest master. Do you have any preference on an alternative name for 
RegularizationPenalty? Would Regularizer work better?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3950) Add Meter Metric Type

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439854#comment-15439854
 ] 

ASF GitHub Bot commented on FLINK-3950:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2374
  
Hey @zentol. Thank you for your comments.
I've updated the PR. Could you take a look at it again?


> Add Meter Metric Type
> -
>
> Key: FLINK-3950
> URL: https://issues.apache.org/jira/browse/FLINK-3950
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Stephan Ewen
>Assignee: Ivan Mushketyk
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2374: [FLINK-3950] Add Meter Metric Type

2016-08-26 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2374
  
Hey @zentol. Thank you for your comments.
I've updated the PR. Could you take a look at it again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439838#comment-15439838
 ] 

ASF GitHub Bot commented on FLINK-3677:
---

GitHub user mushketyk opened a pull request:

https://github.com/apache/flink/pull/2426

[FLINK-3677] Remove Guava dependency from flink-core

Removed Guava dependency from flink-core as discussed here: 
https://github.com/apache/flink/pull/2109

- [x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [x] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mushketyk/flink remove-guava

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2426.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2426


commit 9c8647a91fc82a09f0be98543771df63c9ac41e8
Author: Ivan Mushketyk 
Date:   2016-08-26T20:58:27Z

[FLINK-3677] Remove Guava dependency from flink-core




> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
> Fix For: 1.2.0
>
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2426: [FLINK-3677] Remove Guava dependency from flink-co...

2016-08-26 Thread mushketyk
GitHub user mushketyk opened a pull request:

https://github.com/apache/flink/pull/2426

[FLINK-3677] Remove Guava dependency from flink-core

Removed Guava dependency from flink-core as discussed here: 
https://github.com/apache/flink/pull/2109

- [x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [x] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mushketyk/flink remove-guava

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2426.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2426


commit 9c8647a91fc82a09f0be98543771df63c9ac41e8
Author: Ivan Mushketyk 
Date:   2016-08-26T20:58:27Z

[FLINK-3677] Remove Guava dependency from flink-core




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-4515) org.apache.flink.runtime.jobmanager.JobInfo class is not backwards compatible with 1.1 released version

2016-08-26 Thread Nagarjun Guraja (JIRA)
Nagarjun Guraja created FLINK-4515:
--

 Summary: org.apache.flink.runtime.jobmanager.JobInfo class is not 
backwards compatible with 1.1 released version
 Key: FLINK-4515
 URL: https://issues.apache.org/jira/browse/FLINK-4515
 Project: Flink
  Issue Type: Bug
Reporter: Nagarjun Guraja


Current org.apache.flink.runtime.jobmanager.JobInfo in the 1.2 trunk is not 
backwards compatible which breaks job recorvery while upgrading to latest flink 
build from 1.1 release

2016-08-26 13:39:56,618 INFO  org.apache.flink.runtime.jobmanager.JobManager
- Attempting to recover all jobs.
2016-08-26 13:39:57,225 ERROR org.apache.flink.runtime.jobmanager.JobManager
- Fatal error: Failed to recover jobs.
java.io.InvalidClassException: org.apache.flink.runtime.jobmanager.JobInfo; 
local class incompatible: stream classdesc serialVersionUID = 
4102282956967236682, local class serialVersionUID = -2377916285980374169
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at 
org.apache.flink.runtime.state.filesystem.FileSerializableStateHandle.getState(FileSerializableStateHandle.java:58)
at 
org.apache.flink.runtime.state.filesystem.FileSerializableStateHandle.getState(FileSerializableStateHandle.java:35)
at 
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraphs(ZooKeeperSubmittedJobGraphStore.java:173)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$2$$anonfun$apply$mcV$sp$2.apply$mcV$sp(JobManager.scala:543)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$2$$anonfun$apply$mcV$sp$2.apply(JobManager.scala:539)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$2$$anonfun$apply$mcV$sp$2.apply(JobManager.scala:539)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$2.apply$mcV$sp(JobManager.scala:539)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$2.apply(JobManager.scala:535)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$2.apply(JobManager.scala:535)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1379) add RSS feed for the blog

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439690#comment-15439690
 ] 

ASF GitHub Bot commented on FLINK-1379:
---

GitHub user lresende opened a pull request:

https://github.com/apache/zeppelin/pull/1370

[FLINK-1379] Flink interpreter is missing scala libraries

### What is this PR for?
On Flink interpreter, remove provided scope from scala libraries to enable 
copying them to interpreter location.

### What type of PR is it?
[Bug Fix]

### What is the Jira issue?
[ZEPPELIN-1379](https://issues.apache.org/jira/browse/ZEPPELIN-1379)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lresende/incubator-zeppelin flink-dependencies

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zeppelin/pull/1370.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1370


commit 7d39c0d040a29ed92b515a70862bcd7a9e7c0824
Author: Luciano Resende 
Date:   2016-08-26T19:49:19Z

[FLINK-1379] Flink interpreter is missing scala libraries

Remove provided scope from scala libraries to enable copying
them to interpreter location.




> add RSS feed for the blog
> -
>
> Key: FLINK-1379
> URL: https://issues.apache.org/jira/browse/FLINK-1379
> Project: Flink
>  Issue Type: Improvement
>  Components: Project Website
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Attachments: feed.patch
>
>
> I couldn't find an RSS feed for the Flink blog. I think that a feed helps a 
> lot of people to stay up to date with the changes in Flink. 
> [FLINK-391] mentions a RSS feed but it does not seem to exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1379) add RSS feed for the blog

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439691#comment-15439691
 ] 

ASF GitHub Bot commented on FLINK-1379:
---

Github user lresende commented on the issue:

https://github.com/apache/zeppelin/pull/1370
  
After the fix, performed a full build and compared scala libraries from 
flink and ignite

$ cd flink/
$ ls scala*
-rw-r--r--  1 lresende  staff14M Aug 26 12:39 scala-compiler-2.10.5.jar
-rw-r--r--  1 lresende  staff   6.8M Aug 26 12:39 scala-library-2.10.5.jar
-rw-r--r--  1 lresende  staff   3.1M Aug 26 12:39 scala-reflect-2.10.5.jar
$ cd ..
$ cd ignite/
$ ls scala*
-rw-r--r--  1 lresende  staff14M Aug 26 12:39 scala-compiler-2.10.5.jar
-rw-r--r--  1 lresende  staff   6.8M Aug 26 12:39 scala-library-2.10.5.jar
-rw-r--r--  1 lresende  staff   3.1M Aug 26 12:39 scala-reflect-2.10.5.jar


> add RSS feed for the blog
> -
>
> Key: FLINK-1379
> URL: https://issues.apache.org/jira/browse/FLINK-1379
> Project: Flink
>  Issue Type: Improvement
>  Components: Project Website
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Attachments: feed.patch
>
>
> I couldn't find an RSS feed for the Flink blog. I think that a feed helps a 
> lot of people to stay up to date with the changes in Flink. 
> [FLINK-391] mentions a RSS feed but it does not seem to exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4480) Incorrect link to elastic.co in documentation

2016-08-26 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi resolved FLINK-4480.
--
Resolution: Fixed

> Incorrect link to elastic.co in documentation
> -
>
> Key: FLINK-4480
> URL: https://issues.apache.org/jira/browse/FLINK-4480
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.2.0, 1.1.1
>Reporter: Fabian Hueske
>Assignee: Suneel Marthi
>Priority: Trivial
> Fix For: 1.1.2
>
>
> The link URL of the entry "Elasticsearch 2x (sink)" on the connector's 
> documentation page 
> https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/index.html
>  is pointing to http://elastic.com but should point to http://elastic.co



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4480) Incorrect link to elastic.co in documentation

2016-08-26 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated FLINK-4480:
-
Fix Version/s: 1.1.2

> Incorrect link to elastic.co in documentation
> -
>
> Key: FLINK-4480
> URL: https://issues.apache.org/jira/browse/FLINK-4480
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.2.0, 1.1.1
>Reporter: Fabian Hueske
>Assignee: Suneel Marthi
>Priority: Trivial
> Fix For: 1.1.2
>
>
> The link URL of the entry "Elasticsearch 2x (sink)" on the connector's 
> documentation page 
> https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/index.html
>  is pointing to http://elastic.com but should point to http://elastic.co



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3930) Implement Service-Level Authorization

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439611#comment-15439611
 ] 

ASF GitHub Bot commented on FLINK-3930:
---

GitHub user vijikarthi opened a pull request:

https://github.com/apache/flink/pull/2425

FLINK-3930 Added shared secret based authorization for Flink service …

This PR addresses FLINK-3930 requirements. It enables shared secret based 
secure cookie authorization for the following components

- Akka layer
- Blob Service
- Web UI

Secure cookie authentication can be enabled by providing below 
configurations to Flink configuration file.

- `security.enabled`: A boolean value (true|false) indicating security is 
enabled or not.
- `security.cookie` : Secure cookie value to be used for authentication. 
For standalone deployment mode, the secure cookie value is mandatory when 
security is enabled but for the Yarn mode it is optional (auto-generated if not 
provided).

Alternatively, secure cookie value can be provided through Flink/Yarn CLI 
using "-k" or "--cookie" parameter option.

The web runtime module prompts for secure cookie using standard basic HTTP 
authentication mechanism, where the user id field is a noop and the password 
field will be used to capture the secure cookie. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vijikarthi/flink FLINK-3930

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2425.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2425


commit 33d391cb17e68dd203328a91fa6b63218884b49d
Author: Vijay Srinivasaraghavan 
Date:   2016-08-26T19:02:20Z

FLINK-3930 Added shared secret based authorization for Flink service 
components




> Implement Service-Level Authorization
> -
>
> Key: FLINK-3930
> URL: https://issues.apache.org/jira/browse/FLINK-3930
> Project: Flink
>  Issue Type: New Feature
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>  Labels: security
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> _This issue is part of a series of improvements detailed in the [Secure Data 
> Access|https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing]
>  design doc._
> Service-level authorization is the initial authorization mechanism to ensure 
> clients (or servers) connecting to the Flink cluster are authorized to do so. 
>   The purpose is to prevent a cluster from being used by an unauthorized 
> user, whether to execute jobs, disrupt cluster functionality, or gain access 
> to secrets stored within the cluster.
> Implement service-level authorization as described in the design doc.
> - Introduce a shared secret cookie
> - Enable Akka security cookie
> - Implement data transfer authentication
> - Secure the web dashboard



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2425: FLINK-3930 Added shared secret based authorization...

2016-08-26 Thread vijikarthi
GitHub user vijikarthi opened a pull request:

https://github.com/apache/flink/pull/2425

FLINK-3930 Added shared secret based authorization for Flink service …

This PR addresses FLINK-3930 requirements. It enables shared secret based 
secure cookie authorization for the following components

- Akka layer
- Blob Service
- Web UI

Secure cookie authentication can be enabled by providing below 
configurations to Flink configuration file.

- `security.enabled`: A boolean value (true|false) indicating security is 
enabled or not.
- `security.cookie` : Secure cookie value to be used for authentication. 
For standalone deployment mode, the secure cookie value is mandatory when 
security is enabled but for the Yarn mode it is optional (auto-generated if not 
provided).

Alternatively, secure cookie value can be provided through Flink/Yarn CLI 
using "-k" or "--cookie" parameter option.

The web runtime module prompts for secure cookie using standard basic HTTP 
authentication mechanism, where the user id field is a noop and the password 
field will be used to capture the secure cookie. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vijikarthi/flink FLINK-3930

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2425.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2425


commit 33d391cb17e68dd203328a91fa6b63218884b49d
Author: Vijay Srinivasaraghavan 
Date:   2016-08-26T19:02:20Z

FLINK-3930 Added shared secret based authorization for Flink service 
components




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2418: [FLINK-4245] JMXReporter exposes all defined variables

2016-08-26 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2418
  
The funky thing is that internally the ObjectName reconverts the Hashtable 
to a Map...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4245) Metric naming improvements

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439462#comment-15439462
 ] 

ASF GitHub Bot commented on FLINK-4245:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2418
  
The funky thing is that internally the ObjectName reconverts the Hashtable 
to a Map...


> Metric naming improvements
> --
>
> Key: FLINK-4245
> URL: https://issues.apache.org/jira/browse/FLINK-4245
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: Stephan Ewen
>
> A metric currently has two parts to it:
>   - The name of that particular metric
>   - The "scope" (or namespace), defined by the group that contains the metric.
> A metric group actually always implicitly has a map of naming "tags", like:
>   - taskmanager_host : 
>   - taskmanager_id : 
>   - task_name : "map() -> filter()"
> We derive the scope from that map, following the defined scope formats.
> For JMX (and some users that use JMX), it would be natural to expose that map 
> of tags. Some users reconstruct that map by parsing the metric scope. JMX, we 
> can expose a metric like:
>   - domain: "taskmanager.task.operator.io"
>   - name: "numRecordsIn"
>   - tags: { "hostname" -> "localhost", "operator_name" -> "map() at 
> X.java:123", ... }
> For many other reporters, the formatted scope makes a lot of sense, since 
> they think only in terms of (scope, metric-name).
> We may even have the formatted scope in JMX as well (in the domain), if we 
> want to go that route. 
> [~jgrier] and [~Zentol] - what do you think about that?
> [~mdaxini] Does that match your use of the metrics?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4514) ExpiredIteratorException in Kinesis Consumer on long catch-ups to head of stream

2016-08-26 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-4514:
---
Affects Version/s: 1.1.1
   1.1.0

> ExpiredIteratorException in Kinesis Consumer on long catch-ups to head of 
> stream
> 
>
> Key: FLINK-4514
> URL: https://issues.apache.org/jira/browse/FLINK-4514
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.1.0, 1.1.1
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0, 1.1.2
>
>
> Original mailing thread for the reported issue:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-connector-Iterator-expired-exception-td8711.html
> Normally, the exception is thrown when the consumer uses the same shard 
> iterator after 5 minutes since it was retrieved. I've still yet to clarify & 
> reproduce the root cause of the {{ExpiredIteratorException}}, because from 
> the code this seems to be impossible. I'm leaning towards suspecting this is 
> a Kinesis-side issue (from the description in the ML, the behaviour also 
> seems indeterminate).
> Either way, the exception can be fairly easily handled so that the consumer 
> doesn't just fail. When caught, we request a new shard iterator from Kinesis 
> with the last sequence number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4514) ExpiredIteratorException in Kinesis Consumer on long catch-ups to head of stream

2016-08-26 Thread Tzu-Li (Gordon) Tai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-4514:
---
Description: 
Original mailing thread for the reported issue:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-connector-Iterator-expired-exception-td8711.html

Normally, the exception is thrown when the consumer uses the same shard 
iterator after 5 minutes since it was retrieved. I've still yet to clarify & 
reproduce the root cause of the {{ExpiredIteratorException}}, because from the 
code this seems to be impossible. I'm leaning towards suspecting this is a 
Kinesis-side issue (from the description in the ML, the behaviour also seems 
indeterminate).

Either way, the exception can be fairly easily handled so that the consumer 
doesn't just fail. When caught, we request a new shard iterator from Kinesis 
with the last sequence number.

  was:
Original mailing thread for the reported issue:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-connector-Iterator-expired-exception-td8711.html

Normally, the exception is thrown when the consumer uses the same shard 
iterator after 5 minutes since it was retrieved. I've still yet to clarify & 
reproduce the root cause of the {{ExpiredIteratorException}}, because from the 
code it seems to be impossible. I'm leaning towards suspecting this is a 
Kinesis-side issue (from the description in the ML, the behaviour also seems 
indeterminate).

Either way, the exception can be fairly easily handled so that the consumer 
doesn't just fail. When caught, we request a new shard iterator from Kinesis 
with the last sequence number.


> ExpiredIteratorException in Kinesis Consumer on long catch-ups to head of 
> stream
> 
>
> Key: FLINK-4514
> URL: https://issues.apache.org/jira/browse/FLINK-4514
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0, 1.1.2
>
>
> Original mailing thread for the reported issue:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-connector-Iterator-expired-exception-td8711.html
> Normally, the exception is thrown when the consumer uses the same shard 
> iterator after 5 minutes since it was retrieved. I've still yet to clarify & 
> reproduce the root cause of the {{ExpiredIteratorException}}, because from 
> the code this seems to be impossible. I'm leaning towards suspecting this is 
> a Kinesis-side issue (from the description in the ML, the behaviour also 
> seems indeterminate).
> Either way, the exception can be fairly easily handled so that the consumer 
> doesn't just fail. When caught, we request a new shard iterator from Kinesis 
> with the last sequence number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4514) ExpiredIteratorException in Kinesis Consumer on long catch-ups to head of stream

2016-08-26 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-4514:
--

 Summary: ExpiredIteratorException in Kinesis Consumer on long 
catch-ups to head of stream
 Key: FLINK-4514
 URL: https://issues.apache.org/jira/browse/FLINK-4514
 Project: Flink
  Issue Type: Bug
  Components: Kinesis Connector
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: 1.2.0, 1.1.2


Original mailing thread for the reported issue:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-connector-Iterator-expired-exception-td8711.html

Normally, the exception is thrown when the consumer uses the same shard 
iterator after 5 minutes since it was retrieved. I've still yet to clarify & 
reproduce the root cause of the {{ExpiredIteratorException}}, because from the 
code it seems to be impossible. I'm leaning towards suspecting this is a 
Kinesis-side issue (from the description in the ML, the behaviour also seems 
indeterminate).

Either way, the exception can be fairly easily handled so that the consumer 
doesn't just fail. When caught, we request a new shard iterator from Kinesis 
with the last sequence number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4502) Cassandra connector documentation has misleading consistency guarantees

2016-08-26 Thread Elias Levy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439445#comment-15439445
 ] 

Elias Levy commented on FLINK-4502:
---

Chesnay, thanks for correcting me about the overwriting issue.  I missed the 
implication of setting the timestamp on the queries.

I still think the documentation is not clear that idempotent queries are a 
prerequisite of exactly-once semantics when the WAL is enabled, as shown by the 
portion of the documentation I quoted.  The connector documentation page 
mentions idempotent once, in the paragraph previous to the one describing the 
WAL.

The WAL functionality description could also be clearer.  For instance, is says 
"The write-ahead log guarantees that the replayed checkpoint is identical to 
the first attempt."  But that does not seem accurate.  The WAL doesn't appear 
to guarantee that a replayed checkpoint is identical.  Rather, it guarantees 
that no output associated with a checkpoint is written to the sink until the 
checkpoint is complete, which avoids possibly incompatible duplicate outputs 
when data from non-completed checkpoints are replayed on recovery.


> Cassandra connector documentation has misleading consistency guarantees
> ---
>
> Key: FLINK-4502
> URL: https://issues.apache.org/jira/browse/FLINK-4502
> Project: Flink
>  Issue Type: Bug
>  Components: Cassandra Connector, Documentation
>Affects Versions: 1.1.0
>Reporter: Elias Levy
>
> The Cassandra connector documentation states that  "enableWriteAheadLog() is 
> an optional method, that allows exactly-once processing for non-deterministic 
> algorithms."  This claim appears to be false.
> From what I gather, the write ahead log feature of the connector works as 
> follows:
> - The sink is replaced with a stateful operator that writes incoming messages 
> to the state backend based on checkpoint they belong in.
> - When the operator is notified that a Flink checkpoint has been completed 
> it, for each set of checkpoints older than and including the committed one:
>   * reads its messages from the state backend
>   * writes them to Cassandra
>   * records that it has committed them to Cassandra for the specific 
> checkpoint and operator instance
>* and erases them from the state backend.
> This process attempts to avoid resubmitting queries to Cassandra that would 
> otherwise occur when recovering a job from a checkpoint and having messages 
> replayed.
> Alas, this does not guarantee exactly once semantics at the sink.  The writes 
> to Cassandra that occur when the operator is notified that checkpoint is 
> completed are not atomic and they are potentially non-idempotent.  If the job 
> dies while writing to Cassandra or before committing the checkpoint via 
> committer, queries will be replayed when the job recovers.  Thus the 
> documentation appear to be incorrect in stating this provides exactly-once 
> semantics.
> There also seems to be an issue in GenericWriteAheadSink's 
> notifyOfCompletedCheckpoint which may result in incorrect output.  If 
> sendValues returns false because a write failed, instead of bailing, it 
> simply moves on to the next checkpoint to commit if there is one, keeping the 
> previous one around to try again later.  But that can result in newer data 
> being overwritten with older data when the previous checkpoint is retried.  
> Although given that CassandraCommitter implements isCheckpointCommitted as 
> checkpointID <= this.lastCommittedCheckpointID, it actually means that when 
> it goes back to try the uncommitted older checkpoint it will consider it 
> committed, even though some of its data may not have been written out, and 
> the data will be discarded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2416: FLINK-4480: Incorrect link to elastic.co in docume...

2016-08-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2416


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4418) ClusterClient/ConnectionUtils#findConnectingAddress fails immediately if InetAddress.getLocalHost throws exception

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439379#comment-15439379
 ] 

ASF GitHub Bot commented on FLINK-4418:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2383


> ClusterClient/ConnectionUtils#findConnectingAddress fails immediately if 
> InetAddress.getLocalHost throws exception
> --
>
> Key: FLINK-4418
> URL: https://issues.apache.org/jira/browse/FLINK-4418
> Project: Flink
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Shannon Carey
>Assignee: Shannon Carey
>
> When attempting to connect to a cluster with a ClusterClient, if the 
> machine's hostname is not resolvable to an IP, an exception is thrown 
> preventing success.
> This is the case if, for example, the hostname is not present & mapped to a 
> local IP in /etc/hosts.
> The exception is below. I suggest that findAddressUsingStrategy() should 
> catch java.net.UnknownHostException thrown by InetAddress.getLocalHost() and 
> return null, allowing alternative strategies to be attempted by 
> findConnectingAddress(). I will open a PR to this effect. Ideally this could 
> be included in both 1.2 and 1.1.2.
> In the stack trace below, "ip-10-2-64-47" is the internal host name of an AWS 
> EC2 instance.
> {code}
> 21:11:35 org.apache.flink.client.program.ProgramInvocationException: Failed 
> to retrieve the JobManager gateway.
> 21:11:35  at 
> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:430)
> 21:11:35  at 
> org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:90)
> 21:11:35  at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:389)
> 21:11:35  at 
> org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:75)
> 21:11:35  at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:334)
> 21:11:35  at 
> com.expedia.www.flink.job.scheduler.FlinkJobSubmitter.get(FlinkJobSubmitter.java:81)
> 21:11:35  at 
> com.expedia.www.flink.job.scheduler.streaming.StreamingJobManager.run(StreamingJobManager.java:105)
> 21:11:35  at 
> com.expedia.www.flink.job.scheduler.JobScheduler.runStreamingApp(JobScheduler.java:69)
> 21:11:35  at 
> com.expedia.www.flink.job.scheduler.JobScheduler.main(JobScheduler.java:34)
> 21:11:35 Caused by: java.lang.RuntimeException: Failed to resolve JobManager 
> address at /10.2.89.80:43126
> 21:11:35  at 
> org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:189)
> 21:11:35  at 
> org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:649)
> 21:11:35  at 
> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:428)
> 21:11:35  ... 8 more
> 21:11:35 Caused by: java.net.UnknownHostException: ip-10-2-64-47: 
> ip-10-2-64-47: unknown error
> 21:11:35  at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
> 21:11:35  at 
> org.apache.flink.runtime.net.ConnectionUtils.findAddressUsingStrategy(ConnectionUtils.java:232)
> 21:11:35  at 
> org.apache.flink.runtime.net.ConnectionUtils.findConnectingAddress(ConnectionUtils.java:123)
> 21:11:35  at 
> org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:187)
> 21:11:35  ... 10 more
> 21:11:35 Caused by: java.net.UnknownHostException: ip-10-2-64-47: unknown 
> error
> 21:11:35  at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
> 21:11:35  at 
> java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
> 21:11:35  at 
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
> 21:11:35  at java.net.InetAddress.getLocalHost(InetAddress.java:1500)
> 21:11:35  ... 13 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4480) Incorrect link to elastic.co in documentation

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439380#comment-15439380
 ] 

ASF GitHub Bot commented on FLINK-4480:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2416


> Incorrect link to elastic.co in documentation
> -
>
> Key: FLINK-4480
> URL: https://issues.apache.org/jira/browse/FLINK-4480
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.2.0, 1.1.1
>Reporter: Fabian Hueske
>Assignee: Suneel Marthi
>Priority: Trivial
>
> The link URL of the entry "Elasticsearch 2x (sink)" on the connector's 
> documentation page 
> https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/index.html
>  is pointing to http://elastic.com but should point to http://elastic.co



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2383: [FLINK-4418] [client] Improve resilience when Inet...

2016-08-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2383


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4347) Implement SlotManager core

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439373#comment-15439373
 ] 

ASF GitHub Bot commented on FLINK-4347:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2388
  
Looks quite good all in all.

Be good to get this in, so that JobManager followup work can build on this.
I would take this, rebase it, and merge it. If I find some issues I would 
create followup issues.


> Implement SlotManager core
> --
>
> Key: FLINK-4347
> URL: https://issues.apache.org/jira/browse/FLINK-4347
> Project: Flink
>  Issue Type: Sub-task
>  Components: Cluster Management
>Reporter: Kurt Young
>Assignee: Kurt Young
>
> The slot manager is responsible to maintain the list of slot requests and 
> slot allocations. It allows to request slots from the registered 
> TaskExecutors and issues container allocation requests in case that there are 
> not enough available resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3755) Introduce key groups for key-value state to support dynamic scaling

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439372#comment-15439372
 ] 

ASF GitHub Bot commented on FLINK-3755:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2376
  
Very good work and very nice code!

Some comments after a joint review:

  - The most critical issue is that there should not be any blocking on 
async threads during task shutdown. This unnecessarily delays responses to 
canceling and redeployment.

  - At this point, the `KeyGroupAssigner` interface seems a bit useless, 
especially if it is not parametrized with variable key group mappings. For the 
sake of making this simpler and more efficient, one could just have a static 
method for that.

  - I would suggest to make the assumption that key groups are always used 
(they should be, even if their number is equal to the parallelism), and drop 
the checks for `numberOfKeyGroups > 0`, for example in the 
KeyGroupHashPartitioner.

  - A bit more difficult is what to assume as the default number of key 
groups. We thought about assuming a default of `128`. That has no overhead in 
state backends like RocksDB and also allows initial job deployments which did 
not think about properly configuring this to have some freedom to scale out. If 
the parallelism is >= 128, this should probably round to the next highest 
power-of-two.

  - There are some log statements which cause log flooding, like an INFO 
log statement for every checkpoint stream (factory) created.



> Introduce key groups for key-value state to support dynamic scaling
> ---
>
> Key: FLINK-3755
> URL: https://issues.apache.org/jira/browse/FLINK-3755
> Project: Flink
>  Issue Type: New Feature
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> In order to support dynamic scaling, it is necessary to sub-partition the 
> key-value states of each operator. This sub-partitioning, which produces a 
> set of key groups, allows to easily scale in and out Flink jobs by simply 
> reassigning the different key groups to the new set of sub tasks. The idea of 
> key groups is described in this design document [1]. 
> [1] 
> https://docs.google.com/document/d/1G1OS1z3xEBOrYD4wSu-LuBCyPUWyFd9l3T9WyssQ63w/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2388: [FLINK-4347][cluster management] Implement SlotManager co...

2016-08-26 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2388
  
Looks quite good all in all.

Be good to get this in, so that JobManager followup work can build on this.
I would take this, rebase it, and merge it. If I find some issues I would 
create followup issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2376: [FLINK-3755] Introduce key groups for key-value state to ...

2016-08-26 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2376
  
Very good work and very nice code!

Some comments after a joint review:

  - The most critical issue is that there should not be any blocking on 
async threads during task shutdown. This unnecessarily delays responses to 
canceling and redeployment.

  - At this point, the `KeyGroupAssigner` interface seems a bit useless, 
especially if it is not parametrized with variable key group mappings. For the 
sake of making this simpler and more efficient, one could just have a static 
method for that.

  - I would suggest to make the assumption that key groups are always used 
(they should be, even if their number is equal to the parallelism), and drop 
the checks for `numberOfKeyGroups > 0`, for example in the 
KeyGroupHashPartitioner.

  - A bit more difficult is what to assume as the default number of key 
groups. We thought about assuming a default of `128`. That has no overhead in 
state backends like RocksDB and also allows initial job deployments which did 
not think about properly configuring this to have some freedom to scale out. If 
the parallelism is >= 128, this should probably round to the next highest 
power-of-two.

  - There are some log statements which cause log flooding, like an INFO 
log statement for every checkpoint stream (factory) created.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439364#comment-15439364
 ] 

ASF GitHub Bot commented on FLINK-3677:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2109
  
@mxm @aljoscha @StephanEwen Thank you for your comments. Seems like this is 
sorted out :)
I'll remove Guava calls today at the evening.


> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
> Fix For: 1.2.0
>
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2109: [FLINK-3677] FileInputFormat: Allow to specify include/ex...

2016-08-26 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2109
  
@mxm @aljoscha @StephanEwen Thank you for your comments. Seems like this is 
sorted out :)
I'll remove Guava calls today at the evening.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4245) Metric naming improvements

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439359#comment-15439359
 ] 

ASF GitHub Bot commented on FLINK-4245:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2418
  
Actually, it may be possible, and not too bad if we assume the Hashtable is 
immutable. Something like this:

```java
public class HashtableWrapper extends Hashtable {

private final Map backingMap;
private final String name;

public HashtableWrapper(Map backingMap, String name) {
super(1);
this.backingMap = backingMap;
this.name = name;
}

@Override
public synchronized V get(Object key) {
if ("name".equals(key)) {
return name;
} else {
return backingMap.get(key);
}
}

@Override
public synchronized String put(String  key, String  value) {
throw new UnsupportedOperationException("immutable hashtable");
}

// wrappers for Iterator to Enumeration

@Override
public synchronized Enumeration keys() {
return new IteratorToEnumeration(backingMap.keySet());
}

@Override
public synchronized Enumeration elements() {
return new IteratorToEnumeration(backingMap.valueSet());
}

// and so on ...
}
```


> Metric naming improvements
> --
>
> Key: FLINK-4245
> URL: https://issues.apache.org/jira/browse/FLINK-4245
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: Stephan Ewen
>
> A metric currently has two parts to it:
>   - The name of that particular metric
>   - The "scope" (or namespace), defined by the group that contains the metric.
> A metric group actually always implicitly has a map of naming "tags", like:
>   - taskmanager_host : 
>   - taskmanager_id : 
>   - task_name : "map() -> filter()"
> We derive the scope from that map, following the defined scope formats.
> For JMX (and some users that use JMX), it would be natural to expose that map 
> of tags. Some users reconstruct that map by parsing the metric scope. JMX, we 
> can expose a metric like:
>   - domain: "taskmanager.task.operator.io"
>   - name: "numRecordsIn"
>   - tags: { "hostname" -> "localhost", "operator_name" -> "map() at 
> X.java:123", ... }
> For many other reporters, the formatted scope makes a lot of sense, since 
> they think only in terms of (scope, metric-name).
> We may even have the formatted scope in JMX as well (in the domain), if we 
> want to go that route. 
> [~jgrier] and [~Zentol] - what do you think about that?
> [~mdaxini] Does that match your use of the metrics?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2418: [FLINK-4245] JMXReporter exposes all defined variables

2016-08-26 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2418
  
Actually, it may be possible, and not too bad if we assume the Hashtable is 
immutable. Something like this:

```java
public class HashtableWrapper extends Hashtable {

private final Map backingMap;
private final String name;

public HashtableWrapper(Map backingMap, String name) {
super(1);
this.backingMap = backingMap;
this.name = name;
}

@Override
public synchronized V get(Object key) {
if ("name".equals(key)) {
return name;
} else {
return backingMap.get(key);
}
}

@Override
public synchronized String put(String  key, String  value) {
throw new UnsupportedOperationException("immutable hashtable");
}

// wrappers for Iterator to Enumeration

@Override
public synchronized Enumeration keys() {
return new IteratorToEnumeration(backingMap.keySet());
}

@Override
public synchronized Enumeration elements() {
return new IteratorToEnumeration(backingMap.valueSet());
}

// and so on ...
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4245) Metric naming improvements

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439348#comment-15439348
 ] 

ASF GitHub Bot commented on FLINK-4245:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2418
  
Looks good to me!

The only not-so-nice thing is that we have to convert the `Map` to a 
`HashTable` for every metric. Given how we tried to optimize the metric 
creation overhead, this it pretty tough.
Its seems very hard to "disguise" a map as a hashtable, though...


> Metric naming improvements
> --
>
> Key: FLINK-4245
> URL: https://issues.apache.org/jira/browse/FLINK-4245
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: Stephan Ewen
>
> A metric currently has two parts to it:
>   - The name of that particular metric
>   - The "scope" (or namespace), defined by the group that contains the metric.
> A metric group actually always implicitly has a map of naming "tags", like:
>   - taskmanager_host : 
>   - taskmanager_id : 
>   - task_name : "map() -> filter()"
> We derive the scope from that map, following the defined scope formats.
> For JMX (and some users that use JMX), it would be natural to expose that map 
> of tags. Some users reconstruct that map by parsing the metric scope. JMX, we 
> can expose a metric like:
>   - domain: "taskmanager.task.operator.io"
>   - name: "numRecordsIn"
>   - tags: { "hostname" -> "localhost", "operator_name" -> "map() at 
> X.java:123", ... }
> For many other reporters, the formatted scope makes a lot of sense, since 
> they think only in terms of (scope, metric-name).
> We may even have the formatted scope in JMX as well (in the domain), if we 
> want to go that route. 
> [~jgrier] and [~Zentol] - what do you think about that?
> [~mdaxini] Does that match your use of the metrics?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2418: [FLINK-4245] JMXReporter exposes all defined variables

2016-08-26 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2418
  
Looks good to me!

The only not-so-nice thing is that we have to convert the `Map` to a 
`HashTable` for every metric. Given how we tried to optimize the metric 
creation overhead, this it pretty tough.
Its seems very hard to "disguise" a map as a hashtable, though...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4488) Prevent cluster shutdown after job execution for non-detached jobs

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439346#comment-15439346
 ] 

ASF GitHub Bot commented on FLINK-4488:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2419
  
Our currents tests make it difficult to test such behavior. Added a check 
to the `YarnTestBase`. Basically, I'm skipping the cluster shutdown to check if 
the JobManager is still alive and hasn't been shutdown through other means. 


> Prevent cluster shutdown after job execution for non-detached jobs
> --
>
> Key: FLINK-4488
> URL: https://issues.apache.org/jira/browse/FLINK-4488
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 1.2.0, 1.1.1
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.2.0, 1.1.2
>
>
> In per-job mode, the Yarn cluster currently shuts down after the first 
> interactively executed job. Users may want to execute multiple jobs in one 
> Jar. I would suggest to use this mechanism only for jobs which run detached. 
> For interactive jobs, shutdown of the cluster is additionally handled by the 
> CLI which should be sufficient to ensure cluster shutdown. Cluster shutdown 
> could only become a problem in case of a network partition to the cluster or 
> outage of the CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2419: [FLINK-4488] only automatically shutdown clusters for det...

2016-08-26 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2419
  
Our currents tests make it difficult to test such behavior. Added a check 
to the `YarnTestBase`. Basically, I'm skipping the cluster shutdown to check if 
the JobManager is still alive and hasn't been shutdown through other means. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4419) Batch improvement for supporting dfs as a ResultPartitionType

2016-08-26 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439327#comment-15439327
 ] 

Greg Hogan commented on FLINK-4419:
---

My apologies for being late to this discussion. Are we not able to recover when 
a downstream operator fails if the spilled files are written to redundant 
storage? That would not require changes to the DataSet API and would not reduce 
performance.

CC'ing [~StephanEwen] and [~till.rohrmann]

> Batch improvement for supporting dfs as a ResultPartitionType
> -
>
> Key: FLINK-4419
> URL: https://issues.apache.org/jira/browse/FLINK-4419
> Project: Flink
>  Issue Type: Improvement
>  Components: Batch Connectors and Input/Output Formats
>Reporter: shuai.xu
>Assignee: shuai.xu
>
> This is the root issue to track a improvement for batch, which will enable 
> dfs as a ResultPartitionType, so that upstream node can exist totally after 
> finished and need not be restarted if downstream nodes fail.
> Full design is shown in 
> (https://docs.google.com/document/d/15HtCtc9Gk8SyHsAezM7Od1opAHgnxLeHm3VX7A8fa-4/edit#).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4509) Specify savepoint directory per savepoint

2016-08-26 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439317#comment-15439317
 ] 

Stephan Ewen commented on FLINK-4509:
-

I think this needs some pre-requisite changes to the checkpointing mechanism, 
for example to allow to parameterize the checkpoint (via parameters attached to 
the barrier?).

> Specify savepoint directory per savepoint
> -
>
> Key: FLINK-4509
> URL: https://issues.apache.org/jira/browse/FLINK-4509
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>
> Currently, savepoints go to a per cluster configured default directory 
> (configured via {{savepoints.state.backend}} and 
> {{savepoints.state.backend.fs.dir}}).
> We shall allow to specify the directory per triggered savepoint in case no 
> default is configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4512) Add option for persistent checkpoints

2016-08-26 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439315#comment-15439315
 ] 

Stephan Ewen commented on FLINK-4512:
-

+1 that makes a lot of sense.

One thing to watch out for is that **stopping** a Job lets it currently end in 
state FINISHED. Is it desired to remove externalized checkpoints in that case?

It may make sense to change the behavior of "stop()" anyways (have STOPPING and 
STOPPED), but for now, I guess that this may cause confusion. Stopping is 
frequently used as a "soft cancelling".

> Add option for persistent checkpoints
> -
>
> Key: FLINK-4512
> URL: https://issues.apache.org/jira/browse/FLINK-4512
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Reporter: Ufuk Celebi
>
> Allow periodic checkpoints to be persisted by writing out their meta data. 
> This is what we currently do for savepoints, but in the future checkpoints 
> and savepoints are likely to diverge with respect to guarantees they give for 
> updatability, etc.
> This means that the difference between persistent checkpoints and savepoints 
> in the long term will be that persistent checkpoints can only be restored 
> with the same job settings (like parallelism, etc.)
> Regular and persisted checkpoints should behave differently with respect to 
> disposal in *globally* terminal job states (FINISHED, CANCELLED, FAILED): 
> regular checkpoints are cleaned up in all of these cases whereas persistent 
> checkpoints only on FINISHED. Maybe with the option to customize behaviour on 
> CANCELLED or FAILED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4510) Always create CheckpointCoordinator

2016-08-26 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439312#comment-15439312
 ] 

Stephan Ewen commented on FLINK-4510:
-

[~jark] Thanks for helping out here. I assigned the issue to you.

> Always create CheckpointCoordinator
> ---
>
> Key: FLINK-4510
> URL: https://issues.apache.org/jira/browse/FLINK-4510
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Reporter: Ufuk Celebi
>Assignee: Jark Wu
>
> The checkpoint coordinator is only created if a checkpointing interval is 
> configured. This means that no savepoints can be triggered if there is no 
> checkpointing interval specified.
> Instead we should always create it and allow an interval of 0 for disabled 
> periodic checkpoints. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4510) Always create CheckpointCoordinator

2016-08-26 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-4510:

Assignee: Jark Wu

> Always create CheckpointCoordinator
> ---
>
> Key: FLINK-4510
> URL: https://issues.apache.org/jira/browse/FLINK-4510
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Reporter: Ufuk Celebi
>Assignee: Jark Wu
>
> The checkpoint coordinator is only created if a checkpointing interval is 
> configured. This means that no savepoints can be triggered if there is no 
> checkpointing interval specified.
> Instead we should always create it and allow an interval of 0 for disabled 
> periodic checkpoints. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439303#comment-15439303
 ] 

ASF GitHub Bot commented on FLINK-3677:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2109
  
Ah, got it too now. The diff only makes it appear like the deprecated 
method is new. Then I take that back :-)
A followup on removing Guava would be nice, through...


> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
> Fix For: 1.2.0
>
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2109: [FLINK-3677] FileInputFormat: Allow to specify include/ex...

2016-08-26 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2109
  
Ah, got it too now. The diff only makes it appear like the deprecated 
method is new. Then I take that back :-)
A followup on removing Guava would be nice, through...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4510) Always create CheckpointCoordinator

2016-08-26 Thread Jark Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439292#comment-15439292
 ] 

Jark Wu commented on FLINK-4510:


Hi [~uce] , I would like to contribute this issue.

> Always create CheckpointCoordinator
> ---
>
> Key: FLINK-4510
> URL: https://issues.apache.org/jira/browse/FLINK-4510
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Reporter: Ufuk Celebi
>
> The checkpoint coordinator is only created if a checkpointing interval is 
> configured. This means that no savepoints can be triggered if there is no 
> checkpointing interval specified.
> Instead we should always create it and allow an interval of 0 for disabled 
> periodic checkpoints. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4513) Kafka connector documentation refers to Flink 1.1-SNAPSHOT

2016-08-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4513:


 Summary: Kafka connector documentation refers to Flink 1.1-SNAPSHOT
 Key: FLINK-4513
 URL: https://issues.apache.org/jira/browse/FLINK-4513
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.1.1
Reporter: Fabian Hueske
Priority: Trivial
 Fix For: 1.1.2


The Kafka connector documentation: 
https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/streaming/connectors/kafka.html
 of Flink 1.1 refers to a Flink 1.1-SNAPSHOT Maven version. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2109: [FLINK-3677] FileInputFormat: Allow to specify include/ex...

2016-08-26 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2109
  
@mxm @mushketyk Technically this change does add a new method that is the 
now deprecated method minus the `FilePathFilter`. That's a good change, though, 
since it simplifies the API in the long run. So apologies for making a fuss 
about it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439277#comment-15439277
 ] 

ASF GitHub Bot commented on FLINK-3677:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2109
  
@mxm @mushketyk Technically this change does add a new method that is the 
now deprecated method minus the `FilePathFilter`. That's a good change, though, 
since it simplifies the API in the long run. So apologies for making a fuss 
about it. 


> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
> Fix For: 1.2.0
>
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4190) Generalise RollingSink to work with arbitrary buckets

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439249#comment-15439249
 ] 

ASF GitHub Bot commented on FLINK-4190:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2269
  
Hi,
I'm very for the delays! I still have this sitting at the top of my list 
and I'm hoping to get this in by beginning of next week.


> Generalise RollingSink to work with arbitrary buckets
> -
>
> Key: FLINK-4190
> URL: https://issues.apache.org/jira/browse/FLINK-4190
> Project: Flink
>  Issue Type: Improvement
>  Components: filesystem-connector, Streaming Connectors
>Reporter: Josh Forman-Gornall
>Assignee: Josh Forman-Gornall
>Priority: Minor
>
> The current RollingSink implementation appears to be intended for writing to 
> directories that are bucketed by system time (e.g. minutely) and to only be 
> writing to one file within one bucket at any point in time. When the system 
> time determines that the current bucket should be changed, the current bucket 
> and file are closed and a new bucket and file are created. The sink cannot be 
> used for the more general problem of writing to arbitrary buckets, perhaps 
> determined by an attribute on the element/tuple being processed.
> There are three limitations which prevent the existing sink from being used 
> for more general problems:
> - Only bucketing by the current system time is supported, and not by e.g. an 
> attribute of the element being processed by the sink.
> - Whenever the sink sees a change in the bucket being written to, it flushes 
> the file and moves on to the new bucket. Therefore the sink cannot have more 
> than one bucket/file open at a time. Additionally the checkpointing mechanics 
> only support saving the state of one active bucket and file.
> - The sink determines that it should 'close' an active bucket and file when 
> the bucket path changes. We need another way to determine when a bucket has 
> become inactive and needs to be closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2269: [FLINK-4190] Generalise RollingSink to work with arbitrar...

2016-08-26 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2269
  
Hi,
I'm very for the delays! I still have this sitting at the top of my list 
and I'm hoping to get this in by beginning of next week.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2424: [FLINK-4459][Scheduler] Introduce SlotProvider for...

2016-08-26 Thread KurtYoung
GitHub user KurtYoung opened a pull request:

https://github.com/apache/flink/pull/2424

[FLINK-4459][Scheduler] Introduce SlotProvider for Scheduler

Introduce SlotProvider, prepare for the further slot allocation refactoring

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KurtYoung/flink flink-4459

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2424.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2424


commit b88f024e6c6b134d00588af6aab8c03a189d2d3a
Author: Kurt Young 
Date:   2016-08-26T09:51:40Z

[FLINK-4459][Scheduler] Introduce SlotProvider for Scheduler




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4459) Introduce SlotProvider for Scheduler

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439239#comment-15439239
 ] 

ASF GitHub Bot commented on FLINK-4459:
---

GitHub user KurtYoung opened a pull request:

https://github.com/apache/flink/pull/2424

[FLINK-4459][Scheduler] Introduce SlotProvider for Scheduler

Introduce SlotProvider, prepare for the further slot allocation refactoring

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KurtYoung/flink flink-4459

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2424.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2424


commit b88f024e6c6b134d00588af6aab8c03a189d2d3a
Author: Kurt Young 
Date:   2016-08-26T09:51:40Z

[FLINK-4459][Scheduler] Introduce SlotProvider for Scheduler




> Introduce SlotProvider for Scheduler
> 
>
> Key: FLINK-4459
> URL: https://issues.apache.org/jira/browse/FLINK-4459
> Project: Flink
>  Issue Type: Improvement
>  Components: Scheduler
>Reporter: Till Rohrmann
>Assignee: Kurt Young
>
> Currently the {{Scheduler}} maintains a queue of available instances which it 
> scans if it needs a new slot. If it finds a suitable instance (having free 
> slots available) it will allocate a slot from it. 
> This slot allocation logic can be factored out and be made available via a 
> {{SlotProvider}} interface. The {{SlotProvider}} has methods to allocate a 
> slot given a set of location preferences. Slots should be returned as 
> {{Futures}}, because in the future the slot allocation might happen 
> asynchronously (Flip-6). 
> In the first version, the {{SlotProvider}} implementation will simply 
> encapsulate the existing slot allocation logic extracted from the 
> {{Scheduler}}. When a slot is requested it will return a completed or failed 
> future since the allocation happens synchronously.
> The refactoring will have the advantage to simplify the {{Scheduler}} class 
> and to pave the way for upcoming refactorings (Flip-6).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4440) Make API for edge/vertex creation less verbose

2016-08-26 Thread Ivan Mushketyk (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mushketyk closed FLINK-4440.
-
Resolution: Not A Problem

> Make API for edge/vertex creation less verbose
> --
>
> Key: FLINK-4440
> URL: https://issues.apache.org/jira/browse/FLINK-4440
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Trivial
>
> It would be better if one could create vertex/edges like this:
> {code:java}
> Vertex v = Vertex.create(42);
> Edge e = Edge.create(5, 6);
> {code}
> Instead of this:
> {code:java}
> Vertex v = new Vertex(42, 
> NullValue.getInstance());
> Edge e = new Edge NullValue>(5, 6, NullValue.getInstance());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4440) Make API for edge/vertex creation less verbose

2016-08-26 Thread Ivan Mushketyk (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439188#comment-15439188
 ] 

Ivan Mushketyk commented on FLINK-4440:
---

It's ok with me. I'll close the JIRA issue and the PR.

> Make API for edge/vertex creation less verbose
> --
>
> Key: FLINK-4440
> URL: https://issues.apache.org/jira/browse/FLINK-4440
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Trivial
>
> It would be better if one could create vertex/edges like this:
> {code:java}
> Vertex v = Vertex.create(42);
> Edge e = Edge.create(5, 6);
> {code}
> Instead of this:
> {code:java}
> Vertex v = new Vertex(42, 
> NullValue.getInstance());
> Edge e = new Edge NullValue>(5, 6, NullValue.getInstance());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4440) Make API for edge/vertex creation less verbose

2016-08-26 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439178#comment-15439178
 ] 

Vasia Kalavri commented on FLINK-4440:
--

Alright, makes sense. Let's close this then. Is that fine with you 
[~ivan.mushketyk] or is there anything we're overlooking here?

> Make API for edge/vertex creation less verbose
> --
>
> Key: FLINK-4440
> URL: https://issues.apache.org/jira/browse/FLINK-4440
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>Priority: Trivial
>
> It would be better if one could create vertex/edges like this:
> {code:java}
> Vertex v = Vertex.create(42);
> Edge e = Edge.create(5, 6);
> {code}
> Instead of this:
> {code:java}
> Vertex v = new Vertex(42, 
> NullValue.getInstance());
> Edge e = new Edge NullValue>(5, 6, NullValue.getInstance());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4512) Add option for persistent checkpoints

2016-08-26 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4512:
--

 Summary: Add option for persistent checkpoints
 Key: FLINK-4512
 URL: https://issues.apache.org/jira/browse/FLINK-4512
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Ufuk Celebi


Allow periodic checkpoints to be persisted by writing out their meta data. This 
is what we currently do for savepoints, but in the future checkpoints and 
savepoints are likely to diverge with respect to guarantees they give for 
updatability, etc.

This means that the difference between persistent checkpoints and savepoints in 
the long term will be that persistent checkpoints can only be restored with the 
same job settings (like parallelism, etc.)

Regular and persisted checkpoints should behave differently with respect to 
disposal in *globally* terminal job states (FINISHED, CANCELLED, FAILED): 
regular checkpoints are cleaned up in all of these cases whereas persistent 
checkpoints only on FINISHED. Maybe with the option to customize behaviour on 
CANCELLED or FAILED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4508) Consolidate DummyEnvironment and MockEnvironment for Tests

2016-08-26 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439165#comment-15439165
 ] 

Aljoscha Krettek commented on FLINK-4508:
-

Plus there's custom-mocked Environments in some tests.

And we should probably use the builder pattern for constructing the mock since 
there are so many parameters already.

> Consolidate DummyEnvironment and MockEnvironment for Tests
> --
>
> Key: FLINK-4508
> URL: https://issues.apache.org/jira/browse/FLINK-4508
> Project: Flink
>  Issue Type: Test
>Reporter: Stefan Richter
>Priority: Minor
>
> Currently we {{DummyEnvironment}} and {{MockEnvironment}} as implementations 
> of Environments for our test. Both serve a similar purpose, but offer 
> slightly different features. We should consolidate this by merging them into 
> one class that offers the best of both previous implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4511) Schedule periodic savepoints

2016-08-26 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4511:
--

 Summary: Schedule periodic savepoints
 Key: FLINK-4511
 URL: https://issues.apache.org/jira/browse/FLINK-4511
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Ufuk Celebi


Allow triggering of periodic savepoints, which are kept in a bounded queue 
(like completed checkpoints currently, but separate).

If there is no periodic checkpointing enabled, only periodic savepoints should 
be schedulded.

If periodic checkpointing is enabled, the periodic savepoints should not be 
scheduled independently, but instead the checkpoint scheduler should trigger a 
savepoint instead. This will ensure that no unexpected interference between 
checkpoints and savepoints happens. For this, I would restrict the savepoint 
interval to be a multiple of the checkpointing interval (if enabled).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4510) Always create CheckpointCoordinator

2016-08-26 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4510:
--

 Summary: Always create CheckpointCoordinator
 Key: FLINK-4510
 URL: https://issues.apache.org/jira/browse/FLINK-4510
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Ufuk Celebi


The checkpoint coordinator is only created if a checkpointing interval is 
configured. This means that no savepoints can be triggered if there is no 
checkpointing interval specified.

Instead we should always create it and allow an interval of 0 for disabled 
periodic checkpoints. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4509) Specify savepoint directory per savepoint

2016-08-26 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4509:
--

 Summary: Specify savepoint directory per savepoint
 Key: FLINK-4509
 URL: https://issues.apache.org/jira/browse/FLINK-4509
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi


Currently, savepoints go to a per cluster configured default directory 
(configured via {{savepoints.state.backend}} and 
{{savepoints.state.backend.fs.dir}}).

We shall allow to specify the directory per triggered savepoint in case no 
default is configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3580) Reintroduce Date/Time and implement scalar functions for it

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439125#comment-15439125
 ] 

ASF GitHub Bot commented on FLINK-3580:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2391


> Reintroduce Date/Time and implement scalar functions for it
> ---
>
> Key: FLINK-3580
> URL: https://issues.apache.org/jira/browse/FLINK-3580
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> This task includes:
> {code}
> DATETIME_PLUS
> EXTRACT_DATE
> FLOOR
> CEIL
> CURRENT_TIME
> CURRENT_TIMESTAMP
> LOCALTIME
> LOCALTIMESTAMP
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2391: [FLINK-3580] [table] Implement FLOOR/CEIL for time...

2016-08-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2391


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4487) Need tools for managing the yarn-session better

2016-08-26 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439112#comment-15439112
 ] 

Maximilian Michels commented on FLINK-4487:
---

I totally forgot that there is also a way to shutdown the detached yarn session 
through {{ yarn-session.sh -id }} which resumes the yarn 
session. After resuming you can "stop" the Yarn session cli to shutdown the 
yarn session and application.

We could add an additional parameter to just shutdown instead of resuming it 
first.

> Need tools for managing the yarn-session better
> ---
>
> Key: FLINK-4487
> URL: https://issues.apache.org/jira/browse/FLINK-4487
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN Client
>Affects Versions: 1.1.0
>Reporter: Niels Basjes
>
> There is a script yarn-session.sh which starts an empty JobManager on a Yarn 
> cluster. 
> Desired improvements:
> # If there is already a yarn-session running then yarn-session does not start 
> a new one (or it kills the old one?). Note that the file with ip/port may 
> exist yet the corresponding JobManager may have been killed in an other way.
> # A script that effectively lets me stop a yarn session and cleanup the file 
> that contains the ip/port of this yarn session and the .flink directory on 
> HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2391: [FLINK-3580] [table] Implement FLOOR/CEIL for time points

2016-08-26 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2391
  
Thanks @fhueske. I will address your comments and merge this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3580) Reintroduce Date/Time and implement scalar functions for it

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439108#comment-15439108
 ] 

ASF GitHub Bot commented on FLINK-3580:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2391
  
Thanks @fhueske. I will address your comments and merge this.


> Reintroduce Date/Time and implement scalar functions for it
> ---
>
> Key: FLINK-3580
> URL: https://issues.apache.org/jira/browse/FLINK-3580
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> This task includes:
> {code}
> DATETIME_PLUS
> EXTRACT_DATE
> FLOOR
> CEIL
> CURRENT_TIME
> CURRENT_TIMESTAMP
> LOCALTIME
> LOCALTIMESTAMP
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4311) TableInputFormat fails when reused on next split

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439102#comment-15439102
 ] 

ASF GitHub Bot commented on FLINK-4311:
---

Github user nielsbasjes commented on the issue:

https://github.com/apache/flink/pull/2330
  
I managed to resolve the problems with running these unit tests. 
These problems were caused by version conflicts in guava.
Now we have a HBaseMiniCluster that is started, a table with multiple 
regions is created. And the TableInputFormat is used to extract the rows again. 
By setting the paralellism to 1 the same TableInputFormat instance is used for 
multiple regions and succeeds (the problem this all started with).

Please review.


> TableInputFormat fails when reused on next split
> 
>
> Key: FLINK-4311
> URL: https://issues.apache.org/jira/browse/FLINK-4311
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.3
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Critical
>
> We have written a batch job that uses data from HBase by means of using the 
> TableInputFormat.
> We have found that this class sometimes fails with this exception:
> {quote}
> java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
> Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b
>  rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165]
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:155)
>   at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821)
>   at 
> org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:152)
>   at 
> org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:47)
>   at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:147)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b
>  rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:142)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:269)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:165)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>   ... 10 more
> {quote}
> As you can see the ThreadPoolExecutor was terminated at this point.
> We tracked it down to the fact that 
> # the configure method opens the table
> # the open method obtains the result scanner
> # the closes method closes the table.
> If a second split arrives on the same instance then the open method will fail 
> because the table has already been closed.
> We also found that this error varies with the versions of HBase that are 
> used. I have also seen this exception:
> {quote}
> Caused by: java.io.IOException: hconnection-0x19d37183 closed
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1146)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
>   ... 37 more
> {quote}
> I found that in the [documentation of the InputFormat 
> interface|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/io/InputFormat.html]
>  is clearly state

[GitHub] flink issue #2330: FLINK-4311 Fixed several problems in TableInputFormat

2016-08-26 Thread nielsbasjes
Github user nielsbasjes commented on the issue:

https://github.com/apache/flink/pull/2330
  
I managed to resolve the problems with running these unit tests. 
These problems were caused by version conflicts in guava.
Now we have a HBaseMiniCluster that is started, a table with multiple 
regions is created. And the TableInputFormat is used to extract the rows again. 
By setting the paralellism to 1 the same TableInputFormat instance is used for 
multiple regions and succeeds (the problem this all started with).

Please review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-4508) Consolidate DummyEnvironment and MockEnvironment for Tests

2016-08-26 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-4508:
-

 Summary: Consolidate DummyEnvironment and MockEnvironment for Tests
 Key: FLINK-4508
 URL: https://issues.apache.org/jira/browse/FLINK-4508
 Project: Flink
  Issue Type: Test
Reporter: Stefan Richter
Priority: Minor


Currently we {{DummyEnvironment}} and {{MockEnvironment}} as implementations of 
Environments for our test. Both serve a similar purpose, but offer slightly 
different features. We should consolidate this by merging them into one class 
that offers the best of both previous implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438938#comment-15438938
 ] 

ASF GitHub Bot commented on FLINK-3677:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2109
  
@aljoscha As already pointed out in the JIRA issue, the method is not new. 
A now obsolete method has been deprecated (the diff is hard to read).


> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
> Fix For: 1.2.0
>
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2109: [FLINK-3677] FileInputFormat: Allow to specify include/ex...

2016-08-26 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2109
  
@aljoscha As already pointed out in the JIRA issue, the method is not new. 
A now obsolete method has been deprecated (the diff is hard to read).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-08-26 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438936#comment-15438936
 ] 

Maximilian Michels commented on FLINK-3677:
---

No worries. Your changes are not wrong in any way. 

A lot of people simply do not like the Guava library because it usually just 
adds an additional dependency without adding much simplification to the code. I 
think it is fine to have it in testing scope.

Further, I think it is ok to first deprecate an old API method before removing 
it. Even if it is PublicEvolving.

> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
> Fix For: 1.2.0
>
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-08-26 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438935#comment-15438935
 ] 

Maximilian Michels commented on FLINK-3677:
---

I reviewed the pull requests and I was aware of these change.

Guava is only a test dependency which shouldn't affect users and download size. 
I think it is fine to have it in testing scope.

The diff is hard to read, but he deprecated the old read method and introduced 
a new one. I explicitly told him to not simply change the existing method. We 
can safely remove the deprecated method after the next release. 

> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
> Fix For: 1.2.0
>
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4487) Need tools for managing the yarn-session better

2016-08-26 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438883#comment-15438883
 ] 

Maximilian Michels commented on FLINK-4487:
---

{quote}
1. If there is already a yarn-session running then yarn-session does not start 
a new one (or it kills the old one?). Note that the file with ip/port may exist 
yet the corresponding JobManager may have been killed in an other way.
{quote}

The yarn properties file does not contain the ip/port anymore. Instead, it uses 
the Yarn application id. A new Yarn session always overrides the previous 
session information.

{quote}
2. A script that effectively lets me stop a yarn session and cleanup the file 
that contains the ip/port of this yarn session and the .flink directory on HDFS.
{quote}

If you use the non-detached Yarn session, cleanup is automatically done. If you 
use the detached session, you will have to stop the yarn application (using 
yarn commands) and delete the yarn properties file in the temp directory. Good 
idea to let a script do that! We can now do that reliably because we have the 
yarn application id in the properties file.

> Need tools for managing the yarn-session better
> ---
>
> Key: FLINK-4487
> URL: https://issues.apache.org/jira/browse/FLINK-4487
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN Client
>Affects Versions: 1.1.0
>Reporter: Niels Basjes
>
> There is a script yarn-session.sh which starts an empty JobManager on a Yarn 
> cluster. 
> Desired improvements:
> # If there is already a yarn-session running then yarn-session does not start 
> a new one (or it kills the old one?). Note that the file with ip/port may 
> exist yet the corresponding JobManager may have been killed in an other way.
> # A script that effectively lets me stop a yarn session and cleanup the file 
> that contains the ip/port of this yarn session and the .flink directory on 
> HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2374: [FLINK-3950] Add Meter Metric Type

2016-08-26 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2374#discussion_r76411144
  
--- Diff: 
flink-metrics/flink-metrics-core/src/main/java/org/apache/flink/metrics/Meter.java
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.metrics;
+
+/**
+ * Metric for measuring average throughput.
--- End diff --

can we cut `average`? Maybe it's just me, but i think we should provide raw 
data and let the metric backends handle averaging/weighing and all that fancy 
stuff.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3950) Add Meter Metric Type

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438884#comment-15438884
 ] 

ASF GitHub Bot commented on FLINK-3950:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2374#discussion_r76411144
  
--- Diff: 
flink-metrics/flink-metrics-core/src/main/java/org/apache/flink/metrics/Meter.java
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.metrics;
+
+/**
+ * Metric for measuring average throughput.
--- End diff --

can we cut `average`? Maybe it's just me, but i think we should provide raw 
data and let the metric backends handle averaging/weighing and all that fancy 
stuff.


> Add Meter Metric Type
> -
>
> Key: FLINK-3950
> URL: https://issues.apache.org/jira/browse/FLINK-3950
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Stephan Ewen
>Assignee: Ivan Mushketyk
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3950) Add Meter Metric Type

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438875#comment-15438875
 ] 

ASF GitHub Bot commented on FLINK-3950:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2374#discussion_r76410471
  
--- Diff: 
flink-metrics/flink-metrics-dropwizard/src/test/java/org/apache/flink/dropwizard/metrics/DropwizardMeterWrapperTest.java
 ---
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.dropwizard.metrics;
+
+import org.junit.Test;
+
+import static org.junit.Assert.assertEquals;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.verify;
+import static org.mockito.Mockito.when;
+
+public class DropwizardMeterWrapperTest {
+   private static final double DELTA = 0.0001;
+
+   @Test
+   public void testWrapper() {
+   com.codahale.metrics.Meter dropwizardMeter = 
mock(com.codahale.metrics.Meter.class);
+   when(dropwizardMeter.getOneMinuteRate()).thenReturn(1.0);
+   when(dropwizardMeter.getCount()).thenReturn(100L);
+
+   DropwizardMeterWrapper wrapper = new 
DropwizardMeterWrapper(dropwizardMeter);
+
+   assertEquals(1.0, wrapper.getRate(), DELTA);
--- End diff --

this is minor, but i don't think we need a constant if it is only used once.


> Add Meter Metric Type
> -
>
> Key: FLINK-3950
> URL: https://issues.apache.org/jira/browse/FLINK-3950
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Stephan Ewen
>Assignee: Ivan Mushketyk
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2374: [FLINK-3950] Add Meter Metric Type

2016-08-26 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2374#discussion_r76410471
  
--- Diff: 
flink-metrics/flink-metrics-dropwizard/src/test/java/org/apache/flink/dropwizard/metrics/DropwizardMeterWrapperTest.java
 ---
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.dropwizard.metrics;
+
+import org.junit.Test;
+
+import static org.junit.Assert.assertEquals;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.verify;
+import static org.mockito.Mockito.when;
+
+public class DropwizardMeterWrapperTest {
+   private static final double DELTA = 0.0001;
+
+   @Test
+   public void testWrapper() {
+   com.codahale.metrics.Meter dropwizardMeter = 
mock(com.codahale.metrics.Meter.class);
+   when(dropwizardMeter.getOneMinuteRate()).thenReturn(1.0);
+   when(dropwizardMeter.getCount()).thenReturn(100L);
+
+   DropwizardMeterWrapper wrapper = new 
DropwizardMeterWrapper(dropwizardMeter);
+
+   assertEquals(1.0, wrapper.getRate(), DELTA);
--- End diff --

this is minor, but i don't think we need a constant if it is only used once.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3950) Add Meter Metric Type

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438868#comment-15438868
 ] 

ASF GitHub Bot commented on FLINK-3950:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2374#discussion_r76410073
  
--- Diff: docs/monitoring/metrics.md ---
@@ -155,6 +155,55 @@ public class MyMapper extends RichMapFunction {
 }
 {% endhighlight %}
 
+ Meter
+
+A `Meter` measures an average throughput. An occurrence of an event can be 
registered with the `markEvent()` method. Occurrence of multiple events at the 
same time can be registered with `markEvent(long n)` method.
+You can register a meter by calling `meter(String name, Meter histogram)` 
on a `MetricGroup`.
+
+{% highlight java %}
+public class MyMapper extends RichMapFunction {
+  private Meter meter;
+
+  @Override
+  public void open(Configuration config) {
+this.meter = getRuntimeContext()
+  .getMetricGroup()
+  .meter("myMeter", new MyMeter());
+  }
+
+  @public Integer map(Long value) throws Exception {
+this.meter.markEvent();
+  }
+}
+{% endhighlight %}
+
+Flink offers a {% gh_link 
flink-metrics/flink-metrics-dropwizard/src/main/java/org/apache/flink/dropwizard/metrics/DropwizardMeterWrapper.java
 "Wrapper" %} that allows usage of Codahale/DropWizard meters.
+To use this wrapper add the following dependency in your `pom.xml`:
+{% highlight xml %}
+
+  org.apache.flink
+  flink-metrics-dropwizard
+  {{site.version}}
+
+{% endhighlight %}
+
+You can then register a Codahale/DropWizard meter like this:
+
+{% highlight java %}
+public class MyMapper extends RichMapFunction {
+  private Meter meter;
+
+  @Override
+  public void open(Configuration config) {
+com.codahale.metrics.Meter meter = new com.codahale.metrics.Meter();
+
+this.meter = getRuntimeContext()
+  .getMetricGroup()
+  .histogram("myMeter", new DropWizardMeterWrapper(meter));
--- End diff --

`.histogram` -> `.meter`


> Add Meter Metric Type
> -
>
> Key: FLINK-3950
> URL: https://issues.apache.org/jira/browse/FLINK-3950
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Stephan Ewen
>Assignee: Ivan Mushketyk
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2374: [FLINK-3950] Add Meter Metric Type

2016-08-26 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2374#discussion_r76410073
  
--- Diff: docs/monitoring/metrics.md ---
@@ -155,6 +155,55 @@ public class MyMapper extends RichMapFunction {
 }
 {% endhighlight %}
 
+ Meter
+
+A `Meter` measures an average throughput. An occurrence of an event can be 
registered with the `markEvent()` method. Occurrence of multiple events at the 
same time can be registered with `markEvent(long n)` method.
+You can register a meter by calling `meter(String name, Meter histogram)` 
on a `MetricGroup`.
+
+{% highlight java %}
+public class MyMapper extends RichMapFunction {
+  private Meter meter;
+
+  @Override
+  public void open(Configuration config) {
+this.meter = getRuntimeContext()
+  .getMetricGroup()
+  .meter("myMeter", new MyMeter());
+  }
+
+  @public Integer map(Long value) throws Exception {
+this.meter.markEvent();
+  }
+}
+{% endhighlight %}
+
+Flink offers a {% gh_link 
flink-metrics/flink-metrics-dropwizard/src/main/java/org/apache/flink/dropwizard/metrics/DropwizardMeterWrapper.java
 "Wrapper" %} that allows usage of Codahale/DropWizard meters.
+To use this wrapper add the following dependency in your `pom.xml`:
+{% highlight xml %}
+
+  org.apache.flink
+  flink-metrics-dropwizard
+  {{site.version}}
+
+{% endhighlight %}
+
+You can then register a Codahale/DropWizard meter like this:
+
+{% highlight java %}
+public class MyMapper extends RichMapFunction {
+  private Meter meter;
+
+  @Override
+  public void open(Configuration config) {
+com.codahale.metrics.Meter meter = new com.codahale.metrics.Meter();
+
+this.meter = getRuntimeContext()
+  .getMetricGroup()
+  .histogram("myMeter", new DropWizardMeterWrapper(meter));
--- End diff --

`.histogram` -> `.meter`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-4507) Deprecate savepoint backend config

2016-08-26 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4507:
--

 Summary: Deprecate savepoint backend config
 Key: FLINK-4507
 URL: https://issues.apache.org/jira/browse/FLINK-4507
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi


The savepoint backend configuration allows both {{jobmanager}} and 
{{filesystem}} as values. The {{jobmanager}} variant is used as default if 
nothing is configured.

As part of FLIP-10, we want to get rid of this distinction and make all 
savepoints go to a file. Savepoints backed by JobManagers are only relevant for 
testing. Users could only recover from them if they did not shut down the 
current cluster.

Deprecate the {{savepoints.state.backend.fs.dir}} and add 
{{state.savepoints.dir}} as new config key. This is used as the default 
savepoint directory. Users can overwrite this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4190) Generalise RollingSink to work with arbitrary buckets

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438770#comment-15438770
 ] 

ASF GitHub Bot commented on FLINK-4190:
---

Github user joshfg commented on the issue:

https://github.com/apache/flink/pull/2269
  
Hi Aljoscha, just wanted to remind you about this - any idea when the 
changes will be merged in? Thanks!


> Generalise RollingSink to work with arbitrary buckets
> -
>
> Key: FLINK-4190
> URL: https://issues.apache.org/jira/browse/FLINK-4190
> Project: Flink
>  Issue Type: Improvement
>  Components: filesystem-connector, Streaming Connectors
>Reporter: Josh Forman-Gornall
>Assignee: Josh Forman-Gornall
>Priority: Minor
>
> The current RollingSink implementation appears to be intended for writing to 
> directories that are bucketed by system time (e.g. minutely) and to only be 
> writing to one file within one bucket at any point in time. When the system 
> time determines that the current bucket should be changed, the current bucket 
> and file are closed and a new bucket and file are created. The sink cannot be 
> used for the more general problem of writing to arbitrary buckets, perhaps 
> determined by an attribute on the element/tuple being processed.
> There are three limitations which prevent the existing sink from being used 
> for more general problems:
> - Only bucketing by the current system time is supported, and not by e.g. an 
> attribute of the element being processed by the sink.
> - Whenever the sink sees a change in the bucket being written to, it flushes 
> the file and moves on to the new bucket. Therefore the sink cannot have more 
> than one bucket/file open at a time. Additionally the checkpointing mechanics 
> only support saving the state of one active bucket and file.
> - The sink determines that it should 'close' an active bucket and file when 
> the bucket path changes. We need another way to determine when a bucket has 
> become inactive and needs to be closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2269: [FLINK-4190] Generalise RollingSink to work with arbitrar...

2016-08-26 Thread joshfg
Github user joshfg commented on the issue:

https://github.com/apache/flink/pull/2269
  
Hi Aljoscha, just wanted to remind you about this - any idea when the 
changes will be merged in? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3318) Add support for quantifiers to CEP's pattern API

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438769#comment-15438769
 ] 

ASF GitHub Bot commented on FLINK-3318:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2361
  
Could someone please take a look at this PR? It has been here without a 
review for more than 2 weeks.


> Add support for quantifiers to CEP's pattern API
> 
>
> Key: FLINK-3318
> URL: https://issues.apache.org/jira/browse/FLINK-3318
> Project: Flink
>  Issue Type: Improvement
>  Components: CEP
>Affects Versions: 1.0.0
>Reporter: Till Rohrmann
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> It would be a good addition to extend the pattern API to support quantifiers 
> known from regular expressions (e.g. Kleene star, ?, +, or count bounds). 
> This would considerably enrich the set of supported patterns.
> Implementing the count bounds could be done by unrolling the pattern state. 
> In order to support the Kleene star operator, the {{NFACompiler}} has to be 
> extended to insert epsilon-transition between a Kleene start state and the 
> succeeding pattern state. In order to support {{?}}, one could insert two 
> paths from the preceding state, one which accepts the event and another which 
> directly goes into the next pattern state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2361: [FLINK-3318][cep] Add support for quantifiers to CEP's pa...

2016-08-26 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2361
  
Could someone please take a look at this PR? It has been here without a 
review for more than 2 weeks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-4502) Cassandra connector documentation has misleading consistency guarantees

2016-08-26 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-4502:

Component/s: Cassandra Connector

> Cassandra connector documentation has misleading consistency guarantees
> ---
>
> Key: FLINK-4502
> URL: https://issues.apache.org/jira/browse/FLINK-4502
> Project: Flink
>  Issue Type: Bug
>  Components: Cassandra Connector, Documentation
>Affects Versions: 1.1.0
>Reporter: Elias Levy
>
> The Cassandra connector documentation states that  "enableWriteAheadLog() is 
> an optional method, that allows exactly-once processing for non-deterministic 
> algorithms."  This claim appears to be false.
> From what I gather, the write ahead log feature of the connector works as 
> follows:
> - The sink is replaced with a stateful operator that writes incoming messages 
> to the state backend based on checkpoint they belong in.
> - When the operator is notified that a Flink checkpoint has been completed 
> it, for each set of checkpoints older than and including the committed one:
>   * reads its messages from the state backend
>   * writes them to Cassandra
>   * records that it has committed them to Cassandra for the specific 
> checkpoint and operator instance
>* and erases them from the state backend.
> This process attempts to avoid resubmitting queries to Cassandra that would 
> otherwise occur when recovering a job from a checkpoint and having messages 
> replayed.
> Alas, this does not guarantee exactly once semantics at the sink.  The writes 
> to Cassandra that occur when the operator is notified that checkpoint is 
> completed are not atomic and they are potentially non-idempotent.  If the job 
> dies while writing to Cassandra or before committing the checkpoint via 
> committer, queries will be replayed when the job recovers.  Thus the 
> documentation appear to be incorrect in stating this provides exactly-once 
> semantics.
> There also seems to be an issue in GenericWriteAheadSink's 
> notifyOfCompletedCheckpoint which may result in incorrect output.  If 
> sendValues returns false because a write failed, instead of bailing, it 
> simply moves on to the next checkpoint to commit if there is one, keeping the 
> previous one around to try again later.  But that can result in newer data 
> being overwritten with older data when the previous checkpoint is retried.  
> Although given that CassandraCommitter implements isCheckpointCommitted as 
> checkpointID <= this.lastCommittedCheckpointID, it actually means that when 
> it goes back to try the uncommitted older checkpoint it will consider it 
> committed, even though some of its data may not have been written out, and 
> the data will be discarded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4498) Better Cassandra sink documentation

2016-08-26 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-4498:

Component/s: Cassandra Connector

> Better Cassandra sink documentation
> ---
>
> Key: FLINK-4498
> URL: https://issues.apache.org/jira/browse/FLINK-4498
> Project: Flink
>  Issue Type: Improvement
>  Components: Cassandra Connector, Documentation
>Affects Versions: 1.1.0
>Reporter: Elias Levy
>
> The Cassandra sink documentation is somewhat muddled and could be improved.  
> For instance, the fact that is only supports tuples and POJO's that use 
> DataStax Mapper annotations is only mentioned in passing, and it is not clear 
> that the reference to tuples only applies to Flink Java tuples and not Scala 
> tuples.  
> The documentation also does not mention that setQuery() is only necessary for 
> tuple streams. 
> The explanation of the write ahead log could use some cleaning up to clarify 
> when it is appropriate to use, ideally with an example.  Maybe this would be 
> best as a blog post to expand on the type of non-deterministic streams this 
> applies to.
> It would also be useful to mention that tuple elements will be mapped to 
> Cassandra columns using the Datastax Java driver's default encoders, which 
> are somewhat limited (e.g. to write to a blob column the type in the tuple 
> must be a java.nio.ByteBuffer and not just a byte[]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2423: [FLINK-4486] detached YarnSession: wait until clus...

2016-08-26 Thread mxm
GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/2423

[FLINK-4486] detached YarnSession: wait until cluster startup is complete



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink FLINK-4486

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2423.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2423


commit 2ec30c2c25204ff04270db9d072085f85909c8be
Author: Maximilian Michels 
Date:   2016-08-26T10:06:36Z

[FLINK-4486] detached YarnSession: wait until cluster startup is complete




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4486) JobManager not fully running when yarn-session.sh finishes

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438759#comment-15438759
 ] 

ASF GitHub Bot commented on FLINK-4486:
---

GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/2423

[FLINK-4486] detached YarnSession: wait until cluster startup is complete



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink FLINK-4486

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2423.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2423


commit 2ec30c2c25204ff04270db9d072085f85909c8be
Author: Maximilian Michels 
Date:   2016-08-26T10:06:36Z

[FLINK-4486] detached YarnSession: wait until cluster startup is complete




> JobManager not fully running when yarn-session.sh finishes
> --
>
> Key: FLINK-4486
> URL: https://issues.apache.org/jira/browse/FLINK-4486
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 1.1.0
>Reporter: Niels Basjes
>Assignee: Maximilian Michels
> Fix For: 1.2.0, 1.1.2
>
>
> I start a detached yarn-session.sh.
> If the Yarn cluster is very busy then the yarn-session.sh script completes 
> BEFORE all the task slots have been allocated. As a consequence I sometimes 
> have a jobmanager without any task slots. Over time these task slots are 
> assigned by the Yarn cluster but these are not available for the first job 
> that is submitted.
> As a consequence I have found that the first few tasks in my job fail with 
> this error "Not enough free slots available to run the job.".
> I think the desirable behavior is that yarn-session waits until the 
> jobmanager is fully functional and capable of actually running the jobs.
> {code}
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (CHAIN DataSource (Read prefix 
> '4') -> Map (Map prefix '4') (8/10)) @ (unassigned) - [SCHEDULED] > with 
> groupID < cd6c37df290564e603da908a8783a9bf > in sharing group < 
> SlotSharingGroup [c0b6eff6ce93967182cdb6dfeae9359b, 
> 8b2c3b39f3a55adf9f123243ab03c9c1, 55fb94dd8a3e5f59a10dbbf5c4925db4, 
> 433b2e4a05a5e685b48c517249755a89, 8c74690c35454064e4815ac3756cdca2, 
> 4b4fbd24f3483030fd852b38ff2249c1, 5e36a56ea4dece18fe5ba04352d90dc8, 
> cd6c37df290564e603da908a8783a9bf, 64eafa845087bee70735f7250df9994f, 
> 706a5d6fe48ae57724a00a9fce5dae8a, 7bee4297e0e839e53a153dfcbcca8624, 
> 21b58f7d408d237540ae7b4734f81a1d, b429b1ff338d9d73677f42717cfc0dbc, 
> cc7491db641f557c6aa8c749ebc2de62, f61cbf0ae00331f67aaf60ace78b05aa, 
> 606f02ea9e0f4ad57f0cc0232dd70842] >. Resources available to scheduler: Number 
> of instances=1, total number of slots=7, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:306)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:454)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:326)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:734)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1332)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.jav

[jira] [Created] (FLINK-4506) CsvOutputFormat defaults allowNullValues to false, even though doc and declaration says true

2016-08-26 Thread Michael Wong (JIRA)
Michael Wong created FLINK-4506:
---

 Summary: CsvOutputFormat defaults allowNullValues to false, even 
though doc and declaration says true
 Key: FLINK-4506
 URL: https://issues.apache.org/jira/browse/FLINK-4506
 Project: Flink
  Issue Type: Bug
  Components: Batch Connectors and Input/Output Formats, Documentation
Reporter: Michael Wong
Priority: Minor


In the constructor, it has this

{code}
this.allowNullValues = false;
{code}

But in the setAllowNullValues() method, the doc says the allowNullValues is 
true by default. Also, in the declaration of allowNullValues, the value is set 
to true. It probably makes the most sense to change the constructor.

{code}
/**
 * Configures the format to either allow null values (writing an empty 
field),
 * or to throw an exception when encountering a null field.
 * 
 * by default, null values are allowed.
 *
 * @param allowNulls Flag to indicate whether the output format should 
accept null values.
 */
public void setAllowNullValues(boolean allowNulls) {
this.allowNullValues = allowNulls;
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4502) Cassandra connector documentation has misleading consistency guarantees

2016-08-26 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438754#comment-15438754
 ] 

Chesnay Schepler commented on FLINK-4502:
-

Alright, so let's break this down point by point:

You're description of how the WAL works is correct. However, it only attempts 
to prevent *a different version* of a checkpoint from being committed. 

You are correct that the sink should stop sending data when sendValues returns 
false which can lead to some data for that checkpoint from never being written. 
However, you are incorrect in regards to this potentially overwriting newer 
data. We manually set the time stamp of the queries based on the checkpoint id; 
newer checkpoint => newer timestamp, which means that even if the query is 
submitted to cassandra it will not overwrite anything.

The documentation states that we provide exactly-once semantics for idempotent 
updates, as such by definition the writes to cassandra at any point cannot be 
non-idempotent.

The sink does in fact not guarantee exactly-once *delivery*, but it doesn't 
claim that it does. It fulfills exactly-once *semantics* in so far that, which 
pretty much means that if you look at any point in time at the data in 
cassandra it will show a state that would be reached if the messages would be 
delivered exactly-once.

> Cassandra connector documentation has misleading consistency guarantees
> ---
>
> Key: FLINK-4502
> URL: https://issues.apache.org/jira/browse/FLINK-4502
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
>Reporter: Elias Levy
>
> The Cassandra connector documentation states that  "enableWriteAheadLog() is 
> an optional method, that allows exactly-once processing for non-deterministic 
> algorithms."  This claim appears to be false.
> From what I gather, the write ahead log feature of the connector works as 
> follows:
> - The sink is replaced with a stateful operator that writes incoming messages 
> to the state backend based on checkpoint they belong in.
> - When the operator is notified that a Flink checkpoint has been completed 
> it, for each set of checkpoints older than and including the committed one:
>   * reads its messages from the state backend
>   * writes them to Cassandra
>   * records that it has committed them to Cassandra for the specific 
> checkpoint and operator instance
>* and erases them from the state backend.
> This process attempts to avoid resubmitting queries to Cassandra that would 
> otherwise occur when recovering a job from a checkpoint and having messages 
> replayed.
> Alas, this does not guarantee exactly once semantics at the sink.  The writes 
> to Cassandra that occur when the operator is notified that checkpoint is 
> completed are not atomic and they are potentially non-idempotent.  If the job 
> dies while writing to Cassandra or before committing the checkpoint via 
> committer, queries will be replayed when the job recovers.  Thus the 
> documentation appear to be incorrect in stating this provides exactly-once 
> semantics.
> There also seems to be an issue in GenericWriteAheadSink's 
> notifyOfCompletedCheckpoint which may result in incorrect output.  If 
> sendValues returns false because a write failed, instead of bailing, it 
> simply moves on to the next checkpoint to commit if there is one, keeping the 
> previous one around to try again later.  But that can result in newer data 
> being overwritten with older data when the previous checkpoint is retried.  
> Although given that CassandraCommitter implements isCheckpointCommitted as 
> checkpointID <= this.lastCommittedCheckpointID, it actually means that when 
> it goes back to try the uncommitted older checkpoint it will consider it 
> committed, even though some of its data may not have been written out, and 
> the data will be discarded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >