date:20150722


[ 
https://issues.apache.org/jira/browse/FLINK-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637045#comment-14637045
 ] 

ASF GitHub Bot commented on FLINK-2097:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/858#issuecomment-123755792
  
It should just add more nodes to the ExecutionGraph. Existing ones should 
not be modified. For batch, I think the assumption is that it needs to be 
finished. For streaming, I could also picture attaching nodes at runtime but 
this has to be carefully implemented..


 Add support for JobSessions
 ---

 Key: FLINK-2097
 URL: https://issues.apache.org/jira/browse/FLINK-2097
 Project: Flink
  Issue Type: Sub-task
  Components: JobManager
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Maximilian Michels
 Fix For: 0.9


 Sessions make sure that the JobManager does not immediately discard a 
 JobGraph after execution, but keeps it around for further operations to be 
 attached to the graph. By keeping the JobGraph around, the cached streams 
 (intermediate data) are also kept,
 That is the way of realizing interactive sessions on top of a streaming 
 dataflow abstraction.
 ExecutionGraphs should be kept as long as
 - no timeout occurred or
 - the session has not been explicitly ended



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2097] Implement job session management

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/858#issuecomment-123755792
  
It should just add more nodes to the ExecutionGraph. Existing ones should 
not be modified. For batch, I think the assumption is that it needs to be 
finished. For streaming, I could also picture attaching nodes at runtime but 
this has to be carefully implemented..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-2393) Add a stateless at-least-once mode for streaming

2015-07-22 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-2393:
---

 Summary: Add a stateless at-least-once mode for streaming
 Key: FLINK-2393
 URL: https://issues.apache.org/jira/browse/FLINK-2393
 Project: Flink
  Issue Type: New Feature
  Components: Streaming
Affects Versions: 0.10
Reporter: Stephan Ewen


Currently, the checkpointing mechanism provides exactly once guarantees. Part 
of that is the step that temporarily aligns the data streams. This step 
increases the tuple latency temporarily.

By offering a version that does not provide exactly-once, but only 
at-least-once, we can avoid the latency increase. For super-low-latency 
applications, that tolerate duplicates, this may be an interesting option.

To realize that, we would use a slightly modified version of the checkpointing 
algorithm. Effectively, the streams would not be aligned, but tasks would only 
count the received barriers and emit their own barrier as soon as the saw a 
barrier from all inputs.

My feeling is that it makes not sense to implement state backups, when being 
concerned with this super low latency. The mode would hence be a purely 
stateless at-least-once mode.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2097) Add support for JobSessions


[ 
https://issues.apache.org/jira/browse/FLINK-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637038#comment-14637038
 ] 

ASF GitHub Bot commented on FLINK-2097:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/858#issuecomment-123754538
  
I think right now, it pretty much behaves as if someone started a new job, 
with the grown execution graph.


 Add support for JobSessions
 ---

 Key: FLINK-2097
 URL: https://issues.apache.org/jira/browse/FLINK-2097
 Project: Flink
  Issue Type: Sub-task
  Components: JobManager
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Maximilian Michels
 Fix For: 0.9


 Sessions make sure that the JobManager does not immediately discard a 
 JobGraph after execution, but keeps it around for further operations to be 
 attached to the graph. By keeping the JobGraph around, the cached streams 
 (intermediate data) are also kept,
 That is the way of realizing interactive sessions on top of a streaming 
 dataflow abstraction.
 ExecutionGraphs should be kept as long as
 - no timeout occurred or
 - the session has not been explicitly ended



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2097] Implement job session management

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/858#issuecomment-123754538
  
I think right now, it pretty much behaves as if someone started a new job, 
with the grown execution graph.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-2394) HadoopOutFormat OutputCommitter is default to FileOutputCommiter

2015-07-22 Thread Stefano Bortoli (JIRA)

Stefano Bortoli created FLINK-2394:
--

 Summary: HadoopOutFormat OutputCommitter is default to 
FileOutputCommiter
 Key: FLINK-2394
 URL: https://issues.apache.org/jira/browse/FLINK-2394
 Project: Flink
  Issue Type: Bug
  Components: Hadoop Compatibility
Affects Versions: 0.9.0
Reporter: Stefano Bortoli


MongoOutputFormat does not write back in collection because the 
HadoopOutputFormat wrapper does not allow to set the MongoOutputCommiter and is 
set as default to FileOutputCommitter. Therefore, on close and globalFinalize 
execution the commit does not happen and mongo collection stays untouched. 

A simple solution would be to:

1 - create a constructor of HadoopOutputFormatBase and HadoopOutputFormat that 
gets the OutputCommitter as a parameter
2 - change the outputCommitter field of HadoopOutputFormatBase to be a generic 
OutputCommitter
3 - remove the default assignment in the open() and finalizeGlobal to the 
outputCommitter to FileOutputCommitter(), or keep it as a default in case of no 
specific assignment.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2385] [scala api] Add parenthesis to Sc...

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/933

[FLINK-2385] [scala api] Add parenthesis to Scala 'distinct' transformation.

The operation is not side-effect free. It does not mutate the original 
DataSet, but defines distributed computation.

This pull request also correct some comment markups.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink scala_distinct

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/933.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #933


commit 828761adaeff7a08e0927170e61b1f46c77b55d4
Author: Stephan Ewen se...@apache.org
Date:   2015-07-21T14:55:44Z

[FLINK-2385] [scala api] Add parenthesis to 'distinct' transformation.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2332] [runtime] Adds leader session IDs...

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/917#issuecomment-123755971
  
Addressed the following comments: Corrected order of visibility and 
abstract modifiers. Removed the lazy log field from `FlinkActor`. Now all 
implementing subclasses have to implement it. Made `RequiresLeaderSessionID` a 
Java interface.

All other comments haven been resolved by discussion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2332) Assign session IDs to JobManager and TaskManager messages

[
https://issues.apache.org/jira/browse/FLINK-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637059#comment-14637059
]

ASF GitHub Bot commented on FLINK-2332:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/917#issuecomment-123755971

Addressed the following comments: Corrected order of visibility and
abstract modifiers. Removed the lazy log field from `FlinkActor`. Now all
implementing subclasses have to implement it. Made `RequiresLeaderSessionID` a
Java interface.

All other comments haven been resolved by discussion.

Assign session IDs to JobManager and TaskManager messages
-

Key: FLINK-2332
URL: https://issues.apache.org/jira/browse/FLINK-2332
Project: Flink
Issue Type: Sub-task
Components: JobManager, TaskManager
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Fix For: 0.10

In order to support true high availability {{TaskManager}} and {{JobManager}}
have to be able to distinguish whether a message was sent from the leader or
whether a message was sent from a former leader. Messages which come from a
former leader have to be discarded in order to guarantee a consistent state.
A way to do achieve this is to assign a leader session ID to a {{JobManager}}
once he's elected as leader. This leader session ID is sent to the
{{TaskManager}} upon registration at the {{JobManager}}. All subsequent
messages should then be decorated with this leader session ID to mark them as
valid. On the {{TaskManager}} side the received leader session ID as a
response to the registration message, can then be used to validate incoming
messages.
The same holds true for registration messages which should have a
registration session ID, too. That way, it is possible to distinguish invalid
registration messages from valid ones. The registration session ID can be
assigned once the TaskManager is notified about the new leader.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2393) Add a stateless at-least-once mode for streaming

2015-07-22 Thread Aljoscha Krettek (JIRA)

[
https://issues.apache.org/jira/browse/FLINK-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637083#comment-14637083
]

Aljoscha Krettek commented on FLINK-2393:
-

I guess you meant no state backup for operators other than sources. Otherwise
you wouldn't need to state barriers at all since they don't do anything.

Add a stateless at-least-once mode for streaming
--

Key: FLINK-2393
URL: https://issues.apache.org/jira/browse/FLINK-2393
Project: Flink
Issue Type: New Feature
Components: Streaming
Affects Versions: 0.10
Reporter: Stephan Ewen

Currently, the checkpointing mechanism provides exactly once guarantees.
Part of that is the step that temporarily aligns the data streams. This
step increases the tuple latency temporarily.
By offering a version that does not provide exactly-once, but only
at-least-once, we can avoid the latency increase. For super-low-latency
applications, that tolerate duplicates, this may be an interesting option.
To realize that, we would use a slightly modified version of the
checkpointing algorithm. Effectively, the streams would not be aligned, but
tasks would only count the received barriers and emit their own barrier as
soon as the saw a barrier from all inputs.
My feeling is that it makes not sense to implement state backups, when being
concerned with this super low latency. The mode would hence be a purely
stateless at-least-once mode.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-22 Thread kl0u

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-123752371
  
Hi @mxm. Thanks a lot!
I don't have your email unfortunately. 
Could you somehow send it to me?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2332] [runtime] Adds leader session IDs...

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/917#issuecomment-123751721
  
Do you mean `JobManager.getJobManagerGateway`? This is only a temporary 
solution to obtain an `ActorGateway` for the JobManager for which you have to 
know the current leader session ID. This will be changed once HA with ZooKeeper 
is introduced.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-123756678
  
@kl0u Sure, I've sent you an email.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2332] [runtime] Adds leader session IDs...

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/917#issuecomment-123757309
  
Looks good. +1 to merge this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2332) Assign session IDs to JobManager and TaskManager messages


[ 
https://issues.apache.org/jira/browse/FLINK-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637064#comment-14637064
 ] 

ASF GitHub Bot commented on FLINK-2332:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/917#issuecomment-123757309
  
Looks good. +1 to merge this...


 Assign session IDs to JobManager and TaskManager messages
 -

 Key: FLINK-2332
 URL: https://issues.apache.org/jira/browse/FLINK-2332
 Project: Flink
  Issue Type: Sub-task
  Components: JobManager, TaskManager
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 0.10


 In order to support true high availability {{TaskManager}} and {{JobManager}} 
 have to be able to distinguish whether a message was sent from the leader or 
 whether a message was sent from a former leader. Messages which come from a 
 former leader have to be discarded in order to guarantee a consistent state.
 A way to do achieve this is to assign a leader session ID to a {{JobManager}} 
 once he's elected as leader. This leader session ID is sent to the 
 {{TaskManager}} upon registration at the {{JobManager}}. All subsequent 
 messages should then be decorated with this leader session ID to mark them as 
 valid. On the {{TaskManager}} side the received leader session ID as a 
 response to the registration message, can then be used to validate incoming 
 messages.
 The same holds true for registration messages which should have a 
 registration session ID, too. That way, it is possible to distinguish invalid 
 registration messages from valid ones. The registration session ID can be 
 assigned once the TaskManager is notified about the new leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2385) Scala DataSet.distinct should have parenthesis


[ 
https://issues.apache.org/jira/browse/FLINK-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637068#comment-14637068
 ] 

ASF GitHub Bot commented on FLINK-2385:
---

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/933

[FLINK-2385] [scala api] Add parenthesis to Scala 'distinct' transformation.

The operation is not side-effect free. It does not mutate the original 
DataSet, but defines distributed computation.

This pull request also correct some comment markups.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink scala_distinct

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/933.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #933


commit 828761adaeff7a08e0927170e61b1f46c77b55d4
Author: Stephan Ewen se...@apache.org
Date:   2015-07-21T14:55:44Z

[FLINK-2385] [scala api] Add parenthesis to 'distinct' transformation.




 Scala DataSet.distinct should have parenthesis
 --

 Key: FLINK-2385
 URL: https://issues.apache.org/jira/browse/FLINK-2385
 Project: Flink
  Issue Type: Bug
  Components: Scala API
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 The method is not a side-effect free accessor, but defines heavy computation, 
 even if it does not mutate the original data set.
 This is a somewhat API breaking change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2332) Assign session IDs to JobManager and TaskManager messages


[ 
https://issues.apache.org/jira/browse/FLINK-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637031#comment-14637031
 ] 

ASF GitHub Bot commented on FLINK-2332:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/917#issuecomment-123751721
  
Do you mean `JobManager.getJobManagerGateway`? This is only a temporary 
solution to obtain an `ActorGateway` for the JobManager for which you have to 
know the current leader session ID. This will be changed once HA with ZooKeeper 
is introduced.


 Assign session IDs to JobManager and TaskManager messages
 -

 Key: FLINK-2332
 URL: https://issues.apache.org/jira/browse/FLINK-2332
 Project: Flink
  Issue Type: Sub-task
  Components: JobManager, TaskManager
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 0.10


 In order to support true high availability {{TaskManager}} and {{JobManager}} 
 have to be able to distinguish whether a message was sent from the leader or 
 whether a message was sent from a former leader. Messages which come from a 
 former leader have to be discarded in order to guarantee a consistent state.
 A way to do achieve this is to assign a leader session ID to a {{JobManager}} 
 once he's elected as leader. This leader session ID is sent to the 
 {{TaskManager}} upon registration at the {{JobManager}}. All subsequent 
 messages should then be decorated with this leader session ID to mark them as 
 valid. On the {{TaskManager}} side the received leader session ID as a 
 response to the registration message, can then be used to validate incoming 
 messages.
 The same holds true for registration messages which should have a 
 registration session ID, too. That way, it is possible to distinguish invalid 
 registration messages from valid ones. The registration session ID can be 
 assigned once the TaskManager is notified about the new leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-22 Thread kl0u

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-123756855
  
Thanks a lot!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1818] Added api to cancel job from clie...

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/642#discussion_r35240445
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/program/Client.java ---
@@ -421,6 +426,50 @@ public JobSubmissionResult run(JobGraph jobGraph, 
boolean wait) throws ProgramIn
}
}
 
+   /**
+* Executes the CANCEL action through Client API.
+*
+* @param Accepts job id and cancels the job
+*
+*/
+
+
+   public int cancel(JobID jobId) throws ProgramInvocationException {
--- End diff --

You only return 0 or 1, so you might also return a Boolean. Or just return 
nothing and throw an Exception in case of an error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1818) Provide API to cancel running job


[ 
https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637250#comment-14637250
 ] 

ASF GitHub Bot commented on FLINK-1818:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/642#discussion_r35240445
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/program/Client.java ---
@@ -421,6 +426,50 @@ public JobSubmissionResult run(JobGraph jobGraph, 
boolean wait) throws ProgramIn
}
}
 
+   /**
+* Executes the CANCEL action through Client API.
+*
+* @param Accepts job id and cancels the job
+*
+*/
+
+
+   public int cancel(JobID jobId) throws ProgramInvocationException {
--- End diff --

You only return 0 or 1, so you might also return a Boolean. Or just return 
nothing and throw an Exception in case of an error.


 Provide API to cancel running job
 -

 Key: FLINK-1818
 URL: https://issues.apache.org/jira/browse/FLINK-1818
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: niraj rai
  Labels: starter

 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1818] Added api to cancel job from clie...

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/642#discussion_r35240432
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/program/Client.java ---
@@ -421,6 +426,50 @@ public JobSubmissionResult run(JobGraph jobGraph, 
boolean wait) throws ProgramIn
}
}
 
+   /**
+* Executes the CANCEL action through Client API.
--- End diff --

Could be a bit more explanatory :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1818) Provide API to cancel running job


[ 
https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637248#comment-14637248
 ] 

ASF GitHub Bot commented on FLINK-1818:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/642#discussion_r35240432
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/program/Client.java ---
@@ -421,6 +426,50 @@ public JobSubmissionResult run(JobGraph jobGraph, 
boolean wait) throws ProgramIn
}
}
 
+   /**
+* Executes the CANCEL action through Client API.
--- End diff --

Could be a bit more explanatory :)


 Provide API to cancel running job
 -

 Key: FLINK-1818
 URL: https://issues.apache.org/jira/browse/FLINK-1818
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: niraj rai
  Labels: starter

 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1818) Provide API to cancel running job


[ 
https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637249#comment-14637249
 ] 

ASF GitHub Bot commented on FLINK-1818:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/642#discussion_r35240436
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/program/Client.java ---
@@ -421,6 +426,50 @@ public JobSubmissionResult run(JobGraph jobGraph, 
boolean wait) throws ProgramIn
}
}
 
+   /**
+* Executes the CANCEL action through Client API.
+*
+* @param Accepts job id and cancels the job
--- End diff --

JavaDoc is not correct.


 Provide API to cancel running job
 -

 Key: FLINK-1818
 URL: https://issues.apache.org/jira/browse/FLINK-1818
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: niraj rai
  Labels: starter

 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2373) Add configuration parameter to createRemoteEnvironment method

2015-07-22 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637329#comment-14637329
 ] 

Stephan Ewen commented on FLINK-2373:
-

That is definitely a good idea. Would you be up for creating a patch?

 Add configuration parameter to createRemoteEnvironment method
 -

 Key: FLINK-2373
 URL: https://issues.apache.org/jira/browse/FLINK-2373
 Project: Flink
  Issue Type: Bug
  Components: other
Reporter: Andreas Kunft
Priority: Minor
   Original Estimate: 24h
  Remaining Estimate: 24h

 Currently there is no way to provide a custom configuration upon creation of 
 a remote environment (via ExecutionEnvironment.createRemoteEnvironment(...)).
 This leads to errors when the submitted job exceeds the default value for the 
 max. payload size in Akka, as we can not increase the configuration value 
 (akka.remote.OversizedPayloadException: Discarding oversized payload...)
 Providing an overloaded method with a configuration parameter for the remote 
 environment fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1818) Provide API to cancel running job

2015-07-22 Thread niraj rai (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637385#comment-14637385
 ] 

niraj rai commented on FLINK-1818:
--

Thanks Max for mentoring me.. Really appreciate your help.. Looking forward to 
contribute more ..

 Provide API to cancel running job
 -

 Key: FLINK-1818
 URL: https://issues.apache.org/jira/browse/FLINK-1818
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: niraj rai
  Labels: starter

 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1818] Added api to cancel job from clie...

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/642#issuecomment-123815171
  
@rainiraj Thanks for your contribution!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1818] Added api to cancel job from clie...

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/642#issuecomment-123802868
  
Thanks @rainiraj. I think we can merge your changes with some small 
adjustments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1818] Added api to cancel job from clie...

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/642#discussion_r35240453
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/program/Client.java ---
@@ -421,6 +426,50 @@ public JobSubmissionResult run(JobGraph jobGraph, 
boolean wait) throws ProgramIn
}
}
 
+   /**
+* Executes the CANCEL action through Client API.
+*
+* @param Accepts job id and cancels the job
+*
+*/
+
+
+   public int cancel(JobID jobId) throws ProgramInvocationException {
+   LOG.info(Executing 'cancel' command.);
+   final FiniteDuration timeout = 
AkkaUtils.getTimeout(configuration);
+   //String address = 
configuration.getString(ConfigConstants.JOB_MANAGER_IPC_ADDRESS_KEY, null);
--- End diff --

This comment can probably be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1818) Provide API to cancel running job


[ 
https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637251#comment-14637251
 ] 

ASF GitHub Bot commented on FLINK-1818:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/642#discussion_r35240453
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/program/Client.java ---
@@ -421,6 +426,50 @@ public JobSubmissionResult run(JobGraph jobGraph, 
boolean wait) throws ProgramIn
}
}
 
+   /**
+* Executes the CANCEL action through Client API.
+*
+* @param Accepts job id and cancels the job
+*
+*/
+
+
+   public int cancel(JobID jobId) throws ProgramInvocationException {
+   LOG.info(Executing 'cancel' command.);
+   final FiniteDuration timeout = 
AkkaUtils.getTimeout(configuration);
+   //String address = 
configuration.getString(ConfigConstants.JOB_MANAGER_IPC_ADDRESS_KEY, null);
--- End diff --

This comment can probably be removed.


 Provide API to cancel running job
 -

 Key: FLINK-1818
 URL: https://issues.apache.org/jira/browse/FLINK-1818
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: niraj rai
  Labels: starter

 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1818] Added api to cancel job from clie...

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/642#discussion_r35240436
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/program/Client.java ---
@@ -421,6 +426,50 @@ public JobSubmissionResult run(JobGraph jobGraph, 
boolean wait) throws ProgramIn
}
}
 
+   /**
+* Executes the CANCEL action through Client API.
+*
+* @param Accepts job id and cancels the job
--- End diff --

JavaDoc is not correct.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Closed] (FLINK-2388) JobManager should try retrieving jobs from archive


 [ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-2388.
-
Resolution: Fixed

Thanks for your help, [~ebautistabar]!

 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

2015-07-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/930


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637274#comment-14637274
 ] 

ASF GitHub Bot commented on FLINK-2388:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/930


 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1818) Provide API to cancel running job


[ 
https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637347#comment-14637347
 ] 

ASF GitHub Bot commented on FLINK-1818:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/642#issuecomment-123815171
  
@rainiraj Thanks for your contribution!


 Provide API to cancel running job
 -

 Key: FLINK-1818
 URL: https://issues.apache.org/jira/browse/FLINK-1818
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: niraj rai
  Labels: starter

 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2304) Add named attribute access to Storm compatibility layer


[ 
https://issues.apache.org/jira/browse/FLINK-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636427#comment-14636427
 ] 

ASF GitHub Bot commented on FLINK-2304:
---

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/878#issuecomment-123589583
  
Thanks I will merge it later today! :)


 Add named attribute access to Storm compatibility layer
 ---

 Key: FLINK-2304
 URL: https://issues.apache.org/jira/browse/FLINK-2304
 Project: Flink
  Issue Type: Improvement
  Components: flink-contrib
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Currently, Bolts running in Flink can access fields only by index. Enabling 
 named index access is possible for whole topologies and Tuple-type as well as 
 POJO type inputs for embedded Bolts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2304] Add named attribute access to Sto...

2015-07-22 Thread gyfora

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/878#issuecomment-123589583
  
Thanks I will merge it later today! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1658) Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent


[ 
https://issues.apache.org/jira/browse/FLINK-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636520#comment-14636520
 ] 

ASF GitHub Bot commented on FLINK-1658:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/929#issuecomment-123620462
  
No, +1.


 Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent
 --

 Key: FLINK-1658
 URL: https://issues.apache.org/jira/browse/FLINK-1658
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime, Local Runtime
Reporter: Gyula Fora
Assignee: Matthias J. Sax
Priority: Trivial

 The same name is used for different event classes in the runtime which can 
 cause confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1658] Rename AbstractEvent to AbstractT...

2015-07-22 Thread uce

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/929#issuecomment-123620462
  
No, +1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive

[
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636531#comment-14636531
]

Maximilian Michels commented on FLINK-2388:
---

Hi [~ebautistabar], thanks for finding and even looking into the problem. It's
wrong to return a null value if the accumulators are not found. Instead, we
should send a {{AccumulatorNotFound}} message. Then, the
{{JobManagerInfoServlet}} doesn't have to deal with null values. I've prepared
a fix in the pull request.

I will add another commit to the pull request that forwards the message to the
{{MemoryArchivist}} that in turn replies to the {{JobManagerInfoServlet}}.

JobManager should try retrieving jobs from archive
--

Key: FLINK-2388
URL: https://issues.apache.org/jira/browse/FLINK-2388
Project: Flink
Issue Type: Task
Components: JobManager
Reporter: Enrique Bautista Barahona

I was following the quickstart guide with the WordCount example and, when I
entered the analyze page for the job, nothing came up. Apparently the
JobManagerInfoServlet fails with a NullPointerException.
I've been reading the code and I've seen the problem is in the processing of
the RequestAccumulatorResultsStringified message in JobManager. There's a
TODO where the accumulators should be retrieved from the archive.
As I wanted to know more about Flink internals, I decided to try and fix it.
I've later seen that there's currently ongoing work in that part of the code,
so I guess maybe it's not needed, but if you want I could submit a PR. If you
have already taken it into account and will solve it shortly, please feel
free to close the issue.
If you want to take a look, the commit is here:
https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636533#comment-14636533
 ] 

ASF GitHub Bot commented on FLINK-2388:
---

GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/930

[FLINK-2388] return AccumulatorResultsNotFound until archive is checked



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink accumulator-archive-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/930.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #930


commit 0df9996bc2c054d11e837f1d1556ee14f76240f9
Author: Maximilian Michels m...@apache.org
Date:   2015-07-22T08:38:43Z

[JobManager] return AccumulatorResultsNotFound until archive is checked




 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/930

[FLINK-2388] return AccumulatorResultsNotFound until archive is checked



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink accumulator-archive-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/930.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #930


commit 0df9996bc2c054d11e837f1d1556ee14f76240f9
Author: Maximilian Michels m...@apache.org
Date:   2015-07-22T08:38:43Z

[JobManager] return AccumulatorResultsNotFound until archive is checked




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636549#comment-14636549
 ] 

ASF GitHub Bot commented on FLINK-2388:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35193777
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -688,16 +688,15 @@ class JobManager(
 
   case RequestAccumulatorResults(jobID) =
 try {
-  val accumulatorValues: java.util.Map[String, 
SerializedValue[Object]] = {
-currentJobs.get(jobID) match {
-  case Some((graph, jobInfo)) =
-graph.getAccumulatorsSerialized
-  case None =
-null // TODO check also archive
-}
+  currentJobs.get(jobID) match {
+case Some((graph, jobInfo)) =
+  val accumulatorValues = graph.getAccumulatorsSerialized
+  sender() ! AccumulatorResultsFound(jobID, accumulatorValues)
+case None =
+  // TODO check also archive
+  sender() ! AccumulatorResultsNotFound(jobID)
--- End diff --

Why not properly resolving the TODO?


 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636551#comment-14636551
 ] 

ASF GitHub Bot commented on FLINK-2388:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35193782
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -707,33 +706,33 @@ class JobManager(
 
   case RequestAccumulatorResultsStringified(jobId) =
 try {
-  val accumulatorValues: Array[StringifiedAccumulatorResult] = {
-currentJobs.get(jobId) match {
-  case Some((graph, jobInfo)) =
-val accumulators = graph.aggregateUserAccumulators()
-
-val result: Array[StringifiedAccumulatorResult] = new
-Array[StringifiedAccumulatorResult](accumulators.size)
-
-var i = 0
-accumulators foreach {
-  case (name, accumulator) =
-val (typeString, valueString) =
-  if (accumulator != null) {
-(accumulator.getClass.getSimpleName, 
accumulator.toString)
-  } else {
-(null, null)
-  }
-result(i) = new StringifiedAccumulatorResult(name, 
typeString, valueString)
-i += 1
-}
-result
-  case None =
-null // TODO check also archive
-}
+  currentJobs.get(jobId) match {
+case Some((graph, jobInfo)) =
+  val accumulators = graph.aggregateUserAccumulators()
+
+  val result: Array[StringifiedAccumulatorResult] = new
+  Array[StringifiedAccumulatorResult](accumulators.size)
+
+  var i = 0
+  accumulators foreach {
+case (name, accumulator) =
+  val (typeString, valueString) =
+if (accumulator != null) {
+  (accumulator.getClass.getSimpleName, 
accumulator.toString)
+} else {
+  (null, null)
+}
+  result(i) = new StringifiedAccumulatorResult(name, 
typeString, valueString)
+  i += 1
+  }
+
+  sender() ! AccumulatorResultStringsFound(jobId, result)
+
+case None =
+  // TODO check also archive
+  sender() ! AccumulatorResultsNotFound(jobId)
--- End diff --

Same here.


 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35193782
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -707,33 +706,33 @@ class JobManager(
 
   case RequestAccumulatorResultsStringified(jobId) =
 try {
-  val accumulatorValues: Array[StringifiedAccumulatorResult] = {
-currentJobs.get(jobId) match {
-  case Some((graph, jobInfo)) =
-val accumulators = graph.aggregateUserAccumulators()
-
-val result: Array[StringifiedAccumulatorResult] = new
-Array[StringifiedAccumulatorResult](accumulators.size)
-
-var i = 0
-accumulators foreach {
-  case (name, accumulator) =
-val (typeString, valueString) =
-  if (accumulator != null) {
-(accumulator.getClass.getSimpleName, 
accumulator.toString)
-  } else {
-(null, null)
-  }
-result(i) = new StringifiedAccumulatorResult(name, 
typeString, valueString)
-i += 1
-}
-result
-  case None =
-null // TODO check also archive
-}
+  currentJobs.get(jobId) match {
+case Some((graph, jobInfo)) =
+  val accumulators = graph.aggregateUserAccumulators()
+
+  val result: Array[StringifiedAccumulatorResult] = new
+  Array[StringifiedAccumulatorResult](accumulators.size)
+
+  var i = 0
+  accumulators foreach {
+case (name, accumulator) =
+  val (typeString, valueString) =
+if (accumulator != null) {
+  (accumulator.getClass.getSimpleName, 
accumulator.toString)
+} else {
+  (null, null)
+}
+  result(i) = new StringifiedAccumulatorResult(name, 
typeString, valueString)
+  i += 1
+  }
+
+  sender() ! AccumulatorResultStringsFound(jobId, result)
+
+case None =
+  // TODO check also archive
+  sender() ! AccumulatorResultsNotFound(jobId)
--- End diff --

Same here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2363) Add an end-to-end overview of program execution in Flink to the docs


[ 
https://issues.apache.org/jira/browse/FLINK-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636431#comment-14636431
 ] 

ASF GitHub Bot commented on FLINK-2363:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/913#issuecomment-123590116
  
@fhueske Naa, I just want to see it finished. So I'll probably have to 
write some other parts myself but it would be cool if you could contribute this 
part.

I'll open my own branch and then people can do pull requests against that, 
I will try to write the stuff myself or delegate to others for stuff that I 
don't know very well.


 Add an end-to-end overview of program execution in Flink to the docs
 

 Key: FLINK-2363
 URL: https://issues.apache.org/jira/browse/FLINK-2363
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Stephan Ewen





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2363] [docs] First part of internals - ...

2015-07-22 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/913#issuecomment-123590116
  
@fhueske Naa, I just want to see it finished. So I'll probably have to 
write some other parts myself but it would be cool if you could contribute this 
part.

I'll open my own branch and then people can do pull requests against that, 
I will try to write the stuff myself or delegate to others for stuff that I 
don't know very well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1901) Create sample operator for Dataset

2015-07-22 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636451#comment-14636451
 ] 

Till Rohrmann commented on FLINK-1901:
--

If you use the sampling operator this way, it works. However, usually your 
iteration data set is something like the weight vector of your model and you 
have another training dataset from which you want to take a small sample to 
update your weight vector in each iteration (e.g. SGD). When you write a 
program like that, then you'll see that the output of the sampling operator 
will always be the same (for every iteration). The reason is that the sampling 
no longer is on the dynamic path of the iteration and thus it is only once 
calculated and then cached. This is not the intended behaviour, though.

 Create sample operator for Dataset
 --

 Key: FLINK-1901
 URL: https://issues.apache.org/jira/browse/FLINK-1901
 Project: Flink
  Issue Type: Improvement
  Components: Core
Reporter: Theodore Vasiloudis
Assignee: Chengxiang Li

 In order to be able to implement Stochastic Gradient Descent and a number of 
 other machine learning algorithms we need to have a way to take a random 
 sample from a Dataset.
 We need to be able to sample with or without replacement from the Dataset, 
 choose the relative size of the sample, and set a seed for reproducibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2390) Replace iteration timeout with algorithm for detecting termination

2015-07-22 Thread Aljoscha Krettek (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636497#comment-14636497
 ] 

Aljoscha Krettek commented on FLINK-2390:
-

Yes, this could be a good idea.

How would the head operator figure out that no more input is arriving? I'm 
asking because I expect streaming jobs to be long-running, so there might not 
arrive input for a while before more input arrives.

 Replace iteration timeout with algorithm for detecting termination
 --

 Key: FLINK-2390
 URL: https://issues.apache.org/jira/browse/FLINK-2390
 Project: Flink
  Issue Type: New Feature
  Components: Streaming
Reporter: Gyula Fora
 Fix For: 0.10


 Currently the user can set a timeout which will shut down the iteration 
 source/sink nodes if no new data is received during that time to allow 
 program termination in iterative streaming jobs.
 This method is used due to the non-trivial nature of termination in iterative 
 streaming jobs. While termination is not a main concern in long running 
 streaming jobs, this behaviour makes iterative tests non-deterministic and 
 they often fail on travis due to the timeout. Also setting a timeout can 
 cause jobs to terminate prematurely.
 I propose to remove iteration timeouts and replace it with the following 
 algorithm for detecting termination:
 -We first identify loop edges in the jobgraph (the channels from the 
 iteration sources to the head operators)
 -Once the head operators (the ones with loop input) finish with all their 
 non-loop inputs they broadcast a marker to their outputs.
 -Each operator will broadcast a marker once it received a marker from all its 
 non-finished inputs
 -Iteration sources are terminated when they receive 2 consecutive markers 
 without receiving any record in-between
 The idea behind the algorithm is to find out when no more outputs are 
 generated from the operators inside an iteration after their normal inputs 
 are finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1927][py] Operator distribution rework

2015-07-22 Thread zentol

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/931

[FLINK-1927][py] Operator distribution rework

Python operators are no longer serialized and shipped across the
cluster. Instead the plan file is executed on each node, followed by
usage of the respective operator object.

removed dill library
also fixed [FLINK-2173] by always passing file paths explicitly to python

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink python_operator4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/931.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #931


commit 40fd3501cacb7b382c7265f0370d0f94887b7e85
Author: zentol s.mo...@web.de
Date:   2015-07-21T19:22:19Z

[FLINK-1927][py] Operator distribution rework

Python operators are no longer serialized and shipped across the
cluster. Instead the plan file is executed on each node, followed by
usage of the respective operator object.

removed dill library
[FLINK-2173] filepaths are always explicitly passed to python




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-2173) Python uses different tmp file than Flink

2015-07-22 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-2173:
---

Assignee: Chesnay Schepler

 Python uses different tmp file than Flink
 -

 Key: FLINK-2173
 URL: https://issues.apache.org/jira/browse/FLINK-2173
 Project: Flink
  Issue Type: Bug
  Components: Python API
 Environment: Debian Linux
Reporter: Matthias J. Sax
Assignee: Chesnay Schepler
Priority: Critical

 Flink gets the temp file path using System.getProperty(java.io.tmpdir) 
 while Python uses the tempfile.gettempdir() method. However, both can be 
 semantically different.
 On my system Flink uses /tmp while Pyhton used /tmp/users/1000 (1000 is 
 my Linux user-id)
 This issues leads (at least) to failing tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1927) [Py] Rework operator distribution


[ 
https://issues.apache.org/jira/browse/FLINK-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636537#comment-14636537
 ] 

ASF GitHub Bot commented on FLINK-1927:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/931

[FLINK-1927][py] Operator distribution rework

Python operators are no longer serialized and shipped across the
cluster. Instead the plan file is executed on each node, followed by
usage of the respective operator object.

removed dill library
also fixed [FLINK-2173] by always passing file paths explicitly to python

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink python_operator4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/931.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #931


commit 40fd3501cacb7b382c7265f0370d0f94887b7e85
Author: zentol s.mo...@web.de
Date:   2015-07-21T19:22:19Z

[FLINK-1927][py] Operator distribution rework

Python operators are no longer serialized and shipped across the
cluster. Instead the plan file is executed on each node, followed by
usage of the respective operator object.

removed dill library
[FLINK-2173] filepaths are always explicitly passed to python




 [Py] Rework operator distribution
 -

 Key: FLINK-1927
 URL: https://issues.apache.org/jira/browse/FLINK-1927
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.9
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Minor
 Fix For: 0.9


 Currently, the python operator is created when execution the python plan 
 file, serialized using dill and saved as a byte[] in the java function. It is 
 then deserialized at runtime on each node.
 The current implementation is fairly hacky, and imposes certain limitations 
 that make it hard to work with. Chaining, or generally saving other 
 user-code, always requires a separate deserialization step after 
 deserializing the operator.
 These issues can be easily circumvented by rebuilding the (python) plan on 
 each node, instead of serializing the operator. The plan creation is 
 deterministic, and every operator is uniquely identified by an ID that is 
 already known to the java function.
 This change will allow us to easily support custom serializers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-2346) Mesos clustering

2015-07-22 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-2346.
-
Resolution: Duplicate

I'm closing this issue, its a duplicate of FLINK-1984.

 Mesos clustering
 

 Key: FLINK-2346
 URL: https://issues.apache.org/jira/browse/FLINK-2346
 Project: Flink
  Issue Type: New Feature
Reporter: Suminda Dharmasena





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35193777
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -688,16 +688,15 @@ class JobManager(
 
   case RequestAccumulatorResults(jobID) =
 try {
-  val accumulatorValues: java.util.Map[String, 
SerializedValue[Object]] = {
-currentJobs.get(jobID) match {
-  case Some((graph, jobInfo)) =
-graph.getAccumulatorsSerialized
-  case None =
-null // TODO check also archive
-}
+  currentJobs.get(jobID) match {
+case Some((graph, jobInfo)) =
+  val accumulatorValues = graph.getAccumulatorsSerialized
+  sender() ! AccumulatorResultsFound(jobID, accumulatorValues)
+case None =
+  // TODO check also archive
+  sender() ! AccumulatorResultsNotFound(jobID)
--- End diff --

Why not properly resolving the TODO?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2200) Flink API with Scala 2.11 - Maven Repository


[ 
https://issues.apache.org/jira/browse/FLINK-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636575#comment-14636575
 ] 

ASF GitHub Bot commented on FLINK-2200:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/885#discussion_r35194868
  
--- Diff: 
flink-contrib/flink-storm-compatibility/flink-storm-compatibility-examples/src/assembly/word-count-storm.xml
 ---
@@ -36,7 +36,7 @@ under the License.
 includes
 !-- need to be added explicitly to get 'defaults.yaml' --
 includeorg.apache.storm:storm-core:jar/include
-
includeorg.apache.flink:flink-storm-examples:jar/include
+
includeorg.apache.flink:flink-storm-compatibility-examples${scala.suffix}:jar/include
--- End diff --

good finding ;)


 Flink API with Scala 2.11 - Maven Repository
 

 Key: FLINK-2200
 URL: https://issues.apache.org/jira/browse/FLINK-2200
 Project: Flink
  Issue Type: Wish
  Components: Build System, Scala API
Reporter: Philipp Götze
Assignee: Chiwan Park
Priority: Trivial
  Labels: maven

 It would be nice if you could upload a pre-built version of the Flink API 
 with Scala 2.11 to the maven repository.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2200] Add Flink with Scala 2.11 in Mave...

2015-07-22 Thread rmetzger

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/885#discussion_r35194868
  
--- Diff: 
flink-contrib/flink-storm-compatibility/flink-storm-compatibility-examples/src/assembly/word-count-storm.xml
 ---
@@ -36,7 +36,7 @@ under the License.
 includes
 !-- need to be added explicitly to get 'defaults.yaml' --
 includeorg.apache.storm:storm-core:jar/include
-
includeorg.apache.flink:flink-storm-examples:jar/include
+
includeorg.apache.flink:flink-storm-compatibility-examples${scala.suffix}:jar/include
--- End diff --

good finding ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2218) Web client cannot distinguesh between Flink options and program arguments


[ 
https://issues.apache.org/jira/browse/FLINK-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636627#comment-14636627
 ] 

ASF GitHub Bot commented on FLINK-2218:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/904#issuecomment-123648113
  
This only occurs with your changes, right? Then let's first fix the issue 
of the failing tests before merging.


 Web client cannot distinguesh between Flink options and program arguments
 -

 Key: FLINK-2218
 URL: https://issues.apache.org/jira/browse/FLINK-2218
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Affects Versions: master
Reporter: Ufuk Celebi
Assignee: Matthias J. Sax

 WebClient has only one input field for arguments. This field is used for 
 Flink options (e.g., `-p`) and program arguments. Thus, supported Flink 
 options restrict the possible program arguments. CliFrontend in contrast can 
 distinguish both and thus `-p` can also be used as an program argument.
 Solution: add a second input field for Flink options to WebClient



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2218] Web client cannot distinguesh bet...

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/904#issuecomment-123648113
  
This only occurs with your changes, right? Then let's first fix the issue 
of the failing tests before merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636740#comment-14636740
 ] 

ASF GitHub Bot commented on FLINK-2388:
---

Github user ebautistabar commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35202991
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -687,61 +687,23 @@ class JobManager(
 message match {
 
   case RequestAccumulatorResults(jobID) =
-try {
--- End diff --

Are the try/catch unnecessary?


 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive

[
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636601#comment-14636601
]

Maximilian Michels commented on FLINK-2388:
---

[~ebautistabar] Can you take a look at the pull request and see if you're
satisfied with that one. I just took a look at your commit as well and I think
it's similar. I tried to clean up where possible.

JobManager should try retrieving jobs from archive
--

Key: FLINK-2388
URL: https://issues.apache.org/jira/browse/FLINK-2388
Project: Flink
Issue Type: Task
Components: JobManager
Reporter: Enrique Bautista Barahona

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2200) Flink API with Scala 2.11 - Maven Repository


[ 
https://issues.apache.org/jira/browse/FLINK-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636598#comment-14636598
 ] 

ASF GitHub Bot commented on FLINK-2200:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/885#issuecomment-123643760
  
No need to hurry .. I needed 10 days to look at it .. so a few hours don't 
matter ;)

I tried to install the artifacts to my local repository, but the _2.11 
suffix is not added to the artifacts.

There are a few points still missing:
the parent pom probably also needs the suffix, + all the parent 
definitions in the modules and the module.../module definitions in the 
pom projects.

In this commit you can see what I mean: 
https://github.com/rmetzger/flink/commit/77622269a50babfdc85f46a4235bcdf093cb8e50

Please let me know if you have further questions.


 Flink API with Scala 2.11 - Maven Repository
 

 Key: FLINK-2200
 URL: https://issues.apache.org/jira/browse/FLINK-2200
 Project: Flink
  Issue Type: Wish
  Components: Build System, Scala API
Reporter: Philipp Götze
Assignee: Chiwan Park
Priority: Trivial
  Labels: maven

 It would be nice if you could upload a pre-built version of the Flink API 
 with Scala 2.11 to the maven repository.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2200] Add Flink with Scala 2.11 in Mave...

2015-07-22 Thread rmetzger

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/885#issuecomment-123643760
  
No need to hurry .. I needed 10 days to look at it .. so a few hours don't 
matter ;)

I tried to install the artifacts to my local repository, but the _2.11 
suffix is not added to the artifacts.

There are a few points still missing:
the parent pom probably also needs the suffix, + all the parent 
definitions in the modules and the module.../module definitions in the 
pom projects.

In this commit you can see what I mean: 
https://github.com/rmetzger/flink/commit/77622269a50babfdc85f46a4235bcdf093cb8e50

Please let me know if you have further questions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2218) Web client cannot distinguesh between Flink options and program arguments


[ 
https://issues.apache.org/jira/browse/FLINK-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636606#comment-14636606
 ] 

ASF GitHub Bot commented on FLINK-2218:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/904#issuecomment-123645039
  
+1 for merging it!


 Web client cannot distinguesh between Flink options and program arguments
 -

 Key: FLINK-2218
 URL: https://issues.apache.org/jira/browse/FLINK-2218
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Affects Versions: master
Reporter: Ufuk Celebi
Assignee: Matthias J. Sax

 WebClient has only one input field for arguments. This field is used for 
 Flink options (e.g., `-p`) and program arguments. Thus, supported Flink 
 options restrict the possible program arguments. CliFrontend in contrast can 
 distinguish both and thus `-p` can also be used as an program argument.
 Solution: add a second input field for Flink options to WebClient



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2218] Web client cannot distinguesh bet...

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/904#issuecomment-123645039
  
+1 for merging it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/930#issuecomment-123649614
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636741#comment-14636741
 ] 

ASF GitHub Bot commented on FLINK-2388:
---

Github user ebautistabar commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35202994
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ---
@@ -580,6 +581,34 @@ public ExecutionContext getExecutionContext() {
return result;
}
 
+   /**
+* Returns the a stringified version of the user-defined accumulators.
+* @return an Array containing the StringifiedAccumulatorResult objects
+*/
+   public StringifiedAccumulatorResult[] 
getAccumulatorResultsStringified() {
+
+   MapString, Accumulator?, ? accumulatorMap = 
aggregateUserAccumulators();
+
+   int num = accumulatorMap.size();
+   StringifiedAccumulatorResult[] resultStrings = new 
StringifiedAccumulatorResult[num];
+
+   int i = 0;
+   for (Map.EntryString, Accumulator?, ? entry : 
accumulatorMap.entrySet()) {
+
+   StringifiedAccumulatorResult result;
+   Accumulator?, ? value = entry.getValue();
+   if (value != null) {
+   result = new 
StringifiedAccumulatorResult(entry.getKey(), value.getClass().getSimpleName(), 
value.toString());
+   } else {
+   result = new 
StringifiedAccumulatorResult(entry.getKey(), null, null);
--- End diff --

Why do you use the string `null` instead of `null`?


 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

2015-07-22 Thread ebautistabar

Github user ebautistabar commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35202994
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ---
@@ -580,6 +581,34 @@ public ExecutionContext getExecutionContext() {
return result;
}
 
+   /**
+* Returns the a stringified version of the user-defined accumulators.
+* @return an Array containing the StringifiedAccumulatorResult objects
+*/
+   public StringifiedAccumulatorResult[] 
getAccumulatorResultsStringified() {
+
+   MapString, Accumulator?, ? accumulatorMap = 
aggregateUserAccumulators();
+
+   int num = accumulatorMap.size();
+   StringifiedAccumulatorResult[] resultStrings = new 
StringifiedAccumulatorResult[num];
+
+   int i = 0;
+   for (Map.EntryString, Accumulator?, ? entry : 
accumulatorMap.entrySet()) {
+
+   StringifiedAccumulatorResult result;
+   Accumulator?, ? value = entry.getValue();
+   if (value != null) {
+   result = new 
StringifiedAccumulatorResult(entry.getKey(), value.getClass().getSimpleName(), 
value.toString());
+   } else {
+   result = new 
StringifiedAccumulatorResult(entry.getKey(), null, null);
--- End diff --

Why do you use the string `null` instead of `null`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

2015-07-22 Thread ebautistabar

Github user ebautistabar commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35202991
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -687,61 +687,23 @@ class JobManager(
 message match {
 
   case RequestAccumulatorResults(jobID) =
-try {
--- End diff --

Are the try/catch unnecessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636591#comment-14636591
 ] 

ASF GitHub Bot commented on FLINK-2388:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35195856
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -688,16 +688,15 @@ class JobManager(
 
   case RequestAccumulatorResults(jobID) =
 try {
-  val accumulatorValues: java.util.Map[String, 
SerializedValue[Object]] = {
-currentJobs.get(jobID) match {
-  case Some((graph, jobInfo)) =
-graph.getAccumulatorsSerialized
-  case None =
-null // TODO check also archive
-}
+  currentJobs.get(jobID) match {
+case Some((graph, jobInfo)) =
+  val accumulatorValues = graph.getAccumulatorsSerialized
+  sender() ! AccumulatorResultsFound(jobID, accumulatorValues)
+case None =
+  // TODO check also archive
+  sender() ! AccumulatorResultsNotFound(jobID)
--- End diff --

I've updated the pull request.


 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35195856
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -688,16 +688,15 @@ class JobManager(
 
   case RequestAccumulatorResults(jobID) =
 try {
-  val accumulatorValues: java.util.Map[String, 
SerializedValue[Object]] = {
-currentJobs.get(jobID) match {
-  case Some((graph, jobInfo)) =
-graph.getAccumulatorsSerialized
-  case None =
-null // TODO check also archive
-}
+  currentJobs.get(jobID) match {
+case Some((graph, jobInfo)) =
+  val accumulatorValues = graph.getAccumulatorsSerialized
+  sender() ! AccumulatorResultsFound(jobID, accumulatorValues)
+case None =
+  // TODO check also archive
+  sender() ! AccumulatorResultsNotFound(jobID)
--- End diff --

I've updated the pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2391) Storm-compatibility:method FlinkTopologyBuilder.createTopology() throws java.lang.NullPointerException


[ 
https://issues.apache.org/jira/browse/FLINK-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636603#comment-14636603
 ] 

ASF GitHub Bot commented on FLINK-2391:
---

GitHub user ffbin opened a pull request:

https://github.com/apache/flink/pull/932

[FLINK-2391]Fix Storm-compatibility FlinkTopologyBuilder.createTopology bug

1.Error Scene:
Error happend in program like this:
builder.setSpout(source0, new Generator(pt), 
pt.getInt(sourceParallelism));
builder.setBolt(sa, new RepartPassThroughBolt(pt), 
pt.getInt(sinkParallelism)).fieldsGrouping(source0, new Fields(id));
builder.setBolt(sink, new Sink(pt), 
pt.getInt(sinkParallelism)).fieldsGrouping(sa, new Fields(id));
final FlinkLocalCluster cluster = FlinkLocalCluster.getLocalCluster();
cluster.submitTopology(throughput, conf, builder.createTopology());

if the last bolt use fieldsGrouping, createTopology will throw 
NullPointerException.

2.Reason:
where get streaming group attribute index, it get downstream operator 
outputFields,this is error。Because the last bolt has no
outputFields, so the outputSchema of declarer in null and throw 
NullPointerException.

3.Modify:
where get streaming group attribute index, it get upstream operator 
outputFields.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ffbin/flink FFB_Flink

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/932.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #932


commit e9c098670a14020d38f9b5e05442a2a606d90dcb
Author: ffbin 869218...@qq.com
Date:   2015-07-22T09:04:45Z

modify




 Storm-compatibility:method FlinkTopologyBuilder.createTopology() throws 
 java.lang.NullPointerException
 --

 Key: FLINK-2391
 URL: https://issues.apache.org/jira/browse/FLINK-2391
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.10
 Environment: win7 32bit;linux
Reporter: Huang Wei
  Labels: features
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 core dumped at FlinkOutputFieldsDeclarer.java : 160(package 
 FlinkOutputFieldsDeclarer).
 code : fieldIndexes[i] = this.outputSchema.fieldIndex(groupingFields.get(i));
 in this line, the var this.outputSchema may be null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2391]Fix Storm-compatibility FlinkTopol...

2015-07-22 Thread ffbin

GitHub user ffbin opened a pull request:

https://github.com/apache/flink/pull/932

[FLINK-2391]Fix Storm-compatibility FlinkTopologyBuilder.createTopology bug

1.Error Scene:
Error happend in program like this:
builder.setSpout(source0, new Generator(pt), 
pt.getInt(sourceParallelism));
builder.setBolt(sa, new RepartPassThroughBolt(pt), 
pt.getInt(sinkParallelism)).fieldsGrouping(source0, new Fields(id));
builder.setBolt(sink, new Sink(pt), 
pt.getInt(sinkParallelism)).fieldsGrouping(sa, new Fields(id));
final FlinkLocalCluster cluster = FlinkLocalCluster.getLocalCluster();
cluster.submitTopology(throughput, conf, builder.createTopology());

if the last bolt use fieldsGrouping, createTopology will throw 
NullPointerException.

2.Reason:
where get streaming group attribute index, it get downstream operator 
outputFields,this is errorãBecause the last bolt has no
outputFields, so the outputSchema of declarer in null and throw 
NullPointerException.

3.Modify:
where get streaming group attribute index, it get upstream operator 
outputFields.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ffbin/flink FFB_Flink

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/932.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #932


commit e9c098670a14020d38f9b5e05442a2a606d90dcb
Author: ffbin 869218...@qq.com
Date:   2015-07-22T09:04:45Z

modify




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2218] Web client cannot distinguesh bet...

2015-07-22 Thread mjsax

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/904#issuecomment-123646720
  
You are right. I just updated it.
What about the failing YARN test. Is there already a JIRA for it? 
https://travis-ci.org/apache/flink/builds/72019686


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1658] Rename AbstractEvent to AbstractT...

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/929#issuecomment-123646781
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1658) Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent


[ 
https://issues.apache.org/jira/browse/FLINK-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636621#comment-14636621
 ] 

ASF GitHub Bot commented on FLINK-1658:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/929#issuecomment-123646781
  
+1


 Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent
 --

 Key: FLINK-1658
 URL: https://issues.apache.org/jira/browse/FLINK-1658
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime, Local Runtime
Reporter: Gyula Fora
Assignee: Matthias J. Sax
Priority: Trivial

 The same name is used for different event classes in the runtime which can 
 cause confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2218) Web client cannot distinguesh between Flink options and program arguments


[ 
https://issues.apache.org/jira/browse/FLINK-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636620#comment-14636620
 ] 

ASF GitHub Bot commented on FLINK-2218:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/904#issuecomment-123646720
  
You are right. I just updated it.
What about the failing YARN test. Is there already a JIRA for it? 
https://travis-ci.org/apache/flink/builds/72019686


 Web client cannot distinguesh between Flink options and program arguments
 -

 Key: FLINK-2218
 URL: https://issues.apache.org/jira/browse/FLINK-2218
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Affects Versions: master
Reporter: Ufuk Celebi
Assignee: Matthias J. Sax

 WebClient has only one input field for arguments. This field is used for 
 Flink options (e.g., `-p`) and program arguments. Thus, supported Flink 
 options restrict the possible program arguments. CliFrontend in contrast can 
 distinguish both and thus `-p` can also be used as an program argument.
 Solution: add a second input field for Flink options to WebClient



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (FLINK-2388) JobManager should try retrieving jobs from archive

2015-07-22 Thread Enrique Bautista Barahona (JIRA)

[
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636727#comment-14636727
]

Enrique Bautista Barahona edited comment on FLINK-2388 at 7/22/15 11:10 AM:

Sure [~mxm]. As far as I can see it's very similar. I have some comments/doubts
that I will post in the PR.

was (Author: ebautistabar):
Sure [~mxm], I'll take a look.

JobManager should try retrieving jobs from archive
--

Key: FLINK-2388
URL: https://issues.apache.org/jira/browse/FLINK-2388
Project: Flink
Issue Type: Task
Components: JobManager
Reporter: Enrique Bautista Barahona

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive

2015-07-22 Thread Enrique Bautista Barahona (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636727#comment-14636727
 ] 

Enrique Bautista Barahona commented on FLINK-2388:
--

Sure [~mxm], I'll take a look.

 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2218] Web client cannot distinguesh bet...

2015-07-22 Thread mjsax

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/904#issuecomment-123674841
  
I don't know exactly. My changes should not influence YARN component (as 
fas as I can tell). YARN test have been instable a lot -- I would guess, it is 
a general issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2200) Flink API with Scala 2.11 - Maven Repository

2015-07-22 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636578#comment-14636578
 ] 

Robert Metzger commented on FLINK-2200:
---

[~pgoetze], can you check if  Chiwan's pull request is resolving your issue?

 Flink API with Scala 2.11 - Maven Repository
 

 Key: FLINK-2200
 URL: https://issues.apache.org/jira/browse/FLINK-2200
 Project: Flink
  Issue Type: Wish
  Components: Build System, Scala API
Reporter: Philipp Götze
Assignee: Chiwan Park
Priority: Trivial
  Labels: maven

 It would be nice if you could upload a pre-built version of the Flink API 
 with Scala 2.11 to the maven repository.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35195608
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -688,16 +688,15 @@ class JobManager(
 
   case RequestAccumulatorResults(jobID) =
 try {
-  val accumulatorValues: java.util.Map[String, 
SerializedValue[Object]] = {
-currentJobs.get(jobID) match {
-  case Some((graph, jobInfo)) =
-graph.getAccumulatorsSerialized
-  case None =
-null // TODO check also archive
-}
+  currentJobs.get(jobID) match {
+case Some((graph, jobInfo)) =
+  val accumulatorValues = graph.getAccumulatorsSerialized
+  sender() ! AccumulatorResultsFound(jobID, accumulatorValues)
+case None =
+  // TODO check also archive
+  sender() ! AccumulatorResultsNotFound(jobID)
--- End diff --

See the JIRA for an explanation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2200] Add Flink with Scala 2.11 in Mave...

2015-07-22 Thread rmetzger

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/885#issuecomment-123640287
  
Sorry for the delay.
I will deploy the artifacts from this branch to the maven snapshot 
repository to see if everything works as expected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2200) Flink API with Scala 2.11 - Maven Repository


[ 
https://issues.apache.org/jira/browse/FLINK-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636589#comment-14636589
 ] 

ASF GitHub Bot commented on FLINK-2200:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/885#issuecomment-123640287
  
Sorry for the delay.
I will deploy the artifacts from this branch to the maven snapshot 
repository to see if everything works as expected.


 Flink API with Scala 2.11 - Maven Repository
 

 Key: FLINK-2200
 URL: https://issues.apache.org/jira/browse/FLINK-2200
 Project: Flink
  Issue Type: Wish
  Components: Build System, Scala API
Reporter: Philipp Götze
Assignee: Chiwan Park
Priority: Trivial
  Labels: maven

 It would be nice if you could upload a pre-built version of the Flink API 
 with Scala 2.11 to the maven repository.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2200] Add Flink with Scala 2.11 in Mave...

2015-07-22 Thread chiwanpark

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/885#issuecomment-123642852
  
Hi, currently this PR is not ready to merge, because this PR doesn't 
contain changes for #677. I'll update soon. Unfortunately I'm outside now. 
Maybe I can update this PR in 4-5 hours.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2200) Flink API with Scala 2.11 - Maven Repository


[ 
https://issues.apache.org/jira/browse/FLINK-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636595#comment-14636595
 ] 

ASF GitHub Bot commented on FLINK-2200:
---

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/885#issuecomment-123642852
  
Hi, currently this PR is not ready to merge, because this PR doesn't 
contain changes for #677. I'll update soon. Unfortunately I'm outside now. 
Maybe I can update this PR in 4-5 hours.


 Flink API with Scala 2.11 - Maven Repository
 

 Key: FLINK-2200
 URL: https://issues.apache.org/jira/browse/FLINK-2200
 Project: Flink
  Issue Type: Wish
  Components: Build System, Scala API
Reporter: Philipp Götze
Assignee: Chiwan Park
Priority: Trivial
  Labels: maven

 It would be nice if you could upload a pre-built version of the Flink API 
 with Scala 2.11 to the maven repository.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-2371) AccumulatorLiveITCase fails

2015-07-22 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax closed FLINK-2371.
--

Fixed test passed 10 Travis runs for me without failing. Seems to be stable now.

 AccumulatorLiveITCase fails
 ---

 Key: FLINK-2371
 URL: https://issues.apache.org/jira/browse/FLINK-2371
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Matthias J. Sax
Assignee: Maximilian Michels

 AccumulatorLiveITCase fails regularly (however, not in each run). The tests 
 relies on timing (via sleep) which does not work well on Travis.
 See dev-list for more details: 
 https://mail-archives.apache.org/mod_mbox/flink-dev/201507.mbox/browser
 AccumulatorLiveITCase.testProgram:106-access$1100:68-checkFlinkAccumulators:189



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-685) Add support for semi-joins

2015-07-22 Thread Stephan Ewen (JIRA)

[
https://issues.apache.org/jira/browse/FLINK-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636714#comment-14636714
]

Stephan Ewen commented on FLINK-685:

So far, we have had the paradigm that any operation in Flink's batch API must
be size safe, so it has to scale to groups beyond main memory without an
{{OutOfMemoryError}}.

This put a bit of a hurdle on implementing certain operations.

How about we start a class for extended operations, where we add simpler
implementations? We would not give these guarantees to handle super large
groups, but that is probably still good enough for a big part of the use cases.
We could add your implementation to that class of extended operations.

Add support for semi-joins
--

Key: FLINK-685
URL: https://issues.apache.org/jira/browse/FLINK-685
Project: Flink
Issue Type: New Feature
Reporter: GitHub Import
Assignee: pietro pinoli
Priority: Minor
Labels: github-import
Fix For: pre-apache

A semi-join is basically a join filter. One input is filtering and the
other one is filtered.
A tuple of the filtered input is emitted exactly once if the filtering
input has one (ore more) tuples with matching join keys. That means that the
output of a semi-join has the same type as the filtered input and the
filtering input is completely discarded.
In order to support a semi-join, we need to add an additional physical
execution strategy, that ensures, that a tuple of the filtered input is
emitted only once if the filtering input has more than one tuple with
matching keys. Furthermore, a couple of optimizations compared to standard
joins can be done such as storing only keys and not the full tuple of the
filtering input in a hash table.
Imported from GitHub
Url: https://github.com/stratosphere/stratosphere/issues/685
Created by: [fhueske|https://github.com/fhueske]
Labels: enhancement, java api, runtime,
Milestone: Release 0.6 (unplanned)
Created at: Mon Apr 14 12:05:29 CEST 2014
State: open

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2371) AccumulatorLiveITCase fails


[ 
https://issues.apache.org/jira/browse/FLINK-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636632#comment-14636632
 ] 

Maximilian Michels commented on FLINK-2371:
---

Good to hear :)

 AccumulatorLiveITCase fails
 ---

 Key: FLINK-2371
 URL: https://issues.apache.org/jira/browse/FLINK-2371
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Matthias J. Sax
Assignee: Maximilian Michels

 AccumulatorLiveITCase fails regularly (however, not in each run). The tests 
 relies on timing (via sleep) which does not work well on Travis.
 See dev-list for more details: 
 https://mail-archives.apache.org/mod_mbox/flink-dev/201507.mbox/browser
 AccumulatorLiveITCase.testProgram:106-access$1100:68-checkFlinkAccumulators:189



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636635#comment-14636635
 ] 

ASF GitHub Bot commented on FLINK-2388:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/930#issuecomment-123649614
  
LGTM


 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2388) JobManager should try retrieving jobs from archive


[ 
https://issues.apache.org/jira/browse/FLINK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636764#comment-14636764
 ] 

ASF GitHub Bot commented on FLINK-2388:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35204559
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ---
@@ -580,6 +581,34 @@ public ExecutionContext getExecutionContext() {
return result;
}
 
+   /**
+* Returns the a stringified version of the user-defined accumulators.
+* @return an Array containing the StringifiedAccumulatorResult objects
+*/
+   public StringifiedAccumulatorResult[] 
getAccumulatorResultsStringified() {
+
+   MapString, Accumulator?, ? accumulatorMap = 
aggregateUserAccumulators();
+
+   int num = accumulatorMap.size();
+   StringifiedAccumulatorResult[] resultStrings = new 
StringifiedAccumulatorResult[num];
+
+   int i = 0;
+   for (Map.EntryString, Accumulator?, ? entry : 
accumulatorMap.entrySet()) {
+
+   StringifiedAccumulatorResult result;
+   Accumulator?, ? value = entry.getValue();
+   if (value != null) {
+   result = new 
StringifiedAccumulatorResult(entry.getKey(), value.getClass().getSimpleName(), 
value.toString());
+   } else {
+   result = new 
StringifiedAccumulatorResult(entry.getKey(), null, null);
--- End diff --

StringifiedAccumulatorResult expects a String. I thought it's better to 
pass a null String instead of null.


 JobManager should try retrieving jobs from archive
 --

 Key: FLINK-2388
 URL: https://issues.apache.org/jira/browse/FLINK-2388
 Project: Flink
  Issue Type: Task
  Components: JobManager
Reporter: Enrique Bautista Barahona

 I was following the quickstart guide with the WordCount example and, when I 
 entered the analyze page for the job, nothing came up. Apparently the 
 JobManagerInfoServlet fails with a NullPointerException.
 I've been reading the code and I've seen the problem is in the processing of 
 the RequestAccumulatorResultsStringified message in JobManager. There's a 
 TODO where the accumulators should be retrieved from the archive.
 As I wanted to know more about Flink internals, I decided to try and fix it. 
 I've later seen that there's currently ongoing work in that part of the code, 
 so I guess maybe it's not needed, but if you want I could submit a PR. If you 
 have already taken it into account and will solve it shortly, please feel 
 free to close the issue.
 If you want to take a look, the commit is here: 
 https://github.com/ebautistabar/flink/commit/8536352b21fb6c78ad8840b0397509df04358c6b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35204559
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ---
@@ -580,6 +581,34 @@ public ExecutionContext getExecutionContext() {
return result;
}
 
+   /**
+* Returns the a stringified version of the user-defined accumulators.
+* @return an Array containing the StringifiedAccumulatorResult objects
+*/
+   public StringifiedAccumulatorResult[] 
getAccumulatorResultsStringified() {
+
+   MapString, Accumulator?, ? accumulatorMap = 
aggregateUserAccumulators();
+
+   int num = accumulatorMap.size();
+   StringifiedAccumulatorResult[] resultStrings = new 
StringifiedAccumulatorResult[num];
+
+   int i = 0;
+   for (Map.EntryString, Accumulator?, ? entry : 
accumulatorMap.entrySet()) {
+
+   StringifiedAccumulatorResult result;
+   Accumulator?, ? value = entry.getValue();
+   if (value != null) {
+   result = new 
StringifiedAccumulatorResult(entry.getKey(), value.getClass().getSimpleName(), 
value.toString());
+   } else {
+   result = new 
StringifiedAccumulatorResult(entry.getKey(), null, null);
--- End diff --

StringifiedAccumulatorResult expects a String. I thought it's better to 
pass a null String instead of null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2388] return AccumulatorResultsNotFound...

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/930#discussion_r35204984
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -687,61 +687,23 @@ class JobManager(
 message match {
 
   case RequestAccumulatorResults(jobID) =
-try {
--- End diff --

You're right, this should stay in there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2205) Confusing entries in JM Webfrontend Job Configuration section


[ 
https://issues.apache.org/jira/browse/FLINK-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636798#comment-14636798
 ] 

ASF GitHub Bot commented on FLINK-2205:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/927#issuecomment-123700358
  
The job manager UI is under rework right now, there is a new version coming 
up, see here:

https://github.com/apache/flink/tree/master/flink-runtime-web

This fix (if going to the old UI) would be only temporary...


 Confusing entries in JM Webfrontend Job Configuration section
 -

 Key: FLINK-2205
 URL: https://issues.apache.org/jira/browse/FLINK-2205
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor
  Labels: starter

 The Job Configuration section of the job history / analyze page of the 
 JobManager webinterface contains two confusing entries:
 - {{Number of execution retries}} is actually the maximum number of retries 
 and should be renamed accordingly. The default value is -1 and should be 
 changed to deactivated (or 0).
 - {{Job parallelism}} which is -1 by default. A parallelism of -1 is not very 
 meaningful. It would be better to show something like auto



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2205] Fix confusing entries in JM UI Jo...

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/927#issuecomment-123700358
  
The job manager UI is under rework right now, there is a new version coming 
up, see here:

https://github.com/apache/flink/tree/master/flink-runtime-web

This fix (if going to the old UI) would be only temporary...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: FLINK-2322 Unclosed stream may leak resource

2015-07-22 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/928#issuecomment-123700349
  
Maybe your vim is set to replace tabs with spaces, but in the files that 
you changed there are definitely spaces now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2322) Unclosed stream may leak resource


[ 
https://issues.apache.org/jira/browse/FLINK-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636797#comment-14636797
 ] 

ASF GitHub Bot commented on FLINK-2322:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/928#issuecomment-123700349
  
Maybe your vim is set to replace tabs with spaces, but in the files that 
you changed there are definitely spaces now.


 Unclosed stream may leak resource
 -

 Key: FLINK-2322
 URL: https://issues.apache.org/jira/browse/FLINK-2322
 Project: Flink
  Issue Type: Bug
Reporter: Ted Yu
  Labels: starter

 In UdfAnalyzerUtils.java :
 {code}
 ClassReader cr = new 
 ClassReader(Thread.currentThread().getContextClassLoader()
 .getResourceAsStream(internalClassName.replace('.', '/') + 
 .class));
 {code}
 The stream returned by getResourceAsStream() should be closed upon exit of 
 findMethodNode()
 In ParameterTool#fromPropertiesFile():
 {code}
 props.load(new FileInputStream(propertiesFile));
 {code}
 The FileInputStream should be closed before returning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1658] Rename AbstractEvent to AbstractT...

2015-07-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/929


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1658) Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent


[ 
https://issues.apache.org/jira/browse/FLINK-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636828#comment-14636828
 ] 

ASF GitHub Bot commented on FLINK-1658:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/929


 Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent
 --

 Key: FLINK-1658
 URL: https://issues.apache.org/jira/browse/FLINK-1658
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime, Local Runtime
Reporter: Gyula Fora
Assignee: Matthias J. Sax
Priority: Trivial

 The same name is used for different event classes in the runtime which can 
 cause confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2218) Web client cannot distinguesh between Flink options and program arguments


[ 
https://issues.apache.org/jira/browse/FLINK-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636854#comment-14636854
 ] 

ASF GitHub Bot commented on FLINK-2218:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/904#issuecomment-123707491
  
Yes, please open a JIRA. Merging.


 Web client cannot distinguesh between Flink options and program arguments
 -

 Key: FLINK-2218
 URL: https://issues.apache.org/jira/browse/FLINK-2218
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Affects Versions: master
Reporter: Ufuk Celebi
Assignee: Matthias J. Sax

 WebClient has only one input field for arguments. This field is used for 
 Flink options (e.g., `-p`) and program arguments. Thus, supported Flink 
 options restrict the possible program arguments. CliFrontend in contrast can 
 distinguish both and thus `-p` can also be used as an program argument.
 Solution: add a second input field for Flink options to WebClient



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-2218) Web client cannot distinguesh between Flink options and program arguments


 [ 
https://issues.apache.org/jira/browse/FLINK-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-2218.
---
Resolution: Fixed

 Web client cannot distinguesh between Flink options and program arguments
 -

 Key: FLINK-2218
 URL: https://issues.apache.org/jira/browse/FLINK-2218
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Affects Versions: master
Reporter: Ufuk Celebi
Assignee: Matthias J. Sax

 WebClient has only one input field for arguments. This field is used for 
 Flink options (e.g., `-p`) and program arguments. Thus, supported Flink 
 options restrict the possible program arguments. CliFrontend in contrast can 
 distinguish both and thus `-p` can also be used as an program argument.
 Solution: add a second input field for Flink options to WebClient



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2332] [runtime] Adds leader session IDs...