[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/378


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-11 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73908802
  
I added the UpdateTask message aggregation. I also had to rework the 
PartitionInfo creation to make it work with the concurrent task updates. This 
requires another review of the code before we can merge it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-11 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73911265
  
The job that was previously failing is fixed with this change.

We should merge this change ASAP, because its kinda impossible right now to 
seriously use flink 0.9-SNAPSHOT without it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-11 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73854842
  
There is a problem: https://travis-ci.org/apache/flink/jobs/50215407
```
java.lang.IllegalStateException: Consumer state is FINISHED but was 
expected to be RUNNING.
at 
org.apache.flink.runtime.deployment.PartialPartitionInfo.createPartitionInfo(PartialPartitionInfo.java:81)
at 
org.apache.flink.runtime.executiongraph.Execution.sendPartitionInfos(Execution.java:581)
at 
org.apache.flink.runtime.executiongraph.Execution.switchToRunning(Execution.java:654)
at 
org.apache.flink.runtime.executiongraph.Execution.access$100(Execution.java:88)
at 
org.apache.flink.runtime.executiongraph.Execution$2.onComplete(Execution.java:336)
at akka.dispatch.OnComplete.internal(Future.scala:247)
at akka.dispatch.OnComplete.internal(Future.scala:244)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:174)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:171)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at 
scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-11 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73851481
  
You're right. At the moment there is no aggregation of messages. I'll add 
it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-10 Thread uce
Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/378#discussion_r24415175
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -416,13 +426,26 @@ boolean 
scheduleOrUpdateConsumers(ListListExecutionEdge consumers) throws Ex
final ExecutionState consumerState = 
consumerVertex.getExecutionState();
 
if (consumerState == CREATED) {
-   if (state == RUNNING) {
-   if 
(!consumerVertex.scheduleForExecution(consumerVertex.getExecutionGraph().getScheduler(),
 false)) {
-   success = false;
+   
consumerVertex.cachePartitionInfo(PartialPartitionInfo.fromEdge(edge));
+
+   future(new CallableBoolean(){
+   @Override
+   public Boolean call() throws Exception {
+   try {
+   
consumerVertex.scheduleForExecution(
+   
consumerVertex.getExecutionGraph().getScheduler(), false);
+   } catch (Exception exception) {
+   fail(new 
IllegalStateException(Could not schedule consumer  +
+   vertex 
 + consumerVertex, exception));
+   }
+
+   return true;
}
-   }
-   else {
-   success = false;
+   }, AkkaUtils.globalExecutionContext());
+
+   // double check to resolve race conditions
+   if(consumerVertex.getExecutionState() == 
RUNNING){
+   consumerVertex.sendPartitionInfos();
--- End diff --

Just to verify: the double check  send relies on the fact that update 
messages at the task manager are idempotent, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-10 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73708514
  
Looks good to me. +1

We chatted about batching update task calls. Did you realize a problem with 
it or can we open an improvement issue for it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-10 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73693561
  
Very nice. I will have a detailed look later.

@zentol Can you also test it with the Python API? I think you initially 
noticed the problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-09 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/378

[FLINK-1489] Fixes blocking scheduleOrUpdateConsumers message calls

Replaces the blocking calls with futures which in case of an exception let 
the respective task fail. Furthermore, the PartitionInfos are buffered on the 
JobManager in case that some of the consumers are not yet scheduled. Once the 
state of the consumers switched to running, all buffered partition infos are 
sent to the consumers.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink 
fixScheduleOrUpdateConsumers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/378.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #378


commit d17f15ac966d59444aed86ed7d1c9cc1a2716152
Author: Till Rohrmann trohrm...@apache.org
Date:   2015-02-06T14:13:28Z

[FLINK-1489] Replaces blocking scheduleOrUpdateConsumers message calls with 
asynchronous futures. Buffers PartitionInfos at the JobManager in case that the 
respective consumer has not been scheduled.

Conflicts:

flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---