date:20150220


[ 
https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328795#comment-14328795
 ] 

ASF GitHub Bot commented on FLINK-1501:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-75218017
  
This looks great! ^^


 Integrate metrics library and report basic metrics to JobManager web interface
 --

 Key: FLINK-1501
 URL: https://issues.apache.org/jira/browse/FLINK-1501
 Project: Flink
  Issue Type: Sub-task
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: pre-apache


 As per mailing list, the library: https://github.com/dropwizard/metrics
 The goal of this task is to get the basic infrastructure in place.
 Subsequent issues will integrate more features into the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()

2015-02-20 Thread Vasia Kalavri (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328808#comment-14328808
 ] 

Vasia Kalavri commented on FLINK-1587:
--

Hi,

you can easily reproduce if you have an edge set with an invalid edge ID, as 
[~andralungu] says.
This is a case that shouldn't happen, so I suggest we add a check and throw an 
exception. This should probably be taken care of in the other 2 degrees 
methods. 
[~andralungu] or [~cebe] any of you would like to take care of this? :)

-V.

 coGroup throws NoSuchElementException on iterator.next()
 

 Key: FLINK-1587
 URL: https://issues.apache.org/jira/browse/FLINK-1587
 Project: Flink
  Issue Type: Bug
  Components: Gelly
 Environment: flink-0.8.0-SNAPSHOT
Reporter: Carsten Brandt

 I am receiving the following exception when running a simple job that 
 extracts outdegree from a graph using Gelly. It is currently only failing on 
 the cluster and I am not able to reproduce it locally. Will try that the next 
 days.
 {noformat}
 02/20/2015 02:27:02:  CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) 
 switched to FAILED
 java.util.NoSuchElementException
   at java.util.Collections$EmptyIterator.next(Collections.java:3006)
   at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665)
   at 
 org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
   at java.lang.Thread.run(Thread.java:745)
 02/20/2015 02:27:02:  Job execution switched to status FAILING
 ...
 {noformat}
 The error occurs in Gellys Graph.java at this line: 
 https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636
 Is there any valid case where a coGroup Iterator may be empty? As far as I 
 see there is a bug somewhere.
 I'd like to write a test case for this to reproduce the issue. Where can I 
 put such a test?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1515]Splitted runVertexCentricIteration...

2015-02-20 Thread vasia

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/402#issuecomment-75220973
  
Hi,
if no more comments, I'd like to merge this. 
There is a failing check in Travis (not related to this PR):
```
Tests in error: 
  
JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40
 Â» Timeout
  
JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40-TestKit.within:707-TestKit.within:707
 Â» Timeout
```
Is this fixed by #422? Shall I proceed?
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1515) [Gelly] Enable access to aggregators and broadcast sets in vertex-centric iteration


[ 
https://issues.apache.org/jira/browse/FLINK-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328814#comment-14328814
 ] 

ASF GitHub Bot commented on FLINK-1515:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/402#issuecomment-75220973
  
Hi,
if no more comments, I'd like to merge this. 
There is a failing check in Travis (not related to this PR):
```
Tests in error: 
  
JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40
 » Timeout
  
JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40-TestKit.within:707-TestKit.within:707
 » Timeout
```
Is this fixed by #422? Shall I proceed?
Thanks!


 [Gelly] Enable access to aggregators and broadcast sets in vertex-centric 
 iteration
 ---

 Key: FLINK-1515
 URL: https://issues.apache.org/jira/browse/FLINK-1515
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Martin Kiefer

 Currently, aggregators and broadcast sets cannot be accessed through Gelly's  
 {{runVertexCentricIteration}} method. The functionality is already present in 
 the {{VertexCentricIteration}} and we just need to expose it.
 This could be done like this: We create a method 
 {{createVertexCentricIteration}}, which will return a 
 {{VertexCentricIteration}} object and we change {{runVertexCentricIteration}} 
 to accept this as a parameter (and return the graph after running this 
 iteration).
 The user can configure the {{VertexCentricIteration}} by directly calling the 
 public methods {{registerAggregator}}, {{setName}}, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1588) Load flink configuration also from classloader

Robert Metzger created FLINK-1588:
-

 Summary: Load flink configuration also from classloader
 Key: FLINK-1588
 URL: https://issues.apache.org/jira/browse/FLINK-1588
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger


The GlobalConfiguration object should also check if it finds the 
flink-config.yaml in the classpath and load if from there.

This allows users to inject configuration files in local standalone or 
embedded environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1589) Add option to pass Configuration to LocalExecutor

Robert Metzger created FLINK-1589:
-

 Summary: Add option to pass Configuration to LocalExecutor
 Key: FLINK-1589
 URL: https://issues.apache.org/jira/browse/FLINK-1589
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger


Right now its not possible for users to pass custom configuration values to 
Flink when running it from within an IDE.

It would be very convenient to be able to create a local execution environment 
that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (FLINK-1589) Add option to pass Configuration to LocalExecutor


 [ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1589:
-

Assignee: Robert Metzger

 Add option to pass Configuration to LocalExecutor
 -

 Key: FLINK-1589
 URL: https://issues.apache.org/jira/browse/FLINK-1589
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger
Assignee: Robert Metzger

 Right now its not possible for users to pass custom configuration values to 
 Flink when running it from within an IDE.
 It would be very convenient to be able to create a local execution 
 environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...

2015-02-20 Thread rmetzger

GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/427

[FLINK-1589] Add option to pass configuration to LocalExecutor

Please review the changes.

I'll add a testcase and update the documentation later today.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink1589

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/427.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #427


commit b75b4c285f4810faa5d02d638b61dc7b8e125c8d
Author: Robert Metzger rmetz...@apache.org
Date:   2015-02-20T11:40:41Z

[FLINK-1589] Add option to pass configuration to LocalExecutor




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-1586) Add support for iteration visualization for Streaming programs


 [ 
https://issues.apache.org/jira/browse/FLINK-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-1586:
--
Component/s: Streaming

 Add support for iteration visualization for Streaming programs
 --

 Key: FLINK-1586
 URL: https://issues.apache.org/jira/browse/FLINK-1586
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Gyula Fora
Priority: Minor

 The plan visualizer currently does not support streaming programs containing 
 iterations. There is no visualization at all due to an exception thrown in 
 the visualizer script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor


[ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328841#comment-14328841
 ] 

ASF GitHub Bot commented on FLINK-1589:
---

GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/427

[FLINK-1589] Add option to pass configuration to LocalExecutor

Please review the changes.

I'll add a testcase and update the documentation later today.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink1589

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/427.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #427


commit b75b4c285f4810faa5d02d638b61dc7b8e125c8d
Author: Robert Metzger rmetz...@apache.org
Date:   2015-02-20T11:40:41Z

[FLINK-1589] Add option to pass configuration to LocalExecutor




 Add option to pass Configuration to LocalExecutor
 -

 Key: FLINK-1589
 URL: https://issues.apache.org/jira/browse/FLINK-1589
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger
Assignee: Robert Metzger

 Right now its not possible for users to pass custom configuration values to 
 Flink when running it from within an IDE.
 It would be very convenient to be able to create a local execution 
 environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-1584) Spurious failure of TaskManagerFailsITCase

2015-02-20 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1584.
-
   Resolution: Fixed
Fix Version/s: 0.9
 Assignee: Till Rohrmann

Fixed via 52b40debad6293cc0591eabb847eaac322d174aa

 Spurious failure of TaskManagerFailsITCase
 --

 Key: FLINK-1584
 URL: https://issues.apache.org/jira/browse/FLINK-1584
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 0.9


 The {{TaskManagerFailsITCase}} fails spuriously on Travis. The reason might 
 be that different test cases try to access the same {{JobManager}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1501] Add metrics library for monitorin...

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-75209814
  
Indeed :-)

What does the OS load mean? It would be really awesome to show the CPU 
load, too. I think this is a helpful indicator. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface


[ 
https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328723#comment-14328723
 ] 

ASF GitHub Bot commented on FLINK-1501:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-75209814
  
Indeed :-)

What does the OS load mean? It would be really awesome to show the CPU 
load, too. I think this is a helpful indicator. 


 Integrate metrics library and report basic metrics to JobManager web interface
 --

 Key: FLINK-1501
 URL: https://issues.apache.org/jira/browse/FLINK-1501
 Project: Flink
  Issue Type: Sub-task
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: pre-apache


 As per mailing list, the library: https://github.com/dropwizard/metrics
 The goal of this task is to get the basic infrastructure in place.
 Subsequent issues will integrate more features into the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-1556) JobClient does not wait until a job failed completely if submission exception

2015-02-20 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1556.
-
   Resolution: Fixed
Fix Version/s: 0.9

Re-fixed in 4ff91cd7cd3e05637d70a56cfc6f5a4ba2f2501a

 JobClient does not wait until a job failed completely if submission exception
 -

 Key: FLINK-1556
 URL: https://issues.apache.org/jira/browse/FLINK-1556
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 0.9


 If an exception occurs during job submission the {{JobClient}} received a 
 {{SubmissionFailure}}. Upon receiving this message, the {{JobClient}} 
 terminates itself and returns the error to the {{Client}}. This indicates to 
 the user that the job has been completely failed which is not necessarily 
 true. 
 If the user directly after such a failure submits another job, then it might 
 be the case that not all slots of the formerly failed job are returned. This 
 can lead to a {{NoRessourceAvailableException}}.
 We can solve this problem by waiting for the completion of the job failure in 
 the {{JobClient}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-1451) Enable parallel execution of streaming file sources


 [ 
https://issues.apache.org/jira/browse/FLINK-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora resolved FLINK-1451.
---
Resolution: Fixed
  Assignee: Gyula Fora  (was: Márton Balassi)

https://github.com/apache/flink/commit/c56e3f10b27e1e5be38b8a731f330891b190a268

 Enable parallel execution of streaming file sources
 ---

 Key: FLINK-1451
 URL: https://issues.apache.org/jira/browse/FLINK-1451
 Project: Flink
  Issue Type: Task
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Gyula Fora

 Currently the streaming FileSourceFunction does not support parallelism 
 greater than 1 as it does not implement the ParallelSourceFunction interface. 
 Usage with distributed filesystems should be checked. In relation with this 
 issue possible parallel solution for the FileMonitoringFunction should be 
 considered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1421) Implement a SAMOA Adapter for Flink Streaming

2015-02-20 Thread Paris Carbone (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328882#comment-14328882
 ] 

Paris Carbone commented on FLINK-1421:
--

Thanks for all the great feedback everyone! I believe the adapter is ready for 
a PR! :)
The current Flink-Samoa adapter runs well for all existing Tasks in local and 
remote mode. The only external requirement, at least from the samoa executable 
script is to have the Flink CLI and its dependencies installed and exported 
into a FLINK_HOME variable.

[~StephanEwen] We should address the proper serialisation in the next patch.

[~azaroth] We have rebased our current branch [1] with the incubator-samoa 
repository master and we also created a JIRA for the PR [2]. Do you think it is 
ok to do the PR now or should we wait for some upcoming refactoring that 
affects the adapters?

[1] https://github.com/senorcarbone/samoa/tree/flink-integration
[2] https://issues.apache.org/jira/browse/SAMOA-16

 Implement a SAMOA Adapter for Flink Streaming
 -

 Key: FLINK-1421
 URL: https://issues.apache.org/jira/browse/FLINK-1421
 Project: Flink
  Issue Type: New Feature
  Components: Streaming
Reporter: Paris Carbone
Assignee: Paris Carbone
   Original Estimate: 336h
  Remaining Estimate: 336h

 Yahoo's Samoa is an experimental incremental machine learning library that 
 builds on an abstract compositional data streaming model to write streaming 
 algorithms. The task is to provide an adapter from SAMOA topologies to 
 Flink-streaming job graphs in order to support Flink as a backend engine for 
 SAMOA tasks.
 A statup guide can be viewed here :
 https://docs.google.com/document/d/18glDJDYmnFGT1UGtZimaxZpGeeg1Ch14NgDoymhPk2A/pub
 The main working branch of the adapter :
 https://github.com/senorcarbone/samoa/tree/flink-integration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (FLINK-1555) Add utility to log the serializers of composite types


 [ 
https://issues.apache.org/jira/browse/FLINK-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1555:
-

Assignee: Robert Metzger

 Add utility to log the serializers of composite types
 -

 Key: FLINK-1555
 URL: https://issues.apache.org/jira/browse/FLINK-1555
 Project: Flink
  Issue Type: Improvement
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Minor

 Users affected by poor performance might want to understand how Flink is 
 serializing their data.
 Therefore, it would be cool to have a tool utility which logs the serializers 
 like this:
 {{SerializerUtils.getSerializers(TypeInformationPOJO t);}}
 to get 
 {code}
 PojoSerializer
 TupleSerializer
   IntSer
   DateSer
   GenericTypeSer(java.sql.Date)
 PojoSerializer
   GenericTypeSer(HashMap)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor


[ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328963#comment-14328963
 ] 

ASF GitHub Bot commented on FLINK-1589:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/427#issuecomment-75244751
  
I've added documentation and tests to the change.
Lets see if travis gives us a green light.


 Add option to pass Configuration to LocalExecutor
 -

 Key: FLINK-1589
 URL: https://issues.apache.org/jira/browse/FLINK-1589
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger
Assignee: Robert Metzger

 Right now its not possible for users to pass custom configuration values to 
 Flink when running it from within an IDE.
 It would be very convenient to be able to create a local execution 
 environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...

2015-02-20 Thread rmetzger

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/427#issuecomment-75244751
  
I've added documentation and tests to the change.
Lets see if travis gives us a green light.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-1590) Log environment information also in YARN mode

Robert Metzger created FLINK-1590:
-

 Summary: Log environment information also in YARN mode
 Key: FLINK-1590
 URL: https://issues.apache.org/jira/browse/FLINK-1590
 Project: Flink
  Issue Type: Improvement
Reporter: Robert Metzger
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1591) Remove window merge before flatten as an optimization

Gyula Fora created FLINK-1591:
-

 Summary: Remove window merge before flatten as an optimization
 Key: FLINK-1591
 URL: https://issues.apache.org/jira/browse/FLINK-1591
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora


After a Window Reduce or Map transformation there is always a merge step when 
the transformation was parallel or grouped.

This merge step should be removed when the windowing operator is followed by 
flatten to avoid unnecessary bottlenecks in the program.

This feature should be added as an optimization step to the WindowingOptimizer 
class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1514) [Gelly] Add a Gather-Sum-Apply iteration method

[
https://issues.apache.org/jira/browse/FLINK-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328998#comment-14328998
]

ASF GitHub Bot commented on FLINK-1514:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/408#issuecomment-75249793

Hi @balidani! Thanks a lot for this PR! Gather-Sum-Apply will be an awesome
addition to Gelly ^^
Here come my comments:

- There's no need for Gather, Sum and Apply functions to implement
MapFunction, FlatJoinFunction, etc., since they are wrapped inside those in
GatherSumApplyIteration class. Actually, I would use the Rich* versions
instead, so that we can have access to open() and close() methods. You can look
at how `VertexCentricIteration` wraps the `VertexUpdateFunction` inside a
`RichCoGroupFunction`.

- With this small change above, we could also allow access to aggregators
and broadcast sets. This must be straight-forward to add (again look at
`VertexCentricIteration` for hints). We should also add `getName()`,
`setName()`, `getParallelism()`, `setParallelism()` methods to
`GatherSumApplyIteration`.

- Finally, it'd be great if you could add the tests you have as examples,
i.e. one for Greedy Graph Coloring and one for GSAShortestPaths.

Let me know if you have any doubts!

Thanks again :sunny:

[Gelly] Add a Gather-Sum-Apply iteration method
---

Key: FLINK-1514
URL: https://issues.apache.org/jira/browse/FLINK-1514
Project: Flink
Issue Type: New Feature
Components: Gelly
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Daniel Bali

This will be a method that implements the GAS computation model, but without
the scatter step. The phases can be mapped into the following steps inside
a delta iteration:
gather: a map on each srcVertex, edge, trgVertex that produces a partial
value
sum: a reduce that combines the partial values
apply: join with vertex set to update the vertex values using the results of
sum and the previous state.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1592) Refactor StreamGraph to store vertex IDs as Integers instead of Strings

Gyula Fora created FLINK-1592:
-

 Summary: Refactor StreamGraph to store vertex IDs as Integers 
instead of Strings
 Key: FLINK-1592
 URL: https://issues.apache.org/jira/browse/FLINK-1592
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora
Priority: Minor


The vertex IDs are currently stored as Strings reflecting some deprecated 
usage. It should be refactored to use Integers instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1444][api-extending] Add support for sp...

2015-02-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/379


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-1592) Refactor StreamGraph to store vertex IDs as Integers instead of Strings


 [ 
https://issues.apache.org/jira/browse/FLINK-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-1592:
--
Labels: starter  (was: )

 Refactor StreamGraph to store vertex IDs as Integers instead of Strings
 ---

 Key: FLINK-1592
 URL: https://issues.apache.org/jira/browse/FLINK-1592
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora
Priority: Minor
  Labels: starter

 The vertex IDs are currently stored as Strings reflecting some deprecated 
 usage. It should be refactored to use Integers instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1505] Separate reader API from result c...

2015-02-20 Thread gyfora

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/428#issuecomment-75322938
  
We have rebased and tried, it works fine for us, thank you. I will add 
several minor changes afterwards but thats just for our convenience.

So from my side 
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1582][streaming]Allow SocketStream to r...

2015-02-20 Thread mbalassi

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/424#issuecomment-75330500
  
I was thinking as high as a minute to have as a limit. But after 
reconsidering it I am for your argument, trying to establish the connection 
every couple of seconds should not be a big overhead, so my whole idea might be 
an overkill. Let us have your version for now and generalize it on request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1582) SocketStream gets stuck when socket closes


[ 
https://issues.apache.org/jira/browse/FLINK-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329670#comment-14329670
 ] 

ASF GitHub Bot commented on FLINK-1582:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/424#issuecomment-75330500
  
I was thinking as high as a minute to have as a limit. But after 
reconsidering it I am for your argument, trying to establish the connection 
every couple of seconds should not be a big overhead, so my whole idea might be 
an overkill. Let us have your version for now and generalize it on request.


 SocketStream gets stuck when socket closes
 --

 Key: FLINK-1582
 URL: https://issues.apache.org/jira/browse/FLINK-1582
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.8, 0.9
Reporter: Márton Balassi
  Labels: starter

 When the server side of the socket closes the socket stream reader does not 
 terminate. When the socket is reinitiated it does not reconnect just gets 
 stuck.
 It would be nice to add options for the user have the reader should behave 
 when the socket is down: terminate immediately (good for testing and 
 examples) or wait a specified time - possibly forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1461][api-extending] Add SortPartition ...

2015-02-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/381


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager


[ 
https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329046#comment-14329046
 ] 

ASF GitHub Bot commented on FLINK-1484:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/368#issuecomment-75254417
  
@tillrohrmann and @StephanEwen worked on some other reliablity issues. Will 
the changes in this PR be subsumed by the upcoming changes? If not, we should 
merge this. :-)


 JobManager restart does not notify the TaskManager
 --

 Key: FLINK-1484
 URL: https://issues.apache.org/jira/browse/FLINK-1484
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 0.9


 In case of a JobManager restart, which can happen due to an uncaught 
 exception, the JobManager is restarted. However, connected TaskManager are 
 not informed about the disconnection and continue sending messages to a 
 JobManager with a reseted state. 
 TaskManager should be informed about a possible restart and cleanup their own 
 state in such a case. Afterwards, they can try to reconnect to a restarted 
 JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1505) Separate buffer reader and channel consumption logic


[ 
https://issues.apache.org/jira/browse/FLINK-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329028#comment-14329028
 ] 

ASF GitHub Bot commented on FLINK-1505:
---

GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/428

[FLINK-1505] Separate reader API from result consumption

@gyfora, can you please rebase on this branch and verify that everything is 
still working as expected for you?

This PR separates the reader API (record and buffer readers) from result 
consumption (input gate). The buffer reader was a huge component with mixed 
responsibilities both as the runtime component to set up input channels for 
intermediate result consumption and as a lower-level user API to consume 
buffers/events.

The separation makes it easier for users of the API (e.g. flink-streaming) 
to extend the handling of low-level buffers and events. Gyula's initial 
feedback confirmed this.

In view of FLINK-1568, this PR makes it also easier to test the result 
consumption logic for failure scenarios.

I will rebase #356 on this changes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/incubator-flink flink-1505-input_gate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/428.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #428


commit db1dc5be12427664a418ce6e4fb41de39838fac0
Author: Ufuk Celebi u...@apache.org
Date:   2015-02-10T14:05:44Z

[FLINK-1505] [distributed runtime] Separate reader API from result 
consumption




 Separate buffer reader and channel consumption logic
 

 Key: FLINK-1505
 URL: https://issues.apache.org/jira/browse/FLINK-1505
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Affects Versions: master
Reporter: Ufuk Celebi
Priority: Minor

 Currently, the hierarchy of readers (f.a.o.runtime.io.network.api) is 
 bloated. There is no separation between consumption of the input channels and 
 the buffer readers.
 This was not the case up until release-0.8 and has been introduced by me with 
 intermediate results. I think this was a mistake and we should seperate this 
 again. flink-streaming is currently the heaviest user of these lower level 
 APIs and I have received feedback from [~gyfora] to undo this as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1505] Separate reader API from result c...

2015-02-20 Thread gyfora

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/428#issuecomment-75257236
  
Thank you!
I will try it over the weekend and give you feedback.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/427#discussion_r25080967
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/ExecutionEnvironmentITCase.java
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.javaApiOperators;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.configuration.ConfigConstants;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.test.javaApiOperators.util.CollectionDataSets;
+import org.apache.flink.test.util.MultipleProgramsTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.util.ArrayList;
+import java.util.Collection;
+
+
+@RunWith(Parameterized.class)
+public class ExecutionEnvironmentITCase extends MultipleProgramsTestBase {
+
+
+   public ExecutionEnvironmentITCase(ExecutionMode mode) {
+   super(mode);
+   }
+
+   @Parameterized.Parameters(name = Execution mode = {0})
+   public static CollectionExecutionMode[] executionModes(){
+   CollectionExecutionMode[] c = new 
ArrayListExecutionMode[](1);
+   c.add(new ExecutionMode[] {ExecutionMode.CLUSTER});
+   return c;
+   }
+
+
+   @Test
+   public void testLocalEnvironmentWithConfig() throws Exception {
+   IllegalArgumentException e = null;
+   try {
+   Configuration conf = new Configuration();
+   
conf.setBoolean(ConfigConstants.FILESYSTEM_DEFAULT_OVERWRITE_KEY, true);
+   
conf.setString(ConfigConstants.TASK_MANAGER_TMP_DIR_KEY, 
/tmp/thelikelyhoodthatthisdirectoryexisitsisreallylow);
--- End diff --

Are you sure?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1461) Add sortPartition operator


[ 
https://issues.apache.org/jira/browse/FLINK-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329033#comment-14329033
 ] 

ASF GitHub Bot commented on FLINK-1461:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/381


 Add sortPartition operator
 --

 Key: FLINK-1461
 URL: https://issues.apache.org/jira/browse/FLINK-1461
 Project: Flink
  Issue Type: New Feature
  Components: Java API, Local Runtime, Optimizer, Scala API
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor

 A {{sortPartition()}} operator can be used to
 * sort the input of a {{mapPartition()}} operator
 * enforce a certain sorting of the input of a given operator of a program. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1505) Separate buffer reader and channel consumption logic


[ 
https://issues.apache.org/jira/browse/FLINK-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329061#comment-14329061
 ] 

ASF GitHub Bot commented on FLINK-1505:
---

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/428#issuecomment-75257236
  
Thank you!
I will try it over the weekend and give you feedback.


 Separate buffer reader and channel consumption logic
 

 Key: FLINK-1505
 URL: https://issues.apache.org/jira/browse/FLINK-1505
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Affects Versions: master
Reporter: Ufuk Celebi
Priority: Minor

 Currently, the hierarchy of readers (f.a.o.runtime.io.network.api) is 
 bloated. There is no separation between consumption of the input channels and 
 the buffer readers.
 This was not the case up until release-0.8 and has been introduced by me with 
 intermediate results. I think this was a mistake and we should seperate this 
 again. flink-streaming is currently the heaviest user of these lower level 
 APIs and I have received feedback from [~gyfora] to undo this as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-1466) Add InputFormat to read HCatalog tables

2015-02-20 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1466.
--
Resolution: Implemented

Implemented with bed3da4a61a8637c0faa9632b8a05ccef8c5a6dc

 Add InputFormat to read HCatalog tables
 ---

 Key: FLINK-1466
 URL: https://issues.apache.org/jira/browse/FLINK-1466
 Project: Flink
  Issue Type: New Feature
  Components: Java API, Scala API
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor

 HCatalog is a metadata repository and InputFormat to make Hive tables 
 accessible to other frameworks such as Pig.
 Adding support for HCatalog would give access to Hive managed data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1505] Separate reader API from result c...

2015-02-20 Thread uce

GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/428

[FLINK-1505] Separate reader API from result consumption

@gyfora, can you please rebase on this branch and verify that everything is 
still working as expected for you?

This PR separates the reader API (record and buffer readers) from result 
consumption (input gate). The buffer reader was a huge component with mixed 
responsibilities both as the runtime component to set up input channels for 
intermediate result consumption and as a lower-level user API to consume 
buffers/events.

The separation makes it easier for users of the API (e.g. flink-streaming) 
to extend the handling of low-level buffers and events. Gyula's initial 
feedback confirmed this.

In view of FLINK-1568, this PR makes it also easier to test the result 
consumption logic for failure scenarios.

I will rebase #356 on this changes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/incubator-flink flink-1505-input_gate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/428.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #428


commit db1dc5be12427664a418ce6e4fb41de39838fac0
Author: Ufuk Celebi u...@apache.org
Date:   2015-02-10T14:05:44Z

[FLINK-1505] [distributed runtime] Separate reader API from result 
consumption




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1466) Add InputFormat to read HCatalog tables


[ 
https://issues.apache.org/jira/browse/FLINK-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329032#comment-14329032
 ] 

ASF GitHub Bot commented on FLINK-1466:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/411


 Add InputFormat to read HCatalog tables
 ---

 Key: FLINK-1466
 URL: https://issues.apache.org/jira/browse/FLINK-1466
 Project: Flink
  Issue Type: New Feature
  Components: Java API, Scala API
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor

 HCatalog is a metadata repository and InputFormat to make Hive tables 
 accessible to other frameworks such as Pig.
 Adding support for HCatalog would give access to Hive managed data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1444) Add data properties for data sources


[ 
https://issues.apache.org/jira/browse/FLINK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329031#comment-14329031
 ] 

ASF GitHub Bot commented on FLINK-1444:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/379


 Add data properties for data sources
 

 Key: FLINK-1444
 URL: https://issues.apache.org/jira/browse/FLINK-1444
 Project: Flink
  Issue Type: New Feature
  Components: Java API, JobManager, Optimizer
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor

 This issue proposes to add support for attaching data properties to data 
 sources. These data properties are defined with respect to input splits.
 Possible properties are:
 - partitioning across splits: all elements of the same key (combination) are 
 contained in one split
 - sorting / grouping with splits: elements are sorted or grouped on certain 
 keys within a split
 - key uniqueness: a certain key (combination) is unique for all elements of 
 the data source. This property is not defined wrt. input splits.
 The optimizer can leverage this information to generate more efficient 
 execution plans.
 The InputFormat will be responsible to generate input splits such that the 
 promised data properties are actually in place. Otherwise, the program will 
 produce invalid results. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1594) DataStreams don't support self-join

2015-02-20 Thread Daniel Bali (JIRA)

Daniel Bali created FLINK-1594:
--

 Summary: DataStreams don't support self-join
 Key: FLINK-1594
 URL: https://issues.apache.org/jira/browse/FLINK-1594
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
 Environment: flink-0.9.0-SNAPSHOT
Reporter: Daniel Bali


Trying to join a DataSets with itself will result in exceptions. I get the 
following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.IllegalArgumentException: Union buffer reader must be 
initialized with at least two individual buffer readers
at 
org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
at 
org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.init(UnionBufferReader.java:69)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
at 
org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:170)
... 20 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor


[ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329117#comment-14329117
 ] 

ASF GitHub Bot commented on FLINK-1589:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/427#discussion_r25080967
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/ExecutionEnvironmentITCase.java
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.javaApiOperators;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.configuration.ConfigConstants;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.test.javaApiOperators.util.CollectionDataSets;
+import org.apache.flink.test.util.MultipleProgramsTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.util.ArrayList;
+import java.util.Collection;
+
+
+@RunWith(Parameterized.class)
+public class ExecutionEnvironmentITCase extends MultipleProgramsTestBase {
+
+
+   public ExecutionEnvironmentITCase(ExecutionMode mode) {
+   super(mode);
+   }
+
+   @Parameterized.Parameters(name = Execution mode = {0})
+   public static CollectionExecutionMode[] executionModes(){
+   CollectionExecutionMode[] c = new 
ArrayListExecutionMode[](1);
+   c.add(new ExecutionMode[] {ExecutionMode.CLUSTER});
+   return c;
+   }
+
+
+   @Test
+   public void testLocalEnvironmentWithConfig() throws Exception {
+   IllegalArgumentException e = null;
+   try {
+   Configuration conf = new Configuration();
+   
conf.setBoolean(ConfigConstants.FILESYSTEM_DEFAULT_OVERWRITE_KEY, true);
+   
conf.setString(ConfigConstants.TASK_MANAGER_TMP_DIR_KEY, 
/tmp/thelikelyhoodthatthisdirectoryexisitsisreallylow);
--- End diff --

Are you sure?


 Add option to pass Configuration to LocalExecutor
 -

 Key: FLINK-1589
 URL: https://issues.apache.org/jira/browse/FLINK-1589
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger
Assignee: Robert Metzger

 Right now its not possible for users to pass custom configuration values to 
 Flink when running it from within an IDE.
 It would be very convenient to be able to create a local execution 
 environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager


[ 
https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329056#comment-14329056
 ] 

ASF GitHub Bot commented on FLINK-1484:
---

Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/368


 JobManager restart does not notify the TaskManager
 --

 Key: FLINK-1484
 URL: https://issues.apache.org/jira/browse/FLINK-1484
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 0.9


 In case of a JobManager restart, which can happen due to an uncaught 
 exception, the JobManager is restarted. However, connected TaskManager are 
 not informed about the disconnection and continue sending messages to a 
 JobManager with a reseted state. 
 TaskManager should be informed about a possible restart and cleanup their own 
 state in such a case. Afterwards, they can try to reconnect to a restarted 
 JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...

Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/368


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager


[ 
https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329055#comment-14329055
 ] 

ASF GitHub Bot commented on FLINK-1484:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/368#issuecomment-75256160
  
This PR has been merged as part of PR #423 


 JobManager restart does not notify the TaskManager
 --

 Key: FLINK-1484
 URL: https://issues.apache.org/jira/browse/FLINK-1484
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 0.9


 In case of a JobManager restart, which can happen due to an uncaught 
 exception, the JobManager is restarted. However, connected TaskManager are 
 not informed about the disconnection and continue sending messages to a 
 JobManager with a reseted state. 
 TaskManager should be informed about a possible restart and cleanup their own 
 state in such a case. Afterwards, they can try to reconnect to a restarted 
 JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/368#issuecomment-75256160
  
This PR has been merged as part of PR #423 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-1594) DataStreams don't support self-join

2015-02-20 Thread Daniel Bali (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Bali updated FLINK-1594:
---
Description: 
Trying to join a DataSet with itself will result in exceptions. I get the 
following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.IllegalArgumentException: Union buffer reader must be 
initialized with at least two individual buffer readers
at 
org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
at 
org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.init(UnionBufferReader.java:69)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
at 
org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:170)
... 20 more
{noformat}

  was:
Trying to join a DataSets with itself will result in exceptions. I get the 
following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at

[jira] [Resolved] (FLINK-1444) Add data properties for data sources

2015-02-20 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1444.
--
Resolution: Implemented

Implemented with f0a28bf5345084a0a43df16021e60078e322e087

 Add data properties for data sources
 

 Key: FLINK-1444
 URL: https://issues.apache.org/jira/browse/FLINK-1444
 Project: Flink
  Issue Type: New Feature
  Components: Java API, JobManager, Optimizer
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor

 This issue proposes to add support for attaching data properties to data 
 sources. These data properties are defined with respect to input splits.
 Possible properties are:
 - partitioning across splits: all elements of the same key (combination) are 
 contained in one split
 - sorting / grouping with splits: elements are sorted or grouped on certain 
 keys within a split
 - key uniqueness: a certain key (combination) is unique for all elements of 
 the data source. This property is not defined wrt. input splits.
 The optimizer can leverage this information to generate more efficient 
 execution plans.
 The InputFormat will be responsible to generate input splits such that the 
 promised data properties are actually in place. Otherwise, the program will 
 produce invalid results. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...

2015-02-20 Thread uce

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/368#issuecomment-75254417
  
@tillrohrmann and @StephanEwen worked on some other reliablity issues. Will 
the changes in this PR be subsumed by the upcoming changes? If not, we should 
merge this. :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1567] Add switch to use the AvroSeriali...

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/413#discussion_r25086186
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/GenericTypeInfo.java
 ---
@@ -66,6 +71,9 @@ public boolean isKeyType() {
 
@Override
public TypeSerializerT createSerializer() {
+   if(useAvroSerializer) {
+   return new AvroSerializerT(this.typeClass);
+   }
return new KryoSerializerT(this.typeClass);
--- End diff --

If we enclose the ```return new KryoSerializer``` in an else block, then 
the control flow looks nicer imho.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()

2015-02-20 Thread Andra Lungu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329218#comment-14329218
 ] 

Andra Lungu commented on FLINK-1587:


If there are no time constraints, I will take it :) 

 coGroup throws NoSuchElementException on iterator.next()
 

 Key: FLINK-1587
 URL: https://issues.apache.org/jira/browse/FLINK-1587
 Project: Flink
  Issue Type: Bug
  Components: Gelly
 Environment: flink-0.8.0-SNAPSHOT
Reporter: Carsten Brandt

 I am receiving the following exception when running a simple job that 
 extracts outdegree from a graph using Gelly. It is currently only failing on 
 the cluster and I am not able to reproduce it locally. Will try that the next 
 days.
 {noformat}
 02/20/2015 02:27:02:  CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) 
 switched to FAILED
 java.util.NoSuchElementException
   at java.util.Collections$EmptyIterator.next(Collections.java:3006)
   at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665)
   at 
 org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
   at java.lang.Thread.run(Thread.java:745)
 02/20/2015 02:27:02:  Job execution switched to status FAILING
 ...
 {noformat}
 The error occurs in Gellys Graph.java at this line: 
 https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636
 Is there any valid case where a coGroup Iterator may be empty? As far as I 
 see there is a bug somewhere.
 I'd like to write a test case for this to reproduce the issue. Where can I 
 put such a test?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1515) [Gelly] Enable access to aggregators and broadcast sets in vertex-centric iteration


[ 
https://issues.apache.org/jira/browse/FLINK-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329204#comment-14329204
 ] 

ASF GitHub Bot commented on FLINK-1515:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/402#issuecomment-75279454
  
I'm currently working on fixing this problem. You can ignore it for the
moment.

On Fri, Feb 20, 2015 at 11:58 AM, Vasia Kalavri notificati...@github.com
wrote:

 Hi,
 if no more comments, I'd like to merge this.
 There is a failing check in Travis (not related to this PR):

 Tests in error:
   
JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40
 » Timeout
   
JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40-TestKit.within:707-TestKit.within:707
 » Timeout

 Is this fixed by #422 https://github.com/apache/flink/pull/422? Shall I
 proceed?
 Thanks!

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/402#issuecomment-75220973.




 [Gelly] Enable access to aggregators and broadcast sets in vertex-centric 
 iteration
 ---

 Key: FLINK-1515
 URL: https://issues.apache.org/jira/browse/FLINK-1515
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Martin Kiefer

 Currently, aggregators and broadcast sets cannot be accessed through Gelly's  
 {{runVertexCentricIteration}} method. The functionality is already present in 
 the {{VertexCentricIteration}} and we just need to expose it.
 This could be done like this: We create a method 
 {{createVertexCentricIteration}}, which will return a 
 {{VertexCentricIteration}} object and we change {{runVertexCentricIteration}} 
 to accept this as a parameter (and return the graph after running this 
 iteration).
 The user can configure the {{VertexCentricIteration}} by directly calling the 
 public methods {{registerAggregator}}, {{setName}}, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes


[ 
https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329215#comment-14329215
 ] 

ASF GitHub Bot commented on FLINK-1567:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/413#discussion_r25086186
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/GenericTypeInfo.java
 ---
@@ -66,6 +71,9 @@ public boolean isKeyType() {
 
@Override
public TypeSerializerT createSerializer() {
+   if(useAvroSerializer) {
+   return new AvroSerializerT(this.typeClass);
+   }
return new KryoSerializerT(this.typeClass);
--- End diff --

If we enclose the ```return new KryoSerializer``` in an else block, then 
the control flow looks nicer imho.


 Add option to switch between Avro and Kryo serialization for GenericTypes
 -

 Key: FLINK-1567
 URL: https://issues.apache.org/jira/browse/FLINK-1567
 Project: Flink
  Issue Type: Improvement
Affects Versions: 0.8, 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger

 Allow users to switch the underlying serializer for GenericTypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/427#issuecomment-75271706
  
I think we should rework the test case to check that the configuration is 
properly passed to the system. Right now the exception is thrown in 
```ds.writeAsText(null)``` because we pass ```null```. 

I'd propose something like @StephanEwen did in the PR #410. We set the 
number of slots in the configuration and the job to ```PARALLELISM_AUTO_MAX```. 
With the special input format which produces only a single element per split, 
we can count the number of parallel tasks, given that every task receives only 
one input split.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1555] Add serializer hierarchy debug ut...

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/415#discussion_r25085047
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/typeutils/CompositeType.java
 ---
@@ -132,7 +133,31 @@ public CompositeType(ClassT typeClass) {
}
return getNewComparator(config);
}
-   
+
+   // 

+
+   /**
+* Debugging utility to understand the hierarchy of serializers created 
by the Java API.
+*/
+   public static T String getSerializerTree(TypeInformationT ti) {
+   return getSerializerTree(ti, 0);
+   }
+
+   private static T String getSerializerTree(TypeInformationT ti, int 
indent) {
+   String ret = ;
+   if(ti instanceof CompositeType) {
+   ret += ti.toString()+\n;
--- End diff --

Should the ```toString``` method not already print the whole tree? Thus, 
the information would be redundant.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()

2015-02-20 Thread Andra Lungu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andra Lungu reassigned FLINK-1587:
--

Assignee: Andra Lungu

 coGroup throws NoSuchElementException on iterator.next()
 

 Key: FLINK-1587
 URL: https://issues.apache.org/jira/browse/FLINK-1587
 Project: Flink
  Issue Type: Bug
  Components: Gelly
 Environment: flink-0.8.0-SNAPSHOT
Reporter: Carsten Brandt
Assignee: Andra Lungu

 I am receiving the following exception when running a simple job that 
 extracts outdegree from a graph using Gelly. It is currently only failing on 
 the cluster and I am not able to reproduce it locally. Will try that the next 
 days.
 {noformat}
 02/20/2015 02:27:02:  CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) 
 switched to FAILED
 java.util.NoSuchElementException
   at java.util.Collections$EmptyIterator.next(Collections.java:3006)
   at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665)
   at 
 org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
   at java.lang.Thread.run(Thread.java:745)
 02/20/2015 02:27:02:  Job execution switched to status FAILING
 ...
 {noformat}
 The error occurs in Gellys Graph.java at this line: 
 https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636
 Is there any valid case where a coGroup Iterator may be empty? As far as I 
 see there is a bug somewhere.
 I'd like to write a test case for this to reproduce the issue. Where can I 
 put such a test?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1515]Splitted runVertexCentricIteration...

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/402#issuecomment-75279454
  
I'm currently working on fixing this problem. You can ignore it for the
moment.

On Fri, Feb 20, 2015 at 11:58 AM, Vasia Kalavri notificati...@github.com
wrote:

 Hi,
 if no more comments, I'd like to merge this.
 There is a failing check in Travis (not related to this PR):

 Tests in error:
   
JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40
 Â» Timeout
   
JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40-TestKit.within:707-TestKit.within:707
 Â» Timeout

 Is this fixed by #422 https://github.com/apache/flink/pull/422? Shall I
 proceed?
 Thanks!

 â
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/402#issuecomment-75220973.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes


[ 
https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329219#comment-14329219
 ] 

ASF GitHub Bot commented on FLINK-1567:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/413#discussion_r25086445
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/runtime/AvroSerializer.java
 ---
@@ -91,7 +98,22 @@ public T createInstance() {
@Override
public T copy(T from) {
checkKryoInitialized();
-   return this.kryo.copy(from);
+   try {
+   return this.kryo.copy(from);
+   } catch(KryoException ke) {
+   // kryo was unable to copy it, so we do it through 
serialization:
+   ByteArrayOutputStream baout = new 
ByteArrayOutputStream();
+   Output output = new Output(baout);
+
+   kryo.writeObject(output, from);
+
+   output.close();
+
+   ByteArrayInputStream bain = new 
ByteArrayInputStream(baout.toByteArray());
+   Input input = new Input(bain);
+
+   return (T)kryo.readObject(input, from.getClass());
--- End diff --

Which cases do we cover with this method?


 Add option to switch between Avro and Kryo serialization for GenericTypes
 -

 Key: FLINK-1567
 URL: https://issues.apache.org/jira/browse/FLINK-1567
 Project: Flink
  Issue Type: Improvement
Affects Versions: 0.8, 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger

 Allow users to switch the underlying serializer for GenericTypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1567] Add switch to use the AvroSeriali...

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/413#issuecomment-75283551
  
LGTM besides of my small comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes


[ 
https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329227#comment-14329227
 ] 

ASF GitHub Bot commented on FLINK-1567:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/413#issuecomment-75283551
  
LGTM besides of my small comments.


 Add option to switch between Avro and Kryo serialization for GenericTypes
 -

 Key: FLINK-1567
 URL: https://issues.apache.org/jira/browse/FLINK-1567
 Project: Flink
  Issue Type: Improvement
Affects Versions: 0.8, 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger

 Allow users to switch the underlying serializer for GenericTypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1595) Add a complex integration test for Streaming API

Gyula Fora created FLINK-1595:
-

 Summary: Add a complex integration test for Streaming API
 Key: FLINK-1595
 URL: https://issues.apache.org/jira/browse/FLINK-1595
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora


The streaming tests currently lack a sophisticated integration test that would 
test many api features at once. 

This should include different merging, partitioning, grouping, aggregation 
types, as well as windowing and connected operators.

The results should be tested for correctness.

A test like this would help identifying bugs that are hard to detect by 
unit-tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-1595) Add a complex integration test for Streaming API


 [ 
https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-1595:
--
Labels: Starter  (was: )

 Add a complex integration test for Streaming API
 

 Key: FLINK-1595
 URL: https://issues.apache.org/jira/browse/FLINK-1595
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora
  Labels: Starter

 The streaming tests currently lack a sophisticated integration test that 
 would test many api features at once. 
 This should include different merging, partitioning, grouping, aggregation 
 types, as well as windowing and connected operators.
 The results should be tested for correctness.
 A test like this would help identifying bugs that are hard to detect by 
 unit-tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-1594) DataStreams don't support self-join


 [ 
https://issues.apache.org/jira/browse/FLINK-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-1594:
--
Description: 
Trying to window-join a DataStream with itself will result in exceptions. I get 
the following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.IllegalArgumentException: Union buffer reader must be 
initialized with at least two individual buffer readers
at 
org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
at 
org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.init(UnionBufferReader.java:69)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
at 
org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:170)
... 20 more
{noformat}

  was:
Trying to join a DataSet with itself will result in exceptions. I get the 
following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at

[jira] [Commented] (FLINK-1594) DataStreams don't support self-join


[ 
https://issues.apache.org/jira/browse/FLINK-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329268#comment-14329268
 ] 

Gyula Fora commented on FLINK-1594:
---

We will have to take a look at how self joins are handled in the batch runtime.

 DataStreams don't support self-join
 ---

 Key: FLINK-1594
 URL: https://issues.apache.org/jira/browse/FLINK-1594
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
 Environment: flink-0.9.0-SNAPSHOT
Reporter: Daniel Bali

 Trying to window-join a DataStream with itself will result in exceptions. I 
 get the following stack trace:
 {noformat}
 java.lang.Exception: Error setting up runtime environment: Union buffer 
 reader must be initialized with at least two individual buffer readers
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173)
 at 
 org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
 at 
 org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
 at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
 at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
 at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
 at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
 at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
 at 
 org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
 at akka.actor.ActorCell.invoke(ActorCell.scala:487)
 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
 at akka.dispatch.Mailbox.run(Mailbox.scala:221)
 at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
 at 
 scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: java.lang.IllegalArgumentException: Union buffer reader must 
 be initialized with at least two individual buffer readers
 at 
 org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
 at 
 org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.init(UnionBufferReader.java:69)
 at 
 org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
 at 
 org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
 at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:170)
 ... 20 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...

2015-02-20 Thread andralungu

Github user andralungu closed the pull request at:

https://github.com/apache/flink/pull/414


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples


[ 
https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329381#comment-14329381
 ] 

ASF GitHub Bot commented on FLINK-1522:
---

GitHub user andralungu opened a pull request:

https://github.com/apache/flink/pull/429

[FLINK-1522][gelly] Added test for SSSP Example



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andralungu/flink tidySSSP

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/429.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #429


commit 93cb052c4c432ccd28b008b2436e00d2382ea0ba
Author: andralungu lungu.an...@gmail.com
Date:   2015-02-20T19:11:42Z

[FLINK-1522][gelly] Added test for SSSP Example




 Add tests for the library methods and examples
 --

 Key: FLINK-1522
 URL: https://issues.apache.org/jira/browse/FLINK-1522
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Daniel Bali
  Labels: easyfix, test

 The current tests in gelly test one method at a time. We should have some 
 tests for complete applications. As a start, we could add one test case per 
 example and this way also make sure that our graph library methods actually 
 give correct results.
 I'm assigning this to [~andralungu] because she has already implemented the 
 test for SSSP, but I will help as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples


[ 
https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329379#comment-14329379
 ] 

ASF GitHub Bot commented on FLINK-1522:
---

Github user andralungu closed the pull request at:

https://github.com/apache/flink/pull/414


 Add tests for the library methods and examples
 --

 Key: FLINK-1522
 URL: https://issues.apache.org/jira/browse/FLINK-1522
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Daniel Bali
  Labels: easyfix, test

 The current tests in gelly test one method at a time. We should have some 
 tests for complete applications. As a start, we could add one test case per 
 example and this way also make sure that our graph library methods actually 
 give correct results.
 I'm assigning this to [~andralungu] because she has already implemented the 
 test for SSSP, but I will help as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...

2015-02-20 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/429#issuecomment-75310637
  
Btw, you can update a PR by pushing into your remote branch and don't need 
to close and open a new PR. If you want to change previous commits (including 
commit message) or rebase the PR, you can do a force push.
However, when doing a force push all code comments get lost, so you should 
avoid that if somebody reviewed the code and made comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples


[ 
https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329479#comment-14329479
 ] 

ASF GitHub Bot commented on FLINK-1522:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/429#issuecomment-75310637
  
Btw, you can update a PR by pushing into your remote branch and don't need 
to close and open a new PR. If you want to change previous commits (including 
commit message) or rebase the PR, you can do a force push.
However, when doing a force push all code comments get lost, so you should 
avoid that if somebody reviewed the code and made comments.


 Add tests for the library methods and examples
 --

 Key: FLINK-1522
 URL: https://issues.apache.org/jira/browse/FLINK-1522
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Daniel Bali
  Labels: easyfix, test

 The current tests in gelly test one method at a time. We should have some 
 tests for complete applications. As a start, we could add one test case per 
 example and this way also make sure that our graph library methods actually 
 give correct results.
 I'm assigning this to [~andralungu] because she has already implemented the 
 test for SSSP, but I will help as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...

2015-02-20 Thread andralungu

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/429#issuecomment-75312569
  
Thank you for the tip! I am still learning the dirty insights of Git :) 
Next time I will just update the PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples