[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-05-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/581 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabl

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-05-03 Thread mbalassi
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-98522518 Looks good, merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-05-03 Thread ggevay
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-98457842 OK, I moved it to contrib.streaming. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does no

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-05-03 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-98455786 Maybe under flink-contrib/streaming at least. Personally I think we should include this in the core later. After we merge and test this thoroughly I will work on integratin

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-05-02 Thread ggevay
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-98402525 Thanks! I added the integration test. @gyfora, I can't decide where should this be placed. Originally, collect() was a method of DataStream, and then putting it in c

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-30 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-97956542 I think this looks good now ! I think this needs a test (integration test), otherwise it probably gets broken by some change soon. Starting a `ForkableFlin

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-28 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-97125728 I tried it on EC2, and it worked properly when I submitted the job from the command line. I cannot submit to EC2 using the remote environment but thats probably a firewall

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-28 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-97051927 I tested it from the IDE and submitting remotely to the local cluster, it seems to work properly. Later today I will try running it on EC2. It is actually a very neat featu

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-26 Thread ggevay
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-96373660 I did the small change suggested by Gyula. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project d

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-25 Thread ggevay
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-96220876 I made the modification to use NetUtils.findConnectingAddress. I tested it in both local and remote environments. I also tested the scenario that you mentioned where I laun

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-23 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-95528015 `NetUtils.findConnectingAddress` is a useful util, when you know that the endpoint is up already. If you can make the assumption that the master is running already you

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-23 Thread ggevay
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-95523580 OK, I see your point. I am thinking about using NetUtils.findConnectingAddress to determine which interface is used for the communication with the cluster. For this, I woul

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-21 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-94911176 This looks much better. Being not densely integrated into the DataStream makes it easier to maintain. The `InetAddress.getLocalHost().getHostAddress()` problem

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-21 Thread ggevay
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-94904085 I updated the pull request as per the above points. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-19 Thread ggevay
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-94310035 Thank you for your comments. I am very sorry for not replying earlier, but I was extremely busy this week with other things. I will try to address the issues that you menti

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-17 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-93991530 What about reworking this as a library function and add it to flink-contrib? Make this a special streaming sink and a local util that is used in the program that recei

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-17 Thread mbalassi
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-93988704 @ggevay this PR has been a bit quiet lately, what do you think about the comments? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-08 Thread mbalassi
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-91082306 Thanks for picking up the issue, @ggevay. I would like to add to the list Stephan mentioned: I personally prefer the name collect for the method, it can still ret

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-08 Thread mbalassi
Github user mbalassi commented on a diff in the pull request: https://github.com/apache/flink/pull/581#discussion_r28028612 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/datastream/DataStream.java --- @@ -1165,6 +1168,7 @

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-08 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-90987769 It is an interesting idea to collect back a data stream. This solution here has, however, quite a few limitations and implications (I assume it was only locally tested

[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable

2015-04-08 Thread ggevay
GitHub user ggevay opened a pull request: https://github.com/apache/flink/pull/581 [FLINK-1670] Made DataStream iterable I created a DataStreamIterator class, and added an iterator() method to DataStream, which returns an instance of it. The iterator creates a TCP server on which i