Re: Could not build up connection to JobManager

2015-02-27 Thread Dulaj Viduranga
Here is the taskmanager log when I tried taskmanager.sh start flink-Vidura-taskmanager-localhost.log https://gist.github.com/anonymous/aef5a0bf8722feee9b97#file-flink-vidura-taskmanager-localhost-log On Feb 27, 2015, at 4:12 PM, Till Rohrmann trohrm...@apache.org wrote: It depends on how

Thoughts About Object Reuse and Collection Execution

2015-02-27 Thread Aljoscha Krettek
Hello Nation of Flink, while figuring out this bug: https://issues.apache.org/jira/browse/FLINK-1569 I came upon some difficulties. The problem is that the KeyExtractorMappers always return the same tuple. This is problematic, since Collection Execution does simply store the returned values in a

Re: [DISCUSS] Dedicated streaming mode and start scripts

2015-02-27 Thread Márton Balassi
Today we had a discussion with Robert on this issue. I would like to eventually have the streaming grouped and the windowing buffers/state maybe along with the crucial state of the user in the managed memory. If we had this separating the two modes could became less important as streaming would

Re: Tweets Custom Input Format

2015-02-27 Thread Mustafa Elbehery
@robert, I have created the PR https://github.com/apache/flink/pull/442, On Fri, Feb 27, 2015 at 11:58 AM, Mustafa Elbehery elbeherymust...@gmail.com wrote: @Robert, Thanks I was asking about the procedure. I have opened a Jira ticket for Flink-Contrib and I will create a PR with the

Re: Thoughts About Object Reuse and Collection Execution

2015-02-27 Thread Stephan Ewen
I vote to have the key extractor return a new value each time. That means that objects are not reused everywhere where it is possible, but still in most places, which still helps. What still puzzles me: I thought that the collection execution stores copies of the returned records by default

Flink Streaming parallelism bug report

2015-02-27 Thread Szabó Péter
As I know, the time of creation of the execution environment has been slightly modified in the streaming API, which caused that dataStream.getParallelism() and dataStream.env.getDegreeOfParallelism() may return different values. Usage of the former is recommended. In theory, the latter is

Re: Flink Streaming parallelism bug report

2015-02-27 Thread Gyula Fóra
They should actually return different values in many cases. Datastream.env.getDegreeOfParallelism returns the environment parallelism (default) Datastream.getparallelism() returns the parallelism of the operator. There is a reason when one or the other is used. Please watch out when you try to

Re: Contributing to Flink

2015-02-27 Thread Max Michels
Hi Niraj, Pleased to here you want to start contributing to Flink :) In terms of security, there are some open issues. Like Robert metioned, it would be great if you could implement proper HDFS Kerberos authentication. Basically, the HDFS Delegation Token needs to be transferred to the workers

[DISCUSS] URI NullPointerException in TestBaseUtils

2015-02-27 Thread Szabó Péter
The following code snippet in from TestBaseUtils: protected static File asFile(String path) { try { URI uri = new URI(path); if (uri.getScheme().equals(file)) { return new File(uri.getPath()); } else { throw new IllegalArgumentException(This path does not

Re: Flink Streaming parallelism bug report

2015-02-27 Thread Szabó Péter
Okay, thanks! In my case, I tried to run an ITCase test and the environment parallelism is happened to be -1, and an exception was thrown. The other ITCases ran properly, so I figured, the problem is with the windowing. Can you check it out for me? (WindowedDataStream, line 348) Peter

Re: Drop support for CDH4 / Hadoop 2.0.0-alpha

2015-02-27 Thread Robert Metzger
@Henry: We would still shade Hadoop because of its Guava / ASM dependencies which interfere with our dependencies. The nice thing of my change is that all the other flink modules don't have to care about the details of our Hadoop dependencie. Its basically an abstract hadoop dependency, without

Re: Tweets Custom Input Format

2015-02-27 Thread Robert Metzger
Hi, cool! Can you generalize the input format to read JSON into an arbitrary POJO? It would be great if you could contribute the InputFormat into the flink-contrib module. I've seen many users reading JSON data with Flink, so its good to have a standard solution for that. If you want you can add

Re: [DISCUSS] Iterative streaming example

2015-02-27 Thread Szabó Péter
Cool! At the moment I don't have any good use cases, but I will read some literature about it in the near future. The first priority for me is to make a good streaming iteration example, and Márton liked the machine-learning idea. That, and there is a group in SZTAKI that develops recommendation

Re: Flink Streaming parallelism bug report

2015-02-27 Thread Gyula Fóra
I can't look at it at the moment, I am on vacation and don't have my laptop. On Feb 27, 2015 9:41 AM, Szabó Péter nemderogator...@gmail.com wrote: Okay, thanks! In my case, I tried to run an ITCase test and the environment parallelism is happened to be -1, and an exception was thrown. The

[jira] [Created] (FLINK-1615) Introduces a new InputFormat for Tweets

2015-02-27 Thread mustafa elbehery (JIRA)
mustafa elbehery created FLINK-1615: --- Summary: Introduces a new InputFormat for Tweets Key: FLINK-1615 URL: https://issues.apache.org/jira/browse/FLINK-1615 Project: Flink Issue Type: New

Re: Tweets Custom Input Format

2015-02-27 Thread Robert Metzger
I'm glad you've found the how to contribute guide. I can not describe the process to open a pull request better than already written in the guide. Maybe this link is also helpful for you: https://help.github.com/articles/creating-a-pull-request/ Are you facing a particular error message? Maybe

Re: Tweets Custom Input Format

2015-02-27 Thread Mustafa Elbehery
@Robert, Thanks I was asking about the procedure. I have opened a Jira ticket for Flink-Contrib and I will create a PR with the naming convention on Wiki, https://issues.apache.org/jira/browse/FLINK-1615, On Fri, Feb 27, 2015 at 11:55 AM, Robert Metzger rmetz...@apache.org wrote: I'm glad

Re: [DISCUSS] URI NullPointerException in TestBaseUtils

2015-02-27 Thread Szabó Péter
Yeah, I agree, it is at best a cosmetic issue. I just wanted to let you know about it. Peter 2015-02-27 11:10 GMT+01:00 Till Rohrmann trohrm...@apache.org: Catching the NullPointerException and throwing an IllegalArgumentException with a meaningful message might clarify things. Considering

Re: Flink Streaming parallelism bug report

2015-02-27 Thread Szabó Péter
No problem. I will not commit the modification until it is clarified. Peter 2015-02-27 10:48 GMT+01:00 Gyula Fóra gyf...@apache.org: I can't look at it at the moment, I am on vacation and don't have my laptop. On Feb 27, 2015 9:41 AM, Szabó Péter nemderogator...@gmail.com wrote: Okay,

Re: [DISCUSS] URI NullPointerException in TestBaseUtils

2015-02-27 Thread Till Rohrmann
Catching the NullPointerException and throwing an IllegalArgumentException with a meaningful message might clarify things. Considering that it only affects the TestBaseUtils, it should not be big deal to change it. On Fri, Feb 27, 2015 at 10:30 AM, Szabó Péter nemderogator...@gmail.com wrote:

[jira] [Created] (FLINK-1614) JM Webfrontend doesn't always show the correct state of Tasks

2015-02-27 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1614: - Summary: JM Webfrontend doesn't always show the correct state of Tasks Key: FLINK-1614 URL: https://issues.apache.org/jira/browse/FLINK-1614 Project: Flink

Re: Could not build up connection to JobManager

2015-02-27 Thread Till Rohrmann
It depends on how you started Flink. If you started a local cluster, then the TaskManager log is contained in the JobManager log we just don't see the respective log output in the snippet you posted. If you started a TaskManager independently, either by taskmanager.sh or by start-cluster.sh, then