Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
Thanks, Till! I used the ALS from FlinkML and it works :) Best regards, Lydia > Am 02.10.2015 um 14:14 schrieb Till Rohrmann : > > Hi Lydia, > > I think the APIs of the versions 0.8-incubating-SNAPSHOT and 0.10-SNAPSHOT > are not compatible. Thus, it’s not just simply setting the dependencies

Error trying to access JM through proxy

2015-10-02 Thread Emmanuel
When trying to access the JM through a proxy I get: 19:26:23,113 ERROR akka.remote.EndpointWriter - dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Actor[akka.tcp://flink@10.155.241.168:6123/]] arriving at [akka.tcp://flink@1

Re: JM/TM startup time

2015-10-02 Thread Stephan Ewen
Yeah, registration is fast, JVM heatup is what takes time. You can try two things: - Use the off-heap memory variant and see if that allocates the memory faster. Just add the entry "taskmanager.memory.off-heap: true" to the config. - Or start the system in "streaming" mode. Then, it will not

Re: JM/TM startup time

2015-10-02 Thread Robert Schmidtke
Looking into the logs of each TM it only took about 5 seconds per TM to go from "Trying to register" to "Successful registration". On Fri, Oct 2, 2015 at 5:50 PM, Robert Schmidtke wrote: > I recently switched from running Flink on YARN to running Flink Standalone > and I realized I had to add a

Re: JM/TM startup time

2015-10-02 Thread Robert Schmidtke
I recently switched from running Flink on YARN to running Flink Standalone and I realized I had to add a sleep after ./start-cluster.sh (well, my Slurm adaptation of it). I did not have to explicitly wait before since Flink would wait until all YARN containers became available, so to be honest I do

Re: JM/TM startup time

2015-10-02 Thread Stephan Ewen
Is that a new observation that it takes so long, or has it always taken so long? On Fri, Oct 2, 2015 at 5:40 PM, Robert Schmidtke wrote: > I figured the JM would be waiting for the TMs. Each of my nodes has 64G of > memory available. > > On Fri, Oct 2, 2015 at 5:38 PM, Maximilian Michels wrote:

Re: JM/TM startup time

2015-10-02 Thread Robert Schmidtke
I figured the JM would be waiting for the TMs. Each of my nodes has 64G of memory available. On Fri, Oct 2, 2015 at 5:38 PM, Maximilian Michels wrote: > Hi Robert, > > During startup, the task manager allocates the entire managed memory. > > From the log: > 17:03:33,554 INFO org.apache.flink.ru

Re: JM/TM startup time

2015-10-02 Thread Maximilian Michels
Hi Robert, During startup, the task manager allocates the entire managed memory. >From the log: 17:03:33,554 INFO org.apache.flink.runtime.taskmanager.TaskManager - Using 0.7 of the currently free heap space for Flink managed heap memory (34395 MB). It seems like you are allocating al

Re: JM/TM startup time

2015-10-02 Thread Robert Schmidtke
Yes, they're both set to the same value. These are the JVM Options as reported by the TM: 17:36:12,137 INFO org.apache.flink.runtime.taskmanager.TaskManager - JVM Options: 17:36:12,137 INFO org.apache.flink.runtime.taskmanager.TaskManager - -Xms51355M 17:36:12,137 INFO org.apache

Re: JM/TM startup time

2015-10-02 Thread Stephan Ewen
The delay you see happens when the TaskManager allocates the memory for its memory manager. Allocating that much in a JVM can take a bit, although 40 seconds looks a lot to me... How do you start the JVM? Are Xmx and Xms set to the same value? If not, the JVM incrementally grows through multiple g

JM/TM startup time

2015-10-02 Thread Robert Schmidtke
Hi everyone, I'm wondering about the startup times of the TMs: ... 17:03:33,255 INFO org.apache.flink.runtime.taskmanager.TaskManager - Starting TaskManager actor 17:03:33,262 INFO org.apache.flink.runtime.io.network.netty.NettyConfig - NettyConfig [server address: cumu02-05/130.73.1

Config files content read

2015-10-02 Thread Flavio Pompermaier
Hi to all, in many of my jobs I have to read a config file that can be either on local fs either on hdfs. I'm looking for an intuitive API to read the content of such config files (JSON) before converting them to Java objects through jackson. Is there any Flink API to easily achieve this? I really

Re: data flow example on cluster

2015-10-02 Thread Till Rohrmann
Hi Lydia, I think the APIs of the versions 0.8-incubating-SNAPSHOT and 0.10-SNAPSHOT are not compatible. Thus, it’s not just simply setting the dependencies to 0.10-SNAPSHOT. You also have to fix the API changes. This might not be trivial. Therefore, I’d recommend you to simply use the ALS impleme

Re: data flow example on cluster

2015-10-02 Thread Robert Metzger
Lydia, can you check the log of Flink installed on the cluster? During startup, it is writing the exact commit your 0.10-SNAPSHOT is based on. I would recommend to check out exactly that commit locally and then build Flink locally. After that, you can rebuild your jobs jar again. With that method,

Re: data flow example on cluster

2015-10-02 Thread Stefano Bortoli
I had problems running a flink job with maven, probably there is some issue of classloading. For me worked to run a simple java command with the uberjar. So I build the jar using maven, and then run it this way java -Xmx2g -cp target/youruberjar.jar yourclass arg1 arg2 hope it helps, Stefano 201

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
Hi, I did not create anything by myself. I just downloaded the files from here: https://github.com/tillrohrmann/flink-perf And then executed mvn clean install -DskipTests Then I opened the project within IntelliJ and there it works fine. Then I expo

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
> Am 02.10.2015 um 11:55 schrieb Lydia Ickler : > > 0.10-SNAPSHOT

Re: data flow example on cluster

2015-10-02 Thread Stephan Ewen
@Lydia Did you create your POM files for your job with an 0.8.x quickstart? Can you try to simply re-create your project's POM files with a new quickstart? I think that the POMS between 0.8-incubating-SNAPSHOT and 0.10-SNAPSHOT may not be quite compatible any more... On Fri, Oct 2, 2015 at 12:0

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
It is a university cluster which we have to use. So I am forced to use it :( How can I bypass that version conflict? > Am 02.10.2015 um 12:07 schrieb Robert Metzger : > > Are you relying on a feature only available in 0.10-SNAPSHOT? > Otherwise, I would recommend to use the latest stable releas

Re: data flow example on cluster

2015-10-02 Thread Robert Metzger
Are you relying on a feature only available in 0.10-SNAPSHOT? Otherwise, I would recommend to use the latest stable release (0.9.1) for your flink job and on the cluster. On Fri, Oct 2, 2015 at 11:55 AM, Lydia Ickler wrote: > Hi, > > but inside the pom of flunk-job is the flink version set to 0.

Re: kryo exception due to race condition

2015-10-02 Thread Stefano Bortoli
I don't know whether it is the same issue, but after switching from my POJOs to BSONObject I have got a race condition issue with kryo serialization. I could complete the process using the byte[], but at this point I actually need the POJO. I truly believe it is related to the reuse of the Kryo ins

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
Hi, but inside the pom of flunk-job is the flink version set to 0.8 0.8-incubating-SNAPSHOT how can I change it to the newest? 0.10-SNAPSHOT Is not working > Am 02.10.2015 um 11:48 schrieb Robert Metzger : > > I think there is a version mismatch between the Flin

Re: data flow example on cluster

2015-10-02 Thread Robert Metzger
I think there is a version mismatch between the Flink version you've used to compile your job and the Flink version installed on the cluster. Maven automagically pulls newer 0.10-SNAPSHOT versions every time you're building your job. On Fri, Oct 2, 2015 at 11:45 AM, Lydia Ickler wrote: > Hi Til

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
Hi Till, I want to execute your Matrix Completion program „ALSJoin“. Locally it works perfect. Now I want to execute it on the cluster with: run -c com.github.projectflink.als.ALSJoin -cp /tmp/icklerly/flink-jobs-0.1-SNAPSHOT.jar 0 2 0.001 10 1 1 but I get the following error: java.lang.NoSuchM

Re: kryo exception due to race condition

2015-10-02 Thread Stefano Bortoli
here it is: https://issues.apache.org/jira/browse/FLINK-2800 saluti, Stefano 2015-10-01 18:50 GMT+02:00 Stephan Ewen : > This looks to me like a bug where type registrations are not properly > forwarded to all Serializers. > > Can you open a JIRA ticket for this? > > On Thu, Oct 1, 2015 at 6:46