Re: ResourceManager not using correct akka URI in standalone cluster (?)

2016-09-28 Thread AJ Heller
Thank you Till. I was in a time crunch, and rebuilt my cluster from the ground up with hadoop installed. All works fine now, `netstat -pn | grep 6123` shows flink's pid. Hadoop may be irrelevant, I can't rule out PEBKAC yet :-). Sorry, when I have time I'll attempt to reproduce the scenario, on

Create stream from multiple files using "env.readFile(format, input1, FileProcessingMode.PROCESS_CONTINUOUSLY, 1000, FilePathFilter.createDefaultFilter())" ?

2016-09-28 Thread Anchit Jatana
Hi All, I have a use case where in need to create multiple source streams from multiple files and monitor the files for any changes using the " FileProcessingMode.PROCESS_CONTINUOUSLY" Intention is to achieve something like this(have a monitored stream for each of the multiple files), something

Re: Error while adding data to RocksDB: No more bytes left

2016-09-28 Thread Stephan Ewen
Hi Shannon! The stack trace you pasted is independent of checkpointing - it seems to be from the regular processing. Does this only happen when checkpoints are activated? Can you also share which checkpoint method you use? - FullyAsynchronous - SemiAsynchronous I think there are two

Error while adding data to RocksDB: No more bytes left

2016-09-28 Thread Shannon Carey
It appears that when one of my jobs tries to checkpoint, the following exception is triggered. I am using Flink 1.1.1 in Scala 2.11. RocksDB checkpoints are being saved to S3. java.lang.RuntimeException: Error while adding data to RocksDB at

Re: Best way to trigger dataset sampling

2016-09-28 Thread Flavio Pompermaier
I think I'll probably end with submitting the job through YARN in order to have a more standard approach :) Thanks, Flavio On Wed, Sep 28, 2016 at 5:19 PM, Maximilian Michels wrote: > I meant that you simply keep the sampling jar on the machine where you > want to sample.

Re: Merge the states of different partition in streaming

2016-09-28 Thread Ufuk Celebi
Great to hear! On Wed, Sep 28, 2016 at 5:18 PM, Simone Robutti wrote: > Solved. Probably there was an error in the way I was testing. Also I > simplified the job and it works now. > > 2016-09-27 16:01 GMT+02:00 Simone Robutti : >> >>

Re: Best way to trigger dataset sampling

2016-09-28 Thread Maximilian Michels
I meant that you simply keep the sampling jar on the machine where you want to sample. However, you mentioned that it is a requirement for it to be on the cluster. Cheers, Max On Tue, Sep 27, 2016 at 3:18 PM, Flavio Pompermaier wrote: > Hi max, > that's exactly what I was

Re: Merge the states of different partition in streaming

2016-09-28 Thread Simone Robutti
Solved. Probably there was an error in the way I was testing. Also I simplified the job and it works now. 2016-09-27 16:01 GMT+02:00 Simone Robutti : > Hello, > > I'm dealing with an analytical job in streaming and I don't know how to > write the last part. > >

Re: Flink 1.2-SNAPSHOT fails to initialize keyed state backend

2016-09-28 Thread Stephan Ewen
Hi! This was a temporary regression on the snapshot that has been fixed a few days ago. It should be in the snapshot repositories by now. Can you check if the problem persists if you force an update of your snapshots dependencies? Greetings, Stephan On Tue, Sep 27, 2016 at 5:04 PM, Timo

Re: Flink Checkpoint runs slow for low load stream

2016-09-28 Thread Chakravarthy varaga
Hi Stephan, That should be great. Let me know once the fix is done and the snapshot version to use, I'll check and revert then. Can you also share the JIRA that tracks the issue? With regards to offset commit issue, I'm not sure as to how to proceed here. Probably I'll use your

Re: checkpoints not removed on hdfs.

2016-09-28 Thread Ufuk Celebi
Hey! Any update on this? On Mon, Sep 5, 2016 at 11:29 AM, Aljoscha Krettek wrote: > Hi, > which version of Flink are you using? Are the checkpoints being reported as > successful in the Web Frontend, i.e. in the "checkpoints" tab of the running > job? > > Cheers, > Aljoscha

Re: RemoteEnv connect failed

2016-09-28 Thread Ufuk Celebi
Hey Dayong, can you check the logs of the Flink cluster on the virtual machine? The client side (what you posted) looks ok. – Ufuk On Wed, Sep 14, 2016 at 3:52 PM, Dayong wrote: > Hi folks, > I need to run a java app to submit a job to remote flink cluster. I am > testing

Re: How to interact with a running flink application?

2016-09-28 Thread Ufuk Celebi
Hey Anchit, the usual recommendation for this is to use a CoMap/CoFlatMap operator, where the second input are the lookup location changes. You can then use this input to update the location. Search for CoMap/CoFlatMap here:

Re: Failures on DataSet programs

2016-09-28 Thread Ufuk Celebi
Hey Paulo! I think it's not possible out of the box at the moment, but you can try the following as a work around: 1) Create a custom OutputFormat that extends TextOutputFormat and override the clean up method: public class NoCleanupTextOutputFormat extends TextOutputFormat { @Override