Re: Duplicated data when using Externalized Checkpoints in a Flink Highly Available cluster

2017-06-08 Thread Nico Kruber
Hi Amara, please refer to [1] for some details about our checkpointing mechanism, in short, for your situation: * checkpoints are made at certain checkpoint barriers, * in between those barriers, processing continues and so do outputs * in case of a failure the state at the latest checkpoint is r

Re: Painful KryoException: java.lang.IndexOutOfBoundsException on Flink Batch Api scala

2017-06-08 Thread Aljoscha Krettek
@Flavio, doesn’t this look like the exception you often encountered a while back? If I remember correctly that was fixed by Kurt, right? Best, Aljoscha > On 7. Jun 2017, at 18:11, Tzu-Li (Gordon) Tai wrote: > > Hi Andrea, > > I did some quick issue searching, and it seems like this is a frequ

Streaming job monitoring

2017-06-08 Thread Flavio Pompermaier
Hi to all, we've successfully ran our first straming job on a Flink cluster (with some problems with the shading of guava..) and it really outperforms Logstash, from the point of view of indexing speed and easiness of use. However there's only one problem: when the job is running, in the Job Monit

Re: Use Single Sink For All windows

2017-06-08 Thread Nico Kruber
How about using asynchronous I/O operations? https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/ asyncio.html Nico On Tuesday, 6 June 2017 16:22:31 CEST rhashmi wrote: > because of parallelism i am seeing db contention. Wondering if i can merge > sink of multiple windows and

Re: Flink on kubernetes -> shell deployment

2017-06-08 Thread Nico Kruber
If you have access to the web dashboard, you probably have access to the Jobmanager in general and can submit jobs from your command line by passing flink run --jobmanager ... I've looped in Patrick in case I am missing something kubernetes-specific here. Nico On Wednesday, 7 June 2017 16:0

Re: 回复:Question regarding configuring number of network buffers

2017-06-08 Thread Nico Kruber
yes, also please consider the new and simplified network buffer configuration from 1.3 onwards: https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/ config.html#configuring-the-network-buffers Nico On Thursday, 8 June 2017 05:26:57 CEST Zhijiang(wangzhijiang999) wrote: > Hi Ray, >

Re: Streaming job monitoring

2017-06-08 Thread Chesnay Schepler
Hello Flavio, I'm not sure what source you are using, but it looks like the ContinouosFileMonitoringSource which works with 2 operators. The first operator (what is displayed as the actual Source) emits input splits (chunks of files that should be read) and passes these to the second operator

Re: Deterministic Update

2017-06-08 Thread Nico Kruber
yes - you need to implement the CheckpointedFunction interface. (as an example: our BucketingSink uses this) Nico On Thursday, 8 June 2017 06:44:10 CEST rhashmi wrote: > Is there any possibility to trigger sink operator on completion of > checkpoint? > > > > -- > View this message in context:

Re: Painful KryoException: java.lang.IndexOutOfBoundsException on Flink Batch Api scala

2017-06-08 Thread Flavio Pompermaier
Yes it looks very similar to the Exception I experienced ( https://issues.apache.org/jira/browse/FLINK-6398) but my error was more related to Row serialization/deserialization (see [1]) while this looks more like something related to Kryo. However also with Flink 1.3.0 the error seems to appear fro

Re: Guava version conflict

2017-06-08 Thread Flavio Pompermaier
On an empty machine (with Ubuntu 14.04.5 LTS) and an empty maven local repo I did: 1. git clone https://github.com/apache/flink.git && cd flink && git checkout tags/release-1.2.1 2. /opt/devel/apache-*maven-3.3.9*/bin/mvn clean install -Dhadoop.version=2.6.0-cdh5.9.0 -Dhbase.version=1.

Re: Queries regarding FlinkCEP

2017-06-08 Thread Biplob Biswas
Hi, Can anyone check, whether they can reproduce this issue on their end? There's no log yet as t what is happening. Any idea to debug this issue is well appreciated. Regards, Biplob -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Queries-r

Re: Streaming job monitoring

2017-06-08 Thread Flavio Pompermaier
Hi Chesnay, this is basically my job: TextInputFormat input = new TextInputFormat(new Path(jsonDir, fileName)); DataStream json = env.createInput(input, BasicTypeInfo.STRING_TYPE_INFO); json.addSink(new ElasticsearchSink<>(userConf, transportAddr, sink)); JobExecutionResult jobInfo = env.execute("

Error running Flink job in Yarn-cluster mode

2017-06-08 Thread Biplob Biswas
I am trying to run a flink job on our cluster with 3 dn and 1 nn. I am usng the following command line argument to run this job, but I get an exception saying "Could not connect to the leading JobManager. Please check that the JobManager is running" ... what could I be doing wrong? Surprisingly, o

Re: [POLL] Who still uses Java 7 with Flink ?

2017-06-08 Thread Robert Metzger
Hi all, as promised in March, I want to revive this discussion! Our users are begging for Scala 2.12 support [1], migration to Akka 2.4 would solve a bunch of shading / dependency issues (Akka 2.4 will remove Akka's protobuf dependency [2][3]) and generally Java 8's new language features all spea

Re: Error running Flink job in Yarn-cluster mode

2017-06-08 Thread Nico Kruber
I'm no expert here, but are 4 yarn containers/task managers (-yn 4) not too many for 3 data nodes (=3 dn?)? also, isn't the YARN UI reflecting its own jobs, i.e. running flink, as opposed to running the actual flink job? or did you mean that the flink web ui (through yarn) showed the submitted

Re: Painful KryoException: java.lang.IndexOutOfBoundsException on Flink Batch Api scala

2017-06-08 Thread Andrea Spina
Hi guys, thank you for your interest. Yes @Flavio, I tried both 1.2.0 and 1.3.0 versions. Following Gordon suggestion I tried to put setReference to false but sadly it didn't help. What I did then was to declare a custom serializer as the following: class BlockSerializer extends Serializer[Block

Re: Fink: KafkaProducer Data Loss

2017-06-08 Thread ninad
I built Flink v1.3.0 with cloudera hadoop and am able to see the data loss. Here are the details: *tmOneCloudera583.log* Received session window task: *2017-06-08 15:10:46,131 INFO org.apache.flink.runtime.taskmanager.Task - TriggerWindow(ProcessingTimeSessionWindows(3

Re: In-transit Data Encryption in EMR

2017-06-08 Thread vinay patil
Hi Guys, I am able to setup SSL correctly, however the following command does not work correctly and results in the error I had mailed earlier flink run -m yarn-cluster -yt deploy-keys/ TestJob.jar Few Doubts: 1. Can anyone please explain me how do you test if SSL is working correctly ? Curren

Re: Flink on kubernetes -> shell deployment

2017-06-08 Thread Kaepke, Marc
Hi Nico, thanks for your help. $ kubectl exex -it /bin/bash that was what I was looking for. This command provides a shell directly into my job-manager instance. Best, Marc > Am 08.06.2017 um 12:05 schrieb Nico Kruber : > > If you have access to the web dashboard, you probably have access to

SingleOutputStreamOperator addsink Error

2017-06-08 Thread G.S.Vijay Raajaa
Hi, I am trying to pass the SingleOutputStreamOperator to a custom sink. I am getting an error while implementing the same. Code snippet: SingleOutputStreamOperator stream = env.addSource(source) .flatMap(new ExtractHashTagsSymbols(tickers)) .keyBy(0) .time

Re: SingleOutputStreamOperator addsink Error

2017-06-08 Thread Ted Yu
bq. new SinkFunction(){ Note the case in JsonObject. It should be JSONObject FYI On Thu, Jun 8, 2017 at 1:27 PM, G.S.Vijay Raajaa wrote: > Hi, > > I am trying to pass the SingleOutputStreamOperator to a custom sink. I am > getting an error while implementing the same. > > Code snippet: > > Sin

Re: SingleOutputStreamOperator addsink Error

2017-06-08 Thread G.S.Vijay Raajaa
That's right. Jackson and Gson similar naming convention. Thanks for the quick catch. Regards, Vijay Raajaa GS On Fri, Jun 9, 2017 at 1:59 AM, Ted Yu wrote: > bq. new SinkFunction(){ > > Note the case in JsonObject. It should be JSONObject > > FYI > > On Thu, Jun 8, 2017 at 1:27 PM, G.S.Vijay

Flink with Mesos: Fetcher error

2017-06-08 Thread ani.desh1512
I am trying to configure Flink to work on top of Mesos. I am using Flink release-1.3. I am using DCOS 1.9's underlying mesos which is version 1.2. I am able to start Flink without any issues when the taskmanager starts on the same host as that of appmaster. But when the taskmanager is launched on a

At what point do watermarks get injected into the stream?

2017-06-08 Thread Ray Ruvinskiy
I’m trying to build a mental model of how watermarks get injected into the stream. Suppose I have a stream with a parallel source, and I’m running a cluster with multiple task managers. Does each parallel source reader inject watermarks, which are then forwarded to downstream consumers and shuff