Re: s3 statebackend user state size

2016-05-10 Thread Chen Qin
Hi Ufuk, Yes, it does help with Rocksdb backend! After tune checkpoint frequency align with network throughput, task manager released and job get cancelled are gone. Chen > On May 10, 2016, at 10:33 AM, Ufuk Celebi wrote: > >> On Tue, May 10, 2016 at 5:07 PM, Chen Qin

HBase write problem

2016-05-10 Thread Palle
HBase write problem Hi all. I have a problem writing to HBase. I am using a slightly modified example of this class to proof the concept: https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

Re: Cassandra sink wrt Counters

2016-05-10 Thread Ufuk Celebi
On Tue, May 10, 2016 at 5:36 PM, milind parikh wrote: > When will the Cassandra sink be released? I am ready to test it out even > now. You can work with Chesnay's branch here: https://github.com/apache/flink/pull/1771 Clone his repo via Git, check out the branch, and

Re: s3 statebackend user state size

2016-05-10 Thread Ufuk Celebi
On Tue, May 10, 2016 at 5:07 PM, Chen Qin wrote: > Future, to keep large key/value space, wiki point out using rocksdb as > backend. My understanding is using rocksdb will write to local file systems > instead of sync to s3. Does flink support memory->rocksdb(local disk)->s3 >

Re: Force triggering events on watermark

2016-05-10 Thread Srikanth
Yes, will work. I was trying another route of having a "finalize & purge trigger" that will i) onElement - Register for event time watermark but not alter nested trigger's TriggerResult ii) OnEventTime - Always purge after fire That will work with CountTrigger and other custom trigger too

Re: Force triggering events on watermark

2016-05-10 Thread Fabian Hueske
Maybe the last example of this blog post is helpful [1]. Best, Fabian [1] https://www.mapr.com/blog/essential-guide-streaming-first-processing-apache-flink 2016-05-10 17:24 GMT+02:00 Srikanth : > Hi, > > I read the following in Flink doc "We can explicitly specify a

Re: Cassandra sink wrt Counters

2016-05-10 Thread milind parikh
Hi Chesnay Sorry for asking the question in a confusing manner. Being new to flink, there are many questions swirling around in my head. Thanks for the details in your answers. Here's the facts , as I see them: (a) Cassandra Counters are not idempotent (b) The failures, in context of Cassandra,

Force triggering events on watermark

2016-05-10 Thread Srikanth
Hi, I read the following in Flink doc "We can explicitly specify a Trigger to overwrite the default Trigger provided by the WindowAssigner. Note that specifying a triggers does not add an additional trigger condition but replaces the current trigger." So, I tested out the below code with count

s3 statebackend user state size

2016-05-10 Thread Chen Qin
Hi there, With S3 as state backend, as well as keeping a large chunk of user state on heap. I can see task manager starts to fail without showing OOM exception. Instead, it shows a generic error message (below) when checkpoint triggered. I assume this has something to do with how state were

Re: writing tests for my program

2016-05-10 Thread Igor Berman
thanks Alexander, I'll take a look On 10 May 2016 at 13:07, lofifnc wrote: > Hi, > > Some shameless self promotion: > > You can also checkout: > https://github.com/ottogroup/flink-spector > which has to the goal to remove such hurdles when testing flink programs. > >

Re: writing tests for my program

2016-05-10 Thread lofifnc
Hi, Some shameless self promotion: You can also checkout: https://github.com/ottogroup/flink-spector which has to the goal to remove such hurdles when testing flink programs. Best, Alex -- View this message in context:

回复:Blocking or pipelined mode for batch job

2016-05-10 Thread wangzhijiang999
Hi Ufuk,       Thank you for quick response! I am not very clear of the internal realize for iteration, so would you explain in detail why blocking results can not be reset after each superstep? In addition,  for the below example, why it may cause deadlock in pipelined mode?  DataSet mapped1 =

Re: Blocking or pipelined mode for batch job

2016-05-10 Thread Ufuk Celebi
On Tue, May 10, 2016 at 10:56 AM, wangzhijiang999 wrote: >As I reviewed the flink source code, if the ExecutionMode is set > "Pipelined", the DataExechangeMode will be set "Batch" if breaking pipelined > property is true for two input or iteration situation in

Re: Cassandra sink wrt Counters

2016-05-10 Thread Chesnay Schepler
Hello Milind, I'm not entirely sure i fully understood your question, but I'll try anyway :) There is now way to provide exactly-once semantics for Cassandra's counters. As such we (will) only provide exactly-once semantics for a subset of Cassandra operations; idempotent inserts/updates.

Blocking or pipelined mode for batch job

2016-05-10 Thread wangzhijiang999
Hi ,        As I reviewed the flink source code, if the ExecutionMode is set "Pipelined", the DataExechangeMode will be set "Batch" if breaking pipelined property is true for two input or iteration situation in order to avoid deadlock. When the DataExechangeMode is set "Batch", the

multi-application correlated savepoints

2016-05-10 Thread Krzysztof Zarzycki
Hi! I'm thinking about using a great Flink functionality - savepoints . I would like to be able to stop my streaming application, rollback the state of it and restart it (for example to update code, to fix a bug). Let's say I would like travel back in time and reprocess some data. But what if I

Re: reading from latest kafka offset when flink starts

2016-05-10 Thread Balaji Rajagopalan
Robert, Regarding the event qps 4500 events/sec may not be large no, but I am seeing some issue in processing the events due to processing power that I am using, I have deployed flink app on 3 node yarn cluster one node is a master, 2 slave nodes which has the taskmanager running. Each machine