Re: Best Practice for Querying Flink State

2022-08-29 Thread Chen Qin
Hi Lu & Ken, Flink is a stream processing engine (albeit stateful) that doesn't aim to serve queries directly. When it comes to serving systems, AFAIK, has two campuses of user requirements. - the one that runs a really simple query (single indexing, like dynamo) serving a large number of

RE: Recommended pattern for implementing a DLQ with Flink+Kafka

2020-07-22 Thread Chen Qin
Could you more specific on what “failed message” means here?In general side output can do something like were  def process(ele) {   try{    biz} catch {   Sideout( ele + exception context)}}  process(func).sideoutput(tag).addSink(kafkasink) Thanks,Chen   From: Eleanore JinSent: Wednesday, July

Re: Kafka Rate Limit in FlinkConsumer ?

2020-07-06 Thread Chen Qin
My two cents here, - flink job already has back pressure so rate limit can be done via setting parallelism to proper number in some use cases. There is an open issue of checkpointing reliability when back pressure, community seems working on it. - rate limit can be abused easily and cause lot

Re: Flink operator throttle

2020-05-17 Thread Chen Qin
Hi Ray, In a bit abstract point of view, you can always throttle source and get proper sink throughput control. One approach might be just override base KafkaFetcher and use shaded guava rate limtier.

Re: Flink consuming rate increases slowly

2020-05-10 Thread Chen Qin
Hi Eyal, It’s unclear what warmup phase does in your use cases. Usually we see Flink start consume at high rate and drop to a point downstream can handle. Thanks Chen > On May 10, 2020, at 12:25 AM, Eyal Pe'er wrote: > > Hi all, > Lately I've added more resources to my Flink cluster which

Re: Bootstrapping

2018-01-25 Thread Chen Qin
Hi Gregory, I have similar issue when dealing with historical data. We choose Lambda and figure out use case specific hand off protocol. Unless storage side can support replay logs within a time range, Streaming application authors still needs to carry extra work to implement batching layer

Re: SideOutput doesn't receive anything if filter is applied after the process function

2018-01-16 Thread Chen Qin
>>> .filter(...) >>>>> >>>>> filteredSideOutput = processed >>>>> .getSideOutput(...) >>>>> .filter(...) >>>>> >>>>> >>>>> On 15.01.2018 09:55, Juho Autio wrote: >>

Re: SideOutput doesn't receive anything if filter is applied after the process function

2018-01-13 Thread Chen Qin
Hi Juho, I think sideoutput might deserve a seperate class which inherit form singleoutput. It might prevent lot of confusions. A more generic question is whether datastream api can be mulitple ins and mulitple outs natively. It's more like scheduling problem when you come from single process

ayncIO & TM akka response

2017-12-09 Thread Chen Qin
Hi there, In recent, our production fink jobs observed some weird performance issue. When job tailing kafka source failed and try to catch up, asyncIO after event trigger get much higher load on task thread. Since each TM allocated two virtual CPU in docker, my assumption was akka message between

Re: Data loss in Flink Kafka Pipeline

2017-12-07 Thread Chen Qin
Nishu You might consider sideouput with metrics at least after window. I would suggest having that to catch data screw or partition screw in all flink jobs and amend if needed. Chen On Thu, Dec 7, 2017 at 9:48 AM Fabian Hueske wrote: > Is it possible that the data is

Re: question on sideoutput from ProcessWindow function

2017-09-23 Thread Chen Qin
; do since the interface is very similar to ProcessFunction and we could > also > > add that to the Context. > > > > Best, > > Aljoscha > > > > On 9. Sep 2017, at 06:22, Chen Qin <qinnc...@gmail.com> wrote: > > > > Hi Prabhu, > > > &

Re: ETL with changing reference data

2017-09-17 Thread Chen Qin
Hi Peter, If I understand correctly, I think you are facing a delima of having efficient dynamic referencing as well as salable processing. I don't have answer to how thing would work for your specific case. Yet this is just interesting topic to discuss. Fabian provides insights and I would like

Re: question on sideoutput from ProcessWindow function

2017-09-08 Thread Chen Qin
Hi Prabhu, That is good question, the short answer is not yet, only ProcessFunction was given flexibility of doing customized sideoutput at the moment. Window Function wasn't given such flexibility partially due to sideoutput initially targeting late arriving event for window use cases.

Re: GRPC source in Flink

2017-07-31 Thread Chen Qin
Hi Basanth, Given the fact that Flink put failure recovery garantee on checkpointing and source rewinding. I can imagine a lossless rpc source would be tricky. In essence, any rpc source needs to provide rewind api which can buffer at least to last success checkpoint. In production use cases, put

Re: Checkpoints very slow with high backpressure

2017-05-31 Thread Chen Qin
What is root cause of back pressure? The reason why I ask is we investigated and applied metrics to measure time to process event and ends up finding bottle neck at frequent managed state updates. Our approach was keeping mem cache and periodical updates states before checkpointing cycle kick in.

Re: large sliding window perf question

2017-05-29 Thread Chen Qin
B.T.W It might be better off to pre aggregation via slidingWindow with controlled bucket size and batch update as well as retention. Thanks, Chen > On May 29, 2017, at 3:05 PM, Chen Qin <qinnc...@gmail.com> wrote: > > I see, not sure this this hack works. It utilize operator st

Re: large sliding window perf question

2017-05-29 Thread Chen Qin
t and we’re not > firing a timer (the two cases where we have a key scope). > > Best, > Aljoscha > >> On 24. May 2017, at 21:05, Chen Qin <qinnc...@gmail.com >> <mailto:qinnc...@gmail.com>> wrote: >> >> Got it! Looks like 30days window and trigger 10se

Re: large sliding window perf question

2017-05-24 Thread Chen Qin
gt; Thanks, > Carst > > *From: *Stefan Richter <s.rich...@data-artisans.com> > *Date: *Tuesday, May 23, 2017 at 21:35 > *To: *"user@flink.apache.org" <user@flink.apache.org> > *Subject: *Re: large sliding window perf question > > Hi, > > Which state backen

large sliding window perf question

2017-05-23 Thread Chen Qin
Hi there, I have seen some weird perf issue while running event time based job with large sliding window (24 hours offset every 10s) pipeline looks simple, tail kafka topic and assign timestamp and watermark, forward to large sliding window (30days) and fire every 10 seconds and print out. what

Re: Incremental checkpoint branch

2017-03-03 Thread Chen Qin
Hi Vishnu, My best gussing is there are lots of customized "incremental checkpointing" done via patch around rocksdb statebackend and rocksdb checkpoints. http://rocksdb.org/blog/2015/11/10/use-checkpoints-for-efficient-snapshots.html Thanks, Chen On Fri, Mar 3, 2017 at 1:16 PM, Ted Yu

Re: Side outputs

2017-02-08 Thread Chen Qin
Hi Billy, Without looking into detail how batch api works. I thought filter approach might not the most efficient in general to implement toplogy conditional branching. Again, It may not answer your question in term of prof improvement. If you are willing to use streaming api, you might consider

complete digraph

2017-02-07 Thread Chen Qin
Hi there, I don't think this would be a urgent topic but definitely seems interesting topic to me. Does flink topology able to run complete digraph (excluding sources and sinks)? The use case is more around support event based arbitary state

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-26 Thread Chen Qin
We worked around S3 and had a beer with our Hadoop engineers... -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-snapshotting-to-S3-Timeout-waiting-for-connection-from-pool-tp10994p11330.html Sent from the Apache Flink User Mailing List

Re: multi tenant workflow execution

2017-01-24 Thread Chen Qin
flink-docs- > release-1.2/dev/stream/process_function.html > > 2017-01-24 0:41 GMT+01:00 Chen Qin <qinnc...@gmail.com>: > >> Hi there, >> >> I am researching running one flink job to support customized event driven >> workflow executions. The use case is to

multi tenant workflow execution

2017-01-23 Thread Chen Qin
Hi there, I am researching running one flink job to support customized event driven workflow executions. The use case is to support running various workflows that listen to a set of kafka topics and performing various rpc checks, a user travel through multiple stages in a rule execution(workflow

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-12 Thread Chen Qin
We have seen this issue back to Flink 1.0. Our finding back then was traffic congestion to AWS in internal network. Many teams too dependent on S3 and bandwidth is shared, cause traffic congestion from time to time. Hope it helps! Thanks Chen > On Jan 12, 2017, at 03:30, Ufuk Celebi

Re: HashMap/HashSet Serialization Issue

2017-01-06 Thread Chen Qin
My understanding is HashMap doesn't work with Flink Native serialization framework, though I might be wrong. This might worth reading ​ https://cwiki.apache.org/confluence/display/FLINK/Type+System,+Type+Extraction,+Serialization -Chen​ On Fri, Jan 6, 2017 at 6:06 PM, Charith Wickramarachchi <

Re: Increasing parallelism skews/increases overall job processing time linearly

2017-01-06 Thread Chen Qin
Just noticed there are only two partitions per topic. Regardless of how large parallelism set. Only two of those will get partition assigned at most. Sent from my iPhone > On Jan 6, 2017, at 02:40, Chakravarthy varaga > wrote: > > Hi All, > > Any updates on

Re: Apache siddhi into Flink

2016-08-27 Thread Chen Qin
​+1​ On Aug 26, 2016, at 11:23 PM, Aparup Banerjee (apbanerj) wrote: Hi- Has anyone looked into embedding apache siddhi into Flink. Thanks, Aparup

Re: State in external db (dynamodb)

2016-07-25 Thread Chen Qin
>> store in snapshotState(...)? >> >> Thanks, >> Josh >> >> On Sun, Jul 24, 2016 at 6:00 PM, Chen Qin <qinnc...@gmail.com> wrote: >> >>> >>> >>> On Jul 22, 2016, at 2:54 AM, Josh <jof...@gmail.com> wrote: >>> &

Re: State in external db (dynamodb)

2016-07-24 Thread Chen Qin
> On Jul 22, 2016, at 2:54 AM, Josh wrote: > > Hi all, > > >(1) Only write to the DB upon a checkpoint, at which point it is known that > >no replay of that data will occur any more. Values from partially successful > >writes will be overwritten >with correct value. I

Re: Late arriving events

2016-07-06 Thread Chen Qin
Jamie, Sorry for late reply, some of my thoughts inline. -Chen > > Another way to do this is to kick off a parallel job to do the backfill > from the previous savepoint without stopping the current "realtime" job. > This way you would not have to have a "blackout". This assumes your final >

Late arriving events

2016-07-05 Thread Chen Qin
Hi there, I understand Flink currently doesn't support handling late arriving events. In reality, a exact-once link job needs to deal data missing or backfill from time to time without rewind to previous save point, which implies restored job suffers blackout while it tried to catch up. In

Re: s3 statebackend user state size

2016-05-10 Thread Chen Qin
Hi Ufuk, Yes, it does help with Rocksdb backend! After tune checkpoint frequency align with network throughput, task manager released and job get cancelled are gone. Chen > On May 10, 2016, at 10:33 AM, Ufuk Celebi <u...@apache.org> wrote: > >> On Tue, May 10, 2016 at

s3 statebackend user state size

2016-05-10 Thread Chen Qin
Hi there, With S3 as state backend, as well as keeping a large chunk of user state on heap. I can see task manager starts to fail without showing OOM exception. Instead, it shows a generic error message (below) when checkpoint triggered. I assume this has something to do with how state were

Re: s3 checkpointing issue

2016-05-04 Thread Chen Qin
fer.dir > /tmp > > > > On 4 May 2016 at 11:10, Ufuk Celebi <u...@apache.org> wrote: > >> Hey Chen Qin, >> >> this seems to be an issue with the S3 file system. The root cause is: >> >> Caused by: java.lang.NullPointerException at >> >>

fan out parallel-able operator sub-task beyond total slots number

2016-04-17 Thread Chen Qin
asks for 1s delay mapper"); } Thanks, Chen Qin