Re: RocksDB CPU resource usage

2021-06-17 Thread Padarn Wilson
- > *From:* Robert Metzger > *Sent:* Thursday, June 17, 2021 14:11 > *To:* Padarn Wilson > *Cc:* JING ZHANG ; user > *Subject:* Re: RocksDB CPU resource usage > > If you are able to execute your job locally as well (with enough data), > y

Re: Re: Re: Upgrade job topology in checkpoint

2021-06-16 Thread Padarn Wilson
ted to > the state ? I think we may first check if the operator logic is right and > whether > the precedent tasks have indeed emitted records to the new sink. > > Best, > Yun > > --Original Mail ------ > *Sender:*Padarn Wilson > *Send Date:*W

Re: RocksDB CPU resource usage

2021-06-16 Thread Padarn Wilson
To find out the reason, we need more profile on CPU cost, such as Flame >> Graphs. BTW, starting with Flink 1.13, Flame Graphs are natively supported >> in Flink[1]. >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/debugging/flame_graphs/ &

Re: Re: Upgrade job topology in checkpoint

2021-06-15 Thread Padarn Wilson
Padarn On Wed, Jun 16, 2021 at 10:59 AM Yun Gao wrote: > Hi Padarn, > > Sorry I might not fully got the mean of new topology was ignored. > Do you mean the topology is not the same as expected ? > > Best, > Yun > > > --Original Mail -----

Checkpoint loading failure

2021-06-15 Thread Padarn Wilson
Hi all, We have a job that has a medium size state (around 4GB) and after adding a new part of the job graph (which should not impact the job too much) we found that every single checkpoint restore has the following error: Caused by: java.io.IOException: s3a://: Stream is closed! > at > org.apach

RocksDB CPU resource usage

2021-06-15 Thread Padarn Wilson
Hi all, We have a job that we just enabled rocksdb on (instead of file backend), and see that the CPU usage is almost 3x greater on (we had to increase taskmanagers 3x to get it to run. I don't really understand this, is there something we can look at to understand why CPU use is so high? Our sta

Upgrade job topology in checkpoint

2021-06-11 Thread Padarn Wilson
Hi all, I'm looking for some clarity about changing job topology as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/upgrading/#application-topology My question is simple: Does this only apply to savepoints? Or can it also work for checkpoints? (also why if no

Re: Setting max parallelism via properties

2021-02-28 Thread Padarn Wilson
dy cover the bridge work, so I think it should > just work in your case. > > > pipeline-max-parallelism: > https://ci.apache.org/projects/flink/flink-docs-stable/deployment/config.html#pipeline-max-parallelism > > > Best, > Kezhu Wang > > On February 28, 2021 at 16

Setting max parallelism via properties

2021-02-28 Thread Padarn Wilson
Hi all, Sorry for the basic question, but is it possible to set max parallelism using the flink conf file, rather than explicitly in code: https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html#setting-the-maximum-parallelism Need this for a PR I am working on for the flink ope

Re: Additional options to S3 Filesystem: Interest?

2020-10-13 Thread Padarn Wilson
e it. > > [1] https://flink.apache.org/contributing/contribute-code.html > > On Tue, Oct 13, 2020 at 3:37 AM Padarn Wilson wrote: > >> Thanks for the feedback. I've created a JIRA here >> https://issues.apache.org/jira/browse/FLINK-19589. >> >> @Dan:

Re: Additional options to S3 Filesystem: Interest?

2020-10-12 Thread Padarn Wilson
eriod would certainly be welcome. (I could probably figure out a way to do > this from the S3 admin UI too though) > > On Sat, Oct 10, 2020 at 10:45 PM Padarn Wilson wrote: > >> Hi Flink Users, >> >> We need to expose some additional options for the s3 hadoop filesyste

Additional options to S3 Filesystem: Interest?

2020-10-10 Thread Padarn Wilson
Hi Flink Users, We need to expose some additional options for the s3 hadoop filesystem: Specifically, we want to set object tagging and lifecycle. This would be a fairly easy change and we initially thought to create a new Filsystem with very minor changes to allow this. However then I wondered,

Re: [External] Measuring Kafka consumer lag

2020-06-15 Thread Padarn Wilson
x27;t find a > video. I'm sure there were users talking about best practices for > monitoring Flink in the past ... > > Best, > Robert > > On Sun, Jun 14, 2020 at 5:47 AM Padarn Wilson > wrote: > >> Hi all, >> >> I'm looking for some advice on

[External] Measuring Kafka consumer lag

2020-06-13 Thread Padarn Wilson
Hi all, I'm looking for some advice on how other people measure consumer lag for Kafka consumers. Recently we had an application that looked like it was performing identically to before, but all of a sudden the throughput of the job decreased dramatically. However it was not clear from our Flink m

Re: Implicit Flink Context Documentation

2020-03-16 Thread Padarn Wilson
g/confluence/display/FLINK/Flink+Improvement+Proposals > > On 13 Mar 2020, at 16:14, Padarn Wilson wrote: > > Thanks Piotr, > > Conceptually I understand (and use) the key'ed state quite a lot, but the > implementation details are what I was looking for. > > It l

Re: Implicit Flink Context Documentation

2020-03-13 Thread Padarn Wilson
e of `CountWindowAverage#sum` field. > For details you could take a look at > `org.apache.flink.streaming.api.operators.AbstractStreamOperator#setKeyContextElement1` > method and how it’s being used/implemented. > > I hope that helps. > > Piotrek > > On 13 Mar 2020, at 08:20, Padarn Wilson wrote: > > Hi

Implicit Flink Context Documentation

2020-03-13 Thread Padarn Wilson
Hi Users, I am trying to understand the details of how some aspects of Flink work. While understanding `keyed state` I kept coming up against a claim that `there is a specific key implicitly in context` I would like to understand how this works, which I'm guessing means understanding the details

Re: End of Window Marker

2019-09-02 Thread Padarn Wilson
between substreams should never grow that much >> anyway. >> >> This approach should be particularly useful when the number of partitions >> or keying mechanism is different between the input and output topics. >> >> Hopefully that doesn't sound like a terrible

Re: End of Window Marker

2019-08-27 Thread Padarn Wilson
ssage to each of > its assigned partitions. > > Hope this helps, > Fabian > > > Am Sa., 17. Aug. 2019 um 05:48 Uhr schrieb Padarn Wilson >: > >> Hi Fabian, thanks for your input >> >> Exactly. Actually my first instinct was to see if it was possible to >

Re: End of Window Marker

2019-08-16 Thread Padarn Wilson
options to let the consuming app know about the > progress. > You can either > > The ProcessFunction could count per window end timestamp how many records > passed and forward that information via a side output. > You could then > > > Essentially, you'd like to publis

End of Window Marker

2019-08-12 Thread Padarn Wilson
Hello Users, I have a question that is perhaps not best solved within Flink: It has to do with notifying a downstream application that a Flink window has completed. The (simplified) scenario is this: - We have a Flink job that consumes from Kafka, does some preprocessing, and then has a sliding w

Re: [External] Flink 1.7.1 on EMR metrics

2019-06-01 Thread Padarn Wilson
uot;STATSD SENDING: {} {}", name, value); > > > - > > > > > On Sat, Jun 1, 2019 at 7:30 PM Padarn Wilson wrote: > >> Thanks both: Using the the inbuilt Slf4j reporter is a great idea, I >> will do this. >> >> @Peter.Groesbeck - appreciate the

Re: [External] Flink 1.7.1 on EMR metrics

2019-06-01 Thread Padarn Wilson
porter, you could >> then compare the configurations. >> >> Best >> Yun Tang >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#slf4j-orgapacheflinkmetricsslf4jslf4jreporter >> >> -

[External] Flink 1.7.1 on EMR metrics

2019-05-30 Thread Padarn Wilson
Hello all, I am trying to run Flink 1.7.1 on EMR and having some trouble with metric reporting. I was using the DataDogHttpReporter, but have also tried the StatsDReporter, but with both was seeing no metrics being collected. To debug this I implemented my own reporter (based on StatsDReporter)

Re: Flink metrics missing from UI 1.7.2

2019-03-23 Thread Padarn Wilson
ist-arch...@davidhaglund.se> wrote: > Padarn Wilson-2 wrote > > I am running Fink 1.7.2 on Kubernetes in a setup with task manager and > job > > manager separate. > > > > I'm having trouble seeing the metrics from my Flink job in the UI > > dashboard. Actually

Re: Flink metrics missing from UI 1.7.2

2019-03-23 Thread Padarn Wilson
> exported to datadog. You will find them at > > http://localhost:8081/jobs//metrics > > with IDs like > > > latency.source_id.bc764cd8ddf7a0cff126f51c16239658.operator_id.ea632d67b7d595e5b851708ae9ad79d6.operator_subtask_index.0.latency_p90 > > On Sat, Mar 23

Flink metrics missing from UI 1.7.2

2019-03-23 Thread Padarn Wilson
Hi User, I am running Fink 1.7.2 on Kubernetes in a setup with task manager and job manager separate. I'm having trouble seeing the metrics from my Flink job in the UI dashboard. Actually I'm using the Datadog reporter to expose most of my metrics, but latency tracking does not seem to be exporte

Re: Setting source vs sink vs window parallelism with data increase

2019-03-23 Thread Padarn Wilson
Well.. it turned out I was registering millions of timers by accident, which was why garbage collection was blowing up. Oops. Thanks for your help again. On Wed, Mar 6, 2019 at 9:44 PM Padarn Wilson wrote: > Thanks a lot for your suggestion. I’ll dig into it and update for the > mailing l

Re: Setting source vs sink vs window parallelism with data increase

2019-03-06 Thread Padarn Wilson
hread, can cause > performance bottlenecks in other unrelated places. > > If that’s not a GC issue, you can use Flink metrics (like number of > buffered input/output data) to find Task that’s causing a bottleneck. Then > you can use CPU profiler to analyse why is that happening. > >

Re: Setting source vs sink vs window parallelism with data increase

2019-03-01 Thread Padarn Wilson
cause this to happen with the JVM old generation metric, which appears to be the leading metric before a problem is noticed. Other metrics such as network buffers also show that at the checkpoint time things start to go haywire and the situation never recovers. Thanks On Thu, Feb 28, 2019 at 5:50

Re: Flink Standalone cluster - production settings

2019-02-28 Thread Padarn Wilson
Are you able to give some detail on in which cases you might be better off setting higher (or lower) parallelism for an operator? On Thu, Feb 21, 2019 at 9:54 PM Hung wrote: > / Each job has 3 asynch operators > with Executors with thread counts of 20,20,100/ > > Flink handles parallelisms for y

Setting source vs sink vs window parallelism with data increase

2019-02-28 Thread Padarn Wilson
Hi all, I'm trying to process many records, and I have an expensive operation I'm trying to optimize. Simplified it is something like: Data: (key1, count, time) Source -> Map(x -> (x, newKeyList(x.key1)) -> Flatmap(x -> x._2.map(y => (y, x._1.count, x._1.time)) -> Keyby(_

Re: Collapsing watermarks after keyby

2019-02-28 Thread Padarn Wilson
verse.head) }) sessionsStream.print() countStream.print() streamEnv.execute() } } On Tue, Feb 26, 2019 at 10:49 PM Padarn Wilson wrote: > Okay. I think I still must misunderstand something here. I will work on > building a unit test around this, hopefully this cl

Re: Collapsing watermarks after keyby

2019-02-26 Thread Padarn Wilson
;s watermarks > downstream. In case of a keyBy this means that the watermark is sent to all > downstream consumers. > > Cheers, > Till > > On Tue, Feb 26, 2019 at 1:47 PM Padarn Wilson wrote: > >> Just to add: by printing intermediate results I see that I definitely >

Re: Collapsing watermarks after keyby

2019-02-26 Thread Padarn Wilson
, Padarn Wilson wrote: > Hi Till, > > I will work on an example, but I’m a little confused by how keyBy and > watermarks work in this case. This documentation says ( > > https://ci.apache.org/projects/flink/flink-docs-stable/dev/event_time.html#watermarks-in-parallel-streams > ):

Re: Collapsing watermarks after keyby

2019-02-26 Thread Padarn Wilson
a working example > program with example input could be helpful for further debugging. > > Cheers, > Till > > On Tue, Feb 26, 2019 at 2:53 AM Padarn Wilson wrote: > >> Hi Flink Mailing List, >> >> Long story short - I want to somehow collapse watermarks at a

Collapsing watermarks after keyby

2019-02-25 Thread Padarn Wilson
Hi Flink Mailing List, Long story short - I want to somehow collapse watermarks at an operator across keys, so that keys with dragging watermarks do not drag behind. Details below: --- I have an application in which I want to perform the follow sequence of steps: Assume my data is made up of dat

Re: StreamingFileSink causing AmazonS3Exception

2019-02-21 Thread Padarn Wilson
Thanks Kostas! On Mon, Feb 18, 2019 at 5:10 PM Kostas Kloudas wrote: > Hi Padarn, > > This is the jira issue: https://issues.apache.org/jira/browse/FLINK-11187 > and the fix, as you can see, was first included in version 1.7.2. > > Cheers, > Kostas > > On Mon, Feb

Re: StreamingFileSink causing AmazonS3Exception

2019-02-17 Thread Padarn Wilson
Hi Addison, Kostas, Steffan, I am also encountering this exact issue. I cannot find a JIRA ticket on this, is there some planned work on implementing a fix? @Addison - Did you manage to find a fix that you could apply without modifying the Flink codebase? If possible it would be better not patch

Re: Clarification around versioning and flink's state serialization

2018-12-21 Thread Padarn Wilson
ll traveling because of the recent Flink Forward event; > will send you a copy of the latest slides I presented about the topic once > I get back. > > Cheers, > Gordon > > On Fri, Dec 21, 2018, 10:42 PM Padarn Wilson wrote: > >> Yes that helps a lot! >> >> Just

Re: Clarification around versioning and flink's state serialization

2018-12-21 Thread Padarn Wilson
our question? >> >> By the way, the slides your are looking at I would consider quite >> outdated for the topic, since Flink 1.7 was released with much smoother >> support for state schema evolution. >> An updated version of the slides is not yet publicly availab

Clarification around versioning and flink's state serialization

2018-12-21 Thread Padarn Wilson
Hi all, I am trying to understand the situation with state serialization in flink. I'm looking at a number of sources, but slide 35 from here crystalizes my confusion: https://www.slideshare.net/FlinkForward/flink-forward-berlin-2017-tzuli-gordon-tai-managing-state-in-apache-flink So, I understa

Re: Checkpointing when reading from files?

2018-05-27 Thread Padarn Wilson
I'm a bit confused about this too actually. I think the above would work as a solution if you want to continuously monitor a directory, but for a "PROCESS_ONCE" readFile source I don't think you will get a checkpoint emitted indicating the end of the stream. My understanding of this is that there

AvroInputFormat Serialisation Issue

2018-05-14 Thread Padarn Wilson
Hi all - sorry this seems like a silly question, but I can't figure it out. I'm using an AvroInputFormat in order to read an Avro file like this: val textInputFormat = new AvroInputFormat[GenericRecord](infile, classOf[GenericRecord]) val lines = env.readFile(textInputFormat, path) This works fi

Batch writing from Flink streaming job

2018-05-13 Thread Padarn Wilson
Hi all, I am writing some some jobs intended to run using the DataStream API using a Kafka source. However we also have a lot of data in Avro archives (of the same Kafka source). I would like to be able to run the processing code over parts of the archive so I can generate some "example output".

Window over events defined by a time range

2018-05-02 Thread Padarn Wilson
Hi all, I'm trying to figure out what the "Flink" way of achieving what I'd like to is. Imagine I have the following three events in my stream event1: {"start_time": 0, "end_time": 1, "value": "a"} event2: {"start_time": 0, "end_time": 2, "value": "b"} event3: {"start_time": 1, "end_time": 2, "v