Re: Dynamic StreamingFileSink

2021-02-06 Thread Sidney Feiner
If anybody is interested, I've implemented a StreamingFileSink with dynamic paths: https://github.com/sidfeiner/DynamicPathFileSink Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature] From: Rafi Aroch

Dynamic StreamingFileSink

2020-12-26 Thread Sidney Feiner
be much appreciated :) Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature]

Re: Flink logs with extra pipeline property

2020-12-07 Thread Sidney Feiner
log4j2 twice. Once without using the java dynamic options and the second twice saying it required setting the java dynamic version so I'm a bit confused here 邏 I appreciate the support btw  Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp

Flink logs with extra pipeline property

2020-12-06 Thread Sidney Feiner
Hi, We're using Apache Flink 1.9.2 and we've started logging everything as JSON with log4j (standard log4j1 that comes with Flink). When I say JSON logging, I just mean that I've formatted in according to: log4j.appender.console.layout.ConversionPattern={"level": "%p", "ts": "%d{ISO8601}",

Re: Increase in parallelism has very bad impact on performance

2020-11-04 Thread Sidney Feiner
hese weird and unstable numbers of % in expected increase even though I'm not using a KeyedWindow anymore? Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature] From: Arvid Heise Sent: Tuesday, November 3,

Re: Increase in parallelism has very bad impact on performance

2020-11-03 Thread Sidney Feiner
in performance appear only for higher parallelism? Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature] From: Arvid Heise Sent: Tuesday, November 3, 2020 12:09 PM To: Yangze Guo Cc: Sidney Feiner ; user

Re: Increase in parallelism has very bad impact on performance

2020-11-03 Thread Sidney Feiner
Hey, I just ran a simple consumer that does nothing but consume event event (without aggregating) and every slot handles above 3K per second, and with parallelism set to 15, it succesffully handles 45K events per second Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype

Increase in parallelism has very bad impact on performance

2020-11-02 Thread Sidney Feiner
cause this dramatic decrease in performance? Extra info: * Flink version 1.9.2 * Flink High Availability mode * 3 task managers, 66 slots total Execution plan: [cid:04ba7b84-819d-45b6-98cd-446127a0255b] Any help would be much appreciated  Sidney Feiner / Data Platform Developer M

Re: Windows on SinkFunctions

2020-03-29 Thread Sidney Feiner
Thanks! What am I supposed to put in the apply/process function for the sink to be invoked on a List of items? Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature] From: tison Sent: Sunday, March 22, 2020

Windows on SinkFunctions

2020-03-22 Thread Sidney Feiner
in the pipeline? Because if I have multiple sinks that that only for one of them I need a Window, the second solution might be problematic. Thanks :) Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature]

KafkaConsumer keeps getting InstanceAlreadyExistsException

2020-03-15 Thread Sidney Feiner
t.id" on my consumer to a random UUID, making sure I don't have any duplicates but that didn't help either. Any idea what could be causing this? Thanks  Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature]

Re: Flink Metrics - PrometheusReporter

2020-01-22 Thread Sidney Feiner
Ok, I configured the PrometheusReporter's ports to be a range and now every TaskManager has it's own port where I can see it's metrics. Thank you very much! Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature

Flink Metrics - PrometheusReporter

2020-01-22 Thread Sidney Feiner
are configured but not being reported in high availability but are reported when I run the job locally though IntelliJ? Thanks  Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature]

Different jobName per Job when reporting Flink metrics to PushGateway

2019-12-17 Thread Sidney Feiner
've posted this on StackOverflow as well - here<https://stackoverflow.com/questions/59376693/different-jobname-per-job-when-reporting-flink-metrics-to-pushgateway> :) Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature]

Re: Fw: Metrics based on data filtered from DataStreamSource

2019-12-16 Thread Sidney Feiner
to filter out and then simply not handle them? Which means I must filter them in the task itself and I have no way of filtering them directly from the data source? Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature

Fw: Metrics based on data filtered from DataStreamSource

2019-12-15 Thread Sidney Feiner
Hey, I have a question about using metrics based on filtered data. Basically, I have handlers for many types of events I get from my data source (in my case, Kafka), and every handler has it's own filter function. That given handler also has a Counter, incrementing every time it filters out an