Hi Theo,
We had a very similar problem with one of our spark streaming jobs. Best
solution was to create a custom source having all external records in
cache, periodically reading external data and comparing it to cache. All
changed records were then broadcasted to task managers. We tried to
> A good way is to write your logs to other separate files that could roll
> via log4j. If you want to access them in the Flink webUI,
> upgrade to the 1.11 version. Then you will find a "Log List" tab under
> JobManager sidebar.
>
>
> Best,
> Yang
>
> Max
to the
> stdout/stderr, the two files will increase
> over time.
>
> When you stop the Flink application, Yarn will clean up all the jars and
> logs, so you find that the disk space get back.
>
>
> Best,
> Yang
>
> Maxim Parkachov 于2020年7月30日周四 下午10:00写道:
>
>>
Hi everyone,
I have a strange issue with flink logging. I use pretty much standard log4
config, which is writing to standard output in order to see it in Flink
GUI. Deployment is on YARN with job mode. I can see logs in UI, no problem.
On the servers, where Flink YARN containers are running,
Hi Till,
thank you for very detailed answer, now it is absolutely clear.
Regards,
Maxim.
On Thu, Apr 30, 2020 at 7:19 PM Till Rohrmann wrote:
> Hi Maxim,
>
> I think your problem should be solvable with the CEP library:
>
> So what we are doing here is to define a pattern forward followed by
Hi everyone,
I need to implement following functionality. There is a kafka topic where
"forward" events are coming and in the same topic there are "cancel"
events. For each "forward" event I need to wait 1 minute for possible
"cancel" event. I can uniquely match both events. If "cancel" event
f1cb7c47747b7/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java#L871
>
> Best
> Yun Tang
> --
> *From:* Maxim Parkachov
> *Sent:* Monday, April 6, 2020 23:16
> *To:* user@
Hi everyone,
I'm trying to test exactly once functionality with my job under production
load. The job is reading from kafka, using kafka timestamp as event time,
aggregates every minute and outputs to other kafka topic. I use checkpoint
interval 10 seconds.
Everything seems to be working fine,
getClassLoader().getResource("job.properties")
>
>
> Best,
> Yang
>
> Maxim Parkachov 于2020年2月17日周一 下午6:47写道:
>
>> Hi Yang,
>>
>> thanks, this explains why classpath behavior changed, but now I struggle
>> to
>> understand h
Yang Wang wrote:
> Hi Maxim Parkachov,
>
> The users files also have been shipped to JobManager and TaskManager.
> However, it
> is not directly added to the classpath. Instead, the parent directory is
> added to the
> classpath. This changes are to make resource classloadi
Hi everyone,
I'm trying to run my job with flink 1.10 with YARN cluster per-job mode. In
the previous versions all files in lib/ folder were automatically included
in classpath. Now, with 1.10 I see only *.jar files are included in
classpath. but not "other" files. Is this deliberate change or
wrote:
> No, since a) HA will never use classes from the user-jar and b) zookeeper
> is relocated to a different package (to avoid conflicts) and hence any
> replacement has to follow the same relocation convention.
>
> On 05/02/2020 15:38, Maxim Parkachov wrote:
>
> Hi Chesnay,
>
> On 05/02/2020 15:12, Maxim Parkachov wrote:
>
> Hi everyone,
>
> I have already written about issue with Flink 1.9 on secure MapR cluster
> and high availability. The issue was resolved with custom compiled Flink
> with vendor mapr repositories enabled. The history co
Hi everyone,
I have already written about issue with Flink 1.9 on secure MapR cluster
and high availability. The issue was resolved with custom compiled Flink
with vendor mapr repositories enabled. The history could be found
https://www.mail-archive.com/user@flink.apache.org/msg28235.html
Hi Vasily,
unfortunately, this is known issue with Flink, you could read discussion
under
https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API
.
At the moment I have seen 3 solutions for this issue:
1. You buffer fact stream in local state before broadcast
Hi Stephan,
sorry for the late answer, didn't have access to cluster.
Here is log and stacktrace.
Hope this helps,
Maxim.
-
2019-09-16 18:00:31,804 INFO
ZK?
>
> Best,
> Stephan
>
>
> On Tue, Aug 27, 2019 at 11:03 AM Maxim Parkachov
> wrote:
>
>> Hi everyone,
>>
>> I'm testing release 1.9 on MapR secure cluster. I took flink binaries
>> from download page and trying to start Yarn session cluster. All Map
Hi everyone,
I'm testing release 1.9 on MapR secure cluster. I took flink binaries from
download page and trying to start Yarn session cluster. All MapR specific
libraries and configs are added according to documentation.
When I start yarn-session without high availability, it uses zookeeper
Hi Vasily,
as far as I know, by default console-consumer reads uncommited.
Try setting isolation.level to read_committed in console-consumer
properties.
Hope this helps,
Maxim.
On Fri, Aug 2, 2019 at 12:42 PM Vasily Melnik <
vasily.mel...@glowbyteconsulting.com> wrote:
> Hi, Eduardo.
> Maybe
Hi Vishwas,
took me some time to find out as well. If you have your properties file
under lib following will work:
val kafkaPropertiesInputStream =
getClass.getClassLoader.getResourceAsStream("lib/config/kafka.properties")
Hope this helps,
Maxim.
On Wed, Jul 17, 2019 at 7:23 PM Vishwas
link/flink-docs-release-1.8/ops/jobmanager_high_availability.html#yarn-cluster-high-availability
> .
>
> Best,
> Haibo
>
> At 2019-07-17 23:53:15, "Maxim Parkachov" wrote:
>
> Hi,
>
> I'm looking for advice on how to run flink streaming jobs on Yarn cluste
Hi,
I'm looking for advice on how to run flink streaming jobs on Yarn cluster
in production environment. I tried in testing environment both approaches
with HA mode, namely yarn session + multiple jobs vs cluster per job, both
seems to work for my cases, with slight preference of yarn session
: https://github.com/ing-bank/flink-deployer. The deployer will
> allow you to deploy or upgrade your jobs. All you need to do is integrate
> it into your CI/CD.
>
> Kind regards
>
> Marc
> On 16 Jul 2019, 02:46 +0200, Maxim Parkachov ,
> wrote:
>
> Hi,
>
&g
Hi,
I'm trying to bring my first stateful streaming Flink job to production and
have trouble understanding how to integrate it with CI/CD pipeline. I can
cancel the job with savepoint, but in order to start new version of
application I need to specify savepoint path manually ?
So, basically my
snapshot and restarts streaming job, so no magic here, but nicely
automated.
Regards,
Maxim.
On Tue, Jan 29, 2019 at 5:23 AM Maxim Parkachov
wrote:
> Hi,
>
> I had impression, that in order to change parallelism, one need to stop
> Flink streaming job and re-start with
Hi,
I had impression, that in order to change parallelism, one need to stop
Flink streaming job and re-start with new settings.
According to
https://docs.aws.amazon.com/kinesisanalytics/latest/java/how-scaling.html
auto-scaling works out of the box. Could someone with experience of running
Flink
Hi,
This is certainly possible. What you can do is use a
> BroadcastProcessFunction where you receive the rule code on the broadcast
> side.
>
Yes, this part works, no problem.
> You probably cannot send newly compiled objects this way but what you can
> do is either send a reference to some
Hi everyone,
I have a job with event stream and control stream delivering rules for
event transformation. Rules are broadcasted and used in flatMat-like
coProcessFunction. Rules are defined in custom JSON format. Amount of rules
and complexity rises significantly with every new feature.
What I
Hi everyone,
I'm writing streaming job which needs to query Cassandra for each event
multiple times, around 20. I would like to use Async IO for that but not
sure which option to choose:
1. Implement One AsyncFunction with 20 queries inside
2. Implement 20 AsyncFunctions, each with 1 query
Hi Ron,
I’m joining two streams - one is a “decoration” stream that we have in a
> compacted Kafka topic, produced using a view on a MySQL table AND using
> Kafka Connect; the other is the “event data” we want to decorate, coming in
> over time via Kafka. These streams are keyed the same way -
im.
>
> Best,
> Xingcan
>
>
> On Fri, Nov 3, 2017 at 5:54 AM, Maxim Parkachov <lazy.gop...@gmail.com>
> wrote:
>
>> Hi Flink users,
>>
>> I'm struggling with some basic concept and would appreciate some help. I
>> have 2 Input streams,
Hi Flink users,
I'm struggling with some basic concept and would appreciate some help. I
have 2 Input streams, one is fast event stream and one is slow changing
dimension. They have the same key and I use CoProcessFunction to store slow
data in state and enrich fast data from this state.
32 matches
Mail list logo