I can describe a use that has been successful for me. We have a Flink workflow
that calculates reports over many days and have it currently set up to
recompute the last 10 days or so when recovering this "deep history" from our
databases and then switches over to live flow to process all subsequ
Hello,
I know we can’t set a timer in the processBroadcastElement() of the
KeyedBroadcastProcessFunction as there is no key.
However, there is a context.applyToKeyedState() method which allows us to
iterate over the keyed state in the scope of a key. So it is possible to add
access to the Tim
in order to work
- as your jobmanager can not access the checkpoint files of it can also not
clean-up those files
Hope that helps
Regards
Thias
From: James Sandys-Lumsdaine
Sent: Tuesday, May 17, 2022 3:55 PM
To: Hangxiang Yu ; user@flink.apache.org
Subject: Re: Checkpoint directories not
___
From: Hangxiang Yu
Sent: 17 May 2022 14:38
To: James Sandys-Lumsdaine ; user@flink.apache.org
Subject: Re: Checkpoint directories not cleared as TaskManagers run
Hi, James.
I may not get what the problem is.
All checkpoints will store in the address as you set.
IIUC, TMs will write som
s going on a bit better it seems pointless for me to have file
checkpoints that can't be read by the jobmanager for failover.
If anyone can clarify/correct me I would appreciate.
James.
____
From: James Sandys-Lumsdaine
Sent: 16 May 2022 18:52
To: user@flink.apac
Hello,
I'm seeing my Flink deployment's checkpoint storage directories build up and
never clear down.
When I run from my own IDE, I see the only the latest "chk-x" directory under
the job id folder. So the first checkpoint is "chk-1", which is then replaced
with "chk-2" etc.
However, when
Is anyone able to comment on the below? My worry is this class isn’t well
support so I may need to find an alternative to bulk copy data into SQL Server
e.g. use a simple file sink and then have some process bulk copy the files.
From: Sandys-Lumsdaine, James
Sent
Hello,
We are using the GenericWriteAheadSink to buffer up values to then send to a
SQL Server database with a fast bulk copy upload. However, when I watch my
process running it seems to be a huge amount of time iterating the Iterable
provided to the sendValues() method. It takes such a long ti
Hi all,
I have a 1.14 Flink streaming workflow with many stateful functions that has a
FsStateBackend and checkpointed enabled, although I haven't set a location for
the checkpointed state.
I've really struggled to understand how I can stop my Flink job and restart it
and ensure it carries off
would
suggest to follow the example at
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/testing/#testing-flink-jobs
and using a job that only contains the source (+ something to either extract
the results or verify it within the job).
On 14/02/2022 18:06, James S
Hi all,
I've been using the test harness classes to unit test my stateful 1 and 2
stream functions. But I also have some stateful legacy Source classes I would
like to unit test and can't find any documentation or example for that - is
this possible?
Thanks,
James.
is feature is indeed available and working in 1.14.0 and
how to correctly enable?
Thanks,
James.
From: Austin Cawley-Edwards
Sent: 04 November 2021 18:46
To: James Sandys-Lumsdaine
Cc: user@flink.apache.org
Subject: Re: GenericWriteAheadSink, declined checkp
Hello,
I have a Flink workflow where I need to upload the output data into a legacy
SQL Server database and so I have read the section in the Flink book about data
sinks and utilizing the GenericWriteAheadSink base class. I am currently using
Flink 1.12.3 although we plan to upgrade to 1.14 sho
Ah thanks for the feedback. I can work around for now but will upgrade as soon
as I can to the latest version.
Thanks very much,
James.
From: Piotr Nowojski
Sent: 08 October 2021 13:17
To: James Sandys-Lumsdaine
Cc: user@flink.apache.org
Subject: Re: Empty
Hi everyone,
I'm putting together a Flink workflow that needs to merge historic data from a
custom JDBC source with a Kafka flow (for the realtime data). I have
successfully written the custom JDBC source that emits a watermark for the last
event time after all the DB data has been emitted but
Hello,
I have a Flink workflow which is partitioned on a key common to all the stream
objects and a key that is best suited to the high volume of data I am
processing. I now want to add in a new stream of prices that I want to make
available to all partitioned streams - however, this new stream
Hello,
I'm starting a new Flink application to allow my company to perform lots of
reporting. We have an existing legacy system with most the data we need held in
SQL Server databases. We will need to consume data from these databases
initially before starting to consume more data from newly
17 matches
Mail list logo