Hey,
> 1. The Flink community agrees that we upgrade Kryo to a later version,
which means breaking all checkpoint/savepoint compatibility and releasing a
Flink 2.0 with Java 17 support added and Java 8 and Flink Scala API support
dropped. This is probably the quickest way, but would still mean
:", however this is not what I observe, instead the job fails. I
> am attaching the relevant part of the log. This error happens upon
> trying to recover from a one month old savepoint.
>
> Regards,
> Yordan
>
> On Tue, 15 Nov 2022 at 18:53, Piotr Nowojski wrote:
> >
>
Hi Yordan,
I don't understand where the problem is, why do you think savepoints are
unusable? If you recover with `ignoreFailuresAfterTransactionTimeout`
enabled, the current Flink behaviour shouldn't cause any problems (except
for maybe some logged errors).
Best,
Piotrek
wt., 15 lis 2022 o
ent connector can have different
> serialisation and de-serlisation technique right?. Wouldn't that impact?.
> If I use StateProcessor API, would that be agnostic to all the sources and
> sinks?.
>
> On Fri, Oct 21, 2022, 18:00 Piotr Nowojski wrote:
>
>> ops
>>
>>
to use StateProcessor API.
Best,
Piotrek
pt., 21 paź 2022 o 10:54 Sriram Ganesh napisał(a):
> Thanks !. Will try this.
>
> On Fri, Oct 21, 2022 at 2:22 PM Piotr Nowojski
> wrote:
>
>> Hi Sriram,
>>
>> You can read and modify savepoints using StateProcessor API [1].
>
Hi Sriram,
You can read and modify savepoints using StateProcessor API [1].
Alternatively, you can modify a code of your function/operator for which
you want to modify the state. For example in the
`org.apache.flink.streaming.api.checkpoint.CheckpointedFunction#initializeState`
method you could
Hi Xintong,
I'm not sure if slack is the right tool for the job. IMO it works great as
an adhoc tool for discussion between developers, but it's not searchable
and it's not persistent. Between devs, it works fine, as long as the result
of the ad hoc discussions is backported to JIRA/mailing
Hey,
Could you be more specific about how it is not working? A compiler error
that there is no such class as RuntimeContextInitializationContextAdapters?
This class has been introduced in Flink 1.12 in FLINK-18363 [1]. I don't
know this code and I also don't know where it's documented, but:
a)
Hi,
It's the minimal watermark among all 10 parallel instances of that Task.
Using metric (currentInputWatermark) [1] you can access the watermark of
each of those 10 sub tasks individually.
Best,
Piotrek
[1] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/
pt., 25 lut
gt;>
>>
>> From what I’ve read there is state, checkpoints and save points – all of
>> them hold state - and currently I can’t get any of these to restore when
>> developing in an IDE and the program builds up all state from scratch. So
>> what else do I need to do in
Hi James,
Sure! The basic idea of checkpoints is that they are fully owned by the
running job and used for failure recovery. Thus by default if you stopped
the job, checkpoints are being removed. If you want to stop a job and then
later resume working from the same point that it has previously
Hi Frank,
I'm not sure exactly what you are trying to accomplish, but yes. In
the TimestampAssigner you can only return what should be the new timestamp
for the given record.
If you want to use "ingestion time" - "true even time" as some kind of
delay metric, you will indeed need to have both
Hi,
As far as I can tell the answer is unfortunately no. With Table API (SQL)
things are much simpler, as you have a restricted number of types of
columns that you need to support and you don't need to support arbitrary
Java classes as the records.
I'm shooting blindly here, but maybe you can
Hi Josson,
Would you be able to reproduce this issue on a more recent version of
Flink? I'm afraid that we won't be able to help with this issue as this
affects a Flink version that is not supported for quite some time and
moreover `SlotSharingManager` has been completed removed in Flink 1.13.
Hi,
Unfortunately the new KafkaSource was contributed without good benchmarks,
and so far you are the first one that noticed and reported this issue.
Without more direct comparison (as Martijn suggested), it's hard for us to
help right away. It would be a tremendous help for us if you could for
in a (Process)JoinFunction?
> The join needs keys, but I don’t know if the resulting stream counts as
> keyed from the state’s point of view.
>
>
>
> Regards,
>
> Alexis.
>
>
>
> *From:* Piotr Nowojski
> *Sent:* Montag, 6. Dezember 2021 08:43
> *To:* David Mo
gt;
>
>
>
>
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/config/#state-backend-rocksdb-metrics-estimate-num-keys
>
>
>
> *From:* Mason Chen
> *Sent:* Dienstag, 11. Januar 2022 19:20
> *To:* Piotr Nowojski
n's posts where he mentioned that this way of creating
> DS is not "encouraged and tested". So, I figured out an alternate way of
> using side output and now I can do what I was aiming for.
>
> Thanks,
> Sid.
>
> On Mon, Jan 10, 2022 at 5:29 PM Piotr Now
Ah, I see. Pitty. You could always use reflection if you really had to, but
that's of course not a long term solution.
I will raise this issue to the KafkaSource/AWS contributors.
Best,
Piotr Nowojski
pon., 10 sty 2022 o 16:55 Clayton Wohl napisał(a):
> Custom code can create subclas
Hi,
Unfortunately there is no such metric. Regarding the logs, I'm not sure
what Flink version you are using, but since Flink 1.13.0 [1][2], you could
relay on the tasks/subtasks switch from `INITIALIZING` to `RUNNING` to
check when the task/subtask has finished recovering it's state.
Best,
Hi Puneet,
Have you seen this thread before? [1]. It looks like the same issue and
especially this part might be the key:
> Be aware that the filesystem used by the FileUploadHandler
> is java.nio.file.FileSystem and not
> Flink's org.apache.flink.core.fs.FileSystem for which we provide
Hi Clayton,
I think in principle this example should be still valid, however instead of
providing a `CustomFlinkKafkaConsumer` and overriding it's `open` method,
you would probably need to override
`org.apache.flink.connector.kafka.source.reader.KafkaSourceReader#start`.
So you would most likely
Hi Sid,
I don't see on the stackoverflow explanation of what are you trying to do
here (no mentions of MapFunction or a tuple).
If you want to create a `DataStream` from some a pre
existing/static Tuple of Strings, the easiest thing would be to convert the
tuple to a collection/iterator and use
Hi Basil,
1. What do you mean by:
> The only way we could stop these stuck jobs was to patch the finalizers.
?
2. Do you mean that your job is stuck when doing stop-with-savepoint?
3. What Flink version are you using? Have you tried upgrading to the most
recent version, or at least the most
r to see the huge jumps in window fires. I can this
> benefiting other users who troubleshoot the problem of large number of
> window firing.
>
> Best,
> Mason
>
> On Dec 29, 2021, at 2:56 AM, Piotr Nowojski wrote:
>
> Hi Mason,
>
> > and it has to finish proce
d (retain on cancellation)
> Tolerable Failed Checkpoints 10
> ```
>
> Are there other metrics should I look at—why else should tasks fail
> acknowledgement in unaligned mode? Is it something about the implementation
> details of window function that I am not considering? My main hunch
Hi,
In principle you can register metric/metric groups dynamically it should be
working just fine. However your code probably won't work, because per every
record you are creating a new group and new counter, that most likely will
be colliding with an old one. So every time you are defining a new
Hi,
It might be simply because the binary artifacts are not yet
published/visible. The blog post [1] mentions that it should be visible
within 24h from yesterday), so please try again later/tomorrow. This is
also mentioned in the dev mailing list thread [2]
Best,
Piotrek
[1]
Hi,
Reading in the DataStream API (that's what I'm using you are doing) from
Parquet files is officially supported and documented only since 1.14 [1].
Before that it was only supported for the Table API. As far as I can tell,
the basic classes (`FileSource` and `ParquetColumnarRowInputFormat`)
Hi Tao,
Could you prepare a minimalistic example that would reproduce this issue?
Also what Flink version are you using?
Best,
Piotrek
czw., 16 gru 2021 o 09:44 tao xiao napisał(a):
> >Your upstream is not inflating the record size?
> No, this is a simply dedup function
>
> On Thu, Dec 16,
Hi Mason,
In Flink 1.14 we have also changed the timeout behavior from checking
against the alignment duration, to simply checking how old is the
checkpoint barrier (so it would also account for the start delay) [1]. It
was done in order to solve problems as you are describing. Unfortunately we
Hi Alexis and David,
This actually can not happen. There are mechanisms in the code to make sure
none of the input is starved IF there is some data to be read.
The only time when input can be blocked is during the alignment phase of
aligned checkpoints under back pressure. If there was a back
>
> Best wishes,
>
> Shazia
>
>
> - Original message -
> From: "Piotr Nowojski"
> To: "Shazia Kayani"
> Cc: mart...@ververica.com, "user"
> Subject: [EXTERNAL] Re: Input Selectable & Checkpointing
> Date: Wed, Nov 24, 2
Hi Shazia,
FLIP-182 [1] might be a thing that will let you address issues like this in
the future. With it, maybe you could do some magic with assigning
watermarks to make sure that one stream doesn't run too much into the
future which would effectively prioritise the other stream. But that's
Hi Vasily,
Unfortunately no, I don't think there is such an option in your case. With
per job mode, you could try to use the Distributed Cache, it should be
working in streaming as well [1], but this doesn't work in the application
mode, as in that case no code is executed on the JobMaster [2]
Hi All,
to me it looks like something deadlocked, maybe due to this OOM error from
Kafka, preventing a Task from making any progress. To confirm Dongwan you
could collecte stack traces while the job is in such a blocked state.
Deadlocked Kafka could easily explain those symptoms and it would be
Hi Dongwon,
Thanks for reporting the issue, I've created a ticket for it [1] and we
will analyse and try to fix it soon. In the meantime it should be safe for
you to ignore this problem. If this failure happens only rarely, you can
always retry stop-with-savepoint command and there should be no
Hi Simon,
>From the top of my head I do not see a reason why this shouldn't work in
Flink. I'm not sure what your question is here.
For reading both from the FileSource and Kafka at the same time you might
want to take a look at the Hybrid Source [1]. Apart from that there are
r downstream, it
> just broadcasts it to all output channels”.
>
>
>
> I’ll see what I can do about upgrading the Flink version and do some more
> tests with unaligned checkpoints. Thanks again for all the info.
>
>
>
> Regards,
>
> Alexis.
>
>
>
> *Fro
say, the last 5 minutes of data. Then it
> fails again because the checkpoint times out and, after restarting, would
> it try to read, for example, 15 minutes of data? If there was no
> backpressure in the source, it could be that the new checkpoint barriers
> created after the first restar
arrier for n-1 and “last” the one for n?
>3. Start delay also refers to the “first checkpoint barrier to reach
>this subtask”. As before, what is “first” in this context?
>4. Maybe this will be answered by the previous questions, but what
>happens to barriers if a down
Hi Alexis,
You can read about those metrics in the documentation [1]. Long alignment
duration and start delay almost always come together. High values indicate
long checkpoint barrier propagation times through the job graph, that's
always (at least so far I haven't seen a different reason) caused
Great, thanks!
pon., 11 paź 2021 o 17:24 James Sandys-Lumsdaine
napisał(a):
> Ah thanks for the feedback. I can work around for now but will upgrade as
> soon as I can to the latest version.
>
> Thanks very much,
>
> James.
> --
> *From:* Piot
Hi James,
I believe you have encountered a bug that we have already fixed [1]. The
small problem is that in order to fix this bug, we had to change some
`@PublicEvolving` interfaces and thus we were not able to backport this fix
to earlier minor releases. As such, this is only fixed starting from
Hi,
`JobManagerCommunicationUtils` was never part of Flink's API. It was an
internal class, for our internal unit tests. Note that Flink's public API
is annotated with `@Public`, `@PublicEvolving` or `@Experimental`. Anything
else by default is internal (sometimes to avoid confusion we are
improving the approach.
>
>
>
> Thanks,
>
> Sanket Agrawal
>
>
>
> *From:* Ragini Manjaiah
> *Sent:* Wednesday, September 29, 2021 11:17 AM
> *To:* Sanket Agrawal
> *Cc:* Piotr Nowojski ; user@flink.apache.org
> *Subject:* Re: Event is taking a lot of t
Hi,
With Flink 1.8.0 I'm not sure how reliable the backpressure status is in
the WebUI when it comes to the Async operators. If I remember correctly
until around Flink 1.10 (+/- 2 version) backpressure monitoring was
checking for thread dumps stuck in requesting Flink's network memory
buffers. If
Hi David,
I can confirm that I'm able to reproduce this behaviour. I've tried
profiling/flame graphs and I was not able to make much sense out of those
results. There are no IO/Memory bottlenecks that I could notice, it looks
indeed like the Job is stuck inside RocksDB itself. This might be an
Hi Peter,
Can you provide relevant JobManager logs? And can you write down what steps
have you taken before the failure happened? Did this failure occur during
upgrading Flink, or after the upgrade etc.
Best,
Piotrek
śr., 8 wrz 2021 o 16:11 Peter Westermann
napisał(a):
> We recently upgraded
Hi,
FYI, the performance regression after upgrading RocksDB was clearly visible
in all of our RocksDB related benchmarks, like for example:
http://codespeed.dak8s.net:8000/timeline/?ben=stateBackends.ROCKS=2
http://codespeed.dak8s.net:8000/timeline/?ben=stateBackends.ROCKS_INC=2
(and many more
Thanks for the detailed explanation Yun Tang and clearly all of the effort
you have put into it. Based on what was described here I would also vote
for going forward with the upgrade.
It's a pity that this regression wasn't caught in the RocksDB community. I
would have two questions/ideas:
1. Can
Hi Ivan,
It sounds to me like a bug in FlinkKinesisConsumer that it's not cancelling
properly. The change in the behaviour could have been introduced as a bug
fix [1], where we had to stop interrupting the source thread. This also
might be related or at least relevant for fixing [2].
Ivan, the
d the High Heap Usage is because of a Memory Leak in a
> library we are using.
>
> Thanks for your help.
>
> On Mon, Jul 19, 2021 at 8:31 PM Piotr Nowojski
> wrote:
>
>> Thanks for the update.
>>
>> > Could the backpressure timeout and heartbeat ti
apache.flink.runtime.jobmaster.JobMaster$TaskManagerHeartbeatListener.notifyHeartbeatTimeout(JobMaster.java:1193)
> ..
>
> Because of heartbeat timeout, there was an internal restart of Flink and
> the Kafka consumption rate recovered after the restart.
>
> Could the backpressure timeout and
ly called windows), or when you do event processing where the
> > time when an event occurred is important.
> > ci.apache.org
> >
> >
> > [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/streaming/api/watermark/Watermar
2)\n\tat
> org.apache.flink.streaming.connectors.kafka.internal.Handover.pollNext(Handover.java:74)\n\tat
> org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.runFetchLoop(KafkaFetcher.java:133)\n\tat
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(Fl
hat the entire flow is the
> same(operators and sink); is there any good practice to achieve a single
> job for that?
>
> Tamir.
>
> [1]
> https://stackoverflow.com/questions/54687372/flink-append-an-event-to-the-end-of-finite-datastream#answer-54697302
> -
il.concurrent.CompletableFuture.get(CompletableFuture.java:1908)\n\tat
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegmentBlocking(LocalBufferPool.java:293)\n\tat
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBuilderBlocking(Local
Hi,
Sources when finishing are emitting
{{org.apache.flink.streaming.api.watermark.Watermark#MAX_WATERMARK}}, so I
think the best approach is to register an even time timer for
{{Watermark#MAX_WATERMARK}} or maybe {{Watermark#MAX_WATERMARK - 1}}. If
your function registers such a timer, it would
Hi,
I'm not sure, maybe someone will be able to help you, but it sounds like it
would be better for you to:
- google search something like "Kafka Error sending fetch request
TimeoutException" (I see there are quite a lot of results, some of them
might be related)
- ask this question on the Kafka
Hi,
It's mentioned in the docs [1], but unfortunately this is not very well
documented in 1.10. In short you have to provide a custom implementation of
a `DeserializationSchemaFactory`. Please look at the built-in factories for
examples of how it can be done.
In newer versions it's both easier
Great, thanks for coming back and I'm glad that it works for you!
Piotrek
czw., 8 lip 2021 o 13:34 Yik San Chan
napisał(a):
> Hi Piotr,
>
> Thanks! I end up doing option 1, and that works great.
>
> Best,
> Yik San
>
> On Tue, May 25, 2021 at 11:43 PM Piotr No
ob id.
>>
>> A bit of a mystery. Is there a way to at least catch it in the future?
>> Any additional telemetry (logs, metrics) we can extract to better
>> understand what is happening.
>>
>> Alex
>>
>> On Tue, Jun 8, 2021 at 12:01 AM Piotr Nowojski
&g
You are welcome :)
Piotrek
śr., 30 cze 2021 o 08:34 Thomas Wang napisał(a):
> Thanks Piotr. This is helpful.
>
> Thomas
>
> On Mon, Jun 28, 2021 at 8:29 AM Piotr Nowojski
> wrote:
>
>> Hi,
>>
>> You should still be able to get the Flink log
Hi,
You should still be able to get the Flink logs via:
> yarn logs -applicationId application_1623861596410_0010
And it should give you more answers about what has happened.
About the Flink and YARN behaviour, have you seen the documentation? [1]
Especially this part:
> Failed containers
Thomas J. Raef
> Founder, WeWatchYourWebsite.com
> http://wewatchyourwebsite.com
> tr...@wewatchyourwebsite.com
> LinkedIn <https://www.linkedin.com/in/thomas-raef-74b93a14/>
> Facebook <https://www.facebook.com/WeWatchYourWebsite>
>
>
>
> On Mon, Jun 28, 2021
Hi,
It's hard to say from the log fragment, but I presume this task has
correctly switched to "CANCELLED" state and this error should not have been
logged as an ERROR, right? How did you get this stack trace? Maybe it was
logged as a DEBUG message? If not, that would be probably a minor bug in
Hi,
We are glad that you want to try out Flink, but if you would like to get
help you need to be a bit more specific. What are you exactly doing, and
what, on which step exactly and how is not working (including logs and/or
error messages) is necessary for someone to help you.
In terms of how to
TERMARKS.
>> WHY?
>> processing watermark: 0
>> processing watermark: 0
>> processing watermark: 0
>> Attempts restart: 1
>> processing watermark: 1
>> processing watermark: 1
>> processing watermark: 1
>> processing watermark: 1
>> A
17 Lei Wang napisał(a):
> There's enough slots on the jobmanger UI, but the slots are not available.
>
> After i add taskmanager.host: localhost to flink-conf.yaml, it works.
>
> But i don't know why.
>
> Thanks,
> Lei
>
>
> On Fri, Jun 18, 2021 at 6:07 PM
I'm glad I could help, I hope it will solve your problem :)
Best,
Piotrek
pt., 18 cze 2021 o 14:38 Felipe Gutierrez
napisał(a):
> On Fri, Jun 18, 2021 at 1:41 PM Piotr Nowojski
> wrote:
>
>> Hi,
>>
>> Keep in mind that this is a quite low level approach to this p
Hi,
As far as I know there are no plans to support other state backends with
BroadcastState. I don't know about any particular technical limitation, it
probably just hasn't been done. Also I don't know how much effort that
would be. Probably it wouldn't be easy.
Timo, can you chip in how for
Hi,
always when upgrading I would suggest to check release notes first [1]
Best,
Piotrek
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/release-notes/flink-1.11.html#memory-management
pt., 18 cze 2021 o 12:24 Haihang Jing napisał(a):
> Ask a question, the same business
ctStreamOperator` or
`OneInputStreamOperator`.
Best,
Piotrek
pt., 18 cze 2021 o 12:49 Felipe Gutierrez
napisał(a):
> Hello Piotrek,
>
> On Fri, Jun 18, 2021 at 11:48 AM Piotr Nowojski
> wrote:
>
>> Hi,
>>
>> As far as I can tell timers should be che
Hi,
I would start by looking at the Job Manager and Task Manager logs. Take a
look if Task Managers connected/registered in the Job Manager and if so, if
there were no problems when submitting the job. It seems like either there
are not enough slots, or slots are actually not available.
Best,
Hi,
In old Flink versions (prior to 1.9) that would be the case. If operator D
emitted a record to Operator B, but Operator B hasn't yet processed when
checkpoint is happening, this record would be lost during recovery.
Operator D would be recovered with it's state as it was after emitting this
Hi,
As far as I can tell timers should be checkpointed and recovered. What may
be happening is that the state of the last seen watermarks by operators on
different inputs and different channels inside an input is not persisted.
Flink is assuming that after the restart, watermark assigners will
Hi,
Unfortunately at the moment I think there are no plans to push for this. I
would suggest you to bump/cast a vote on
https://issues.apache.org/jira/browse/FLINK-13856 in order to allows us
more accurately prioritise efforts.
Best,
Piotrek
śr., 16 cze 2021 o 05:46 Jiahui Jiang napisał(a):
>
Hi Haocheng,
Regarding the first part, yes. For a very long time there was a trivial bug
that was displaying the maximum "backpressure status" ("HIGH" in your case)
from all of the subtasks, for every subtask, instead of showing the
subtask's individual status. [1] It is/will be fixed in Flink
Hi Thomas. The bug https://issues.apache.org/jira/browse/FLINK-21028 is
still present in 1.12.1. You would need to upgrade to at least 1.13.0,
1.12.2 or 1.11.4. However as I mentioned before, 1.11.4 hasn't yet been
released. On the other hand both 1.12.2 and 1.13.0 have already been
superseded by
Yes good catch Kezhu, IllegalStateException sounds very much like
FLINK-21028.
Thomas, could you try upgrading to Flink 1.13.1 or 1.12.4? (1.11.4 hasn't
been released yet)?
Piotrek
wt., 8 cze 2021 o 17:18 Kezhu Wang napisał(a):
> Could it be same as FLINK-21028[1] (titled as “Streaming
nts.
>
> Alex
>
> On Mon, Jun 7, 2021 at 3:12 AM Piotr Nowojski
> wrote:
>
>> Hi Alex,
>>
>> A quick question. Are you using incremental checkpoints?
>>
>> Best, Piotrek
>>
>> sob., 5 cze 2021 o 21:23 napisał(a):
>>
>>
>>>> Unexpected error in InitProducerIdResponse; Producer attempted an
>>>> operation with an old epoch. Either there is a newer producer with the same
>>>> transactionalId, or the producer's transaction has been expired by the
>>>> broker.
>>>
Hi Alex,
A quick question. Are you using incremental checkpoints?
Best, Piotrek
sob., 5 cze 2021 o 21:23 napisał(a):
> Small correction, in T4 and T5 I mean Job2, not Job 1 (as job 1 was save
> pointed).
>
> Thank you,
> Alex
>
> On Jun 4, 2021, at 3:07 PM, Alexander Filipchik
> wrote:
>
>
Hi,
I think there is no generic way. If this error has happened indeed after
starting a second job from the same savepoint, or something like that, user
can change Sink's operator UID.
If this is an issue of intentional recovery from an earlier
checkpoint/savepoint, maybe
>>
>> Yes, it should be possible to register a timer for Long.MAX_WATERMARK if
>> you want to apply a transformation at the end of each key. You could
>> also use the reduce operation (DataStream#keyBy#reduce) in BATCH mode.
>
> According to [0], timer time is irrelevant since timer will be
Glad to hear it! Thanks for confirming that it works.
Piotrek
śr., 26 maj 2021 o 12:59 Barak Ben Nathan
napisał(a):
>
>
> Hi Piotrek,
>
>
>
> This is exactly what I was searching for. Thanks!
>
>
>
> Barak
>
>
>
> *From:* Piotr Nowojski
> *Sent
k.beforeInvoke(StreamTask.java:475)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:526)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> at java
Hi Barak,
Before starting the JobManager I don't think there is any API running at
all. If you want to be able to submit/stop multiple jobs to the same
cluster session mode is indeed the way to go. But first you need to to
start the cluster ( start-cluster.sh ) [1]
Piotrek
[1]
Hi,
You could always buffer records in your sink function/operator, until a
large enough batch is accumulated and upload the whole batch at once. Note
that if you want to have at-least-once or exactly-once semantics, you would
need to take care of those buffered records in one way or another. For
Hi,
1. I don't know if there is a built-in way of doing it. You can always pass
this information anyway on your own when you are starting the job (via
operator/function's constructors).
2. Yes, I think this should work.
Best,
Piotrek
wt., 25 maj 2021 o 17:05 ChangZhuo Chen (陳昌倬)
napisał(a):
>
Hi Georgi,
I don't think it's a bug in Flink. It sounds like some problem with
dependencies or jars in your job. Can you explain a bit more what do you
mean by:
> that some of them are constantly restarting with the following exception.
After restart, everything is working as expected
Hi,
That's a throughput of 700 records/second, which should be well below
theoretical limits of any deserializer (from hundreds thousands up to tens
of millions records/second/per single operator), unless your records are
huge or very complex.
Long story short, I don't know of a magic bullet to
Hi Vijay,
I'm not sure if I understand your question correctly. You have jar and
configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using
those? Can you simply download those things (whole directory containing
those) to the machine that will be starting the Flink job?
Best, Piotrek
Hi Marco,
How are you starting the job? For example, are you using Yarn as the
resource manager? It looks like there is just enough resources in the
cluster to run this job. Assuming the cluster is correctly configured and
Task Managers are able to connect with the Job Manager (can you share full
Yes, thanks a lot for driving this release Arvid :)
Piotrek
czw., 29 kwi 2021 o 19:04 Till Rohrmann napisał(a):
> Great to hear. Thanks a lot for being our release manager Arvid and to
> everyone who has contributed to this release!
>
> Cheers,
> Till
>
> On Thu, Apr 29, 2021 at 4:11 PM Arvid
d that count is about 20-25% short (and not
> consistent) of what comes out consistently when parallelism is set to 1.
>
>
>
> Dylan
>
>
>
> *From: *Dylan Forciea
> *Date: *Wednesday, April 14, 2021 at 9:08 AM
> *To: *Piotr Nowojski
> *Cc: *"user@flink.a
think the only thing you can do is to either speed up the split enumeration
(probably difficult) or increase the timeouts that are failing. I don't
know if there is some other workaround at the moment (Becket?).
Piotrek
śr., 14 kwi 2021 o 15:57 Piotr Nowojski napisał(a):
> Hey,
>
>
Hi,
Yes, it looks like your query is non deterministic because of `FIRST_VALUE`
used inside `GROUP BY`. If you have many different parallel sources, each
time you run your query your first value might be different. If that's the
case, you could try to confirm it with even smaller query:
Hey,
could you provide full logs from both task managers and job managers?
Piotrek
śr., 14 kwi 2021 o 15:43 太平洋 <495635...@qq.com> napisał(a):
> After submit job, I received 'Failed to execute job' error. And the time
> between initialization and scheduling last 214s. What has happened during
1 - 100 of 607 matches
Mail list logo