Hi all,
After running some tests with the proposed default value (
taskmanager.network.sort-shuffle.min-parallelism: 1,
taskmanager.network.sort-shuffle.min-buffers: 512,
taskmanager.memory.framework.off-heap.batch-shuffle.size: 64m,
taskmanager.network.blocking-shuffle.compression.enabled: true),
Hi everyone,
We (team in seznam.cz) are actually using the Gelly library for batch
anomaly detection in our graphs. It will be very nice to somehow keep this
functionality, maybe in a separate repository. Is there any replacement?
Best,
Lukas
On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser wrote
Hi Thomas,
AFAIK there are no specific plans in this direction with the native
integration, but I'd like to share some thoughts on the topic
In my understanding there are three major groups of workloads in Flink:
1) Batch workloads
2) Interactive workloads (Both Batch and Streaming; eg. SQL Gate
Thanks for the experiment. +1 for the changes.
Yingjie Cao 于2022年1月4日周二 17:35写道:
> Hi all,
>
> After running some tests with the proposed default value (
> taskmanager.network.sort-shuffle.min-parallelism: 1,
> taskmanager.network.sort-shuffle.min-buffers: 512,
> taskmanager.memory.framework.off
Hi Martijin,
Thanks for the feedback. I am not proposing to bundle the new graph
library with Alink. I am +1 for dropping the DataSet-based Gelly library,
but we probably need a new graph library in Flink for the possible
migration.
We haven't decided what to do yet and probably need more discus
Hi,
thanks Martijn for bringing it up for discussion. I think we could make the
discussion a little bit clearer by splitting it into two questions:
1. should Flink drop Gelly?
2. should Flink drop the graph computing?
The answer of the first question could be yes, since there have been no
change
I am seeing an issue with class loaders not being GCed and the metaspace
eventually OOM. Here is my setup:
- Flink 1.13.1 on EMR using JDK 8 in session mode
- Job manager is a long-running yarn session
- New jobs are submitted every 5m (and typically run for less than 5m)
I find that after a few
Hi Piotrek,
> In other words, something (presumably a watermark) has fired more than 151
> 200 windows at once, which is taking ~1h 10minutes to process and during this
> time the checkpoint can not make any progress. Is this number of triggered
> windows plausible in your scenario?
It seems p
Hello Flink community,
We have a Flink cluster deployed to AWS EKS along with many other applications.
This cluster is managed by Spotify’s Flink operator. After deployment I notice
the Stateful pods of job manager and task managers intermittently received
SIGTERM to terminate themselves. I ass
After a lot of struggle with the pure Jackson library which doesn't have a
strict mode within it due to which I wasn't able to validate the JSON
schema. I finally found one way of doing it but now I am not able to map
the correct *Success* and *Failure* messages in order to call the Process
Functio
Can someone help me understand how Flink deals with the following scenario?
I have a job that reads from a source Kafka (starting-offset: latest) and
writes to a sink Kafka with exactly-once execution. Let's say that I have 2
records in the source. The 1st one is processed without issue and the jo
Hi Zhipeng,
I think that we're seeing more code being externalised, for example with
the Flink Remote Shuffle service [1] and the ongoing discussion on the
external connector repository [2], it makes sense to go for your second
option. Maybe it fits under Flink Extended [3].
The main question bec
We are trying to write objects with encryption at rest. To enable this, the
request containing the payload we intend to upload must include a
x-amz-server-side-encryption header. [1]. I would imagine this is a common
use case, but after some digging, I cannot find any article that covers
this. Has
退订
Hi!
The last expression in your try block is validationMessages.foreach(msg =>
println(msg.getMessage)), which does not produce anything. You need to use
an expression that produces type T in your try block.
Siddhesh Kalgaonkar 于2022年1月5日周三 02:35写道:
> After a lot of struggle with the pure Jacks
Hi!
This is a valid case. This starting-offset is the offset for Kafka source
to read from when the job starts *without checkpoint*. That is to say, if
your job has been running for a while, completed several checkpoints and
then restarted, Kafka source won't read from starting-offset, but from th
Hi,
Very thanks for initiating the discussion!
Also +1 to drop the current DataSet based Gelly library so that we could
finally drop the
legacy DataSet API.
For whether to keep the graph computing ability, from my side graph query /
graph computing and
chaining them with the preprocessing pi
Hi Caizhi,
Thank you for the quick response. Can you help me understand how
reprocessing the data with the earliest starting-offset ensures exactly
once processing? 1st, the earliest offset could be way beyond the 1st
record in my example since the first time the job started from the latest
offset
18 matches
Mail list logo