Hello,
We are migrating our HA setup from ZK to K8S, and we have a question
regarding the RPC port.
Previously with ZK, the RPC connection config was the :
high-availability.jobmanager.port
We were expecting that the config will be the same with K8S HA, as the doc
says : "The port (range) used
Can someone help me ?
*Regards,*
------
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
ns that we drop some of our kafka source state to read
again from kafka committed offset, but maybe nodoby does that ^^
Anyway thanks for the focus on the release note !
Best Regards,
------
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mer. 15 juin 20
whether this is a changed behavior or I am missing
something
Bastien DINE
Freelance
Data Architect / Software Engineer / Sysadmin
http://bastiendine.io
Le mar. 14 juin 2022 à 23:08, Jing Ge a écrit :
> Hi Bastien,
>
> Thanks for asking. I didn't find any call of setStartFromGroupOffsets
tedOffsets(OffsetResetStrategy.EARLIEST
This change can lead to big troubles if user pay no attention to this point
when migrating from old KafkaConsumer to new KafkaSource,
Regards,
Bastien
------
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
ink as we are
> planning to write to S3?
>
> Thanks again for your help.
>
> Antonio.
>
> On Thu, Feb 10, 2022 at 11:49 AM bastien dine
> wrote:
>
>> Hello Antonio,
>>
>> .collect() method should be use with caution as it's collecting the
>> Da
it into a formatted file instead
(like CSV one) or if it's too big, into something splittable
Regards,
Bastien
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le jeu. 10 févr. 2022 à 20:32, Antonio Si a écrit :
> Hi,
>
> I am using the
Hello,
On k8s the current recommendation is to set up 1 job manager with H-A
enabled, so that cluster do not lost state upon crash
1. The storage dir can for sure be on kube PV, the directory should be
shared within all JM, you will need to map the volume to the same local
directory (e.g /data)
t throwing an
exception ?
Or do I try to find the root cause, namely why the field serializer of the
deleted field is still present ?
Regards,
------
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mer. 2 févr. 2022 à 16:37, Alexis Sarda-Espinosa <
alexi
Hello,
I have some trouble restoring a state (pojo) after deleting a field
According to documentation, it should not be a problem with POJO :
*"Fields can be removed. Once removed, the previous value for the removed
field will be dropped in future checkpoints and savepoints."*
Here is a short
,
Bastien
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mar. 8 déc. 2020 à 08:54, Yun Tang a écrit :
> Hi Bastien,
>
> Flink supports to register state via state descriptor when
> calling runtimeContext.getState(). However, on
on't want them to contain non-used state
If anybody has experienced an issue like that before or knows how to handle
this, I would be glad to discuss !
Best regards,
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
Hello Congxian,
Thanks for your response,
Don't you have an example with an Operator extending the
AbstractUdfStreamOperator?
Using the context.getRawKeyedStateInputs() (& output to snapshots)
TimeService is reimplementing the whole stuff :/
--
Bastien DINE
Data Archi
( need to access it and update) so it's taking
a crazy amount of time and we would like to have it serialized only on
snapshot, so using Raw state is a possible good solution,
But i cannot find anyexample of it :/
Thanks and best regards,
Bastien DINE
Freelance
Data Architect / Software Engineer
enough ressources to increase my parallelism),
As far as I have seen, this is not possible
When I write something like
Async.orderedWait(myStream.keyBy(myKeyselector)), the keyBy is totally
ignored
Have you a solution for this?
Best Regards,
Bastien
--
Bastien DINE
Data Architect
(Constructor.java:423)
at
org.apache.flink.runtime.taskmanager.Task.loadAndInstantiateInvokable(Task.java:1405)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:689)
at java.lang.Thread.run(Thread.java:748)
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
with this entrypoint) it will read in ZK
/ HDFS and restore previous execution
Regards,
Bastien
------
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le ven. 24 mai 2019 à 23:22, Boris Lublinsky
a écrit :
> Hi,
> I was experimenting with HA latel
containing
field1, field2
I would like to split it into two operator B & C with, respectively, state
field1 and state field2
How can I split my state upon multiple operators ?
Best Regards,
Bastien
------
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
between job manager / task manager / zk
Best Regards,
Bastien
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
sing the clientLevel to submit a program ? (with a regular execution
environment )
https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html#client-level
What are the caveats of each methods ?
Best Regards,
Bastien
--
Bastien DINE
Data Architect / Software Engi
Hello Jamie,
Does #1 apply to batch jobs too ?
Regards,
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le lun. 14 janv. 2019 à 20:39, Jamie Grier a écrit :
> There are a lot of different ways to deploy Flink. It would be easier to
>
Nevermind..
Problem already discussed in thread :
Flink 1.7 jobmanager tries to lookup taskmanager by its hostname in k8s
environment"
------
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mar. 15 janv. 2019 à 15:16, bastien dine a
écrit :
&g
eed to had a service or something to make it work ?
Best Regards,
Bastien
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
cally
do the processing, right ?
How can I deal with that ?
Best Regard and many thanks !
Bastien
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mer. 12 déc. 2018 à 13:39, Hequn Cheng a écrit :
> Hi Bastien,
>
> Each key “belongs” to exa
is related, does "works with the keys" means always the same keys ?
Best Regards,
Bastien
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Yea, that was I was thinking..
Batch can be quick
Can I report metric on "added" action ? (i should override the
notifyOnAddedMetric to report ?)
------
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mer. 28 nov. 2018 à 15:54, Chesnay Scheple
logy ?
*Note* : I am using DataSet API (so my program is batch, and not continuous)
Regards,
Bastien
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mar. 27 nov. 2018 à 17:07, Chesnay Schepler a
écrit :
> Please enable WARN logging an
Name))
.output(new MetricGaugeOutputFormat<>()); // dummy output
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Oh god, if we have some code with Accumulator after the env.execute(), this
will not be executed on the JobManager too ?
Thanks, I would be interested indeed !
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le ven. 23 nov. 2018 à 16:37, Flavio
Hello,
I need to chain processing in DataSet API, so I am launching severals jobs,
with multiple env.execute() :
topology1.define();
env.execute;
topogy2.define();
env.execute;
This is working fine when I am running it within IntellIiJ
But when I am deploying it into my cluster, it only launch
Hi Eric,
You can run a job from another one, using the REST API
This is the only way we have found to launch a batch job from a streaming
job
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le ven. 23 nov. 2018 à 11:52, Piotr Nowojski a
écrit
Hello,
I would like to use a broadcast variable in my outputformat (to pass some
information, and control execution flow)
How would I do it ?
.output does not have a .withBroadcast function as it does not extends
SingleInputUdfOperator
--
Bastien DINE
Data Architect / Software
}
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mer. 21 nov. 2018 à 22:44, Jamie Grier a écrit :
> What you're describing is not possible. There is no runtime context or
> metrics you can use at that point.
>
> The best you can
(i.e not
in a operator) before & after the env.execute..
How can i retrieve the runtime context, from the execution env maybe ?
Regards,
Bastien
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Hi Fabian,
Thanks for the response, I am going to use the second solution !
Regards,
Bastien
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mer. 7 nov. 2018 à 14:16, Fabian Hueske a écrit :
> Another option for certain tasks is to w
Hello,
I would like to a way to count a dataset to check if it is empty or not..
But .count() throw an execution and I do not want to do separe job
execution plan, as hthis will trigger multiple reading..
I would like to have something like..
Source -> map -> count -> if 0 -> do someting
Best, Hequn
> [1] https://flink.apache.org/flink-applications.html#layered-apis
> [2]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql.html#joins
>
> On Tue, Sep 25, 2018 at 4:14 PM bastien dine
> wrote:
>
>> Hello everyone,
>>
>> I need
Hello everyone,
I need to join some files to perform some processing.. The dataset API is a
perfect way to achieve this, I am able to do it when I read file in batch
(csv)
However in the prod environment, I will receive thoses files in kafka
messages (one message = one line of a file)
So I am
38 matches
Mail list logo