Hi Marvin,
Thanks for reporting this issue.
Can you share more details about the failed checkpoint, such as log,
exception stack trace, which statebackend used, HA configuration?
These information can help to trace the issue.
Thanks, vino.
2018-07-26 10:12 GMT+08:00 Marvin777 :
> Hi, all:
>
Hi Youjun,
Thanks, you can try this but I am not sure if it works correctly. Because
for the REST Client, there are quite a few changes from 1.4 to 1.5.
Maybe you can customize the source code in 1.4 refer to specific
implementation of 1.5? Another option, upgrade your Flink version.
To Chesnay
Hi, all:
flink job can run normally, but checkpoint always fails, like this:
[image: image.png]
[image: image.png]
checkpoint configuration:
[image: image.png]
thanks.
Thanks for the information. Forgot to mention, I am using Flink 1.4, the
RestClusterClient seems don’t have the ability to retrieve the leader address.
I did notice there is webMonitorRetrievalService member in Flink 1.5.
I wonder if I can use
Hi Michael,
We are currently using 15 TMs with 4 cores and 4 slots each, 10GB of memory on
each TM. We have 15 partitions on Kafka for stream and 6 for context/smaller
stream. Heap is around 50%, GC is about 150ms and CPU loads are low. We may be
able to reduce resources on this if need be.
Hi,
I'm trying to run a job with Flink's new Python streaming API but I'm
running into issues with Java imports.
I have a Jython project in IntelliJ with a lot of Java dependencies
configured through Maven. I can't figure out how to make Flink "see" these
dependencies.
An example script that
Hi Till,
Thanks for your reply. But I think maybe I did not make my question clear. My
question is not about whether the States within each keyed operator instances
will run out of memory. My question is about, whether the unlimited keyed
operator instances themselves will run out of memory.
I have been using the JMX reporter to gather data from a running flink
instance, more specifically to access a Gauge in one of my functions.
I am able to access this metric through:
JMXClientListener listener = new JMXClientListener();
JMXServiceURL url = new
Hi Vino,
Data is ok i double checked. Input is plain json and it can be processed by
same code compiled and run on 1.3.1 flink. Thanks for the hint about avro
and parquet versions. Got my fat jar synced up with flink 1.5.1
avro/parguet versions. Hope was high that it will help to resolve the
Hi Till,
Server start up entrypoint log
2018-07-25T12:19:12.268+
[org.apache.flink.runtime.entrypoint.ClusterEntrypoint] INFO
2018-07-25T12:19:12.271+
[org.apache.flink.runtime.entrypoint.ClusterEntrypoint]
Hi, You are welcome. Hope for enhancing Flink ML in the future. Vino yang
Thanks. On 2018-07-25 20:02 , Cederic Bosmans Wrote: Dear I was able to fix the
problem. Thank you very much for your support! Kind regards Cederic On Wed,
Jul 25, 2018 at 1:57 PM Cederic Bosmans wrote: Dear I
Hi Cussac,
If I understand correctly, you want to pass rules.consumer.topic=test
and rules.consumer.topic=test
to flink jvm.
I think you can try:
flink run -m $HOSTPORT -yD rules.consumer.topic=test -yD
rules.consumer.topic=test
Hope this helps.
Hequn
On Wed, Jul 25, 2018 at 3:26 PM, Cussac,
Hi Chesnay,
No error in the logs. That is why I am not able to understand why
checkpoints are getting triggered.
Regards,
Vinay Patil
On Wed, Jul 25, 2018 at 4:36 PM Chesnay Schepler wrote:
> Please check the job- and taskmanager logs for anything suspicious.
>
> On 25.07.2018 12:33, Vinay
Hi Hequn,
Thanks for your answer. I just tested and it doesn’t work.
I’m using PureConfig to parse my conf files. With java I can override any
argument using –D= syntax. How can I do same with flink in yarn
mode ?
Franck.
De : Hequn Cheng [mailto:chenghe...@gmail.com]
Envoyé : mercredi 25
Hey guys,
We just built a brand new Flink 1.4.0 cluster with HA and everything seems
to be working fine, but we are getting some errors with savepoints.
For example, I have a running job
-- Running/Restarting Jobs ---
25.07.2018 11:55:18 :
You can use rest api here
https://ci.apache.org/projects/flink/flink-docs-release-1.5/monitoring/rest_api.html
On Wed, Jul 25, 2018 at 5:18 PM 从六品州同 <26304...@qq.com> wrote:
> dear all:
> Is there a notification mechanism for Flink? When job's status changes,
> such as restart, failure, notify
Can you provide us with the job code?
I assume that checkpointing runs properly if you submit the same job to
a normal cluster?
On 25.07.2018 13:15, Vinay Patil wrote:
No error in the logs. That is why I am not able to understand why
checkpoints are not getting triggered.
Regards,
Vinay
No error in the logs. That is why I am not able to understand why
checkpoints are not getting triggered.
Regards,
Vinay Patil
On Wed, Jul 25, 2018 at 4:44 PM Vinay Patil wrote:
> Hi Chesnay,
>
> No error in the logs. That is why I am not able to understand why
> checkpoints are getting
Hi Martin,
For a standalone cluster which exists multiple JM instances, If you do not
use Rest API, but use Flink provided Cluster client. The client can
perceive which one this the JM leader from multiple JM instances.
For example, you can use CLI to submit flink job in a non-Leader node.
But
Hi Cederic,
The README said the "*latest*" version is 1.3.2, it means as of that time.
I think for the latest Flink version, there is no guarantee.
I suggest replace the version's value from LATEST to specific version
number.
Thanks, vino.
2018-07-25 15:49 GMT+08:00 Cederic Bosmans :
> Dear
Please check the job- and taskmanager logs for anything suspicious.
On 25.07.2018 12:33, Vinay Patil wrote:
Hi,
I am starting the cluster using bootstrap application where in I am
calling Job Manager and Task Manager main class to form the cluster.
The HA cluster is formed correctly and I am
Hi,
I am starting the cluster using bootstrap application where in I am calling
Job Manager and Task Manager main class to form the cluster. The HA cluster
is formed correctly and I am able to submit jobs to this cluster using
RemoteExecutionEnvironment but when I enable checkpointing in code I
Thanks Chesnay for your inputs.
I am actually starting the cluster using bootstrap application where in I
am calling Job Manager and Task Manager main class to form the cluster.
So I have removed flink-runtime-web dependency and used only flink_runtime
dependency for forming the cluster , but
Hi,
First of all, the ticket reports a bug (or improvement or feature
suggestion) such that others are aware of the problem and understand its
cause.
At some point it might be picked up and implemented. In general, there is
no guarantee whether or when this happens, but the Flink community is of
Hi,
Following the documentation I want to use -yD option to override some params in
my conf like this :
flink run -m $HOSTPORT -yD
"env.java.opts.taskmanager=-Drules.consumer.topic=test" -yD
"env.java.opts.jobmanager=-Drules.consumer.topic=test" myjar mymain
but it is just ignored. Nothing
Hi,
This is actually very relevant to us as well.
We want to deploy Flink 1.3.2 on a 3 node DCOS cluster. In the case of
Mesos/DCOS, Flink HA runs only one JobManager which gets restarted on
another node by Marathon in case of failure and re-load it's state from
Zookeeper.
Yuan I am guessing
There's no built-in mechanism to send notifications.
To monitor the job status you can poll the REST API. (*/jobs/:jobid)
*Alternatively you could implement a MetricReporter that explicitly
checks for the availability metrics
Hi,all:
Checkpoint always fails, like this:
https://jira.apache.org/jira/browse/FLINK-9945
[image: image.png]
thanks.
dear all??Is there a notification mechanism for Flink? When job's status
changes, such as restart, failure, notify other systems??
Or
I have a system to monitor the job state of Flink. What should I do?
thanks
Thank you Fabian for the guide to implement the fix.
I'm not quite clear about the best practice of creating JIRA ticket. I
modified its priority to Major as you said that it is important.
What would happen next with that issue then? Someone (anyone) will pick it
and create a fix, then include
Hi,
Thanks for creating the Jira issue.
I'm not sure if I would consider this a blocker but it is certainly an
important problem to fix.
Anyway, in the original version Flink checkpoints the modification
timestamp up to which all files have been read (or at least up to which
point it *thinks* to
Hi Ashish,
We are planning for a similar use case and I was wondering if you can share
the amount of resources you have allocated for this flow?
Thanks,
Michael
On Tue, Jul 24, 2018, 18:57 ashish pok wrote:
> BTW,
>
> We got around bootstrap problem for similar use case using a “nohup” topic
Hi Alex,
could you share with us the full logs of the client and the cluster
entrypoint? That would be tremendously helpful.
Cheers,
Till
On Wed, Jul 25, 2018 at 4:08 AM vino yang wrote:
> Hi Alex,
>
> Is it possible that the data has been corrupted?
>
> Or have you confirmed that the avro
Hi Yuan Youjun,
Actually, RestClusterClient has a method named getWebMonitorBaseUrl which
will retrieve the webmonitor's leader address when you submit job
automatically.[1]
Ideally, you do not need to retrieve JM by yourself. Currently, the
webmonitor is binding with JobManager, maybe if JM
34 matches
Mail list logo