Re: In native k8s application mode, how can I know whether the job is failed or finished?

2021-06-03 Thread LIU Xiao
Thank you for timely help!

I've tried session mode a little bit, it's better than I thought, the
TaskManager can be allocated and de-allocated dynamically. But it seems the
memory size of TaskManager is fixed when the session starts, and can not be
adjusted for different job.

I'll try to deploy a history server on k8s later...


Re: In native k8s application mode, how can I know whether the job is failed or finished?

2021-06-03 Thread Xintong Song
There are two ways to access the status of a job after it is finished.

1. You can try native k8s deployment in session mode. When jobs are
finished in this mode, TMs will be automatically released after a
short period of time, while JM will not be terminated until you explicitly
shutdown the session cluster. Thus, status of historical jobs can be
accessed via the JM.

2. You can try setting up a history server [1], where information of
finished jobs can be archived.

Thank you~

Xintong Song


[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/advanced/historyserver/

On Thu, Jun 3, 2021 at 2:46 PM 刘逍  wrote:

> Hi,
>
> We are currently using Flink 1.6 standalone mode, but the lack of
> isolation is a headache for us. At present, I am trying application mode
> of Flink 1.13.0 on native K8s.
>
> I found that as soon as the job ends, whether it ends normally or
> abnormally, the jobmanager can no longer be accessed, so the "flink
> list" command cannot get the final state of the job.
>
> K8s pod will also be deleted immediately, "kubectl get pod" can only see
> "running", "terminating", and then "not found".
>
> The Flink job needs to be managed by our internal scheduling system, so
> I need to find a way to let the scheduling system know whether the job
> ends normally or abnormally.
>
> Is there any way?
>


In native k8s application mode, how can I know whether the job is failed or finished?

2021-06-02 Thread 刘逍
 Hi,

We are currently using Flink 1.6 standalone mode, but the lack of
isolation is a headache for us. At present, I am trying application mode
of Flink 1.13.0 on native K8s.

I found that as soon as the job ends, whether it ends normally or
abnormally, the jobmanager can no longer be accessed, so the "flink
list" command cannot get the final state of the job.

K8s pod will also be deleted immediately, "kubectl get pod" can only see
"running", "terminating", and then "not found".

The Flink job needs to be managed by our internal scheduling system, so
I need to find a way to let the scheduling system know whether the job
ends normally or abnormally.

Is there any way?