from:"DONG, Weike"

Question on getting the last succesfuly externalized checkpoint path for crashed jobs

2021-01-11 Thread DONG, Weike

Hi community, We are currently using* Externalized Checkpoints* to prevent abrupt YARN application failures, as it saves a "_metadata" file within the checkpoint folder which is essential for the job's cold recovery. As it is designed in Flink, the completed checkpoint paths are like *hdfs:///fli

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

2020-10-16 Thread DONG, Weike

wrote: > Great, thanks a lot Weike. I think the first step would be to open a JIRA > issue, get assigned and then start on fixing it and opening a PR. > > Cheers, > Till > > On Fri, Oct 16, 2020 at 10:02 AM DONG, Weike > wrote: > >> Hi all, >> >> Than

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

2020-10-16 Thread DONG, Weike

plit > assignments and for the LocationPreferenceSlotSelectionStrategy to > calculate how many TMs run on the same machine). > > Do you want to fix this issue? > > Cheers, > Till > > On Thu, Oct 15, 2020 at 11:38 AM DONG, Weike > wrote: > >> Hi Till and community, >> >&g

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

2020-10-15 Thread DONG, Weike

high variance, i. e. normally it completes fast but occasionally some slow results would block the thread. So an unstable DNS server might have a great impact on the performance of Flink job startup. Best, Weike On Thu, Oct 15, 2020 at 5:19 PM DONG, Weike wrote: > Hi Till and commun

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

2020-10-15 Thread DONG, Weike

k at them. My suspicion >> would be that there is some operation blocking the JobMaster's main thread >> which causes the registrations from the TMs to time out. Maybe the logs >> allow me to validate/falsify this suspicion. >> >> Cheers, >> Till >> >> O

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

2020-10-12 Thread DONG, Weike

://gist.github.com/kylemeow/740c470d9b5a1ab3552376193920adce TaskManager-1-1: https://gist.github.com/kylemeow/41b9a8fe91975875c40afaf58276c2fe Thanks : ) Best regards, Weike On Mon, Oct 12, 2020 at 4:14 PM DONG, Weike wrote: > Hi community, > > Recently we have noticed a strange behavior

TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

2020-10-12 Thread DONG, Weike

Hi community, Recently we have noticed a strange behavior for Flink jobs on Kubernetes per-job mode: when the parallelism increases, the time it takes for the TaskManagers to register with *JobManager *becomes abnormally long (for a task with parallelism of 50, it could take 60 ~ 120 seconds or ev

Re: Flink YARN app terminated before the client receives the result

2020-03-20 Thread DONG, Weike

gt;> remember whether a request is currently ongoing or not. >> >> Cheers, >> Till >> >> On Tue, Mar 17, 2020 at 9:01 AM DONG, Weike >> wrote: >> >>> Hi Tison & Till and all, >>> >>> I have uploaded the client, taskmanager an

Re: Flink YARN app terminated before the client receives the result

2020-03-17 Thread DONG, Weike

gt;> RestServer which then is not able to serve the response to the client. I'm >>> pulling in Aljoscha and Tison who introduced this change. They might be >>> able to verify my theory and propose a solution for it. >>> >>> [1] https://issues.apa

Re: Flink YARN app terminated before the client receives the result

2020-03-12 Thread DONG, Weike

hy the task executor > is killed? If it is killed by Yarn, you might get such info in Yarn > NM/RM logs. > > Best, > Yangze Guo > > Best, > Yangze Guo > > > On Fri, Mar 13, 2020 at 12:31 PM DONG, Weike > wrote: > > > > Hi, > > > > Recently

Flink YARN app terminated before the client receives the result

2020-03-12 Thread DONG, Weike

Hi, Recently I have encountered a strange behavior of Flink on YARN, which is that when I try to cancel a Flink job running in per-job mode on YARN using commands like "cancel -m yarn-cluster -yid application_1559388106022_9412 ed7e2e0ab0a7316c1b65df6047bc6aae" the client happily found and conne

Question on the SQL "GROUPING SETS" and "CUBE" syntax usability

2020-03-09 Thread DONG, Weike

Hi, >From the Flink 1.10 official document ( https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/queries.html), we could see that GROUPING SETS is only supported in Batch mode. [image: image.png] However, we also found that in https://issues.apache.org/jira/browse/FLINK-1

Question on getting the last succesfuly externalized checkpoint path for crashed jobs

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

Re: TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

TaskManager takes abnormally long time to register with JobManager on Kubernetes for Flink 1.11.0

Re: Flink YARN app terminated before the client receives the result

Re: Flink YARN app terminated before the client receives the result

Re: Flink YARN app terminated before the client receives the result

Flink YARN app terminated before the client receives the result

Question on the SQL "GROUPING SETS" and "CUBE" syntax usability

12 matches

Site Navigation

Mail list logo

Footer information