,
Zhijiang
--
发件人:杨力
发送时间:2018年9月7日(星期五) 13:09
收件人:user
主 题:Flink 1.6 Job fails with IllegalStateException: Buffer pool is destroyed.
Hi all,
I am encountering a weird problem when running flink 1.6 in yarn per-job
clusters.
The job
You can check the log to show the related stack in OOM, maybe we can confirm
some reasons.
Or you can dump the heap to analyze the memory usages after OOM.
Best,
Zhijiang
--
发件人:Darshan Singh
发送时间:2018年8月29日(星期三) 19:22
收件人
I remember,
that means the downstream will be scheduled after upstream finishes, so the
slower downstream will not block upstream running, then the backpressure may
not exist in this case.
Best,
Zhijiang
--
发件人:Darshan Singh
发送时间
buffers in record
serializers. If the record size is large and the downstream parallelism is
large, it may cause OOM issue in serialization.
Could you show the stack of OOM part? If it is this case, the following [1]
can solve it and it is working in progress.
Zhijiang
[1] https
askManager received the task deployment delayed from
JobManager, or some operations in upstream task initialization unexpectly cost
more time before registering result partition.
Best,
Zhijiang
--
发件人:Steffen Wohlers
发送时间:2018年7月22日(星期日)
for lock which is also
occupied by task output process.
As you mentioned, it makes sense to check the data structure of the output
record and reduces the size or make it lightweight to handle.
Best,
Zhijiang
--
发件人:Gerard Garcia
framework. Also you can monitor the gc status to check the full gc delay.
Best,
Zhijiang
--
发件人:Gerard Garcia
发送时间:2018年7月13日(星期五) 16:22
收件人:wangzhijiang999
抄 送:user
主 题:Re: Flink job hangs/deadlocks (possibly related to out of m
trics for some helps.
--
发件人:Vishal Santoshi
发送时间:2018年7月6日(星期五) 22:05
收件人:Zhijiang(wangzhijiang999)
抄 送:user
主 题:Re: Limiting in flight data
Further if there is are metrics that allows us to chart delays per pipe on n/w
buffers, that would be immensely help
(will not cause OOM). I think you should not worry
about that. Normally it is better to consider TPS of both sides and set the
proper paralellism to avoid back pressure to some extent.
Zhijiang
--
发件人:Mich Talebzadeh
发送时间:2018年7月4日(星期三
and
taskmanager.network.memory.floating-buffers-per-gate.
If you have other questions about them, let me know then i can explain for you.
Zhijiang
--
发件人:Vishal Santoshi
发送时间:2018年7月5日(星期四) 22:28
收件人:user
主 题:Limiting in flight data
"Yes,
to
trigger restarting the job.
Zhijiang
--
发件人:Gerard Garcia
发送时间:2018年7月2日(星期一) 18:29
收件人:wangzhijiang999
抄 送:user
主 题:Re: Flink job hangs/deadlocks (possibly related to out of memory)
Thanks Zhijiang,
We haven't found any other
whether and where caused the OOM. Maybe check the task
failure logs.
Zhijiang
--
发件人:gerardg
发送时间:2018年6月30日(星期六) 00:12
收件人:user
主 题:Re: Flink job hangs/deadlocks (possibly related to out of memory)
(fixed formatting)
Hello
Hi Osh,
As I know, currently one dataset source can not be consumed by several
different vertexs and from the API you can not construct the topology for your
request.
I think your way to merge different reduce functions into one UDF is feasible.
Maybe someone has better solution. :)
zhijiang
to improve
barrier alignment, that has already been verified to decrease the alignment
time greatly for backpressure scenarios.
zhijiang
--发件人:Piotr
Nowojski <pi...@data-artisans.com>发送时间:2018年4月6日(星期五) 00:06收件人:Edward
memory
usage by netty PooledByteBuffer can be largely reduced and under controlled
easily.
cheers,zhijiang
--发件人:Kurt
Young <k...@apache.org>发送时间:2017年6月30日(星期五) 15:51收件人:dev
<d...@flink.apache.org>; user <user@flink
it can
help you.
Cheers,Zhijiang
--发件人:Ray
Ruvinskiy <ray.ruvins...@arcticwolf.com>发送时间:2017年6月7日(星期三)
23:59收件人:user@flink.apache.org <user@flink.apache.org>主 题:Question regarding
configuring number of net
cheers,zhijiang--发件人:albertjonathan
<alb...@cs.umn.edu>发送时间:2017年4月26日(星期三) 02:37收件人:user <user@flink.apache.org>主
题:Multiple consumers on a subpartition
Hello,
Is there a way Flink allow a (pipelined) subpartit
native memory, so you can try to upgrade the version as
Stephan's suggestions. Good luck!
Cheers,zhijiang--发件人:Stephan
Ewen <se...@apache.org>发送时间:2017年4月19日(星期三) 21:25收件人:Shannon Carey
<sca...@expedia.com>抄 送:user@flin
for
ack in hdfs.
cheers,zhijiang--发件人:Jürgen
Thomann <juergen.thom...@innogames.com>发送时间:2017年4月13日(星期四) 15:32收件人:user
<user@flink.apache.org>主 题:Re: 回复:Changing timeout for cancel command
Hi zhijiang,
Hi Jürgen,
You can set the timeout in the configuration by this key
"akka.ask.timeout", and the current default value is 10 s. Hope it can help you.
cheers,zhijiang
--发件人:Jürgen
Thomann <juergen.thom...@innog
when response the PartitionNotFound
to track the reason. Wish your further findings!
Cheers,Zhijiang
--发件人:Kamil
Dziublinski <kamil.dziublin...@gmail.com>发送时间:2017年4月4日(星期二) 17:20收件人:user
<user@flink.apach
buffers.
Cheers,Zhijiang--发件人:lining
jing <jinglini...@gmail.com>发送时间:2017年3月27日(星期一) 15:46收件人:user
<user@flink.apache.org>主 题:question about record Hi
All , data transmission is achieved through the buffer. If recor
-B1,A1-IntermediateResultPartition-B2,A2-IntermediateResultPartition-B1,
A2-IntermediateResultPartition-B2 in the right graph.
Cheers,
Zhijiang-发件人:lining
jing <jinglini...@gmail.com>发送时间:2017年3月15日(星期三) 10:54收件人:user
case with
JobVertex(A).
Cheers,
Zhijiang--发件人:윤형덕
<ynoo...@naver.com>发送时间:2017年3月13日(星期一) 12:43收件人:user <user@flink.apache.org>主
题:multiple consumer of intermediate data set
Hi All, figure1
https://ci.apache.org/projects
onent in JobManager is in charge
of recovering state from complete checkpoint, and the state would be set onto
Execution in ExecutionGraph.
Best,
Zhijiang
For yarn cluster mode,
--发件人:Dominik
Safaric <dominiksafa...@gmail.com>发送时间:201
101 - 125 of 125 matches
Mail list logo