Hi Vijay,
>From the information you provided (the configurations, error message &
screenshot), I'm not able to find out what is the problem and how to
resolve it.
The error message comes from a healthy task manager, who discovered that
another task manager is not responding. We would need to
Hi Vijay,
The error message suggests that another task manager (10.127.106.54) is not
responding. This could happen when the remote task manager has failed or
under severe GC pressure. You would need to find the log of the remote task
manager to understand what is happening.
Thank you~
Xintong
Thx, Xintong for the detailed explanation of memory fraction. I increased
the mem fraction now.
As I increase the defaultParallelism, I keep getting this error:
org.apache.flink.runtime.io.network.partition.consumer.
PartitionConnectionException: Connection for partition
Ah, I guess I had misunderstood what your mean.
Below 18000 tasks, the Flink Job is able to start up.
> Even though I increased the number of slots, it still works when 312 slots
> are being used.
>
When you say "it still works", I thought that you increased the parallelism
the job was sill
Thanks so much, Xintong for guiding me through this. I looked at the Flink
logs to see the errors.
I had to change taskmanager.network.memory.max: 4gb and akka.ask.timeout:
240s to increase the number of tasks.
Now, I am able to increase the number of Tasks/ aka Task vertices.
Could you also explain how do you set the parallelism when getting this
execution plan?
I'm asking because this json file itself only shows the resulted execution
plan. It is not clear to me what is not working as expected in your case.
E.g., you set the parallelism for an operator to 10 but the
>
> Increasing network memory buffers (fraction, min, max) seems to increase
> tasks slightly.
That's wired. I don't think the number of network memory buffers have
anything to do with the task amount.
Let me try to clarify a few things.
Please be aware that, how many tasks a Flink job has, and
Hi Xintong,
Thx for your reply. Increasing network memory buffers (fraction, min, max)
seems to increase tasks slightly.
Streaming job
Standalone
Vijay
On Fri, May 22, 2020 at 2:49 AM Xintong Song wrote:
> Hi Vijay,
>
> I don't think your problem is related to number of opening files. The
>
Hi Vijay,
I don't think your problem is related to number of opening files. The
parallelism of your job is decided before actually tries to open the files.
And if the OS limit for opening files is reached, you should see a job
execution failure, instead of a success execution with a lower
Hi,
I have increased the number of slots available but the Job is not using all
the slots but runs into this approximate 18000 Tasks limit. Looking into
the source code, it seems to be opening file -
10 matches
Mail list logo