I found out that there is a limitaion in a number of schedulers in
SchedulerFactory.java[1]
"executor = ExecutorFactory.singleton().createOrGet("SchedulerFactory", 100);"
It can be tested by:
Set a small number for SchedulerFactory, for example 16.
Run notes with interpreters in an isolated mode per user and per note.
See pending paragraphs when a dozen of interpreter processes will start.
There is no limitation in total number of started interpreter processes, but
there is a limitation in schedulers.
Scheduler born inside interpreter. If we need a limitation it's to be good to
limit a number of interpreter processes.
Is this limitation in schedulers useful?
1.
https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/scheduler/SchedulerFactory.java
Maksim Belousov
From: Belousov Maksim Eduardovich [mailto:[email protected]]
Sent: Tuesday, October 03, 2017 10:37 AM
To: [email protected]
Subject: RE: Is any limitation of maximum interpreter processes?
> Which interpreter is pending ?
There comes a time when any paragraph with any interpreter doesn't run and
remains in 'Pending' state.
We use local spark instances in spark interpretator.
Logs don't contain errors.
Максим Белоусов
Архитектор
Отдел отчетности и витрин данных
Управление хранилищ данных и отчетности
Тел.: +7 495 648-10-00, доб. 2271
From: Jianfeng (Jeff) Zhang [mailto:[email protected]]
Sent: Tuesday, October 03, 2017 2:01 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: Is any limitation of maximum interpreter processes?
Which interpreter is pending ? It is possible that spark interpreter pending
due to yarn resource capacity if you run it in yarn client mode
If it is pending, you can check the log first.
Best Regard,
Jeff Zhang
From: Belousov Maksim Eduardovich
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Monday, October 2, 2017 at 9:26 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Is any limitation of maximum interpreter processes?
Hello, users!
Our analysts run notes with such interpreters: markdown, one or two jdbc and
pyspark. The interpreters are instantiated Per User in isolated process and Per
Note in isolated process.
And the analysts complain that sometimes paragraphs aren't processed and stay
in status 'Pending'.
We noticed that it happen when number of started interpreter processes is about
90-100.
If admin restarts one of the popular interpreter (that is killing some
interpreter processes), the paragraphs become 'Running'.
We can't see any workload on zeppelin server when paragraphs are pended. RAM is
sufficiently, iowait ~ 0
Also we can't find out any parameters about maximum interpreter processes.
Has anyone of you faced the same problem? How can this problem be solved?
Thanks,
Maksim Belousov