Let's say we have a Spark interpreter set up as
" The interpreter will be instantiated *Globally *in *shared *process"

When one user is using Spark interpreter,
another users that are trying to use the same interpreter,
getting PENDING until another user's code completes.

Per Spark documentation,
https://spark.apache.org/docs/latest/job-scheduling.html

" *within* each Spark application, multiple “jobs” (Spark actions) may be
> running concurrently if they were submitted by different threads
> ... /skip/
> threads. By “job”, in this section, we mean a Spark action (e.g. save,
> collect) and any tasks that need to run to evaluate that action. Spark’s
> scheduler is fully thread-safe and supports this use case to enable
> applications that serve multiple requests (e.g. queries for multiple users).
> ... /skip/
> Without any intervention, newly submitted jobs go into a *default pool*,
> but jobs’ pools can be set by adding the *spark.scheduler.pool* “local
> property” to the SparkContext in the thread that’s submitting them.    "


So Spark allows multiple users to use the same shared spark context..

Two quick questions:
1. Why concurrent users are getting PENDING in Zeppelin?
2. Does Zeppelin set *spark.scheduler.pool* accordingly as described above?

PS.
We have set following Spark interpreter settings:
- zeppelin.spark.concurrentSQL= true
- spark.scheduler.mode = FAIR


Thank you,
Ruslan Dautkhanov

Reply via email to