Re: why zeppelin SparkInterpreter use FIFOScheduler

Pranav Kumar Agarwal Thu, 30 Jul 2015 04:54:07 -0700

Hi Moon,

How about tracking dedicated SparkContext for a notebook in Spark'sremote interpreter - this will allow multiple users to run their sparkparagraphs in parallel. Also, within a notebook only one paragraph isexecuted at a time.


Regards,
-Pranav.


On 15/07/15 7:15 pm, moon soo Lee wrote:

Hi,

Thanks for asking question.
The reason is simply because of it is running code statements. Thestatements can have order and dependency. Imagine i have two paragraphs
%spark
val a = 1

%spark
print(a)
If they're not running one by one, that means they possibly runs inrandom order and the output will be always different. Either '1' or'val a can not found'.
This is the reason why. But if there are nice idea to handle thisproblem i agree using parallel scheduler would help a lot.
Thanks,
moon
On 2015년 7월 14일 (화) at 오후 7:59 linxi zeng<linxizeng0...@gmail.com <mailto:linxizeng0...@gmail.com>> wrote:
    any one who have the same question with me? or this is not a question?

    2015-07-14 11:47 GMT+08:00 linxi zeng <linxizeng0...@gmail.com
    <mailto:linxizeng0...@gmail.com>>:

        hi, Moon:
           I notice that the getScheduler function in the
        SparkInterpreter.java return a FIFOScheduler which makes the
        spark interpreter run spark job one by one. It's not a good
        experience when couple of users do some work on zeppelin at
        the same time, because they have to wait for each other.
        And at the same time, SparkSqlInterpreter can chose what
        scheduler to use by "zeppelin.spark.concurrentSQL".
        My question is, what kind of consideration do you based on to
        make such a decision?

Re: why zeppelin SparkInterpreter use FIFOScheduler

Reply via email to