Hi Moon,

How about tracking dedicated SparkContext for a notebook in Spark's remote interpreter - this will allow multiple users to run their spark paragraphs in parallel. Also, within a notebook only one paragraph is executed at a time.

Regards,
-Pranav.


On 15/07/15 7:15 pm, moon soo Lee wrote:
Hi,

Thanks for asking question.

The reason is simply because of it is running code statements. The statements can have order and dependency. Imagine i have two paragraphs

%spark
val a = 1

%spark
print(a)

If they're not running one by one, that means they possibly runs in random order and the output will be always different. Either '1' or 'val a can not found'.

This is the reason why. But if there are nice idea to handle this problem i agree using parallel scheduler would help a lot.

Thanks,
moon
On 2015년 7월 14일 (화) at 오후 7:59 linxi zeng <linxizeng0...@gmail.com <mailto:linxizeng0...@gmail.com>> wrote:

    any one who have the same question with me? or this is not a question?

    2015-07-14 11:47 GMT+08:00 linxi zeng <linxizeng0...@gmail.com
    <mailto:linxizeng0...@gmail.com>>:

        hi, Moon:
           I notice that the getScheduler function in the
        SparkInterpreter.java return a FIFOScheduler which makes the
        spark interpreter run spark job one by one. It's not a good
        experience when couple of users do some work on zeppelin at
        the same time, because they have to wait for each other.
        And at the same time, SparkSqlInterpreter can chose what
        scheduler to use by "zeppelin.spark.concurrentSQL".
        My question is, what kind of consideration do you based on to
        make such a decision?



Reply via email to