Re: why zeppelin SparkInterpreter use FIFOScheduler

moon soo Lee Mon, 10 Aug 2015 11:08:19 -0700

Hi piyush,

Separate instance of SparkILoop SparkIMain for each notebook while sharing
the SparkContext sounds great.


Actually, i tried to do it, found problem that multiple SparkILoop could
generates the same class name, and spark executor confuses classname since
they're reading classes from single SparkContext.

If someone can share about the idea of sharing single SparkContext through
multiple SparkILoop safely, it'll be really helpful.

Thanks,
moon


On Mon, Aug 10, 2015 at 1:21 AM Piyush Mukati (Data Platform) <
piyush.muk...@flipkart.com> wrote:

> Hi Moon,
> Any suggestion on it, have to wait lot when multiple people  working with 
> spark.
> Can we create separate instance of   SparkILoop  SparkIMain and printstrems  
> for each notebook while sharing the SparkContext  ZeppelinContext   
> SQLContext and DependencyResolver and then use parallel scheduler ?
> thanks
>
> -piyush
>
>
> Hi Moon,
>
> How about tracking dedicated SparkContext for a notebook in Spark's
> remote interpreter - this will allow multiple users to run their spark
> paragraphs in parallel. Also, within a notebook only one paragraph is
> executed at a time.
>
> Regards,
> -Pranav.
>
>
> On 15/07/15 7:15 pm, moon soo Lee wrote:
> > Hi,
> >
> > Thanks for asking question.
> >
> > The reason is simply because of it is running code statements. The
> > statements can have order and dependency. Imagine i have two paragraphs
> >
> > %spark
> > val a = 1
> >
> > %spark
> > print(a)
> >
> > If they're not running one by one, that means they possibly runs in
> > random order and the output will be always different. Either '1' or
> > 'val a can not found'.
> >
> > This is the reason why. But if there are nice idea to handle this
> > problem i agree using parallel scheduler would help a lot.
> >
> > Thanks,
> > moon
> > On 2015년 7월 14일 (화) at 오후 7:59 linxi zeng
> > <linxizeng0...@gmail.com <mailto:linxizeng0...@gmail.com>> wrote:
> >
> >     any one who have the same question with me? or this is not a question?
> >
> >     2015-07-14 11:47 GMT+08:00 linxi zeng <linxizeng0...@gmail.com
> >     <mailto:linxizeng0...@gmail.com>>:
> >
> >         hi, Moon:
> >            I notice that the getScheduler function in the
> >         SparkInterpreter.java return a FIFOScheduler which makes the
> >         spark interpreter run spark job one by one. It's not a good
> >         experience when couple of users do some work on zeppelin at
> >         the same time, because they have to wait for each other.
> >         And at the same time, SparkSqlInterpreter can chose what
> >         scheduler to use by "zeppelin.spark.concurrentSQL".
> >         My question is, what kind of consideration do you based on to
> >         make such a decision?
> >
> >
>
>
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the system manager.
> This message contains confidential information and is intended only for the
> individual named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately by e-mail if you have received this e-mail by mistake and
> delete this e-mail from your system. If you are not the intended recipient
> you are notified that disclosing, copying, distributing or taking any
> action in reliance on the contents of this information is strictly
> prohibited. Although Flipkart has taken reasonable precautions to ensure no
> viruses are present in this email, the company cannot accept responsibility
> for any loss or damage arising from the use of this email or attachments
>

Re: why zeppelin SparkInterpreter use FIFOScheduler

Reply via email to