Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-11 Thread Michael Armbrust
The way that we do this is to have a single context with a server in front
that multiplexes jobs that use that shared context.  Even if you aren't
sharing data this is going to give you the best fine grained sharing of the
resources that the context is managing.

On Fri, Dec 11, 2015 at 10:55 AM, Mike Wright  wrote:

> Somewhat related - What's the correct implementation when you have a
> single cluster to support multiple jobs that are unrelated and NOT sharing
> data? I was directed to figure out, via job server, to support "multiple
> contexts" and explained that multiple contexts per JVM is not really
> supported. So, via job server, how does one support multiple contexts in
> DIFFERENT JVM's? I specify multiple contexts in the conf file and the
> initialization of the subsequent contexts fail.
>
>
>
> On Fri, Dec 4, 2015 at 3:37 PM, Michael Armbrust 
> wrote:
>
>> On Fri, Dec 4, 2015 at 11:24 AM, Anfernee Xu 
>> wrote:
>>
>>> If multiple users are looking at the same data set, then it's good
>>> choice to share the SparkContext.
>>>
>>> But my usercases are different, users are looking at different data(I
>>> use custom Hadoop InputFormat to load data from my data source based on the
>>> user input), the data might not have any overlap. For now I'm taking below
>>> approach
>>>
>>
>> Still if you want fine grained sharing of compute resources as well, you
>> want to using single SparkContext.
>>
>
>


Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-11 Thread Mike Wright
Thanks for the insight!


___

*Mike Wright*
Principal Architect, Software Engineering
S Capital IQ and SNL

434-951-7816 *p*
434-244-4466 *f*
540-470-0119 *m*

mwri...@snl.com



On Fri, Dec 11, 2015 at 2:38 PM, Michael Armbrust 
wrote:

> The way that we do this is to have a single context with a server in front
> that multiplexes jobs that use that shared context.  Even if you aren't
> sharing data this is going to give you the best fine grained sharing of the
> resources that the context is managing.
>
> On Fri, Dec 11, 2015 at 10:55 AM, Mike Wright  wrote:
>
>> Somewhat related - What's the correct implementation when you have a
>> single cluster to support multiple jobs that are unrelated and NOT sharing
>> data? I was directed to figure out, via job server, to support "multiple
>> contexts" and explained that multiple contexts per JVM is not really
>> supported. So, via job server, how does one support multiple contexts in
>> DIFFERENT JVM's? I specify multiple contexts in the conf file and the
>> initialization of the subsequent contexts fail.
>>
>>
>>
>> On Fri, Dec 4, 2015 at 3:37 PM, Michael Armbrust 
>> wrote:
>>
>>> On Fri, Dec 4, 2015 at 11:24 AM, Anfernee Xu 
>>> wrote:
>>>
 If multiple users are looking at the same data set, then it's good
 choice to share the SparkContext.

 But my usercases are different, users are looking at different data(I
 use custom Hadoop InputFormat to load data from my data source based on the
 user input), the data might not have any overlap. For now I'm taking below
 approach

>>>
>>> Still if you want fine grained sharing of compute resources as well, you
>>> want to using single SparkContext.
>>>
>>
>>
>


Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-11 Thread Mike Wright
Somewhat related - What's the correct implementation when you have a single
cluster to support multiple jobs that are unrelated and NOT sharing data? I
was directed to figure out, via job server, to support "multiple contexts"
and explained that multiple contexts per JVM is not really supported. So,
via job server, how does one support multiple contexts in DIFFERENT JVM's?
I specify multiple contexts in the conf file and the initialization of the
subsequent contexts fail.



On Fri, Dec 4, 2015 at 3:37 PM, Michael Armbrust 
wrote:

> On Fri, Dec 4, 2015 at 11:24 AM, Anfernee Xu 
> wrote:
>
>> If multiple users are looking at the same data set, then it's good choice
>> to share the SparkContext.
>>
>> But my usercases are different, users are looking at different data(I use
>> custom Hadoop InputFormat to load data from my data source based on the
>> user input), the data might not have any overlap. For now I'm taking below
>> approach
>>
>
> Still if you want fine grained sharing of compute resources as well, you
> want to using single SparkContext.
>


is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-04 Thread prateek arora
Hi

I want to create multiple sparkContext in my application.
i read so many articles they suggest " usage of multiple contexts is
discouraged, since SPARK-2243 is still not resolved."
i want to know that Is spark 1.5.0 supported to create multiple contexts
without error ?
and if supported then are we need to set
"spark.driver.allowMultipleContexts" configuration parameter ?

Regards
Prateek



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/is-Multiple-Spark-Contexts-is-supported-in-spark-1-5-0-tp25568.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-04 Thread Ted Yu
See Josh's response in this thread:

http://search-hadoop.com/m/q3RTt1z1hUw4TiG1=Re+Question+about+yarn+cluster+mode+and+spark+driver+allowMultipleContexts

Cheers

On Fri, Dec 4, 2015 at 9:46 AM, prateek arora <prateek.arora...@gmail.com>
wrote:

> Hi
>
> I want to create multiple sparkContext in my application.
> i read so many articles they suggest " usage of multiple contexts is
> discouraged, since SPARK-2243 is still not resolved."
> i want to know that Is spark 1.5.0 supported to create multiple contexts
> without error ?
> and if supported then are we need to set
> "spark.driver.allowMultipleContexts" configuration parameter ?
>
> Regards
> Prateek
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/is-Multiple-Spark-Contexts-is-supported-in-spark-1-5-0-tp25568.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-04 Thread Michael Armbrust
On Fri, Dec 4, 2015 at 11:24 AM, Anfernee Xu  wrote:

> If multiple users are looking at the same data set, then it's good choice
> to share the SparkContext.
>
> But my usercases are different, users are looking at different data(I use
> custom Hadoop InputFormat to load data from my data source based on the
> user input), the data might not have any overlap. For now I'm taking below
> approach
>

Still if you want fine grained sharing of compute resources as well, you
want to using single SparkContext.


Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-04 Thread prateek arora
Hi Ted
Thanks for the information .
is there any way that two different spark application share there data ?

Regards
Prateek

On Fri, Dec 4, 2015 at 9:54 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> See Josh's response in this thread:
>
>
> http://search-hadoop.com/m/q3RTt1z1hUw4TiG1=Re+Question+about+yarn+cluster+mode+and+spark+driver+allowMultipleContexts
>
> Cheers
>
> On Fri, Dec 4, 2015 at 9:46 AM, prateek arora <prateek.arora...@gmail.com>
> wrote:
>
>> Hi
>>
>> I want to create multiple sparkContext in my application.
>> i read so many articles they suggest " usage of multiple contexts is
>> discouraged, since SPARK-2243 is still not resolved."
>> i want to know that Is spark 1.5.0 supported to create multiple contexts
>> without error ?
>> and if supported then are we need to set
>> "spark.driver.allowMultipleContexts" configuration parameter ?
>>
>> Regards
>> Prateek
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/is-Multiple-Spark-Contexts-is-supported-in-spark-1-5-0-tp25568.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-04 Thread prateek arora
Thanks ...

Is there any way my second application run in parallel and wait for
fetching data from hbase or any other data storeage system ?

Regards
Prateek

On Fri, Dec 4, 2015 at 10:24 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> How about using NoSQL data store such as HBase :-)
>
> On Fri, Dec 4, 2015 at 10:17 AM, prateek arora <prateek.arora...@gmail.com
> > wrote:
>
>> Hi Ted
>> Thanks for the information .
>> is there any way that two different spark application share there data ?
>>
>> Regards
>> Prateek
>>
>> On Fri, Dec 4, 2015 at 9:54 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> See Josh's response in this thread:
>>>
>>>
>>> http://search-hadoop.com/m/q3RTt1z1hUw4TiG1=Re+Question+about+yarn+cluster+mode+and+spark+driver+allowMultipleContexts
>>>
>>> Cheers
>>>
>>> On Fri, Dec 4, 2015 at 9:46 AM, prateek arora <
>>> prateek.arora...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> I want to create multiple sparkContext in my application.
>>>> i read so many articles they suggest " usage of multiple contexts is
>>>> discouraged, since SPARK-2243 is still not resolved."
>>>> i want to know that Is spark 1.5.0 supported to create multiple contexts
>>>> without error ?
>>>> and if supported then are we need to set
>>>> "spark.driver.allowMultipleContexts" configuration parameter ?
>>>>
>>>> Regards
>>>> Prateek
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/is-Multiple-Spark-Contexts-is-supported-in-spark-1-5-0-tp25568.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> -
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>>>
>>
>


Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-04 Thread Ted Yu
How about using NoSQL data store such as HBase :-)

On Fri, Dec 4, 2015 at 10:17 AM, prateek arora <prateek.arora...@gmail.com>
wrote:

> Hi Ted
> Thanks for the information .
> is there any way that two different spark application share there data ?
>
> Regards
> Prateek
>
> On Fri, Dec 4, 2015 at 9:54 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> See Josh's response in this thread:
>>
>>
>> http://search-hadoop.com/m/q3RTt1z1hUw4TiG1=Re+Question+about+yarn+cluster+mode+and+spark+driver+allowMultipleContexts
>>
>> Cheers
>>
>> On Fri, Dec 4, 2015 at 9:46 AM, prateek arora <prateek.arora...@gmail.com
>> > wrote:
>>
>>> Hi
>>>
>>> I want to create multiple sparkContext in my application.
>>> i read so many articles they suggest " usage of multiple contexts is
>>> discouraged, since SPARK-2243 is still not resolved."
>>> i want to know that Is spark 1.5.0 supported to create multiple contexts
>>> without error ?
>>> and if supported then are we need to set
>>> "spark.driver.allowMultipleContexts" configuration parameter ?
>>>
>>> Regards
>>> Prateek
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/is-Multiple-Spark-Contexts-is-supported-in-spark-1-5-0-tp25568.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>


Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-04 Thread Michael Armbrust
To be clear, I don't think there is ever a compelling reason to create more
than one SparkContext in a single application.  The context is threadsafe
and can launch many jobs in parallel from multiple threads.  Even if there
wasn't global state that made it unsafe to do so, creating more than one
context creates an artificial barrier that prevents sharing of RDDs between
the two.

On Fri, Dec 4, 2015 at 10:47 AM, prateek arora <prateek.arora...@gmail.com>
wrote:

> Thanks ...
>
> Is there any way my second application run in parallel and wait for
> fetching data from hbase or any other data storeage system ?
>
> Regards
> Prateek
>
> On Fri, Dec 4, 2015 at 10:24 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> How about using NoSQL data store such as HBase :-)
>>
>> On Fri, Dec 4, 2015 at 10:17 AM, prateek arora <
>> prateek.arora...@gmail.com> wrote:
>>
>>> Hi Ted
>>> Thanks for the information .
>>> is there any way that two different spark application share there data ?
>>>
>>> Regards
>>> Prateek
>>>
>>> On Fri, Dec 4, 2015 at 9:54 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>
>>>> See Josh's response in this thread:
>>>>
>>>>
>>>> http://search-hadoop.com/m/q3RTt1z1hUw4TiG1=Re+Question+about+yarn+cluster+mode+and+spark+driver+allowMultipleContexts
>>>>
>>>> Cheers
>>>>
>>>> On Fri, Dec 4, 2015 at 9:46 AM, prateek arora <
>>>> prateek.arora...@gmail.com> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I want to create multiple sparkContext in my application.
>>>>> i read so many articles they suggest " usage of multiple contexts is
>>>>> discouraged, since SPARK-2243 is still not resolved."
>>>>> i want to know that Is spark 1.5.0 supported to create multiple
>>>>> contexts
>>>>> without error ?
>>>>> and if supported then are we need to set
>>>>> "spark.driver.allowMultipleContexts" configuration parameter ?
>>>>>
>>>>> Regards
>>>>> Prateek
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/is-Multiple-Spark-Contexts-is-supported-in-spark-1-5-0-tp25568.html
>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> -
>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>
>


Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-04 Thread Mark Hamstra
Where it could start to make some sense is if you wanted a single
application to be able to work with more than one Spark cluster -- but
that's a pretty weird or unusual thing to do, and I'm pretty sure it
wouldn't work correctly at present.

On Fri, Dec 4, 2015 at 11:10 AM, Michael Armbrust <mich...@databricks.com>
wrote:

> To be clear, I don't think there is ever a compelling reason to create
> more than one SparkContext in a single application.  The context is
> threadsafe and can launch many jobs in parallel from multiple threads.
> Even if there wasn't global state that made it unsafe to do so, creating
> more than one context creates an artificial barrier that prevents sharing
> of RDDs between the two.
>
> On Fri, Dec 4, 2015 at 10:47 AM, prateek arora <prateek.arora...@gmail.com
> > wrote:
>
>> Thanks ...
>>
>> Is there any way my second application run in parallel and wait for
>> fetching data from hbase or any other data storeage system ?
>>
>> Regards
>> Prateek
>>
>> On Fri, Dec 4, 2015 at 10:24 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> How about using NoSQL data store such as HBase :-)
>>>
>>> On Fri, Dec 4, 2015 at 10:17 AM, prateek arora <
>>> prateek.arora...@gmail.com> wrote:
>>>
>>>> Hi Ted
>>>> Thanks for the information .
>>>> is there any way that two different spark application share there data ?
>>>>
>>>> Regards
>>>> Prateek
>>>>
>>>> On Fri, Dec 4, 2015 at 9:54 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>>> See Josh's response in this thread:
>>>>>
>>>>>
>>>>> http://search-hadoop.com/m/q3RTt1z1hUw4TiG1=Re+Question+about+yarn+cluster+mode+and+spark+driver+allowMultipleContexts
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Fri, Dec 4, 2015 at 9:46 AM, prateek arora <
>>>>> prateek.arora...@gmail.com> wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> I want to create multiple sparkContext in my application.
>>>>>> i read so many articles they suggest " usage of multiple contexts is
>>>>>> discouraged, since SPARK-2243 is still not resolved."
>>>>>> i want to know that Is spark 1.5.0 supported to create multiple
>>>>>> contexts
>>>>>> without error ?
>>>>>> and if supported then are we need to set
>>>>>> "spark.driver.allowMultipleContexts" configuration parameter ?
>>>>>>
>>>>>> Regards
>>>>>> Prateek
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/is-Multiple-Spark-Contexts-is-supported-in-spark-1-5-0-tp25568.html
>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>> Nabble.com.
>>>>>>
>>>>>> -
>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


Re: is Multiple Spark Contexts is supported in spark 1.5.0 ?

2015-12-04 Thread Anfernee Xu
If multiple users are looking at the same data set, then it's good choice
to share the SparkContext.

But my usercases are different, users are looking at different data(I use
custom Hadoop InputFormat to load data from my data source based on the
user input), the data might not have any overlap. For now I'm taking below
approach

 1.  once my webapp receives the user request, it will submit custom
Yarn application to my Yarn cluster to allocate a container where a
SparkContext is created and my driver is running
 2. I have to design a coordination/callback protocol so that user
session in my webapp can be notified when the Spark job is finished and the
result will be pushed back.

Let me know if you have any better solution

Thanks

On Fri, Dec 4, 2015 at 11:10 AM, Michael Armbrust <mich...@databricks.com>
wrote:

> To be clear, I don't think there is ever a compelling reason to create
> more than one SparkContext in a single application.  The context is
> threadsafe and can launch many jobs in parallel from multiple threads.
> Even if there wasn't global state that made it unsafe to do so, creating
> more than one context creates an artificial barrier that prevents sharing
> of RDDs between the two.
>
> On Fri, Dec 4, 2015 at 10:47 AM, prateek arora <prateek.arora...@gmail.com
> > wrote:
>
>> Thanks ...
>>
>> Is there any way my second application run in parallel and wait for
>> fetching data from hbase or any other data storeage system ?
>>
>> Regards
>> Prateek
>>
>> On Fri, Dec 4, 2015 at 10:24 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> How about using NoSQL data store such as HBase :-)
>>>
>>> On Fri, Dec 4, 2015 at 10:17 AM, prateek arora <
>>> prateek.arora...@gmail.com> wrote:
>>>
>>>> Hi Ted
>>>> Thanks for the information .
>>>> is there any way that two different spark application share there data ?
>>>>
>>>> Regards
>>>> Prateek
>>>>
>>>> On Fri, Dec 4, 2015 at 9:54 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>>> See Josh's response in this thread:
>>>>>
>>>>>
>>>>> http://search-hadoop.com/m/q3RTt1z1hUw4TiG1=Re+Question+about+yarn+cluster+mode+and+spark+driver+allowMultipleContexts
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Fri, Dec 4, 2015 at 9:46 AM, prateek arora <
>>>>> prateek.arora...@gmail.com> wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> I want to create multiple sparkContext in my application.
>>>>>> i read so many articles they suggest " usage of multiple contexts is
>>>>>> discouraged, since SPARK-2243 is still not resolved."
>>>>>> i want to know that Is spark 1.5.0 supported to create multiple
>>>>>> contexts
>>>>>> without error ?
>>>>>> and if supported then are we need to set
>>>>>> "spark.driver.allowMultipleContexts" configuration parameter ?
>>>>>>
>>>>>> Regards
>>>>>> Prateek
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/is-Multiple-Spark-Contexts-is-supported-in-spark-1-5-0-tp25568.html
>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>> Nabble.com.
>>>>>>
>>>>>> -
>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


-- 
--Anfernee