Re: Problem with version compatibility

2015-06-25 Thread jimfcarroll
Hi Sean,

I'm packaging spark with my (standalone) driver app using maven. Any
assemblies that are used on the mesos workers through extending the
classpath or providing the jars in the driver (via the SparkConf) isn't
packaged with spark (it seems obvious that would be a mistake).

I need, for example, RDD on my classpath in order for my driver app to
run. Are you saying I need to mark spark as provided in maven and include an
installed distribution's lib directory jars on my classpath?

I'm not using anything but the jar files from a Spark install in my driver
so that seemed superfluous (and slightly more difficult to manage the
deployment). Also, even if that's the case, I don't understand why the maven
dependency of the same version of a deployable distribution would have
different versions of classes in it than the deployable version itself.

Thanks for your patience.
Jim




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12889.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Problem with version compatibility

2015-06-25 Thread Sean Owen
Yes spark-submit adds all this for you. You don't bring Spark classes in
your app

On Thu, Jun 25, 2015, 4:01 PM jimfcarroll jimfcarr...@gmail.com wrote:

 Hi Sean,

 I'm packaging spark with my (standalone) driver app using maven. Any
 assemblies that are used on the mesos workers through extending the
 classpath or providing the jars in the driver (via the SparkConf) isn't
 packaged with spark (it seems obvious that would be a mistake).

 I need, for example, RDD on my classpath in order for my driver app to
 run. Are you saying I need to mark spark as provided in maven and include
 an
 installed distribution's lib directory jars on my classpath?

 I'm not using anything but the jar files from a Spark install in my driver
 so that seemed superfluous (and slightly more difficult to manage the
 deployment). Also, even if that's the case, I don't understand why the
 maven
 dependency of the same version of a deployable distribution would have
 different versions of classes in it than the deployable version itself.

 Thanks for your patience.
 Jim




 --
 View this message in context:
 http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12889.html
 Sent from the Apache Spark Developers List mailing list archive at
 Nabble.com.

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Re: Problem with version compatibility

2015-06-25 Thread Yana Kadiyska
Jim, I do something similar to you. I mark all dependencies as provided and
then make sure to drop the same version of spark-assembly in my war as I
have on the executors. I don't remember if dropping in server/lib works, I
think I ran into an issue with that. Would love to know best practices
when it comes to Tomcat and Spark

On Thu, Jun 25, 2015 at 11:23 AM, Sean Owen so...@cloudera.com wrote:

 Try putting your same Mesos assembly on the classpath of your client
 then, to emulate what spark-submit does. I don't think you merely also
 want to put it on the classpath but make sure nothing else from Spark
 is coming from your app.

 In 1.4 there is the 'launcher' API which makes programmatic access a
 lot more feasible but still kinda needs you to get Spark code to your
 driver program, and if it's not the same as on your cluster you'd
 still risk some incompatibilities.

 On Thu, Jun 25, 2015 at 6:05 PM, jimfcarroll jimfcarr...@gmail.com
 wrote:
  Ah. I've avoided using spark-submit primarily because our use of Spark
 is as
  part of an analytics library that's meant to be embedded in other
  applications with their own lifecycle management.
 
  One of those application is a REST app running in tomcat which will make
 the
  use of spark-submit difficult (if not impossible).
 
  Also, we're trying to avoid sending jars over the wire per-job and so we
  install our library (minus the spark dependencies) on the mesos workers
 and
  refer to it in the spark configuration using
 spark.executor.extraClassPath
  and if I'm reading SparkSubmit.scala correctly, it looks like the user's
  assembly ends up sent to the cluster (at least in the case of yarn)
 though I
  could be wrong on this.
 
  Is there a standard way of running an app that's in control of it's own
  runtime lifecycle without spark-submit?
 
  Thanks again.
  Jim
 
 
 
 
  --
  View this message in context:
 http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12894.html
  Sent from the Apache Spark Developers List mailing list archive at
 Nabble.com.
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Re: Problem with version compatibility

2015-06-25 Thread jimfcarroll
Ah. I've avoided using spark-submit primarily because our use of Spark is as
part of an analytics library that's meant to be embedded in other
applications with their own lifecycle management.

One of those application is a REST app running in tomcat which will make the
use of spark-submit difficult (if not impossible).

Also, we're trying to avoid sending jars over the wire per-job and so we
install our library (minus the spark dependencies) on the mesos workers and
refer to it in the spark configuration using spark.executor.extraClassPath
and if I'm reading SparkSubmit.scala correctly, it looks like the user's
assembly ends up sent to the cluster (at least in the case of yarn) though I
could be wrong on this.

Is there a standard way of running an app that's in control of it's own
runtime lifecycle without spark-submit?

Thanks again.
Jim




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12894.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Problem with version compatibility

2015-06-25 Thread jimfcarroll
Yana and Sean,

Thanks for the feedback. I can get it to work a number of ways, I'm just
wondering if there's a preferred means. 

One last question. Is there a reason the deployed Spark install doesn't
contain the same version of several classes as the maven dependency. Is this
intentional?

Thanks again.
Jim




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12900.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Problem with version compatibility

2015-06-25 Thread Sean Owen
-dev +user

That all sounds fine except are you packaging Spark classes with your
app? that's the bit I'm wondering about. You would mark it as a
'provided' dependency in Maven.

On Thu, Jun 25, 2015 at 5:12 AM, jimfcarroll jimfcarr...@gmail.com wrote:
 Hi Sean,

 I'm running a Mesos cluster. My driver app is built using maven against the
 maven 1.4.0 dependency.

 The Mesos slave machines have the spark distribution installed from the
 distribution link.

 I have a hard time understanding how this isn't a standard app deployment
 but maybe I'm missing something.

 If you build a driver app against 1.4.0 using maven and run it against a
 mesos cluster that has the 1.4.0 binary distribution installed, your driver
 wont run right.

 I meant to publish this question on the user list so my apologies if it's in
 the wrong place.

 Jim




 --
 View this message in context: 
 http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12876.html
 Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Problem with version compatibility

2015-06-24 Thread jimfcarroll
Hi Sean,

I'm running a Mesos cluster. My driver app is built using maven against the
maven 1.4.0 dependency.

The Mesos slave machines have the spark distribution installed from the
distribution link.

I have a hard time understanding how this isn't a standard app deployment
but maybe I'm missing something. 

If you build a driver app against 1.4.0 using maven and run it against a
mesos cluster that has the 1.4.0 binary distribution installed, your driver
wont run right.

I meant to publish this question on the user list so my apologies if it's in
the wrong place.

Jim




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12876.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Problem with version compatibility

2015-06-24 Thread Sean Owen
They are different classes even. Your problem isn't class-not-found though.
You're also comparing different builds really. You should not be including
Spark code in your app.

On Wed, Jun 24, 2015, 9:48 PM jimfcarroll jimfcarr...@gmail.com wrote:

 These jars are simply incompatible. You can see this by looking at that
 class
 in both the maven repo for 1.4.0 here:


 http://central.maven.org/maven2/org/apache/spark/spark-core_2.10/1.4.0/spark-core_2.10-1.4.0.jar

 as well as the spark-assembly jar inside the .tgz file you can get from the
 official download here:

 http://d3kbcqa49mib13.cloudfront.net/spark-1.4.0-bin-hadoop2.4.tgz

 Am I missing something?

 Thanks
 Jim




 --
 View this message in context:
 http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12863.html
 Sent from the Apache Spark Developers List mailing list archive at
 Nabble.com.

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Re: Problem with version compatibility

2015-06-24 Thread jimfcarroll
These jars are simply incompatible. You can see this by looking at that class
in both the maven repo for 1.4.0 here:

http://central.maven.org/maven2/org/apache/spark/spark-core_2.10/1.4.0/spark-core_2.10-1.4.0.jar

as well as the spark-assembly jar inside the .tgz file you can get from the
official download here:

http://d3kbcqa49mib13.cloudfront.net/spark-1.4.0-bin-hadoop2.4.tgz

Am I missing something?

Thanks
Jim




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12863.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org