Re: Problem with version compatibility
Hi Sean, I'm packaging spark with my (standalone) driver app using maven. Any assemblies that are used on the mesos workers through extending the classpath or providing the jars in the driver (via the SparkConf) isn't packaged with spark (it seems obvious that would be a mistake). I need, for example, RDD on my classpath in order for my driver app to run. Are you saying I need to mark spark as provided in maven and include an installed distribution's lib directory jars on my classpath? I'm not using anything but the jar files from a Spark install in my driver so that seemed superfluous (and slightly more difficult to manage the deployment). Also, even if that's the case, I don't understand why the maven dependency of the same version of a deployable distribution would have different versions of classes in it than the deployable version itself. Thanks for your patience. Jim -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12889.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Problem with version compatibility
Yes spark-submit adds all this for you. You don't bring Spark classes in your app On Thu, Jun 25, 2015, 4:01 PM jimfcarroll jimfcarr...@gmail.com wrote: Hi Sean, I'm packaging spark with my (standalone) driver app using maven. Any assemblies that are used on the mesos workers through extending the classpath or providing the jars in the driver (via the SparkConf) isn't packaged with spark (it seems obvious that would be a mistake). I need, for example, RDD on my classpath in order for my driver app to run. Are you saying I need to mark spark as provided in maven and include an installed distribution's lib directory jars on my classpath? I'm not using anything but the jar files from a Spark install in my driver so that seemed superfluous (and slightly more difficult to manage the deployment). Also, even if that's the case, I don't understand why the maven dependency of the same version of a deployable distribution would have different versions of classes in it than the deployable version itself. Thanks for your patience. Jim -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12889.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Problem with version compatibility
Jim, I do something similar to you. I mark all dependencies as provided and then make sure to drop the same version of spark-assembly in my war as I have on the executors. I don't remember if dropping in server/lib works, I think I ran into an issue with that. Would love to know best practices when it comes to Tomcat and Spark On Thu, Jun 25, 2015 at 11:23 AM, Sean Owen so...@cloudera.com wrote: Try putting your same Mesos assembly on the classpath of your client then, to emulate what spark-submit does. I don't think you merely also want to put it on the classpath but make sure nothing else from Spark is coming from your app. In 1.4 there is the 'launcher' API which makes programmatic access a lot more feasible but still kinda needs you to get Spark code to your driver program, and if it's not the same as on your cluster you'd still risk some incompatibilities. On Thu, Jun 25, 2015 at 6:05 PM, jimfcarroll jimfcarr...@gmail.com wrote: Ah. I've avoided using spark-submit primarily because our use of Spark is as part of an analytics library that's meant to be embedded in other applications with their own lifecycle management. One of those application is a REST app running in tomcat which will make the use of spark-submit difficult (if not impossible). Also, we're trying to avoid sending jars over the wire per-job and so we install our library (minus the spark dependencies) on the mesos workers and refer to it in the spark configuration using spark.executor.extraClassPath and if I'm reading SparkSubmit.scala correctly, it looks like the user's assembly ends up sent to the cluster (at least in the case of yarn) though I could be wrong on this. Is there a standard way of running an app that's in control of it's own runtime lifecycle without spark-submit? Thanks again. Jim -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12894.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Problem with version compatibility
Ah. I've avoided using spark-submit primarily because our use of Spark is as part of an analytics library that's meant to be embedded in other applications with their own lifecycle management. One of those application is a REST app running in tomcat which will make the use of spark-submit difficult (if not impossible). Also, we're trying to avoid sending jars over the wire per-job and so we install our library (minus the spark dependencies) on the mesos workers and refer to it in the spark configuration using spark.executor.extraClassPath and if I'm reading SparkSubmit.scala correctly, it looks like the user's assembly ends up sent to the cluster (at least in the case of yarn) though I could be wrong on this. Is there a standard way of running an app that's in control of it's own runtime lifecycle without spark-submit? Thanks again. Jim -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12894.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Problem with version compatibility
Yana and Sean, Thanks for the feedback. I can get it to work a number of ways, I'm just wondering if there's a preferred means. One last question. Is there a reason the deployed Spark install doesn't contain the same version of several classes as the maven dependency. Is this intentional? Thanks again. Jim -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12900.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Problem with version compatibility
-dev +user That all sounds fine except are you packaging Spark classes with your app? that's the bit I'm wondering about. You would mark it as a 'provided' dependency in Maven. On Thu, Jun 25, 2015 at 5:12 AM, jimfcarroll jimfcarr...@gmail.com wrote: Hi Sean, I'm running a Mesos cluster. My driver app is built using maven against the maven 1.4.0 dependency. The Mesos slave machines have the spark distribution installed from the distribution link. I have a hard time understanding how this isn't a standard app deployment but maybe I'm missing something. If you build a driver app against 1.4.0 using maven and run it against a mesos cluster that has the 1.4.0 binary distribution installed, your driver wont run right. I meant to publish this question on the user list so my apologies if it's in the wrong place. Jim -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12876.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Problem with version compatibility
Hi Sean, I'm running a Mesos cluster. My driver app is built using maven against the maven 1.4.0 dependency. The Mesos slave machines have the spark distribution installed from the distribution link. I have a hard time understanding how this isn't a standard app deployment but maybe I'm missing something. If you build a driver app against 1.4.0 using maven and run it against a mesos cluster that has the 1.4.0 binary distribution installed, your driver wont run right. I meant to publish this question on the user list so my apologies if it's in the wrong place. Jim -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12876.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Problem with version compatibility
They are different classes even. Your problem isn't class-not-found though. You're also comparing different builds really. You should not be including Spark code in your app. On Wed, Jun 24, 2015, 9:48 PM jimfcarroll jimfcarr...@gmail.com wrote: These jars are simply incompatible. You can see this by looking at that class in both the maven repo for 1.4.0 here: http://central.maven.org/maven2/org/apache/spark/spark-core_2.10/1.4.0/spark-core_2.10-1.4.0.jar as well as the spark-assembly jar inside the .tgz file you can get from the official download here: http://d3kbcqa49mib13.cloudfront.net/spark-1.4.0-bin-hadoop2.4.tgz Am I missing something? Thanks Jim -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12863.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Problem with version compatibility
These jars are simply incompatible. You can see this by looking at that class in both the maven repo for 1.4.0 here: http://central.maven.org/maven2/org/apache/spark/spark-core_2.10/1.4.0/spark-core_2.10-1.4.0.jar as well as the spark-assembly jar inside the .tgz file you can get from the official download here: http://d3kbcqa49mib13.cloudfront.net/spark-1.4.0-bin-hadoop2.4.tgz Am I missing something? Thanks Jim -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12863.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org