spark-assembly libraries conflict with needed libraries

2014-07-07 Thread Robert James
spark-submit includes a spark-assembly uber jar, which has older
versions of many common libraries.  These conflict with some of the
dependencies we need.  I have been racking my brain trying to find a
solution (including experimenting with ProGuard), but haven't been
able to: when we use spark-submit, we get NoMethodErrors, even though
the code compiles fine, because the runtime classes are different than
the compile time classes!

Can someone recommend a solution? We are using scala, sbt, and
sbt-assembly, but are happy using another tool (please provide
instructions how to).


Re: spark-assembly libraries conflict with needed libraries

2014-07-07 Thread Koert Kuipers
spark has a setting to put user jars in front of classpath, which should do
the trick.
however i had no luck with this. see here:

https://issues.apache.org/jira/browse/SPARK-1863



On Mon, Jul 7, 2014 at 1:31 PM, Robert James srobertja...@gmail.com wrote:

 spark-submit includes a spark-assembly uber jar, which has older
 versions of many common libraries.  These conflict with some of the
 dependencies we need.  I have been racking my brain trying to find a
 solution (including experimenting with ProGuard), but haven't been
 able to: when we use spark-submit, we get NoMethodErrors, even though
 the code compiles fine, because the runtime classes are different than
 the compile time classes!

 Can someone recommend a solution? We are using scala, sbt, and
 sbt-assembly, but are happy using another tool (please provide
 instructions how to).



Re: spark-assembly libraries conflict with needed libraries

2014-07-07 Thread Chester Chen
I don't have experience deploying to EC2.  can you use add.jar conf to add
the missing jar at runtime ?   I haven't tried this myself. Just a guess.


On Mon, Jul 7, 2014 at 12:16 PM, Chester Chen ches...@alpinenow.com wrote:

 with provided scope, you need to provide the provided jars at the
 runtime yourself. I guess in this case Hadoop jar files.


 On Mon, Jul 7, 2014 at 12:13 PM, Robert James srobertja...@gmail.com
 wrote:

 Thanks - that did solve my error, but instead got a different one:
   java.lang.NoClassDefFoundError:
 org/apache/hadoop/mapreduce/lib/input/FileInputFormat

 It seems like with that setting, spark can't find Hadoop.

 On 7/7/14, Koert Kuipers ko...@tresata.com wrote:
  spark has a setting to put user jars in front of classpath, which
 should do
  the trick.
  however i had no luck with this. see here:
 
  https://issues.apache.org/jira/browse/SPARK-1863
 
 
 
  On Mon, Jul 7, 2014 at 1:31 PM, Robert James srobertja...@gmail.com
  wrote:
 
  spark-submit includes a spark-assembly uber jar, which has older
  versions of many common libraries.  These conflict with some of the
  dependencies we need.  I have been racking my brain trying to find a
  solution (including experimenting with ProGuard), but haven't been
  able to: when we use spark-submit, we get NoMethodErrors, even though
  the code compiles fine, because the runtime classes are different than
  the compile time classes!
 
  Can someone recommend a solution? We are using scala, sbt, and
  sbt-assembly, but are happy using another tool (please provide
  instructions how to).
 
 





Re: spark-assembly libraries conflict with needed libraries

2014-07-07 Thread Robert James
Thanks - that did solve my error, but instead got a different one:
  java.lang.NoClassDefFoundError:
org/apache/hadoop/mapreduce/lib/input/FileInputFormat

It seems like with that setting, spark can't find Hadoop.

On 7/7/14, Koert Kuipers ko...@tresata.com wrote:
 spark has a setting to put user jars in front of classpath, which should do
 the trick.
 however i had no luck with this. see here:

 https://issues.apache.org/jira/browse/SPARK-1863



 On Mon, Jul 7, 2014 at 1:31 PM, Robert James srobertja...@gmail.com
 wrote:

 spark-submit includes a spark-assembly uber jar, which has older
 versions of many common libraries.  These conflict with some of the
 dependencies we need.  I have been racking my brain trying to find a
 solution (including experimenting with ProGuard), but haven't been
 able to: when we use spark-submit, we get NoMethodErrors, even though
 the code compiles fine, because the runtime classes are different than
 the compile time classes!

 Can someone recommend a solution? We are using scala, sbt, and
 sbt-assembly, but are happy using another tool (please provide
 instructions how to).