Re: Spark on Windows
Thanks, Sree! Are you able to run your applications using spark-submit? Even after we were able to build successfully, we ran into problems with running the spark-submit script. If everything worked correctly for you, we can hope that things will be smoother when 1.4.0 is made generally available. arun On Thu, Apr 16, 2015 at 10:18 PM, Sree V sree_at_ch...@yahoo.com wrote: spark 'master' branch (i.e. v1.4.0) builds successfully on windows 8.1 intel i7 64-bit with oracle jdk8_45. with maven opts without the flag -XX:ReservedCodeCacheSize=1g. takes about 33 minutes. Thanking you. With Regards Sree On Thursday, April 16, 2015 9:07 PM, Arun Lists lists.a...@gmail.com wrote: Here is what I got from the engineer who worked on building Spark and using it on Windows: 1) Hadoop winutils.exe is needed on Windows, even for local files – and you have to set the Hadoop.home.dir in the spark-class2.cmd (for the two lines with $RUNNER near the end, by adding “-Dhadoop.home.dir=dir” file after downloading Hadoop binaries + winutils. 2) Java/Spark cannot delete the spark temporary files and it throws an exception (program still works though). Manual clean-up works just fine, and it is not a permissions issue as it has rights to create the file (I have also tried using my own directory rather than the default, same error). 3) tried building Spark again, and have attached the log – I don’t get any errors, just warnings. However when I try to use that JAR I just get the error message “Error: Could not find or load main class org.apache.spark.deploy.SparkSubmit”. On Thu, Apr 16, 2015 at 12:19 PM, Arun Lists lists.a...@gmail.com wrote: Thanks, Matei! We'll try that and let you know if it works. You are correct in inferring that some of the problems we had were with dependencies. We also had problems with the spark-submit scripts. I will get the details from the engineer who worked on the Windows builds and provide them to you. arun On Thu, Apr 16, 2015 at 10:44 AM, Matei Zaharia matei.zaha...@gmail.com wrote: You could build Spark with Scala 2.11 on Mac / Linux and transfer it over to Windows. AFAIK it should build on Windows too, the only problem is that Maven might take a long time to download dependencies. What errors are you seeing? Matei On Apr 16, 2015, at 9:23 AM, Arun Lists lists.a...@gmail.com wrote: We run Spark on Mac and Linux but also need to run it on Windows 8.1 and Windows Server. We ran into problems with the Scala 2.10 binary bundle for Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we are on Scala 2.11.6 (we built Spark from the sources). On Windows, however despite our best efforts we cannot get Spark 1.3.0 as built from sources working for Scala 2.11.6. Spark has too many moving parts and dependencies! When can we expect to see a binary bundle for Spark 1.3.0 that is built for Scala 2.11.6? I read somewhere that the only reason that Spark 1.3.0 is still built for Scala 2.10 is because Kafka is still on Scala 2.10. For those of us who don't use Kafka, can we have a Scala 2.10 bundle. If there isn't an official bundle arriving any time soon, can someone who has built it for Windows 8.1 successfully please share with the group? Thanks, arun - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark on Windows
We run Spark on Mac and Linux but also need to run it on Windows 8.1 and Windows Server. We ran into problems with the Scala 2.10 binary bundle for Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we are on Scala 2.11.6 (we built Spark from the sources). On Windows, however despite our best efforts we cannot get Spark 1.3.0 as built from sources working for Scala 2.11.6. Spark has too many moving parts and dependencies! When can we expect to see a binary bundle for Spark 1.3.0 that is built for Scala 2.11.6? I read somewhere that the only reason that Spark 1.3.0 is still built for Scala 2.10 is because Kafka is still on Scala 2.10. For those of us who don't use Kafka, can we have a Scala 2.10 bundle. If there isn't an official bundle arriving any time soon, can someone who has built it for Windows 8.1 successfully please share with the group? Thanks, arun
Re: Spark on Windows
Thanks, Matei! We'll try that and let you know if it works. You are correct in inferring that some of the problems we had were with dependencies. We also had problems with the spark-submit scripts. I will get the details from the engineer who worked on the Windows builds and provide them to you. arun On Thu, Apr 16, 2015 at 10:44 AM, Matei Zaharia matei.zaha...@gmail.com wrote: You could build Spark with Scala 2.11 on Mac / Linux and transfer it over to Windows. AFAIK it should build on Windows too, the only problem is that Maven might take a long time to download dependencies. What errors are you seeing? Matei On Apr 16, 2015, at 9:23 AM, Arun Lists lists.a...@gmail.com wrote: We run Spark on Mac and Linux but also need to run it on Windows 8.1 and Windows Server. We ran into problems with the Scala 2.10 binary bundle for Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we are on Scala 2.11.6 (we built Spark from the sources). On Windows, however despite our best efforts we cannot get Spark 1.3.0 as built from sources working for Scala 2.11.6. Spark has too many moving parts and dependencies! When can we expect to see a binary bundle for Spark 1.3.0 that is built for Scala 2.11.6? I read somewhere that the only reason that Spark 1.3.0 is still built for Scala 2.10 is because Kafka is still on Scala 2.10. For those of us who don't use Kafka, can we have a Scala 2.10 bundle. If there isn't an official bundle arriving any time soon, can someone who has built it for Windows 8.1 successfully please share with the group? Thanks, arun
Re: Registering classes with KryoSerializer
Wow, it all works now! Thanks, Imran! In case someone else finds this useful, here are the additional classes that I had to register (in addition to my application specific classes): val tuple3ArrayClass = classOf[Array[Tuple3[Any, Any, Any]]] val anonClass = Class.forName(scala.reflect.ClassTag$$anon$1) val javaClassClass = classOf[java.lang.Class[Any]] arun On Tue, Apr 14, 2015 at 6:23 PM, Imran Rashid iras...@cloudera.com wrote: hmm, I dunno why IntelliJ is unhappy, but you can always fall back to getting a class from the String: Class.forName(scala.reflect.ClassTag$$anon$1) perhaps the class is package private or something, and the repl somehow subverts it ... On Tue, Apr 14, 2015 at 5:44 PM, Arun Lists lists.a...@gmail.com wrote: Hi Imran, Thanks for the response! However, I am still not there yet. In the Scala interpreter, I can do: scala classOf[scala.reflect.ClassTag$$anon$1] but when I try to do this in my program in IntelliJ, it indicates an error: Cannot resolve symbol ClassTag$$anon$1 Hence I am not any closer to making this work. If you have any further suggestions, they would be most welcome. arun On Tue, Apr 14, 2015 at 2:33 PM, Imran Rashid iras...@cloudera.com wrote: Hi Arun, It can be hard to use kryo with required registration because of issues like this -- there isn't a good way to register all the classes that you need transitively. In this case, it looks like one of your classes has a reference to a ClassTag, which in turn has a reference to some anonymous inner class. I'd suggest (a) figuring out whether you really want to be serializing this thing -- its possible you're serializing an RDD which keeps a ClassTag, but normally you wouldn't want to serialize your RDDs (b) you might want to bring this up w/ chill -- spark offloads most of the kryo setup for all the scala internals to chill, I'm surprised they don't handle this already. Looks like they still handle ClassManifests which are from pre-scala 2.10: https://github.com/twitter/chill/blob/master/chill-scala/src/main/scala/com/twitter/chill/ScalaKryoInstantiator.scala#L189 (c) you can always register these classes yourself, despite the crazy names, though you'll just need to knock these out one-by-one: scala classOf[scala.reflect.ClassTag$$anon$1] res0: Class[scala.reflect.ClassTag[T]{def unapply(x$1: scala.runtime.BoxedUnit): Option[_]; def arrayClass(x$1: Class[_]): Class[_]}] = class scala.reflect.ClassTag$$anon$1 On Mon, Apr 13, 2015 at 6:09 PM, Arun Lists lists.a...@gmail.com wrote: Hi, I am trying to register classes with KryoSerializer. This has worked with other programs. Usually the error messages are helpful in indicating which classes need to be registered. But with my current program, I get the following cryptic error message: *Caused by: java.lang.IllegalArgumentException: Class is not registered: scala.reflect.ClassTag$$anon$1* *Note: To register this class use: kryo.register(scala.reflect.ClassTag$$anon$1.class);* How do I find out which class needs to be registered? I looked at my program and registered all classes used in RDDs. But clearly more classes remain to be registered if I can figure out which classes. Thanks for your help! arun
Re: Registering classes with KryoSerializer
Hi Imran, Thanks for the response! However, I am still not there yet. In the Scala interpreter, I can do: scala classOf[scala.reflect.ClassTag$$anon$1] but when I try to do this in my program in IntelliJ, it indicates an error: Cannot resolve symbol ClassTag$$anon$1 Hence I am not any closer to making this work. If you have any further suggestions, they would be most welcome. arun On Tue, Apr 14, 2015 at 2:33 PM, Imran Rashid iras...@cloudera.com wrote: Hi Arun, It can be hard to use kryo with required registration because of issues like this -- there isn't a good way to register all the classes that you need transitively. In this case, it looks like one of your classes has a reference to a ClassTag, which in turn has a reference to some anonymous inner class. I'd suggest (a) figuring out whether you really want to be serializing this thing -- its possible you're serializing an RDD which keeps a ClassTag, but normally you wouldn't want to serialize your RDDs (b) you might want to bring this up w/ chill -- spark offloads most of the kryo setup for all the scala internals to chill, I'm surprised they don't handle this already. Looks like they still handle ClassManifests which are from pre-scala 2.10: https://github.com/twitter/chill/blob/master/chill-scala/src/main/scala/com/twitter/chill/ScalaKryoInstantiator.scala#L189 (c) you can always register these classes yourself, despite the crazy names, though you'll just need to knock these out one-by-one: scala classOf[scala.reflect.ClassTag$$anon$1] res0: Class[scala.reflect.ClassTag[T]{def unapply(x$1: scala.runtime.BoxedUnit): Option[_]; def arrayClass(x$1: Class[_]): Class[_]}] = class scala.reflect.ClassTag$$anon$1 On Mon, Apr 13, 2015 at 6:09 PM, Arun Lists lists.a...@gmail.com wrote: Hi, I am trying to register classes with KryoSerializer. This has worked with other programs. Usually the error messages are helpful in indicating which classes need to be registered. But with my current program, I get the following cryptic error message: *Caused by: java.lang.IllegalArgumentException: Class is not registered: scala.reflect.ClassTag$$anon$1* *Note: To register this class use: kryo.register(scala.reflect.ClassTag$$anon$1.class);* How do I find out which class needs to be registered? I looked at my program and registered all classes used in RDDs. But clearly more classes remain to be registered if I can figure out which classes. Thanks for your help! arun
Registering classes with KryoSerializer
Hi, I am trying to register classes with KryoSerializer. This has worked with other programs. Usually the error messages are helpful in indicating which classes need to be registered. But with my current program, I get the following cryptic error message: *Caused by: java.lang.IllegalArgumentException: Class is not registered: scala.reflect.ClassTag$$anon$1* *Note: To register this class use: kryo.register(scala.reflect.ClassTag$$anon$1.class);* How do I find out which class needs to be registered? I looked at my program and registered all classes used in RDDs. But clearly more classes remain to be registered if I can figure out which classes. Thanks for your help! arun
Reading file with Unicode characters
Hi, Does SparkContext's textFile() method handle files with Unicode characters? How about files in UTF-8 format? Going further, is it possible to specify encodings to the method? If not, what should one do if the files to be read are in some encoding? Thanks, arun
Re: Reading file with Unicode characters
Thanks! arun On Wed, Apr 8, 2015 at 10:51 AM, java8964 java8...@hotmail.com wrote: Spark use the Hadoop TextInputFormat to read the file. Since Hadoop is almost only supporting Linux, so UTF-8 is the only encoding supported, as it is the the one on Linux. If you have other encoding data, you may want to vote for this Jira: https://issues.apache.org/jira/browse/MAPREDUCE-232 Yong -- Date: Wed, 8 Apr 2015 10:35:18 -0700 Subject: Reading file with Unicode characters From: lists.a...@gmail.com To: user@spark.apache.org CC: lists.a...@gmail.com Hi, Does SparkContext's textFile() method handle files with Unicode characters? How about files in UTF-8 format? Going further, is it possible to specify encodings to the method? If not, what should one do if the files to be read are in some encoding? Thanks, arun
Specifying Spark property from command line?
Hi, Is it possible to specify a Spark property like spark.local.dir from the command line when running an application using spark-submit? Thanks, arun
Error when running Spark on Windows 8.1
Hi, We are trying to run a Spark application using spark-submit on Windows 8.1. The application runs successfully to completion on MacOS 10.10 and on Ubuntu Linux. On Windows, we get the following error messages (see below). It appears that Spark is trying to delete some temporary directory that it creates. How do we solve this problem? Thanks, arun 5/04/07 10:55:14 ERROR Utils: Exception while deleting Spark temp dir: C:\Users\JOSHMC~1\AppData\Local\Temp\spark-339bf2d9-8b89-46e9-b5c1-404caf9d3cd7\userFiles-62976ef7-ab56-41c0-a35b-793c7dca31c7 java.io.IOException: Failed to delete: C:\Users\JOSHMC~1\AppData\Local\Temp\spark-339bf2d9-8b89-46e9-b5c1-404caf9d3cd7\userFiles-62976ef7-ab56-41c0-a35b-793c7dca31c7 at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:932) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1$$anonfun$apply$mcV$sp$2.apply(Utils.scala:181) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1$$anonfun$apply$mcV$sp$2.apply(Utils.scala:179) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply$mcV$sp(Utils.scala:179) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:177) at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:177) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617) at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:177)
Re: Specifying Spark property from command line?
I just figured this out from the documentation: --conf spark.local.dir=C:\Temp On Tue, Apr 7, 2015 at 5:00 PM, Arun Lists lists.a...@gmail.com wrote: Hi, Is it possible to specify a Spark property like spark.local.dir from the command line when running an application using spark-submit? Thanks, arun
Registering classes with KryoSerializer
I am trying to register classes with KryoSerializer. I get the following error message: How do I find out what class is being referred to by: *OpenHashMap$mcI$sp ?* *com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Class is not registered: com.comp.common.base.OpenHashMap$mcI$sp* *Note: To register this class use: * *kryo.register(com.dtex.common.base.OpenHashMap$mcI$sp.class);* I have registered other classes with it by using: sparkConf.registerKryoClasses(Array( classOf[MyClass] )) Thanks, arun
ClassNotFoundException when registering classes with Kryo
Here is the relevant snippet of code in my main program: === sparkConf.set(spark.serializer, org.apache.spark.serializer.KryoSerializer) sparkConf.set(spark.kryo.registrationRequired, true) val summaryDataClass = classOf[SummaryData] val summaryViewClass = classOf[SummaryView] sparkConf.registerKryoClasses(Array( summaryDataClass, summaryViewClass)) === I get the following error: Exception in thread main java.lang.reflect.InvocationTargetException ... Caused by: org.apache.spark.SparkException: Failed to load class to register with Kryo ... Caused by: java.lang.ClassNotFoundException: com.dtex.analysis.transform.SummaryData Note that the class in question SummaryData is in the same package as the main program and hence in the same jar. What do I need to do to make this work? Thanks, arun
Re: ClassNotFoundException when registering classes with Kryo
Thanks for the notification! For now, I'll use the Kryo serializer without registering classes until the bug fix has been merged into the next version of Spark (I guess that will be 1.3, right?). arun On Sun, Feb 1, 2015 at 10:58 PM, Shixiong Zhu zsxw...@gmail.com wrote: It's a bug that has been fixed in https://github.com/apache/spark/pull/4258 but not yet been merged. Best Regards, Shixiong Zhu 2015-02-02 10:08 GMT+08:00 Arun Lists lists.a...@gmail.com: Here is the relevant snippet of code in my main program: === sparkConf.set(spark.serializer, org.apache.spark.serializer.KryoSerializer) sparkConf.set(spark.kryo.registrationRequired, true) val summaryDataClass = classOf[SummaryData] val summaryViewClass = classOf[SummaryView] sparkConf.registerKryoClasses(Array( summaryDataClass, summaryViewClass)) === I get the following error: Exception in thread main java.lang.reflect.InvocationTargetException ... Caused by: org.apache.spark.SparkException: Failed to load class to register with Kryo ... Caused by: java.lang.ClassNotFoundException: com.dtex.analysis.transform.SummaryData Note that the class in question SummaryData is in the same package as the main program and hence in the same jar. What do I need to do to make this work? Thanks, arun
Re: Reading resource files in a Spark application
The problem is that it gives an error message saying something to the effect that: URI is not hierarchical This is consistent with your explanation. Thanks, arun On Wed, Jan 14, 2015 at 1:14 AM, Sean Owen so...@cloudera.com wrote: My hunch is that it is because the URI of a resource in a JAR file will necessarily be specific to where the JAR is on the local filesystem and that is not portable or the right way to read a resource. But you didn't specify the problem here. On Jan 14, 2015 5:15 AM, Arun Lists lists.a...@gmail.com wrote: I experimented with using getResourceAsStream(cls, fileName) instead cls.getResource(fileName).toURI. That works! I have no idea why the latter method does not work in Spark. Any explanations would be welcome. Thanks, arun On Tue, Jan 13, 2015 at 6:35 PM, Arun Lists lists.a...@gmail.com wrote: In some classes, I initialize some values from resource files using the following snippet: new File(cls.getResource(fileName).toURI) This works fine in SBT. When I run it using spark-submit, I get a bunch of errors because the classes cannot be initialized. What can I do to make such initialization that is Spark friendly? Thanks, arun
Re: Reading resource files in a Spark application
I experimented with using getResourceAsStream(cls, fileName) instead cls.getResource(fileName).toURI. That works! I have no idea why the latter method does not work in Spark. Any explanations would be welcome. Thanks, arun On Tue, Jan 13, 2015 at 6:35 PM, Arun Lists lists.a...@gmail.com wrote: In some classes, I initialize some values from resource files using the following snippet: new File(cls.getResource(fileName).toURI) This works fine in SBT. When I run it using spark-submit, I get a bunch of errors because the classes cannot be initialized. What can I do to make such initialization that is Spark friendly? Thanks, arun
Re: Running Spark application from command line
Yes, I am running with Scala 2.11. Here is what I see when I do scala -version scala -version Scala code runner version 2.11.4 -- Copyright 2002-2013, LAMP/EPFL On Tue, Jan 13, 2015 at 2:30 AM, Sean Owen so...@cloudera.com wrote: It sounds like possibly a Scala version mismatch? are you sure you're running with Scala 2.11 too? On Tue, Jan 13, 2015 at 6:58 AM, Arun Lists lists.a...@gmail.com wrote: I have a Spark application that was assembled using sbt 0.13.7, Scala 2.11, and Spark 1.2.0. In build.sbt, I am running on Mac OSX Yosemite. I use provided for the Spark dependencies. I can run the application fine within sbt. I run into problems when I try to run it from the command line. Here is the command I use: ADD_JARS=analysis/target/scala-2.11/dtex-analysis_2.11-0.1.jar scala -cp /Applications/spark-1.2.0-bin-hadoop2.4/lib/spark-assembly-1.2.0-hadoop2.4.0.jar:analysis/target/scala-2.11/dtex-analysis_2.11-0.1.jar com.dtex.analysis.transform.GenUserSummaryView ... I get the following error messages below. Please advise what I can do to resolve this issue. Thanks! arun 15/01/12 22:47:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/01/12 22:47:18 WARN BlockManager: Putting block broadcast_0 failed java.lang.NoSuchMethodError: scala.collection.immutable.$colon$colon.hd$1()Ljava/lang/Object; at org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:84) at org.apache.spark.util.collection.SizeTracker$class.resetSamples(SizeTracker.scala:61) at org.apache.spark.util.collection.SizeTrackingVector.resetSamples(SizeTrackingVector.scala:25) at org.apache.spark.util.collection.SizeTracker$class.$init$(SizeTracker.scala:51) at org.apache.spark.util.collection.SizeTrackingVector.init(SizeTrackingVector.scala:25) at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:236) at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:136) at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:114) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787) at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638) at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:992) at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:98) at org.apache.spark.broadcast.TorrentBroadcast.init(TorrentBroadcast.scala:84) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:945) at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:695) at org.apache.spark.SparkContext.textFile(SparkContext.scala:540) at com.dtex.analysis.transform.TransformUtils$anonfun$2.apply(TransformUtils.scala:97) at com.dtex.analysis.transform.TransformUtils$anonfun$2.apply(TransformUtils.scala:97) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at com.dtex.analysis.transform.TransformUtils$.generateUserSummaryData(TransformUtils.scala:97) at com.dtex.analysis.transform.GenUserSummaryView$.main(GenUserSummaryView.scala:77) at com.dtex.analysis.transform.GenUserSummaryView.main(GenUserSummaryView.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at scala.reflect.internal.util.ScalaClassLoader$anonfun$run$1.apply(ScalaClassLoader.scala:70) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.ScalaClassLoader$URLClassLoader.asContext(ScalaClassLoader.scala:101) at scala.reflect.internal.util.ScalaClassLoader$class.run(ScalaClassLoader.scala:70) at scala.reflect.internal.util.ScalaClassLoader$URLClassLoader.run(ScalaClassLoader.scala:101) at scala.tools.nsc.CommonRunner$class.run(ObjectRunner.scala:22) at scala.tools.nsc.ObjectRunner$.run
Reading resource files in a Spark application
In some classes, I initialize some values from resource files using the following snippet: new File(cls.getResource(fileName).toURI) This works fine in SBT. When I run it using spark-submit, I get a bunch of errors because the classes cannot be initialized. What can I do to make such initialization that is Spark friendly? Thanks, arun
Running Spark application from command line
I have a Spark application that was assembled using sbt 0.13.7, Scala 2.11, and Spark 1.2.0. In build.sbt, I am running on Mac OSX Yosemite. I use provided for the Spark dependencies. I can run the application fine within sbt. I run into problems when I try to run it from the command line. Here is the command I use: ADD_JARS=analysis/target/scala-2.11/dtex-analysis_2.11-0.1.jar scala -cp /Applications/spark-1.2.0-bin-hadoop2.4/lib/spark-assembly-1.2.0-hadoop2.4.0.jar:analysis/target/scala-2.11/dtex-analysis_2.11-0.1.jar com.dtex.analysis.transform.GenUserSummaryView ... I get the following error messages below. Please advise what I can do to resolve this issue. Thanks! arun 15/01/12 22:47:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/01/12 22:47:18 WARN BlockManager: Putting block broadcast_0 failed java.lang.NoSuchMethodError: scala.collection.immutable.$colon$colon.hd$1()Ljava/lang/Object; at org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:84) at org.apache.spark.util.collection.SizeTracker$class.resetSamples(SizeTracker.scala:61) at org.apache.spark.util.collection.SizeTrackingVector.resetSamples(SizeTrackingVector.scala:25) at org.apache.spark.util.collection.SizeTracker$class.$init$(SizeTracker.scala:51) at org.apache.spark.util.collection.SizeTrackingVector.init(SizeTrackingVector.scala:25) at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:236) at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:136) at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:114) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787) at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638) at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:992) at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:98) at org.apache.spark.broadcast.TorrentBroadcast.init(TorrentBroadcast.scala:84) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:945) at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:695) at org.apache.spark.SparkContext.textFile(SparkContext.scala:540) at com.dtex.analysis.transform.TransformUtils$anonfun$2.apply(TransformUtils.scala:97) at com.dtex.analysis.transform.TransformUtils$anonfun$2.apply(TransformUtils.scala:97) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at com.dtex.analysis.transform.TransformUtils$.generateUserSummaryData(TransformUtils.scala:97) at com.dtex.analysis.transform.GenUserSummaryView$.main(GenUserSummaryView.scala:77) at com.dtex.analysis.transform.GenUserSummaryView.main(GenUserSummaryView.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at scala.reflect.internal.util.ScalaClassLoader$anonfun$run$1.apply(ScalaClassLoader.scala:70) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.ScalaClassLoader$URLClassLoader.asContext(ScalaClassLoader.scala:101) at scala.reflect.internal.util.ScalaClassLoader$class.run(ScalaClassLoader.scala:70) at scala.reflect.internal.util.ScalaClassLoader$URLClassLoader.run(ScalaClassLoader.scala:101) at scala.tools.nsc.CommonRunner$class.run(ObjectRunner.scala:22) at scala.tools.nsc.ObjectRunner$.run(ObjectRunner.scala:39) at scala.tools.nsc.CommonRunner$class.runAndCatch(ObjectRunner.scala:29) at scala.tools.nsc.ObjectRunner$.runAndCatch(ObjectRunner.scala:39) at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala:65) at scala.tools.nsc.MainGenericRunner.run$1(MainGenericRunner.scala:87) at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:98) at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:103) at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)