I'll kick this vote off with a +1.
On Thu, Jan 16, 2014 at 10:43 AM, Patrick Wendell <pwend...@gmail.com> wrote: > I also ran your example locally and it worked with 0.8.1 and > 0.9.0-rc1. So it's possible somehow you are pulling in an older > version if Spark or an incompatible version of Hadoop. > > - Patrick > > On Thu, Jan 16, 2014 at 9:39 AM, Patrick Wendell <pwend...@gmail.com> wrote: >> Hey Alex, >> >> Thanks for testing out this rc. Would you mind forking this into a different >> thread so we can discuss there? >> >> Also, does your application build and run correctly with spark 0.8.1? That >> would determine whether the problem is specifically with this rc... >> >> Patrick >> >> --- >> sent from my phone >> >> On Jan 15, 2014 11:44 PM, "Alex Cozzi" <alexco...@gmail.com> wrote: >>> >>> Oh, I forgot: I am using the “yarn” maven profile to target yarn 2.2 >>> >>> Alex Cozzi >>> alexco...@gmail.com >>> On Jan 15, 2014, at 11:41 PM, Alex Cozzi <alexco...@gmail.com> wrote: >>> >>> > Just testing out the rc1. I create a dependent project (using maven) and >>> > I copied the HdfsTest.scala test, but I added a single line to save the >>> > file >>> > back to disk: >>> > >>> > package org.apache.spark.examples >>> > >>> > import org.apache.spark._ >>> > >>> > object HdfsTest { >>> > def main(args: Array[String]) { >>> > val sc = new SparkContext(args(0), "HdfsTest", >>> > System.getenv("SPARK_HOME"), >>> > SparkContext.jarOfClass(this.getClass)) >>> > val file = sc.textFile(args(1)) >>> > val mapped = file.map(s => s.length).cache() >>> > for (iter <- 1 to 10) { >>> > val start = System.currentTimeMillis() >>> > for (x <- mapped) { x + 2 } >>> > // println("Processing: " + x) >>> > val end = System.currentTimeMillis() >>> > println("Iteration " + iter + " took " + (end-start) + " ms") >>> > mapped.saveAsTextFile("out") >>> > } >>> > System.exit(0) >>> > } >>> > } >>> > >>> > and this my pom file: >>> > <project xmlns="http://maven.apache.org/POM/4.0.0" >>> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>> > xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 >>> > http://maven.apache.org/maven-v4_0_0.xsd"> >>> > <modelVersion>4.0.0</modelVersion> >>> > <groupId>my.examples</groupId> >>> > <artifactId>spark-samples</artifactId> >>> > <version>0.0.1-SNAPSHOT</version> >>> > <inceptionYear>2014</inceptionYear> >>> > >>> > <properties> >>> > <maven.compiler.source>1.6</maven.compiler.source> >>> > <maven.compiler.target>1.6</maven.compiler.target> >>> > <encoding>UTF-8</encoding> >>> > <scala.tools.version>2.10</scala.tools.version> >>> > <scala.version>2.10.0</scala.version> >>> > </properties> >>> > >>> > <repositories> >>> > <repository> >>> > <id>spark staging</id> >>> > >>> > <url>https://repository.apache.org/content/repositories/orgapachespark-1001</url> >>> > </repository> >>> > </repositories> >>> > >>> > <dependencies> >>> > <dependency> >>> > <groupId>org.scala-lang</groupId> >>> > <artifactId>scala-library</artifactId> >>> > <version>${scala.version}</version> >>> > </dependency> >>> > >>> > <dependency> >>> > <groupId>org.apache.spark</groupId> >>> > >>> > <artifactId>spark-core_${scala.tools.version}</artifactId> >>> > <version>0.9.0-incubating</version> >>> > </dependency> >>> > >>> > <!-- Test --> >>> > <dependency> >>> > <groupId>junit</groupId> >>> > <artifactId>junit</artifactId> >>> > <version>4.11</version> >>> > <scope>test</scope> >>> > </dependency> >>> > <dependency> >>> > <groupId>org.specs2</groupId> >>> > >>> > <artifactId>specs2_${scala.tools.version}</artifactId> >>> > <version>1.13</version> >>> > <scope>test</scope> >>> > </dependency> >>> > <dependency> >>> > <groupId>org.scalatest</groupId> >>> > >>> > <artifactId>scalatest_${scala.tools.version}</artifactId> >>> > <version>2.0.M6-SNAP8</version> >>> > <scope>test</scope> >>> > </dependency> >>> > </dependencies> >>> > >>> > <build> >>> > <sourceDirectory>src/main/scala</sourceDirectory> >>> > <testSourceDirectory>src/test/scala</testSourceDirectory> >>> > <plugins> >>> > <plugin> >>> > <!-- see >>> > http://davidb.github.com/scala-maven-plugin --> >>> > <groupId>net.alchim31.maven</groupId> >>> > >>> > <artifactId>scala-maven-plugin</artifactId> >>> > <version>3.1.6</version> >>> > <configuration> >>> > >>> > <scalaCompatVersion>2.10</scalaCompatVersion> >>> > <jvmArgs> >>> > <jvmArg>-Xms128m</jvmArg> >>> > <jvmArg>-Xmx2048m</jvmArg> >>> > </jvmArgs> >>> > </configuration> >>> > <executions> >>> > <execution> >>> > <goals> >>> > >>> > <goal>compile</goal> >>> > >>> > <goal>testCompile</goal> >>> > </goals> >>> > <configuration> >>> > <args> >>> > >>> > <arg>-make:transitive</arg> >>> > >>> > <arg>-dependencyfile</arg> >>> > >>> > <arg>${project.build.directory}/.scala_dependencies</arg> >>> > </args> >>> > </configuration> >>> > </execution> >>> > </executions> >>> > </plugin> >>> > <plugin> >>> > >>> > <groupId>org.apache.maven.plugins</groupId> >>> > >>> > <artifactId>maven-surefire-plugin</artifactId> >>> > <version>2.13</version> >>> > <configuration> >>> > <useFile>false</useFile> >>> > >>> > <disableXmlReport>true</disableXmlReport> >>> > <!-- If you have classpath issue >>> > like NoDefClassError,... --> >>> > <!-- >>> > useManifestOnlyJar>false</useManifestOnlyJar --> >>> > <includes> >>> > >>> > <include>**/*Test.*</include> >>> > >>> > <include>**/*Suite.*</include> >>> > </includes> >>> > </configuration> >>> > </plugin> >>> > <plugin> >>> > <groupId>org.codehaus.mojo</groupId> >>> > <artifactId>exec-maven-plugin</artifactId> >>> > <version>1.2.1</version> >>> > <executions> >>> > <execution> >>> > <goals> >>> > <goal>exec</goal> >>> > </goals> >>> > </execution> >>> > </executions> >>> > <configuration> >>> > >>> > <mainClass>org.apache.spark.examples.HdfsTest</mainClass> >>> > <arguments> >>> > <argument>local</argument> >>> > >>> > <argument>pom.xml</argument> >>> > </arguments> >>> > </configuration> >>> > </plugin> >>> > </plugins> >>> > </build> >>> > </project> >>> > >>> > >>> > now, when I run it either in eclipse or using "mvn exec:java" I get the >>> > following error: >>> > [INFO] >>> > [INFO] --- exec-maven-plugin:1.2.1:java (default-cli) @ spark-samples >>> > --- >>> > SLF4J: Class path contains multiple SLF4J bindings. >>> > SLF4J: Found binding in >>> > [jar:file:/Users/acozzi/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>> > SLF4J: Found binding in >>> > [jar:file:/Users/acozzi/.m2/repository/org/slf4j/slf4j-simple/1.6.1/slf4j-simple-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>> > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an >>> > explanation. >>> > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >>> > 14/01/15 23:37:57 INFO slf4j.Slf4jLogger: Slf4jLogger started >>> > 14/01/15 23:37:57 INFO Remoting: Starting remoting >>> > 14/01/15 23:37:57 INFO Remoting: Remoting started; listening on >>> > addresses :[akka.tcp://spark@10.0.1.10:53682] >>> > 14/01/15 23:37:57 INFO Remoting: Remoting now listens on addresses: >>> > [akka.tcp://spark@10.0.1.10:53682] >>> > 14/01/15 23:37:57 INFO spark.SparkEnv: Registering BlockManagerMaster >>> > 14/01/15 23:37:57 INFO storage.DiskBlockManager: Created local directory >>> > at >>> > /var/folders/mm/4qxz27w91p96v2zp5f9ncmqm38ychm/T/spark-local-20140115233757-7a41 >>> > 14/01/15 23:37:57 INFO storage.MemoryStore: MemoryStore started with >>> > capacity 1218.8 MB. >>> > 14/01/15 23:37:57 INFO network.ConnectionManager: Bound socket to port >>> > 53683 with id = ConnectionManagerId(10.0.1.10,53683) >>> > 14/01/15 23:37:57 INFO storage.BlockManagerMaster: Trying to register >>> > BlockManager >>> > 14/01/15 23:37:57 INFO storage.BlockManagerMasterActor$BlockManagerInfo: >>> > Registering block manager 10.0.1.10:53683 with 1218.8 MB RAM >>> > 14/01/15 23:37:57 INFO storage.BlockManagerMaster: Registered >>> > BlockManager >>> > 14/01/15 23:37:57 INFO spark.HttpServer: Starting HTTP Server >>> > 14/01/15 23:37:57 INFO server.Server: jetty-7.6.8.v20121106 >>> > 14/01/15 23:37:57 INFO server.AbstractConnector: Started >>> > SocketConnector@0.0.0.0:53684 >>> > 14/01/15 23:37:57 INFO broadcast.HttpBroadcast: Broadcast server started >>> > at http://10.0.1.10:53684 >>> > 14/01/15 23:37:57 INFO spark.SparkEnv: Registering MapOutputTracker >>> > 14/01/15 23:37:57 INFO spark.HttpFileServer: HTTP File server directory >>> > is >>> > /var/folders/mm/4qxz27w91p96v2zp5f9ncmqm38ychm/T/spark-e9304513-3714-430f-aa14-1a430a915d98 >>> > 14/01/15 23:37:57 INFO spark.HttpServer: Starting HTTP Server >>> > 14/01/15 23:37:57 INFO server.Server: jetty-7.6.8.v20121106 >>> > 14/01/15 23:37:57 INFO server.AbstractConnector: Started >>> > SocketConnector@0.0.0.0:53685 >>> > 14/01/15 23:37:57 INFO server.Server: jetty-7.6.8.v20121106 >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/storage/rdd,null} >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/storage,null} >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/stages/stage,null} >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/stages/pool,null} >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/stages,null} >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/environment,null} >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/executors,null} >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/metrics/json,null} >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/static,null} >>> > 14/01/15 23:37:57 INFO handler.ContextHandler: started >>> > o.e.j.s.h.ContextHandler{/,null} >>> > 14/01/15 23:37:57 INFO server.AbstractConnector: Started >>> > SelectChannelConnector@0.0.0.0:4040 >>> > 14/01/15 23:37:57 INFO ui.SparkUI: Started Spark Web UI at >>> > http://10.0.1.10:4040 >>> > 2014-01-15 23:37:57.929 java[34819:1020b] Unable to load realm mapping >>> > info from SCDynamicStore >>> > 14/01/15 23:37:58 INFO storage.MemoryStore: ensureFreeSpace(35456) >>> > called with curMem=0, maxMem=1278030643 >>> > 14/01/15 23:37:58 INFO storage.MemoryStore: Block broadcast_0 stored as >>> > values to memory (estimated size 34.6 KB, free 1218.8 MB) >>> > 14/01/15 23:37:58 WARN util.NativeCodeLoader: Unable to load >>> > native-hadoop library for your platform... using builtin-java classes >>> > where >>> > applicable >>> > 14/01/15 23:37:58 WARN snappy.LoadSnappy: Snappy native library not >>> > loaded >>> > 14/01/15 23:37:58 INFO mapred.FileInputFormat: Total input paths to >>> > process : 1 >>> > 14/01/15 23:37:58 INFO spark.SparkContext: Starting job: foreach at >>> > HdfsTest.scala:30 >>> > 14/01/15 23:37:58 INFO scheduler.DAGScheduler: Got job 0 (foreach at >>> > HdfsTest.scala:30) with 1 output partitions (allowLocal=false) >>> > 14/01/15 23:37:58 INFO scheduler.DAGScheduler: Final stage: Stage 0 >>> > (foreach at HdfsTest.scala:30) >>> > 14/01/15 23:37:58 INFO scheduler.DAGScheduler: Parents of final stage: >>> > List() >>> > 14/01/15 23:37:58 INFO scheduler.DAGScheduler: Missing parents: List() >>> > 14/01/15 23:37:58 INFO scheduler.DAGScheduler: Submitting Stage 0 >>> > (MappedRDD[2] at map at HdfsTest.scala:27), which has no missing parents >>> > 14/01/15 23:37:58 INFO scheduler.DAGScheduler: Submitting 1 missing >>> > tasks from Stage 0 (MappedRDD[2] at map at HdfsTest.scala:27) >>> > 14/01/15 23:37:58 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 >>> > with 1 tasks >>> > 14/01/15 23:37:58 INFO scheduler.TaskSetManager: Starting task 0.0:0 as >>> > TID 0 on executor localhost: localhost (PROCESS_LOCAL) >>> > 14/01/15 23:37:58 INFO scheduler.TaskSetManager: Serialized task 0.0:0 >>> > as 1778 bytes in 5 ms >>> > 14/01/15 23:37:58 INFO executor.Executor: Running task ID 0 >>> > 14/01/15 23:37:58 INFO storage.BlockManager: Found block broadcast_0 >>> > locally >>> > 14/01/15 23:37:58 INFO spark.CacheManager: Partition rdd_2_0 not found, >>> > computing it >>> > 14/01/15 23:37:58 INFO rdd.HadoopRDD: Input split: >>> > file:/Users/acozzi/Documents/workspace/spark-samples/pom.xml:0+4092 >>> > 14/01/15 23:37:58 INFO storage.MemoryStore: ensureFreeSpace(2853) called >>> > with curMem=35456, maxMem=1278030643 >>> > 14/01/15 23:37:58 INFO storage.MemoryStore: Block rdd_2_0 stored as >>> > values to memory (estimated size 2.8 KB, free 1218.8 MB) >>> > 14/01/15 23:37:58 INFO storage.BlockManagerMasterActor$BlockManagerInfo: >>> > Added rdd_2_0 in memory on 10.0.1.10:53683 (size: 2.8 KB, free: 1218.8 MB) >>> > 14/01/15 23:37:58 INFO storage.BlockManagerMaster: Updated info of block >>> > rdd_2_0 >>> > 14/01/15 23:37:58 INFO executor.Executor: Serialized size of result for >>> > 0 is 525 >>> > 14/01/15 23:37:58 INFO executor.Executor: Sending result for 0 directly >>> > to driver >>> > 14/01/15 23:37:58 INFO executor.Executor: Finished task ID 0 >>> > 14/01/15 23:37:58 INFO scheduler.TaskSetManager: Finished TID 0 in 61 ms >>> > on localhost (progress: 0/1) >>> > 14/01/15 23:37:58 INFO scheduler.DAGScheduler: Completed ResultTask(0, >>> > 0) >>> > 14/01/15 23:37:58 INFO scheduler.TaskSchedulerImpl: Remove TaskSet 0.0 >>> > from pool >>> > 14/01/15 23:37:58 INFO scheduler.DAGScheduler: Stage 0 (foreach at >>> > HdfsTest.scala:30) finished in 0.071 s >>> > 14/01/15 23:37:58 INFO spark.SparkContext: Job finished: foreach at >>> > HdfsTest.scala:30, took 0.151199 s >>> > Iteration 1 took 189 ms >>> > [WARNING] >>> > java.lang.reflect.InvocationTargetException >>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> > at >>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> > at >>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> > at java.lang.reflect.Method.invoke(Method.java:597) >>> > at >>> > org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297) >>> > at java.lang.Thread.run(Thread.java:695) >>> > Caused by: java.lang.IncompatibleClassChangeError: Implementing class >>> > at java.lang.ClassLoader.defineClass1(Native Method) >>> > at java.lang.ClassLoader.defineClassCond(ClassLoader.java:637) >>> > at java.lang.ClassLoader.defineClass(ClassLoader.java:621) >>> > at >>> > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) >>> > at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) >>> > at java.net.URLClassLoader.access$000(URLClassLoader.java:58) >>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:197) >>> > at java.security.AccessController.doPrivileged(Native Method) >>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) >>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) >>> > at java.lang.Class.forName0(Native Method) >>> > at java.lang.Class.forName(Class.java:171) >>> > at >>> > org.apache.hadoop.mapred.SparkHadoopMapRedUtil$class.firstAvailableClass(SparkHadoopMapRedUtil.scala:48) >>> > at >>> > org.apache.hadoop.mapred.SparkHadoopMapRedUtil$class.newJobContext(SparkHadoopMapRedUtil.scala:23) >>> > at >>> > org.apache.hadoop.mapred.SparkHadoopWriter.newJobContext(SparkHadoopWriter.scala:40) >>> > at >>> > org.apache.hadoop.mapred.SparkHadoopWriter.getJobContext(SparkHadoopWriter.scala:149) >>> > at >>> > org.apache.hadoop.mapred.SparkHadoopWriter.preSetup(SparkHadoopWriter.scala:64) >>> > at >>> > org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:713) >>> > at >>> > org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:686) >>> > at >>> > org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:572) >>> > at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:894) >>> > at >>> > org.apache.spark.examples.HdfsTest$$anonfun$main$1.apply$mcVI$sp(HdfsTest.scala:34) >>> > at >>> > scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:142) >>> > at org.apache.spark.examples.HdfsTest$.main(HdfsTest.scala:28) >>> > at org.apache.spark.examples.HdfsTest.main(HdfsTest.scala) >>> > ... 6 more >>> > [INFO] >>> > ------------------------------------------------------------------------ >>> > [INFO] BUILD FAILURE >>> > [INFO] >>> > ------------------------------------------------------------------------ >>> > [INFO] Total time: 3.224s >>> > [INFO] Finished at: Wed Jan 15 23:37:58 PST 2014 >>> > [INFO] Final Memory: 12M/81M >>> > [INFO] >>> > ------------------------------------------------------------------------ >>> > [ERROR] Failed to execute goal >>> > org.codehaus.mojo:exec-maven-plugin:1.2.1:java (default-cli) on project >>> > spark-samples: An exception occured while executing the Java class. null: >>> > InvocationTargetException: Implementing class -> [Help 1] >>> > >>> > >>> > Alex Cozzi >>> > alexco...@gmail.com >>> > On Jan 15, 2014, at 5:48 PM, Patrick Wendell <pwend...@gmail.com> wrote: >>> > >>> >> Please vote on releasing the following candidate as Apache Spark >>> >> (incubating) version 0.9.0. >>> >> >>> >> A draft of the release notes along with the changes file is attached >>> >> to this e-mail. >>> >> >>> >> The tag to be voted on is v0.9.0-incubating (commit 7348893): >>> >> >>> >> https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=7348893f0edd96dacce2f00970db1976266f7008 >>> >> >>> >> The release files, including signatures, digests, etc can be found at: >>> >> http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc1/ >>> >> >>> >> Release artifacts are signed with the following key: >>> >> https://people.apache.org/keys/committer/pwendell.asc >>> >> >>> >> The staging repository for this release can be found at: >>> >> https://repository.apache.org/content/repositories/orgapachespark-1001/ >>> >> >>> >> The documentation corresponding to this release can be found at: >>> >> http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc1-docs/ >>> >> >>> >> Please vote on releasing this package as Apache Spark 0.9.0-incubating! >>> >> >>> >> The vote is open until Sunday, January 19, at 02:00 UTC >>> >> and passes if a majority of at least 3 +1 PPMC votes are cast. >>> >> >>> >> [ ] +1 Release this package as Apache Spark 0.9.0-incubating >>> >> [ ] -1 Do not release this package because ... >>> >> >>> >> To learn more about Apache Spark, please see >>> >> http://spark.incubator.apache.org/ >>> > >>> >>