I'm trying to get my feet wet with Spark.  I've done some simple stuff in the 
shell in standalone mode, and now I'm trying to connect to HDFS resources, but 
I'm running into a problem.

I synced to git's master branch (c399baa - "SPARK-1456 Remove view bounds on 
Ordered in favor of a context bound on Ordering. (3 days ago) <Michael 
Armbrust>" and built like so:

    SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true sbt/sbt assembly

This created various jars in various places, including these (I think):

   ./examples/target/scala-2.10/spark-examples-assembly-1.0.0-SNAPSHOT.jar
   ./tools/target/scala-2.10/spark-tools-assembly-1.0.0-SNAPSHOT.jar
   ./assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop2.2.0.jar

In `conf/spark-env.sh`, I added this (actually before I did the assembly):

    export HADOOP_CONF_DIR=/etc/hadoop/conf

Now I fire up the shell (bin/spark-shell) and try to grab data from HFDS, and 
get the following exception:

scala> var hdf = sc.hadoopFile("hdfs:///user/kwilliams/dat/part-m-00000")
hdf: org.apache.spark.rdd.RDD[(Nothing, Nothing)] = HadoopRDD[0] at hadoopFile 
at <console>:12

scala> hdf.count()
java.lang.RuntimeException: java.lang.InstantiationException
        at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
        at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:155)
        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:168)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:209)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:207)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1064)
        at org.apache.spark.rdd.RDD.count(RDD.scala:806)
        at $iwC$$iwC$$iwC$$iwC.<init>(<console>:15)
        at $iwC$$iwC$$iwC.<init>(<console>:20)
        at $iwC$$iwC.<init>(<console>:22)
        at $iwC.<init>(<console>:24)
        at <init>(<console>:26)
        at .<init>(<console>:30)
        at .<clinit>(<console>)
        at .<init>(<console>:7)
        at .<clinit>(<console>)
        at $print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:777)
        at 
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1045)
        at 
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609)
        at 
org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796)
        at 
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:841)
        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753)
        at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:601)
        at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:608)
        at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:611)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:936)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)
        at 
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)
Caused by: java.lang.InstantiationException
        at 
sun.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(InstantiationExceptionConstructorAccessorImpl.java:48)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:129)
        ... 41 more


Is this recognizable to anyone as a build problem, or a config problem, or 
anything?  Failing that, any way to get more information about where in the 
process it's failing?

Thanks.

--
Ken Williams, Senior Research Scientist
WindLogics
http://windlogics.com



________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended 
recipient(s) and may contain confidential and privileged information. Any 
unauthorized review, use, disclosure or distribution of any kind is strictly 
prohibited. If you are not the intended recipient, please contact the sender 
via reply e-mail and destroy all copies of the original message. Thank you.

Reply via email to