RE: Example File not running

Hingorani, Vineet Wed, 27 Aug 2014 07:12:09 -0700

It didn’t work after adding file:// in the front. I compiled it again and ran 
it. The same error are coming. Do you think there can be some problem with the 
java dependency? Also, I don’t want to install Hadoop I just want to run it on 
local machine. The reason is, whenever I install these things they don’t run 
due to some dependencies and then I have to give time to what dependencies are 
needed.




Thank you for helping but it is depressing that I am not even able to run a 
simple example with Spark.



Regards,

Vineet



From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Mittwoch, 27. August 2014 16:01
To: Hingorani, Vineet
Cc: user@spark.apache.org
Subject: Re: Example File not running



You can install hadoop 2 by reading this doc 
https://wiki.apache.org/hadoop/Hadoop2OnWindows Once you are done with it, you 
can set the environment variable HADOOP_HOME then it should work.



Also Not sure if it will work, but can you provide file:// at the front and 
give it a go? I don't see any requirement for hadoop here.



   /* SimpleApp.scala */
   import org.apache.spark.SparkContext
   import org.apache.spark.SparkContext._
   import org.apache.spark.SparkConf

   object SimpleApp {
     def main(args: Array[String]) {
       val logFile = 
"file://C:/Users/D062844/Desktop/HandsOnSpark/Install/spark-1.0.2-bin-hadoop2/README.md<file:///C:\Users\D062844\Desktop\HandsOnSpark\Install\spark-1.0.2-bin-hadoop2\README.md>"
 // Should be some file on your system
       val conf = new SparkConf().setAppName("Simple Application")
       val sc = new SparkContext(conf)
       val logData = sc.textFile(logFile, 2).cache()
       val numAs = logData.filter(line => line.contains("a")).count()
       val numBs = logData.filter(line => line.contains("b")).count()
       println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
     }
   }








   Thanks

   Best Regards



   On Wed, Aug 27, 2014 at 5:09 PM, Hingorani, Vineet 
<vineet.hingor...@sap.com<mailto:vineet.hingor...@sap.com>> wrote:

   The code is the example given on Spark site:



   /* SimpleApp.scala */

   import org.apache.spark.SparkContext

   import org.apache.spark.SparkContext._

   import org.apache.spark.SparkConf



   object SimpleApp {

     def main(args: Array[String]) {

       val logFile = 
"C:/Users/D062844/Desktop/HandsOnSpark/Install/spark-1.0.2-bin-hadoop2/README.md"
 // Should be some file on your system

       val conf = new SparkConf().setAppName("Simple Application")

       val sc = new SparkContext(conf)

       val logData = sc.textFile(logFile, 2).cache()

       val numAs = logData.filter(line => line.contains("a")).count()

       val numBs = logData.filter(line => line.contains("b")).count()

       println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))

     }

   }



   It goes on like this and doesn’t show me the result of count in the file. I 
had installed pre-built version of Spark1.0.2 named Hadoop2 version Spark on 
the site. The problem what I think is because I am running it on local machine 
and it is not able to find some dependencies of Hadoop. Please tell me what 
file should I download to work on my local machine (pre-built, so that I don’t 
have to build it again).









   From: Akhil Das 
[mailto:ak...@sigmoidanalytics.com<mailto:ak...@sigmoidanalytics.com>]
   Sent: Mittwoch, 27. August 2014 13:35
   To: Hingorani, Vineet
   Cc: user@spark.apache.org<mailto:user@spark.apache.org>
   Subject: Re: Example File not running



   It should point to your hadoop installation directory. (like C:\hadoop\)



   Since you don't have hadoop installed, What is the code that you are running?




   Thanks

   Best Regards



   On Wed, Aug 27, 2014 at 4:50 PM, Hingorani, Vineet 
<vineet.hingor...@sap.com<mailto:vineet.hingor...@sap.com>> wrote:

   What should I put the value of that environment variable? I want to run the 
scripts locally on my machine and do not have any Hadoop installed.



   Thank you



   From: Akhil Das 
[mailto:ak...@sigmoidanalytics.com<mailto:ak...@sigmoidanalytics.com>]
   Sent: Mittwoch, 27. August 2014 12:54
   To: Hingorani, Vineet
   Cc: user@spark.apache.org<mailto:user@spark.apache.org>
   Subject: Re: Example File not running



   The statement java.io.IOException: Could not locate executable 
null\bin\winutils.exe



   explains that the null is received when expanding or replacing an 
Environment Variable.



   I'm guessing that you are missing HADOOP_HOME in the environment variables.




   Thanks

   Best Regards



   On Wed, Aug 27, 2014 at 3:52 PM, Hingorani, Vineet 
<vineet.hingor...@sap.com<mailto:vineet.hingor...@sap.com>> wrote:

   Hello all,



   I am able to use Spark in the shell but I am not able to run a spark file. I 
am using sbt and the jar is created but even the SimpleApp class example given 
on the site http://spark.apache.org/docs/latest/quick-start.html is not 
running. I installed a prebuilt version of spark and >sbt package is compiling 
the scala file to jar. I am running it locally on my machine. The error log is 
huge but it starts with something like this:



   14/08/27 12:14:21 INFO SecurityManager: Using Spark's default log4j profile: 
org/apache/sp

   ark/log4j-defaults.properties

   14/08/27 12:14:21 INFO SecurityManager: Changing view acls to: D062844

   14/08/27 12:14:21 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls

   disabled; users with view permissions: Set(D062844)

   14/08/27 12:14:22 INFO Slf4jLogger: Slf4jLogger started

   14/08/27 12:14:22 INFO Remoting: Starting remoting

   14/08/27 12:14:22 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://spark@10.94.74.159:51157<http://rk@10.94.74.159:51157>]

   14/08/27 12:14:22 INFO Remoting: Remoting now listens on addresses: 
[akka.tcp://spark@10.9

   4.74.159:51157]

   14/08/27 12:14:22 INFO SparkEnv: Registering MapOutputTracker

   14/08/27 12:14:22 INFO SparkEnv: Registering BlockManagerMaster

   14/08/27 12:14:22 INFO DiskBlockManager: Created local directory at 
C:\Users\D062844\AppDa

   ta\Local\Temp\spark-local-20140827121422-dec8

   14/08/27 12:14:22 INFO MemoryStore: MemoryStore started with capacity 294.9 
MB.

   14/08/27 12:14:22 INFO ConnectionManager: Bound socket to port 51160 with id 
= ConnectionM

   anagerId(10.94.74.159,51160)

   14/08/27 12:14:22 INFO BlockManagerMaster: Trying to register BlockManager

   14/08/27 12:14:22 INFO BlockManagerInfo: Registering block manager 
10.94.74.159:51160<http://10.94.74.159:51160> with

   294.9 MB RAM

   14/08/27 12:14:22 INFO BlockManagerMaster: Registered BlockManager

   14/08/27 12:14:22 INFO HttpServer: Starting HTTP Server

   14/08/27 12:14:22 INFO HttpBroadcast: Broadcast server started at 
http://10.94.74.159:5116

   1

   14/08/27 12:14:22 INFO HttpFileServer: HTTP File server directory is 
C:\Users\D062844\AppD

   ata\Local\Temp\spark-d79d2857-3d85-4b16-8d76-ade83d465f10

   14/08/27 12:14:22 INFO HttpServer: Starting HTTP Server

   14/08/27 12:14:22 INFO SparkUI: Started SparkUI at http://10.94.74.159:4040

   14/08/27 12:14:23 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your pla

   tform... using builtin-java classes where applicable

   14/08/27 12:14:23 INFO SparkContext: Added JAR 
file:/C:/Users/D062844/Desktop/HandsOnSpark

   
/Install/spark-1.0.2-bin-hadoop2/target/scala-2.10/simple-project_2.10-1.0.jar 
at 
http://10.94.74.159:51162/jars/simple-project_2.10-1.0.jar<http://0.94.74.159:51162/jars/simple-project_2.10-1.0.jar>
 with timestamp 1409134463198

   14/08/27 12:14:23 INFO MemoryStore: ensureFreeSpace(138763) called with 
curMem=0, maxMem=3

   09225062

   14/08/27 12:14:23 INFO MemoryStore: Block broadcast_0 stored as values to 
memory (estimate

   d size 135.5 KB, free 294.8 MB)

   14/08/27 12:14:23 ERROR Shell: Failed to locate the winutils binary in the 
hadoop binary p

   ath

   java.io.IOException: Could not locate executable null\bin\winutils.exe in 
the Hadoop binar

   ies.

           at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)

           at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)

           at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)

           at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)

           at 
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362

   )

           at 
org.apache.spark.SparkContext$$anonfun$22.apply(SparkContext.scala:546)

           at 
org.apache.spark.SparkContext$$anonfun$22.apply(SparkContext.scala:546)

           at 
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:145)



           at 
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:145)



           at scala.Option.map(Option.scala:145)

           at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:145)

           at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:168)

           at 
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)

           at 
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)

           at scala.Option.getOrElse(Option.scala:120)

           at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)

           at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)

           at 
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)

           at 
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)

           at scala.Option.getOrElse(Option.scala:120)

           at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)

           at 
org.apache.spark.rdd.FilteredRDD.getPartitions(FilteredRDD.scala:29)

           at 
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)



   ………………

   ……………..





   It goes on like this and doesn’t show me the result of count in the file. I 
had installed pre0built version named Hadoop2. The problem what I think is 
because I am running it on local machine and it is not able to find some 
dependencies of Hadoop. Please give me what file should I download to work on 
my local machine (pre-built, so that I don’t have to build it again).



   Thank you



   Regards,



   Vineet Hingorani

RE: Example File not running

Reply via email to