You need to set your HADOOP_HOME and make sure the winutils.exe is available in the PATH.
Here's a discussion around the same issue http://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-binary-path Also this JIRA https://issues.apache.org/jira/browse/SPARK-2356 Thanks Best Regards On Wed, Sep 9, 2015 at 11:30 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote: > Hello. > > I have some basic code that counts numbers using updateStateByKey. I > setup a streaming context with checkpointing as follows: > > def createStreamingContext(masterName : String, checkpointDirectory : String, > timeWindow : Int) : StreamingContext = { > val sparkConf = new SparkConf().setAppName("Program") > val ssc = new StreamingContext(sparkConf, Seconds(timeWindow)) > ssc.checkpoint(checkpointDirectory) > ssc > } > > > This runs fine on my distributed (Linux) cluster, writing checkpoints to > local disk. However, when I run on my Windows desktop I am seeing a number > of checkpoint errors: > > 15/09/09 13:57:06 INFO CheckpointWriter: Saving checkpoint for time > 1441821426000 ms to file > 'file:/C:/Temp/sparkcheckpoint/checkpoint-1441821426000' > Exception in thread "pool-14-thread-4" java.lang.NullPointerException > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012) > at org.apache.hadoop.util.Shell.runCommand(Shell.java:404) > at org.apache.hadoop.util.Shell.run(Shell.java:379) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) > at org.apache.hadoop.util.Shell.execCommand(Shell.java:678) > at org.apache.hadoop.util.Shell.execCommand(Shell.java:661) > at > org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639) > at > org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:468) > at > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456) > at > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772) > at > org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.run(Checkpoint.scala:181) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > JAVA_HOME is set correctly, the code runs correctly, it's not a > permissions issue (I've run this as Administrator). Directories and files > are being created in C:\Temp, although all of the files appear to be empty. > > Does anyone have an idea of what is causing these errors? Has anyone seen > something similar? > > Regards, > > Bryan Jeffrey > > >