Akhil, This looks like the issue. I'll update my path to include the (soon to be added) winutils & assoc. DLLs.
Thank you, Bryan -----Original Message----- From: "Akhil Das" <ak...@sigmoidanalytics.com> Sent: 9/14/2015 6:46 AM To: "Bryan Jeffrey" <bryan.jeff...@gmail.com> Cc: "user" <user@spark.apache.org> Subject: Re: Problems with Local Checkpoints You need to set your HADOOP_HOME and make sure the winutils.exe is available in the PATH. Here's a discussion around the same issue http://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-binary-path Also this JIRA https://issues.apache.org/jira/browse/SPARK-2356 Thanks Best Regards On Wed, Sep 9, 2015 at 11:30 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote: Hello. I have some basic code that counts numbers using updateStateByKey. I setup a streaming context with checkpointing as follows: def createStreamingContext(masterName : String, checkpointDirectory : String, timeWindow : Int) : StreamingContext = { val sparkConf = new SparkConf().setAppName("Program") val ssc = new StreamingContext(sparkConf, Seconds(timeWindow)) ssc.checkpoint(checkpointDirectory) ssc} This runs fine on my distributed (Linux) cluster, writing checkpoints to local disk. However, when I run on my Windows desktop I am seeing a number of checkpoint errors: 15/09/09 13:57:06 INFO CheckpointWriter: Saving checkpoint for time 1441821426000 ms to file 'file:/C:/Temp/sparkcheckpoint/checkpoint-1441821426000' Exception in thread "pool-14-thread-4" java.lang.NullPointerException at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012) at org.apache.hadoop.util.Shell.runCommand(Shell.java:404) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.util.Shell.execCommand(Shell.java:678) at org.apache.hadoop.util.Shell.execCommand(Shell.java:661) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639) at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:468) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772) at org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.run(Checkpoint.scala:181) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) JAVA_HOME is set correctly, the code runs correctly, it's not a permissions issue (I've run this as Administrator). Directories and files are being created in C:\Temp, although all of the files appear to be empty. Does anyone have an idea of what is causing these errors? Has anyone seen something similar? Regards, Bryan Jeffrey