Akhil,

This looks like the issue. I'll update my path to include the (soon to be 
added) winutils & assoc. DLLs.

Thank you,

Bryan

-----Original Message-----
From: "Akhil Das" <ak...@sigmoidanalytics.com>
Sent: ‎9/‎14/‎2015 6:46 AM
To: "Bryan Jeffrey" <bryan.jeff...@gmail.com>
Cc: "user" <user@spark.apache.org>
Subject: Re: Problems with Local Checkpoints

You need to set your HADOOP_HOME and make sure the winutils.exe is available in 
the PATH.


Here's a discussion around the same issue 
http://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-binary-path
 Also this JIRA https://issues.apache.org/jira/browse/SPARK-2356


Thanks
Best Regards


On Wed, Sep 9, 2015 at 11:30 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote:

Hello.


I have some basic code that counts numbers using updateStateByKey.  I setup a 
streaming context with checkpointing as follows:


def createStreamingContext(masterName : String, checkpointDirectory : String, 
timeWindow : Int) : StreamingContext = {  val sparkConf = new 
SparkConf().setAppName("Program")  val ssc = new StreamingContext(sparkConf, 
Seconds(timeWindow))  ssc.checkpoint(checkpointDirectory)  ssc}

This runs fine on my distributed (Linux) cluster, writing checkpoints to local 
disk. However, when I run on my Windows desktop I am seeing a number of 
checkpoint errors:


15/09/09 13:57:06 INFO CheckpointWriter: Saving checkpoint for time 
1441821426000 ms to file 
'file:/C:/Temp/sparkcheckpoint/checkpoint-1441821426000'
Exception in thread "pool-14-thread-4" java.lang.NullPointerException
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)
 at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639)
 at 
org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:468)
 at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)
 at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772)
 at 
org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.run(Checkpoint.scala:181)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)


JAVA_HOME is set correctly, the code runs correctly, it's not a permissions 
issue (I've run this as Administrator).  Directories and files are being 
created in C:\Temp, although all of the files appear to be empty.


Does anyone have an idea of what is causing these errors?  Has anyone seen 
something similar?


Regards,


Bryan Jeffrey

Reply via email to