Akhil,
This looks like the issue. I'll update my path to include the (soon to be
added) winutils & assoc. DLLs.
Thank you,
Bryan
-Original Message-
From: "Akhil Das" <ak...@sigmoidanalytics.com>
Sent: 9/14/2015 6:46 AM
To: "Bryan Jeffrey" <bryan.jeff...@gmail.com>
Cc: "user" <user@spark.apache.org>
Subject: Re: Problems with Local Checkpoints
You need to set your HADOOP_HOME and make sure the winutils.exe is available in
the PATH.
Here's a discussion around the same issue
http://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-binary-path
Also this JIRA https://issues.apache.org/jira/browse/SPARK-2356
Thanks
Best Regards
On Wed, Sep 9, 2015 at 11:30 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote:
Hello.
I have some basic code that counts numbers using updateStateByKey. I setup a
streaming context with checkpointing as follows:
def createStreamingContext(masterName : String, checkpointDirectory : String,
timeWindow : Int) : StreamingContext = { val sparkConf = new
SparkConf().setAppName("Program") val ssc = new StreamingContext(sparkConf,
Seconds(timeWindow)) ssc.checkpoint(checkpointDirectory) ssc}
This runs fine on my distributed (Linux) cluster, writing checkpoints to local
disk. However, when I run on my Windows desktop I am seeing a number of
checkpoint errors:
15/09/09 13:57:06 INFO CheckpointWriter: Saving checkpoint for time
1441821426000 ms to file
'file:/C:/Temp/sparkcheckpoint/checkpoint-1441821426000'
Exception in thread "pool-14-thread-4" java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)
at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639)
at
org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:468)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772)
at
org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.run(Checkpoint.scala:181)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
JAVA_HOME is set correctly, the code runs correctly, it's not a permissions
issue (I've run this as Administrator). Directories and files are being
created in C:\Temp, although all of the files appear to be empty.
Does anyone have an idea of what is causing these errors? Has anyone seen
something similar?
Regards,
Bryan Jeffrey