Re: Not able to write output to local filsystem from Standalone mode.

Jacek Laskowski Tue, 24 May 2016 04:57:50 -0700

Hi,

What happens when you create the parent directory /home/stuti? I think the
failure is due to missing parent directories. What's the OS?


Jacek
On 24 May 2016 11:27 a.m., "Stuti Awasthi" <stutiawas...@hcl.com> wrote:

Hi All,

I have 3 nodes Spark 1.6 Standalone mode cluster with 1 Master and 2
Slaves. Also Im not having Hadoop as filesystem . Now, Im able to launch
shell , read the input file from local filesystem and perform
transformation successfully. When I try to write my output in local
filesystem path then I receive below error .



I tried to search on web and found similar Jira :
https://issues.apache.org/jira/browse/SPARK-2984 . Even though it shows
resolved for Spark 1.3+ but already people have posted the same issue still
persists in latest versions.



*ERROR*

scala> data.saveAsTextFile("/home/stuti/test1")

16/05/24 05:03:42 WARN TaskSetManager: Lost task 1.0 in stage 1.0 (TID 2,
server1): java.io.IOException: The temporary job-output directory
file:/home/stuti/test1/_temporary doesn't exist!

        at
org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)

        at
org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)

        at
org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116)

        at
org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:91)

        at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1193)

        at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)

        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)

        at org.apache.spark.scheduler.Task.run(Task.scala:89)

        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)

        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)



What is the best way to resolve this issue if suppose I don’t want to have
Hadoop installed OR is it mandatory to have Hadoop to write the output from
Standalone cluster mode.



Please suggest.



Thanks &Regards

Stuti Awasthi





::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and
intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as
information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability
on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction,
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses
and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

Re: Not able to write output to local filsystem from Standalone mode.

Reply via email to