Re: WordCount MapReduce error

Ravi Prakash Wed, 22 Feb 2017 11:37:12 -0800

Hi Vasil!

It seems like the WordCount application is expecting to open the
intermediate file but failing. Do you see a directory under
D:/tmp/hadoop-Vasil Grigirov/ . I can think of a few reasons. I'm sorry I
am not familiar with the Filesystem on Windows 10.
1. Spaces in the file name are not being encoded / decoded properly. Can
you try changing your name / username to remove the space?
2. There's not enough space on the D:/tmp directory?
3. The application does not have the right permissions to create the file.


HTH
Ravi

On Wed, Feb 22, 2017 at 10:51 AM, Васил Григоров <vask...@abv.bg> wrote:

> Hello, I've been trying to run the WordCount example provided on the
> website on my Windows 10 machine. I have built the latest hadoop version
> (2.7.3) successfully and I want to run the code on the Local (Standalone)
> Mode. Thus, I have not specified any configurations, apart from setting the
> JAVA_HOME path in the "hadoop-env.cmd" file. When I try to run the
> WordCount file it fails to run the Reduce task but it completes the Map
> tasks. I get the following output:
>
>
> *D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount>hadoop
> jar wc.jar WordCount
> D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\input
> D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\output*
> *17/02/22 18:40:43 INFO Configuration.deprecation: session.id
> <http://session.id> is deprecated. Instead, use dfs.metrics.session-id*
> *17/02/22 18:40:43 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=*
> *17/02/22 18:40:43 WARN mapreduce.JobResourceUploader: Hadoop command-line
> option parsing not performed. Implement the Tool interface and execute your
> application with ToolRunner to remedy this.*
> *17/02/22 18:40:43 WARN mapreduce.JobResourceUploader: No job jar file
> set.  User classes may not be found. See Job or Job#setJar(String).*
> *17/02/22 18:40:44 INFO input.FileInputFormat: Total input paths to
> process : 2*
> *17/02/22 18:40:44 INFO mapreduce.JobSubmitter: number of splits:2*
> *17/02/22 18:40:44 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_local334410887_0001*
> *17/02/22 18:40:45 INFO mapreduce.Job: The url to track the job:
> http://localhost:8080/ <http://localhost:8080/>*
> *17/02/22 18:40:45 INFO mapreduce.Job: Running job:
> job_local334410887_0001*
> *17/02/22 18:40:45 INFO mapred.LocalJobRunner: OutputCommitter set in
> config null*
> *17/02/22 18:40:45 INFO output.FileOutputCommitter: File Output Committer
> Algorithm version is 1*
> *17/02/22 18:40:45 INFO mapred.LocalJobRunner: OutputCommitter is
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter*
> *17/02/22 18:40:45 INFO mapred.LocalJobRunner: Waiting for map tasks*
> *17/02/22 18:40:45 INFO mapred.LocalJobRunner: Starting task:
> attempt_local334410887_0001_m_000000_0*
> *17/02/22 18:40:45 INFO output.FileOutputCommitter: File Output Committer
> Algorithm version is 1*
> *17/02/22 18:40:45 INFO util.ProcfsBasedProcessTree:
> ProcfsBasedProcessTree currently is supported only on Linux.*
> *17/02/22 18:40:45 INFO mapred.Task:  Using ResourceCalculatorProcessTree
> : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@3019d00f*
> *17/02/22 18:40:45 INFO mapred.MapTask: Processing split:
> file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file02:0+27*
> *17/02/22 18:40:45 INFO mapred.MapTask: (EQUATOR) 0 kvi
> 26214396(104857584)*
> *17/02/22 18:40:45 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100*
> *17/02/22 18:40:45 INFO mapred.MapTask: soft limit at 83886080*
> *17/02/22 18:40:45 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600*
> *17/02/22 18:40:45 INFO mapred.MapTask: kvstart = 26214396; length =
> 6553600*
> *17/02/22 18:40:45 INFO mapred.MapTask: Map output collector class =
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer*
> *17/02/22 18:40:45 INFO mapred.LocalJobRunner:*
> *17/02/22 18:40:45 INFO mapred.MapTask: Starting flush of map output*
> *17/02/22 18:40:45 INFO mapred.MapTask: Spilling map output*
> *17/02/22 18:40:45 INFO mapred.MapTask: bufstart = 0; bufend = 44; bufvoid
> = 104857600*
> *17/02/22 18:40:45 INFO mapred.MapTask: kvstart = 26214396(104857584);
> kvend = 26214384(104857536); length = 13/6553600*
> *17/02/22 18:40:45 INFO mapred.MapTask: Finished spill 0*
> *17/02/22 18:40:45 INFO mapred.Task:
> Task:attempt_local334410887_0001_m_000000_0 is done. And is in the process
> of committing*
> *17/02/22 18:40:45 INFO mapred.LocalJobRunner: map*
> *17/02/22 18:40:45 INFO mapred.Task: Task
> 'attempt_local334410887_0001_m_000000_0' done.*
> *17/02/22 18:40:45 INFO mapred.LocalJobRunner: Finishing task:
> attempt_local334410887_0001_m_000000_0*
> *17/02/22 18:40:45 INFO mapred.LocalJobRunner: Starting task:
> attempt_local334410887_0001_m_000001_0*
> *17/02/22 18:40:46 INFO output.FileOutputCommitter: File Output Committer
> Algorithm version is 1*
> *17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree:
> ProcfsBasedProcessTree currently is supported only on Linux.*
> *17/02/22 18:40:46 INFO mapred.Task:  Using ResourceCalculatorProcessTree
> : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@39ef3a7*
> *17/02/22 18:40:46 INFO mapred.MapTask: Processing split:
> file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file01:0+25*
> *17/02/22 18:40:46 INFO mapred.MapTask: (EQUATOR) 0 kvi
> 26214396(104857584)*
> *17/02/22 18:40:46 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100*
> *17/02/22 18:40:46 INFO mapred.MapTask: soft limit at 83886080*
> *17/02/22 18:40:46 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600*
> *17/02/22 18:40:46 INFO mapred.MapTask: kvstart = 26214396; length =
> 6553600*
> *17/02/22 18:40:46 INFO mapred.MapTask: Map output collector class =
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer*
> *17/02/22 18:40:46 INFO mapred.LocalJobRunner:*
> *17/02/22 18:40:46 INFO mapred.MapTask: Starting flush of map output*
> *17/02/22 18:40:46 INFO mapred.MapTask: Spilling map output*
> *17/02/22 18:40:46 INFO mapred.MapTask: bufstart = 0; bufend = 42; bufvoid
> = 104857600*
> *17/02/22 18:40:46 INFO mapred.MapTask: kvstart = 26214396(104857584);
> kvend = 26214384(104857536); length = 13/6553600*
> *17/02/22 18:40:46 INFO mapred.MapTask: Finished spill 0*
> *17/02/22 18:40:46 INFO mapred.Task:
> Task:attempt_local334410887_0001_m_000001_0 is done. And is in the process
> of committing*
> *17/02/22 18:40:46 INFO mapred.LocalJobRunner: map*
> *17/02/22 18:40:46 INFO mapreduce.Job: Job job_local334410887_0001 running
> in uber mode : false*
> *17/02/22 18:40:46 INFO mapred.Task: Task
> 'attempt_local334410887_0001_m_000001_0' done.*
> *17/02/22 18:40:46 INFO mapreduce.Job:  map 100% reduce 0%*
> *17/02/22 18:40:46 INFO mapred.LocalJobRunner: Finishing task:
> attempt_local334410887_0001_m_000001_0*
> *17/02/22 18:40:46 INFO mapred.LocalJobRunner: map task executor complete.*
> *17/02/22 18:40:46 INFO mapred.LocalJobRunner: Waiting for reduce tasks*
> *17/02/22 18:40:46 INFO mapred.LocalJobRunner: Starting task:
> attempt_local334410887_0001_r_000000_0*
> *17/02/22 18:40:46 INFO output.FileOutputCommitter: File Output Committer
> Algorithm version is 1*
> *17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree:
> ProcfsBasedProcessTree currently is supported only on Linux.*
> *17/02/22 18:40:46 INFO mapred.Task:  Using ResourceCalculatorProcessTree
> : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@13ac822f*
> *17/02/22 18:40:46 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin:
> org.apache.hadoop.mapreduce.task.reduce.Shuffle@6c4d20c4*
> *17/02/22 18:40:46 INFO reduce.MergeManagerImpl: MergerManager:
> memoryLimit=334338464, maxSingleShuffleLimit=83584616,
> mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10*
> *17/02/22 18:40:46 INFO reduce.EventFetcher:
> attempt_local334410887_0001_r_000000_0 Thread started: EventFetcher for
> fetching Map Completion Events*
> *17/02/22 18:40:46 INFO mapred.LocalJobRunner: reduce task executor
> complete.*
> *17/02/22 18:40:46 WARN mapred.LocalJobRunner: job_local334410887_0001*
> *java.lang.Exception:
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in
> shuffle in localfetcher#1*
> *        at
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)*
> *        at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)*
> *Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
> error in shuffle in localfetcher#1*
> *        at
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)*
> *        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)*
> *        at
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)*
> *        at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)*
> *        at java.util.concurrent.FutureTask.run(FutureTask.java:266)*
> *        at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)*
> *        at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)*
> *        at java.lang.Thread.run(Thread.java:745)*
> *Caused by: java.io.FileNotFoundException:
> D:/tmp/hadoop-Vasil%20Grigorov/mapred/local/localRunner/Vasil%20Grigorov/jobcache/job_local334410887_0001/attempt_local334410887_0001_m_000000_0/output/file.out.index*
> *        at
> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:200)*
> *        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)*
> *        at org.apache.hadoop.io
> <http://org.apache.hadoop.io>.SecureIOUtils.openFSDataInputStream(SecureIOUtils.java:156)*
> *        at
> org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:71)*
> *        at
> org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:62)*
> *        at
> org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:57)*
> *        at
> org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:124)*
> *        at
> org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:102)*
> *        at
> org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:85)*
> *17/02/22 18:40:47 INFO mapreduce.Job: Job job_local334410887_0001 failed
> with state FAILED due to: NA*
> *17/02/22 18:40:47 INFO mapreduce.Job: Counters: 18*
> *        File System Counters*
> *                FILE: Number of bytes read=1158*
> *                FILE: Number of bytes written=591978*
> *                FILE: Number of read operations=0*
> *                FILE: Number of large read operations=0*
> *                FILE: Number of write operations=0*
> *        Map-Reduce Framework*
> *                Map input records=2*
> *                Map output records=8*
> *                Map output bytes=86*
> *                Map output materialized bytes=89*
> *                Input split bytes=308*
> *                Combine input records=8*
> *                Combine output records=6*
> *                Spilled Records=6*
> *                Failed Shuffles=0*
> *                Merged Map outputs=0*
> *                GC time elapsed (ms)=0*
> *                Total committed heap usage (bytes)=574095360*
> *        File Input Format Counters*
> *                Bytes Read=52*
>
> I have followed every tutorial available and looked for a potention
> solution to the error I get, but I have been unsuccessful. As I mentioned
> before, I have not set any further configurations to any files because I
> want to run it in Standalone mode, rather than pseudo-distributed or fully
> distributed mode. I've spent a lot of time and effort to get this far and
> I've hit a brick wall with this error, so any help would be GREATLY
> appreciated.
>
> Thank you in advance!
>

Re: WordCount MapReduce error

Reply via email to