Hi Vasil! It seems like the WordCount application is expecting to open the intermediate file but failing. Do you see a directory under D:/tmp/hadoop-Vasil Grigirov/ . I can think of a few reasons. I'm sorry I am not familiar with the Filesystem on Windows 10. 1. Spaces in the file name are not being encoded / decoded properly. Can you try changing your name / username to remove the space? 2. There's not enough space on the D:/tmp directory? 3. The application does not have the right permissions to create the file.
HTH Ravi On Wed, Feb 22, 2017 at 10:51 AM, Васил Григоров <vask...@abv.bg> wrote: > Hello, I've been trying to run the WordCount example provided on the > website on my Windows 10 machine. I have built the latest hadoop version > (2.7.3) successfully and I want to run the code on the Local (Standalone) > Mode. Thus, I have not specified any configurations, apart from setting the > JAVA_HOME path in the "hadoop-env.cmd" file. When I try to run the > WordCount file it fails to run the Reduce task but it completes the Map > tasks. I get the following output: > > > *D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount>hadoop > jar wc.jar WordCount > D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\input > D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\output* > *17/02/22 18:40:43 INFO Configuration.deprecation: session.id > <http://session.id> is deprecated. Instead, use dfs.metrics.session-id* > *17/02/22 18:40:43 INFO jvm.JvmMetrics: Initializing JVM Metrics with > processName=JobTracker, sessionId=* > *17/02/22 18:40:43 WARN mapreduce.JobResourceUploader: Hadoop command-line > option parsing not performed. Implement the Tool interface and execute your > application with ToolRunner to remedy this.* > *17/02/22 18:40:43 WARN mapreduce.JobResourceUploader: No job jar file > set. User classes may not be found. See Job or Job#setJar(String).* > *17/02/22 18:40:44 INFO input.FileInputFormat: Total input paths to > process : 2* > *17/02/22 18:40:44 INFO mapreduce.JobSubmitter: number of splits:2* > *17/02/22 18:40:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: > job_local334410887_0001* > *17/02/22 18:40:45 INFO mapreduce.Job: The url to track the job: > http://localhost:8080/ <http://localhost:8080/>* > *17/02/22 18:40:45 INFO mapreduce.Job: Running job: > job_local334410887_0001* > *17/02/22 18:40:45 INFO mapred.LocalJobRunner: OutputCommitter set in > config null* > *17/02/22 18:40:45 INFO output.FileOutputCommitter: File Output Committer > Algorithm version is 1* > *17/02/22 18:40:45 INFO mapred.LocalJobRunner: OutputCommitter is > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter* > *17/02/22 18:40:45 INFO mapred.LocalJobRunner: Waiting for map tasks* > *17/02/22 18:40:45 INFO mapred.LocalJobRunner: Starting task: > attempt_local334410887_0001_m_000000_0* > *17/02/22 18:40:45 INFO output.FileOutputCommitter: File Output Committer > Algorithm version is 1* > *17/02/22 18:40:45 INFO util.ProcfsBasedProcessTree: > ProcfsBasedProcessTree currently is supported only on Linux.* > *17/02/22 18:40:45 INFO mapred.Task: Using ResourceCalculatorProcessTree > : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@3019d00f* > *17/02/22 18:40:45 INFO mapred.MapTask: Processing split: > file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file02:0+27* > *17/02/22 18:40:45 INFO mapred.MapTask: (EQUATOR) 0 kvi > 26214396(104857584)* > *17/02/22 18:40:45 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100* > *17/02/22 18:40:45 INFO mapred.MapTask: soft limit at 83886080* > *17/02/22 18:40:45 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600* > *17/02/22 18:40:45 INFO mapred.MapTask: kvstart = 26214396; length = > 6553600* > *17/02/22 18:40:45 INFO mapred.MapTask: Map output collector class = > org.apache.hadoop.mapred.MapTask$MapOutputBuffer* > *17/02/22 18:40:45 INFO mapred.LocalJobRunner:* > *17/02/22 18:40:45 INFO mapred.MapTask: Starting flush of map output* > *17/02/22 18:40:45 INFO mapred.MapTask: Spilling map output* > *17/02/22 18:40:45 INFO mapred.MapTask: bufstart = 0; bufend = 44; bufvoid > = 104857600* > *17/02/22 18:40:45 INFO mapred.MapTask: kvstart = 26214396(104857584); > kvend = 26214384(104857536); length = 13/6553600* > *17/02/22 18:40:45 INFO mapred.MapTask: Finished spill 0* > *17/02/22 18:40:45 INFO mapred.Task: > Task:attempt_local334410887_0001_m_000000_0 is done. And is in the process > of committing* > *17/02/22 18:40:45 INFO mapred.LocalJobRunner: map* > *17/02/22 18:40:45 INFO mapred.Task: Task > 'attempt_local334410887_0001_m_000000_0' done.* > *17/02/22 18:40:45 INFO mapred.LocalJobRunner: Finishing task: > attempt_local334410887_0001_m_000000_0* > *17/02/22 18:40:45 INFO mapred.LocalJobRunner: Starting task: > attempt_local334410887_0001_m_000001_0* > *17/02/22 18:40:46 INFO output.FileOutputCommitter: File Output Committer > Algorithm version is 1* > *17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree: > ProcfsBasedProcessTree currently is supported only on Linux.* > *17/02/22 18:40:46 INFO mapred.Task: Using ResourceCalculatorProcessTree > : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@39ef3a7* > *17/02/22 18:40:46 INFO mapred.MapTask: Processing split: > file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file01:0+25* > *17/02/22 18:40:46 INFO mapred.MapTask: (EQUATOR) 0 kvi > 26214396(104857584)* > *17/02/22 18:40:46 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100* > *17/02/22 18:40:46 INFO mapred.MapTask: soft limit at 83886080* > *17/02/22 18:40:46 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600* > *17/02/22 18:40:46 INFO mapred.MapTask: kvstart = 26214396; length = > 6553600* > *17/02/22 18:40:46 INFO mapred.MapTask: Map output collector class = > org.apache.hadoop.mapred.MapTask$MapOutputBuffer* > *17/02/22 18:40:46 INFO mapred.LocalJobRunner:* > *17/02/22 18:40:46 INFO mapred.MapTask: Starting flush of map output* > *17/02/22 18:40:46 INFO mapred.MapTask: Spilling map output* > *17/02/22 18:40:46 INFO mapred.MapTask: bufstart = 0; bufend = 42; bufvoid > = 104857600* > *17/02/22 18:40:46 INFO mapred.MapTask: kvstart = 26214396(104857584); > kvend = 26214384(104857536); length = 13/6553600* > *17/02/22 18:40:46 INFO mapred.MapTask: Finished spill 0* > *17/02/22 18:40:46 INFO mapred.Task: > Task:attempt_local334410887_0001_m_000001_0 is done. And is in the process > of committing* > *17/02/22 18:40:46 INFO mapred.LocalJobRunner: map* > *17/02/22 18:40:46 INFO mapreduce.Job: Job job_local334410887_0001 running > in uber mode : false* > *17/02/22 18:40:46 INFO mapred.Task: Task > 'attempt_local334410887_0001_m_000001_0' done.* > *17/02/22 18:40:46 INFO mapreduce.Job: map 100% reduce 0%* > *17/02/22 18:40:46 INFO mapred.LocalJobRunner: Finishing task: > attempt_local334410887_0001_m_000001_0* > *17/02/22 18:40:46 INFO mapred.LocalJobRunner: map task executor complete.* > *17/02/22 18:40:46 INFO mapred.LocalJobRunner: Waiting for reduce tasks* > *17/02/22 18:40:46 INFO mapred.LocalJobRunner: Starting task: > attempt_local334410887_0001_r_000000_0* > *17/02/22 18:40:46 INFO output.FileOutputCommitter: File Output Committer > Algorithm version is 1* > *17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree: > ProcfsBasedProcessTree currently is supported only on Linux.* > *17/02/22 18:40:46 INFO mapred.Task: Using ResourceCalculatorProcessTree > : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@13ac822f* > *17/02/22 18:40:46 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: > org.apache.hadoop.mapreduce.task.reduce.Shuffle@6c4d20c4* > *17/02/22 18:40:46 INFO reduce.MergeManagerImpl: MergerManager: > memoryLimit=334338464, maxSingleShuffleLimit=83584616, > mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10* > *17/02/22 18:40:46 INFO reduce.EventFetcher: > attempt_local334410887_0001_r_000000_0 Thread started: EventFetcher for > fetching Map Completion Events* > *17/02/22 18:40:46 INFO mapred.LocalJobRunner: reduce task executor > complete.* > *17/02/22 18:40:46 WARN mapred.LocalJobRunner: job_local334410887_0001* > *java.lang.Exception: > org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in > shuffle in localfetcher#1* > * at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)* > * at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)* > *Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: > error in shuffle in localfetcher#1* > * at > org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)* > * at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)* > * at > org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)* > * at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)* > * at java.util.concurrent.FutureTask.run(FutureTask.java:266)* > * at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)* > * at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)* > * at java.lang.Thread.run(Thread.java:745)* > *Caused by: java.io.FileNotFoundException: > D:/tmp/hadoop-Vasil%20Grigorov/mapred/local/localRunner/Vasil%20Grigorov/jobcache/job_local334410887_0001/attempt_local334410887_0001_m_000000_0/output/file.out.index* > * at > org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:200)* > * at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)* > * at org.apache.hadoop.io > <http://org.apache.hadoop.io>.SecureIOUtils.openFSDataInputStream(SecureIOUtils.java:156)* > * at > org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:71)* > * at > org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:62)* > * at > org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:57)* > * at > org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:124)* > * at > org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:102)* > * at > org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:85)* > *17/02/22 18:40:47 INFO mapreduce.Job: Job job_local334410887_0001 failed > with state FAILED due to: NA* > *17/02/22 18:40:47 INFO mapreduce.Job: Counters: 18* > * File System Counters* > * FILE: Number of bytes read=1158* > * FILE: Number of bytes written=591978* > * FILE: Number of read operations=0* > * FILE: Number of large read operations=0* > * FILE: Number of write operations=0* > * Map-Reduce Framework* > * Map input records=2* > * Map output records=8* > * Map output bytes=86* > * Map output materialized bytes=89* > * Input split bytes=308* > * Combine input records=8* > * Combine output records=6* > * Spilled Records=6* > * Failed Shuffles=0* > * Merged Map outputs=0* > * GC time elapsed (ms)=0* > * Total committed heap usage (bytes)=574095360* > * File Input Format Counters* > * Bytes Read=52* > > I have followed every tutorial available and looked for a potention > solution to the error I get, but I have been unsuccessful. As I mentioned > before, I have not set any further configurations to any files because I > want to run it in Standalone mode, rather than pseudo-distributed or fully > distributed mode. I've spent a lot of time and effort to get this far and > I've hit a brick wall with this error, so any help would be GREATLY > appreciated. > > Thank you in advance! >