Hi, I tried it too and it gave me a similar output. Looks like some bug with the code. The code seems to be there since stone age though... I tried a fix, it seems there was "." period missing while setting the conf and when retrieving we were trying to get it with the period. Have put the code here: https://github.com/ayushtkn/hadoop/commit/ab7da425e204903e867855b05b7c8fc2fbdd8b0e
Patched it on top of trunk and gave it a try locally for your use case, seems post that output is correct. Will check and raise a MAPRED Jira to fix, If it gets reviewed & Committed you can either patch your hadoop distro or wait for the next release which would contain a fix. hadoop-3.4.0-SNAPSHOT % bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.4.0-SNAPSHOT.jar aggregatewordcount /testData /testOut 1 textinputformat hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -cat /testOut/part-r-00000 Bye 1 Goodbye 1 Hadoop 2 Hello 2 World 2 > Does this mean that Aggregate WordCount is merely counting the number of files in the input directory? Not in an ideal situation, The JavaDoc says: *It reads the text input files, breaks each line into words and counts them. The output is a locally sorted list of words and the count of how often they occurred.* On Mon, 2 May 2022 at 10:23, Pratyush Das <reik...@gmail.com> wrote: > Hi, > > I had some questions about what the Aggregate Word Count example in the > hadoop-mapreduce-examples-3.3.1.jar actually does. > > This is how I executed the AggregateWordCount example - hadoop jar > hadoop-3.3.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar > aggregatewordcount /examples-input/wordcount/ /examples-output/wordcount/ 1 > textinputformat > > /examples-input/wordcount/ contains 2 files - wc01.txt and wc02.txt. > > These are the contents of wc01.txt: > Hello World Bye World > > These are the contents of wc02.txt: > Hello Hadoop Goodbye Hadoop > > The generated output file - /examples-output/wordcount/part-r-00000 > contains the following line: > record_count 2 > > I tried adding another file - wc03.txt which changed the content of the > generated file to: > record_count 3 > > Does this mean that Aggregate WordCount is merely counting the number of > files in the input directory? > > Regards, > > > -- > Pratyush Das >