Hi, When I run the below program, I see two files in the HDFS because the number of partitions in 2. But, one of the file is empty. Why is it so? Is the work not distributed equally to all the tasks?
textFile.flatMap(lambda line: line.split()).map(lambda word: (word, 1)). *reduceByKey*(lambda a, b: a+b).*repartition(2)* .saveAsTextFile("hdfs://localhost:9000/user/praveen/output/") Thanks, Praveen