On Feb 4, 2008 2:11 PM, Miles Osborne <[EMAIL PROTECTED]> wrote: > This is exactly the same as word counting, except that you have a second > pass to find the top n per block of data (this can be done in a mapper) and > then a reducer can quite easily merge the results together. >
This would mean I have to write a second program that reads the output of first and does the job. I was wondering if it could be done in one program. > This wouldn't be homework, would it? > no, it isn't homework. I read the word count program that came along with hadoop, wanted to extend it to solve my problem. thanks, Taran > MIles > > > On 04/02/2008, Tarandeep Singh <[EMAIL PROTECTED]> wrote: > > > > Hi, > > > > Can someone guide me on how to write program using hadoop framework > > that analyze the log files and find out the top most frequently > > occurring keywords. The log file has the format - > > > > keyword source dateId > > > > Thanks, > > Tarandeep > > > > > > -- > The University of Edinburgh is a charitable body, registered in Scotland, > with registration number SC005336. >