That doesn't quite do what the poster requested. They wanted to pass the entire file to the mapper.
That requires a custom input format or an indirect input approach (list of file names in input). On 10/15/07 9:57 AM, "Rick Cox" <[EMAIL PROTECTED]> wrote: > You can also gzip each input file. Hadoop will not split a compressed > input file (but will automatically decompress it before feeding it to > your mapper). > > rick > > On 10/15/07, Ted Dunning <[EMAIL PROTECTED]> wrote: >> >> >> Use a list of file names as your map input. Then your mapper can read a >> line, use that to open and read a file for processing. >> >> This is similar to the problem of web-crawling where the input is a list of >> URL's. >> >> On 10/15/07 6:57 AM, "Ming Yang" <[EMAIL PROTECTED]> wrote: >> >>> I was writing a test mapreduce program and noticed that the >>> input file was always broken down into separate lines and fed >>> to the mapper. However, in my case I need to process the whole >>> file in the mapper since there are some dependency between >>> lines in the input file. Is there any way I can achieve this -- >>> process the whole input file, either text or binary, in the mapper? >> >>