subject:"RE\: one input file per map"

Re: one input file per map

2008-07-03 Thread Yang Chen

Maybe consider a hierachy. The first level is one map per file, and the second level is map/reduce for parent level. YC On 7/3/08, Jason Venner <[EMAIL PROTECTED]> wrote: > > You could also set your input split size to Long.MAX_VALUE. > > Goel, Ankur wrote: > >> Nope, But if the intent is so the

Re: one input file per map

2008-07-03 Thread Jason Venner

You could also set your input split size to Long.MAX_VALUE. Goel, Ankur wrote: Nope, But if the intent is so then there are 2 ways of doing it. 1. Just extend the input format of your choice and override isSplitable() method to return false. 2. Compress your text file using a compression forma

RE: one input file per map

2008-07-02 Thread Goel, Ankur

Nope, But if the intent is so then there are 2 ways of doing it. 1. Just extend the input format of your choice and override isSplitable() method to return false. 2. Compress your text file using a compression format supported by hadoop (e.g gzip). This will ensure that one map task processes 1 f