On Thu, Nov 26, 2009 at 5:32 PM, Cubic <cubicdes...@gmail.com> wrote:
> Hi list. > > I have small files containing data that has to be processed. A file > can be small, even down to 10MB (but it can me also 100-600MB large) > and contains at least 30000 records to be processed. > Processing one record can take 30 seconds to 2 minutes. My cluster is > about 10 nodes. Each node has 16 cores. > Sorry for deviating from the question , but curious to know what does core here refer to ? > Anybody can give an idea about how to deal with these small files? It > is not quite a common Hadoop task; I know. For example, how many map > tasks should I set in this case? > -- Regards, ~Sid~ I have never met a man so ignorant that i couldn't learn something from him