You might be hitting into the problem of "small-files". This has been
discussed multiple times on the list. Greping through archives will help.
Also http://www.cloudera.com/blog/2009/02/02/the-small-files-problem/

Ashutosh

On Sun, Oct 18, 2009 at 22:57, Kunsheng Chen <ke...@yahoo.com> wrote:

> I and running a hadoop program to perform MapReduce work on files inside a
> folder.
>
> My program is basically doing Map and Reduce work, each line of any file is
> a pair of string, and the result is a string associate with occurence inside
> all files.
>
> The program works fine until the number of files grow to about 80,000,then
> the 'cannot allocate memory' error occur for some reason.
>
> Each of the file contains around 50 lines, but the total size of all files
> is no more than 1.5 GB. There are 3 datanodes performing calculation,each of
> them have more than 10GB hd left.
>
> I am wondering if that is normal for Hadoop because the data is too large ?
> Or it might be my programs problem ?
>
> It is really not supposed to be since Hadoop was developed for processing
> large data sets.
>
>
> Any idea is well appreciated
>
>
>
>
>
>
>
>
>
>

Reply via email to