So you mean if we want to taste the sweetness of hadoop, we need huge datasets... Well I don't have any dataset of this range this time.. Can anyone suggest any website or have any dataset of this range so that I can see the efficiency of hadoop..
Thanks, Praveenesh On Tue, Apr 19, 2011 at 10:55 AM, Mathias Herberts < mathias.herbe...@gmail.com> wrote: > Hi Praveeenesh, > > On Tue, Apr 19, 2011 at 07:06, praveenesh kumar <praveen...@gmail.com> > wrote: > > The input were 3 plain text files.. > > > > 1 file was around 665 KB and other 2 files were around 1.5 MB each.. > > That's not the Hadoop's sweet spot, with the default block size > (64Mb), none of those files will be split, meaning they will only be > processed by a single mapper and thus adding machines won't improve > performance. > > Try with files that span several blocks, you should see a difference > in perf when adding machines. > > Also the penalty of starting mappers and co won't be erased unless you > have consequently sized files as input. > > Mathias. >