Re: Query regarding HBase Mapreduce

2012-10-25 Thread Bertrand Dechoux
Hi Amit, You might want to add details to your question. 1) Lot of small files is a known 'problem' for Hadoop MapReduce. And you will find information on it by searching. http://blog.cloudera.com/blog/2009/02/the-small-files-problem/ I assume you have a more specific issue, what is it? 2) I am

Re: Query regarding HBase Mapreduce

2012-10-25 Thread Nick maillard
Hi amit I am starting with Hbase and MR so my opinion ismore about what I read than real world. However the documentation says Hadoop will deal better with a set of large files than a lot of small ones. regards amit bohra bohra.a@... writes:

Re: Query regarding HBase Mapreduce

2012-10-25 Thread lohit
When you say small files, do you mean to say those are stored within HBase columns? If so, you need not worry as HBase would eventually write bigger HFile on disk (or HDFS). If you are storing lot of small files on HDFS itself, then you will have scalability problems as single NameNode cannot