Hello Masoud, You can use the Bulk Load feature. You might find it more efficient than normal client APIs or using the TableOutputFormat.
The bulk load feature uses a MapReduce job to output table data in HBase's internal data format, and then directly loads the generated StoreFiles into a running cluster. Using bulk load will use less CPU and network resources than simply using the HBase API. For a detailed info you can go here : http://hbase.apache.org/book/arch.bulk.load.html Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Mon, Feb 18, 2013 at 5:00 PM, Masoud <mas...@agape.hanyang.ac.kr> wrote: > > Dear All, > > We are going to do our experiment of a scientific papers, ] > We must insert data in our database for later consideration, it almost > 300 tables each one has 2/000/000 records. > as you know It takes lots of time to do it with a single machine, > we are going to use our Hadoop cluster (32 machines) and divide 300 > insertion tasks between them, > I need some hint to progress faster, > 1- as i know we dont need to Reduser, just Mapper in enough. > 2- so wee need just implement Mapper class with needed code. > > Please let me know if there is any point, > > Best Regards > Masoud > > > >