Re: Database insertion by HAdoop

Hemanth Yamijala Mon, 18 Feb 2013 06:58:40 -0800

What database is this ? Was hbase mentioned ?

On Monday, February 18, 2013, Mohammad Tariq wrote:


> Hello Masoud,
>
>           You can use the Bulk Load feature. You might find it more
> efficient than normal client APIs or using the TableOutputFormat.
>
> The bulk load feature uses a MapReduce job to output table data
> in HBase's internal data format, and then directly loads the
> generated StoreFiles into a running cluster. Using bulk load will use
> less CPU and network resources than simply using the HBase API.
>
> For a detailed info you can go here :
> http://hbase.apache.org/book/arch.bulk.load.html
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, Feb 18, 2013 at 5:00 PM, Masoud 
> <mas...@agape.hanyang.ac.kr<javascript:_e({}, 'cvml', 
> 'mas...@agape.hanyang.ac.kr');>
> > wrote:
>
>>
>> Dear All,
>>
>> We are going to do our experiment of a scientific papers, ]
>> We must insert data in our database for later consideration, it almost
>> 300 tables each one has 2/000/000 records.
>> as you know It takes lots of time to do it with a single machine,
>> we are going to use our Hadoop cluster (32 machines) and divide 300
>> insertion tasks between them,
>> I need some hint to progress faster,
>> 1- as i know we dont need to Reduser, just Mapper in enough.
>> 2- so wee need just implement Mapper class with needed code.
>>
>> Please let me know if there is any point,
>>
>> Best Regards
>> Masoud
>>
>>
>>
>>
>

Re: Database insertion by HAdoop

Reply via email to