RE: Database insertion by HAdoop

2013-02-20 Thread Guillaume Polaert
- De : Masoud [mailto:mas...@agape.hanyang.ac.kr] Envoyé : lundi 18 février 2013 12:20 À : common-user@hadoop.apache.org Objet : Database insertion by HAdoop Dear All, We are going to do our experiment of a scientific papers, ] We must insert data in our database for later consideration

Re: Database insertion by HAdoop

2013-02-19 Thread Mohammad Tariq
Hello Masoud, So you want to pull your data from SQL server to your Hadoop cluster first and then do the processing. Please correct me if I am wrong. You can do that using Sqoop as mention by Hemanth sir. BTW, what exactly is the kind of processing which you are planning to do on your data.

Re: Database insertion by HAdoop

2013-02-19 Thread Masoud
Dear Tariq No, exactly in opposite way, actually we compute the similarity between documents and insert them in database, in every table almost 2/000/000 records. Best Regards On 02/19/2013 06:41 PM, Mohammad Tariq wrote: Hello Masoud, So you want to pull your data from SQL server

Re: Database insertion by HAdoop

2013-02-19 Thread Hemanth Yamijala
Sqoop can be used to export as well. Thanks Hemanth On Tuesday, February 19, 2013, Masoud wrote: Dear Tariq No, exactly in opposite way, actually we compute the similarity between documents and insert them in database, in every table almost 2/000/000 records. Best Regards On

Database insertion by HAdoop

2013-02-18 Thread Masoud
Dear All, We are going to do our experiment of a scientific papers, ] We must insert data in our database for later consideration, it almost 300 tables each one has 2/000/000 records. as you know It takes lots of time to do it with a single machine, we are going to use our Hadoop cluster (32

Database insertion by HAdoop

2013-02-18 Thread Masoud
Dear All, We are going to do our experiment of a scientific papers, ] We must insert data in our database for later consideration, it almost 300 tables each one has 2/000/000 records. as you know It takes lots of time to do it with a single machine, we are going to use our Hadoop cluster (32

Re: Database insertion by HAdoop

2013-02-18 Thread Mohammad Tariq
Hello Masoud, You can use the Bulk Load feature. You might find it more efficient than normal client APIs or using the TableOutputFormat. The bulk load feature uses a MapReduce job to output table data in HBase's internal data format, and then directly loads the generated StoreFiles

Re: Database insertion by HAdoop

2013-02-18 Thread Hemanth Yamijala
What database is this ? Was hbase mentioned ? On Monday, February 18, 2013, Mohammad Tariq wrote: Hello Masoud, You can use the Bulk Load feature. You might find it more efficient than normal client APIs or using the TableOutputFormat. The bulk load feature uses a MapReduce job

Re: Database insertion by HAdoop

2013-02-18 Thread Michael Segel
Nope HBase wasn't mentioned. The OP could be talking about using external tables and Hive. The OP could still be stuck in the RDBMs world and hasn't flattened his data yet. 2 million records? Kinda small dontcha think? Not Enough Information ... On Feb 18, 2013, at 8:58 AM, Hemanth

Re: Database insertion by HAdoop

2013-02-18 Thread Masoud
Hello Tariq, Our database is sql server 2008, and we dont need to develop a professional app, we just need to develop it fast and make our experiment result soon. Thanks On 02/18/2013 11:58 PM, Hemanth Yamijala wrote: What database is this ? Was hbase mentioned ? On Monday, February 18,