Edmen,
Parallel Databases ( Teradata, Netezza..)?? I believe if you use Sqoop
(with JBDC) for loading you cannot achieve parallelism since table gets
dead locks by specifying more mappers. But you can use Sqoop + Parallel
Database Connector ( you find them on Cloudera site ) to achieve the native
Agree. Apache Sqoop is what you're looking for:
http://incubator.apache.org/sqoop/
On Tue, Jan 24, 2012 at 10:51 PM, Prashant Kommireddi
wrote:
> I am assuming you want to move data between Hadoop and database.
> Please take a look at Sqoop.
>
> Thanks,
> Prashant
>
> Sent from my iPhone
>
> On J
I am assuming you want to move data between Hadoop and database.
Please take a look at Sqoop.
Thanks,
Prashant
Sent from my iPhone
On Jan 24, 2012, at 9:19 AM, Edmon Begoli wrote:
> I am looking to use Hadoop for parallel loading of CSV file into a
> non-Hadoop, parallel database.
>
> Is there
I am looking to use Hadoop for parallel loading of CSV file into a
non-Hadoop, parallel database.
Is there an existing utility that allows one to pick entries,
row-by-row, synchronized and in parallel and load into a database?
Thank you in advance,
Edmon