RE: ETL using Hadoop
The closest thing I can think of to a .NET API would be to set up Hive external tables, and use a vendor’s (Cloudera, et al.) ODBC driver. You could connect from your .NET app using ODBC to the Hive tables, and SELECT/INSERT to read/write. If you’re desperate. ☺ As far as ETL, I’d recommend you give SyncSort DMX-h a try. It’s a great little ETL tool that can translate its ETL tasks to MapReduce jobs. I’ve been using it for almost a year now, and it’s fantastic. Blazing fast, and with a trial download. (Disclaimer: I’m not affiliated with SyncSort, other than being a happy customer) Andrew Andrew Machtolff / Senior Consultant 205.259.2558 o 205.447.0956 c 205.259.2301 f [http://images.askcts.com/images/cts_logo_email.png]http://www.askcts.com/ www.askcts.comhttp://www.askcts.com/ amachto...@askcts.commailto:amachto...@askcts.com [http://images.askcts.com/images/cts_logo_linkedin.png]http://www.linkedin.com/company/cts-inc[http://images.askcts.com/images/cts_logo_twitter.png]https://twitter.com/askCTS[http://images.askcts.com/images/cts_logo_facebook.png]https://www.facebook.com/askCTS From: Azuryy Yu [mailto:azury...@gmail.com] Sent: Wednesday, October 08, 2014 1:41 AM To: user@hadoop.apache.org Subject: Re: ETL using Hadoop Hi Moin, Yes, you can replace your ETL using hadoop. but it would be a big change. such as data collection, pre-process, ETL tasks rewrite etc. I don't think there is .NET API in Hadoop. On Wed, Oct 8, 2014 at 2:31 PM, Dattatrya Moin dattatryam...@gmail.commailto:dattatryam...@gmail.com wrote: Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin
Re: ETL using Hadoop
the fastest way to do ETL on Hadoop is via Hbase+Phoenix JDBC driver http://phoenix.apache.org/, as for ODBC mapping you could use Thrift or one of the ODBC-JDBC bridges http://stackoverflow.com/questions/5352956/odbc-jdbc-bridge-that-maps-its-own-calls-to-jdbc-driver On Thu, Oct 9, 2014 at 8:16 AM, Andrew Machtolff amachto...@askcts.com wrote: The closest thing I can think of to a .NET API would be to set up Hive external tables, and use a vendor’s (Cloudera, et al.) ODBC driver. You could connect from your .NET app using ODBC to the Hive tables, and SELECT/INSERT to read/write. If you’re desperate. J As far as ETL, I’d recommend you give SyncSort DMX-h a try. It’s a great little ETL tool that can translate its ETL tasks to MapReduce jobs. I’ve been using it for almost a year now, and it’s fantastic. Blazing fast, and with a trial download. (Disclaimer: I’m not affiliated with SyncSort, other than being a happy customer) Andrew ** Andrew Machtolff / Senior Consultant 205.259.2558 o 205.447.0956 c 205.259.2301 f [image: http://images.askcts.com/images/cts_logo_email.png] http://www.askcts.com/ www.askcts.com amachto...@askcts.com [image: http://images.askcts.com/images/cts_logo_linkedin.png] http://www.linkedin.com/company/cts-inc[image: http://images.askcts.com/images/cts_logo_twitter.png] https://twitter.com/askCTS[image: http://images.askcts.com/images/cts_logo_facebook.png] https://www.facebook.com/askCTS *From:* Azuryy Yu [mailto:azury...@gmail.com] *Sent:* Wednesday, October 08, 2014 1:41 AM *To:* user@hadoop.apache.org *Subject:* Re: ETL using Hadoop Hi Moin, Yes, you can replace your ETL using hadoop. but it would be a big change. such as data collection, pre-process, ETL tasks rewrite etc. I don't think there is .NET API in Hadoop. On Wed, Oct 8, 2014 at 2:31 PM, Dattatrya Moin dattatryam...@gmail.com wrote: Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin
Re: Fwd: ETL using Hadoop
Hadoop is in effect a massively fast etl with high latency as the tradeoff. Other solutions allow different tradeoffs. And some of those occur in Map phase, some in a reduce phase (e.g. Stream or columnar stores). On Oct 7, 2014 11:32 PM, Dattatrya Moin dattatryam...@gmail.com wrote: Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin
Fwd: ETL using Hadoop
Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin
Re: ETL using Hadoop
Hi Moin, Yes, you can replace your ETL using hadoop. but it would be a big change. such as data collection, pre-process, ETL tasks rewrite etc. I don't think there is .NET API in Hadoop. On Wed, Oct 8, 2014 at 2:31 PM, Dattatrya Moin dattatryam...@gmail.com wrote: Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin
Re: Fwd: ETL using Hadoop
Hi , We are extracting data from Oracle database, we are processing that data (Transformation) i.e. Substring ,Replace and loading modified data back to Oracle database. On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote: where to E where to L? what do you do in E and T? -- - BestWishes! Blog:http://snv.iteye.com/ -- 原始邮件 -- *发件人:* Dattatrya Moin;dattatryam...@gmail.com; *发送时间:* 2014年10月8日(星期三) 下午3:01 *收件人:* useruser@hadoop.apache.org; *主题:* Fwd: ETL using Hadoop Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin
回复: Fwd: ETL using Hadoop
using mr job to ETL the temporary data store in hdfs -- - BestWishes! Blog:http://snv.iteye.com/ -- 原始邮件 -- 发件人: Dattatrya Moin;dattatryam...@gmail.com; 发送时间: 2014年10月8日(星期三) 下午3:44 收件人: useruser@hadoop.apache.org; 主题: Re: Fwd: ETL using Hadoop Hi , We are extracting data from Oracle database, we are processing that data (Transformation) i.e. Substring ,Replace and loading modified data back to Oracle database. On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote: where to E where to L? what do you do in E and T? -- - BestWishes! Blog:http://snv.iteye.com/ -- 原始邮件 -- 发件人: Dattatrya Moin;dattatryam...@gmail.com; 发送时间: 2014年10月8日(星期三) 下午3:01 收件人: useruser@hadoop.apache.org; 主题: Fwd: ETL using Hadoop Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin
Re: Fwd: ETL using Hadoop
Why not process data on-the-fly? I guess reading and writing data in HDFS would incur high latency. 2014-10-08 15:20 GMT+08:00 snv smallnetvisi...@foxmail.com: using mr job to ETL the temporary data store in hdfs -- - BestWishes! Blog:http://snv.iteye.com/ -- 原始邮件 -- *发件人:* Dattatrya Moin;dattatryam...@gmail.com; *发送时间:* 2014年10月8日(星期三) 下午3:44 *收件人:* useruser@hadoop.apache.org; *主题:* Re: Fwd: ETL using Hadoop Hi , We are extracting data from Oracle database, we are processing that data (Transformation) i.e. Substring ,Replace and loading modified data back to Oracle database. On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote: where to E where to L? what do you do in E and T? -- - BestWishes! Blog:http://snv.iteye.com/ -- 原始邮件 -- *发件人:* Dattatrya Moin;dattatryam...@gmail.com; *发送时间:* 2014年10月8日(星期三) 下午3:01 *收件人:* useruser@hadoop.apache.org; *主题:* Fwd: ETL using Hadoop Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin
回复: Fwd: ETL using Hadoop
if complex use script such as BRMS to deal the biz -- - BestWishes! Blog:http://snv.iteye.com/ -- 原始邮件 -- 发件人: Blade Liu;hafzc...@gmail.com; 发送时间: 2014年10月8日(星期三) 下午4:54 收件人: useruser@hadoop.apache.org; 主题: Re: Fwd: ETL using Hadoop Why not process data on-the-fly? I guess reading and writing data in HDFS would incur high latency. 2014-10-08 15:20 GMT+08:00 snv smallnetvisi...@foxmail.com: using mr job to ETL the temporary data store in hdfs -- - BestWishes! Blog:http://snv.iteye.com/ -- 原始邮件 -- 发件人: Dattatrya Moin;dattatryam...@gmail.com; 发送时间: 2014年10月8日(星期三) 下午3:44 收件人: useruser@hadoop.apache.org; 主题: Re: Fwd: ETL using Hadoop Hi , We are extracting data from Oracle database, we are processing that data (Transformation) i.e. Substring ,Replace and loading modified data back to Oracle database. On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote: where to E where to L? what do you do in E and T? -- - BestWishes! Blog:http://snv.iteye.com/ -- 原始邮件 -- 发件人: Dattatrya Moin;dattatryam...@gmail.com; 发送时间: 2014年10月8日(星期三) 下午3:01 收件人: useruser@hadoop.apache.org; 主题: Fwd: ETL using Hadoop Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin
RE: Fwd: ETL using Hadoop
For Oracle, you can also use GoldenGate for realtime replication and ODI for simplifying the ETS workflow. Regards, Rahul Chaudhari From: Dattatrya Moin [mailto:dattatryam...@gmail.com] Sent: Wednesday, October 08, 2014 12:44 PM To: user Subject: Re: Fwd: ETL using Hadoop Hi , We are extracting data from Oracle database, we are processing that data (Transformation) i.e. Substring ,Replace and loading modified data back to Oracle database. On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote: where to E where to L? what do you do in E and T? -- - BestWishes! Blog:http://snv.iteye.com/ -- 原始邮件 -- 发件人: Dattatrya Moin;dattatryam...@gmail.com; 发送时间: 2014年10月8日(星期三) 下午3:01 收件人: useruser@hadoop.apache.org; 主题: Fwd: ETL using Hadoop Hi , We have our own ETL , but we are planning to use Hadoop for data processing as it gives better scalability and performance. As i am new to hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to connect to Hadoop using .NET. Thanks, Dattatrya Moin