RE: ETL using Hadoop

2014-10-09 Thread Andrew Machtolff
The closest thing I can think of to a .NET API would be to set up Hive external 
tables, and use a vendor’s (Cloudera, et al.) ODBC driver. You could connect 
from your .NET app using ODBC to the Hive tables, and SELECT/INSERT to 
read/write. If you’re desperate. ☺

As far as ETL, I’d recommend you give SyncSort DMX-h a try. It’s a great little 
ETL tool that can translate its ETL tasks to MapReduce jobs. I’ve been using it 
for almost a year now, and it’s fantastic. Blazing fast, and with a trial 
download.
(Disclaimer: I’m not affiliated with SyncSort, other than being a happy 
customer)

Andrew


Andrew Machtolff / Senior Consultant
205.259.2558 o
205.447.0956 c
205.259.2301 f
[http://images.askcts.com/images/cts_logo_email.png]http://www.askcts.com/
www.askcts.comhttp://www.askcts.com/
amachto...@askcts.commailto:amachto...@askcts.com
[http://images.askcts.com/images/cts_logo_linkedin.png]http://www.linkedin.com/company/cts-inc[http://images.askcts.com/images/cts_logo_twitter.png]https://twitter.com/askCTS[http://images.askcts.com/images/cts_logo_facebook.png]https://www.facebook.com/askCTS

From: Azuryy Yu [mailto:azury...@gmail.com]
Sent: Wednesday, October 08, 2014 1:41 AM
To: user@hadoop.apache.org
Subject: Re: ETL using Hadoop

Hi Moin,
Yes, you can replace your ETL using hadoop. but it would be a big change.  such 
as data collection, pre-process, ETL tasks rewrite etc.

I don't think there is .NET API in Hadoop.

On Wed, Oct 8, 2014 at 2:31 PM, Dattatrya Moin 
dattatryam...@gmail.commailto:dattatryam...@gmail.com wrote:

Hi ,

We have our own ETL , but we are planning to use Hadoop for data processing as 
it gives better scalability and performance. As i am new to hadoop kindly guide 
to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to 
connect to Hadoop using .NET.


Thanks,
Dattatrya Moin




Re: ETL using Hadoop

2014-10-09 Thread Alex Kamil
the fastest way to do ETL on Hadoop is via Hbase+Phoenix JDBC driver
http://phoenix.apache.org/,
as for ODBC mapping you could use Thrift  or one of the ODBC-JDBC bridges
http://stackoverflow.com/questions/5352956/odbc-jdbc-bridge-that-maps-its-own-calls-to-jdbc-driver

On Thu, Oct 9, 2014 at 8:16 AM, Andrew Machtolff amachto...@askcts.com
wrote:

  The closest thing I can think of to a .NET API would be to set up Hive
 external tables, and use a vendor’s (Cloudera, et al.) ODBC driver. You
 could connect from your .NET app using ODBC to the Hive tables, and
 SELECT/INSERT to read/write. If you’re desperate. J



 As far as ETL, I’d recommend you give SyncSort DMX-h a try. It’s a great
 little ETL tool that can translate its ETL tasks to MapReduce jobs. I’ve
 been using it for almost a year now, and it’s fantastic. Blazing fast, and
 with a trial download.

 (Disclaimer: I’m not affiliated with SyncSort, other than being a happy
 customer)



 Andrew



 **

 Andrew Machtolff / Senior Consultant

 205.259.2558 o
 205.447.0956 c

 205.259.2301 f

 [image: http://images.askcts.com/images/cts_logo_email.png]
 http://www.askcts.com/
 www.askcts.com
 amachto...@askcts.com
 [image: http://images.askcts.com/images/cts_logo_linkedin.png]
 http://www.linkedin.com/company/cts-inc[image:
 http://images.askcts.com/images/cts_logo_twitter.png]
 https://twitter.com/askCTS[image:
 http://images.askcts.com/images/cts_logo_facebook.png]
 https://www.facebook.com/askCTS



 *From:* Azuryy Yu [mailto:azury...@gmail.com]
 *Sent:* Wednesday, October 08, 2014 1:41 AM
 *To:* user@hadoop.apache.org
 *Subject:* Re: ETL using Hadoop



 Hi Moin,

 Yes, you can replace your ETL using hadoop. but it would be a big change.
  such as data collection, pre-process, ETL tasks rewrite etc.



 I don't think there is .NET API in Hadoop.



 On Wed, Oct 8, 2014 at 2:31 PM, Dattatrya Moin dattatryam...@gmail.com
 wrote:



 Hi ,



 We have our own ETL , but we are planning to use Hadoop for data
 processing as it gives better scalability and performance. As i am new to
 hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop .
 And we have any API to connect to Hadoop using .NET.





 Thanks,

 Dattatrya Moin







Re: Fwd: ETL using Hadoop

2014-10-09 Thread daemeon reiydelle
Hadoop is in effect a massively fast etl with high latency as the tradeoff.

Other solutions allow different tradeoffs. And some of those occur in Map
phase, some in a reduce phase (e.g. Stream or columnar stores).
On Oct 7, 2014 11:32 PM, Dattatrya Moin dattatryam...@gmail.com wrote:


 Hi ,

 We have our own ETL , but we are planning to use Hadoop for data
 processing as it gives better scalability and performance. As i am new to
 hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop .
 And we have any API to connect to Hadoop using .NET.


 Thanks,
 Dattatrya Moin




Fwd: ETL using Hadoop

2014-10-08 Thread Dattatrya Moin
Hi ,

We have our own ETL , but we are planning to use Hadoop for data processing
as it gives better scalability and performance. As i am new to hadoop
kindly guide to start with Hadoop. Can we replace ETL using Hadoop . And we
have any API to connect to Hadoop using .NET.


Thanks,
Dattatrya Moin


Re: ETL using Hadoop

2014-10-08 Thread Azuryy Yu
Hi Moin,
Yes, you can replace your ETL using hadoop. but it would be a big change.
 such as data collection, pre-process, ETL tasks rewrite etc.

I don't think there is .NET API in Hadoop.

On Wed, Oct 8, 2014 at 2:31 PM, Dattatrya Moin dattatryam...@gmail.com
wrote:


 Hi ,

 We have our own ETL , but we are planning to use Hadoop for data
 processing as it gives better scalability and performance. As i am new to
 hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop .
 And we have any API to connect to Hadoop using .NET.


 Thanks,
 Dattatrya Moin




Re: Fwd: ETL using Hadoop

2014-10-08 Thread Dattatrya Moin
Hi ,

We are extracting data from Oracle database, we are processing that data
(Transformation) i.e. Substring ,Replace and loading modified data back to
Oracle database.

On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote:

 where to E where to L? what do you do in E and T?

 --
 -
 BestWishes!
 Blog:http://snv.iteye.com/




 -- 原始邮件 --
 *发件人:* Dattatrya Moin;dattatryam...@gmail.com;
 *发送时间:* 2014年10月8日(星期三) 下午3:01
 *收件人:* useruser@hadoop.apache.org;
 *主题:* Fwd: ETL using Hadoop


 Hi ,

 We have our own ETL , but we are planning to use Hadoop for data
 processing as it gives better scalability and performance. As i am new to
 hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop .
 And we have any API to connect to Hadoop using .NET.


 Thanks,
 Dattatrya Moin




回复: Fwd: ETL using Hadoop

2014-10-08 Thread snv
using mr job to ETL the temporary data store in hdfs ‍ ‍


--
-
BestWishes!
Blog:http://snv.iteye.com/


 

 




-- 原始邮件 --
发件人: Dattatrya Moin;dattatryam...@gmail.com;
发送时间: 2014年10月8日(星期三) 下午3:44
收件人: useruser@hadoop.apache.org; 

主题: Re: Fwd: ETL using Hadoop



Hi , 

We are extracting data from Oracle database, we are processing that data 
(Transformation) i.e. Substring ,Replace and loading modified data back to 
Oracle database. 


On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote:
where to E where to L? what do you do in E and T?


--
-
BestWishes!
Blog:http://snv.iteye.com/


 

 




-- 原始邮件 --
发件人: Dattatrya Moin;dattatryam...@gmail.com;
发送时间: 2014年10月8日(星期三) 下午3:01
收件人: useruser@hadoop.apache.org; 

主题: Fwd: ETL using Hadoop




Hi ,

We have our own ETL , but we are planning to use Hadoop for data processing as 
it gives better scalability and performance. As i am new to hadoop kindly guide 
to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to 
connect to Hadoop using .NET.




Thanks,
Dattatrya Moin

Re: Fwd: ETL using Hadoop

2014-10-08 Thread Blade Liu
Why not process data on-the-fly?   I guess reading and writing data in HDFS
would incur high latency.

2014-10-08 15:20 GMT+08:00 snv smallnetvisi...@foxmail.com:

 using mr job to ETL the temporary data store in hdfs ‍ ‍

 --
 -
 BestWishes!
 Blog:http://snv.iteye.com/




 -- 原始邮件 --
 *发件人:* Dattatrya Moin;dattatryam...@gmail.com;
 *发送时间:* 2014年10月8日(星期三) 下午3:44
 *收件人:* useruser@hadoop.apache.org;
 *主题:* Re: Fwd: ETL using Hadoop

 Hi ,

 We are extracting data from Oracle database, we are processing that data
 (Transformation) i.e. Substring ,Replace and loading modified data back to
 Oracle database.

 On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote:

 where to E where to L? what do you do in E and T?

 --
 -
 BestWishes!
 Blog:http://snv.iteye.com/




 -- 原始邮件 --
 *发件人:* Dattatrya Moin;dattatryam...@gmail.com;
 *发送时间:* 2014年10月8日(星期三) 下午3:01
 *收件人:* useruser@hadoop.apache.org;
 *主题:* Fwd: ETL using Hadoop


 Hi ,

 We have our own ETL , but we are planning to use Hadoop for data
 processing as it gives better scalability and performance. As i am new to
 hadoop kindly guide to start with Hadoop. Can we replace ETL using Hadoop .
 And we have any API to connect to Hadoop using .NET.


 Thanks,
 Dattatrya Moin





回复: Fwd: ETL using Hadoop

2014-10-08 Thread snv
if complex   use script such as BRMS to deal the biz‍


--
-
BestWishes!
Blog:http://snv.iteye.com/
 

 




-- 原始邮件 --
发件人: Blade Liu;hafzc...@gmail.com;
发送时间: 2014年10月8日(星期三) 下午4:54
收件人: useruser@hadoop.apache.org; 

主题: Re: Fwd: ETL using Hadoop



Why not process data on-the-fly?   I guess reading and writing data in HDFS 
would incur high latency.

2014-10-08 15:20 GMT+08:00 snv smallnetvisi...@foxmail.com:
using mr job to ETL the temporary data store in hdfs ‍ ‍


--
-
BestWishes!
Blog:http://snv.iteye.com/


 

 




-- 原始邮件 --
发件人: Dattatrya Moin;dattatryam...@gmail.com;
发送时间: 2014年10月8日(星期三) 下午3:44
收件人: useruser@hadoop.apache.org; 

主题: Re: Fwd: ETL using Hadoop



Hi , 

We are extracting data from Oracle database, we are processing that data 
(Transformation) i.e. Substring ,Replace and loading modified data back to 
Oracle database. 


On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote:
where to E where to L? what do you do in E and T?


--
-
BestWishes!
Blog:http://snv.iteye.com/


 

 




-- 原始邮件 --
发件人: Dattatrya Moin;dattatryam...@gmail.com;
发送时间: 2014年10月8日(星期三) 下午3:01
收件人: useruser@hadoop.apache.org; 

主题: Fwd: ETL using Hadoop




Hi ,

We have our own ETL , but we are planning to use Hadoop for data processing as 
it gives better scalability and performance. As i am new to hadoop kindly guide 
to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to 
connect to Hadoop using .NET.




Thanks,
Dattatrya Moin

RE: Fwd: ETL using Hadoop

2014-10-08 Thread Chaudhari, Rahul A
For Oracle, you can also use GoldenGate for realtime replication and ODI for 
simplifying the ETS workflow.

Regards,
Rahul Chaudhari

From: Dattatrya Moin [mailto:dattatryam...@gmail.com] 
Sent: Wednesday, October 08, 2014 12:44 PM
To: user
Subject: Re: Fwd: ETL using Hadoop

Hi , 

We are extracting data from Oracle database, we are processing that data 
(Transformation) i.e. Substring ,Replace and loading modified data back to 
Oracle database. 

On Wed, Oct 8, 2014 at 12:13 PM, snv smallnetvisi...@foxmail.com wrote:
where to E where to L? what do you do in E and T?

--
-
BestWishes!
Blog:http://snv.iteye.com/

 


-- 原始邮件 --
发件人: Dattatrya Moin;dattatryam...@gmail.com;
发送时间: 2014年10月8日(星期三) 下午3:01
收件人: useruser@hadoop.apache.org; 
主题: Fwd: ETL using Hadoop


Hi ,

We have our own ETL , but we are planning to use Hadoop for data processing as 
it gives better scalability and performance. As i am new to hadoop kindly guide 
to start with Hadoop. Can we replace ETL using Hadoop . And we have any API to 
connect to Hadoop using .NET.


Thanks,
Dattatrya Moin