Smee,

Cookbook talks about updating the data on RDBMS when are EXPORTing data from 
Sqoop.

If you want to update data on Hive, you need to replace the data.  I have 
implemented SCD Type 2 on hive tables recently.

You can refer to my presentation at 
http://files.meetup.com/1624468/Getting%20Jiggy%20with%20Change%20Data%20Capture%20and%20Slowly%20Changing%20Dimen.pdf
  which shows a simple example of how to update data on Hive.

Regards,
Venkat

From: Anthony Smee [mailto:[email protected]]
Sent: Wednesday, August 27, 2014 8:55 AM
To: [email protected]
Subject: Updating data in HDFS

Hi there

I am new to Sqoop and have recently been reading the Apache Sqoop Cookbook and 
wanted to ask a question. I noticed that in section 5 of the cookbook it 
details how Sqoop can update data in an existing dataset in the RDBMS. Sorry, 
just to be clear - I am aware of the hive-import switch, but I have tables in 
an RDBMS which has data updated going back numerous days over months and my 
HDFS data is partitioned by numerous columns meaning lots of partitions need to 
have the files merged.

My question, have you ever considered the same requirement but to update data 
in Hive tables, e.g. the files in HDFS?

Just wondering if this is on the roadmap, and if not why not?

Thanks



[https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif]

Reply via email to