Smee, Cookbook talks about updating the data on RDBMS when are EXPORTing data from Sqoop.
If you want to update data on Hive, you need to replace the data. I have implemented SCD Type 2 on hive tables recently. You can refer to my presentation at http://files.meetup.com/1624468/Getting%20Jiggy%20with%20Change%20Data%20Capture%20and%20Slowly%20Changing%20Dimen.pdf which shows a simple example of how to update data on Hive. Regards, Venkat From: Anthony Smee [mailto:[email protected]] Sent: Wednesday, August 27, 2014 8:55 AM To: [email protected] Subject: Updating data in HDFS Hi there I am new to Sqoop and have recently been reading the Apache Sqoop Cookbook and wanted to ask a question. I noticed that in section 5 of the cookbook it details how Sqoop can update data in an existing dataset in the RDBMS. Sorry, just to be clear - I am aware of the hive-import switch, but I have tables in an RDBMS which has data updated going back numerous days over months and my HDFS data is partitioned by numerous columns meaning lots of partitions need to have the files merged. My question, have you ever considered the same requirement but to update data in Hive tables, e.g. the files in HDFS? Just wondering if this is on the roadmap, and if not why not? Thanks [https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif]
