---------- Forwarded message ---------- From: burberry blues <[email protected]> Date: Fri, Nov 8, 2013 at 12:11 AM Subject: Extracting Updated records using Sqoop To: [email protected]
Hi Harsh I am trying to extract the modified records apart from the incremental updates through Sqoop from Oracle database into Hive table. However I am getting duplicate entries when trying to extract on a particular last value attribute. Below is my sqoop commnad sqoop import --connect jdbc:oracle:thin:xxx:xxx:xxx --username xxx --password xxx --hive-import --table xxx --target-dir xxx --hive-table xxx --incremental append --check-column COLUMN_3 --split-by COLUMN_2 --columns COLUMN_1,COLUMN_2,COLUMN_3 --last-value "2013-11-05 00:00:00" My output is as follows Column_1 Column_2 Column_3 new change1 1.0 2013-11-07 11:05:55.0 change3 3.0 2013-11-07 11:19:25.0 change1 1.0 2013-11-05 11:15:50.0 new change1 2.0 2013-11-07 11:18:55.0 NULL 4.0 2013-11-07 12:13:00.0 change2 2.0 2013-11-05 11:15:55.0 The highlighted record is getting inserted again instead of updating the existing record Is there any command for this? Thanks, Burberry
