Naresh AR created SQOOP-3436:
--------------------------------
Summary: sqoop imports data from oracle exadata has duplicates
Key: SQOOP-3436
URL: https://issues.apache.org/jira/browse/SQOOP-3436
Project: Sqoop
Issue Type: Bug
Components: sqoop2-build, sqoop2-jdbc-connector
Affects Versions: 1.4.7
Environment: sqoop1.4.7,hortonworks2.6.3
Reporter: Naresh AR
Hi I have used sqoop with oracle exadata which results in complete row
duplicate ,at present we are removing using the distinct query and dumping into
another target table,Please suggest on this
Background for oracle table :
Oracle used for sqoop import have no primary keys involved (i.e) tables are of
scd type2 and have complex keys as primary keys which does not suit split by
option and tables are very huge(100gig)
Command used for sqoop import from oracle exadata
sqoop import --connect %s@//%s:%s/%s --username %s -password %s --table %s.%s
--fields-terminated-by '%s' --hive-drop-import-delims --hive-import
--hive-overwrite --hive-table %s.%s --null-string '\\\N' --null-non-string
'\\\N' --m %s --fetch-size=2500
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)