I asked the exact question before, event opened a ticket against CDH. The answer I got back, however, is to use Hive to clean the data first.
From: Munjuluri, Shyam [mailto:[email protected]] Sent: Monday, February 09, 2015 6:05 PM To: [email protected] Subject: RE: any way to truncate data based on column length? To add to this, the SQOOP export to Netezza also fails if there are CONTROL characters in the data. Hope there is a setting that can be made in Sqoop/Netezza driver to ignore Control characters! Has anyone faced this issue and if so, was there any resolution? From: Frank Luo [mailto:[email protected]] Sent: Monday, February 09, 2015 4:38 PM To: [email protected]<mailto:[email protected]> Subject: RE: any way to truncate data based on column length? If that is the only way, then we will do that. I just hope Sqoop or Netessa is able to do it. After all, it is a pretty standard ETL requirement. From: Abraham Elmahrek [mailto:[email protected]] Sent: Monday, February 09, 2015 2:31 PM To: [email protected]<mailto:[email protected]> Subject: Re: any way to truncate data based on column length? Hey man, Why not use Hive (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableAsSelect(CTAS)) CTAS to sanitize your data first? You should be able to use CTAS in conjunction with SUBSTR (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF). -Abe On Mon, Feb 9, 2015 at 12:00 PM, Frank Luo <[email protected]<mailto:[email protected]>> wrote: We are using HDP 2.2 with Sqoop 1.4.5.2.2.0.0-2041. When loading data to Netezza tables, we found some input data length is larger than what defined in Netezza, and when that happens, the loading fails. I am wondering if there is way to tell Sqoop/Netezza driver to truncate the data when too large to fit. Here is message from NTZ log: 1: 484(308) [6, VARCHAR(255)] text field too long for column, "xxx” Thx Frank This e-mail may contain confidential or privileged information. If you think you have received this e-mail in error, please advise the sender by reply e-mail and then delete this e-mail immediately. Thank you. Aetna
