At sometimes, I get the below error. The column which has below error string is generated by users. Might be because columns contains ctrl-A delimiter in it?
Log Length: 13320 log4j:ERROR Could not find value for key log4j.appender.CLA log4j:ERROR Could not instantiate appender named "CLA". log4j:ERROR Could not find value for key log4j.appender.CLA log4j:ERROR Could not instantiate appender named "CLA". log4j:ERROR Could not find value for key log4j.appender.CLA log4j:ERROR Could not instantiate appender named "CLA". Note: /tmp/sqoop-hadoop/compile/c0d035312d3a623ccfad1ea5d20623ac/order_completed.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. Error: java.lang.RuntimeException: Can't parse input data: 'Duplicate order detected. Rejecting current order.' at order_completed.__loadFromFields(order_completed.java:989) at order_completed.parse(order_completed.java:847) at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53) at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170) Caused by: java.util.NoSuchElementException at java.util.ArrayList$Itr.next(ArrayList.java:834) at order_completed.__loadFromFields(order_completed.java:944) ... 11 more On Thu, May 21, 2015 at 11:31 PM, Manikandan R <[email protected]> wrote: > It works perfectly. Thanks. > > On Thu, May 21, 2015 at 12:11 AM, Michael Arena <[email protected]> > wrote: > >> I have found that you must specify these 6 parameters for incremental >> imports to work: >> --null-string '\\N' >> --null-non-string '\\N' >> --fields-terminated-by ‘\001' >> --input-null-string '\\N' >> --input-null-non-string '\\N' >> --input-fields-terminated-by ‘\001' >> >> I believe that the first 3 control how output files (created by Sqoop) >> are delimited and the second 3 control how input files are delimited. >> Since an incremental import merges existing files, they are treated like >> input. >> >> From: Manikandan R >> Reply-To: "[email protected]" >> Date: Wednesday, May 20, 2015 at 1:52 PM >> To: "[email protected]" >> Subject: Re: Merge failed - timestamp column with null values >> >> Hello Swati, >> >> Thanks for your reply. >> >> I am not using --class-name in sqoop command. >> >> Here is my sqoop action in oozie >> >> <action name="sqoop-saved-job"> >> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> >> <job-tracker>${jobTracker}</job-tracker> >> <name-node>${nameNode}</name-node> >> <job-xml>/tmp/sqoop-site.xml</job-xml> >> <arg>job</arg> >> <arg>--create</arg> >> <arg>${dbName}-${tableName}-sync-job</arg> >> <arg>--</arg> >> <arg>import</arg> >> <arg>--connect</arg> >> <arg>jdbc:mysql://${dbHost}/${dbName}</arg> >> <arg>--username</arg> >> <arg>root</arg> >> <arg>--password-file</arg> >> <arg>/tmp/.password</arg> >> <arg>--table</arg> >> <arg>${tableName}</arg> >> <arg>--incremental</arg> >> <arg>${incrementalMode}</arg> >> <arg>--merge-key</arg> >> <arg>${mergeKey}</arg> >> <arg>--check-column</arg> >> <arg>${checkColumn}</arg> >> <arg>--last-value</arg> >> <arg>${lastValue}</arg> >> <arg>--target-dir</arg> >> >> <arg>/data/${dbName}/${stgPrefix}_${tableName}</arg> >> <arg>--fields-terminated-by</arg> >> <arg>\001</arg> >> <arg>--null-string</arg> >> <arg>\\N</arg> >> <arg>--null-non-string</arg> >> <arg>\\N</arg> >> <arg>${directOption}</arg> >> </sqoop> >> >> <ok to="sqoop-run-or-saved-job-check" /> >> <error to="sqoop-run-or-saved-job-check" /> >> </action> >> >> and here is the exception - >> >> Error: java.lang.RuntimeException: Can't parse input data: '\N' >> at dim_scd_table.__loadFromFields(dim_scd_table.java:473) >> at dim_scd_table.parse(dim_scd_table.java:391) >> at >> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53) >> at >> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) >> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) >> Caused by: java.lang.IllegalArgumentException: Timestamp format must be >> yyyy-mm-dd hh:mm:ss[.fffffffff] >> at java.sql.Timestamp.valueOf(Timestamp.java:202) >> at dim_scd_table.__loadFromFields(dim_scd_table.java:455) >> >> Table name is dim_scd_table. It has scd_end_date column of "timestamp" >> datatype. When this column has NULL value, am getting the above exception. >> >> Please let me know on this. >> >> Thanks, >> Mani >> >> On Wed, May 20, 2015 at 10:49 PM, Swati Ambulkar -X (sambulka - >> PERSISTENT SYSTEMS INC at Cisco) <[email protected]> wrote: >> >>> Can you paste your sqoop command please? >>> >>> >>> >>> Are you generating your class with –class-name option? : >>> >>> >>> >>> Once you do that you should see some code for handling timestamp column >>> similar to listed below. Here if it encounters \N __cur_str.length() will >>> not be 0 and it will go through else part and you can check if this is what >>> is failing for you. >>> >>> >>> >>> __cur_str = __it.next(); >>> >>> if (__cur_str.equals("null") || __cur_str.length() == 0) { >>> this.STARTTIME = null; } else { >>> >>> this.STARTTIME = java.sql.Timestamp.valueOf(__cur_str); >>> >>> } >>> >>> >>> >>> __cur_str = __it.next(); >>> >>> if (__cur_str.equals("null") || __cur_str.length() == 0) { >>> this.ENDTIME = null; } else { >>> >>> this.ENDTIME = java.sql.Timestamp.valueOf(__cur_str); >>> >>> } >>> >>> >>> >>> You can direct sqoop to use null string (“”) for –null-non-string >>> option. >>> >>> options.add("--null-string"); >>> >>> options.add(""); >>> >>> >>> >>> options.add("--null-non-string"); >>> >>> options.add(""); >>> >>> >>> >>> This would put null string in imported row and then the abovementioned >>> check should timestamp column value to null. >>> >>> >>> >>> Thanks, >>> >>> Swati >>> >>> >>> >>> *From:* Manikandan R [mailto:[email protected]] >>> *Sent:* Wednesday, May 20, 2015 12:02 AM >>> *To:* [email protected] >>> *Subject:* Merge failed - timestamp column with null values >>> >>> >>> >>> Hello Everyone, >>> >>> >>> >>> I am trying to push incremental updates from mysql to hdfs using sqoop >>> import command with Mergekey option and incremental mode as "lastmodified". >>> >>> >>> >>> My table has some timestamp columns. I don't see any problems as long as >>> timestamp columns has some values. But, Problem arises only when it has >>> NULL values. I copied the below exception from my logs. Also, Incase of >>> Non-timestamp columns having null values, there is no issues. >>> >>> >>> >>> Error: java.lang.RuntimeException: Can't parse input data: '\N' >>> >>> at dim_scd_table.__loadFromFields(dim_scd_table.java:473) >>> >>> at dim_scd_table.parse(dim_scd_table.java:391) >>> >>> at >>> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53) >>> >>> at >>> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) >>> >>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) >>> >>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775) >>> >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) >>> >>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) >>> >>> at java.security.AccessController.doPrivileged(Native Method) >>> >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >>> >>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) >>> >>> Caused by: java.lang.IllegalArgumentException: Timestamp format must >>> be yyyy-mm-dd hh:mm:ss[.fffffffff] >>> >>> at java.sql.Timestamp.valueOf(Timestamp.java:202) >>> >>> at dim_scd_table.__loadFromFields(dim_scd_table.java:455) >>> >>> ... 11 more >>> >>> >>> >>> Kindly let me know on this. >>> >>> >>> >>> Thanks, >>> >>> Mani >>> >> >> >
