It works perfectly. Thanks. On Thu, May 21, 2015 at 12:11 AM, Michael Arena <[email protected]> wrote:
> I have found that you must specify these 6 parameters for incremental > imports to work: > --null-string '\\N' > --null-non-string '\\N' > --fields-terminated-by ‘\001' > --input-null-string '\\N' > --input-null-non-string '\\N' > --input-fields-terminated-by ‘\001' > > I believe that the first 3 control how output files (created by Sqoop) > are delimited and the second 3 control how input files are delimited. > Since an incremental import merges existing files, they are treated like > input. > > From: Manikandan R > Reply-To: "[email protected]" > Date: Wednesday, May 20, 2015 at 1:52 PM > To: "[email protected]" > Subject: Re: Merge failed - timestamp column with null values > > Hello Swati, > > Thanks for your reply. > > I am not using --class-name in sqoop command. > > Here is my sqoop action in oozie > > <action name="sqoop-saved-job"> > <sqoop xmlns="uri:oozie:sqoop-action:0.2"> > <job-tracker>${jobTracker}</job-tracker> > <name-node>${nameNode}</name-node> > <job-xml>/tmp/sqoop-site.xml</job-xml> > <arg>job</arg> > <arg>--create</arg> > <arg>${dbName}-${tableName}-sync-job</arg> > <arg>--</arg> > <arg>import</arg> > <arg>--connect</arg> > <arg>jdbc:mysql://${dbHost}/${dbName}</arg> > <arg>--username</arg> > <arg>root</arg> > <arg>--password-file</arg> > <arg>/tmp/.password</arg> > <arg>--table</arg> > <arg>${tableName}</arg> > <arg>--incremental</arg> > <arg>${incrementalMode}</arg> > <arg>--merge-key</arg> > <arg>${mergeKey}</arg> > <arg>--check-column</arg> > <arg>${checkColumn}</arg> > <arg>--last-value</arg> > <arg>${lastValue}</arg> > <arg>--target-dir</arg> > > <arg>/data/${dbName}/${stgPrefix}_${tableName}</arg> > <arg>--fields-terminated-by</arg> > <arg>\001</arg> > <arg>--null-string</arg> > <arg>\\N</arg> > <arg>--null-non-string</arg> > <arg>\\N</arg> > <arg>${directOption}</arg> > </sqoop> > > <ok to="sqoop-run-or-saved-job-check" /> > <error to="sqoop-run-or-saved-job-check" /> > </action> > > and here is the exception - > > Error: java.lang.RuntimeException: Can't parse input data: '\N' > at dim_scd_table.__loadFromFields(dim_scd_table.java:473) > at dim_scd_table.parse(dim_scd_table.java:391) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) > Caused by: java.lang.IllegalArgumentException: Timestamp format must be > yyyy-mm-dd hh:mm:ss[.fffffffff] > at java.sql.Timestamp.valueOf(Timestamp.java:202) > at dim_scd_table.__loadFromFields(dim_scd_table.java:455) > > Table name is dim_scd_table. It has scd_end_date column of "timestamp" > datatype. When this column has NULL value, am getting the above exception. > > Please let me know on this. > > Thanks, > Mani > > On Wed, May 20, 2015 at 10:49 PM, Swati Ambulkar -X (sambulka - PERSISTENT > SYSTEMS INC at Cisco) <[email protected]> wrote: > >> Can you paste your sqoop command please? >> >> >> >> Are you generating your class with –class-name option? : >> >> >> >> Once you do that you should see some code for handling timestamp column >> similar to listed below. Here if it encounters \N __cur_str.length() will >> not be 0 and it will go through else part and you can check if this is what >> is failing for you. >> >> >> >> __cur_str = __it.next(); >> >> if (__cur_str.equals("null") || __cur_str.length() == 0) { >> this.STARTTIME = null; } else { >> >> this.STARTTIME = java.sql.Timestamp.valueOf(__cur_str); >> >> } >> >> >> >> __cur_str = __it.next(); >> >> if (__cur_str.equals("null") || __cur_str.length() == 0) { >> this.ENDTIME = null; } else { >> >> this.ENDTIME = java.sql.Timestamp.valueOf(__cur_str); >> >> } >> >> >> >> You can direct sqoop to use null string (“”) for –null-non-string option. >> >> options.add("--null-string"); >> >> options.add(""); >> >> >> >> options.add("--null-non-string"); >> >> options.add(""); >> >> >> >> This would put null string in imported row and then the abovementioned >> check should timestamp column value to null. >> >> >> >> Thanks, >> >> Swati >> >> >> >> *From:* Manikandan R [mailto:[email protected]] >> *Sent:* Wednesday, May 20, 2015 12:02 AM >> *To:* [email protected] >> *Subject:* Merge failed - timestamp column with null values >> >> >> >> Hello Everyone, >> >> >> >> I am trying to push incremental updates from mysql to hdfs using sqoop >> import command with Mergekey option and incremental mode as "lastmodified". >> >> >> >> My table has some timestamp columns. I don't see any problems as long as >> timestamp columns has some values. But, Problem arises only when it has >> NULL values. I copied the below exception from my logs. Also, Incase of >> Non-timestamp columns having null values, there is no issues. >> >> >> >> Error: java.lang.RuntimeException: Can't parse input data: '\N' >> >> at dim_scd_table.__loadFromFields(dim_scd_table.java:473) >> >> at dim_scd_table.parse(dim_scd_table.java:391) >> >> at >> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53) >> >> at >> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) >> >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) >> >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775) >> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) >> >> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) >> >> at java.security.AccessController.doPrivileged(Native Method) >> >> at javax.security.auth.Subject.doAs(Subject.java:415) >> >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >> >> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) >> >> Caused by: java.lang.IllegalArgumentException: Timestamp format must be >> yyyy-mm-dd hh:mm:ss[.fffffffff] >> >> at java.sql.Timestamp.valueOf(Timestamp.java:202) >> >> at dim_scd_table.__loadFromFields(dim_scd_table.java:455) >> >> ... 11 more >> >> >> >> Kindly let me know on this. >> >> >> >> Thanks, >> >> Mani >> > >
