Re: Merge failed - timestamp column with null values

Manikandan R Mon, 06 Jul 2015 05:15:35 -0700

At sometimes, I get the below error. The column which has below error
string is generated by users. Might be because columns contains ctrl-A
delimiter in it?



Log Length: 13320
  log4j:ERROR Could not find value for key log4j.appender.CLA
  log4j:ERROR Could not instantiate appender named "CLA".
  log4j:ERROR Could not find value for key log4j.appender.CLA
  log4j:ERROR Could not instantiate appender named "CLA".
  log4j:ERROR Could not find value for key log4j.appender.CLA
  log4j:ERROR Could not instantiate appender named "CLA".
  Note:
/tmp/sqoop-hadoop/compile/c0d035312d3a623ccfad1ea5d20623ac/order_completed.java
uses or overrides a deprecated API.
  Note: Recompile with -Xlint:deprecation for details.
  Error: java.lang.RuntimeException: Can't parse input data: 'Duplicate
order detected. Rejecting current order.'
  at order_completed.__loadFromFields(order_completed.java:989)
  at order_completed.parse(order_completed.java:847)
  at
org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53)
  at
org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
  Caused by: java.util.NoSuchElementException
  at java.util.ArrayList$Itr.next(ArrayList.java:834)
  at order_completed.__loadFromFields(order_completed.java:944)
  ... 11 more

On Thu, May 21, 2015 at 11:31 PM, Manikandan R <[email protected]> wrote:

> It works perfectly. Thanks.
>
> On Thu, May 21, 2015 at 12:11 AM, Michael Arena <[email protected]>
> wrote:
>
>>  I have found that you must specify these 6 parameters for incremental
>> imports to work:
>>     --null-string '\\N'
>>    --null-non-string '\\N'
>>    --fields-terminated-by ‘\001'
>>    --input-null-string '\\N'
>>    --input-null-non-string '\\N'
>>     --input-fields-terminated-by ‘\001'
>>
>>  I believe that the first 3 control how output files (created by Sqoop)
>> are delimited and the second 3 control how input files are delimited.
>> Since an incremental import merges existing files, they are treated like
>> input.
>>
>>   From: Manikandan R
>> Reply-To: "[email protected]"
>> Date: Wednesday, May 20, 2015 at 1:52 PM
>> To: "[email protected]"
>> Subject: Re: Merge failed - timestamp column with null values
>>
>>   Hello Swati,
>>
>>  Thanks for your reply.
>>
>>  I am not using --class-name in sqoop command.
>>
>>  Here is my sqoop action in oozie
>>
>>          <action name="sqoop-saved-job">
>>                 <sqoop xmlns="uri:oozie:sqoop-action:0.2">
>>                         <job-tracker>${jobTracker}</job-tracker>
>>                         <name-node>${nameNode}</name-node>
>>                         <job-xml>/tmp/sqoop-site.xml</job-xml>
>>                         <arg>job</arg>
>>                         <arg>--create</arg>
>>                         <arg>${dbName}-${tableName}-sync-job</arg>
>>                         <arg>--</arg>
>>                         <arg>import</arg>
>>                         <arg>--connect</arg>
>>                         <arg>jdbc:mysql://${dbHost}/${dbName}</arg>
>>                         <arg>--username</arg>
>>                         <arg>root</arg>
>>                         <arg>--password-file</arg>
>>                         <arg>/tmp/.password</arg>
>>                         <arg>--table</arg>
>>                         <arg>${tableName}</arg>
>>                         <arg>--incremental</arg>
>>                         <arg>${incrementalMode}</arg>
>>                         <arg>--merge-key</arg>
>>                         <arg>${mergeKey}</arg>
>>                         <arg>--check-column</arg>
>>                         <arg>${checkColumn}</arg>
>>                         <arg>--last-value</arg>
>>                         <arg>${lastValue}</arg>
>>                         <arg>--target-dir</arg>
>>
>> <arg>/data/${dbName}/${stgPrefix}_${tableName}</arg>
>>                         <arg>--fields-terminated-by</arg>
>>                         <arg>\001</arg>
>>                         <arg>--null-string</arg>
>>                         <arg>\\N</arg>
>>                         <arg>--null-non-string</arg>
>>                         <arg>\\N</arg>
>>                         <arg>${directOption}</arg>
>>                 </sqoop>
>>
>>                  <ok to="sqoop-run-or-saved-job-check" />
>>                 <error to="sqoop-run-or-saved-job-check" />
>>         </action>
>>
>>  and here is the exception -
>>
>>   Error: java.lang.RuntimeException: Can't parse input data: '\N'
>>   at dim_scd_table.__loadFromFields(dim_scd_table.java:473)
>>   at dim_scd_table.parse(dim_scd_table.java:391)
>>   at
>> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53)
>>   at
>> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775)
>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at javax.security.auth.Subject.doAs(Subject.java:415)
>>   at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>   Caused by: java.lang.IllegalArgumentException: Timestamp format must be
>> yyyy-mm-dd hh:mm:ss[.fffffffff]
>>   at java.sql.Timestamp.valueOf(Timestamp.java:202)
>>   at dim_scd_table.__loadFromFields(dim_scd_table.java:455)
>>
>>  Table name is dim_scd_table. It has scd_end_date column of "timestamp"
>> datatype. When this column has NULL value, am getting the above exception.
>>
>>  Please let me know on this.
>>
>>  Thanks,
>> Mani
>>
>> On Wed, May 20, 2015 at 10:49 PM, Swati Ambulkar -X (sambulka -
>> PERSISTENT SYSTEMS INC at Cisco) <[email protected]> wrote:
>>
>>>  Can you paste your sqoop command please?
>>>
>>>
>>>
>>> Are you generating your class with –class-name option? :
>>>
>>>
>>>
>>> Once you do that you should see some code for handling timestamp column
>>> similar to listed below. Here if it encounters \N __cur_str.length() will
>>> not be 0 and it will go through else part and you can check if this is what
>>> is failing for you.
>>>
>>>
>>>
>>>     __cur_str = __it.next();
>>>
>>>     if (__cur_str.equals("null") || __cur_str.length() == 0) {
>>> this.STARTTIME = null; } else {
>>>
>>>       this.STARTTIME = java.sql.Timestamp.valueOf(__cur_str);
>>>
>>>     }
>>>
>>>
>>>
>>>     __cur_str = __it.next();
>>>
>>>     if (__cur_str.equals("null") || __cur_str.length() == 0) {
>>> this.ENDTIME = null; } else {
>>>
>>>       this.ENDTIME = java.sql.Timestamp.valueOf(__cur_str);
>>>
>>>     }
>>>
>>>
>>>
>>> You can direct sqoop to use null string (“”) for –null-non-string
>>> option.
>>>
>>>         options.add("--null-string");
>>>
>>>         options.add("");
>>>
>>>
>>>
>>>         options.add("--null-non-string");
>>>
>>>         options.add("");
>>>
>>>
>>>
>>> This would put null string in imported row and then the abovementioned
>>> check should timestamp column value to null.
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Swati
>>>
>>>
>>>
>>> *From:* Manikandan R [mailto:[email protected]]
>>> *Sent:* Wednesday, May 20, 2015 12:02 AM
>>> *To:* [email protected]
>>> *Subject:* Merge failed - timestamp column with null values
>>>
>>>
>>>
>>> Hello Everyone,
>>>
>>>
>>>
>>> I am trying to push incremental updates from mysql to hdfs using sqoop
>>> import command with Mergekey option and incremental mode as "lastmodified".
>>>
>>>
>>>
>>> My table has some timestamp columns. I don't see any problems as long as
>>> timestamp columns has some values. But, Problem arises only when it has
>>> NULL values. I copied the below exception from my logs. Also, Incase of
>>> Non-timestamp columns having null values, there is no issues.
>>>
>>>
>>>
>>> Error: java.lang.RuntimeException: Can't parse input data: '\N'
>>>
>>>   at dim_scd_table.__loadFromFields(dim_scd_table.java:473)
>>>
>>>   at dim_scd_table.parse(dim_scd_table.java:391)
>>>
>>>   at
>>> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53)
>>>
>>>   at
>>> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>>>
>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>>>
>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775)
>>>
>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>
>>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>>>
>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>
>>>   at javax.security.auth.Subject.doAs(Subject.java:415)
>>>
>>>   at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>>>
>>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>>
>>>   Caused by: java.lang.IllegalArgumentException: Timestamp format must
>>> be yyyy-mm-dd hh:mm:ss[.fffffffff]
>>>
>>>   at java.sql.Timestamp.valueOf(Timestamp.java:202)
>>>
>>>   at dim_scd_table.__loadFromFields(dim_scd_table.java:455)
>>>
>>>   ... 11 more
>>>
>>>
>>>
>>> Kindly let me know on this.
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Mani
>>>
>>
>>
>

Re: Merge failed - timestamp column with null values

Reply via email to