[ https://issues.apache.org/jira/browse/SQOOP-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Attila Szabo updated SQOOP-1600: -------------------------------- Fix Version/s: (was: 1.4.7) 1.5.0 > Exception when import data using Data Connector for Oracle with TIMESTAMP > column type to Parquet files > ------------------------------------------------------------------------------------------------------ > > Key: SQOOP-1600 > URL: https://issues.apache.org/jira/browse/SQOOP-1600 > Project: Sqoop > Issue Type: Bug > Affects Versions: 1.4.6 > Environment: Hadoop version: 2.5.0-cdh5.2.0 > Sqoop: 1.4.5 > Reporter: Daniel Lanza GarcĂa > Assignee: Qian Xu > Labels: Connector, Oracle, Parquet, Timestamp > Fix For: 1.5.0 > > Original Estimate: 24h > Remaining Estimate: 24h > > A error is thrown in each mapper when a import job is run using Quest data > connector for Oracle (-direct argument), the source table has a column of the > type timestamp and the destination files are of Parquet format. > The mapper's log show that the error is the following: > {code} > WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : > org.apache.avro.UnresolvedUnionException: Not in union ["long","null"]: > 2012-7-1 0:4:44. 403000000 > {code} > Which means the data obtained by the mapper (by the connector) is not of the > same type that the schema describe in this field. As we can read in the > error, the problem is related with the column UTC_STAMP (the unique column in > the source table that store a time stamp). > If we check the generated schema for this column, we can observe that the > column is of the type long and SQL data type TIMESTAMP (93), which is correct. > {code} > Schema: {"name" : "UTC_STAMP","type" : [ "long", "null" ],"columnName" : > "UTC_STAMP","sqlType" : "93"} > {code} > If we debug the method where the exception is thrown > {{org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:605)}}, > we can see that the problem comes when the type of the data obtained by the > mapper is of the type String which doesn't correspond with the type described > by the schema (long). The exception is not thrown when the destination files > are text files. The reason is that when you import to text files, a schema is > not generated. > Solution: > In the documentation, there is a section which describe how manage data and > timestamps when you use the Data Connector for Oracle and Hadoop. As we can > read in this section, this connector has a different way to manage this type > of data. However, this behavior can be disabled as describe this section with > the below parameter. > -Doraoop.timestamp.string=false > Although the problem is solved with this parameter (mandatory if you are in > this conditions), the software should deal with this types of column and > doesn't throw an exception. -- This message was sent by Atlassian JIRA (v6.4.14#64029)