[ https://issues.apache.org/jira/browse/ATLAS-409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068231#comment-15068231 ]
Aaron Dossett edited comment on ATLAS-409 at 12/22/15 3:14 PM: --------------------------------------------------------------- Sorry for not providing more details initially. Here is the exception I get when trying to import a hive table with a schema defined by reference to an avro schema file. The schema file does include defined columns. I will try to recreate with a simple table and post the DDL and schema file. Note that the hive_storagedesc does not contain any columns. When I apply my fix, this error is resolved, for what it's worth. {code} 2015-12-22 08:12:06,307 ERROR - [qtp151405253-17 - 89808aec-1c06-42ce-8815-0d634b7a152d:] ~ Unable to persist entity instance due to a desrialization error (EntityResource:134) org.apache.atlas.typesystem.types.ValueConversionException: Cannot convert value '{Id='(type: hive_table, id: <unassigned>)', traits=[], values={db={Id='(type: hive_db, id: 11e69cb7-2f66-446f-8a00-dde5ab805a70)', traits=[], values={}}, createTime=1450129254, lastAccessTime=0, sd={Id='(type: hive_storagedesc, id: <unassigned>)', traits=[], values={compressed=false, serdeInfo=org.apache.atlas.typesystem.Struct@a1719f80, location=hdfs://redstackambari.vagrant.tgt:8020/user/vagrant/firefly, outputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, qualifiedName=firefly.estore@primary, storedAsSubDirectories=false, numBuckets=-1, inputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, parameters={}}}, columns=[], temporary=false, partitionKeys=[], tableName=testtable, name=testdb.testtable@primary, owner=vagrant, retention=0, tableType=EXTERNAL_TABLE, parameters={rawDataSize=-1, numFiles=0, transient_lastDdlTime=1450129254, totalSize=0, avro.schema.url=hdfs://redstackambari.vagrant.tgt:8020/user/vagrant/schema/test_schema.avsc, EXTERNAL=TRUE, COLUMN_STATS_ACCURATE=false, numRows=-1}, comment=null}}' to datatype hive_table at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:145) at org.apache.atlas.services.DefaultMetadataService.deserializeClassInstances(DefaultMetadataService.java:304) at org.apache.atlas.services.DefaultMetadataService.createEntities(DefaultMetadataService.java:278) ...... Caused by: org.apache.atlas.typesystem.types.ValueConversionException: Cannot convert value '{Id='(type: hive_storagedesc, id: <unassigned>)', traits=[], values={compressed=false, serdeInfo=org.apache.atlas.typesystem.Struct@a1719f80, location=hdfs://redstackambari.vagrant.tgt:8020/user/vagrant/firefly, outputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, qualifiedName=firefly.estore@primary, storedAsSubDirectories=false, numBuckets=-1, inputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, parameters={}}}' to datatype hive_storagedesc at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:145) at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:43) at org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:122) at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:143) ... 52 more Caused by: org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException: For field 'cols' at org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:124) at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:143) ... 55 more Caused by: org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException: Null value not allowed for multiplicty Multiplicity{lower=1, upper=2147483647, isUnique=false} at org.apache.atlas.typesystem.types.DataTypes$ArrayType.convert(DataTypes.java:549) at org.apache.atlas.typesystem.types.DataTypes$ArrayType.convert(DataTypes.java:495) at org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:122) ... 56 more {code} was (Author: doss...@gmail.com): Sorry for not providing more details initially. Here is the exception I get when trying to import a hive table with a schema defined by reference to an avro schema file. The schema file does include defined columns. I will try to recreate with a simple table and post the DDL and schema file. Note that the hive_storagedesc does not contain any columns. When I apply my fix, this error is resolved, for what it's worth. 2015-12-22 08:12:06,307 ERROR - [qtp151405253-17 - 89808aec-1c06-42ce-8815-0d634b7a152d:] ~ Unable to persist entity instance due to a desrialization error (EntityResource:134) org.apache.atlas.typesystem.types.ValueConversionException: Cannot convert value '{Id='(type: hive_table, id: <unassigned>)', traits=[], values={db={Id='(type: hive_db, id: 11e69cb7-2f66-446f-8a00-dde5ab805a70)', traits=[], values={}}, createTime=1450129254, lastAccessTime=0, sd={Id='(type: hive_storagedesc, id: <unassigned>)', traits=[], values={compressed=false, serdeInfo=org.apache.atlas.typesystem.Struct@a1719f80, location=hdfs://redstackambari.vagrant.tgt:8020/user/vagrant/firefly, outputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, qualifiedName=firefly.estore@primary, storedAsSubDirectories=false, numBuckets=-1, inputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, parameters={}}}, columns=[], temporary=false, partitionKeys=[], tableName=testtable, name=testdb.testtable@primary, owner=vagrant, retention=0, tableType=EXTERNAL_TABLE, parameters={rawDataSize=-1, numFiles=0, transient_lastDdlTime=1450129254, totalSize=0, avro.schema.url=hdfs://redstackambari.vagrant.tgt:8020/user/vagrant/schema/test_schema.avsc, EXTERNAL=TRUE, COLUMN_STATS_ACCURATE=false, numRows=-1}, comment=null}}' to datatype hive_table at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:145) at org.apache.atlas.services.DefaultMetadataService.deserializeClassInstances(DefaultMetadataService.java:304) at org.apache.atlas.services.DefaultMetadataService.createEntities(DefaultMetadataService.java:278) ...... Caused by: org.apache.atlas.typesystem.types.ValueConversionException: Cannot convert value '{Id='(type: hive_storagedesc, id: <unassigned>)', traits=[], values={compressed=false, serdeInfo=org.apache.atlas.typesystem.Struct@a1719f80, location=hdfs://redstackambari.vagrant.tgt:8020/user/vagrant/firefly, outputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, qualifiedName=firefly.estore@primary, storedAsSubDirectories=false, numBuckets=-1, inputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, parameters={}}}' to datatype hive_storagedesc at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:145) at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:43) at org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:122) at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:143) ... 52 more Caused by: org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException: For field 'cols' at org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:124) at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:143) ... 55 more Caused by: org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException: Null value not allowed for multiplicty Multiplicity{lower=1, upper=2147483647, isUnique=false} at org.apache.atlas.typesystem.types.DataTypes$ArrayType.convert(DataTypes.java:549) at org.apache.atlas.typesystem.types.DataTypes$ArrayType.convert(DataTypes.java:495) at org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:122) ... 56 more > Atlas will not import hive tables with no columns > ------------------------------------------------- > > Key: ATLAS-409 > URL: https://issues.apache.org/jira/browse/ATLAS-409 > Project: Atlas > Issue Type: Bug > Affects Versions: 0.6-incubating > Reporter: Aaron Dossett > Assignee: Aaron Dossett > Attachments: ATLAS-409.patch > > > Atlas won't import a Hive table with no columns (see below for an example of > a valid hive table with no explicit columns). This is because the Atlas Hive > Storage Descriptor class REQUIRES columns, but the Hive Table class allows > them to be OPTIONAL. > {code} > CREATE TABLE example > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' > TBLPROPERTIES ( > 'avro.schema.url'='file:///path/to/the/schema/test_serializer.avsc'); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)