[jira] [Commented] (ATLAS-409) Atlas will not import hive tables with no columns

2015-12-22 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15068231#comment-15068231
 ] 

Aaron Dossett commented on ATLAS-409:
-

Sorry for not providing more details initially.  Here is the exception I get 
when trying to import a hive table with a schema defined by reference to an 
avro schema file.  The schema file does include defined columns.  I will try to 
recreate with a simple table and post the DDL and schema file.  Note that the 
hive_storagedesc does not contain any columns.

When I apply my fix, this error is resolved, for what it's worth.

2015-12-22 08:12:06,307 ERROR - [qtp151405253-17 - 
89808aec-1c06-42ce-8815-0d634b7a152d:] ~ Unable to persist entity instance due 
to a desrialization error  (EntityResource:134)
org.apache.atlas.typesystem.types.ValueConversionException: Cannot convert 
value '{Id='(type: hive_table, id: )', traits=[], 
values={db={Id='(type: hive_db, id: 11e69cb7-2f66-446f-8a00-dde5ab805a70)', 
traits=[], values={}}, createTime=1450129254, lastAccessTime=0, sd={Id='(type: 
hive_storagedesc, id: )', traits=[], values={compressed=false, 
serdeInfo=org.apache.atlas.typesystem.Struct@a1719f80, 
location=hdfs://redstackambari.vagrant.tgt:8020/user/vagrant/firefly, 
outputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, 
qualifiedName=firefly.estore@primary, storedAsSubDirectories=false, 
numBuckets=-1, 
inputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, 
parameters={}}}, columns=[], temporary=false, partitionKeys=[], 
tableName=testtable, name=testdb.testtable@primary, owner=vagrant, retention=0, 
tableType=EXTERNAL_TABLE, parameters={rawDataSize=-1, numFiles=0, 
transient_lastDdlTime=1450129254, totalSize=0, 
avro.schema.url=hdfs://redstackambari.vagrant.tgt:8020/user/vagrant/schema/test_schema.avsc,
 EXTERNAL=TRUE, COLUMN_STATS_ACCURATE=false, numRows=-1}, comment=null}}' to 
datatype hive_table
at 
org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:145)
at 
org.apache.atlas.services.DefaultMetadataService.deserializeClassInstances(DefaultMetadataService.java:304)
at 
org.apache.atlas.services.DefaultMetadataService.createEntities(DefaultMetadataService.java:278)
 ..
Caused by: org.apache.atlas.typesystem.types.ValueConversionException: Cannot 
convert value '{Id='(type: hive_storagedesc, id: )', traits=[], 
values={compressed=false, 
serdeInfo=org.apache.atlas.typesystem.Struct@a1719f80, 
location=hdfs://redstackambari.vagrant.tgt:8020/user/vagrant/firefly, 
outputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, 
qualifiedName=firefly.estore@primary, storedAsSubDirectories=false, 
numBuckets=-1, 
inputFormat=org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, 
parameters={}}}' to datatype hive_storagedesc
at 
org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:145)
at 
org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:43)
at 
org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:122)
at 
org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:143)
... 52 more
Caused by: 
org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException:
 For field 'cols'
at 
org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:124)
at 
org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:143)
... 55 more
Caused by: 
org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException:
 Null value not allowed for multiplicty Multiplicity{lower=1, upper=2147483647, 
isUnique=false}
at 
org.apache.atlas.typesystem.types.DataTypes$ArrayType.convert(DataTypes.java:549)
at 
org.apache.atlas.typesystem.types.DataTypes$ArrayType.convert(DataTypes.java:495)
at 
org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:122)
... 56 more

> Atlas will not import hive tables with no columns
> -
>
> Key: ATLAS-409
> URL: https://issues.apache.org/jira/browse/ATLAS-409
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 0.6-incubating
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: ATLAS-409.patch
>
>
> Atlas won't import a Hive table with no columns (see below for an example of 
> a valid hive table with no explicit columns).  This is because the Atlas Hive 
> Storage Descriptor class REQUIRES columns, but the Hive Table class allows 
> them to be OPTIONAL.
> {code}
> CREATE TABLE example
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   

[jira] [Commented] (ATLAS-409) Atlas will not import hive tables with no columns

2015-12-22 Thread Shwetha G S (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067875#comment-15067875
 ] 

Shwetha G S commented on ATLAS-409:
---

Doesn't hive try to figure out the columns from the custom serde? In hive hook, 
we load the table metadata from hive and figure out the entity attributes, not 
from the hive command directly. So, shouldn't hive always have columns?

> Atlas will not import hive tables with no columns
> -
>
> Key: ATLAS-409
> URL: https://issues.apache.org/jira/browse/ATLAS-409
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 0.6-incubating
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: ATLAS-409.patch
>
>
> Atlas won't import a Hive table with no columns (see below for an example of 
> a valid hive table with no explicit columns).  This is because the Atlas Hive 
> Storage Descriptor class REQUIRES columns, but the Hive Table class allows 
> them to be OPTIONAL.
> {code}
> CREATE TABLE example
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   TBLPROPERTIES (
> 'avro.schema.url'='file:///path/to/the/schema/test_serializer.avsc');
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)