[ 
https://issues.apache.org/jira/browse/PARQUET-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Blue updated PARQUET-155:
------------------------------
    Description: 
Hi.
I have following avro schema 

{code}
{
     "namespace" : "com.example.test",
     "type" : "record",
     "name" : "TestRecord",
     "fields" : [{"name" : "objectLink", "type" : [
                           {"type": "record", "name" : "TestObj1", "fields" : 
[{"name":"obj1VisitorId","type":["null","string"]}] },
                           {"type": "record", "name" : "TestObj2", "fields" : 
[{"name":"obj2VisitorId","type":["null","string"]}]}
                       ]
                 }],
     "doc" : "event for test purposes"
}
{code}

Using this schema I can create avro objects, also I'm able to create table 
backed by avro in Hive. But then I want to create a table backed by parquet I'm 
doing 

CREATE TABLE parquet_table 
STORED AS parquet
AS SELECT * FROM avro_table

and i get 
SemanticException java.lang.UnsupportedOperationException: Unknown field type: 
uniontype<struct<obj1visitorid:string>,struct<obj2visitorid:string>>

Is there a way to convert such structures, to store them in hive backed as 
parquet? This is a simple example, but I have big data structure described in 
avro, so I can't convert it manually, and also I have data which already stored 
in avro and need to be loaded in table, backed by parquet. Is there any way to 
this?

I'm using hive 0.13.

  was:
Hi.
I have following avro schema 

{
     "namespace" : "com.example.test",
     "type" : "record",
     "name" : "TestRecord",
     "fields" : [{"name" : "objectLink", "type" : [{"type": "record", "name" : 
"TestObj1", "fields" : [{"name":"obj1VisitorId","type":["null","string"]}] },
                                                   {"type": "record", "name" : 
"TestObj2", "fields" : [{"name":"obj2VisitorId","type":["null","string"]}]} ]
                 }],
     "doc" : "event for test purposes"
}

Using this schema I can create avro objects, also I'm able to create table 
backed by avro in Hive. But then I want to create a table backed by parquet I'm 
doing 

CREATE TABLE parquet_table 
STORED AS parquet
AS SELECT * FROM avro_table

and i get 
SemanticException java.lang.UnsupportedOperationException: Unknown field type: 
uniontype<struct<obj1visitorid:string>,struct<obj2visitorid:string>>

Is there a way to convert such structures, to store them in hive backed as 
parquet? This is a simple example, but I have big data structure described in 
avro, so I can't convert it manually, and also I have data which already stored 
in avro and need to be loaded in table, backed by parquet. Is there any way to 
this?

I'm using hive 0.13.


> Hive Avro to Parquet table conversion
> -------------------------------------
>
>                 Key: PARQUET-155
>                 URL: https://issues.apache.org/jira/browse/PARQUET-155
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Dmitriy
>
> Hi.
> I have following avro schema 
> {code}
> {
>      "namespace" : "com.example.test",
>      "type" : "record",
>      "name" : "TestRecord",
>      "fields" : [{"name" : "objectLink", "type" : [
>                            {"type": "record", "name" : "TestObj1", "fields" : 
> [{"name":"obj1VisitorId","type":["null","string"]}] },
>                            {"type": "record", "name" : "TestObj2", "fields" : 
> [{"name":"obj2VisitorId","type":["null","string"]}]}
>                        ]
>                  }],
>      "doc" : "event for test purposes"
> }
> {code}
> Using this schema I can create avro objects, also I'm able to create table 
> backed by avro in Hive. But then I want to create a table backed by parquet 
> I'm doing 
> CREATE TABLE parquet_table 
> STORED AS parquet
> AS SELECT * FROM avro_table
> and i get 
> SemanticException java.lang.UnsupportedOperationException: Unknown field 
> type: uniontype<struct<obj1visitorid:string>,struct<obj2visitorid:string>>
> Is there a way to convert such structures, to store them in hive backed as 
> parquet? This is a simple example, but I have big data structure described in 
> avro, so I can't convert it manually, and also I have data which already 
> stored in avro and need to be loaded in table, backed by parquet. Is there 
> any way to this?
> I'm using hive 0.13.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to