[ 
https://issues.apache.org/jira/browse/PARQUET-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14258692#comment-14258692
 ] 

Dmitriy commented on PARQUET-155:
---------------------------------

>>> What you could try is to write a simple map-only job that reads with Avro 
>>> and writes with Parquet-avro. Then you wouldn't have to do any conversion

If I'll do this, then I need to define parquet backed table by myself, am I 
right? This is not very good approach for me, because my real schema is really 
big. Or there some other way to create hive table using existing parquet files? 
I searched for this, but only two approaches I've found "create as select" and 
define table when create it.

> Hive Avro to Parquet table conversion
> -------------------------------------
>
>                 Key: PARQUET-155
>                 URL: https://issues.apache.org/jira/browse/PARQUET-155
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Dmitriy
>
> Hi.
> I have following avro schema 
> {code}
> {
>      "namespace" : "com.example.test",
>      "type" : "record",
>      "name" : "TestRecord",
>      "fields" : [{"name" : "objectLink", "type" : [
>                            {"type": "record", "name" : "TestObj1", "fields" : 
> [{"name":"obj1VisitorId","type":["null","string"]}] },
>                            {"type": "record", "name" : "TestObj2", "fields" : 
> [{"name":"obj2VisitorId","type":["null","string"]}]}
>                        ]
>                  }],
>      "doc" : "event for test purposes"
> }
> {code}
> Using this schema I can create avro objects, also I'm able to create table 
> backed by avro in Hive. But then I want to create a table backed by parquet 
> I'm doing 
> CREATE TABLE parquet_table 
> STORED AS parquet
> AS SELECT * FROM avro_table
> and i get 
> SemanticException java.lang.UnsupportedOperationException: Unknown field 
> type: uniontype<struct<obj1visitorid:string>,struct<obj2visitorid:string>>
> Is there a way to convert such structures, to store them in hive backed as 
> parquet? This is a simple example, but I have big data structure described in 
> avro, so I can't convert it manually, and also I have data which already 
> stored in avro and need to be loaded in table, backed by parquet. Is there 
> any way to this?
> I'm using hive 0.13.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to