[ 
https://issues.apache.org/jira/browse/HAWQ-53?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivram Mani updated HAWQ-53:
-----------------------------
    Description: 
*Current Behavior*
Create pxf external table with an avro schema which contains complex nullable 
type with null as default value would fail.

*Expected Behavior*
The complex avro types should be properly mapped to HAWQ types and create table 
should have no error with data loaded.

*Analysis*
>From code review 
>[AvroResolver.java|https://github.com/apache/incubator-hawq/blob/master/pxf/pxf-hdfs/src/main/java/com/pivotal/pxf/plugins/hdfs/AvroResolver.java],
> we didn't specify null situation for complex types:
- Array
- Map
- Record

when using union type , which contains a nullable type(union) and use null as 
default - I think complex type with nullable type in value schema and with null 
as default is the problem.

{code:javascript}
...
"name" : "meta__kvpairs",
    "type" : [ "null", {
      "type" : "map",
      "values" : [ "null", "string" ],
      "default" : null
    } ],
    "default" : null
  },
...
{code}

Create external table from pxf to read it:

{code:sql}
"CREATE EXTERNAL TABLE meetmeevent_hdfs(
sourcetimestamp bigint,
sourceagent varchar(200),
sourceip_address varchar(20),
meetmeuser_id bigint,
meetmeinterested_uid bigint,
meetmevote varchar(4),
meetmeis_match int,
meetmenetwork_score bigint,
meetmeresponsiveness_score bigint,
meetmesession_id varchar(32),
meetmefriends int,
meetmeprevious_view varchar(32), 
meetmemodel bigint,
meetmescore float8,
meetmemethod int,
meetmecontributions varchar(120),
meetmeclicksource varchar(20),
meetmeclickaction varchar(20),
meetmeprofileview_ts bigint,
meetmeplatform varchar(20),
metatopic_name varchar(100),
metarequest_user_agent varchar(200),
metarequest_session_id varchar(50),
metarequest_id text,
metakvpairs varchar(1000),
meta_handlers varchar(1000)
)
LOCATION 
('pxf://dahdp2nn01.tag-dev.com:50070/data/ramblas/event_log/meetme/20150824/12/s_meetme.20.0.3492572.11600054147.1440442800000.avro?PROFILE=Avro&DATA-SCHEMA=/data/ramblas/schema/meetme')
FORMAT 'CUSTOM' (formatter='pxfwritable_import');
{code}

  was:
*Current Behavior*
Create pxf external table with an avro schema which contains complex nullable 
type with null as default value would fail.

*Expected Behavior*
The complex avro types should be properly mapped to HAWQ types and create table 
should have no error with data loaded.

*Analysis*
>From code review 
>[AvroResolver.java|https://github.com/apache/incubator-hawq/blob/master/pxf/pxf-hdfs/src/main/java/com/pivotal/pxf/plugins/hdfs/AvroResolver.java],
> we didn't specify null situation for complex types:
- Array
- Map
- Record

when using union type , which contains a nullable type(union) and use null as 
default - I think complex type with nullable type in value schema and with null 
as default is the problem.

{code:javascript}
...
"name" : "meta__kvpairs",
    "type" : [ "null", {
      "type" : "map",
      "values" : [ "null", "string" ],
      "default" : null
    } ],
    "default" : null
  },
...
{code}

Create external table from pxf to read it:

{code:sql}
"CREATE EXTERNAL TABLE meetmeevent_hdfs(
sourcetimestamp bigint,
sourceagent varchar(200),
sourceip_address varchar(20),
meetmeuser_id numeric(15,0),
meetmeinterested_uid numeric(15,0),
meetmevote varchar(4),
meetmeis_match numeric(3,0),
meetmenetwork_score numeric (10,0),
meetmeresponsiveness_score numeric (10,0),
meetmesession_id varchar(32),
meetmefriends numeric(1,0),
meetmeprevious_view varchar(32), 
meetmemodel numeric(15,0),
meetmescore numeric(15,0),
meetmemethod numeric(1,0),
meetmecontributions varchar(120),
meetmeclicksource varchar(20),
meetmeclickaction varchar(20),
meetmeprofileview_ts bigint,
meetmeplatform varchar(20),
metatopic_name varchar(100),
metarequest_user_agent varchar(200),
metarequest_session_id varchar(50),
metarequest_id bigint,
metakvpairs varchar(1000),
meta_handlers varchar(1000),
dt varchar(6),
hour varchar(2)
)
LOCATION 
('pxf://dahdp2nn01.tag-dev.com:50070/data/ramblas/event_log/meetme/20150824/12/s_meetme.20.0.3492572.11600054147.1440442800000.avro?PROFILE=Avro&DATA-SCHEMA=/data/ramblas/schema/meetme')
FORMAT 'CUSTOM' (formatter='pxfwritable_import');
{code}


> Avro Union(complex) type containing nullable type and using null as default 
> failed
> ----------------------------------------------------------------------------------
>
>                 Key: HAWQ-53
>                 URL: https://issues.apache.org/jira/browse/HAWQ-53
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: PXF
>            Reporter: Goden Yao
>            Assignee: Shivram Mani
>         Attachments: meetme.avsc, meetme_example.json
>
>
> *Current Behavior*
> Create pxf external table with an avro schema which contains complex nullable 
> type with null as default value would fail.
> *Expected Behavior*
> The complex avro types should be properly mapped to HAWQ types and create 
> table should have no error with data loaded.
> *Analysis*
> From code review 
> [AvroResolver.java|https://github.com/apache/incubator-hawq/blob/master/pxf/pxf-hdfs/src/main/java/com/pivotal/pxf/plugins/hdfs/AvroResolver.java],
>  we didn't specify null situation for complex types:
> - Array
> - Map
> - Record
> when using union type , which contains a nullable type(union) and use null as 
> default - I think complex type with nullable type in value schema and with 
> null as default is the problem.
> {code:javascript}
> ...
> "name" : "meta__kvpairs",
>     "type" : [ "null", {
>       "type" : "map",
>       "values" : [ "null", "string" ],
>       "default" : null
>     } ],
>     "default" : null
>   },
> ...
> {code}
> Create external table from pxf to read it:
> {code:sql}
> "CREATE EXTERNAL TABLE meetmeevent_hdfs(
> sourcetimestamp bigint,
> sourceagent varchar(200),
> sourceip_address varchar(20),
> meetmeuser_id bigint,
> meetmeinterested_uid bigint,
> meetmevote varchar(4),
> meetmeis_match int,
> meetmenetwork_score bigint,
> meetmeresponsiveness_score bigint,
> meetmesession_id varchar(32),
> meetmefriends int,
> meetmeprevious_view varchar(32), 
> meetmemodel bigint,
> meetmescore float8,
> meetmemethod int,
> meetmecontributions varchar(120),
> meetmeclicksource varchar(20),
> meetmeclickaction varchar(20),
> meetmeprofileview_ts bigint,
> meetmeplatform varchar(20),
> metatopic_name varchar(100),
> metarequest_user_agent varchar(200),
> metarequest_session_id varchar(50),
> metarequest_id text,
> metakvpairs varchar(1000),
> meta_handlers varchar(1000)
> )
> LOCATION 
> ('pxf://dahdp2nn01.tag-dev.com:50070/data/ramblas/event_log/meetme/20150824/12/s_meetme.20.0.3492572.11600054147.1440442800000.avro?PROFILE=Avro&DATA-SCHEMA=/data/ramblas/schema/meetme')
> FORMAT 'CUSTOM' (formatter='pxfwritable_import');
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to