[ 
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated HIVE-27662:
-----------------------------------
    Description: 
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map<string,structid:string,name:string> then \u0004 char is coming in the 
output. Here is a example:

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;

create EXTERNAL table `table6` as
select
  'bob' as name,
  MAP(
    "Key1",
    ARRAY(
      1,
      2,
      3
    ),
    "Key2",
    ARRAY(
    4,
    5,
    6
    )
  ) as testmarks;

select * from table6;
set hive.vectorized.execution.enabled=false;
select * from table6; {code}
Output of 1st select statement:
{code:java}
bob·    {"Key1":null,"Key2":null} {code}
Output of 2nd select statement:
{code:java}
bob·    {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
 

MAP Complex type is not handling the scenario where it contains a nested 
complex type like STRUCT, ARRAY, UNION.

  was:
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map<string,structid:string,name:string> then \u0004 char is coming in the 
output. Here is a example:

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;

create EXTERNAL table `table6` as
select
  'bob'                                           as name,
  MAP(
    "Key1",
    ARRAY(
      1,
      2,
      3
    ),
    "Key2",
    ARRAY(
    4,
    5,
    6
    )
  )                                               as testmarks;

select * from table6;
set hive.vectorized.execution.enabled=false;
select * from table6; {code}
Output of 1st select statement:
{code:java}
bob·    {"Key1":null,"Key2":null} {code}
Output of 2nd select statement:
{code:java}
bob·    {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
 


> Incorrect parsing of nested complex types containing map during vectorized 
> text processing
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-27662
>                 URL: https://issues.apache.org/jira/browse/HIVE-27662
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>            Reporter: Raghav Aggarwal
>            Assignee: Raghav Aggarwal
>            Priority: Major
>
> When reading a text table with vectorization on and 
> hive.fetch.task.conversion as none, wrong parsing of delimiter is happening 
> in nested complex types containing map. For example, if a columns schema is 
> like: map<string,structid:string,name:string> then \u0004 char is coming in 
> the output. Here is a example:
> Sample q file:
>  
> {code:java}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.execution.enabled=true;
> create EXTERNAL table `table6` as
> select
>   'bob' as name,
>   MAP(
>     "Key1",
>     ARRAY(
>       1,
>       2,
>       3
>     ),
>     "Key2",
>     ARRAY(
>     4,
>     5,
>     6
>     )
>   ) as testmarks;
> select * from table6;
> set hive.vectorized.execution.enabled=false;
> select * from table6; {code}
> Output of 1st select statement:
> {code:java}
> bob·    {"Key1":null,"Key2":null} {code}
> Output of 2nd select statement:
> {code:java}
> bob·    {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
>  
> MAP Complex type is not handling the scenario where it contains a nested 
> complex type like STRUCT, ARRAY, UNION.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to