[ 
https://issues.apache.org/jira/browse/IMPALA-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-9821:
---------------------------------
    Description: 
Until Binary implementation is ongoing ds_hll_sketch() and ds_hll_union() 
functions return serialized sketches in String format. Once Binary is available 
in Impala these can return the serialized sketches in Binary format.

Currently when sketches are written by Hive as BINARY to ORC table and this 
table is loaded to Impala where the sketch columns are STRINGs then we get an 
error
{code:java}
ERROR: Type mismatch: table column STRING is map to column binary in ORC file
{code}
Interestingly the works with Parquet format.

Once we have binary support make sure to add coverage for ORC table where the 
table is created and populated by Hive and read for estimating by Impala.


  was:Until Binary implementation is ongoing ds_hll_sketch() and ds_hll_union() 
functions return serialized sketches in String format. Once Binary is available 
in Impala these can return the serialized sketches in Binary format.


> Rewrite ds_hll_sketch() and ds_hll_union() functions to return Binary
> ---------------------------------------------------------------------
>
>                 Key: IMPALA-9821
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9821
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>            Reporter: Gabor Kaszab
>            Priority: Major
>
> Until Binary implementation is ongoing ds_hll_sketch() and ds_hll_union() 
> functions return serialized sketches in String format. Once Binary is 
> available in Impala these can return the serialized sketches in Binary format.
> Currently when sketches are written by Hive as BINARY to ORC table and this 
> table is loaded to Impala where the sketch columns are STRINGs then we get an 
> error
> {code:java}
> ERROR: Type mismatch: table column STRING is map to column binary in ORC file
> {code}
> Interestingly the works with Parquet format.
> Once we have binary support make sure to add coverage for ORC table where the 
> table is created and populated by Hive and read for estimating by Impala.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to