[ 
https://issues.apache.org/jira/browse/HIVE-19943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523378#comment-16523378
 ] 

Liam De Lee edited comment on HIVE-19943 at 6/26/18 8:11 AM:
-------------------------------------------------------------

we are already working on orc tables with gzip compressed data.

I tried what you suggested but i still get the header back. Also we are using 
external tables and those cannot be put into orc because we get the data into 
csv files. And we also have the same problem there.

Can this be a problem by HDInsight that they need to fix? If this is the case i 
can raise a ticket by them to look into this because it is kind of a basic  
option in my opinion.


was (Author: liam de lee):
we are already working on orc tables with gzip compressed data.

I tried what you suggested but i still get the header back.

Can this be a problem by HDInsight that they need to fix? If this is the case i 
can raise a ticket by them to look into this because it is kind of a basic  
option in my opinion.

> Header values keep showing up in result sets
> --------------------------------------------
>
>                 Key: HIVE-19943
>                 URL: https://issues.apache.org/jira/browse/HIVE-19943
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 2.1.0
>         Environment: Hdinsight Hive interactivequerry
> [Components|https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-component-versioning#hadoop-components-available-with-different-hdinsight-versions]
>            Reporter: Liam De Lee
>            Priority: Major
>
> We are using the tblproperties ("skip.header.line.count"="1") when creating 
> an external table.
> When we do a select * from table we get it back as expected without the 
> header present in the result set.
> However when we do for instance a count(1) we get the header back in this 
> count (tested with a select * from table and paste it in notepad to find the 
> amount of rows)
> If we also do this with a select distinct(column) from table we also get the 
> header as a distinct value.
> file structure:
> ||_TESTING_TYPE||
> |adf|
> |hyg|
> |abc|
>  
> *Update: 26/06/2018*
> Create statement:
> {code:java}
> -----------------------------------
> --test_type--
> -----------------------------------
> CREATE EXTERNAL TABLE IF NOT EXISTS ext.test_type_in
>   (
>     test_type      string
>     )
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\073'
> STORED AS TEXTFILE
> LOCATION 'adl://{adlslocation}data/data2/test'
> tblproperties ("skip.header.line.count"="1")
> {code}
>  Select statement:
> {code:java}
> select * from test_type_in;
> {code}
> Distinct statement:
> {code:java}
> select distinct test_type from test_type_in ORDER BY test_type;
> {code}
> I cannot show the exact statement because of NDA so i changed those values to 
> test.
>  
> I can also tell you it is not just at our HDInsight but also at another 
> company we are working for. It does not Mather what is in the data as well. 
> so for testing purposes:
> {code:java}
> test_type,abcg,gjeiza,aze,grriajj,gd,rrjri,vdju{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to