[jira] [Work logged] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

ASF GitHub Bot (Jira) Mon, 18 Oct 2021 04:38:04 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-25505?focusedWorklogId=666255&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-666255
 ]


ASF GitHub Bot logged work on HIVE-25505:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Oct/21 11:37
            Start Date: 18/Oct/21 11:37
    Worklog Time Spent: 10m 
      Work Description: pgaref commented on pull request #2717:
URL: https://github.com/apache/hive/pull/2717#issuecomment-945677144


   > thanks @pgaref for the patch, the fix looks good to me! however, I can see 
that it only applies to LLAP IO codepaths, I'm wondering if we're facing the 
same issue in tez container mode (with TestMiniTezCliDriver)
   
   Thanks @abstractdog for taking a look -- just added tests for Tez container 
mode both for plain an compressed files


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 666255)
    Time Spent: 1h 10m  (was: 1h)

> Incorrect results with header. skip.header.line.count if first line is blank
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-25505
>                 URL: https://issues.apache.org/jira/browse/HIVE-25505
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>            Reporter: Steve Carlin
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> aAtable with header. skip.header.line.count=1 does not skip the first line if 
> it is blank, except in a fetch task.
> To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
> table properties.
> In the table location, create a single file, with a blank (empty) first line, 
> and say 2 further lines.
> If you do a select * on it, you see 2 rows (correct)
>  If you do select count(*) on it, you get 3 (incorrect)
> {code:java}
> CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
>   LOCATION '${system:test.tmp.dir}/testcase1'
>   TBLPROPERTIES ("skip.header.line.count"="1");
> SET hive.fetch.task.conversion = more;
> select * from testcase1;
> select count(*) from testcase1;
> set hive.fetch.task.conversion=none;
> select * from testcase1;
> select count(*) from testcase1;
> Test file:
> 1,2019-12-31
> 2,2019-12-31
> 3,2019-12-31
> Should both yield (with the above test file):
> #### A masked pattern was here ####
> 1     2019-12-31
> 2     2019-12-31
> 3     2019-12-31
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

Reply via email to