[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2015-12-16 Thread Sivanesan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061630#comment-15061630
 ] 

Sivanesan commented on HIVE-5795:
-

But it was the other way around. It was able to skip if the file size is lesser 
than block size and skips random detail record if the file size is more than 
block size.

My assumption: While using CombineHiveInputFormat, a record around end of first 
block is skipped.

> Hive should be able to skip header and footer rows when reading data file for 
> a table
> -
>
> Key: HIVE-5795
> URL: https://issues.apache.org/jira/browse/HIVE-5795
> Project: Hive
>  Issue Type: New Feature
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
>  Labels: TODOC13
> Fix For: 0.13.0
>
> Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
> HIVE-5795.4.patch, HIVE-5795.5.patch
>
>
> Hive should be able to skip header and footer lines when reading data file 
> from table. In this way, user don't need to processing data which generated 
> by other application with a header or footer and directly use the file for 
> table operations.
> To implement this, the idea is adding new properties in table descriptions to 
> define the number of lines in header and footer and skip them when reading 
> the record from record reader. An DDL example for creating a table with 
> header and footer should be like this:
> {code}
> Create external table testtable (name string, message string) row format 
> delimited fields terminated by '\t' lines terminated by '\n' location 
> '/testtable' tblproperties ("skip.header.line.count"="1", 
> "skip.footer.line.count"="2");
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2015-06-28 Thread Sivanesan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604797#comment-14604797
 ] 

Sivanesan commented on HIVE-5795:
-

I agree with prashant kumar- I face this exacr issue. I find this issue only 
when I use CombineHiveInputFormat and not while using HiveInputFormat. Does 
this have something to do with InputSplit? Please help.

 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: New Feature
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
  Labels: TODOC13
 Fix For: 0.13.0

 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
 HIVE-5795.4.patch, HIVE-5795.5.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.line.count=1, 
 skip.footer.line.count=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)