Re: Review Request 16526: more usage of paths

2013-12-30 Thread Ashutosh Chauhan


> On Dec. 31, 2013, 6:40 a.m., Xuefu Zhang wrote:
> > trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java, line 
> > 503
> > 
> >
> > There might be an tab/indention problem.

will take care of this one.


> On Dec. 31, 2013, 6:40 a.m., Xuefu Zhang wrote:
> > trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java, 
> > line 74
> > 
> >
> > Is it safer to do path.toUri().toString()? I saw it's done this way in 
> > other part of changes.

Actually, I had path.toUri().toString() in my first version of patch, that 
resulted in few failures. On investigation, I found out pathToPartitionInfo() 
map which is  assumes path.toString() in key (not 
path.toUri().toString()). This is evident from 
HiveFileFormatUtils#getPartitionDescFromPathRecursively(). Best way to avoid  
this confusion is to have partToPartitionInfo() in  form 
but thats a bigger change since that datastructure is used all over the place. 
I plan to make that change in next patch in this series.


> On Dec. 31, 2013, 6:40 a.m., Xuefu Zhang wrote:
> > trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java, 
> > line 180
> > 
> >
> > Same as above.

This is actually other way around. inputPaths() was List, now its 
List


> On Dec. 31, 2013, 6:40 a.m., Xuefu Zhang wrote:
> > trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java, 
> > line 190
> > 
> >
> > Same as above.

This is actually other way around. inputPaths() was List, now its 
List


> On Dec. 31, 2013, 6:40 a.m., Xuefu Zhang wrote:
> > trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java, line 35
> > 
> >
> > Could you confirm that tmpDir isn't used? The old code seemed spending 
> > quite some effort generating the temp directory.

Yup, I did verify thats its not used anywhere. Pure dead code. Even worse, it 
was made part of explain unnecessarily, which resulted in changes of all those 
.q.out files.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16526/#review30995
---


On Dec. 31, 2013, 12:42 a.m., Ashutosh Chauhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16526/
> ---
> 
> (Updated Dec. 31, 2013, 12:42 a.m.)
> 
> 
> Review request for hive, Xuefu Zhang and Xuefu Zhang.
> 
> 
> Bugs: HIVE-6121
> https://issues.apache.org/jira/browse/HIVE-6121
> 
> 
> Repository: hive
> 
> 
> Description
> ---
> 
> Refactoring patch.
> 
> 
> Diffs
> -
> 
>   
> trunk/hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
>  1554326 
>   
> trunk/hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
>  1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
>  1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java 
> 1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanWork.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateTask.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateWork.java
>  1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
> 1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/opt

[jira] [Commented] (HIVE-6121) Use Paths Consistently - IV

2013-12-30 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859370#comment-13859370
 ] 

Xuefu Zhang commented on HIVE-6121:
---

Sorry for the delayed +1 for HIVE-6116, as I was travelling today.

The patch for this JIRA looks good. This is great work that's important to keep 
Hive code clean. I do have a few minor comments/questions on the review board.

> Use Paths Consistently - IV
> ---
>
> Key: HIVE-6121
> URL: https://issues.apache.org/jira/browse/HIVE-6121
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6121.2.patch, HIVE-6121.patch
>
>
> Next one in patch series to fix Hive to use paths consistently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 16526: more usage of paths

2013-12-30 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16526/#review30995
---



trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java


There might be an tab/indention problem.



trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java


Is it safer to do path.toUri().toString()? I saw it's done this way in 
other part of changes.



trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java


Same as above.



trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java


Same as above.



trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java


Could you confirm that tmpDir isn't used? The old code seemed spending 
quite some effort generating the temp directory.


- Xuefu Zhang


On Dec. 31, 2013, 12:42 a.m., Ashutosh Chauhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16526/
> ---
> 
> (Updated Dec. 31, 2013, 12:42 a.m.)
> 
> 
> Review request for hive, Xuefu Zhang and Xuefu Zhang.
> 
> 
> Bugs: HIVE-6121
> https://issues.apache.org/jira/browse/HIVE-6121
> 
> 
> Repository: hive
> 
> 
> Description
> ---
> 
> Refactoring patch.
> 
> 
> Diffs
> -
> 
>   
> trunk/hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
>  1554326 
>   
> trunk/hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
>  1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
>  1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java 
> 1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanWork.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateTask.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateWork.java
>  1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
> 1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/AlterTablePartMergeFilesDesc.java
>  1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java
>  1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ArchiveWork.java 1554326 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
>  1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CopyWork.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 1554326 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadMultiFilesDesc

[jira] [Commented] (HIVE-6116) Use Paths consistently III

2013-12-30 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859357#comment-13859357
 ] 

Xuefu Zhang commented on HIVE-6116:
---

+1

> Use Paths consistently III
> --
>
> Key: HIVE-6116
> URL: https://issues.apache.org/jira/browse/HIVE-6116
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6116.2.patch, HIVE-6116.3.patch, HIVE-6116.patch
>
>
> Another one in patch series to make use of Paths consistently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 16502: Use paths consisently - 3

2013-12-30 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16502/#review30993
---

Ship it!


Ship It!

- Xuefu Zhang


On Dec. 30, 2013, 6:29 p.m., Ashutosh Chauhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16502/
> ---
> 
> (Updated Dec. 30, 2013, 6:29 p.m.)
> 
> 
> Review request for hive, Xuefu Zhang and Xuefu Zhang.
> 
> 
> Bugs: HIVE-6116
> https://issues.apache.org/jira/browse/HIVE-6116
> 
> 
> Repository: hive
> 
> 
> Description
> ---
> 
> Refactor patch.
> 
> 
> Diffs
> -
> 
>   
> trunk/hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
>  1554291 
>   
> trunk/hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
>  1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
> 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
> 1554291 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java
>  1554291 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
>  1554291 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
>  1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 
> 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 1554291 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/session/LineageState.java 
> 1554291 
> 
> Diff: https://reviews.apache.org/r/16502/diff/
> 
> 
> Testing
> ---
> 
> No new tests. Refactor only patch. Regression suite suffices.
> 
> 
> Thanks,
> 
> Ashutosh Chauhan
> 
>



[jira] [Commented] (HIVE-6083) User provided table properties are not assigned to the TableDesc of the FileSinkDesc in a CTAS query

2013-12-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859340#comment-13859340
 ] 

Hive QA commented on HIVE-6083:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12620896/HIVE-6083.2.patch.txt

{color:green}SUCCESS:{color} +1 4818 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/775/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/775/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12620896

> User provided table properties are not assigned to the TableDesc of the 
> FileSinkDesc in a CTAS query
> 
>
> Key: HIVE-6083
> URL: https://issues.apache.org/jira/browse/HIVE-6083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Yin Huai
>Assignee: Yin Huai
> Attachments: HIVE-6083.1.patch.txt, HIVE-6083.2.patch.txt
>
>
> I was trying to use a CTAS query to create a table stored with ORC and 
> orc.compress was set to SNAPPY. However, the table was still compressed as 
> ZLIB (although the result of DESCRIBE still shows that this table is 
> compressed by SNAPPY). For a CTAS query, SemanticAnalyzer.genFileSinkPlan 
> uses CreateTableDesc to generate the TableDesc for the FileSinkDesc by 
> calling PlanUtils.getTableDesc. However, in PlanUtils.getTableDesc, I do not 
> see user provided table properties are assigned to the returned TableDesc 
> (CreateTableDesc.getTblProps was not called in this method ).  
> btw, I only checked the code of 0.12 and trunk.
> Two examples:
> * Snappy compression
> {code}
> create table web_sales_wrong_orc_snappy
> stored as orc tblproperties ("orc.compress"="SNAPPY")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_snappy;
> 
> Location: 
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_snappy
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
>   numRows 719384  
>   orc.compressSNAPPY  
>   rawDataSize 97815412
>   totalSize   40625243
>   transient_lastDdlTime   1387566015   
>    
> {code}
> {code}
> bin/hive --orcfiledump 
> /user/hive/warehouse/web_sales_wrong_orc_snappy/00_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}
> * No compression
> {code}
> create table web_sales_wrong_orc_none
> stored as orc tblproperties ("orc.compress"="NONE")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_none;
> 
> Location: 
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_none  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
>   numRows 719384  
>   orc.compressNONE
>   rawDataSize 97815412
>   totalSize   40625243
>   transient_lastDdlTime   1387566064   
>    
> {code}
> {code}
> bin/hive --orcfiledump /user/hive/warehouse/web_sales_wrong_orc_none/00_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2013-12-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859316#comment-13859316
 ] 

Lefty Leverenz commented on HIVE-5795:
--

bq.  How do I add the new parameters to the wikidoc?

Just add a release note to this ticket and I'll put the information in the 
wikidoc after the patch gets committed.  Or you could do it yourself, as long 
as you remember to say it applies to version 0.13.0 and later.

* hive.file.max.footer goes in Configuration Properties in the [Query 
Execution|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryExecution]
 section
* skip.header.line.count & skip.footer.line.count need a new section somewhere, 
most likely in the Create Table section of the 
[DDL|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL] doc 
-- should there also be a link from the 
[Select|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select] 
doc or one of the SerDe docs 
([HiveSerDe|https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-HiveSerDe]
 in the Developer Guide or 
[SerDe|https://cwiki.apache.org/confluence/display/Hive/SerDe])?

> Hive should be able to skip header and footer rows when reading data file for 
> a table
> -
>
> Key: HIVE-5795
> URL: https://issues.apache.org/jira/browse/HIVE-5795
> Project: Hive
>  Issue Type: Bug
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
> Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
> HIVE-5795.4.patch
>
>
> Hive should be able to skip header and footer lines when reading data file 
> from table. In this way, user don't need to processing data which generated 
> by other application with a header or footer and directly use the file for 
> table operations.
> To implement this, the idea is adding new properties in table descriptions to 
> define the number of lines in header and footer and skip them when reading 
> the record from record reader. An DDL example for creating a table with 
> header and footer should be like this:
> {code}
> Create external table testtable (name string, message string) row format 
> delimited fields terminated by '\t' lines terminated by '\n' location 
> '/testtable' tblproperties ("skip.header.line.count"="1", 
> "skip.footer.line.count"="2");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2013-12-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859294#comment-13859294
 ] 

Hive QA commented on HIVE-5795:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12620881/HIVE-5795.4.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 4818 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_left_outer_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_mapjoin
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_bucketed_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_smb_mapjoin_8
org.apache.hive.hcatalog.listener.TestNotificationListener.testAMQListener
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/774/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/774/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12620881

> Hive should be able to skip header and footer rows when reading data file for 
> a table
> -
>
> Key: HIVE-5795
> URL: https://issues.apache.org/jira/browse/HIVE-5795
> Project: Hive
>  Issue Type: Bug
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
> Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
> HIVE-5795.4.patch
>
>
> Hive should be able to skip header and footer lines when reading data file 
> from table. In this way, user don't need to processing data which generated 
> by other application with a header or footer and directly use the file for 
> table operations.
> To implement this, the idea is adding new properties in table descriptions to 
> define the number of lines in header and footer and skip them when reading 
> the record from record reader. An DDL example for creating a table with 
> header and footer should be like this:
> {code}
> Create external table testtable (name string, message string) row format 
> delimited fields terminated by '\t' lines terminated by '\n' location 
> '/testtable' tblproperties ("skip.header.line.count"="1", 
> "skip.footer.line.count"="2");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: How do you run single query test(s) after mavenization?

2013-12-30 Thread Lefty Leverenz
Wow, immediate gratification.  Thanks very much, Brock, and Happy New Year
everybody!

-- Lefty


On Mon, Dec 30, 2013 at 7:27 AM, Brock Noland  wrote:

> On Mon, Nov 18, 2013 at 2:21 PM, Lefty Leverenz  >wrote:
>
> > Thanks for the typo alert Remus, I've changed -Dcase=TestCliDriver to
> > -Dtest=TestCliDriver.
> >
>
> Thank you for this!!
>
>
> >
> > But HowToContribute<
> >
> https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute
> > >still
> > has several instances of "ant" that should be changed to "mvn" --
> > some are simple replacements but others might need additional changes:
>
>
> >- Check for new Checkstyle 
> > violations
> >by running ant checkstyle, ...  [mvn checkstyle?]
> >
>
> We have not implemented checkstyle on maven yet. I created
> https://issues.apache.org/jira/browse/HIVE-6123
>
>
> >- Define methods within your class whose names begin with test, and
> call
> >JUnit's many assert methods to verify conditions; these methods will
> be
> >executed when you run ant test.  [simple replacement]
> >- (2 ants) We can run "ant test -Dtestcase=TestAbc" where TestAbc is
> the
> >name of the new class. This will test only the new testcase, which
> will
> > be
> >faster than "ant test" which tests all testcases.  [change ant to mvn
> >twice; also change -Dtestcase to -Dtest?]
> >- Folks should run ant clean package test before selecting *Submit
> > Patch*.
> > [mvn clean package?]
> >
>
> I have updated the above.
>
>
> >
> > The rest of the "ant" instances are okay because the MVN section
> afterwards
> > gives the alternative, but should we keep ant or make the replacements?
> >
> >- 9.  Now you can run the ant 'thriftif' target ...
> >- 11.  ant thriftif -Dthrift.home=...
> >- 15.  ant thriftif
> >- 18. ant clean package
> >- The maven equivalent of ant thriftif is:
> >
> > mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local
> >
> >
> >
> I have not generated the thrift stuff recently. It would be great if Alan
> or someone else who has would update this section.
>
> Thank you!!
>


[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2013-12-30 Thread Shuaishuai Nie (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859255#comment-13859255
 ] 

Shuaishuai Nie commented on HIVE-5795:
--

Thanks [~leftylev]. Changed the parameter name in the JIRA description.How do I 
add the new parameters to the wikidoc?

> Hive should be able to skip header and footer rows when reading data file for 
> a table
> -
>
> Key: HIVE-5795
> URL: https://issues.apache.org/jira/browse/HIVE-5795
> Project: Hive
>  Issue Type: Bug
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
> Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
> HIVE-5795.4.patch
>
>
> Hive should be able to skip header and footer lines when reading data file 
> from table. In this way, user don't need to processing data which generated 
> by other application with a header or footer and directly use the file for 
> table operations.
> To implement this, the idea is adding new properties in table descriptions to 
> define the number of lines in header and footer and skip them when reading 
> the record from record reader. An DDL example for creating a table with 
> header and footer should be like this:
> {code}
> Create external table testtable (name string, message string) row format 
> delimited fields terminated by '\t' lines terminated by '\n' location 
> '/testtable' tblproperties ("skip.header.line.count"="1", 
> "skip.footer.line.count"="2");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2013-12-30 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-5795:
-

Description: 
Hive should be able to skip header and footer lines when reading data file from 
table. In this way, user don't need to processing data which generated by other 
application with a header or footer and directly use the file for table 
operations.
To implement this, the idea is adding new properties in table descriptions to 
define the number of lines in header and footer and skip them when reading the 
record from record reader. An DDL example for creating a table with header and 
footer should be like this:
{code}
Create external table testtable (name string, message string) row format 
delimited fields terminated by '\t' lines terminated by '\n' location 
'/testtable' tblproperties ("skip.header.line.count"="1", 
"skip.footer.line.count"="2");
{code}

  was:
Hive should be able to skip header and footer lines when reading data file from 
table. In this way, user don't need to processing data which generated by other 
application with a header or footer and directly use the file for table 
operations.
To implement this, the idea is adding new properties in table descriptions to 
define the number of lines in header and footer and skip them when reading the 
record from record reader. An DDL example for creating a table with header and 
footer should be like this:
{code}
Create external table testtable (name string, message string) row format 
delimited fields terminated by '\t' lines terminated by '\n' location 
'/testtable' tblproperties ("skip.header.number"="1", "skip.footer.number"="2");
{code}


> Hive should be able to skip header and footer rows when reading data file for 
> a table
> -
>
> Key: HIVE-5795
> URL: https://issues.apache.org/jira/browse/HIVE-5795
> Project: Hive
>  Issue Type: Bug
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
> Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
> HIVE-5795.4.patch
>
>
> Hive should be able to skip header and footer lines when reading data file 
> from table. In this way, user don't need to processing data which generated 
> by other application with a header or footer and directly use the file for 
> table operations.
> To implement this, the idea is adding new properties in table descriptions to 
> define the number of lines in header and footer and skip them when reading 
> the record from record reader. An DDL example for creating a table with 
> header and footer should be like this:
> {code}
> Create external table testtable (name string, message string) row format 
> delimited fields terminated by '\t' lines terminated by '\n' location 
> '/testtable' tblproperties ("skip.header.line.count"="1", 
> "skip.footer.line.count"="2");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6083) User provided table properties are not assigned to the TableDesc of the FileSinkDesc in a CTAS query

2013-12-30 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-6083:
---

Status: Patch Available  (was: Open)

> User provided table properties are not assigned to the TableDesc of the 
> FileSinkDesc in a CTAS query
> 
>
> Key: HIVE-6083
> URL: https://issues.apache.org/jira/browse/HIVE-6083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Yin Huai
>Assignee: Yin Huai
> Attachments: HIVE-6083.1.patch.txt, HIVE-6083.2.patch.txt
>
>
> I was trying to use a CTAS query to create a table stored with ORC and 
> orc.compress was set to SNAPPY. However, the table was still compressed as 
> ZLIB (although the result of DESCRIBE still shows that this table is 
> compressed by SNAPPY). For a CTAS query, SemanticAnalyzer.genFileSinkPlan 
> uses CreateTableDesc to generate the TableDesc for the FileSinkDesc by 
> calling PlanUtils.getTableDesc. However, in PlanUtils.getTableDesc, I do not 
> see user provided table properties are assigned to the returned TableDesc 
> (CreateTableDesc.getTblProps was not called in this method ).  
> btw, I only checked the code of 0.12 and trunk.
> Two examples:
> * Snappy compression
> {code}
> create table web_sales_wrong_orc_snappy
> stored as orc tblproperties ("orc.compress"="SNAPPY")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_snappy;
> 
> Location: 
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_snappy
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
>   numRows 719384  
>   orc.compressSNAPPY  
>   rawDataSize 97815412
>   totalSize   40625243
>   transient_lastDdlTime   1387566015   
>    
> {code}
> {code}
> bin/hive --orcfiledump 
> /user/hive/warehouse/web_sales_wrong_orc_snappy/00_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}
> * No compression
> {code}
> create table web_sales_wrong_orc_none
> stored as orc tblproperties ("orc.compress"="NONE")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_none;
> 
> Location: 
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_none  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
>   numRows 719384  
>   orc.compressNONE
>   rawDataSize 97815412
>   totalSize   40625243
>   transient_lastDdlTime   1387566064   
>    
> {code}
> {code}
> bin/hive --orcfiledump /user/hive/warehouse/web_sales_wrong_orc_none/00_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6083) User provided table properties are not assigned to the TableDesc of the FileSinkDesc in a CTAS query

2013-12-30 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-6083:
---

Attachment: HIVE-6083.2.patch.txt

Let me trigger HiveQA again.

> User provided table properties are not assigned to the TableDesc of the 
> FileSinkDesc in a CTAS query
> 
>
> Key: HIVE-6083
> URL: https://issues.apache.org/jira/browse/HIVE-6083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Yin Huai
>Assignee: Yin Huai
> Attachments: HIVE-6083.1.patch.txt, HIVE-6083.2.patch.txt
>
>
> I was trying to use a CTAS query to create a table stored with ORC and 
> orc.compress was set to SNAPPY. However, the table was still compressed as 
> ZLIB (although the result of DESCRIBE still shows that this table is 
> compressed by SNAPPY). For a CTAS query, SemanticAnalyzer.genFileSinkPlan 
> uses CreateTableDesc to generate the TableDesc for the FileSinkDesc by 
> calling PlanUtils.getTableDesc. However, in PlanUtils.getTableDesc, I do not 
> see user provided table properties are assigned to the returned TableDesc 
> (CreateTableDesc.getTblProps was not called in this method ).  
> btw, I only checked the code of 0.12 and trunk.
> Two examples:
> * Snappy compression
> {code}
> create table web_sales_wrong_orc_snappy
> stored as orc tblproperties ("orc.compress"="SNAPPY")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_snappy;
> 
> Location: 
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_snappy
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
>   numRows 719384  
>   orc.compressSNAPPY  
>   rawDataSize 97815412
>   totalSize   40625243
>   transient_lastDdlTime   1387566015   
>    
> {code}
> {code}
> bin/hive --orcfiledump 
> /user/hive/warehouse/web_sales_wrong_orc_snappy/00_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}
> * No compression
> {code}
> create table web_sales_wrong_orc_none
> stored as orc tblproperties ("orc.compress"="NONE")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_none;
> 
> Location: 
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_none  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
>   numRows 719384  
>   orc.compressNONE
>   rawDataSize 97815412
>   totalSize   40625243
>   transient_lastDdlTime   1387566064   
>    
> {code}
> {code}
> bin/hive --orcfiledump /user/hive/warehouse/web_sales_wrong_orc_none/00_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Review Request 16531: HIVE-6083: User provided table properties are not assigned to the TableDesc of the FileSinkDesc in a CTAS query

2013-12-30 Thread Yin Huai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16531/
---

Review request for hive.


Bugs: HIVE-6083
https://issues.apache.org/jira/browse/HIVE-6083


Repository: hive-git


Description
---

User provided table properties are not assigned to the TableDesc of the 
FileSinkDesc in a CTAS query


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java cdd8b9c 

Diff: https://reviews.apache.org/r/16531/diff/


Testing
---


Thanks,

Yin Huai



[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2013-12-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859243#comment-13859243
 ] 

Lefty Leverenz commented on HIVE-5795:
--

When this is committed, we'll need documentation for:

* hive.file.max.footer -- new config parameter
* skip.header.line.count -- serde property for table
* skip.footer.line.count -- serde property for table

The patch documents hive.file.max.footer in hive-default.xml.template, so the 
parameter just needs to be added to the wikidoc.  Note the property names:  not 
skip.xxx.number as shown in this ticket's description but skip.xxx.line.count.

> Hive should be able to skip header and footer rows when reading data file for 
> a table
> -
>
> Key: HIVE-5795
> URL: https://issues.apache.org/jira/browse/HIVE-5795
> Project: Hive
>  Issue Type: Bug
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
> Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
> HIVE-5795.4.patch
>
>
> Hive should be able to skip header and footer lines when reading data file 
> from table. In this way, user don't need to processing data which generated 
> by other application with a header or footer and directly use the file for 
> table operations.
> To implement this, the idea is adding new properties in table descriptions to 
> define the number of lines in header and footer and skip them when reading 
> the record from record reader. An DDL example for creating a table with 
> header and footer should be like this:
> {code}
> Create external table testtable (name string, message string) row format 
> delimited fields terminated by '\t' lines terminated by '\n' location 
> '/testtable' tblproperties ("skip.header.number"="1", 
> "skip.footer.number"="2");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6083) User provided table properties are not assigned to the TableDesc of the FileSinkDesc in a CTAS query

2013-12-30 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-6083:
---

Status: Open  (was: Patch Available)

> User provided table properties are not assigned to the TableDesc of the 
> FileSinkDesc in a CTAS query
> 
>
> Key: HIVE-6083
> URL: https://issues.apache.org/jira/browse/HIVE-6083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Yin Huai
>Assignee: Yin Huai
> Attachments: HIVE-6083.1.patch.txt
>
>
> I was trying to use a CTAS query to create a table stored with ORC and 
> orc.compress was set to SNAPPY. However, the table was still compressed as 
> ZLIB (although the result of DESCRIBE still shows that this table is 
> compressed by SNAPPY). For a CTAS query, SemanticAnalyzer.genFileSinkPlan 
> uses CreateTableDesc to generate the TableDesc for the FileSinkDesc by 
> calling PlanUtils.getTableDesc. However, in PlanUtils.getTableDesc, I do not 
> see user provided table properties are assigned to the returned TableDesc 
> (CreateTableDesc.getTblProps was not called in this method ).  
> btw, I only checked the code of 0.12 and trunk.
> Two examples:
> * Snappy compression
> {code}
> create table web_sales_wrong_orc_snappy
> stored as orc tblproperties ("orc.compress"="SNAPPY")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_snappy;
> 
> Location: 
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_snappy
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
>   numRows 719384  
>   orc.compressSNAPPY  
>   rawDataSize 97815412
>   totalSize   40625243
>   transient_lastDdlTime   1387566015   
>    
> {code}
> {code}
> bin/hive --orcfiledump 
> /user/hive/warehouse/web_sales_wrong_orc_snappy/00_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}
> * No compression
> {code}
> create table web_sales_wrong_orc_none
> stored as orc tblproperties ("orc.compress"="NONE")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_none;
> 
> Location: 
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_none  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
>   numRows 719384  
>   orc.compressNONE
>   rawDataSize 97815412
>   totalSize   40625243
>   transient_lastDdlTime   1387566064   
>    
> {code}
> {code}
> bin/hive --orcfiledump /user/hive/warehouse/web_sales_wrong_orc_none/00_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6121) Use Paths Consistently - IV

2013-12-30 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859214#comment-13859214
 ] 

Ashutosh Chauhan commented on HIVE-6121:


[~xuefuz] This one is ready for review as well. Above failure is minor .q.out 
file update. RB request : https://reviews.apache.org/r/16526/ Currently it 
includes HIVE-6116 patch since thats one not committed on trunk. (waiting for 
+1 on latest patch (which only has whitespace changes)) But that shouldn't be a 
problem for this review.

> Use Paths Consistently - IV
> ---
>
> Key: HIVE-6121
> URL: https://issues.apache.org/jira/browse/HIVE-6121
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6121.2.patch, HIVE-6121.patch
>
>
> Next one in patch series to fix Hive to use paths consistently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Review Request 16526: more usage of paths

2013-12-30 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16526/
---

Review request for hive, Xuefu Zhang and Xuefu Zhang.


Bugs: HIVE-6121
https://issues.apache.org/jira/browse/HIVE-6121


Repository: hive


Description
---

Refactoring patch.


Diffs
-

  
trunk/hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
 1554326 
  
trunk/hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java 
1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java
 1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanWork.java
 1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java
 1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateTask.java
 1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateWork.java
 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java 
1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
 1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
 1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/AlterTablePartMergeFilesDesc.java
 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ArchiveWork.java 1554326 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CopyWork.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadMultiFilesDesc.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/TruncateTableDesc.java 
1554326 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/LineageState.java 1554326 
  trunk/ql/src/test/results/clientpositive/binary_output_format.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucket1.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucket2.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucket3.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucket4.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucket5.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin4.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin5.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 
1554326 
  trunk/ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 
1554326 
  trunk/ql/src/test/results/clientpositive/groupby_map_ppr.q.out 1554326 
  trunk/ql/src/test/results/clientpositive/groupby_map

[jira] [Commented] (HIVE-6121) Use Paths Consistently - IV

2013-12-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859207#comment-13859207
 ] 

Hive QA commented on HIVE-6121:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12620879/HIVE-6121.2.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4818 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union22
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/773/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/773/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12620879

> Use Paths Consistently - IV
> ---
>
> Key: HIVE-6121
> URL: https://issues.apache.org/jira/browse/HIVE-6121
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6121.2.patch, HIVE-6121.patch
>
>
> Next one in patch series to fix Hive to use paths consistently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2013-12-30 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859188#comment-13859188
 ] 

Thejas M Nair commented on HIVE-5795:
-

+1

> Hive should be able to skip header and footer rows when reading data file for 
> a table
> -
>
> Key: HIVE-5795
> URL: https://issues.apache.org/jira/browse/HIVE-5795
> Project: Hive
>  Issue Type: Bug
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
> Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
> HIVE-5795.4.patch
>
>
> Hive should be able to skip header and footer lines when reading data file 
> from table. In this way, user don't need to processing data which generated 
> by other application with a header or footer and directly use the file for 
> table operations.
> To implement this, the idea is adding new properties in table descriptions to 
> define the number of lines in header and footer and skip them when reading 
> the record from record reader. An DDL example for creating a table with 
> header and footer should be like this:
> {code}
> Create external table testtable (name string, message string) row format 
> delimited fields terminated by '\t' lines terminated by '\n' location 
> '/testtable' tblproperties ("skip.header.number"="1", 
> "skip.footer.number"="2");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6017) Contribute Decimal128 high-performance decimal(p, s) package from Microsoft to Hive

2013-12-30 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859172#comment-13859172
 ] 

Jitendra Nath Pandey commented on HIVE-6017:


The code looks good to me. +1
It seems the copywrite needs to be mentioned in the NOTICE file as well, 
although I am not an expert on these rules. Please also refer to 
http://www.apache.org/licenses/ to comply with the guidelines when submitting 
code with employer copywrite or third-party code. Does it require a Software 
Grant Agreement (SGA) with PMC?


> Contribute Decimal128 high-performance decimal(p, s) package from Microsoft 
> to Hive
> ---
>
> Key: HIVE-6017
> URL: https://issues.apache.org/jira/browse/HIVE-6017
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.13.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Attachments: HIVE-6017.01.patch, HIVE-6017.02.patch, 
> HIVE-6017.03.patch, HIVE-6017.04.patch
>
>
> Contribute the Decimal128 high-performance decimal package developed by 
> Microsoft to Hive. This was originally written for Microsoft PolyBase by 
> Hideaki Kimura.
> This code is about 8X more efficient than Java BigDecimal for typical 
> operations. It uses a finite (128 bit) precision and can handle up to 
> decimal(38, X). It is also "mutable" so you can change the contents of an 
> existing object. This helps reduce the cost of new() and garbage collection.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Hive-trunk-h0.21 - Build # 2534 - Still Failing

2013-12-30 Thread Apache Jenkins Server
Changes for Build #2493
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes for Build #2494
[hashutosh] HIVE-5982 : Remove redundant filesystem operations and methods in 
FileSink (Ashutosh Chauhan via Thejas Nair)

[navis] HIVE-5955 : decimal_precision.q test case fails in trunk (Prasanth J 
via Navis)

[brock] HIVE-5983 - Fix name of ColumnProjectionUtils.appendReadColumnIDs 
(Brock Noland reviewed by Navis)


Changes for Build #2495
[omalley] HIVE-5580. Predicate pushdown predicates with an and-operator between 
non-SARGable predicates cause a NPE. (omalley)


Changes for Build #2496
[gunther] HIVE-6000: Hive build broken on hadoop2 (Vikram Dixit K via Gunther 
Hagleitner

[gunther] HIVE-2093: UPDATE - add two missing files from previous commit 
(Gunther Hagleitner)

[thejas] HIVE-2093 : create/drop database should populate inputs/outputs and 
check concurrency and user permission (Navis via Thejas Nair)

[hashutosh] HIVE-6016 : Hadoop23Shims has a bug in listLocatedStatus impl. 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5994 : ORC RLEv2 encodes wrongly for large negative BIGINTs  
(64 bits ) (Prasanth J via Owen Omalley)

[hashutosh] HIVE-5991 : ORC RLEv2 fails with ArrayIndexOutOfBounds exception 
for PATCHED_BLOB encoding (Prasanth J via Owen Omalley)

[prasadm] HIVE-4395: Support TFetchOrientation.FIRST for HiveServer2 
FetchResults (Prasad Mujumdar reviewed by Thejas Nair)

[ehans] HIVE-5756: Implement vectorized support for IF conditional expression 
(Eric Hanson)

[hashutosh] HIVE-6018 : FetchTask should not reference metastore classes (Navis 
via Prasad Mujumdar)

[hashutosh] HIVE-5979. Failure in cast to timestamps. (Jitendra Pandey)

[hashutosh] HIVE-5897 : Fix hadoop2 execution environment Milestone 2 (Vikram 
Dixit via Brock Noland)


Changes for Build #2497

Changes for Build #2498
[hashutosh] HIVE-6004 : Fix statistics annotation related test failures in 
hadoop2 (Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-6027 : non-vectorized log10 has rounding issue (Sergey 
Shelukhin via Ashutosh Chauhan)

[prasadm] HIVE-5993: JDBC Driver should not hard-code the database name (Szehon 
Ho via Prasad Mujumdar)


Changes for Build #2499
[navis] HIVE-5985 : Make qfile_regex to accept multiple patterns (Navis 
reviewed by Ashutosh Chauhan)


Changes for Build #2500

Changes for Build #2501

Changes for Build #2502
[navis] HIVE-5276 : Skip redundant string encoding/decoding for hiveserver2 
(Navis Reviewed by Carl Steinbach)


Changes for Build #2503
[xuefu] HIVE-6022: Load statements with incorrect order of partitions put input 
files to unreadable places (Teruyoshi Zenmyo via Xuefu)


Changes for Build #2504

Changes for Build #2505
[thejas] HIVE-5975 : [WebHCat] templeton mapreduce job failed if provide 
"define" parameters (Shanyu Zhao via Thejas Nair)


Changes for Build #2506
[prasadm] HIVE-1466: Add NULL DEFINED AS to ROW FORMAT specification (Prasad 
Mujumdar reviewed by Xuefu Zhang)


Changes for Build #2507
[jitendra] HIVE-5521 : Remove CommonRCFileInputFormat. (hashutosh via jitendra)

[rhbutani] HIVE-5973 SMB joins produce incorrect results with multiple 
partitions and buckets (Vikram Dixit via Harish Butani)

[ehans] HIVE-6015: vectorized logarithm produces results for 0 that are 
different from a non-vectorized one (Sergey Shelukhin via Eric Hanson)


Changes for Build #2508
[brock] HIVE-5812 - HiveServer2 SSL connection transport binds to loopback 
address by default (Prasad Mujumdar via Brock Noland)


Changes for Build #2509
[hashutosh] HIVE-5936 : analyze command failing to collect stats with counter 
mechanism (Navis via Ashutosh Chauhan)


Changes for Build #2510
[thejas] HIVE-5230 : Better error reporting by async threads in HiveServer2 
(Vaibhav Gumashta via Thejas Nair)


Changes for Build #2511
[navis] HIVE-5879 : Fix spelling errors in hive-default.xml.template (Lefty 
Leverenz via Navis)


Changes for Build #2512

Changes for Build #2513
[xuefu] HIVE-6021: Problem in GroupByOperator for handling distinct aggrgations 
(Sun Rui via Xuefu)


Changes for Build #2514
[prasadm] HIVE-6036: A test case for embedded beeline - with URL 
jdbc:hive2:///default (Anandha L Ranganathan via Prasad Mujumdar)

[prasadm] HIVE-4256: JDBC2 HiveConnection does not use the specified database 
(Anandha L Ranganathan via Prasad Mujumdar)


Changes for Build #2515
[brock] HIVE-5966 - Fix eclipse:eclipse post shim aggregation changes (Szehon 
Ho via Brock Noland)


Changes for Build #2516
[daijy] HIVE-5540: webhcat e2e test failures: "Expe

Re: Review Request 16184: Hive should be able to skip header and footer rows when reading data file for a table (HIVE-5795)

2013-12-30 Thread Shuaishuai Nie

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16184/
---

(Updated Dec. 30, 2013, 10:41 p.m.)


Review request for hive, Eric Hanson and Thejas Nair.


Changes
---

Fixed the patch based on Thejas's comment


Bugs: hive-5795
https://issues.apache.org/jira/browse/hive-5795


Repository: hive-git


Description
---

Hive should be able to skip header and footer rows when reading data file for a 
table
(follow up with review https://reviews.apache.org/r/15663/diff/#index_header)


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2ddb08f 
  conf/hive-default.xml.template b94013a 
  data/files/header_footer_table_1/0001.txt PRE-CREATION 
  data/files/header_footer_table_1/0002.txt PRE-CREATION 
  data/files/header_footer_table_1/0003.txt PRE-CREATION 
  data/files/header_footer_table_2/2012/01/01/0001.txt PRE-CREATION 
  data/files/header_footer_table_2/2012/01/02/0002.txt PRE-CREATION 
  data/files/header_footer_table_2/2012/01/03/0003.txt PRE-CREATION 
  itests/qtest/pom.xml 88e0890 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 03fd30a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java fc9b7e4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FooterBuffer.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java daf4e4a 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java 
dd5cb6b 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 974a5d6 
  
ql/src/test/org/apache/hadoop/hive/ql/io/TestHiveBinarySearchRecordReader.java 
85dd975 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 
0686d9b 
  ql/src/test/queries/clientnegative/file_with_header_footer_negative.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/file_with_header_footer.q PRE-CREATION 
  ql/src/test/results/clientnegative/file_with_header_footer_negative.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/file_with_header_footer.q.out PRE-CREATION 
  serde/if/serde.thrift 2ceb572 
  
serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java
 22a6168 

Diff: https://reviews.apache.org/r/16184/diff/


Testing
---


Thanks,

Shuaishuai Nie



[jira] [Updated] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2013-12-30 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-5795:
-

Attachment: HIVE-5795.4.patch

> Hive should be able to skip header and footer rows when reading data file for 
> a table
> -
>
> Key: HIVE-5795
> URL: https://issues.apache.org/jira/browse/HIVE-5795
> Project: Hive
>  Issue Type: Bug
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
> Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
> HIVE-5795.4.patch
>
>
> Hive should be able to skip header and footer lines when reading data file 
> from table. In this way, user don't need to processing data which generated 
> by other application with a header or footer and directly use the file for 
> table operations.
> To implement this, the idea is adding new properties in table descriptions to 
> define the number of lines in header and footer and skip them when reading 
> the record from record reader. An DDL example for creating a table with 
> header and footer should be like this:
> {code}
> Create external table testtable (name string, message string) row format 
> delimited fields terminated by '\t' lines terminated by '\n' location 
> '/testtable' tblproperties ("skip.header.number"="1", 
> "skip.footer.number"="2");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6121) Use Paths Consistently - IV

2013-12-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6121:
---

Status: Patch Available  (was: Open)

> Use Paths Consistently - IV
> ---
>
> Key: HIVE-6121
> URL: https://issues.apache.org/jira/browse/HIVE-6121
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6121.2.patch, HIVE-6121.patch
>
>
> Next one in patch series to fix Hive to use paths consistently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6121) Use Paths Consistently - IV

2013-12-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6121:
---

Status: Open  (was: Patch Available)

> Use Paths Consistently - IV
> ---
>
> Key: HIVE-6121
> URL: https://issues.apache.org/jira/browse/HIVE-6121
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6121.2.patch, HIVE-6121.patch
>
>
> Next one in patch series to fix Hive to use paths consistently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6121) Use Paths Consistently - IV

2013-12-30 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6121:
---

Attachment: HIVE-6121.2.patch

> Use Paths Consistently - IV
> ---
>
> Key: HIVE-6121
> URL: https://issues.apache.org/jira/browse/HIVE-6121
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6121.2.patch, HIVE-6121.patch
>
>
> Next one in patch series to fix Hive to use paths consistently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5945) ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask also sums those tables which are not used in the child of this conditional task.

2013-12-30 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859114#comment-13859114
 ] 

Yin Huai commented on HIVE-5945:


Thanks Navis :) I played with your patch and found a issue which I commented at 
the review board. I am also attaching more info at here. For the query in the 
description, we can have 4 map-joins. There will be 3 different intermediate 
tables called $INTNAME. The current patch does not update the size of $INTNAME.

Here are logs.
{code}
13/12/30 16:48:25 INFO ql.Driver: MapReduce Jobs Launched: 
Job 0: Map: 1   Cumulative CPU: 12.76 sec   HDFS Read: 388445624 HDFS Write: 
20815654 SUCCESS
13/12/30 16:48:25 INFO ql.Driver: Job 0: Map: 1   Cumulative CPU: 12.76 sec   
HDFS Read: 388445624 HDFS Write: 20815654 SUCCESS
Job 1: Map: 1   Cumulative CPU: 9.18 sec   HDFS Read: 20816111 HDFS Write: 
28593993 SUCCESS
13/12/30 16:48:25 INFO ql.Driver: Job 1: Map: 1   Cumulative CPU: 9.18 sec   
HDFS Read: 20816111 HDFS Write: 28593993 SUCCESS
Job 2: Map: 1   Cumulative CPU: 17.38 sec   HDFS Read: 80660331 HDFS Write: 
378063 SUCCESS
13/12/30 16:48:25 INFO ql.Driver: Job 2: Map: 1   Cumulative CPU: 17.38 sec   
HDFS Read: 80660331 HDFS Write: 378063 SUCCESS
Job 3: Map: 1   Cumulative CPU: 2.06 sec   HDFS Read: 378520 HDFS Write: 96 
SUCCESS
13/12/30 16:48:25 INFO ql.Driver: Job 3: Map: 1   Cumulative CPU: 2.06 sec   
HDFS Read: 378520 HDFS Write: 96 SUCCESS
Job 4: Map: 1  Reduce: 1   Cumulative CPU: 2.45 sec   HDFS Read: 553 HDFS 
Write: 96 SUCCESS
13/12/30 16:48:25 INFO ql.Driver: Job 4: Map: 1  Reduce: 1   Cumulative CPU: 
2.45 sec   HDFS Read: 553 HDFS Write: 96 SUCCESS
Job 5: Map: 1  Reduce: 1   Cumulative CPU: 2.33 sec   HDFS Read: 553 HDFS 
Write: 0 SUCCESS
13/12/30 16:48:25 INFO ql.Driver: Job 5: Map: 1  Reduce: 1   Cumulative CPU: 
2.33 sec   HDFS Read: 553 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 46 seconds 160 msec
{code}

{code}
Map-join1:
plan.ConditionalResolverCommonJoin: Driver alias is store_sales with size 
388445409 (total size of others : 0, threshold : 2500)
Stage-28 is selected by condition resolver.

Map-join2:
plan.ConditionalResolverCommonJoin: Driver alias is $INTNAME with size 20815654 
(total size of others : 5051899, threshold : 2500)
Stage-26 is selected by condition resolver.

Map-join3:
 plan.ConditionalResolverCommonJoin: Driver alias is customer_demographics with 
size 80660096 (total size of others : 20815654, threshold : 2500)
Stage-24 is filtered out by condition resolver.

Map-join4:
plan.ConditionalResolverCommonJoin: Driver alias is $INTNAME with size 20815654 
(total size of others : 3155, threshold : 2500)
Stage-22 is selected by condition resolver.
{code}


btw, a minor question. Why the log of map-join 1 shows the size of others 0?

> ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask also sums those 
> tables which are not used in the child of this conditional task.
> -
>
> Key: HIVE-5945
> URL: https://issues.apache.org/jira/browse/HIVE-5945
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
>Reporter: Yin Huai
>Assignee: Navis
>Priority: Critical
> Attachments: HIVE-5945.1.patch.txt, HIVE-5945.2.patch.txt, 
> HIVE-5945.3.patch.txt, HIVE-5945.4.patch.txt, HIVE-5945.5.patch.txt
>
>
> Here is an example
> {code}
> select
>i_item_id,
>s_state,
>avg(ss_quantity) agg1,
>avg(ss_list_price) agg2,
>avg(ss_coupon_amt) agg3,
>avg(ss_sales_price) agg4
> FROM store_sales
> JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
> JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
> JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
> customer_demographics.cd_demo_sk)
> JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
> where
>cd_gender = 'F' and
>cd_marital_status = 'U' and
>cd_education_status = 'Primary' and
>d_year = 2002 and
>s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
> group by
>i_item_id,
>s_state
> order by
>i_item_id,
>s_state
> limit 100;
> {\code}
> I turned off noconditionaltask. So, I expected that there will be 4 Map-only 
> jobs for this query. However, I got 1 Map-only job (joining strore_sales and 
> date_dim) and 3 MR job (for reduce joins.)
> So, I checked the conditional task determining the plan of the join involving 
> item. In ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask, 
> aliasToFileSizeMap contains all input tables used in this query and the 
> intermediate table generated by joining store_sales and date_dim. So, when we 
> sum the size of all small tables, the siz

Re: Review Request 16172: ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask also sums those tables which are not used in the child of this conditional task.

2013-12-30 Thread Yin Huai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16172/#review30977
---



ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverCommonJoin.java


For the example in the description of HIVE-5945, the same alias "$INTNAME" 
can actually refer to different intermediate tables. So, at here, we will not 
update the correct size for an alias "$INTNAME".


- Yin Huai


On Dec. 30, 2013, 2:20 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16172/
> ---
> 
> (Updated Dec. 30, 2013, 2:20 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-5945
> https://issues.apache.org/jira/browse/HIVE-5945
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Here is an example
> {code}
> select
>i_item_id,
>s_state,
>avg(ss_quantity) agg1,
>avg(ss_list_price) agg2,
>avg(ss_coupon_amt) agg3,
>avg(ss_sales_price) agg4
> FROM store_sales
> JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
> JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
> JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
> customer_demographics.cd_demo_sk)
> JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
> where
>cd_gender = 'F' and
>cd_marital_status = 'U' and
>cd_education_status = 'Primary' and
>d_year = 2002 and
>s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
> group by
>i_item_id,
>s_state
> order by
>i_item_id,
>s_state
> limit 100;
> {\code}
> I turned off noconditionaltask. So, I expected that there will be 4 Map-only 
> jobs for this query. However, I got 1 Map-only job (joining strore_sales and 
> date_dim) and 3 MR job (for reduce joins.)
> 
> So, I checked the conditional task determining the plan of the join involving 
> item. In ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask, 
> aliasToFileSizeMap contains all input tables used in this query and the 
> intermediate table generated by joining store_sales and date_dim. So, when we 
> sum the size of all small tables, the size of store_sales (which is around 
> 45GB in my test) will be also counted.  
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java daf4e4a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
>  37ed275 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverCommonJoin.java 
> f75e366 
>   
> ql/src/test/org/apache/hadoop/hive/ql/plan/TestConditionalResolverCommonJoin.java
>  67203c9 
>   ql/src/test/results/clientpositive/auto_join25.q.out 7427239 
>   ql/src/test/results/clientpositive/infer_bucket_sort_convert_join.q.out 
> 7d06739 
>   ql/src/test/results/clientpositive/mapjoin_hook.q.out d60d16e 
> 
> Diff: https://reviews.apache.org/r/16172/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>



Hive-trunk-hadoop2 - Build # 635 - Still Failing

2013-12-30 Thread Apache Jenkins Server
Changes for Build #591
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes for Build #592
[hashutosh] HIVE-5982 : Remove redundant filesystem operations and methods in 
FileSink (Ashutosh Chauhan via Thejas Nair)

[navis] HIVE-5955 : decimal_precision.q test case fails in trunk (Prasanth J 
via Navis)

[brock] HIVE-5983 - Fix name of ColumnProjectionUtils.appendReadColumnIDs 
(Brock Noland reviewed by Navis)


Changes for Build #593
[omalley] HIVE-5580. Predicate pushdown predicates with an and-operator between 
non-SARGable predicates cause a NPE. (omalley)


Changes for Build #594
[gunther] HIVE-6000: Hive build broken on hadoop2 (Vikram Dixit K via Gunther 
Hagleitner

[gunther] HIVE-2093: UPDATE - add two missing files from previous commit 
(Gunther Hagleitner)

[thejas] HIVE-2093 : create/drop database should populate inputs/outputs and 
check concurrency and user permission (Navis via Thejas Nair)

[hashutosh] HIVE-6016 : Hadoop23Shims has a bug in listLocatedStatus impl. 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5994 : ORC RLEv2 encodes wrongly for large negative BIGINTs  
(64 bits ) (Prasanth J via Owen Omalley)

[hashutosh] HIVE-5991 : ORC RLEv2 fails with ArrayIndexOutOfBounds exception 
for PATCHED_BLOB encoding (Prasanth J via Owen Omalley)

[prasadm] HIVE-4395: Support TFetchOrientation.FIRST for HiveServer2 
FetchResults (Prasad Mujumdar reviewed by Thejas Nair)

[ehans] HIVE-5756: Implement vectorized support for IF conditional expression 
(Eric Hanson)

[hashutosh] HIVE-6018 : FetchTask should not reference metastore classes (Navis 
via Prasad Mujumdar)

[hashutosh] HIVE-5979. Failure in cast to timestamps. (Jitendra Pandey)

[hashutosh] HIVE-5897 : Fix hadoop2 execution environment Milestone 2 (Vikram 
Dixit via Brock Noland)


Changes for Build #595

Changes for Build #596
[hashutosh] HIVE-6027 : non-vectorized log10 has rounding issue (Sergey 
Shelukhin via Ashutosh Chauhan)

[prasadm] HIVE-5993: JDBC Driver should not hard-code the database name (Szehon 
Ho via Prasad Mujumdar)


Changes for Build #597
[hashutosh] HIVE-6004 : Fix statistics annotation related test failures in 
hadoop2 (Prasanth J via Ashutosh Chauhan)


Changes for Build #598
[navis] HIVE-5985 : Make qfile_regex to accept multiple patterns (Navis 
reviewed by Ashutosh Chauhan)


Changes for Build #599

Changes for Build #600

Changes for Build #601
[navis] HIVE-5276 : Skip redundant string encoding/decoding for hiveserver2 
(Navis Reviewed by Carl Steinbach)


Changes for Build #602
[xuefu] HIVE-6022: Load statements with incorrect order of partitions put input 
files to unreadable places (Teruyoshi Zenmyo via Xuefu)


Changes for Build #603

Changes for Build #604
[thejas] HIVE-5975 : [WebHCat] templeton mapreduce job failed if provide 
"define" parameters (Shanyu Zhao via Thejas Nair)


Changes for Build #605
[prasadm] HIVE-1466: Add NULL DEFINED AS to ROW FORMAT specification (Prasad 
Mujumdar reviewed by Xuefu Zhang)


Changes for Build #606
[jitendra] HIVE-5521 : Remove CommonRCFileInputFormat. (hashutosh via jitendra)

[rhbutani] HIVE-5973 SMB joins produce incorrect results with multiple 
partitions and buckets (Vikram Dixit via Harish Butani)

[ehans] HIVE-6015: vectorized logarithm produces results for 0 that are 
different from a non-vectorized one (Sergey Shelukhin via Eric Hanson)


Changes for Build #607
[brock] HIVE-5812 - HiveServer2 SSL connection transport binds to loopback 
address by default (Prasad Mujumdar via Brock Noland)


Changes for Build #608
[hashutosh] HIVE-5936 : analyze command failing to collect stats with counter 
mechanism (Navis via Ashutosh Chauhan)


Changes for Build #609
[thejas] HIVE-5230 : Better error reporting by async threads in HiveServer2 
(Vaibhav Gumashta via Thejas Nair)


Changes for Build #610
[navis] HIVE-5879 : Fix spelling errors in hive-default.xml.template (Lefty 
Leverenz via Navis)


Changes for Build #611

Changes for Build #612
[xuefu] HIVE-6021: Problem in GroupByOperator for handling distinct aggrgations 
(Sun Rui via Xuefu)


Changes for Build #613
[prasadm] HIVE-6036: A test case for embedded beeline - with URL 
jdbc:hive2:///default (Anandha L Ranganathan via Prasad Mujumdar)

[prasadm] HIVE-4256: JDBC2 HiveConnection does not use the specified database 
(Anandha L Ranganathan via Prasad Mujumdar)


Changes for Build #614
[brock] HIVE-5966 - Fix eclipse:eclipse post shim aggregation changes (Szehon 
Ho via Brock Noland)


Changes for Build #615
[daijy] HIVE-5540: webhcat e2e test failures: "Expe

[jira] [Updated] (HIVE-6082) Certain KeeperException should be ignored in ZooKeeperHiveLockManage.unlockPrimitive

2013-12-30 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6082:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk! Thank you for your contribution!!

> Certain KeeperException should be ignored in 
> ZooKeeperHiveLockManage.unlockPrimitive
> 
>
> Key: HIVE-6082
> URL: https://issues.apache.org/jira/browse/HIVE-6082
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.13.0
>
> Attachments: HIVE-6082.patch, Hive-6082.patch
>
>
> KeeperException.NoNodeException and NotEmptyException should be ignored when 
> deleting a zLock or its parent in ZooKeeperHiveLockManager unlockPrimitive. 
> The exceptions can happen: 
> 1) ZooKeeperHiveLockManager retries deleting a zLock after a failure but it 
> has been deleted. 
> 2) a race condition where another process adds a zLock just before it is 
> about to be deleted.
> Otherwise, unlock may unnecessarily be retried for numRetriesForUnLock times.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5923) SQL std auth - parser changes

2013-12-30 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859042#comment-13859042
 ] 

Thejas M Nair commented on HIVE-5923:
-

sure, will post it on RB. I am looking at the unit test failures and upload the 
new patch there.


> SQL std auth - parser changes
> -
>
> Key: HIVE-5923
> URL: https://issues.apache.org/jira/browse/HIVE-5923
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-5923.1.patch, HIVE-5923.2.patch
>
>   Original Estimate: 96h
>  Time Spent: 72h
>  Remaining Estimate: 12h
>
> There are new access control statements proposed in the functional spec in 
> HIVE-5837 . It also proposes some small changes to the existing query syntax 
> (mostly extensions and some optional keywords).
> The syntax supported should depend on the current authorization mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Work started] (HIVE-6067) Implement vectorized decimal column-scalar comparison filters

2013-12-30 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6067 started by Eric Hanson.

> Implement vectorized decimal column-scalar comparison filters
> -
>
> Key: HIVE-6067
> URL: https://issues.apache.org/jira/browse/HIVE-6067
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Attachments: HIVE-6067.01.patch, HIVE-6067.02.patch
>
>
> Using the new DecimalColumnVector type, implement a template to generate 
> VectorExpression subclasses for Decimal comparison filters (<, <=, >, >=, =, 
> !=).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Work started] (HIVE-6051) Create DecimalColumnVector and a representative VectorExpression for decimal

2013-12-30 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6051 started by Eric Hanson.

> Create DecimalColumnVector and a representative VectorExpression for decimal
> 
>
> Key: HIVE-6051
> URL: https://issues.apache.org/jira/browse/HIVE-6051
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.13.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Attachments: HIVE-6051.01.patch
>
>
> Create a DecimalColumnVector to use as a basis for vectorized decimal 
> operations. Include a representative VectorExpression on decimal (e.g. 
> column-column addition) to demonstrate it's use.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-1996) "LOAD DATA INPATH" fails when the table already contains a file of the same name

2013-12-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859024#comment-13859024
 ] 

Hive QA commented on HIVE-1996:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12503097/HIVE-1996.2.Patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/772/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/772/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-772/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/Driver.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target 
hcatalog/server-extensions/target hcatalog/core/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen 
contrib/target service/target serde/target beeline/target odbc/target 
cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update
U
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java

Fetching external item into 'hcatalog/src/test/e2e/harness'
Updated external to revision 1554298.

Updated to revision 1554298.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12503097

> "LOAD DATA INPATH" fails when the table already contains a file of the same 
> name
> 
>
> Key: HIVE-1996
> URL: https://issues.apache.org/jira/browse/HIVE-1996
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0, 0.8.1
>Reporter: Kirk True
>Assignee: Chinna Rao Lalam
> Attachments: HIVE-1996.1.Patch, HIVE-1996.2.Patch, HIVE-1996.Patch
>
>
> Steps:
> 1. From the command line copy the kv2.txt data file into the current user's 
> HDFS directory:
> {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
> kv2.txt}}
> 2. In Hive, create the table:
> {{create table tst_src1 (key_ int, value_ string);}}
> 3. Load the data into the table from HDFS:
> {{load data inpath './kv2.txt' into table tst_src1;}}
> 4. Repeat step 1
> 5. Repeat step 3
> Expected:
> To have kv2.txt renamed in HDFS and then copied to the destination as per 
> HIVE-307.
> Actual:
> File is renamed, but {{Hive.copyFiles}} doesn't "see" the change in {{srcs}} 
> as it continues to use the same array elements (with the un-renamed, old file 
> names). It crashes with this error:
> {noformat}
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
> at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
> at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.Task

[jira] [Commented] (HIVE-1996) "LOAD DATA INPATH" fails when the table already contains a file of the same name

2013-12-30 Thread Antonio Bastardo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858997#comment-13858997
 ] 

Antonio Bastardo commented on HIVE-1996:


Is this bug solved in hive-0.10.0_cdh4.2.0_20130411_1129?
I'm having this behavior in this version.
Regards

> "LOAD DATA INPATH" fails when the table already contains a file of the same 
> name
> 
>
> Key: HIVE-1996
> URL: https://issues.apache.org/jira/browse/HIVE-1996
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0, 0.8.1
>Reporter: Kirk True
>Assignee: Chinna Rao Lalam
> Attachments: HIVE-1996.1.Patch, HIVE-1996.2.Patch, HIVE-1996.Patch
>
>
> Steps:
> 1. From the command line copy the kv2.txt data file into the current user's 
> HDFS directory:
> {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
> kv2.txt}}
> 2. In Hive, create the table:
> {{create table tst_src1 (key_ int, value_ string);}}
> 3. Load the data into the table from HDFS:
> {{load data inpath './kv2.txt' into table tst_src1;}}
> 4. Repeat step 1
> 5. Repeat step 3
> Expected:
> To have kv2.txt renamed in HDFS and then copied to the destination as per 
> HIVE-307.
> Actual:
> File is renamed, but {{Hive.copyFiles}} doesn't "see" the change in {{srcs}} 
> as it continues to use the same array elements (with the un-renamed, old file 
> names). It crashes with this error:
> {noformat}
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
> at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
> at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-3746) Fix HS2 ResultSet Serialization Performance Regression

2013-12-30 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858974#comment-13858974
 ] 

Brock Noland commented on HIVE-3746:


Hey guys, this patch is a huge improvement over the existing RS serialization 
and is a big page so we don't want to have Navis continually rebasing. I think 
we should have a follow on JIRA to:

1) test backwards compatibility with the older driver and fix any outstanding 
issues
2) remove the debug stuff that is included (printStackTrace and System.out)

Brock

> Fix HS2 ResultSet Serialization Performance Regression
> --
>
> Key: HIVE-3746
> URL: https://issues.apache.org/jira/browse/HIVE-3746
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Server Infrastructure
>Reporter: Carl Steinbach
>Assignee: Navis
>  Labels: HiveServer2, jdbc, thrift
> Attachments: HIVE-3746.1.patch.txt, HIVE-3746.2.patch.txt, 
> HIVE-3746.3.patch.txt, HIVE-3746.4.patch.txt, HIVE-3746.5.patch.txt, 
> HIVE-3746.6.patch.txt, HIVE-3746.7.patch.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6116) Use Paths consistently III

2013-12-30 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858971#comment-13858971
 ] 

Ashutosh Chauhan commented on HIVE-6116:


Updated patch on RB with correct formatting. No code changes. Only whitespace 
changes.

> Use Paths consistently III
> --
>
> Key: HIVE-6116
> URL: https://issues.apache.org/jira/browse/HIVE-6116
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6116.2.patch, HIVE-6116.3.patch, HIVE-6116.patch
>
>
> Another one in patch series to make use of Paths consistently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 16502: Use paths consisently - 3

2013-12-30 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16502/
---

(Updated Dec. 30, 2013, 6:29 p.m.)


Review request for hive, Xuefu Zhang and Xuefu Zhang.


Changes
---

Same patch as last with tab/spacing corrected. Only whitespace changes.


Bugs: HIVE-6116
https://issues.apache.org/jira/browse/HIVE-6116


Repository: hive


Description
---

Refactor patch.


Diffs (updated)
-

  
trunk/hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
 1554291 
  
trunk/hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
 1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
1554291 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java 
1554291 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
 1554291 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
 1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 
1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 1554291 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/LineageState.java 1554291 

Diff: https://reviews.apache.org/r/16502/diff/


Testing
---

No new tests. Refactor only patch. Regression suite suffices.


Thanks,

Ashutosh Chauhan



Re: [ANNOUNCE] New Hive PMC Member - Gunther Hagleitner

2013-12-30 Thread Gunther Hagleitner
This is awesome. Thanks everyone! I really appreciate it!

Cheers,
Gunther.


On Sat, Dec 28, 2013 at 12:42 AM, Clark Yang (杨卓荦) wrote:

> Congrats Gunther!
>
> Cheers,
> Zhuoluo (Clark) Yang
>
>
> 2013/12/28 Biswajit Nayak 
>
>> Congratulations Gunther...
>>
>>
>> On Fri, Dec 27, 2013 at 7:20 PM, Prasanth Jayachandran <
>> pjayachand...@hortonworks.com> wrote:
>>
>>> Congrats Gunther!!
>>>
>>> Sent from my iPhone
>>>
>>> > On Dec 27, 2013, at 4:46 PM, Lefty Leverenz 
>>> wrote:
>>> >
>>> > Congratulations Gunther, well deserved!
>>> >
>>> > -- Lefty
>>> >
>>> >
>>> > On Fri, Dec 27, 2013 at 12:00 AM, Jarek Jarcec Cecho <
>>> jar...@apache.org>wrote:
>>> >
>>> >> Congratulations Gunther, good job!
>>> >>
>>> >> Jarcec
>>> >>
>>> >>> On Thu, Dec 26, 2013 at 08:59:37PM -0800, Carl Steinbach wrote:
>>> >>> I am pleased to announce that Gunther Hagleitner has been elected to
>>> the
>>> >>> Hive Project Management Committee. Please join me in congratulating
>>> >> Gunther!
>>> >>>
>>> >>> Thanks.
>>> >>>
>>> >>> Carl
>>> >>
>>>
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>>> immediately
>>> and delete it from your system. Thank You.
>>>
>>
>>
>> _
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>
>
>


[jira] [Commented] (HIVE-6117) mapreduce.RecordReader instance needs to be initialized

2013-12-30 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858955#comment-13858955
 ] 

Nick Dimiduk commented on HIVE-6117:


bq. that we wrap the mapreduce RR from HBase in a mapred RR. Therefore it's 
hive's responsibility to call initialize which we are not doing. Therefore the 
patch looks correct and I will commit.

Yes, that's my understanding too. Thanks [~brocknoland], [~jxiang] for the 
context and quick response!

> mapreduce.RecordReader instance needs to be initialized
> ---
>
> Key: HIVE-6117
> URL: https://issues.apache.org/jira/browse/HIVE-6117
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 0.13.0
>
> Attachments: 6117.00.patch, HIVE-6117.0.patch
>
>
> The HBase storage handler makes use of a mapreduce.RecordReader instance but 
> does not initialize it when consumed from local context. This results in a 
> NPE for some queries, for instance
> {noformat}
> create table hbase_1(key string, age int) stored by 
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ( 
> "hbase.columns.mapping" = "info:age");
> insert overwrite table hbase_1 select name, SUM(age) from studenttab10k group 
> by name;
> select * from hbase_1;
> {noformat}
> The select statement throws the following exception
> {noformat}
> 13/12/18 01:30:32 ERROR CliDriver: Failed with exception 
> java.io.IOException:java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:551)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1494)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:196)
> at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138)
> at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:234)
> at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:193)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6112) SQL std auth - support new privileges INSERT, DELETE

2013-12-30 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858946#comment-13858946
 ] 

Brock Noland commented on HIVE-6112:


Looks like this patch misses the getPrivTypeByName() and the toString() method 
on Privilege. FIWW this class looks like a perfect candidate for a unit test 
and I think we could get ride of all those if/else and case statements by 
storing this stuff in a map.

> SQL std auth - support new privileges INSERT, DELETE
> 
>
> Key: HIVE-6112
> URL: https://issues.apache.org/jira/browse/HIVE-6112
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authorization
>Reporter: Thejas M Nair
> Attachments: new-privs.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> Includes  INSERT, DELETE privileges. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Hive-trunk-h0.21 - Build # 2533 - Still Failing

2013-12-30 Thread Apache Jenkins Server
Changes for Build #2493
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes for Build #2494
[hashutosh] HIVE-5982 : Remove redundant filesystem operations and methods in 
FileSink (Ashutosh Chauhan via Thejas Nair)

[navis] HIVE-5955 : decimal_precision.q test case fails in trunk (Prasanth J 
via Navis)

[brock] HIVE-5983 - Fix name of ColumnProjectionUtils.appendReadColumnIDs 
(Brock Noland reviewed by Navis)


Changes for Build #2495
[omalley] HIVE-5580. Predicate pushdown predicates with an and-operator between 
non-SARGable predicates cause a NPE. (omalley)


Changes for Build #2496
[gunther] HIVE-6000: Hive build broken on hadoop2 (Vikram Dixit K via Gunther 
Hagleitner

[gunther] HIVE-2093: UPDATE - add two missing files from previous commit 
(Gunther Hagleitner)

[thejas] HIVE-2093 : create/drop database should populate inputs/outputs and 
check concurrency and user permission (Navis via Thejas Nair)

[hashutosh] HIVE-6016 : Hadoop23Shims has a bug in listLocatedStatus impl. 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5994 : ORC RLEv2 encodes wrongly for large negative BIGINTs  
(64 bits ) (Prasanth J via Owen Omalley)

[hashutosh] HIVE-5991 : ORC RLEv2 fails with ArrayIndexOutOfBounds exception 
for PATCHED_BLOB encoding (Prasanth J via Owen Omalley)

[prasadm] HIVE-4395: Support TFetchOrientation.FIRST for HiveServer2 
FetchResults (Prasad Mujumdar reviewed by Thejas Nair)

[ehans] HIVE-5756: Implement vectorized support for IF conditional expression 
(Eric Hanson)

[hashutosh] HIVE-6018 : FetchTask should not reference metastore classes (Navis 
via Prasad Mujumdar)

[hashutosh] HIVE-5979. Failure in cast to timestamps. (Jitendra Pandey)

[hashutosh] HIVE-5897 : Fix hadoop2 execution environment Milestone 2 (Vikram 
Dixit via Brock Noland)


Changes for Build #2497

Changes for Build #2498
[hashutosh] HIVE-6004 : Fix statistics annotation related test failures in 
hadoop2 (Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-6027 : non-vectorized log10 has rounding issue (Sergey 
Shelukhin via Ashutosh Chauhan)

[prasadm] HIVE-5993: JDBC Driver should not hard-code the database name (Szehon 
Ho via Prasad Mujumdar)


Changes for Build #2499
[navis] HIVE-5985 : Make qfile_regex to accept multiple patterns (Navis 
reviewed by Ashutosh Chauhan)


Changes for Build #2500

Changes for Build #2501

Changes for Build #2502
[navis] HIVE-5276 : Skip redundant string encoding/decoding for hiveserver2 
(Navis Reviewed by Carl Steinbach)


Changes for Build #2503
[xuefu] HIVE-6022: Load statements with incorrect order of partitions put input 
files to unreadable places (Teruyoshi Zenmyo via Xuefu)


Changes for Build #2504

Changes for Build #2505
[thejas] HIVE-5975 : [WebHCat] templeton mapreduce job failed if provide 
"define" parameters (Shanyu Zhao via Thejas Nair)


Changes for Build #2506
[prasadm] HIVE-1466: Add NULL DEFINED AS to ROW FORMAT specification (Prasad 
Mujumdar reviewed by Xuefu Zhang)


Changes for Build #2507
[jitendra] HIVE-5521 : Remove CommonRCFileInputFormat. (hashutosh via jitendra)

[rhbutani] HIVE-5973 SMB joins produce incorrect results with multiple 
partitions and buckets (Vikram Dixit via Harish Butani)

[ehans] HIVE-6015: vectorized logarithm produces results for 0 that are 
different from a non-vectorized one (Sergey Shelukhin via Eric Hanson)


Changes for Build #2508
[brock] HIVE-5812 - HiveServer2 SSL connection transport binds to loopback 
address by default (Prasad Mujumdar via Brock Noland)


Changes for Build #2509
[hashutosh] HIVE-5936 : analyze command failing to collect stats with counter 
mechanism (Navis via Ashutosh Chauhan)


Changes for Build #2510
[thejas] HIVE-5230 : Better error reporting by async threads in HiveServer2 
(Vaibhav Gumashta via Thejas Nair)


Changes for Build #2511
[navis] HIVE-5879 : Fix spelling errors in hive-default.xml.template (Lefty 
Leverenz via Navis)


Changes for Build #2512

Changes for Build #2513
[xuefu] HIVE-6021: Problem in GroupByOperator for handling distinct aggrgations 
(Sun Rui via Xuefu)


Changes for Build #2514
[prasadm] HIVE-6036: A test case for embedded beeline - with URL 
jdbc:hive2:///default (Anandha L Ranganathan via Prasad Mujumdar)

[prasadm] HIVE-4256: JDBC2 HiveConnection does not use the specified database 
(Anandha L Ranganathan via Prasad Mujumdar)


Changes for Build #2515
[brock] HIVE-5966 - Fix eclipse:eclipse post shim aggregation changes (Szehon 
Ho via Brock Noland)


Changes for Build #2516
[daijy] HIVE-5540: webhcat e2e test failures: "Expe

Hive-trunk-hadoop2 - Build # 634 - Still Failing

2013-12-30 Thread Apache Jenkins Server
Changes for Build #591
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes for Build #592
[hashutosh] HIVE-5982 : Remove redundant filesystem operations and methods in 
FileSink (Ashutosh Chauhan via Thejas Nair)

[navis] HIVE-5955 : decimal_precision.q test case fails in trunk (Prasanth J 
via Navis)

[brock] HIVE-5983 - Fix name of ColumnProjectionUtils.appendReadColumnIDs 
(Brock Noland reviewed by Navis)


Changes for Build #593
[omalley] HIVE-5580. Predicate pushdown predicates with an and-operator between 
non-SARGable predicates cause a NPE. (omalley)


Changes for Build #594
[gunther] HIVE-6000: Hive build broken on hadoop2 (Vikram Dixit K via Gunther 
Hagleitner

[gunther] HIVE-2093: UPDATE - add two missing files from previous commit 
(Gunther Hagleitner)

[thejas] HIVE-2093 : create/drop database should populate inputs/outputs and 
check concurrency and user permission (Navis via Thejas Nair)

[hashutosh] HIVE-6016 : Hadoop23Shims has a bug in listLocatedStatus impl. 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5994 : ORC RLEv2 encodes wrongly for large negative BIGINTs  
(64 bits ) (Prasanth J via Owen Omalley)

[hashutosh] HIVE-5991 : ORC RLEv2 fails with ArrayIndexOutOfBounds exception 
for PATCHED_BLOB encoding (Prasanth J via Owen Omalley)

[prasadm] HIVE-4395: Support TFetchOrientation.FIRST for HiveServer2 
FetchResults (Prasad Mujumdar reviewed by Thejas Nair)

[ehans] HIVE-5756: Implement vectorized support for IF conditional expression 
(Eric Hanson)

[hashutosh] HIVE-6018 : FetchTask should not reference metastore classes (Navis 
via Prasad Mujumdar)

[hashutosh] HIVE-5979. Failure in cast to timestamps. (Jitendra Pandey)

[hashutosh] HIVE-5897 : Fix hadoop2 execution environment Milestone 2 (Vikram 
Dixit via Brock Noland)


Changes for Build #595

Changes for Build #596
[hashutosh] HIVE-6027 : non-vectorized log10 has rounding issue (Sergey 
Shelukhin via Ashutosh Chauhan)

[prasadm] HIVE-5993: JDBC Driver should not hard-code the database name (Szehon 
Ho via Prasad Mujumdar)


Changes for Build #597
[hashutosh] HIVE-6004 : Fix statistics annotation related test failures in 
hadoop2 (Prasanth J via Ashutosh Chauhan)


Changes for Build #598
[navis] HIVE-5985 : Make qfile_regex to accept multiple patterns (Navis 
reviewed by Ashutosh Chauhan)


Changes for Build #599

Changes for Build #600

Changes for Build #601
[navis] HIVE-5276 : Skip redundant string encoding/decoding for hiveserver2 
(Navis Reviewed by Carl Steinbach)


Changes for Build #602
[xuefu] HIVE-6022: Load statements with incorrect order of partitions put input 
files to unreadable places (Teruyoshi Zenmyo via Xuefu)


Changes for Build #603

Changes for Build #604
[thejas] HIVE-5975 : [WebHCat] templeton mapreduce job failed if provide 
"define" parameters (Shanyu Zhao via Thejas Nair)


Changes for Build #605
[prasadm] HIVE-1466: Add NULL DEFINED AS to ROW FORMAT specification (Prasad 
Mujumdar reviewed by Xuefu Zhang)


Changes for Build #606
[jitendra] HIVE-5521 : Remove CommonRCFileInputFormat. (hashutosh via jitendra)

[rhbutani] HIVE-5973 SMB joins produce incorrect results with multiple 
partitions and buckets (Vikram Dixit via Harish Butani)

[ehans] HIVE-6015: vectorized logarithm produces results for 0 that are 
different from a non-vectorized one (Sergey Shelukhin via Eric Hanson)


Changes for Build #607
[brock] HIVE-5812 - HiveServer2 SSL connection transport binds to loopback 
address by default (Prasad Mujumdar via Brock Noland)


Changes for Build #608
[hashutosh] HIVE-5936 : analyze command failing to collect stats with counter 
mechanism (Navis via Ashutosh Chauhan)


Changes for Build #609
[thejas] HIVE-5230 : Better error reporting by async threads in HiveServer2 
(Vaibhav Gumashta via Thejas Nair)


Changes for Build #610
[navis] HIVE-5879 : Fix spelling errors in hive-default.xml.template (Lefty 
Leverenz via Navis)


Changes for Build #611

Changes for Build #612
[xuefu] HIVE-6021: Problem in GroupByOperator for handling distinct aggrgations 
(Sun Rui via Xuefu)


Changes for Build #613
[prasadm] HIVE-6036: A test case for embedded beeline - with URL 
jdbc:hive2:///default (Anandha L Ranganathan via Prasad Mujumdar)

[prasadm] HIVE-4256: JDBC2 HiveConnection does not use the specified database 
(Anandha L Ranganathan via Prasad Mujumdar)


Changes for Build #614
[brock] HIVE-5966 - Fix eclipse:eclipse post shim aggregation changes (Szehon 
Ho via Brock Noland)


Changes for Build #615
[daijy] HIVE-5540: webhcat e2e test failures: "Expe

Hive-trunk-hadoop2 - Build # 633 - Still Failing

2013-12-30 Thread Apache Jenkins Server
Changes for Build #590
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #591
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes for Build #592
[hashutosh] HIVE-5982 : Remove redundant filesystem operations and methods in 
FileSink (Ashutosh Chauhan via Thejas Nair)

[navis] HIVE-5955 : decimal_precision.q test case fails in trunk (Prasanth J 
via Navis)

[brock] HIVE-5983 - Fix name of ColumnProjectionUtils.appendReadColumnIDs 
(Brock Noland reviewed by Navis)


Changes for Build #593
[omalley] HIVE-5580. Predicate pushdown predicates with an and-operator between 
non-SARGable predicates cause a NPE. (omalley)


Changes for Build #594
[gunther] HIVE-6000: Hive build broken on hadoop2 (Vikram Dixit K via Gunther 
Hagleitner

[gunther] HIVE-2093: UPDATE - add two missing files from previous commit 
(Gunther Hagleitner)

[thejas] HIVE-2093 : create/drop database should populate inputs/outputs and 
check concurrency and user permission (Navis via Thejas Nair)

[hashutosh] HIVE-6016 : Hadoop23Shims has a bug in listLocatedStatus impl. 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5994 : ORC RLEv2 encodes wrongly for large negative BIGINTs  
(64 bits ) (Prasanth J via Owen Omalley)

[hashutosh] HIVE-5991 : ORC RLEv2 fails with ArrayIndexOutOfBounds exception 
for PATCHED_BLOB encoding (Prasanth J via Owen Omalley)

[prasadm] HIVE-4395: Support TFetchOrientation.FIRST for HiveServer2 
FetchResults (Prasad Mujumdar reviewed by Thejas Nair)

[ehans] HIVE-5756: Implement vectorized support for IF conditional expression 
(Eric Hanson)

[hashutosh] HIVE-6018 : FetchTask should not reference metastore classes (Navis 
via Prasad Mujumdar)

[hashutosh] HIVE-5979. Failure in cast to timestamps. (Jitendra Pandey)

[hashutosh] HIVE-5897 : Fix hadoop2 execution environment Milestone 2 (Vikram 
Dixit via Brock Noland)


Changes for Build #595

Changes for Build #596
[hashutosh] HIVE-6027 : non-vectorized log10 has rounding issue (Sergey 
Shelukhin via Ashutosh Chauhan)

[prasadm] HIVE-5993: JDBC Driver should not hard-code the database name (Szehon 
Ho via Prasad Mujumdar)


Changes for Build #597
[hashutosh] HIVE-6004 : Fix statistics annotation related test failures in 
hadoop2 (Prasanth J via Ashutosh Chauhan)


Changes for Build #598
[navis] HIVE-5985 : Make qfile_regex to accept multiple patterns (Navis 
reviewed by Ashutosh Chauhan)


Changes for Build #599

Changes for Build #600

Changes for Build #601
[navis] HIVE-5276 : Skip redundant string encoding/decoding for hiveserver2 
(Navis Reviewed by Carl Steinbach)


Changes for Build #602
[xuefu] HIVE-6022: Load statements with incorrect order of partitions put input 
files to unreadable places (Teruyoshi Zenmyo via Xuefu)


Changes for Build #603

Changes for Build #604
[thejas] HIVE-5975 : [WebHCat] templeton mapreduce job failed if provide 
"define" parameters (Shanyu Zhao via Thejas Nair)


Changes for Build #605
[prasadm] HIVE-1466: Add NULL DEFINED AS to ROW FORMAT specification (Prasad 
Mujumdar reviewed by Xuefu Zhang)


Changes for Build #606
[jitendra] HIVE-5521 : Remove CommonRCFileInputFormat. (hashutosh via jitendra)

[rhbutani] HIVE-5973 SMB joins produce incorrect results with multiple 
partitions and buckets (Vikram Dixit via Harish Butani)

[ehans] HIVE-6015: vectorized logarithm produces results for 0 that are 
different from a non-vectorized one (Sergey Shelukhin via Eric Hanson)


Changes for Build #607
[brock] HIVE-5812 - HiveServer2 SSL connection transport binds to loopback 
address by default (Prasad Mujumdar via Brock Noland)


Changes for Build #608
[hashutosh] HIVE-5936 : analyze command failing to collect stats with counter 
mechanism (Navis via Ashutosh Chauhan)


Changes for Build #609
[thejas] HIVE-5230 : Better error reporting by async threads in HiveServer2 
(Vaibhav Gumashta via Thejas Nair)


Changes for Build #610
[navis] HIVE-5879 : Fix spelling errors in hive-default.xml.template (Lefty 
Leverenz via Navis)


Changes for Build #611

Changes for Build #612
[xuefu] HIVE-6021: Problem in GroupByOperator for handling distinct aggrgations 
(Sun Rui via Xuefu)


Changes for Build #613
[prasadm] HIVE-6036: A test case for embedded beeline - with URL 
jdbc:hive2:///default (Anandha L Ranganathan via Prasad Mujumdar)

[prasadm] HIVE-4256: JDBC2 HiveConnection does not use the specified database 
(Anandha L Ranganathan via Prasad Mujumdar)


Changes for Build #614
[brock] HIVE-5966 - Fix eclipse:eclipse post shim aggregation c

[jira] [Commented] (HIVE-5923) SQL std auth - parser changes

2013-12-30 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858874#comment-13858874
 ] 

Brock Noland commented on HIVE-5923:


Hey Thejas, would you mind creating a RB item for this? Thanks!

> SQL std auth - parser changes
> -
>
> Key: HIVE-5923
> URL: https://issues.apache.org/jira/browse/HIVE-5923
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-5923.1.patch, HIVE-5923.2.patch
>
>   Original Estimate: 96h
>  Time Spent: 72h
>  Remaining Estimate: 12h
>
> There are new access control statements proposed in the functional spec in 
> HIVE-5837 . It also proposes some small changes to the existing query syntax 
> (mostly extensions and some optional keywords).
> The syntax supported should depend on the current authorization mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: How do you run single query test(s) after mavenization?

2013-12-30 Thread Brock Noland
On Mon, Nov 18, 2013 at 2:21 PM, Lefty Leverenz wrote:

> Thanks for the typo alert Remus, I've changed -Dcase=TestCliDriver to
> -Dtest=TestCliDriver.
>

Thank you for this!!


>
> But HowToContribute<
> https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute
> >still
> has several instances of "ant" that should be changed to "mvn" --
> some are simple replacements but others might need additional changes:


>- Check for new Checkstyle 
> violations
>by running ant checkstyle, ...  [mvn checkstyle?]
>

We have not implemented checkstyle on maven yet. I created
https://issues.apache.org/jira/browse/HIVE-6123


>- Define methods within your class whose names begin with test, and call
>JUnit's many assert methods to verify conditions; these methods will be
>executed when you run ant test.  [simple replacement]
>- (2 ants) We can run "ant test -Dtestcase=TestAbc" where TestAbc is the
>name of the new class. This will test only the new testcase, which will
> be
>faster than "ant test" which tests all testcases.  [change ant to mvn
>twice; also change -Dtestcase to -Dtest?]
>- Folks should run ant clean package test before selecting *Submit
> Patch*.
> [mvn clean package?]
>

I have updated the above.


>
> The rest of the "ant" instances are okay because the MVN section afterwards
> gives the alternative, but should we keep ant or make the replacements?
>
>- 9.  Now you can run the ant 'thriftif' target ...
>- 11.  ant thriftif -Dthrift.home=...
>- 15.  ant thriftif
>- 18. ant clean package
>- The maven equivalent of ant thriftif is:
>
> mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local
>
>
>
I have not generated the thrift stuff recently. It would be great if Alan
or someone else who has would update this section.

Thank you!!


[jira] [Created] (HIVE-6123) Implement checkstyle in maven

2013-12-30 Thread Brock Noland (JIRA)
Brock Noland created HIVE-6123:
--

 Summary: Implement checkstyle in maven
 Key: HIVE-6123
 URL: https://issues.apache.org/jira/browse/HIVE-6123
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland


ant had a checkstyle target, we should do something similar for maven



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6116) Use Paths consistently III

2013-12-30 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858865#comment-13858865
 ] 

Xuefu Zhang commented on HIVE-6116:
---

Patch looks good. A minor comment on RB.

> Use Paths consistently III
> --
>
> Key: HIVE-6116
> URL: https://issues.apache.org/jira/browse/HIVE-6116
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6116.2.patch, HIVE-6116.3.patch, HIVE-6116.patch
>
>
> Another one in patch series to make use of Paths consistently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 16502: Use paths consisently - 3

2013-12-30 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16502/#review30965
---



trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java


It would be nice if this can be reformatted: tab/space.


- Xuefu Zhang


On Dec. 29, 2013, 4:22 a.m., Ashutosh Chauhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16502/
> ---
> 
> (Updated Dec. 29, 2013, 4:22 a.m.)
> 
> 
> Review request for hive, Xuefu Zhang and Xuefu Zhang.
> 
> 
> Bugs: HIVE-6116
> https://issues.apache.org/jira/browse/HIVE-6116
> 
> 
> Repository: hive
> 
> 
> Description
> ---
> 
> Refactor patch.
> 
> 
> Diffs
> -
> 
>   
> trunk/hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
>  1553986 
>   
> trunk/hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
>  1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
> 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
> 1553986 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java
>  1553986 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
>  1553986 
>   
> trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
>  1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 
> 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 1553986 
>   trunk/ql/src/java/org/apache/hadoop/hive/ql/session/LineageState.java 
> 1553986 
> 
> Diff: https://reviews.apache.org/r/16502/diff/
> 
> 
> Testing
> ---
> 
> No new tests. Refactor only patch. Regression suite suffices.
> 
> 
> Thanks,
> 
> Ashutosh Chauhan
> 
>



Hive-trunk-h0.21 - Build # 2532 - Still Failing

2013-12-30 Thread Apache Jenkins Server
Changes for Build #2492
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #2493
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes for Build #2494
[hashutosh] HIVE-5982 : Remove redundant filesystem operations and methods in 
FileSink (Ashutosh Chauhan via Thejas Nair)

[navis] HIVE-5955 : decimal_precision.q test case fails in trunk (Prasanth J 
via Navis)

[brock] HIVE-5983 - Fix name of ColumnProjectionUtils.appendReadColumnIDs 
(Brock Noland reviewed by Navis)


Changes for Build #2495
[omalley] HIVE-5580. Predicate pushdown predicates with an and-operator between 
non-SARGable predicates cause a NPE. (omalley)


Changes for Build #2496
[gunther] HIVE-6000: Hive build broken on hadoop2 (Vikram Dixit K via Gunther 
Hagleitner

[gunther] HIVE-2093: UPDATE - add two missing files from previous commit 
(Gunther Hagleitner)

[thejas] HIVE-2093 : create/drop database should populate inputs/outputs and 
check concurrency and user permission (Navis via Thejas Nair)

[hashutosh] HIVE-6016 : Hadoop23Shims has a bug in listLocatedStatus impl. 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5994 : ORC RLEv2 encodes wrongly for large negative BIGINTs  
(64 bits ) (Prasanth J via Owen Omalley)

[hashutosh] HIVE-5991 : ORC RLEv2 fails with ArrayIndexOutOfBounds exception 
for PATCHED_BLOB encoding (Prasanth J via Owen Omalley)

[prasadm] HIVE-4395: Support TFetchOrientation.FIRST for HiveServer2 
FetchResults (Prasad Mujumdar reviewed by Thejas Nair)

[ehans] HIVE-5756: Implement vectorized support for IF conditional expression 
(Eric Hanson)

[hashutosh] HIVE-6018 : FetchTask should not reference metastore classes (Navis 
via Prasad Mujumdar)

[hashutosh] HIVE-5979. Failure in cast to timestamps. (Jitendra Pandey)

[hashutosh] HIVE-5897 : Fix hadoop2 execution environment Milestone 2 (Vikram 
Dixit via Brock Noland)


Changes for Build #2497

Changes for Build #2498
[hashutosh] HIVE-6004 : Fix statistics annotation related test failures in 
hadoop2 (Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-6027 : non-vectorized log10 has rounding issue (Sergey 
Shelukhin via Ashutosh Chauhan)

[prasadm] HIVE-5993: JDBC Driver should not hard-code the database name (Szehon 
Ho via Prasad Mujumdar)


Changes for Build #2499
[navis] HIVE-5985 : Make qfile_regex to accept multiple patterns (Navis 
reviewed by Ashutosh Chauhan)


Changes for Build #2500

Changes for Build #2501

Changes for Build #2502
[navis] HIVE-5276 : Skip redundant string encoding/decoding for hiveserver2 
(Navis Reviewed by Carl Steinbach)


Changes for Build #2503
[xuefu] HIVE-6022: Load statements with incorrect order of partitions put input 
files to unreadable places (Teruyoshi Zenmyo via Xuefu)


Changes for Build #2504

Changes for Build #2505
[thejas] HIVE-5975 : [WebHCat] templeton mapreduce job failed if provide 
"define" parameters (Shanyu Zhao via Thejas Nair)


Changes for Build #2506
[prasadm] HIVE-1466: Add NULL DEFINED AS to ROW FORMAT specification (Prasad 
Mujumdar reviewed by Xuefu Zhang)


Changes for Build #2507
[jitendra] HIVE-5521 : Remove CommonRCFileInputFormat. (hashutosh via jitendra)

[rhbutani] HIVE-5973 SMB joins produce incorrect results with multiple 
partitions and buckets (Vikram Dixit via Harish Butani)

[ehans] HIVE-6015: vectorized logarithm produces results for 0 that are 
different from a non-vectorized one (Sergey Shelukhin via Eric Hanson)


Changes for Build #2508
[brock] HIVE-5812 - HiveServer2 SSL connection transport binds to loopback 
address by default (Prasad Mujumdar via Brock Noland)


Changes for Build #2509
[hashutosh] HIVE-5936 : analyze command failing to collect stats with counter 
mechanism (Navis via Ashutosh Chauhan)


Changes for Build #2510
[thejas] HIVE-5230 : Better error reporting by async threads in HiveServer2 
(Vaibhav Gumashta via Thejas Nair)


Changes for Build #2511
[navis] HIVE-5879 : Fix spelling errors in hive-default.xml.template (Lefty 
Leverenz via Navis)


Changes for Build #2512

Changes for Build #2513
[xuefu] HIVE-6021: Problem in GroupByOperator for handling distinct aggrgations 
(Sun Rui via Xuefu)


Changes for Build #2514
[prasadm] HIVE-6036: A test case for embedded beeline - with URL 
jdbc:hive2:///default (Anandha L Ranganathan via Prasad Mujumdar)

[prasadm] HIVE-4256: JDBC2 HiveConnection does not use the specified database 
(Anandha L Ranganathan via Prasad Mujumdar)


Changes for Build #2515
[brock] HIVE-5966 - Fix eclipse:eclipse post shim aggregation c

[jira] [Updated] (HIVE-6117) mapreduce.RecordReader instance needs to be initialized

2013-12-30 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6117:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk! Thank you Nick and Jimmy for your help on this one!

> mapreduce.RecordReader instance needs to be initialized
> ---
>
> Key: HIVE-6117
> URL: https://issues.apache.org/jira/browse/HIVE-6117
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 0.13.0
>
> Attachments: 6117.00.patch, HIVE-6117.0.patch
>
>
> The HBase storage handler makes use of a mapreduce.RecordReader instance but 
> does not initialize it when consumed from local context. This results in a 
> NPE for some queries, for instance
> {noformat}
> create table hbase_1(key string, age int) stored by 
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ( 
> "hbase.columns.mapping" = "info:age");
> insert overwrite table hbase_1 select name, SUM(age) from studenttab10k group 
> by name;
> select * from hbase_1;
> {noformat}
> The select statement throws the following exception
> {noformat}
> 13/12/18 01:30:32 ERROR CliDriver: Failed with exception 
> java.io.IOException:java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:551)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1494)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:196)
> at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138)
> at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:234)
> at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:193)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6117) mapreduce.RecordReader instance needs to be initialized

2013-12-30 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858845#comment-13858845
 ] 

Brock Noland commented on HIVE-6117:


I also verified that this patch fixes the NPE I was seeing with *both* select 
\* from table (no-MR job) and select count(\*) from table (MR job).

> mapreduce.RecordReader instance needs to be initialized
> ---
>
> Key: HIVE-6117
> URL: https://issues.apache.org/jira/browse/HIVE-6117
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Attachments: 6117.00.patch, HIVE-6117.0.patch
>
>
> The HBase storage handler makes use of a mapreduce.RecordReader instance but 
> does not initialize it when consumed from local context. This results in a 
> NPE for some queries, for instance
> {noformat}
> create table hbase_1(key string, age int) stored by 
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ( 
> "hbase.columns.mapping" = "info:age");
> insert overwrite table hbase_1 select name, SUM(age) from studenttab10k group 
> by name;
> select * from hbase_1;
> {noformat}
> The select statement throws the following exception
> {noformat}
> 13/12/18 01:30:32 ERROR CliDriver: Failed with exception 
> java.io.IOException:java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:551)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1494)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:196)
> at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138)
> at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:234)
> at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:193)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6117) mapreduce.RecordReader instance needs to be initialized

2013-12-30 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858837#comment-13858837
 ] 

Brock Noland commented on HIVE-6117:


Hey guys,

I think I see the issue here. o.a.h.mapreduce.RecordReader has an initialize 
method which is called by the mapper. o.a.h.mapred.RecordReader does not have 
an initialize method. Hive uses the mapred API so you can see here: 
https://github.com/apache/hive/blob/fb63a28cd5fddb5e5c974cab84cd9c3a4155e40d/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java#L178

that we wrap the mapreduce RR from HBase in a mapred RR. Therefore it's hive's 
responsibility to call initialize which we are not doing. Therefore the patch 
looks correct and I will commit.

> mapreduce.RecordReader instance needs to be initialized
> ---
>
> Key: HIVE-6117
> URL: https://issues.apache.org/jira/browse/HIVE-6117
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Attachments: 6117.00.patch, HIVE-6117.0.patch
>
>
> The HBase storage handler makes use of a mapreduce.RecordReader instance but 
> does not initialize it when consumed from local context. This results in a 
> NPE for some queries, for instance
> {noformat}
> create table hbase_1(key string, age int) stored by 
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ( 
> "hbase.columns.mapping" = "info:age");
> insert overwrite table hbase_1 select name, SUM(age) from studenttab10k group 
> by name;
> select * from hbase_1;
> {noformat}
> The select statement throws the following exception
> {noformat}
> 13/12/18 01:30:32 ERROR CliDriver: Failed with exception 
> java.io.IOException:java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:551)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1494)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:196)
> at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138)
> at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:234)
> at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:193)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5901) Query cancel should stop running MR tasks

2013-12-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858714#comment-13858714
 ] 

Hive QA commented on HIVE-5901:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12620812/HIVE-5901.3.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4818 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/771/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/771/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12620812

> Query cancel should stop running MR tasks
> -
>
> Key: HIVE-5901
> URL: https://issues.apache.org/jira/browse/HIVE-5901
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-5901.1.patch.txt, HIVE-5901.2.patch.txt, 
> HIVE-5901.3.patch.txt
>
>
> Currently, query canceling does not stop running MR job immediately.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-3746) Fix HS2 ResultSet Serialization Performance Regression

2013-12-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858684#comment-13858684
 ] 

Hive QA commented on HIVE-3746:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12620811/HIVE-3746.7.patch.txt

{color:green}SUCCESS:{color} +1 4818 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/770/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/770/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12620811

> Fix HS2 ResultSet Serialization Performance Regression
> --
>
> Key: HIVE-3746
> URL: https://issues.apache.org/jira/browse/HIVE-3746
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Server Infrastructure
>Reporter: Carl Steinbach
>Assignee: Navis
>  Labels: HiveServer2, jdbc, thrift
> Attachments: HIVE-3746.1.patch.txt, HIVE-3746.2.patch.txt, 
> HIVE-3746.3.patch.txt, HIVE-3746.4.patch.txt, HIVE-3746.5.patch.txt, 
> HIVE-3746.6.patch.txt, HIVE-3746.7.patch.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5414) The result of show grant is not visible via JDBC

2013-12-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858657#comment-13858657
 ] 

Hive QA commented on HIVE-5414:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12620797/D13209.4.patch

{color:green}SUCCESS:{color} +1 4820 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/769/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/769/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12620797

> The result of show grant is not visible via JDBC
> 
>
> Key: HIVE-5414
> URL: https://issues.apache.org/jira/browse/HIVE-5414
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: D13209.1.patch, D13209.2.patch, D13209.3.patch, 
> D13209.4.patch, HIVE-5414.4.patch.txt, HIVE-5414.5.patch.txt
>
>
> Currently, show grant / show role grant does not make fetch task, which 
> provides the result schema for jdbc clients.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)