[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

2014-07-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-3072:
-

Labels:   (was: TODOC10)

> Hive List Bucketing - DDL support
> -
>
> Key: HIVE-3072
> URL: https://issues.apache.org/jira/browse/HIVE-3072
> Project: Hive
>  Issue Type: New Feature
>  Components: SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Fix For: 0.10.0
>
> Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, 
> HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6, 
> HIVE-3072.patch.7
>
>
> If a hive table column has skewed keys, query performance on non-skewed key 
> is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single 
> skewed column and multiple columns.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

2014-07-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049667#comment-14049667
 ] 

Lefty Leverenz commented on HIVE-3072:
--

This is documented in the wiki here:

* [Language Manual -- DDL -- Skewed Tables | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-SkewedTables]

> Hive List Bucketing - DDL support
> -
>
> Key: HIVE-3072
> URL: https://issues.apache.org/jira/browse/HIVE-3072
> Project: Hive
>  Issue Type: New Feature
>  Components: SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Fix For: 0.10.0
>
> Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, 
> HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6, 
> HIVE-3072.patch.7
>
>
> If a hive table column has skewed keys, query performance on non-skewed key 
> is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single 
> skewed column and multiple columns.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7291) Refactor TestParser to understand test-property file

2014-07-01 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049625#comment-14049625
 ] 

Brock Noland commented on HIVE-7291:


+1

> Refactor TestParser to understand test-property file
> 
>
> Key: HIVE-7291
> URL: https://issues.apache.org/jira/browse/HIVE-7291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-7291.2.patch, HIVE-7291.3.patch, HIVE-7291.4.patch, 
> HIVE-7291.patch, trunk-mr2.properties
>
>
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7303) IllegalMonitorStateException when stmtHandle is null in HiveStatement

2014-07-01 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049558#comment-14049558
 ] 

Brock Noland commented on HIVE-7303:


Thank you [~navis]! Do you think we should implement the unwrap* functions? 
What would the use case be there?

> IllegalMonitorStateException when stmtHandle is null in HiveStatement
> -
>
> Key: HIVE-7303
> URL: https://issues.apache.org/jira/browse/HIVE-7303
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Navis
> Attachments: HIVE-7303.1.patch.txt
>
>
> From http://www.mail-archive.com/dev@hive.apache.org/msg75617.html
> Unlock can be called even it's not locked in some situation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-860) Persistent distributed cache

2014-07-01 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-860:
--

Attachment: HIVE-860.patch

Running this one again. Some of the failures are quite strange.

> Persistent distributed cache
> 
>
> Key: HIVE-860
> URL: https://issues.apache.org/jira/browse/HIVE-860
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.12.0
>Reporter: Zheng Shao
>Assignee: Brock Noland
> Fix For: 0.14.0
>
> Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
> HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
> HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch
>
>
> DistributedCache is shared across multiple jobs, if the hdfs file name is the 
> same.
> We need to make sure Hive put the same file into the same location every time 
> and do not overwrite if the file content is the same.
> We can achieve 2 different results:
> A1. Files added with the same name, timestamp, and md5 in the same session 
> will have a single copy in distributed cache.
> A2. Filed added with the same name, timestamp, and md5 will have a single 
> copy in distributed cache.
> A2 has a bigger benefit in sharing but may raise a question on when Hive 
> should clean it up in hdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 22996: HIVE-7090 Support session-level temporary tables in Hive

2014-07-01 Thread Brock Noland

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22996/#review47170
---


Hey Jason, looks good! Nice work! I have a question or two below and a bit nits.


itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java


When the error message does not contain the text we are looking for, 
putting the actual text in the error message is useful.

I.e. when this assertion fails we won't have any idea what the actual 
message was. Thus the person debugging will have to actually make a code change 
and re-run the test to see what happened.



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java


I am sure this is a stupid question but why are we subclassing HMSC?



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java


nit: 

Is "Partition columns are not supported on temporary tables and source 
table in CREATE TABLE LIKE is partitioned." more clear?




ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java


It looks to me like these can be private since they are not accessed 
outside this class?



ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java


These // should be javadoc style.



ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java


I understand it's coded today such that these three conf.get() will not 
return null. However I believe we should use Preconditions.checkNotNull here to 
ensure once that assumption is not true we don't give the dev/user a terrible 
error message.



ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java


nit: 

Is "Cannot create directory" more clear?



ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java


Setter is not being used.


- Brock Noland


On June 28, 2014, 12:35 a.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22996/
> ---
> 
> (Updated June 28, 2014, 12:35 a.m.)
> 
> 
> Review request for hive, Gunther Hagleitner, Navis Ryu, and Harish Butani.
> 
> 
> Bugs: HIVE-7090
> https://issues.apache.org/jira/browse/HIVE-7090
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Temp tables managed in memory by SessionState.
> SessionHiveMetaStoreClient overrides table-related methods in HiveMetaStore 
> to access the temp tables saved in the SessionState when appropriate.
> 
> 
> Diffs
> -
> 
>   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java 
> 9fb7550 
>   itests/qtest/testconfiguration.properties 1462ecd 
>   metastore/if/hive_metastore.thrift cc802c6 
>   metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 9e8d912 
>   ql/src/java/org/apache/hadoop/hive/ql/Context.java abc4290 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java d8d900b 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 4d35176 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 3df2690 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 
> 1270520 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f934ac4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 71471f4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83d09c0 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 2537b75 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableLikeDesc.java cb5d64c 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 2143d0c 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezTask.java 43125f7 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 98c3cc3 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 
> 91de8da 
>   
> ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestHiveAuthorizationTaskFactory.java
>  20d08b3 
>   ql/src/test/queries/clientnegative/temp_table_authorize_create_tbl.q 
> PRE-CREATION 
>   ql/src/test/queries/clientnegative/temp_table_column_stats.q PRE-CREATION 
>   ql/src/test/queries/clientnegative/temp_table_create_like_partitions.q 
> PRE-CREATION 
>   ql/src/test/queries/clientnegative/temp_tab

[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049519#comment-14049519
 ] 

Hive QA commented on HIVE-7262:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12653480/HIVE-7262.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5672 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/656/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/656/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-656/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12653480

> Partitioned Table Function (PTF) query fails on ORC table when attempting to 
> vectorize
> --
>
> Key: HIVE-7262
> URL: https://issues.apache.org/jira/browse/HIVE-7262
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch
>
>
> In ptf.q, create the part table with STORED AS ORC and SET 
> hive.vectorized.execution.enabled=true;
> Queries fail to find BLOCKOFFSET virtual column during vectorization and 
> suffers an exception.
> ERROR vector.VectorizationContext 
> (VectorizationContext.java:getInputColumnIndex(186)) - The column 
> BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map.
> Jitendra pointed to the routine that returns the VectorizationContext in 
> Vectorize.java needing to add virtual columns to the map, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Issue Comment Deleted] (HIVE-7292) Hive on Spark

2014-07-01 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7292:
---

Comment: was deleted

(was: I am in OOO, so, the replying to the email might get delayed.
)

> Hive on Spark
> -
>
> Key: HIVE-7292
> URL: https://issues.apache.org/jira/browse/HIVE-7292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: Hive-on-Spark.pdf
>
>
> Spark as an open-source data analytics cluster computing framework has gained 
> significant momentum recently. Many Hive users already have Spark installed 
> as their computing backbone. To take advantages of Hive, they still need to 
> have either MapReduce or Tez on their cluster. This initiative will provide 
> user a new alternative so that those user can consolidate their backend. 
> Secondly, providing such an alternative further increases Hive's adoption as 
> it exposes Spark users  to a viable, feature-rich de facto standard SQL tools 
> on Hadoop.
> Finally, allowing Hive to run on Spark also has performance benefits. Hive 
> queries, especially those involving multiple reducer stages, will run faster, 
> thus improving user experience as Tez does.
> This is an umbrella JIRA which will cover many coming subtask. Design doc 
> will be attached here shortly, and will be on the wiki as well. Feedback from 
> the community is greatly appreciated!



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize

2014-07-01 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7262:
---

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-7318

> Partitioned Table Function (PTF) query fails on ORC table when attempting to 
> vectorize
> --
>
> Key: HIVE-7262
> URL: https://issues.apache.org/jira/browse/HIVE-7262
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch
>
>
> In ptf.q, create the part table with STORED AS ORC and SET 
> hive.vectorized.execution.enabled=true;
> Queries fail to find BLOCKOFFSET virtual column during vectorization and 
> suffers an exception.
> ERROR vector.VectorizationContext 
> (VectorizationContext.java:getInputColumnIndex(186)) - The column 
> BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map.
> Jitendra pointed to the routine that returns the VectorizationContext in 
> Vectorize.java needing to add virtual columns to the map, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7314) Wrong results of UDF when hive.cache.expr.evaluation is set

2014-07-01 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049510#comment-14049510
 ] 

Ashutosh Chauhan commented on HIVE-7314:


+1

> Wrong results of UDF when hive.cache.expr.evaluation is set
> ---
>
> Key: HIVE-7314
> URL: https://issues.apache.org/jira/browse/HIVE-7314
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.0, 0.13.1
>Reporter: dima machlin
>Assignee: Navis
> Attachments: HIVE-7314.1.patch.txt
>
>
> It seems that the expression caching doesn't work when using UDF inside 
> another UDF or a hive function.
> For example :
> tbl has one row : 'a','b'
> The following query :
> {code:sql} select concat(custUDF(a),' ', custUDF(b)) from tbl; {code}
> returns 'a a'
> seems to cache custUDF(a)  and use it for custUDF(b).
> Same query without the concat works fine.
> Replacing the concat with another custom UDF also returns 'a a'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7127) Handover more details on exception in hiveserver2

2014-07-01 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7127:


   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Szehon Ho, for the review!

> Handover more details on exception in hiveserver2
> -
>
> Key: HIVE-7127
> URL: https://issues.apache.org/jira/browse/HIVE-7127
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Fix For: 0.14.0
>
> Attachments: HIVE-7127.1.patch.txt, HIVE-7127.2.patch.txt, 
> HIVE-7127.4.patch.txt, HIVE-7127.5.patch.txt
>
>
> Currently, JDBC hands over exception message and error codes. But it's not 
> helpful for debugging.
> {noformat}
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: ParseException line 1:0 cannot recognize input near 
> 'createa' 'asd' ''
>   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
>   at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
>   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231)
>   at org.apache.hive.beeline.Commands.execute(Commands.java:736)
>   at org.apache.hive.beeline.Commands.sql(Commands.java:657)
>   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:889)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:744)
>   at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:459)
>   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:442)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> {noformat}
> With this patch, JDBC client can get more details on hiveserver2. 
> {noformat}
> Caused by: org.apache.hive.service.cli.HiveSQLException: Error while 
> compiling statement: FAILED: ParseException line 1:0 cannot recognize input 
> near 'createa' 'asd' ''
>   at org.apache.hive.service.cli.operation.SQLOperation.prepare(Unknown 
> Source)
>   at org.apache.hive.service.cli.operation.SQLOperation.run(Unknown 
> Source)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(Unknown
>  Source)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(Unknown
>  Source)
>   at org.apache.hive.service.cli.CLIService.executeStatementAsync(Unknown 
> Source)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(Unknown 
> Source)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown
>  Source)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown
>  Source)
>   at org.apache.thrift.ProcessFunction.process(Unknown Source)
>   at org.apache.thrift.TBaseProcessor.process(Unknown Source)
>   at org.apache.hive.service.auth.TSetIpAddressProcessor.process(Unknown 
> Source)
>   at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown 
> Source)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>   at java.lang.Thread.run(Unknown Source)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7294) sql std auth - authorize show grant statements

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049471#comment-14049471
 ] 

Hive QA commented on HIVE-7294:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12653354/HIVE-7294.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5663 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/654/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/654/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-654/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12653354

> sql std auth - authorize show grant statements
> --
>
> Key: HIVE-7294
> URL: https://issues.apache.org/jira/browse/HIVE-7294
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, SQLStandardAuthorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7294.1.patch, HIVE-7294.2.patch
>
>
> A non admin user should not be allowed to run show grant commands only for 
> themselves or a role they belong to.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates

2014-07-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-7326:


Summary: Hive complains invalid column reference with 'having' aggregate 
predicates  (was: Hive complains invalid column reference with group by having 
aggregate predicates)

> Hive complains invalid column reference with 'having' aggregate predicates
> --
>
> Key: HIVE-7326
> URL: https://issues.apache.org/jira/browse/HIVE-7326
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>
> CREATE TABLE TestV1_Staples (
>   Item_Count INT,
>   Ship_Priority STRING,
>   Order_Priority STRING,
>   Order_Status STRING,
>   Order_Quantity DOUBLE,
>   Sales_Total DOUBLE,
>   Discount DOUBLE,
>   Tax_Rate DOUBLE,
>   Ship_Mode STRING,
>   Fill_Time DOUBLE,
>   Gross_Profit DOUBLE,
>   Price DOUBLE,
>   Ship_Handle_Cost DOUBLE,
>   Employee_Name STRING,
>   Employee_Dept STRING,
>   Manager_Name STRING,
>   Employee_Yrs_Exp DOUBLE,
>   Employee_Salary DOUBLE,
>   Customer_Name STRING,
>   Customer_State STRING,
>   Call_Center_Region STRING,
>   Customer_Balance DOUBLE,
>   Customer_Segment STRING,
>   Prod_Type1 STRING,
>   Prod_Type2 STRING,
>   Prod_Type3 STRING,
>   Prod_Type4 STRING,
>   Product_Name STRING,
>   Product_Container STRING,
>   Ship_Promo STRING,
>   Supplier_Name STRING,
>   Supplier_Balance DOUBLE,
>   Supplier_Region STRING,
>   Supplier_State STRING,
>   Order_ID STRING,
>   Order_Year INT,
>   Order_Month INT,
>   Order_Day INT,
>   Order_Date_ STRING,
>   Order_Quarter STRING,
>   Product_Base_Margin DOUBLE,
>   Product_ID STRING,
>   Receive_Time DOUBLE,
>   Received_Date_ STRING,
>   Ship_Date_ STRING,
>   Ship_Charge DOUBLE,
>   Total_Cycle_Time DOUBLE,
>   Product_In_Stock STRING,
>   PID INT,
>   Market_Segment STRING
>   );
> Query that works:
> SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
> default.testv1_staples s1 GROUP BY customer_name HAVING (
> (COUNT(s1.discount) <= 822) AND
> (SUM(customer_balance) <= 4074689.00041)
> );
> Query that fails:
> SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
> default.testv1_staples s1 GROUP BY customer_name HAVING (
> (SUM(customer_balance) <= 4074689.00041)
> AND (COUNT(s1.discount) <= 822)
> );



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7326) Hive complains invalid column reference with group by having aggregate predicates

2014-07-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)
Hari Sankar Sivarama Subramaniyan created HIVE-7326:
---

 Summary: Hive complains invalid column reference with group by 
having aggregate predicates
 Key: HIVE-7326
 URL: https://issues.apache.org/jira/browse/HIVE-7326
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan



CREATE TABLE TestV1_Staples (
  Item_Count INT,
  Ship_Priority STRING,
  Order_Priority STRING,
  Order_Status STRING,
  Order_Quantity DOUBLE,
  Sales_Total DOUBLE,
  Discount DOUBLE,
  Tax_Rate DOUBLE,
  Ship_Mode STRING,
  Fill_Time DOUBLE,
  Gross_Profit DOUBLE,
  Price DOUBLE,
  Ship_Handle_Cost DOUBLE,
  Employee_Name STRING,
  Employee_Dept STRING,
  Manager_Name STRING,
  Employee_Yrs_Exp DOUBLE,
  Employee_Salary DOUBLE,
  Customer_Name STRING,
  Customer_State STRING,
  Call_Center_Region STRING,
  Customer_Balance DOUBLE,
  Customer_Segment STRING,
  Prod_Type1 STRING,
  Prod_Type2 STRING,
  Prod_Type3 STRING,
  Prod_Type4 STRING,
  Product_Name STRING,
  Product_Container STRING,
  Ship_Promo STRING,
  Supplier_Name STRING,
  Supplier_Balance DOUBLE,
  Supplier_Region STRING,
  Supplier_State STRING,
  Order_ID STRING,
  Order_Year INT,
  Order_Month INT,
  Order_Day INT,
  Order_Date_ STRING,
  Order_Quarter STRING,
  Product_Base_Margin DOUBLE,
  Product_ID STRING,
  Receive_Time DOUBLE,
  Received_Date_ STRING,
  Ship_Date_ STRING,
  Ship_Charge DOUBLE,
  Total_Cycle_Time DOUBLE,
  Product_In_Stock STRING,
  PID INT,
  Market_Segment STRING
  );

Query that works:
SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
default.testv1_staples s1 GROUP BY customer_name HAVING (
(COUNT(s1.discount) <= 822) AND
(SUM(customer_balance) <= 4074689.00041)
);
Query that fails:
SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
default.testv1_staples s1 GROUP BY customer_name HAVING (
(SUM(customer_balance) <= 4074689.00041)
AND (COUNT(s1.discount) <= 822)
);




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize

2014-07-01 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7262:
---

Status: In Progress  (was: Patch Available)

> Partitioned Table Function (PTF) query fails on ORC table when attempting to 
> vectorize
> --
>
> Key: HIVE-7262
> URL: https://issues.apache.org/jira/browse/HIVE-7262
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch
>
>
> In ptf.q, create the part table with STORED AS ORC and SET 
> hive.vectorized.execution.enabled=true;
> Queries fail to find BLOCKOFFSET virtual column during vectorization and 
> suffers an exception.
> ERROR vector.VectorizationContext 
> (VectorizationContext.java:getInputColumnIndex(186)) - The column 
> BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map.
> Jitendra pointed to the routine that returns the VectorizationContext in 
> Vectorize.java needing to add virtual columns to the map, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize

2014-07-01 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7262:
---

Status: Patch Available  (was: In Progress)

> Partitioned Table Function (PTF) query fails on ORC table when attempting to 
> vectorize
> --
>
> Key: HIVE-7262
> URL: https://issues.apache.org/jira/browse/HIVE-7262
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch
>
>
> In ptf.q, create the part table with STORED AS ORC and SET 
> hive.vectorized.execution.enabled=true;
> Queries fail to find BLOCKOFFSET virtual column during vectorization and 
> suffers an exception.
> ERROR vector.VectorizationContext 
> (VectorizationContext.java:getInputColumnIndex(186)) - The column 
> BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map.
> Jitendra pointed to the routine that returns the VectorizationContext in 
> Vectorize.java needing to add virtual columns to the map, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize

2014-07-01 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7262:
---

Attachment: HIVE-7262.2.patch

> Partitioned Table Function (PTF) query fails on ORC table when attempting to 
> vectorize
> --
>
> Key: HIVE-7262
> URL: https://issues.apache.org/jira/browse/HIVE-7262
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch
>
>
> In ptf.q, create the part table with STORED AS ORC and SET 
> hive.vectorized.execution.enabled=true;
> Queries fail to find BLOCKOFFSET virtual column during vectorization and 
> suffers an exception.
> ERROR vector.VectorizationContext 
> (VectorizationContext.java:getInputColumnIndex(186)) - The column 
> BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map.
> Jitendra pointed to the routine that returns the VectorizationContext in 
> Vectorize.java needing to add virtual columns to the map, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-494) Select columns by index instead of name

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049363#comment-14049363
 ] 

Hive QA commented on HIVE-494:
--



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12653349/HIVE-494.3.patch.txt

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5675 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/652/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/652/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-652/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12653349

> Select columns by index instead of name
> ---
>
> Key: HIVE-494
> URL: https://issues.apache.org/jira/browse/HIVE-494
> Project: Hive
>  Issue Type: Wish
>  Components: Clients, Query Processor
>Reporter: Adam Kramer
>Assignee: Navis
>Priority: Minor
>  Labels: SQL
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, 
> HIVE-494.2.patch.txt, HIVE-494.3.patch.txt, HIVE-494.D12153.1.patch
>
>
> SELECT mytable[0], mytable[2] FROM some_table_name mytable;
> ...should return the first and third columns, respectively, from mytable 
> regardless of their column names.
> The need for "names" specifically is kind of silly when they just get 
> translated into numbers anyway.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5976) Decouple input formats from STORED as keywords

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049302#comment-14049302
 ] 

Hive QA commented on HIVE-5976:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12653346/HIVE-5976.4.patch

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5673 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_file_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_storage_format_descriptor
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/650/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/650/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-650/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12653346

> Decouple input formats from STORED as keywords
> --
>
> Key: HIVE-5976
> URL: https://issues.apache.org/jira/browse/HIVE-5976
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-5976.2.patch, HIVE-5976.3.patch, HIVE-5976.3.patch, 
> HIVE-5976.4.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, 
> HIVE-5976.patch
>
>
> As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
> It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7325) Support non-constant expressions for MAP type indices.

2014-07-01 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-7325:
--

Description: 
Here is my sample:
{code}
CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,D:BatchDate,D:Country") 
TBLPROPERTIES ("hbase.table.name" = "RECORD"); 


CREATE TABLE KEY_RECORD(KeyValue String, RecordId map) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, K:") 
TBLPROPERTIES ("hbase.table.name" = "KEY_RECORD"); 
{code}
The following join statement doesn't work. 
{code}
SELECT a.*, b.* from KEY_RECORD a join RECORD b 
WHERE a.RecordId[b.RecordID] is not null;
{code}
FAILED: SemanticException 2:16 Non-constant expression for map indexes not 
supported. Error encountered near token 'RecordID' 

  was:
Here is my sample: 
CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,D:BatchDate,D:Country") 
TBLPROPERTIES ("hbase.table.name" = "RECORD"); 


CREATE TABLE KEY_RECORD(KeyValue String, RecordId map) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, K:") 
TBLPROPERTIES ("hbase.table.name" = "KEY_RECORD"); 

The following join statement doesn't work. 

SELECT a.*, b.* from KEY_RECORD a join RECORD b 
WHERE a.RecordId[b.RecordID] is not null;

FAILED: SemanticException 2:16 Non-constant expression for map indexes not 
supported. Error encountered near token 'RecordID' 


> Support non-constant expressions for MAP type indices.
> --
>
> Key: HIVE-7325
> URL: https://issues.apache.org/jira/browse/HIVE-7325
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Mala Chikka Kempanna
> Fix For: 0.14.0
>
>
> Here is my sample:
> {code}
> CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,D:BatchDate,D:Country") 
> TBLPROPERTIES ("hbase.table.name" = "RECORD"); 
> CREATE TABLE KEY_RECORD(KeyValue String, RecordId map) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, K:") 
> TBLPROPERTIES ("hbase.table.name" = "KEY_RECORD"); 
> {code}
> The following join statement doesn't work. 
> {code}
> SELECT a.*, b.* from KEY_RECORD a join RECORD b 
> WHERE a.RecordId[b.RecordID] is not null;
> {code}
> FAILED: SemanticException 2:16 Non-constant expression for map indexes not 
> supported. Error encountered near token 'RecordID' 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7325) Support non-constant expressions for MAP indexes.

2014-07-01 Thread Mala Chikka Kempanna (JIRA)
Mala Chikka Kempanna created HIVE-7325:
--

 Summary: Support non-constant expressions for MAP indexes.
 Key: HIVE-7325
 URL: https://issues.apache.org/jira/browse/HIVE-7325
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Mala Chikka Kempanna
 Fix For: 0.14.0


Here is my sample: 
CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,D:BatchDate,D:Country") 
TBLPROPERTIES ("hbase.table.name" = "RECORD"); 


CREATE TABLE KEY_RECORD(KeyValue String, RecordId map) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, K:") 
TBLPROPERTIES ("hbase.table.name" = "KEY_RECORD"); 

The following join statement doesn't work. 

SELECT a.*, b.* from KEY_RECORD a join RECORD b 
WHERE a.RecordId[b.RecordID] is not null;

FAILED: SemanticException 2:16 Non-constant expression for map indexes not 
supported. Error encountered near token 'RecordID' 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7325) Support non-constant expressions for MAP type indices.

2014-07-01 Thread Mala Chikka Kempanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mala Chikka Kempanna updated HIVE-7325:
---

Summary: Support non-constant expressions for MAP type indices.  (was: 
Support non-constant expressions for MAP indexes.)

> Support non-constant expressions for MAP type indices.
> --
>
> Key: HIVE-7325
> URL: https://issues.apache.org/jira/browse/HIVE-7325
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Mala Chikka Kempanna
> Fix For: 0.14.0
>
>
> Here is my sample: 
> CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,D:BatchDate,D:Country") 
> TBLPROPERTIES ("hbase.table.name" = "RECORD"); 
> CREATE TABLE KEY_RECORD(KeyValue String, RecordId map) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, K:") 
> TBLPROPERTIES ("hbase.table.name" = "KEY_RECORD"); 
> The following join statement doesn't work. 
> SELECT a.*, b.* from KEY_RECORD a join RECORD b 
> WHERE a.RecordId[b.RecordID] is not null;
> FAILED: SemanticException 2:16 Non-constant expression for map indexes not 
> supported. Error encountered near token 'RecordID' 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5020) HCat reading null-key map entries causes NPE

2014-07-01 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049238#comment-14049238
 ] 

Sushanth Sowmyan commented on HIVE-5020:


Sorry for the late response to this jira, and thanks for the input, all. I'd 
initially wanted to give it time for more people to respond, and then this fell 
by the wayside.

Thrift structures do not support map null keys. I agree that sortedness is not 
important for maps, and in fact, we should not guarantee it for something 
that's just called a map.

And while I'd like to see a usecase for nulls in keys supported, it looks like 
the conventional hive semantics for maps ignores null keys, and changing rcfile 
users so that they suddenly start getting null keys is a recipe for trouble for 
a lot of users. So having orc map to rc behaviour, and make that the standard 
"hive" behaviour might make more sense. [~owen.omalley]/[~prasanth_j], could 
you comment on what you think the impact of changing orc behaviour that way 
might be?

HCat should adopt whatever behaviour we standardize on for hive, and can follow 
after that.

> HCat reading null-key map entries causes NPE
> 
>
> Key: HIVE-5020
> URL: https://issues.apache.org/jira/browse/HIVE-5020
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>
> Currently, if someone has a null key in a map, HCatInputFormat will terminate 
> with an NPE while trying to read it.
> {noformat}
> java.lang.NullPointerException
> at java.lang.String.compareTo(String.java:1167)
> at java.lang.String.compareTo(String.java:92)
> at java.util.TreeMap.put(TreeMap.java:545)
> at 
> org.apache.hcatalog.data.HCatRecordSerDe.serializeMap(HCatRecordSerDe.java:222)
> at 
> org.apache.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:198)
> at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53)
> at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97)
> at 
> org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:203)
> {noformat}
> This is because we use a TreeMap to preserve order of elements in the map 
> when reading from the underlying storage/serde.
> This problem is easily fixed in a number of ways:
> a) Switch to HashMap, which allows null keys. That does not preserve order of 
> keys, which should not be important for map fields, but if we desire that, we 
> have a solution for that too - LinkedHashMap, which would both retain order 
> and allow us to insert null keys into the map.
> b) Ignore null keyed entries - check if the field we read is null, and if it 
> is, then ignore that item in the record altogether. This way, HCat is robust 
> in what it does - it does not terminate with an NPE, and it does not allow 
> null keys in maps that might be problematic to layers above us that are not 
> used to seeing nulls as keys in maps.
> Why do I bring up the second fix? First, I bring it up because of the way we 
> discovered this bug. When reading from an RCFile, we do not notice this bug. 
> If the same query that produced the RCFile instead produces an Orcfile, and 
> we try reading from it, we see this problem.
> RCFile seems to be quietly stripping any null key entries, whereas Orc 
> retains them. This is why we didn't notice this problem for a long while, and 
> suddenly, now, we are. Now, if we fix our code to allow nulls in map keys 
> through to layers above, we expose layers above to this change, which may 
> then cause them to break. (Technically, this is stretching the case because 
> we already break now if they care) More importantly, though, we have a case 
> now, where the same data will be exposed differently if it were stored as orc 
> or if it were stored as rcfile. And as a layer that is supposed to make 
> storage invisible to the end user, HCat should attempt to provide some 
> consistency in how data behaves to the end user.
> Secondly, whether or not nulls should be supported as keys in Maps seems to 
> be almost a religious view. Some people see it from a perspective of a 
> "mapping", which lends itself to a "Sure, if we encounter a null, we map to 
> this other value" kind of a view, whereas other people view it from a "lookup 
> index" kind of view, which lends itself to a "null as a key makes no sense - 
> What kind of lookup do you expect to perform?" kind of view. Both views have 
> their points, and it makes sense to see if we need to support it.
> That said...
> There is another important concern at hand here: nulls in map keys might be 
> due to bad data(corruption or loading error), and by stripping them, we might 
> be silently hiding that from the user. So "silent stripping" is bad. This is 
> an important point that does steer 

[jira] [Commented] (HIVE-7127) Handover more details on exception in hiveserver2

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049232#comment-14049232
 ] 

Hive QA commented on HIVE-7127:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652341/HIVE-7127.5.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5657 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/649/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/649/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-649/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652341

> Handover more details on exception in hiveserver2
> -
>
> Key: HIVE-7127
> URL: https://issues.apache.org/jira/browse/HIVE-7127
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-7127.1.patch.txt, HIVE-7127.2.patch.txt, 
> HIVE-7127.4.patch.txt, HIVE-7127.5.patch.txt
>
>
> Currently, JDBC hands over exception message and error codes. But it's not 
> helpful for debugging.
> {noformat}
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: ParseException line 1:0 cannot recognize input near 
> 'createa' 'asd' ''
>   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
>   at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
>   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231)
>   at org.apache.hive.beeline.Commands.execute(Commands.java:736)
>   at org.apache.hive.beeline.Commands.sql(Commands.java:657)
>   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:889)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:744)
>   at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:459)
>   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:442)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> {noformat}
> With this patch, JDBC client can get more details on hiveserver2. 
> {noformat}
> Caused by: org.apache.hive.service.cli.HiveSQLException: Error while 
> compiling statement: FAILED: ParseException line 1:0 cannot recognize input 
> near 'createa' 'asd' ''
>   at org.apache.hive.service.cli.operation.SQLOperation.prepare(Unknown 
> Source)
>   at org.apache.hive.service.cli.operation.SQLOperation.run(Unknown 
> Source)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(Unknown
>  Source)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(Unknown
>  Source)
>   at org.apache.hive.service.cli.CLIService.executeStatementAsync(Unknown 
> Source)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(Unknown 
> Source)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown
>  Source)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown
>  Source)
>   at org.apache.thrift.ProcessFunction.process(Unknown Source)
>   at org.apache.thrift.TBaseProcessor.process(Unknown Source)
>   at org.apache.hive.service.auth.TSetIpAddressProcessor.process(Unknown 
> Source)
>   at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown 
> Source)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>   at java.lang.Thread.run(Unknown Source)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key

2014-07-01 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049228#comment-14049228
 ] 

Sushanth Sowmyan commented on HIVE-7282:


While this protects the difference between orc and rcfile from HCat, HIVE-5020 
is about the differences in behaviour between rcfile and orc in how they handle 
nulls in maps, and should not be closed until hive has a consistent behaviour. 
I would actually prefer to solve this in a consistent manner in hive before 
applying this to hcat, as explained in comments in that jira. I'll try to 
revive the discussion there.

> HCatLoader fail to load Orc map with null key
> -
>
> Key: HIVE-7282
> URL: https://issues.apache.org/jira/browse/HIVE-7282
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.14.0
>
> Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch
>
>
> Here is the stack:
> Get exception:
> AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374)
> at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> ... 13 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive

2014-07-01 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049146#comment-14049146
 ] 

Jason Dere commented on HIVE-7090:
--

Will add the view check as a followup item, I think this can be done during 
semantic analysis of the view creation.

> Support session-level temporary tables in Hive
> --
>
> Key: HIVE-7090
> URL: https://issues.apache.org/jira/browse/HIVE-7090
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Reporter: Gunther Hagleitner
>Assignee: Jason Dere
> Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, 
> HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch
>
>
> It's common to see sql scripts that create some temporary table as an 
> intermediate result, run some additional queries against it and then clean up 
> at the end.
> We should support temporary tables properly, meaning automatically manage the 
> life cycle and make sure the visibility is restricted to the creating 
> connection/session. Without these it's common to see left over tables in 
> meta-store or weird errors with clashing tmp table names.
> Proposed syntax:
> CREATE TEMPORARY TABLE 
> CTAS, CTL, INSERT INTO, should all be supported as usual.
> Knowing that a user wants a temp table can enable us to further optimize 
> access to it. E.g.: temp tables should be kept in memory where possible, 
> compactions and merging table files aren't required, ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7292) Hive on Spark

2014-07-01 Thread niraj rai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049138#comment-14049138
 ] 

niraj rai commented on HIVE-7292:
-

I am in OOO, so, the replying to the email might get delayed.


> Hive on Spark
> -
>
> Key: HIVE-7292
> URL: https://issues.apache.org/jira/browse/HIVE-7292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: Hive-on-Spark.pdf
>
>
> Spark as an open-source data analytics cluster computing framework has gained 
> significant momentum recently. Many Hive users already have Spark installed 
> as their computing backbone. To take advantages of Hive, they still need to 
> have either MapReduce or Tez on their cluster. This initiative will provide 
> user a new alternative so that those user can consolidate their backend. 
> Secondly, providing such an alternative further increases Hive's adoption as 
> it exposes Spark users  to a viable, feature-rich de facto standard SQL tools 
> on Hadoop.
> Finally, allowing Hive to run on Spark also has performance benefits. Hive 
> queries, especially those involving multiple reducer stages, will run faster, 
> thus improving user experience as Tez does.
> This is an umbrella JIRA which will cover many coming subtask. Design doc 
> will be attached here shortly, and will be on the wiki as well. Feedback from 
> the community is greatly appreciated!



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Issue Comment Deleted] (HIVE-7292) Hive on Spark

2014-07-01 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7292:
---

Comment: was deleted

(was: I am in OOO, so, the replying to the email might get delayed. Please 
reach out to me at (408) 799-8605 if you need something urgent.
Regards
Niraj

)

> Hive on Spark
> -
>
> Key: HIVE-7292
> URL: https://issues.apache.org/jira/browse/HIVE-7292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: Hive-on-Spark.pdf
>
>
> Spark as an open-source data analytics cluster computing framework has gained 
> significant momentum recently. Many Hive users already have Spark installed 
> as their computing backbone. To take advantages of Hive, they still need to 
> have either MapReduce or Tez on their cluster. This initiative will provide 
> user a new alternative so that those user can consolidate their backend. 
> Secondly, providing such an alternative further increases Hive's adoption as 
> it exposes Spark users  to a viable, feature-rich de facto standard SQL tools 
> on Hadoop.
> Finally, allowing Hive to run on Spark also has performance benefits. Hive 
> queries, especially those involving multiple reducer stages, will run faster, 
> thus improving user experience as Tez does.
> This is an umbrella JIRA which will cover many coming subtask. Design doc 
> will be attached here shortly, and will be on the wiki as well. Feedback from 
> the community is greatly appreciated!



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive

2014-07-01 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049124#comment-14049124
 ] 

Alan Gates commented on HIVE-7090:
--

I believe we need to solve the views issue, as being able to create a view on a 
table when others can see the view and not the table is bogus.

Other than that I'm +1 on the patch.

> Support session-level temporary tables in Hive
> --
>
> Key: HIVE-7090
> URL: https://issues.apache.org/jira/browse/HIVE-7090
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Reporter: Gunther Hagleitner
>Assignee: Jason Dere
> Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, 
> HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch
>
>
> It's common to see sql scripts that create some temporary table as an 
> intermediate result, run some additional queries against it and then clean up 
> at the end.
> We should support temporary tables properly, meaning automatically manage the 
> life cycle and make sure the visibility is restricted to the creating 
> connection/session. Without these it's common to see left over tables in 
> meta-store or weird errors with clashing tmp table names.
> Proposed syntax:
> CREATE TEMPORARY TABLE 
> CTAS, CTL, INSERT INTO, should all be supported as usual.
> Knowing that a user wants a temp table can enable us to further optimize 
> access to it. E.g.: temp tables should be kept in memory where possible, 
> compactions and merging table files aren't required, ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-7307) Lack of synchronization for TxnHandler#getDbConn()

2014-07-01 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HIVE-7307.
--

Resolution: Later

> Lack of synchronization for TxnHandler#getDbConn()
> --
>
> Key: HIVE-7307
> URL: https://issues.apache.org/jira/browse/HIVE-7307
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> TxnHandler#getDbConn() accesses connPool without holding lock on 
> TxnHandler.class
> {code}
>   Connection dbConn = connPool.getConnection();
>   dbConn.setAutoCommit(false);
> {code}
> null check should be performed on the return value, dbConn.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7307) Lack of synchronization for TxnHandler#getDbConn()

2014-07-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049106#comment-14049106
 ] 

Ted Yu commented on HIVE-7307:
--

That was the initial thought.

Thread safety would be handled by BoneCP.

> Lack of synchronization for TxnHandler#getDbConn()
> --
>
> Key: HIVE-7307
> URL: https://issues.apache.org/jira/browse/HIVE-7307
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> TxnHandler#getDbConn() accesses connPool without holding lock on 
> TxnHandler.class
> {code}
>   Connection dbConn = connPool.getConnection();
>   dbConn.setAutoCommit(false);
> {code}
> null check should be performed on the return value, dbConn.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7307) Lack of synchronization for TxnHandler#getDbConn()

2014-07-01 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049066#comment-14049066
 ] 

Alan Gates commented on HIVE-7307:
--

[~ted_yu], not sure why we should be synchronizing calls to connPool.  Are you 
worried that they are not thread safe?

> Lack of synchronization for TxnHandler#getDbConn()
> --
>
> Key: HIVE-7307
> URL: https://issues.apache.org/jira/browse/HIVE-7307
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> TxnHandler#getDbConn() accesses connPool without holding lock on 
> TxnHandler.class
> {code}
>   Connection dbConn = connPool.getConnection();
>   dbConn.setAutoCommit(false);
> {code}
> null check should be performed on the return value, dbConn.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049060#comment-14049060
 ] 

Hive QA commented on HIVE-7205:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12651916/HIVE-7205.2.patch.txt

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 5657 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/648/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/648/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-648/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12651916

> Wrong results when union all of grouping followed by group by with 
> correlation optimization
> ---
>
> Key: HIVE-7205
> URL: https://issues.apache.org/jira/browse/HIVE-7205
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.0, 0.13.1
>Reporter: dima machlin
>Assignee: Navis
>Priority: Critical
> Attachments: HIVE-7205.1.patch.txt, HIVE-7205.2.patch.txt
>
>
> use case :
> table TBL (a string,b string) contains single row : 'a','a'
> the following query :
> {code:sql}
> select b, sum(cc) from (
> select b,count(1) as cc from TBL group by b
> union all
> select a as b,count(1) as cc from TBL group by a
> ) z
> group by b
> {code}
> returns 
> a 1
> a 1
> while set hive.optimize.correlation=true;
> if we change set hive.optimize.correlation=false;
> it returns correct results : a 2
> The plan with correlation optimization :
> {code:sql}
> ABSTRACT SYNTAX TREE:
>   (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM 
> (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR 
> TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR 
> (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY 
> (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION 
> (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) 
> (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL 
> a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT 
> (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum 
> (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Alias -> Map Operator Tree:
> null-subquery1:z-subquery1:TBL 
>   TableScan
> alias: TBL
> Select Operator
>   expressions:
> expr: b
> type: string
>   outputColumnNames: b
>   Group By Operator
> aggregations:
>   expr: count(1)
> bucketGroup: false
> keys:
>   expr: b
>   type: string
> mode: hash
> outputColumnNames: _col0, _col1
>  

[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive

2014-07-01 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049026#comment-14049026
 ] 

Jason Dere commented on HIVE-7090:
--

[~brocknoland], does the patch look okay?

> Support session-level temporary tables in Hive
> --
>
> Key: HIVE-7090
> URL: https://issues.apache.org/jira/browse/HIVE-7090
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Reporter: Gunther Hagleitner
>Assignee: Jason Dere
> Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, 
> HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch
>
>
> It's common to see sql scripts that create some temporary table as an 
> intermediate result, run some additional queries against it and then clean up 
> at the end.
> We should support temporary tables properly, meaning automatically manage the 
> life cycle and make sure the visibility is restricted to the creating 
> connection/session. Without these it's common to see left over tables in 
> meta-store or weird errors with clashing tmp table names.
> Proposed syntax:
> CREATE TEMPORARY TABLE 
> CTAS, CTL, INSERT INTO, should all be supported as usual.
> Knowing that a user wants a temp table can enable us to further optimize 
> access to it. E.g.: temp tables should be kept in memory where possible, 
> compactions and merging table files aren't required, ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive

2014-07-01 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049027#comment-14049027
 ] 

Jason Dere commented on HIVE-7090:
--

I don't think the failure in TestHCatLoader is related, this passes locally for 
me

> Support session-level temporary tables in Hive
> --
>
> Key: HIVE-7090
> URL: https://issues.apache.org/jira/browse/HIVE-7090
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Reporter: Gunther Hagleitner
>Assignee: Jason Dere
> Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, 
> HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch
>
>
> It's common to see sql scripts that create some temporary table as an 
> intermediate result, run some additional queries against it and then clean up 
> at the end.
> We should support temporary tables properly, meaning automatically manage the 
> life cycle and make sure the visibility is restricted to the creating 
> connection/session. Without these it's common to see left over tables in 
> meta-store or weird errors with clashing tmp table names.
> Proposed syntax:
> CREATE TEMPORARY TABLE 
> CTAS, CTL, INSERT INTO, should all be supported as usual.
> Knowing that a user wants a temp table can enable us to further optimize 
> access to it. E.g.: temp tables should be kept in memory where possible, 
> compactions and merging table files aren't required, ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7292) Hive on Spark

2014-07-01 Thread niraj rai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048936#comment-14048936
 ] 

niraj rai commented on HIVE-7292:
-

I am in OOO, so, the replying to the email might get delayed. Please reach out 
to me at (408) 799-8605 if you need something urgent.
Regards
Niraj



> Hive on Spark
> -
>
> Key: HIVE-7292
> URL: https://issues.apache.org/jira/browse/HIVE-7292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: Hive-on-Spark.pdf
>
>
> Spark as an open-source data analytics cluster computing framework has gained 
> significant momentum recently. Many Hive users already have Spark installed 
> as their computing backbone. To take advantages of Hive, they still need to 
> have either MapReduce or Tez on their cluster. This initiative will provide 
> user a new alternative so that those user can consolidate their backend. 
> Secondly, providing such an alternative further increases Hive's adoption as 
> it exposes Spark users  to a viable, feature-rich de facto standard SQL tools 
> on Hadoop.
> Finally, allowing Hive to run on Spark also has performance benefits. Hive 
> queries, especially those involving multiple reducer stages, will run faster, 
> thus improving user experience as Tez does.
> This is an umbrella JIRA which will cover many coming subtask. Design doc 
> will be attached here shortly, and will be on the wiki as well. Feedback from 
> the community is greatly appreciated!



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7292) Hive on Spark

2014-07-01 Thread Jeff Hammerbacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Hammerbacher updated HIVE-7292:


Description: 
Spark as an open-source data analytics cluster computing framework has gained 
significant momentum recently. Many Hive users already have Spark installed as 
their computing backbone. To take advantages of Hive, they still need to have 
either MapReduce or Tez on their cluster. This initiative will provide user a 
new alternative so that those user can consolidate their backend. 

Secondly, providing such an alternative further increases Hive's adoption as it 
exposes Spark users  to a viable, feature-rich de facto standard SQL tools on 
Hadoop.

Finally, allowing Hive to run on Spark also has performance benefits. Hive 
queries, especially those involving multiple reducer stages, will run faster, 
thus improving user experience as Tez does.

This is an umbrella JIRA which will cover many coming subtask. Design doc will 
be attached here shortly, and will be on the wiki as well. Feedback from the 
community is greatly appreciated!

  was:
Spark as an open-source data analytics cluster computing framework has gained 
significant momentum recently. Many Hive users already have Spark installed as 
their computing backbone. To take advantages of Hive, they still need to have 
either MapReduce or Tez on their cluster. This initiative will provide user a 
new alternative so that those user can consolidate their backend. 

Secondly, providing such an alternative further increases Hive's adoption as it 
exposes Spark users  to a viable, feature-rich de facto standard SQL tools on 
Hadoop.

Finally, allowing Hive to run on Spark also has performance benefits. Hive 
queries, especially those involving multiple reducer stages, will run faster, 
thus improving user experience as Tez does.

This is an umber JIRA which will cover many coming subtask. Design doc will be 
attached here shortly, and will be on the wiki as well. Feedback from the 
community is greatly appreciated!


> Hive on Spark
> -
>
> Key: HIVE-7292
> URL: https://issues.apache.org/jira/browse/HIVE-7292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: Hive-on-Spark.pdf
>
>
> Spark as an open-source data analytics cluster computing framework has gained 
> significant momentum recently. Many Hive users already have Spark installed 
> as their computing backbone. To take advantages of Hive, they still need to 
> have either MapReduce or Tez on their cluster. This initiative will provide 
> user a new alternative so that those user can consolidate their backend. 
> Secondly, providing such an alternative further increases Hive's adoption as 
> it exposes Spark users  to a viable, feature-rich de facto standard SQL tools 
> on Hadoop.
> Finally, allowing Hive to run on Spark also has performance benefits. Hive 
> queries, especially those involving multiple reducer stages, will run faster, 
> thus improving user experience as Tez does.
> This is an umbrella JIRA which will cover many coming subtask. Design doc 
> will be attached here shortly, and will be on the wiki as well. Feedback from 
> the community is greatly appreciated!



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-860) Persistent distributed cache

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048892#comment-14048892
 ] 

Hive QA commented on HIVE-860:
--



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12653325/HIVE-860.patch

{color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 5656 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_cast
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_empty_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_long
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_columnarserde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_leadlag_queries
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_decimal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver_udtf_output_on_close
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/647/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/647/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-647/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 26 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12653325

> Persistent distributed cache
> 
>
> Key: HIVE-860
> URL: https://issues.apache.org/jira/browse/HIVE-860
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.12.0
>Reporter: Zheng Shao
>Assignee: Brock Noland
> Fix For: 0.14.0
>
> Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
> HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
> HIVE-860.patch, HIVE-860.patch, HIVE-860.patch
>
>
> DistributedCache is shared across multiple jobs, if the hdfs file name is the 
> same.
> We need to make sure Hive put the same file into the same location every time 
> and do not overwrite if the file content is the same.
> We can achieve 2 different results:
> A1. Files added with the same name, timestamp, and md5 in the same session 
> will have a single copy in distributed cache.
> A2. Filed added with the same name, timestamp, and md5 will have a single 
> copy in distributed cache.
> A2 has a bigger benefit in sharing but may raise a question on when Hive 
> should clean it up in hdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7324) CBO: provide a mechanism to test CBO features based on table stats only (w/o table data)

2014-07-01 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-7324:
---

Description: 
Since lot of the CBO work is focused on planning, it will be nice to be able to 
run explain query to test CBO features. TPCDS has a rich enough schema and 
query set. So the patch loads a dump TPCDS(Scale 1) stats.

1. TestCBO shows a way to load stats from a dump and run explain on a tpcds 
query. The output is currently dumped to Sys.out. This can be improved by 
hooking to QTestUtil, but hopefully this is a good start.

2. Uncovered couple of issues in the process of testing this:
a) PartitionPruner fails on 'true' constants. For e.g. you will get an error 
for 
{code:sql}
SELECT * 
FROM t WHERE
partCol < 100 AND true
{code}
This gets exposed because the predicates coming out of Optiq can contain 'true' 
predicates.
b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = 
numBuckets. This fails because there are no dataFiles. So I have altered it to 
catch exceptions and assume bucketMapJoinConvertible = false if an exception is 
encountered here.
Uploading with these changes in this patch for now. Will carve them out as 
separate patches.

[~ashutoshc], [~hagleitn] can you please take a look. 



  was:
Since lot of the CBO work is focused on planning, it will be nice to be able to 
run explain query to test CBO features. TPCDS has a rich enough schema and 
query set. So the patch loads a dump TPCDS(Scale 1) stats.

1. TestCBO shows a way to load stats from a dump and run explain on a tpcds 
query. The output is currently dumped to Sys.out. This can be improved by 
hooking to QTestUtil, but hopefully this is a good start.

2. Uncovered couple of issues in the process of testing this:
a) PartitionPruner fails on 'true' constants. For e.g. you will get an error 
for 
{code}
select * from t where partCol < 100 and true
{code}
This gets exposed because the predicates coming out of Optiq can contain 'true' 
predicates.
b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = 
numBuckets. This fails because there are no dataFiles. So I have altered it to 
catch exceptions and assume bucketMapJoinConvertible = false if an exception is 
encountered here.
Uploading with these changes in this patch for now. Will carve them out as 
separate patches.

[~ashutoshc], [~hagleitn] can you please take a look. 




> CBO: provide a mechanism to test CBO features based on table stats only (w/o 
> table data)
> 
>
> Key: HIVE-7324
> URL: https://issues.apache.org/jira/browse/HIVE-7324
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Butani
>Assignee: Harish Butani
> Attachments: HIVE-7324.1.patch
>
>
> Since lot of the CBO work is focused on planning, it will be nice to be able 
> to run explain query to test CBO features. TPCDS has a rich enough schema and 
> query set. So the patch loads a dump TPCDS(Scale 1) stats.
> 1. TestCBO shows a way to load stats from a dump and run explain on a tpcds 
> query. The output is currently dumped to Sys.out. This can be improved by 
> hooking to QTestUtil, but hopefully this is a good start.
> 2. Uncovered couple of issues in the process of testing this:
> a) PartitionPruner fails on 'true' constants. For e.g. you will get an error 
> for 
> {code:sql}
> SELECT * 
> FROM t WHERE
> partCol < 100 AND true
> {code}
> This gets exposed because the predicates coming out of Optiq can contain 
> 'true' predicates.
> b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = 
> numBuckets. This fails because there are no dataFiles. So I have altered it 
> to catch exceptions and assume bucketMapJoinConvertible = false if an 
> exception is encountered here.
> Uploading with these changes in this patch for now. Will carve them out as 
> separate patches.
> [~ashutoshc], [~hagleitn] can you please take a look. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5275) HiveServer2 should respect hive.aux.jars.path property and add aux jars to distributed cache

2014-07-01 Thread Hari Sekhon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048786#comment-14048786
 ] 

Hari Sekhon commented on HIVE-5275:
---

I've observed this but on IBM BigInsights 2.1 which has many integration bugs 
so I don't know if that's just IBM having done something funny or if this is a 
widespread problem.

> HiveServer2 should respect hive.aux.jars.path property and add aux jars to 
> distributed cache
> 
>
> Key: HIVE-5275
> URL: https://issues.apache.org/jira/browse/HIVE-5275
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Alex Favaro
>
> HiveServer2 currently ignores the hive.aux.jars.path property in 
> hive-site.xml. That means that the only way to use a custom SerDe is to add 
> it to AUX_CLASSPATH on the server and manually distribute the jar to the 
> cluster nodes. Hive CLI does this automatically when hive.aux.jars.path is 
> set. It would be nice if HiverServer2 did the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7231) Improve ORC padding

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048713#comment-14048713
 ] 

Hive QA commented on HIVE-7231:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12653314/HIVE-7231.6.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5671 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthBinary.org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthBinary
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/645/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/645/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-645/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12653314

> Improve ORC padding
> ---
>
> Key: HIVE-7231
> URL: https://issues.apache.org/jira/browse/HIVE-7231
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
>  Labels: orcfile
> Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, 
> HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch
>
>
> Current ORC padding is not optimal because of fixed stripe sizes within 
> block. The padding overhead will be significant in some cases. Also padding 
> percentage relative to stripe size is not configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 23139: HIVE-7294 : sql std auth - authorize show grant statements

2014-07-01 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23139/
---

(Updated July 1, 2014, 9:16 a.m.)


Review request for hive.


Changes
---

HIVE-7294.2.patch - also authorizes 'show role grant' statements.


Bugs: HIVE-7294
https://issues.apache.org/jira/browse/HIVE-7294


Repository: hive-git


Description
---

See jira


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java d8d900b 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
 e4f5aac 
  ql/src/test/queries/clientnegative/authorization_insertoverwrite_nodel.q 
90fe6e1 
  ql/src/test/queries/clientnegative/authorization_priv_current_role_neg.q 
bbf3b66 
  ql/src/test/queries/clientnegative/authorization_role_grant_otherrole.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/authorization_role_grant_otheruser.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/authorization_show_grant_otherrole.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_all.q 
PRE-CREATION 
  
ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_alltabs.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_wtab.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_grant_public_role.q 8473178 
  ql/src/test/queries/clientpositive/authorization_grant_table_priv.q 02d364e 
  ql/src/test/queries/clientpositive/authorization_insert.q 5de6f50 
  ql/src/test/queries/clientpositive/authorization_revoke_table_priv.q ccda3b5 
  ql/src/test/queries/clientpositive/authorization_role_grant2.q fd6aa38 
  ql/src/test/queries/clientpositive/authorization_show_grant.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_view_sqlstd.q bd7bbfe 
  ql/src/test/results/clientnegative/authorization_insertoverwrite_nodel.q.out 
de1d230 
  ql/src/test/results/clientnegative/authorization_role_grant_otherrole.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_role_grant_otheruser.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_show_grant_otherrole.q.out 
PRE-CREATION 
  
ql/src/test/results/clientnegative/authorization_show_grant_otheruser_all.q.out 
PRE-CREATION 
  
ql/src/test/results/clientnegative/authorization_show_grant_otheruser_alltabs.q.out
 PRE-CREATION 
  
ql/src/test/results/clientnegative/authorization_show_grant_otheruser_wtab.q.out
 PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_grant_public_role.q.out 
a0a45f7 
  ql/src/test/results/clientpositive/authorization_grant_table_priv.q.out 
9a6ec17 
  ql/src/test/results/clientpositive/authorization_insert.q.out f94d9a9 
  ql/src/test/results/clientpositive/authorization_role_grant2.q.out 2e94af3 
  ql/src/test/results/clientpositive/authorization_show_grant.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_view_sqlstd.q.out 50c0247 

Diff: https://reviews.apache.org/r/23139/diff/


Testing
---

test cases included.


Thanks,

Thejas Nair



[jira] [Commented] (HIVE-7040) TCP KeepAlive for HiveServer2

2014-07-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048669#comment-14048669
 ] 

Nicolas ThiƩbaud commented on HIVE-7040:


Let me know which one to focus on, I'd like to see merged one or the other. I 
don't mind closing this one in favor of HIVE-6679.

> TCP KeepAlive for HiveServer2
> -
>
> Key: HIVE-7040
> URL: https://issues.apache.org/jira/browse/HIVE-7040
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Server Infrastructure
>Affects Versions: 0.13.1
>Reporter: Nicolas ThiƩbaud
> Attachments: HIVE-7040.3.patch, HIVE-7040.patch, HIVE-7040.patch.2
>
>
> Implement TCP KeepAlive for HiverServer 2 to avoid half open connections.
> A setting new is added. This works for ThriftBinaryCLIService with and 
> without SSL.
> {code}
> 
>   hive.server2.tcp.keepalive
>   true
>   Whether to enable TCP keepalive for Hive Server 2
> 
> {code}
> The default proposed value is true, in the same way this is the default for 
> the metastore, see HIVE-1410.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7294) sql std auth - authorize show grant statements

2014-07-01 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-7294:


Attachment: HIVE-7294.2.patch

HIVE-7294.2.patch - also authorizes 'show role grant' statements.


> sql std auth - authorize show grant statements
> --
>
> Key: HIVE-7294
> URL: https://issues.apache.org/jira/browse/HIVE-7294
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, SQLStandardAuthorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7294.1.patch, HIVE-7294.2.patch
>
>
> A non admin user should not be allowed to run show grant commands only for 
> themselves or a role they belong to.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7323) Date type stats in ORC sometimes go stale

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048651#comment-14048651
 ] 

Hive QA commented on HIVE-7323:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12653308/HIVE-7323.1.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5671 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/644/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/644/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-644/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12653308

> Date type stats in ORC sometimes go stale
> -
>
> Key: HIVE-7323
> URL: https://issues.apache.org/jira/browse/HIVE-7323
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7323.1.patch.txt
>
>
> I cannot make proper test case but sometimes min/max value in date type stats 
> is changed in runtime. Stats for other type contains non-mutable values in it 
> but date type stats contains DateWritable, which of inner value can be 
> changed anytime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing "org.apache.tez.dag.api.SessionNotRunning: Application not running" error

2014-07-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048650#comment-14048650
 ] 

Lefty Leverenz commented on HIVE-6782:
--

*hive.localize.resource.wait.interval* & 
*hive.localize.resource.num.wait.attempts* are documented in the wiki here:

* [Configuration Properties -- Tez -- hive.localize.resource.wait.interval | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.localize.resource.wait.interval]
* [Configuration Properties -- Tez -- hive.localize.resource.num.wait.attempts 
| 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.localize.resource.num.wait.attempts]

I also added a comment to HIVE-6586 so they won't get lost in the shuffle when 
HIVE-6037 changes HiveConf.java.

> HiveServer2Concurrency issue when running with tez intermittently, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error
> -
>
> Key: HIVE-6782
> URL: https://issues.apache.org/jira/browse/HIVE-6782
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 0.13.0, 0.14.0
>
> Attachments: HIVE-6782.1.patch, HIVE-6782.10.patch, 
> HIVE-6782.11.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, HIVE-6782.4.patch, 
> HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, HIVE-6782.8.patch, 
> HIVE-6782.9.patch
>
>
> HiveServer2 concurrency is failing intermittently when using tez, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-494) Select columns by index instead of name

2014-07-01 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-494:
---

Attachment: HIVE-494.3.patch.txt

Check ambiguous columns & added negative tests

> Select columns by index instead of name
> ---
>
> Key: HIVE-494
> URL: https://issues.apache.org/jira/browse/HIVE-494
> Project: Hive
>  Issue Type: Wish
>  Components: Clients, Query Processor
>Reporter: Adam Kramer
>Assignee: Navis
>Priority: Minor
>  Labels: SQL
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, 
> HIVE-494.2.patch.txt, HIVE-494.3.patch.txt, HIVE-494.D12153.1.patch
>
>
> SELECT mytable[0], mytable[2] FROM some_table_name mytable;
> ...should return the first and third columns, respectively, from mytable 
> regardless of their column names.
> The need for "names" specifically is kind of silly when they just get 
> translated into numbers anyway.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 22926: Select columns by index instead of name

2014-07-01 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22926/
---

(Updated July 1, 2014, 8:43 a.m.)


Review request for hive.


Changes
---

Check ambiguous columns & added negative tests


Bugs: HIVE-494
https://issues.apache.org/jira/browse/HIVE-494


Repository: hive-git


Description
---

SELECT mytable[0], mytable[2] FROM some_table_name mytable;

...should return the first and third columns, respectively, from mytable 
regardless of their column names.

The need for "names" specifically is kind of silly when they just get 
translated into numbers anyway.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnInfo.java feb8558 
  ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g f448b16 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 9c001c1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1d8d764 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java e7da289 
  ql/src/test/queries/clientnegative/select_by_column_index_negative0.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/select_by_column_index_negative1.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/select_by_column_index_negative2.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/select_by_column_index.q PRE-CREATION 
  ql/src/test/results/clientnegative/select_by_column_index_negative0.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/select_by_column_index_negative1.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/select_by_column_index_negative2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/select_by_column_index.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/22926/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Commented] (HIVE-6586) Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos)

2014-07-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048645#comment-14048645
 ] 

Lefty Leverenz commented on HIVE-6586:
--

HIVE-6782 added hive.localize.resource.wait.interval & 
hive.localize.resource.num.wait.attempts in 0.13.0. They aren't in patch 
HIVE-6037-0.13.0.

> Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos)
> ---
>
> Key: HIVE-6586
> URL: https://issues.apache.org/jira/browse/HIVE-6586
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Lefty Leverenz
>  Labels: TODOC14
>
> HIVE-6037 puts the definitions of configuration parameters into the 
> HiveConf.java file, but several recent jiras for release 0.13.0 introduce new 
> parameters that aren't in HiveConf.java yet and some parameter definitions 
> need to be altered for 0.13.0.  This jira will patch HiveConf.java after 
> HIVE-6037 gets committed.
> Also, four typos patched in HIVE-6582 need to be fixed in the new 
> HiveConf.java.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5976) Decouple input formats from STORED as keywords

2014-07-01 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-5976:
-

Attachment: HIVE-5976.4.patch

Rebasing on trunk.

> Decouple input formats from STORED as keywords
> --
>
> Key: HIVE-5976
> URL: https://issues.apache.org/jira/browse/HIVE-5976
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-5976.2.patch, HIVE-5976.3.patch, HIVE-5976.3.patch, 
> HIVE-5976.4.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, 
> HIVE-5976.patch
>
>
> As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
> It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 23153: Fix some test output files.

2014-07-01 Thread David Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23153/
---

(Updated July 1, 2014, 8:29 a.m.)


Review request for hive.


Bugs: HIVE-5976
https://issues.apache.org/jira/browse/HIVE-5976


Repository: hive-git


Description (updated)
---

Fix some test output files.


Use JavaUtils.getClassLoader.


Apply patch


Diffs (updated)
-

  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/CreateTableHook.java
 ec24531117203a5c75c62d0e5b54d5a43d37fa79 
  
itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextSerDe.java
 PRE-CREATION 
  
itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextStorageFormatDescriptor.java
 PRE-CREATION 
  
itests/custom-serde/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/AbstractStorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 
41310661ced0616f6bee27af2b1195127e5230e8 
  ql/src/java/org/apache/hadoop/hive/ql/io/ORCFileStorageFormatDescriptor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/ParquetFileStorageFormatDescriptor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/RCFileStorageFormatDescriptor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/SequenceFileStorageFormatDescriptor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatFactory.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/TextFileStorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
60d54b6a04e1a9601342b0159387114f7b666338 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
640b6b319ce84a875cc78cb8b29fa6bbc1067fc5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 
412a046488eaea42a6416c7cbd514715d37e249f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 
f934ac4e3b736eed1b3060fa516124c67f9a2f87 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 
9c001c1495b423c19f3fa710c74f1bb1e24a08f4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java 
0af25360ee6f3088c764f0c4d812f30d1eeb91d6 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
399f92a6b8006e52891d7f864393846276a6c2b3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java PRE-CREATION 
  
ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor
 PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/storage_format_descriptor.q PRE-CREATION 
  ql/src/test/results/clientnegative/fileformat_bad_class.q.out 
ab1e9357c0a7d4e21816290fbf7ed99396932b92 
  ql/src/test/results/clientnegative/genericFileFormat.q.out 
9613df95c8fc977c0ad1f717afa2db3870dfd904 
  ql/src/test/results/clientpositive/ctas.q.out 
0040f3c9df690c44a1bc2f258cb075dbaaa585f3 
  ql/src/test/results/clientpositive/storage_format_descriptor.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/ctas.q.out 
a58e16639d725c851cedfc7bb81d65c25f3c56c3 

Diff: https://reviews.apache.org/r/23153/diff/


Testing
---


Thanks,

David Chen



[jira] [Commented] (HIVE-5976) Decouple input formats from STORED as keywords

2014-07-01 Thread David Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048632#comment-14048632
 ] 

David Chen commented on HIVE-5976:
--

I have posted a new patch that should fix some of the tests. Some of the test 
failures appear to be caused by slightly different output that Hive now prints 
due to this patch. The changes should fix the following tests:

{{org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ctas}}
{{org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas}}

Test failed because output files expected old parse tree dump.

{{org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fileformat_bad_class}}

Hive now outputs {{FAILED: SemanticException Cannot find class 
'ClassDoesNotExist'}} if SerDe class is not found.

{{org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_genericFileFormat}}

Hive now prints format name in all caps in quotes.

I am still looking into the other test failures.

> Decouple input formats from STORED as keywords
> --
>
> Key: HIVE-5976
> URL: https://issues.apache.org/jira/browse/HIVE-5976
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-5976.2.patch, HIVE-5976.3.patch, HIVE-5976.3.patch, 
> HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch
>
>
> As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
> It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5976) Decouple input formats from STORED as keywords

2014-07-01 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-5976:
-

Attachment: HIVE-5976.3.patch

> Decouple input formats from STORED as keywords
> --
>
> Key: HIVE-5976
> URL: https://issues.apache.org/jira/browse/HIVE-5976
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-5976.2.patch, HIVE-5976.3.patch, HIVE-5976.3.patch, 
> HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch
>
>
> As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
> It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 23153: Fix some test output files.

2014-07-01 Thread David Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23153/
---

(Updated July 1, 2014, 8:03 a.m.)


Review request for hive.


Bugs: HIVE-5976
https://issues.apache.org/jira/browse/HIVE-5976


Repository: hive-git


Description
---

Update expected output files based on patch changes.


Diffs
-

  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/CreateTableHook.java
 ec24531117203a5c75c62d0e5b54d5a43d37fa79 
  
itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextSerDe.java
 PRE-CREATION 
  
itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextStorageFormatDescriptor.java
 PRE-CREATION 
  
itests/custom-serde/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/AbstractStorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 
41310661ced0616f6bee27af2b1195127e5230e8 
  ql/src/java/org/apache/hadoop/hive/ql/io/ORCFileStorageFormatDescriptor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/ParquetFileStorageFormatDescriptor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/RCFileStorageFormatDescriptor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/SequenceFileStorageFormatDescriptor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatFactory.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/TextFileStorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
60d54b6a04e1a9601342b0159387114f7b666338 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
640b6b319ce84a875cc78cb8b29fa6bbc1067fc5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 
412a046488eaea42a6416c7cbd514715d37e249f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 
f934ac4e3b736eed1b3060fa516124c67f9a2f87 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 
9c001c1495b423c19f3fa710c74f1bb1e24a08f4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java 
0af25360ee6f3088c764f0c4d812f30d1eeb91d6 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
83d09c079f3ce035c4d905280a40611b41516356 
  ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java PRE-CREATION 
  
ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor
 PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/storage_format_descriptor.q PRE-CREATION 
  ql/src/test/results/clientnegative/fileformat_bad_class.q.out 
ab1e9357c0a7d4e21816290fbf7ed99396932b92 
  ql/src/test/results/clientnegative/genericFileFormat.q.out 
9613df95c8fc977c0ad1f717afa2db3870dfd904 
  ql/src/test/results/clientpositive/ctas.q.out 
0040f3c9df690c44a1bc2f258cb075dbaaa585f3 
  ql/src/test/results/clientpositive/storage_format_descriptor.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/ctas.q.out 
a58e16639d725c851cedfc7bb81d65c25f3c56c3 

Diff: https://reviews.apache.org/r/23153/diff/


Testing
---


Thanks,

David Chen



Re: Review Request 23153: Fix some test output files.

2014-07-01 Thread David Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23153/
---

(Updated July 1, 2014, 8:03 a.m.)


Review request for hive.


Summary (updated)
-

Fix some test output files.


Bugs: HIVE-5976
https://issues.apache.org/jira/browse/HIVE-5976


Repository: hive-git


Description (updated)
---

Update expected output files based on patch changes.


Diffs (updated)
-

  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/CreateTableHook.java
 ec24531117203a5c75c62d0e5b54d5a43d37fa79 
  
itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextSerDe.java
 PRE-CREATION 
  
itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextStorageFormatDescriptor.java
 PRE-CREATION 
  
itests/custom-serde/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/AbstractStorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 
41310661ced0616f6bee27af2b1195127e5230e8 
  ql/src/java/org/apache/hadoop/hive/ql/io/ORCFileStorageFormatDescriptor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/ParquetFileStorageFormatDescriptor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/RCFileStorageFormatDescriptor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/SequenceFileStorageFormatDescriptor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatFactory.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/TextFileStorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
60d54b6a04e1a9601342b0159387114f7b666338 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
640b6b319ce84a875cc78cb8b29fa6bbc1067fc5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 
412a046488eaea42a6416c7cbd514715d37e249f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 
f934ac4e3b736eed1b3060fa516124c67f9a2f87 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 
9c001c1495b423c19f3fa710c74f1bb1e24a08f4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java 
0af25360ee6f3088c764f0c4d812f30d1eeb91d6 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
83d09c079f3ce035c4d905280a40611b41516356 
  ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java PRE-CREATION 
  
ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor
 PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/storage_format_descriptor.q PRE-CREATION 
  ql/src/test/results/clientnegative/fileformat_bad_class.q.out 
ab1e9357c0a7d4e21816290fbf7ed99396932b92 
  ql/src/test/results/clientnegative/genericFileFormat.q.out 
9613df95c8fc977c0ad1f717afa2db3870dfd904 
  ql/src/test/results/clientpositive/ctas.q.out 
0040f3c9df690c44a1bc2f258cb075dbaaa585f3 
  ql/src/test/results/clientpositive/storage_format_descriptor.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/ctas.q.out 
a58e16639d725c851cedfc7bb81d65c25f3c56c3 

Diff: https://reviews.apache.org/r/23153/diff/


Testing
---


Thanks,

David Chen



[jira] [Commented] (HIVE-860) Persistent distributed cache

2014-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048595#comment-14048595
 ] 

Hive QA commented on HIVE-860:
--



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12653298/HIVE-860.patch

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 5610 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_func1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_id2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_test_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_leadlag
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoinopt20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_streaming
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/643/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/643/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-643/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12653298

> Persistent distributed cache
> 
>
> Key: HIVE-860
> URL: https://issues.apache.org/jira/browse/HIVE-860
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.12.0
>Reporter: Zheng Shao
>Assignee: Brock Noland
> Fix For: 0.14.0
>
> Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
> HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
> HIVE-860.patch, HIVE-860.patch, HIVE-860.patch
>
>
> DistributedCache is shared across multiple jobs, if the hdfs file name is the 
> same.
> We need to make sure Hive put the same file into the same location every time 
> and do not overwrite if the file content is the same.
> We can achieve 2 different results:
> A1. Files added with the same name, timestamp, and md5 in the same session 
> will have a single copy in distributed cache.
> A2. Filed added with the same name, timestamp, and md5 will have a single 
> copy in distributed cache.
> A2 has a bigger benefit in sharing but may raise a question on when Hive 
> should clean it up in hdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)