[jira] [Commented] (HIVE-10630) Renaming tables across encryption zones renames table even though the operation throws error

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543296#comment-14543296
 ] 

Hive QA commented on HIVE-10630:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732747/HIVE-10630.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8922 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3887/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3887/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3887/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732747 - PreCommit-HIVE-TRUNK-Build

> Renaming tables across encryption zones renames table even though the 
> operation throws error
> 
>
> Key: HIVE-10630
> URL: https://issues.apache.org/jira/browse/HIVE-10630
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Security
>Reporter: Deepesh Khandelwal
>Assignee: Eugene Koifman
> Attachments: HIVE-10630.patch, HIVE-10630.patch
>
>
> Create a table with data in an encrypted zone 1 and then rename it to 
> encrypted zone 2.
> {noformat}
> hive> alter table encdb1.testtbl rename to encdb2.testtbl;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. Unable to 
> access old location 
> hdfs://node-1.example.com:8020/apps/hive/warehouse/encdb1.db/testtbl for 
> table encdb1.testtbl
> {noformat}
> Even though the command errors out the table is renamed. I think the right 
> behavior should be to not rename the table at all including the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10706) Make vectorized_timestamp_funcs test more stable

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543281#comment-14543281
 ] 

Alexander Pivovarov commented on HIVE-10706:


depending on how big the number n is I use different options:
- for E9 and smaller I use round(n, 3)
- for E13 - round(n, 0)
- for E19 - n between x and y

> Make vectorized_timestamp_funcs test more stable
> 
>
> Key: HIVE-10706
> URL: https://issues.apache.org/jira/browse/HIVE-10706
> Project: Hive
>  Issue Type: Bug
>  Components: UDF, Vectorization
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-10706.1.patch
>
>
> TestCliDriver.testCliDriver_vectorized_timestamp_funcs failed recently
> The problem is Double to Double numbers comparison without delta.
> {code}
> Running: diff -a 
> /home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/vectorized_timestamp_funcs.q.out
>  
> /home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../ql/src/test/results/clientpositive/vectorized_timestamp_funcs.q.out
> 729c729
> < 1123143.857003
> ---
> > 1123143.856998
> {code}
> I also noticed that last query results are different among TestCliDriver and 
> TestSparkCliDriver tests  (last line in vectorized_timestamp_funcs.q.out)
> {code}
> -- regular
> 2.8798560435897438E13 8.970772952794212E198.970772952794212E19
> 9.206845925236166E199.471416447815084E9 9.471416447815084E9 
> 9.471416447815084E9 9.595231068211002E9
> -- spark
> 2.8798560435897438E13 8.970772952794214E198.970772952794214E19
> 9.206845925236167E199.471416447815086E9 9.471416447815086E9 
> 9.471416447815086E9 9.595231068211004E9
> -- tez
> 2.8798560435897438E13 8.970772952794212E198.970772952794212E19
> 9.206845925236166E199.471416447815084E9 9.471416447815084E9 
> 9.471416447815084E9 9.595231068211002E9
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10706) Make vectorized_timestamp_funcs test more stable

2015-05-13 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-10706:
---
Attachment: HIVE-10706.1.patch

patch #1

> Make vectorized_timestamp_funcs test more stable
> 
>
> Key: HIVE-10706
> URL: https://issues.apache.org/jira/browse/HIVE-10706
> Project: Hive
>  Issue Type: Bug
>  Components: UDF, Vectorization
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-10706.1.patch
>
>
> TestCliDriver.testCliDriver_vectorized_timestamp_funcs failed recently
> The problem is Double to Double numbers comparison without delta.
> {code}
> Running: diff -a 
> /home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/vectorized_timestamp_funcs.q.out
>  
> /home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../ql/src/test/results/clientpositive/vectorized_timestamp_funcs.q.out
> 729c729
> < 1123143.857003
> ---
> > 1123143.856998
> {code}
> I also noticed that last query results are different among TestCliDriver and 
> TestSparkCliDriver tests  (last line in vectorized_timestamp_funcs.q.out)
> {code}
> -- regular
> 2.8798560435897438E13 8.970772952794212E198.970772952794212E19
> 9.206845925236166E199.471416447815084E9 9.471416447815084E9 
> 9.471416447815084E9 9.595231068211002E9
> -- spark
> 2.8798560435897438E13 8.970772952794214E198.970772952794214E19
> 9.206845925236167E199.471416447815086E9 9.471416447815086E9 
> 9.471416447815086E9 9.595231068211004E9
> -- tez
> 2.8798560435897438E13 8.970772952794212E198.970772952794212E19
> 9.206845925236166E199.471416447815084E9 9.471416447815084E9 
> 9.471416447815084E9 9.595231068211002E9
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8211) Tez and Vectorization of SUM(timestamp) not vectorized -- can't execute correctly because aggregation output is double

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543249#comment-14543249
 ] 

Alexander Pivovarov commented on HIVE-8211:
---

correction
{code}
select cast(cast('2015-05-14 23:20:34.123456789' as timestamp) as double);
OK
1.4316708341234567E9
{code}

> Tez and Vectorization of SUM(timestamp) not vectorized -- can't execute 
> correctly because aggregation output is double
> --
>
> Key: HIVE-8211
> URL: https://issues.apache.org/jira/browse/HIVE-8211
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, Vectorization
>Reporter: Matt McCline
>Assignee: Matt McCline
>
> Vectorization of SUM(timestamp) is currently turned off because the output of 
> aggregation is a double (DoubleColumnVector) and the execution code is 
> expecting a long (LongColumnVector).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8211) Tez and Vectorization of SUM(timestamp) not vectorized -- can't execute correctly because aggregation output is double

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543246#comment-14543246
 ] 

Alexander Pivovarov commented on HIVE-8211:
---

if timestamp value is converted to number then
int part is seconds
fractional part is milliseconds, microseconds, or nanoseconds (but up to 7 
digits (not 9))
e.g.
{code}
select cast(cast('2015-05-14 23:20:34.123456789' as timestamp) as 
decimal(32,12));
OK
1431670834.1234567

select cast(cast('2015-05-14 23:20:34.123456789' as timestamp) as double());
OK
1.4316708341234567E9
{code}

> Tez and Vectorization of SUM(timestamp) not vectorized -- can't execute 
> correctly because aggregation output is double
> --
>
> Key: HIVE-8211
> URL: https://issues.apache.org/jira/browse/HIVE-8211
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, Vectorization
>Reporter: Matt McCline
>Assignee: Matt McCline
>
> Vectorization of SUM(timestamp) is currently turned off because the output of 
> aggregation is a double (DoubleColumnVector) and the execution code is 
> expecting a long (LongColumnVector).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10623) Implement hive cli options using beeline functionality

2015-05-13 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10623:

Attachment: HIVE-10623.3.patch

Thanks [~apivovarov] for your review. Update patch addressing your comments.

> Implement hive cli options using beeline functionality
> --
>
> Key: HIVE-10623
> URL: https://issues.apache.org/jira/browse/HIVE-10623
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10623.1.patch, HIVE-10623.2.patch, 
> HIVE-10623.3.patch, HIVE-10623.patch
>
>
> We need to support the original hive cli options for the purpose of backwards 
> compatibility. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10623) Implement hive cli options using beeline functionality

2015-05-13 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10623:

Attachment: (was: HIVE-10623.3.patch)

> Implement hive cli options using beeline functionality
> --
>
> Key: HIVE-10623
> URL: https://issues.apache.org/jira/browse/HIVE-10623
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10623.1.patch, HIVE-10623.2.patch, HIVE-10623.patch
>
>
> We need to support the original hive cli options for the purpose of backwards 
> compatibility. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10101) LLAP: enable yourkit profiling of tasks

2015-05-13 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543232#comment-14543232
 ] 

Mostafa Mokhtar commented on HIVE-10101:


[~sershe]
This is an alternative to an API 
https://docs.oracle.com/javacomponents/jmc-5-4/jfr-runtime-guide/comline.htm#BABBGJCF


> LLAP: enable yourkit profiling of tasks
> ---
>
> Key: HIVE-10101
> URL: https://issues.apache.org/jira/browse/HIVE-10101
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10101.02.patch, HIVE-10101.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10662) LLAP: Wait queue pre-emption

2015-05-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-10662.
--
Resolution: Pending Closed

Committed patch to llap branch.

> LLAP: Wait queue pre-emption
> 
>
> Key: HIVE-10662
> URL: https://issues.apache.org/jira/browse/HIVE-10662
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
> Attachments: HIVE-10662.1.patch, HIVE-10662.2.patch
>
>
> If the wait queue is full, currently the task scheduler throws 
> RejectedExecutionException. Instead it should kick out the lowest priority 
> task and notify the AM as task killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10671) yarn-cluster mode offers a degraded performance from yarn-client [Spark Branch]

2015-05-13 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-10671:
--
Attachment: HIVE-10671.2-spark.patch

Address RB comments.
I don't think the failures are related.

> yarn-cluster mode offers a degraded performance from yarn-client [Spark 
> Branch]
> ---
>
> Key: HIVE-10671
> URL: https://issues.apache.org/jira/browse/HIVE-10671
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-10671.1-spark.patch, HIVE-10671.2-spark.patch
>
>
> With Hive on Spark, users noticed that in certain cases 
> spark.master=yarn-client offers 2x or 3x better performance than if 
> spark.master=yarn-cluster. However, yarn-cluster is what we recommend and 
> support. Thus, we should investigate and fix the problem. One of the such 
> queries is TPC-H  22.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10623) Implement hive cli options using beeline functionality

2015-05-13 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10623:

Attachment: HIVE-10623.3.patch

Failed cases are caused by permission. Update the patch by specifying the 
metastore path.
{noformat}
Caused by: MetaException(message:Unable to create database path 
file:/user/hive/warehouse/test.db, failed to create database test)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database_core(HiveMetaStore.java:857)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:903)
{noformat}

> Implement hive cli options using beeline functionality
> --
>
> Key: HIVE-10623
> URL: https://issues.apache.org/jira/browse/HIVE-10623
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10623.1.patch, HIVE-10623.2.patch, 
> HIVE-10623.3.patch, HIVE-10623.patch
>
>
> We need to support the original hive cli options for the purpose of backwards 
> compatibility. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10662) LLAP: Wait queue pre-emption

2015-05-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10662:
-
Attachment: HIVE-10662.2.patch

> LLAP: Wait queue pre-emption
> 
>
> Key: HIVE-10662
> URL: https://issues.apache.org/jira/browse/HIVE-10662
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
> Attachments: HIVE-10662.1.patch, HIVE-10662.2.patch
>
>
> If the wait queue is full, currently the task scheduler throws 
> RejectedExecutionException. Instead it should kick out the lowest priority 
> task and notify the AM as task killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10662) LLAP: Wait queue pre-emption

2015-05-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10662:
-
Attachment: (was: HIVE-10662.2.patch)

> LLAP: Wait queue pre-emption
> 
>
> Key: HIVE-10662
> URL: https://issues.apache.org/jira/browse/HIVE-10662
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
> Attachments: HIVE-10662.1.patch
>
>
> If the wait queue is full, currently the task scheduler throws 
> RejectedExecutionException. Instead it should kick out the lowest priority 
> task and notify the AM as task killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10671) yarn-cluster mode offers a degraded performance from yarn-client [Spark Branch]

2015-05-13 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543222#comment-14543222
 ] 

Chengxiang Li commented on HIVE-10671:
--

LGTM, +1

> yarn-cluster mode offers a degraded performance from yarn-client [Spark 
> Branch]
> ---
>
> Key: HIVE-10671
> URL: https://issues.apache.org/jira/browse/HIVE-10671
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-10671.1-spark.patch
>
>
> With Hive on Spark, users noticed that in certain cases 
> spark.master=yarn-client offers 2x or 3x better performance than if 
> spark.master=yarn-cluster. However, yarn-cluster is what we recommend and 
> support. Thus, we should investigate and fix the problem. One of the such 
> queries is TPC-H  22.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10636) CASE comparison operator rotation optimization

2015-05-13 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543221#comment-14543221
 ] 

Gopal V commented on HIVE-10636:


[~ashutoshc]: +1 on the case rotation.

But it misses out on one of the CASE when folds, so HIVE-9644 wasn't 
comprehensive for the NULL and TRUE branch cases.

{code}
hive> explain select count(1) from store_sales where (case ss_sold_date_sk when 
2 then true else null end);
...
Vertices:
Map 1 
Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: CASE (ss_sold_date_sk) WHEN (2) THEN (true) ELSE 
(null) END (type: boolean)
{code}

the (true) / (null) WHEN is not getting folded, which is the last case (and the 
very first).

Would be nice to get that one is as well (as well as the no-else, which 
defaults to NULL).

{code}
hive> explain select count(1) from store_sales where (case ss_sold_date_sk when 
2 then true end);
OK
...
  Vertices:
Map 1 
Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: CASE (ss_sold_date_sk) WHEN (2) THEN (true) END 
(type: boolean)
{code}

Other than those two cases, everything seems to work fine.

> CASE comparison operator rotation optimization
> --
>
> Key: HIVE-10636
> URL: https://issues.apache.org/jira/browse/HIVE-10636
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10636.1.patch, HIVE-10636.2.patch, 
> HIVE-10636.3.patch, HIVE-10636.patch
>
>
> Step 1 as outlined in description of HIVE-9644



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10704) Errors in Tez HashTableLoader when estimated table size is 0

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543210#comment-14543210
 ] 

Hive QA commented on HIVE-10704:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732741/HIVE-10704.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8920 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3886/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3886/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3886/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732741 - PreCommit-HIVE-TRUNK-Build

> Errors in Tez HashTableLoader when estimated table size is 0
> 
>
> Key: HIVE-10704
> URL: https://issues.apache.org/jira/browse/HIVE-10704
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-10704.1.patch
>
>
> Couple of issues:
> - If the table sizes in MapJoinOperator.getParentDataSizes() are 0 for all 
> tables, the largest small table selection is wrong and could select the large 
> table (which results in NPE)
> - The memory estimates can either divide-by-zero, or allocate 0 memory if the 
> table size is 0. Try to come up with a sensible default for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10671) yarn-cluster mode offers a degraded performance from yarn-client [Spark Branch]

2015-05-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543212#comment-14543212
 ] 

Xuefu Zhang commented on HIVE-10671:


1. Not sure if the test failures are related.
2. [~chengxiang li], could you also give a review? thanks.

> yarn-cluster mode offers a degraded performance from yarn-client [Spark 
> Branch]
> ---
>
> Key: HIVE-10671
> URL: https://issues.apache.org/jira/browse/HIVE-10671
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-10671.1-spark.patch
>
>
> With Hive on Spark, users noticed that in certain cases 
> spark.master=yarn-client offers 2x or 3x better performance than if 
> spark.master=yarn-cluster. However, yarn-cluster is what we recommend and 
> support. Thus, we should investigate and fix the problem. One of the such 
> queries is TPC-H  22.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10257) Ensure Parquet Hive has null optimization

2015-05-13 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10257:

Component/s: Tests

> Ensure Parquet Hive has null optimization
> -
>
> Key: HIVE-10257
> URL: https://issues.apache.org/jira/browse/HIVE-10257
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Dong Chen
>Assignee: Dong Chen
> Attachments: HIVE-10257-parquet.1.patch, HIVE-10257-parquet.patch
>
>
> In Parquet statistics, a boolean value {{hasNonNullValue}} is used for each 
> column chunk. Hive could use this value to skip a column, avoid null-checking 
> logic, and speed up vectorization like HIVE-4478 (in the future, Parquet 
> vectorization is not completed yet).
> In this Jira we could check whether this null optimization works, and make 
> changes if any.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10706) Make vectorized_timestamp_funcs test more stable

2015-05-13 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-10706:
---
Description: 
TestCliDriver.testCliDriver_vectorized_timestamp_funcs failed recently
The problem is Double to Double numbers comparison without delta.
{code}
Running: diff -a 
/home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/vectorized_timestamp_funcs.q.out
 
/home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../ql/src/test/results/clientpositive/vectorized_timestamp_funcs.q.out
729c729
< 1123143.857003
---
> 1123143.856998
{code}

I also noticed that last query results are different among TestCliDriver and 
TestSparkCliDriver tests  (last line in vectorized_timestamp_funcs.q.out)
{code}
-- regular
2.8798560435897438E13   8.970772952794212E198.970772952794212E19
9.206845925236166E199.471416447815084E9 9.471416447815084E9 
9.471416447815084E9 9.595231068211002E9

-- spark
2.8798560435897438E13   8.970772952794214E198.970772952794214E19
9.206845925236167E199.471416447815086E9 9.471416447815086E9 
9.471416447815086E9 9.595231068211004E9

-- tez
2.8798560435897438E13   8.970772952794212E198.970772952794212E19
9.206845925236166E199.471416447815084E9 9.471416447815084E9 
9.471416447815084E9 9.595231068211002E9
{code}

  was:
TestCliDriver.testCliDriver_vectorized_timestamp_funcs failed recently
The problem is Double to Double numbers comparison without delta.
{code}
Running: diff -a 
/home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/vectorized_timestamp_funcs.q.out
 
/home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../ql/src/test/results/clientpositive/vectorized_timestamp_funcs.q.out
729c729
< 1123143.857003
---
> 1123143.856998
{code}


> Make vectorized_timestamp_funcs test more stable
> 
>
> Key: HIVE-10706
> URL: https://issues.apache.org/jira/browse/HIVE-10706
> Project: Hive
>  Issue Type: Bug
>  Components: UDF, Vectorization
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
>
> TestCliDriver.testCliDriver_vectorized_timestamp_funcs failed recently
> The problem is Double to Double numbers comparison without delta.
> {code}
> Running: diff -a 
> /home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/vectorized_timestamp_funcs.q.out
>  
> /home/hiveptest/54.196.24.219-hiveptest-1/apache-github-source-source/itests/qtest/../../ql/src/test/results/clientpositive/vectorized_timestamp_funcs.q.out
> 729c729
> < 1123143.857003
> ---
> > 1123143.856998
> {code}
> I also noticed that last query results are different among TestCliDriver and 
> TestSparkCliDriver tests  (last line in vectorized_timestamp_funcs.q.out)
> {code}
> -- regular
> 2.8798560435897438E13 8.970772952794212E198.970772952794212E19
> 9.206845925236166E199.471416447815084E9 9.471416447815084E9 
> 9.471416447815084E9 9.595231068211002E9
> -- spark
> 2.8798560435897438E13 8.970772952794214E198.970772952794214E19
> 9.206845925236167E199.471416447815086E9 9.471416447815086E9 
> 9.471416447815086E9 9.595231068211004E9
> -- tez
> 2.8798560435897438E13 8.970772952794212E198.970772952794212E19
> 9.206845925236166E199.471416447815084E9 9.471416447815084E9 
> 9.471416447815084E9 9.595231068211002E9
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10665) Continue to make udaf_percentile_approx_23.q test more stable

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543167#comment-14543167
 ] 

Alexander Pivovarov commented on HIVE-10665:


I opened HIVE-10706 to fix failed vectorized_timestamp_funcs test

> Continue to make udaf_percentile_approx_23.q test more stable
> -
>
> Key: HIVE-10665
> URL: https://issues.apache.org/jira/browse/HIVE-10665
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-10665.1.patch
>
>
> HIVE-10059 fixed line 628 in q.out
> Similar issue exists on line 567 and should be fixed as well.
> {code}
> Running: diff -a 
> /home/hiveptest/54.159.254.207-hiveptest-2/apache-github-source-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/udaf_percentile_approx_23.q.out
>  
> /home/hiveptest/54.159.254.207-hiveptest-2/apache-github-source-source/itests/qtest/../../ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out
> 567c567
> < 342.0
> ---
> > 341.5
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10662) LLAP: Wait queue pre-emption

2015-05-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10662:
-
Attachment: HIVE-10662.2.patch

> LLAP: Wait queue pre-emption
> 
>
> Key: HIVE-10662
> URL: https://issues.apache.org/jira/browse/HIVE-10662
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
> Attachments: HIVE-10662.1.patch, HIVE-10662.2.patch
>
>
> If the wait queue is full, currently the task scheduler throws 
> RejectedExecutionException. Instead it should kick out the lowest priority 
> task and notify the AM as task killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10665) Continue to make udaf_percentile_approx_23.q test more stable

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543144#comment-14543144
 ] 

Hive QA commented on HIVE-10665:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732730/HIVE-10665.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8921 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_timestamp_funcs
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3885/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3885/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3885/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732730 - PreCommit-HIVE-TRUNK-Build

> Continue to make udaf_percentile_approx_23.q test more stable
> -
>
> Key: HIVE-10665
> URL: https://issues.apache.org/jira/browse/HIVE-10665
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-10665.1.patch
>
>
> HIVE-10059 fixed line 628 in q.out
> Similar issue exists on line 567 and should be fixed as well.
> {code}
> Running: diff -a 
> /home/hiveptest/54.159.254.207-hiveptest-2/apache-github-source-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/udaf_percentile_approx_23.q.out
>  
> /home/hiveptest/54.159.254.207-hiveptest-2/apache-github-source-source/itests/qtest/../../ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out
> 567c567
> < 342.0
> ---
> > 341.5
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10685) Alter table concatenate oparetor will cause duplicate data

2015-05-13 Thread guoliming (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guoliming updated HIVE-10685:
-
Attachment: HIVE-10685.patch

> Alter table concatenate oparetor will cause duplicate data
> --
>
> Key: HIVE-10685
> URL: https://issues.apache.org/jira/browse/HIVE-10685
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: guoliming
>Assignee: guoliming
> Fix For: 1.1.0
>
> Attachments: HIVE-10685.patch
>
>
> "Orders" table has 15 rows and stored as ORC. 
> {noformat}
> hive> select count(*) from orders;
> OK
> 15
> Time taken: 37.692 seconds, Fetched: 1 row(s)
> {noformat}
> The table contain 14 files,the size of each file is about 2.1 ~ 3.2 GB.
> After executing command : ALTER TABLE orders CONCATENATE;
> The table is already 1530115000 rows.
> My hive version is 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10685) Alter table concatenate oparetor will cause duplicate data

2015-05-13 Thread guoliming (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guoliming reassigned HIVE-10685:


Assignee: guoliming

> Alter table concatenate oparetor will cause duplicate data
> --
>
> Key: HIVE-10685
> URL: https://issues.apache.org/jira/browse/HIVE-10685
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: guoliming
>Assignee: guoliming
>
> "Orders" table has 15 rows and stored as ORC. 
> {noformat}
> hive> select count(*) from orders;
> OK
> 15
> Time taken: 37.692 seconds, Fetched: 1 row(s)
> {noformat}
> The table contain 14 files,the size of each file is about 2.1 ~ 3.2 GB.
> After executing command : ALTER TABLE orders CONCATENATE;
> The table is already 1530115000 rows.
> My hive version is 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2327) Optimize REGEX UDFs with constant parameter information

2015-05-13 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-2327:
--
Attachment: HIVE-2327.2.patch

patch #2:
- updated q.out files

> Optimize REGEX UDFs with constant parameter information
> ---
>
> Key: HIVE-2327
> URL: https://issues.apache.org/jira/browse/HIVE-2327
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Adam Kramer
>Assignee: Alexander Pivovarov
> Attachments: HIVE-2327.01.patch, HIVE-2327.2.patch
>
>
> There are a lot of UDFs which would show major performance differences if one 
> assumes that some of its arguments are constant.
> Consider, for example, any UDF that takes a regular expression as input: This 
> can be complied once (fast) if it's a constant, or once per row (wicked slow) 
> if it's not a constant.
> Or, consider any UDF that reads from a file and/or takes a filename as input; 
> it would have to re-read the whole file if the filename changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10641) create CRC32 UDF

2015-05-13 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543097#comment-14543097
 ] 

Jason Dere commented on HIVE-10641:
---

+1

> create CRC32 UDF
> 
>
> Key: HIVE-10641
> URL: https://issues.apache.org/jira/browse/HIVE-10641
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10641.1.patch, HIVE-10641.2.patch
>
>
> CRC32 computes a cyclic redundancy check value for string or binary argument 
> and returns bigint value. The result is NULL if the argument is NULL.
> MySQL has similar function 
> https://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html#function_crc32



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10630) Renaming tables across encryption zones renames table even though the operation throws error

2015-05-13 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543092#comment-14543092
 ] 

Eugene Koifman commented on HIVE-10630:
---

[~ashutoshc], could you review?
The fix isn't specific to TDE.  It's a general fix to ALTER TABLE error handling

> Renaming tables across encryption zones renames table even though the 
> operation throws error
> 
>
> Key: HIVE-10630
> URL: https://issues.apache.org/jira/browse/HIVE-10630
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Security
>Reporter: Deepesh Khandelwal
>Assignee: Eugene Koifman
> Attachments: HIVE-10630.patch, HIVE-10630.patch
>
>
> Create a table with data in an encrypted zone 1 and then rename it to 
> encrypted zone 2.
> {noformat}
> hive> alter table encdb1.testtbl rename to encdb2.testtbl;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. Unable to 
> access old location 
> hdfs://node-1.example.com:8020/apps/hive/warehouse/encdb1.db/testtbl for 
> table encdb1.testtbl
> {noformat}
> Even though the command errors out the table is renamed. I think the right 
> behavior should be to not rename the table at all including the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10630) Renaming tables across encryption zones renames table even though the operation throws error

2015-05-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10630:
--
Attachment: HIVE-10630.patch

> Renaming tables across encryption zones renames table even though the 
> operation throws error
> 
>
> Key: HIVE-10630
> URL: https://issues.apache.org/jira/browse/HIVE-10630
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Security
>Reporter: Deepesh Khandelwal
>Assignee: Eugene Koifman
> Attachments: HIVE-10630.patch, HIVE-10630.patch
>
>
> Create a table with data in an encrypted zone 1 and then rename it to 
> encrypted zone 2.
> {noformat}
> hive> alter table encdb1.testtbl rename to encdb2.testtbl;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. Unable to 
> access old location 
> hdfs://node-1.example.com:8020/apps/hive/warehouse/encdb1.db/testtbl for 
> table encdb1.testtbl
> {noformat}
> Even though the command errors out the table is renamed. I think the right 
> behavior should be to not rename the table at all including the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8583) HIVE-8341 Cleanup & Test for hive.script.operator.env.blacklist

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543086#comment-14543086
 ] 

Hive QA commented on HIVE-8583:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732719/HIVE-8583.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3884/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3884/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3884/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-3884/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 3138334 HIVE-10690: ArrayIndexOutOfBounds exception in 
MetaStoreDirectSql.aggrColStatsForPartitions() (Vaibhav Gumashta reviewed by 
Jason Dere)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 3138334 HIVE-10690: ArrayIndexOutOfBounds exception in 
MetaStoreDirectSql.aggrColStatsForPartitions() (Vaibhav Gumashta reviewed by 
Jason Dere)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732719 - PreCommit-HIVE-TRUNK-Build

> HIVE-8341 Cleanup & Test for hive.script.operator.env.blacklist
> ---
>
> Key: HIVE-8583
> URL: https://issues.apache.org/jira/browse/HIVE-8583
> Project: Hive
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Lars Francke
>Priority: Minor
> Attachments: HIVE-8583.1.patch, HIVE-8583.2.patch
>
>
> [~alangates] added the following in HIVE-8341:
> {code}
> String bl = 
> hconf.get(HiveConf.ConfVars.HIVESCRIPT_ENV_BLACKLIST.toString());
> if (bl != null && bl.length() > 0) {
>   String[] bls = bl.split(",");
>   for (String b : bls) {
> b.replaceAll(".", "_");
> blackListedConfEntries.add(b);
>   }
> }
> {code}
> The {{replaceAll}} call is confusing as its result is not used at all.
> This patch contains the following:
> * Minor style modification (missorted modifiers)
> * Adds reading of default value for HIVESCRIPT_ENV_BLACKLIST
> * Removes replaceAll
> * Lets blackListed take a Configuration job as parameter which allowed me to 
> add a test for this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10563) MiniTezCliDriver tests ordering issues

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543085#comment-14543085
 ] 

Hive QA commented on HIVE-10563:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732700/HIVE-10563.6.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8921 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3883/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3883/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3883/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732700 - PreCommit-HIVE-TRUNK-Build

> MiniTezCliDriver tests ordering issues
> --
>
> Key: HIVE-10563
> URL: https://issues.apache.org/jira/browse/HIVE-10563
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10563.1.patch, HIVE-10563.2.patch, 
> HIVE-10563.3.patch, HIVE-10563.4.patch, HIVE-10563.5.patch, HIVE-10563.6.patch
>
>
> There are a bunch of tests related to TestMiniTezCliDriver which gives 
> ordering issues when run on Centos/Windows/OSX



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2327) Optimize REGEX UDFs with constant parameter information

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543068#comment-14543068
 ] 

Alexander Pivovarov commented on HIVE-2327:
---

I found the reason for diff in q.out files and the place where isOperator is 
used.

1. GenericUDFBridge.isOperator is only used in getDisplayString method
This method adds parenthesis around "(a regexp b)"
{code}
if (isOperator) {
  ...
  return "(" + children[0] + " " + udfName + " " + children[1] + ")";
}
{code}

I do not think we need parenthesis in getDisplayString output. This is why new 
GenericUDFRegExp.getDisplayString() returns just
{code}
@Override
public String getDisplayString(String[] children) {
  return children[0] + " regexp " + children[1];
}
{code}

2. The reason why rlike is replaced with regexp in query plan is because 
GenericUDFRegExp.getFuncName returns "regexp" (because it's primary name for 
the function)
{code}
  @Override
  protected String getFuncName() {
return "regexp";
  }
{code}

I'll update q.out files soon

> Optimize REGEX UDFs with constant parameter information
> ---
>
> Key: HIVE-2327
> URL: https://issues.apache.org/jira/browse/HIVE-2327
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Adam Kramer
>Assignee: Alexander Pivovarov
> Attachments: HIVE-2327.01.patch
>
>
> There are a lot of UDFs which would show major performance differences if one 
> assumes that some of its arguments are constant.
> Consider, for example, any UDF that takes a regular expression as input: This 
> can be complied once (fast) if it's a constant, or once per row (wicked slow) 
> if it's not a constant.
> Or, consider any UDF that reads from a file and/or takes a filename as input; 
> it would have to re-read the whole file if the filename changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10639) create SHA1 UDF

2015-05-13 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543063#comment-14543063
 ] 

Jason Dere commented on HIVE-10639:
---

+1

> create SHA1 UDF
> ---
>
> Key: HIVE-10639
> URL: https://issues.apache.org/jira/browse/HIVE-10639
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10639.1.patch, HIVE-10639.2.patch, 
> HIVE-10639.3.patch
>
>
> Calculates an SHA-1 160-bit checksum for the string and binary, as described 
> in RFC 3174 (Secure Hash Algorithm). The value is returned as a string of 40 
> hex digits, or NULL if the argument was NULL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10657) Remove copyBytes operation from MD5 UDF

2015-05-13 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543060#comment-14543060
 ] 

Jason Dere commented on HIVE-10657:
---

+1

> Remove copyBytes operation from MD5 UDF
> ---
>
> Key: HIVE-10657
> URL: https://issues.apache.org/jira/browse/HIVE-10657
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-10657.1.patch, HIVE-10657.2.patch
>
>
> Current MD5 UDF implementation uses Apache Commons  DigestUtils.md5Hex method 
> to get md5 hex.
> DigestUtils does not provide md5Hex method with signature (byte[], start, 
> length). This is why copyBytes method was added to UDFMd5 to get bytes[] from 
> BytesWritable.
> To avoid copying bytes from BytesWritable to new byte array we can use java 
> MessageDigest API directly.
> MessageDigest has method update(byte[], start, length)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10704) Errors in Tez HashTableLoader when estimated table size is 0

2015-05-13 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10704:
--
Attachment: HIVE-10704.1.patch

> Errors in Tez HashTableLoader when estimated table size is 0
> 
>
> Key: HIVE-10704
> URL: https://issues.apache.org/jira/browse/HIVE-10704
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-10704.1.patch
>
>
> Couple of issues:
> - If the table sizes in MapJoinOperator.getParentDataSizes() are 0 for all 
> tables, the largest small table selection is wrong and could select the large 
> table (which results in NPE)
> - The memory estimates can either divide-by-zero, or allocate 0 memory if the 
> table size is 0. Try to come up with a sensible default for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2327) Optimize REGEX UDFs with constant parameter information

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543059#comment-14543059
 ] 

Alexander Pivovarov commented on HIVE-2327:
---

I run vectorization_short_regress.q test locally and noticed couple changes in 
q.out file
- rlike is replaced with regexp   (they are synonyms)
- parentesises around "(a regexp b)" are removed in explain query output.

I also noticed that old UDFRegExp registration set isOperator=true
{code}
system.registerUDF("rlike", UDFRegExp.class, true);
system.registerUDF("regexp", UDFRegExp.class, true);
{code}
But new implementation extends GenericUDF.  Generic UDF registration does not 
have isOperator parameter. Can it cause any issues?
{code}
system.registerGenericUDF("rlike", GenericUDFRegExp.class);
system.registerGenericUDF("regexp", GenericUDFRegExp.class);
{code}

> Optimize REGEX UDFs with constant parameter information
> ---
>
> Key: HIVE-2327
> URL: https://issues.apache.org/jira/browse/HIVE-2327
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Adam Kramer
>Assignee: Alexander Pivovarov
> Attachments: HIVE-2327.01.patch
>
>
> There are a lot of UDFs which would show major performance differences if one 
> assumes that some of its arguments are constant.
> Consider, for example, any UDF that takes a regular expression as input: This 
> can be complied once (fast) if it's a constant, or once per row (wicked slow) 
> if it's not a constant.
> Or, consider any UDF that reads from a file and/or takes a filename as input; 
> it would have to re-read the whole file if the filename changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10550) Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543048#comment-14543048
 ] 

Xuefu Zhang commented on HIVE-10550:


[~chengxiang li], when the patch is ready to review, please create a RB for it. 
Thanks.

> Dynamic RDD caching optimization for HoS.[Spark Branch]
> ---
>
> Key: HIVE-10550
> URL: https://issues.apache.org/jira/browse/HIVE-10550
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Attachments: HIVE-10550.1-spark.patch, HIVE-10550.1.patch
>
>
> A Hive query may try to scan the same table multi times, like self-join, 
> self-union, or even share the same subquery, [TPC-DS 
> Q39|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query39.sql]
>  is an example. As you may know that, Spark support cache RDD data, which 
> mean Spark would put the calculated RDD data in memory and get the data from 
> memory directly for next time, this avoid the calculation cost of this 
> RDD(and all the cost of its dependencies) at the cost of more memory usage. 
> Through analyze the query context, we should be able to understand which part 
> of query could be shared, so that we can reuse the cached RDD in the 
> generated Spark job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10671) yarn-cluster mode offers a degraded performance from yarn-client [Spark Branch]

2015-05-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543046#comment-14543046
 ] 

Xuefu Zhang commented on HIVE-10671:


[~lirui], could you please create a RB entry for better review?

> yarn-cluster mode offers a degraded performance from yarn-client [Spark 
> Branch]
> ---
>
> Key: HIVE-10671
> URL: https://issues.apache.org/jira/browse/HIVE-10671
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-10671.1-spark.patch
>
>
> With Hive on Spark, users noticed that in certain cases 
> spark.master=yarn-client offers 2x or 3x better performance than if 
> spark.master=yarn-cluster. However, yarn-cluster is what we recommend and 
> support. Thus, we should investigate and fix the problem. One of the such 
> queries is TPC-H  22.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10696) TestAddResource tests are non-portable

2015-05-13 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543045#comment-14543045
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10696:
--

[~apivovarov]  Thank you for looking at the change. There are a couple of 
reasons why I didnt use  new Path(path).toUri() for both windows and linux.
1. It is a slightly more expensive call (which involves some additional 
parsing) which is unnecessary can be avoided in non-windows since Linux or HDFS 
always has / as the path delimiter.
2. There can be a case where a user can accidently pass escape sequence in the 
jar path in unix and instead of getting an exception, he gets a valid URI 
object which might get exposed elsewhere.

Apart from the minor reasons stated above using new Path().toUri() is fine in 
both linux and windows.

Thanks
Hari

> TestAddResource tests are non-portable
> --
>
> Key: HIVE-10696
> URL: https://issues.apache.org/jira/browse/HIVE-10696
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10696.1.patch
>
>
> We need to make sure these tests work in windows as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10687) AvroDeserializer fails to deserialize evolved union fields

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543016#comment-14543016
 ] 

Alexander Pivovarov commented on HIVE-10687:


It would be easier to review if you add RB link to the Jira

> AvroDeserializer fails to deserialize evolved union fields
> --
>
> Key: HIVE-10687
> URL: https://issues.apache.org/jira/browse/HIVE-10687
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Attachments: HIVE-10687.1.patch
>
>
> Consider the union field:
> {noformat}
> union {int, string}
> {noformat}
> and now this field evolves to
> {noformat}
> union {null, int, string}.
> {noformat}
> Running it through the avro schema compatibility check[1], they are actually 
> compatible which means that the latter could be used to deserialize the data 
> written with former. However the avro deserializer fails to do that. Mainly 
> because of the way it reads the tags from the reader schema and then reds the 
> corresponding data from the writer schema. [2]
> [1] http://pastebin.cerner.corp/31078
> [2] 
> https://github.com/cloudera/hive/blob/cdh5.4.0-release/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L354



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10696) TestAddResource tests are non-portable

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543014#comment-14543014
 ] 

Alexander Pivovarov commented on HIVE-10696:


Can we use new Path(path).toUri() for both Win and Unix cases?

What is the benefit of using new URI(path) for Unix?

It would be easier to review if you add RB link to the Jira

> TestAddResource tests are non-portable
> --
>
> Key: HIVE-10696
> URL: https://issues.apache.org/jira/browse/HIVE-10696
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10696.1.patch
>
>
> We need to make sure these tests work in windows as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10684) Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files

2015-05-13 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543007#comment-14543007
 ] 

Ferdinand Xu commented on HIVE-10684:
-

Failed case is irrelevant.

> Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary 
> jar files
> --
>
> Key: HIVE-10684
> URL: https://issues.apache.org/jira/browse/HIVE-10684
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10684.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8769) Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected)

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542996#comment-14542996
 ] 

Hive QA commented on HIVE-8769:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732699/HIVE-8769.02.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8921 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3882/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3882/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3882/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732699 - PreCommit-HIVE-TRUNK-Build

> Physical optimizer : Incorrect CE results in a shuffle join instead of a Map 
> join (PK/FK pattern not detected)
> --
>
> Key: HIVE-8769
> URL: https://issues.apache.org/jira/browse/HIVE-8769
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Pengcheng Xiong
> Attachments: HIVE-8769.01.patch, HIVE-8769.02.patch
>
>
> TPC-DS Q82 is running slower than hive 13 because the join type is not 
> correct.
> The estimate for item x inventory x date_dim is 227 Million rows while the 
> actual is  3K rows.
> Hive 13 finishes in  753  seconds.
> Hive 14 finishes in  1,267  seconds.
> Hive 14 + force map join finished in 431 seconds.
> Query
> {code}
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> Plan 
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Map 7 <- Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE)
> Reducer 4 <- Map 3 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE)
> Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
> Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   DagName: mmokhtar_20141106005353_7a2eb8df-12ff-4fe9-89b4-30f1e4e3fb90:1
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: item
>   filterExpr: ((i_current_price BETWEEN 30 AND 60 and 
> (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
> boolean)
>   Statistics: Num rows: 462000 Data size: 663862160 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((i_current_price BETWEEN 30 AND 60 and 
> (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
> boolean)
> Statistics: Num rows: 115500 Data size: 34185680 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
>   expressions: i_item_sk (type: int), i_item_id (type: 
> string), i_item_desc (type: string), i_current_price (type: float)
>   outputColumnNames: _col0, _col1, _col2, _col3
>   Statistics: Num rows: 115500 Data size: 33724832 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 115500 Data size: 33724832 
> Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col1 (type: string), _col2 (type: 
> string), _col3 (type: float)
> Execution mode: vectorized
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_date BETWEEN '2002-05-30' AND '2002-07-30' 
> and 

[jira] [Updated] (HIVE-10665) Continue to make udaf_percentile_approx_23.q test more stable

2015-05-13 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-10665:
---
Attachment: HIVE-10665.1.patch

patch #1

> Continue to make udaf_percentile_approx_23.q test more stable
> -
>
> Key: HIVE-10665
> URL: https://issues.apache.org/jira/browse/HIVE-10665
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-10665.1.patch
>
>
> HIVE-10059 fixed line 628 in q.out
> Similar issue exists on line 567 and should be fixed as well.
> {code}
> Running: diff -a 
> /home/hiveptest/54.159.254.207-hiveptest-2/apache-github-source-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/udaf_percentile_approx_23.q.out
>  
> /home/hiveptest/54.159.254.207-hiveptest-2/apache-github-source-source/itests/qtest/../../ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out
> 567c567
> < 342.0
> ---
> > 341.5
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10574) Metastore to handle expired tokens inline

2015-05-13 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-10574:
---
Assignee: (was: Chaoyu Tang)

> Metastore to handle expired tokens inline
> -
>
> Key: HIVE-10574
> URL: https://issues.apache.org/jira/browse/HIVE-10574
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Xuefu Zhang
>
> This is a followup for HIVE-9625.
> Metastore has a garbage collection thread that removes expired tokens. 
> However that still leaves a window (1 hour by default) where clients could 
> retrieve a token that's expired or about to expire. An option is for 
> metastore handle expired tokens inline. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8583) HIVE-8341 Cleanup & Test for hive.script.operator.env.blacklist

2015-05-13 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542973#comment-14542973
 ] 

Lars Francke commented on HIVE-8583:


I dropped the ball on this one but did rebase and uploaded a new patch.

> HIVE-8341 Cleanup & Test for hive.script.operator.env.blacklist
> ---
>
> Key: HIVE-8583
> URL: https://issues.apache.org/jira/browse/HIVE-8583
> Project: Hive
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Lars Francke
>Priority: Minor
> Attachments: HIVE-8583.1.patch, HIVE-8583.2.patch
>
>
> [~alangates] added the following in HIVE-8341:
> {code}
> String bl = 
> hconf.get(HiveConf.ConfVars.HIVESCRIPT_ENV_BLACKLIST.toString());
> if (bl != null && bl.length() > 0) {
>   String[] bls = bl.split(",");
>   for (String b : bls) {
> b.replaceAll(".", "_");
> blackListedConfEntries.add(b);
>   }
> }
> {code}
> The {{replaceAll}} call is confusing as its result is not used at all.
> This patch contains the following:
> * Minor style modification (missorted modifiers)
> * Adds reading of default value for HIVESCRIPT_ENV_BLACKLIST
> * Removes replaceAll
> * Lets blackListed take a Configuration job as parameter which allowed me to 
> add a test for this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8583) HIVE-8341 Cleanup & Test for hive.script.operator.env.blacklist

2015-05-13 Thread Lars Francke (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Francke updated HIVE-8583:
---
Attachment: HIVE-8583.2.patch

> HIVE-8341 Cleanup & Test for hive.script.operator.env.blacklist
> ---
>
> Key: HIVE-8583
> URL: https://issues.apache.org/jira/browse/HIVE-8583
> Project: Hive
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Lars Francke
>Priority: Minor
> Attachments: HIVE-8583.1.patch, HIVE-8583.2.patch
>
>
> [~alangates] added the following in HIVE-8341:
> {code}
> String bl = 
> hconf.get(HiveConf.ConfVars.HIVESCRIPT_ENV_BLACKLIST.toString());
> if (bl != null && bl.length() > 0) {
>   String[] bls = bl.split(",");
>   for (String b : bls) {
> b.replaceAll(".", "_");
> blackListedConfEntries.add(b);
>   }
> }
> {code}
> The {{replaceAll}} call is confusing as its result is not used at all.
> This patch contains the following:
> * Minor style modification (missorted modifiers)
> * Adds reading of default value for HIVESCRIPT_ENV_BLACKLIST
> * Removes replaceAll
> * Lets blackListed take a Configuration job as parameter which allowed me to 
> add a test for this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10678) update sql standard authorization configuration whitelist for 1.0.x , 1.1.x

2015-05-13 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10678:
-
Summary: update sql standard authorization configuration whitelist for 
1.0.x , 1.1.x  (was: update sql standard authorization configuration whitelist 
- more optimization flags)

> update sql standard authorization configuration whitelist for 1.0.x , 1.1.x
> ---
>
> Key: HIVE-10678
> URL: https://issues.apache.org/jira/browse/HIVE-10678
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-10678.1.patch
>
>
> hive.exec.parallel and hive.groupby.orderby.position.alias are optimization 
> config parameters that should be settable when sql standard authorization is 
> enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10678) update sql standard authorization configuration whitelist - more optimization flags

2015-05-13 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10678:
-
Summary: update sql standard authorization configuration whitelist - more 
optimization flags  (was: update sql standard authorization configuration 
whitelist for 1.0.x , 1.1.x)

> update sql standard authorization configuration whitelist - more optimization 
> flags
> ---
>
> Key: HIVE-10678
> URL: https://issues.apache.org/jira/browse/HIVE-10678
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-10678.1.patch
>
>
> hive.exec.parallel and hive.groupby.orderby.position.alias are optimization 
> config parameters that should be settable when sql standard authorization is 
> enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10678) update sql standard authorization configuration whitelist - more optimization flags

2015-05-13 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10678:
-
Attachment: HIVE-10678.1.patch

> update sql standard authorization configuration whitelist - more optimization 
> flags
> ---
>
> Key: HIVE-10678
> URL: https://issues.apache.org/jira/browse/HIVE-10678
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-10678.1.patch
>
>
> hive.exec.parallel and hive.groupby.orderby.position.alias are optimization 
> config parameters that should be settable when sql standard authorization is 
> enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10647) Hive on LLAP: Limit HS2 from overwhelming LLAP

2015-05-13 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542963#comment-14542963
 ] 

Thejas M Nair commented on HIVE-10647:
--

Clusters with more than one HS2 instances are going to be common, with the new 
rolling upgrade and high availability features in HS2.
This will not have the desired impact in such cases.


> Hive on LLAP: Limit HS2 from overwhelming LLAP
> --
>
> Key: HIVE-10647
> URL: https://issues.apache.org/jira/browse/HIVE-10647
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10647.1.patch
>
>
> We want to restrict the number of queries that flow through LLAP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager

2015-05-13 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10233:
--
Attachment: (was: HIVE-10233.1.patch)

> Hive on LLAP: Memory manager
> 
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager

2015-05-13 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10233:
--
Attachment: HIVE-10233-WIP-6.patch

> Hive on LLAP: Memory manager
> 
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10563) MiniTezCliDriver tests ordering issues

2015-05-13 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10563:
-
Attachment: HIVE-10563.6.patch

> MiniTezCliDriver tests ordering issues
> --
>
> Key: HIVE-10563
> URL: https://issues.apache.org/jira/browse/HIVE-10563
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10563.1.patch, HIVE-10563.2.patch, 
> HIVE-10563.3.patch, HIVE-10563.4.patch, HIVE-10563.5.patch, HIVE-10563.6.patch
>
>
> There are a bunch of tests related to TestMiniTezCliDriver which gives 
> ordering issues when run on Centos/Windows/OSX



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8769) Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected)

2015-05-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8769:
--
Attachment: HIVE-8769.02.patch

> Physical optimizer : Incorrect CE results in a shuffle join instead of a Map 
> join (PK/FK pattern not detected)
> --
>
> Key: HIVE-8769
> URL: https://issues.apache.org/jira/browse/HIVE-8769
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Pengcheng Xiong
> Attachments: HIVE-8769.01.patch, HIVE-8769.02.patch
>
>
> TPC-DS Q82 is running slower than hive 13 because the join type is not 
> correct.
> The estimate for item x inventory x date_dim is 227 Million rows while the 
> actual is  3K rows.
> Hive 13 finishes in  753  seconds.
> Hive 14 finishes in  1,267  seconds.
> Hive 14 + force map join finished in 431 seconds.
> Query
> {code}
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> Plan 
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Map 7 <- Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE)
> Reducer 4 <- Map 3 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE)
> Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
> Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   DagName: mmokhtar_20141106005353_7a2eb8df-12ff-4fe9-89b4-30f1e4e3fb90:1
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: item
>   filterExpr: ((i_current_price BETWEEN 30 AND 60 and 
> (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
> boolean)
>   Statistics: Num rows: 462000 Data size: 663862160 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((i_current_price BETWEEN 30 AND 60 and 
> (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
> boolean)
> Statistics: Num rows: 115500 Data size: 34185680 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
>   expressions: i_item_sk (type: int), i_item_id (type: 
> string), i_item_desc (type: string), i_current_price (type: float)
>   outputColumnNames: _col0, _col1, _col2, _col3
>   Statistics: Num rows: 115500 Data size: 33724832 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 115500 Data size: 33724832 
> Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col1 (type: string), _col2 (type: 
> string), _col3 (type: float)
> Execution mode: vectorized
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_date BETWEEN '2002-05-30' AND '2002-07-30' 
> and d_date_sk is not null) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 81741831 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_date BETWEEN '2002-05-30' AND '2002-07-30' 
> and d_date_sk is not null) (type: boolean)
> Statistics: Num rows: 36524 Data size: 3579352 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
>   expressions: d_date_sk (type: int)
>   outputColumnNames: _col0
>   Statistics: Num rows: 36524 Data size: 146096 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 36524 Data size: 146096 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: _col0 (type: int)
> ou

[jira] [Commented] (HIVE-10580) Fix impossible cast in GenericUDF.getConstantLongValue

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542863#comment-14542863
 ] 

Alexander Pivovarov commented on HIVE-10580:


Can you +1 for fixing the bug then? Sure, we can deprecate it when new 
getConstantLongValue will be available.

> Fix impossible cast in GenericUDF.getConstantLongValue
> --
>
> Key: HIVE-10580
> URL: https://issues.apache.org/jira/browse/HIVE-10580
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10580.1.patch
>
>
> line 548-549
> {code}
> if (constValue instanceof IntWritable) {
>   v = ((LongWritable) constValue).get();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10580) Fix impossible cast in GenericUDF.getConstantLongValue

2015-05-13 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542860#comment-14542860
 ] 

Swarnim Kulkarni commented on HIVE-10580:
-

+1 on changes(NB)

> Fix impossible cast in GenericUDF.getConstantLongValue
> --
>
> Key: HIVE-10580
> URL: https://issues.apache.org/jira/browse/HIVE-10580
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10580.1.patch
>
>
> line 548-549
> {code}
> if (constValue instanceof IntWritable) {
>   v = ((LongWritable) constValue).get();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10580) Fix impossible cast in GenericUDF.getConstantLongValue

2015-05-13 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542853#comment-14542853
 ] 

Swarnim Kulkarni commented on HIVE-10580:
-

{quote}
So I think we should go ahead and just deprecate this method since it is not 
being used.
{quote}

I don't think I worded this properly. It should be a "if" instead of "since" ;)

So my only point was that if we must remove this method, let's not remove it 
non-passively but deprecate it and doc accordingly so our consumers have a 
chance to migrate to the newer API.

> Fix impossible cast in GenericUDF.getConstantLongValue
> --
>
> Key: HIVE-10580
> URL: https://issues.apache.org/jira/browse/HIVE-10580
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10580.1.patch
>
>
> line 548-549
> {code}
> if (constValue instanceof IntWritable) {
>   v = ((LongWritable) constValue).get();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10432) Need to add more e2e like tests between HiveServer2 and JDBC using wiremock or equivalent

2015-05-13 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10432:
-
Attachment: HIVE-10432.1.patch

cc-ing [~thejas] / [~vgumashta] for reviewing patch #1. This contains the basic 
framework integration and some simple tests. 

Thanks
Hari

> Need to add more e2e like tests between HiveServer2 and JDBC using wiremock 
> or equivalent
> -
>
> Key: HIVE-10432
> URL: https://issues.apache.org/jira/browse/HIVE-10432
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10432.1.patch
>
>
> The current unit tests use ThriftCLIService to test client-server 
> interaction. We will need to mock HS2 to facilitate use of writing test cases 
> where we can parse HTTP request/response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10054) Clean up ETypeConverter since Parquet supports timestamp type already

2015-05-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542822#comment-14542822
 ] 

Sergio Peña commented on HIVE-10054:


[~Ferd] There is a duplicated patch for this on HIVE-10642. 
We should be compatible with older files that still uses INT96 for timestamp. 
The patch on HIVE-10642 addresses that, and also supports the new 
TIMESTAMP_MILLIS.

I will set this as duplicate. 

> Clean up ETypeConverter since Parquet supports timestamp type already
> -
>
> Key: HIVE-10054
> URL: https://issues.apache.org/jira/browse/HIVE-10054
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10054.1.patch, HIVE-10054.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10580) Fix impossible cast in GenericUDF.getConstantLongValue

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542814#comment-14542814
 ] 

Alexander Pivovarov commented on HIVE-10580:


Lets look at the following facts
- after we fix the bug the implementation will be correct.
- name of the method is also good, we do not have other method which replaces 
it.
- currently the method is not used in UDFs which exist in hive-exec project
- the method might be used in hive users custom UDFs

Hive is not just a product itself. Hive is an extendable platform. End-users 
can develop their own UDFs, Serde, etc.
The platform should provide clear and useful interfaces for external developers.

Do you still want to deprecate useful base class method with correct name and 
implementation just because it's not used in hive-exec itself?

> Fix impossible cast in GenericUDF.getConstantLongValue
> --
>
> Key: HIVE-10580
> URL: https://issues.apache.org/jira/browse/HIVE-10580
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10580.1.patch
>
>
> line 548-549
> {code}
> if (constValue instanceof IntWritable) {
>   v = ((LongWritable) constValue).get();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9658) Reduce parquet memory usage by bypassing java primitive objects on ETypeConverter

2015-05-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-9658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-9658:
--
Summary: Reduce parquet memory usage by bypassing java primitive objects on 
ETypeConverter  (was: Reduce parquet memory use by bypassing java primitive 
objects on ETypeConverter)

> Reduce parquet memory usage by bypassing java primitive objects on 
> ETypeConverter
> -
>
> Key: HIVE-9658
> URL: https://issues.apache.org/jira/browse/HIVE-9658
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-9658.1.patch, HIVE-9658.2.patch, HIVE-9658.3.patch, 
> HIVE-9658.4.patch, HIVE-9658.5.patch
>
>
> NO PRECOMMIT TESTS
> The ETypeConverter class passes Writable objects to the collection converters 
> in order to be read later by the map/reduce functions. These objects are all 
> wrapped in a unique ArrayWritable object.
> We can save some memory by returning the java primitive objects instead in 
> order to prevent memory allocation. The only writable object needed by 
> map/reduce is ArrayWritable. If we create another writable class where to 
> store primitive objects (Object), then we can stop using all primitive 
> wirtables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9658) Reduce parquet memory use by bypassing java primitive objects on ETypeConverter

2015-05-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-9658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-9658:
--
Description: 
NO PRECOMMIT TESTS

The ETypeConverter class passes Writable objects to the collection converters 
in order to be read later by the map/reduce functions. These objects are all 
wrapped in a unique ArrayWritable object.

We can save some memory by returning the java primitive objects instead in 
order to prevent memory allocation. The only writable object needed by 
map/reduce is ArrayWritable. If we create another writable class where to store 
primitive objects (Object), then we can stop using all primitive wirtables.

  was:
The ETypeConverter class passes Writable objects to the collection converters 
in order to be read later by the map/reduce functions. These objects are all 
wrapped in a unique ArrayWritable object.

We can save some memory by returning the java primitive objects instead in 
order to prevent memory allocation. The only writable object needed by 
map/reduce is ArrayWritable. If we create another writable class where to store 
primitive objects (Object), then we can stop using all primitive wirtables.


> Reduce parquet memory use by bypassing java primitive objects on 
> ETypeConverter
> ---
>
> Key: HIVE-9658
> URL: https://issues.apache.org/jira/browse/HIVE-9658
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-9658.1.patch, HIVE-9658.2.patch, HIVE-9658.3.patch, 
> HIVE-9658.4.patch, HIVE-9658.5.patch
>
>
> NO PRECOMMIT TESTS
> The ETypeConverter class passes Writable objects to the collection converters 
> in order to be read later by the map/reduce functions. These objects are all 
> wrapped in a unique ArrayWritable object.
> We can save some memory by returning the java primitive objects instead in 
> order to prevent memory allocation. The only writable object needed by 
> map/reduce is ArrayWritable. If we create another writable class where to 
> store primitive objects (Object), then we can stop using all primitive 
> wirtables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9658) Reduce parquet memory use by bypassing java primitive objects on ETypeConverter

2015-05-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-9658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-9658:
--
Attachment: HIVE-9658.5.patch

This patch has changes due to other changes done on the parquet branch. 

> Reduce parquet memory use by bypassing java primitive objects on 
> ETypeConverter
> ---
>
> Key: HIVE-9658
> URL: https://issues.apache.org/jira/browse/HIVE-9658
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-9658.1.patch, HIVE-9658.2.patch, HIVE-9658.3.patch, 
> HIVE-9658.4.patch, HIVE-9658.5.patch
>
>
> The ETypeConverter class passes Writable objects to the collection converters 
> in order to be read later by the map/reduce functions. These objects are all 
> wrapped in a unique ArrayWritable object.
> We can save some memory by returning the java primitive objects instead in 
> order to prevent memory allocation. The only writable object needed by 
> map/reduce is ArrayWritable. If we create another writable class where to 
> store primitive objects (Object), then we can stop using all primitive 
> wirtables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10627) Queries fail with Failed to breakup Windowing invocations into Groups

2015-05-13 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542800#comment-14542800
 ] 

Laljo John Pullokkaran commented on HIVE-10627:
---

+1

> Queries fail with Failed to breakup Windowing invocations into Groups
> -
>
> Key: HIVE-10627
> URL: https://issues.apache.org/jira/browse/HIVE-10627
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10627.01.patch, HIVE-10627.01.patch, 
> HIVE-10627.02.patch, HIVE-10627.03.patch, HIVE-10627.patch
>
>
> TPC-DS queries 51 fails with Failed to breakup Windowing invocations into 
> Groups. At least 1 group must only depend on input columns. Also check for 
> circular dependencies.
> {code}
> explain  
> WITH web_v1 as (
> select
>   ws_item_sk item_sk, d_date, sum(ws_sales_price),
>   sum(sum(ws_sales_price))
>   over (partition by ws_item_sk order by d_date rows between unbounded 
> preceding and current row) cume_sales
> from web_sales
> ,date_dim
> where ws_sold_date_sk=d_date_sk
>   and d_month_seq between 1193 and 1193+11
>   and ws_item_sk is not NULL
> group by ws_item_sk, d_date),
> store_v1 as (
> select
>   ss_item_sk item_sk, d_date, sum(ss_sales_price),
>   sum(sum(ss_sales_price))
>   over (partition by ss_item_sk order by d_date rows between unbounded 
> preceding and current row) cume_sales
> from store_sales
> ,date_dim
> where ss_sold_date_sk=d_date_sk
>   and d_month_seq between 1193 and 1193+11
>   and ss_item_sk is not NULL
> group by ss_item_sk, d_date)
>  select  *
> from (select item_sk
>  ,d_date
>  ,web_sales
>  ,store_sales
>  ,max(web_sales)
>  over (partition by item_sk order by d_date rows between unbounded 
> preceding and current row) web_cumulative
>  ,max(store_sales)
>  over (partition by item_sk order by d_date rows between unbounded 
> preceding and current row) store_cumulative
>  from (select case when web.item_sk is not null then web.item_sk else 
> store.item_sk end item_sk
>  ,case when web.d_date is not null then web.d_date else 
> store.d_date end d_date
>  ,web.cume_sales web_sales
>  ,store.cume_sales store_sales
>from web_v1 web full outer join store_v1 store on (web.item_sk = 
> store.item_sk
>   and web.d_date = 
> store.d_date)
>   )x )y
> where web_cumulative > store_cumulative
> order by item_sk
> ,d_date
> limit 100;
> {code}
> Exception 
> {code}
> org.apache.hadoop.hive.ql.parse.SemanticException: Failed to breakup 
> Windowing invocations into Groups. At least 1 group must only depend on input 
> columns. Also check for circular dependencies. 
> Underlying error: org.apache.hadoop.hive.ql.parse.SemanticException: Line 
> 0:-1 Invalid column reference '$f2' 
>   at 
> org.apache.hadoop.hive.ql.parse.WindowingComponentizer.next(WindowingComponentizer.java:94)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:11538)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8514)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8472)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9304)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9189)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9210)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9189)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9210)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9189)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9210)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9189)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9210)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9189)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9210)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9592)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:208)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:7

[jira] [Commented] (HIVE-10686) java.lang.IndexOutOfBoundsException for query with rank() over(partition ...)

2015-05-13 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542797#comment-14542797
 ] 

Laljo John Pullokkaran commented on HIVE-10686:
---

[~jcamachorodriguez] In "adjustOBSchema"
if (rn instanceof RexCall) {
  operands.add(adjustOBSchema((RexCall)rn, obChild, resultSchema));
} else {
  operands.add(rn);
}

the else can only be a literal right?

> java.lang.IndexOutOfBoundsException for query with rank() over(partition ...)
> -
>
> Key: HIVE-10686
> URL: https://issues.apache.org/jira/browse/HIVE-10686
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10686.01.patch, HIVE-10686.02.patch, 
> HIVE-10686.03.patch, HIVE-10686.patch
>
>
> CBO throws Index out of bound exception for TPC-DS Q70.
> Query 
> {code}
> explain
> select
> sum(ss_net_profit) as total_sum
>,s_state
>,s_county
>,grouping__id as lochierarchy
>, rank() over(partition by grouping__id, case when grouping__id == 2 then 
> s_state end order by sum(ss_net_profit)) as rank_within_parent
> from
> store_sales ss join date_dim d1 on d1.d_date_sk = ss.ss_sold_date_sk
> join store s on s.s_store_sk  = ss.ss_store_sk
>  where
> d1.d_month_seq between 1193 and 1193+11
>  and s.s_state in
>  ( select s_state
>from  (select s_state as s_state, sum(ss_net_profit),
>  rank() over ( partition by s_state order by 
> sum(ss_net_profit) desc) as ranking
>   from   store_sales, store, date_dim
>   where  d_month_seq between 1193 and 1193+11
> and date_dim.d_date_sk = 
> store_sales.ss_sold_date_sk
> and store.s_store_sk  = store_sales.ss_store_sk
>   group by s_state
>  ) tmp1
>where ranking <= 5
>  )
>  group by s_state,s_county with rollup
> order by
>lochierarchy desc
>   ,case when lochierarchy = 0 then s_state end
>   ,rank_within_parent
>  limit 100
> {code}
> Original plan (correct)
> {code}
>  HiveSort(fetch=[100])
>   HiveSort(sort0=[$3], sort1=[$5], sort2=[$4], dir0=[DESC], dir1=[ASC], 
> dir2=[ASC])
> HiveProject(total_sum=[$4], s_state=[$0], s_county=[$1], 
> lochierarchy=[$5], rank_within_parent=[rank() OVER (PARTITION BY $5, 
> when(==($5, 2), $0) ORDER BY $4 ROWS BETWEEN 2147483647 FOLLOWING AND 
> 2147483647 PRECEDING)], (tok_function when (= (tok_table_or_col lochierarchy) 
> 0) (tok_table_or_col s_state))=[when(=($5, 0), $0)])
>   HiveAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}, {}]], 
> indicator=[true], agg#0=[sum($2)], GROUPING__ID=[GROUPING__ID()])
> HiveProject($f0=[$7], $f1=[$6], $f2=[$1])
>   HiveJoin(condition=[=($5, $2)], joinType=[inner], algorithm=[none], 
> cost=[{1177.2086187101072 rows, 0.0 cpu, 0.0 io}])
> HiveJoin(condition=[=($3, $0)], joinType=[inner], 
> algorithm=[none], cost=[{2880430.428726483 rows, 0.0 cpu, 0.0 io}])
>   HiveProject(ss_sold_date_sk=[$0], ss_net_profit=[$21], 
> ss_store_sk=[$22])
> HiveTableScan(table=[[tpcds.store_sales]])
>   HiveProject(d_date_sk=[$0], d_month_seq=[$3])
> HiveFilter(condition=[between(false, $3, 1193, +(1193, 11))])
>   HiveTableScan(table=[[tpcds.date_dim]])
> HiveProject(s_store_sk=[$0], s_county=[$1], s_state=[$2])
>   SemiJoin(condition=[=($2, $3)], joinType=[inner])
> HiveProject(s_store_sk=[$0], s_county=[$23], s_state=[$24])
>   HiveTableScan(table=[[tpcds.store]])
> HiveProject(s_state=[$0])
>   HiveFilter(condition=[<=($1, 5)])
> HiveProject((tok_table_or_col s_state)=[$0], 
> rank_window_0=[rank() OVER (PARTITION BY $0 ORDER BY $1 DESC ROWS BETWEEN 
> 2147483647 FOLLOWING AND 2147483647 PRECEDING)])
>   HiveAggregate(group=[{0}], agg#0=[sum($1)])
> HiveProject($f0=[$6], $f1=[$1])
>   HiveJoin(condition=[=($5, $2)], joinType=[inner], 
> algorithm=[none], cost=[{1177.2086187101072 rows, 0.0 cpu, 0.0 io}])
> HiveJoin(condition=[=($3, $0)], joinType=[inner], 
> algorithm=[none], cost=[{2880430.428726483 rows, 0.0 cpu, 0.0 io}])
>   HiveProject(ss_sold_date_sk=[$0], 
> ss_net_profit=[$21], ss_store_sk=[$22])
> HiveTableScan(table=[[tpcds.store_sales]])
>   HiveProject(d_date_sk=[$0], d_month_seq=[$3])
>   

[jira] [Commented] (HIVE-10580) Fix impossible cast in GenericUDF.getConstantLongValue

2015-05-13 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542779#comment-14542779
 ] 

Swarnim Kulkarni commented on HIVE-10580:
-

{quote}
protected API should be handled the same as public API. Once added it should 
not be removed.
{quote}

+1 for not directly removing the method to maintain passivity but that 
statement is not completely true. The ideal approach is to deprecate the method 
and mention in the javadoc an alternative to use. So I think we should go ahead 
and just deprecate this method since it is not being used.

> Fix impossible cast in GenericUDF.getConstantLongValue
> --
>
> Key: HIVE-10580
> URL: https://issues.apache.org/jira/browse/HIVE-10580
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10580.1.patch
>
>
> line 548-549
> {code}
> if (constValue instanceof IntWritable) {
>   v = ((LongWritable) constValue).get();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10623) Implement hive cli options using beeline functionality

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542728#comment-14542728
 ] 

Hive QA commented on HIVE-10623:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732594/HIVE-10623.2.patch

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 8929 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hive.beeline.cli.TestHiveCli.testDatabaseOptions
org.apache.hive.beeline.cli.TestHiveCli.testHelp
org.apache.hive.beeline.cli.TestHiveCli.testInValidCmd
org.apache.hive.beeline.cli.TestHiveCli.testInvalidDatabaseOptions
org.apache.hive.beeline.cli.TestHiveCli.testInvalidOptions
org.apache.hive.beeline.cli.TestHiveCli.testInvalidOptions2
org.apache.hive.beeline.cli.TestHiveCli.testSqlFromCmd
org.apache.hive.beeline.cli.TestHiveCli.testSqlFromCmdWithDBName
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3881/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3881/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3881/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732594 - PreCommit-HIVE-TRUNK-Build

> Implement hive cli options using beeline functionality
> --
>
> Key: HIVE-10623
> URL: https://issues.apache.org/jira/browse/HIVE-10623
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10623.1.patch, HIVE-10623.2.patch, HIVE-10623.patch
>
>
> We need to support the original hive cli options for the purpose of backwards 
> compatibility. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10580) Fix impossible cast in GenericUDF.getConstantLongValue

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542700#comment-14542700
 ] 

Alexander Pivovarov commented on HIVE-10580:


Protected means that the method might be used by actual UDFs.
Hive itself contains only common UDFs.
But each serious hive users/companies have their own UDFs.
it's quite possible that custom UDFs have bigint constant as a parameter.

protected API should be handled the same as public API. Once added it should 
not be removed.
because they both are exposed to external users
if we look at UML class diagram then we can see that
- protected API can be used for "inheritance" objects relationship
- public API is used for all objects relationship

> Fix impossible cast in GenericUDF.getConstantLongValue
> --
>
> Key: HIVE-10580
> URL: https://issues.apache.org/jira/browse/HIVE-10580
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10580.1.patch
>
>
> line 548-549
> {code}
> if (constValue instanceof IntWritable) {
>   v = ((LongWritable) constValue).get();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10565) LLAP: Native Vector Map Join doesn't handle filtering and matching on LEFT OUTER JOIN repeated key correctly

2015-05-13 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542696#comment-14542696
 ] 

Matt McCline commented on HIVE-10565:
-

Test failures are unrelated.

> LLAP: Native Vector Map Join doesn't handle filtering and matching on LEFT 
> OUTER JOIN repeated key correctly
> 
>
> Key: HIVE-10565
> URL: https://issues.apache.org/jira/browse/HIVE-10565
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0
>
> Attachments: HIVE-10565.01.patch, HIVE-10565.02.patch, 
> HIVE-10565.03.patch, HIVE-10565.04.patch, HIVE-10565.05.patch, 
> HIVE-10565.06.patch, HIVE-10565.07.patch, HIVE-10565.08.patch, 
> HIVE-10565.09.patch, HIVE-10565.091.patch, HIVE-10565.092.patch
>
>
> Filtering can knock out some of the rows for a repeated key, but those 
> knocked out rows need to be included in the LEFT OUTER JOIN result and are 
> currently not when only some rows are filtered out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10690) ArrayIndexOutOfBounds exception in MetaStoreDirectSql.aggrColStatsForPartitions()

2015-05-13 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542695#comment-14542695
 ] 

Sushanth Sowmyan commented on HIVE-10690:
-

Yes, please add it to the wiki list for 1.2.1 inclusion in 
https://cwiki.apache.org/confluence/display/Hive/Hive+1.2+Release+Status

> ArrayIndexOutOfBounds exception in 
> MetaStoreDirectSql.aggrColStatsForPartitions()
> -
>
> Key: HIVE-10690
> URL: https://issues.apache.org/jira/browse/HIVE-10690
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 1.2.0
>Reporter: Jason Dere
>Assignee: Vaibhav Gumashta
> Fix For: 1.3.0
>
> Attachments: HIVE-10690.1.patch
>
>
> Noticed a bunch of these stack traces in hive.log while running some unit 
> tests:
> {noformat}
> 2015-05-11 21:18:59,371 WARN  [main]: metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(2420)) - Direct SQL failed
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1132)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:6162)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:6158)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2385)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:6158)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy84.get_aggr_stats_for(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_aggr_stats_for(HiveMetaStore.java:5662)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy86.get_aggr_stats_for(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAggrColStatsFor(HiveMetaStoreClient.java:2064)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
> at com.sun.proxy.$Proxy87.getAggrColStatsFor(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAggrColStatsFor(Hive.java:3110)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:245)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:329)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:399)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:392)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:150)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:77)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:64)
> at sun.reflect.GeneratedMethodAccessor296.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$1$1.invoke(ReflectiveRelMetadataProvider.java:182)
> at com.sun.proxy.$Proxy108.getDistinctRowCoun

[jira] [Updated] (HIVE-10690) ArrayIndexOutOfBounds exception in MetaStoreDirectSql.aggrColStatsForPartitions()

2015-05-13 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10690:

Fix Version/s: 1.3.0

> ArrayIndexOutOfBounds exception in 
> MetaStoreDirectSql.aggrColStatsForPartitions()
> -
>
> Key: HIVE-10690
> URL: https://issues.apache.org/jira/browse/HIVE-10690
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 1.2.0
>Reporter: Jason Dere
>Assignee: Vaibhav Gumashta
> Fix For: 1.3.0
>
> Attachments: HIVE-10690.1.patch
>
>
> Noticed a bunch of these stack traces in hive.log while running some unit 
> tests:
> {noformat}
> 2015-05-11 21:18:59,371 WARN  [main]: metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(2420)) - Direct SQL failed
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1132)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:6162)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:6158)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2385)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:6158)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy84.get_aggr_stats_for(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_aggr_stats_for(HiveMetaStore.java:5662)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy86.get_aggr_stats_for(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAggrColStatsFor(HiveMetaStoreClient.java:2064)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
> at com.sun.proxy.$Proxy87.getAggrColStatsFor(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAggrColStatsFor(Hive.java:3110)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:245)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:329)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:399)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:392)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:150)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:77)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:64)
> at sun.reflect.GeneratedMethodAccessor296.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$1$1.invoke(ReflectiveRelMetadataProvider.java:182)
> at com.sun.proxy.$Proxy108.getDistinctRowCount(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor234.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(Delegatin

[jira] [Commented] (HIVE-10580) Fix impossible cast in GenericUDF.getConstantLongValue

2015-05-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542674#comment-14542674
 ] 

Ashutosh Chauhan commented on HIVE-10580:
-

Method is protected, so its not meant to be used outside of Hive.

> Fix impossible cast in GenericUDF.getConstantLongValue
> --
>
> Key: HIVE-10580
> URL: https://issues.apache.org/jira/browse/HIVE-10580
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10580.1.patch
>
>
> line 548-549
> {code}
> if (constValue instanceof IntWritable) {
>   v = ((LongWritable) constValue).get();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10676) Update Hive's README to mention spark, and to remove jdk1.6

2015-05-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10676:
--
Labels:   (was: TODOC1.2)

> Update Hive's README to mention spark, and to remove jdk1.6
> ---
>
> Key: HIVE-10676
> URL: https://issues.apache.org/jira/browse/HIVE-10676
> Project: Hive
>  Issue Type: Task
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Trivial
> Fix For: 1.2.0
>
> Attachments: HIVE-10676.2.patch, HIVE-10676.patch
>
>
> a) Hive's README file mentions only 2 execution frameworks, and does not 
> mention spark. We should add that in.
> b) We should remove jdk1.6 from the README, since hive no longer supports or 
> even compiles under jdk1.6.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10676) Update Hive's README to mention spark, and to remove jdk1.6

2015-05-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542672#comment-14542672
 ] 

Lefty Leverenz commented on HIVE-10676:
---

Your work is always good, [~sushanth].  ;)

I like "onward" -- much nicer than "and later" so I'll be borrowing that term.  
Thanks.

Removing the TODOC1.2 label.

> Update Hive's README to mention spark, and to remove jdk1.6
> ---
>
> Key: HIVE-10676
> URL: https://issues.apache.org/jira/browse/HIVE-10676
> Project: Hive
>  Issue Type: Task
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Trivial
> Fix For: 1.2.0
>
> Attachments: HIVE-10676.2.patch, HIVE-10676.patch
>
>
> a) Hive's README file mentions only 2 execution frameworks, and does not 
> mention spark. We should add that in.
> b) We should remove jdk1.6 from the README, since hive no longer supports or 
> even compiles under jdk1.6.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542662#comment-14542662
 ] 

Ashutosh Chauhan commented on HIVE-10650:
-

I see.
+1

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> select sum(value) over (partition by key order by value rows between 
> unbounded preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-13 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542639#comment-14542639
 ] 

Aihua Xu commented on HIVE-10650:
-

[~ashutoshc] Thanks for reviewing. You are right. I added such test case in 
HIVE-10140, then I noticed that such windowing actually doesn't work. I 
mentioned there the result was incorrect and this is the patch to fix such 
issue. 

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> select sum(value) over (partition by key order by value rows between 
> unbounded preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10590) fix potential NPE in HiveMetaStore.equals

2015-05-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542630#comment-14542630
 ] 

Ashutosh Chauhan commented on HIVE-10590:
-

+1

> fix potential NPE in HiveMetaStore.equals
> -
>
> Key: HIVE-10590
> URL: https://issues.apache.org/jira/browse/HIVE-10590
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-10590.1.patch, rb33798.patch
>
>
> The following code will throw NPE if both v1 and v2 are null. HiveMetaStore 
> (2028-2029)
> {code}
> String v1 = p1.getValues().get(i), v2 = p2.getValues().get(i);
> if ((v1 == null && v2 != null) || !v1.equals(v2)) return false;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10580) Fix impossible cast in GenericUDF.getConstantLongValue

2015-05-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542604#comment-14542604
 ] 

Alexander Pivovarov commented on HIVE-10580:


GenericUDF is a base class to build UDFs. Some 3rd party UDFs might have bigint 
constants.

getConstantLongValue API was added to hive-1.2.0
I do not think it's a good idea to remove this API in 1.3.0.
Some people might built UDFs which uses getConstantLongValue already

> Fix impossible cast in GenericUDF.getConstantLongValue
> --
>
> Key: HIVE-10580
> URL: https://issues.apache.org/jira/browse/HIVE-10580
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10580.1.patch
>
>
> line 548-549
> {code}
> if (constValue instanceof IntWritable) {
>   v = ((LongWritable) constValue).get();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10700) LLAP: Log additional debug information in the scheduler

2015-05-13 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-10700.
---
Resolution: Fixed

> LLAP: Log additional debug information in the scheduler
> ---
>
> Key: HIVE-10700
> URL: https://issues.apache.org/jira/browse/HIVE-10700
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: llap
>
> Attachments: HIVE-10700.1.txt
>
>
> Temporarily, while we're debugging issues. Change to the DBEUG log level is 
> too verbose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10700) LLAP: Log additional debug information in the scheduler

2015-05-13 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10700:
--
Attachment: HIVE-10700.1.txt

> LLAP: Log additional debug information in the scheduler
> ---
>
> Key: HIVE-10700
> URL: https://issues.apache.org/jira/browse/HIVE-10700
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: llap
>
> Attachments: HIVE-10700.1.txt
>
>
> Temporarily, while we're debugging issues. Change to the DBEUG log level is 
> too verbose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542566#comment-14542566
 ] 

Ashutosh Chauhan commented on HIVE-10650:
-

Hey [~aihuaxu] I see that you have deleted
{code : sql}
select f, sum(f) over (partition by ts order by f rows between 2 preceding and 
1 preceding) from over10k limit 100;
{code}
But you have added same query with additional projection of ts in another q 
file. But, results are quite different. This led me to think if we had bug in 
current Hive and have wrong results checked in. I then tested this query and 
others you have in patch on postgres and results matched with what you have. 
So,  it seems like we have a correctness bug in current trunk, which this patch 
is fixing. Is that correct?

I am still going through code changes.

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> select sum(value) over (partition by key order by value rows between 
> unbounded preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542566#comment-14542566
 ] 

Ashutosh Chauhan edited comment on HIVE-10650 at 5/13/15 7:52 PM:
--

Hey [~aihuaxu] I see that you have deleted
{code}
select f, sum(f) over (partition by ts order by f rows between 2 preceding and 
1 preceding) from over10k limit 100;
{code}
But you have added same query with additional projection of ts in another q 
file. But, results are quite different. This led me to think if we had bug in 
current Hive and have wrong results checked in. I then tested this query and 
others you have in patch on postgres and results matched with what you have. 
So,  it seems like we have a correctness bug in current trunk, which this patch 
is fixing. Is that correct?

I am still going through code changes.


was (Author: ashutoshc):
Hey [~aihuaxu] I see that you have deleted
{code : sql}
select f, sum(f) over (partition by ts order by f rows between 2 preceding and 
1 preceding) from over10k limit 100;
{code}
But you have added same query with additional projection of ts in another q 
file. But, results are quite different. This led me to think if we had bug in 
current Hive and have wrong results checked in. I then tested this query and 
others you have in patch on postgres and results matched with what you have. 
So,  it seems like we have a correctness bug in current trunk, which this patch 
is fixing. Is that correct?

I am still going through code changes.

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> select sum(value) over (partition by key order by value rows between 
> unbounded preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10684) Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542536#comment-14542536
 ] 

Hive QA commented on HIVE-10684:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732502/HIVE-10684.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8921 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3879/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3879/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3879/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732502 - PreCommit-HIVE-TRUNK-Build

> Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary 
> jar files
> --
>
> Key: HIVE-10684
> URL: https://issues.apache.org/jira/browse/HIVE-10684
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10684.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10697) ObjecInspectorConvertors#UnionConvertor does a faulty conversion

2015-05-13 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542504#comment-14542504
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10697:
--

[~swarnim] You are right regarding the fact that ObjectinspectorConvertors are 
supposed to be used to convert from one object to another given the two OIs and 
not the OIs themselves. The below line needs to be modified to change 
fieldConverters.get(f).convert(inputOI) to 
fieldConverters.get(f).convert(input)  :

{code}
@Override
public Object convert(Object input) {
.
  for (int f = 0; f < minFields; f++) {
Object outputFieldValue = fieldConverters.get(f).convert(inputOI);
 
}
{code}

Thanks
Hari



> ObjecInspectorConvertors#UnionConvertor does a faulty conversion
> 
>
> Key: HIVE-10697
> URL: https://issues.apache.org/jira/browse/HIVE-10697
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
>
> Currently the UnionConvertor in the ObjectInspectorConvertors class has an 
> issue with the convert method where it attempts to convert the 
> objectinspector itself instead of converting the field.[1]. This should be 
> changed to convert the field itself. This could result in a 
> ClassCastException as shown below:
> {code}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyUnionObjectInspector 
> cannot be cast to org.apache.hadoop.hive.serde2.lazy.LazyString
>   at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector.getPrimitiveWritableObject(LazyStringObjectInspector.java:51)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:391)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:338)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$UnionConverter.convert(ObjectInspectorConverters.java:456)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$MapConverter.convert(ObjectInspectorConverters.java:539)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
>   ... 9 more
> {code}
> [1] 
> https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java#L466



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10582) variable typo in HiveOpConverter (714) and SemanticAnalyzer (7496)

2015-05-13 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10582:

Component/s: (was: Logical Optimizer)
 Query Planning

> variable typo in HiveOpConverter (714) and SemanticAnalyzer (7496)
> --
>
> Key: HIVE-10582
> URL: https://issues.apache.org/jira/browse/HIVE-10582
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: HIVE-10582.1.patch, rb33790.patch
>
>
> HiveOpConverter lines 703-717
> {code}
>   int kindex = exprBack == null ? -1 : 
> ExprNodeDescUtils.indexOf(exprBack, reduceKeysBack);
>   if (kindex >= 0) {
> ColumnInfo newColInfo = new ColumnInfo(colInfo);
> newColInfo.setInternalName(Utilities.ReduceField.KEY + 
> ".reducesinkkey" + kindex);
> newColInfo.setAlias(outputColName);
> newColInfo.setTabAlias(colInfo.getTabAlias());
> outputColumns.add(newColInfo);
> index[i] = kindex;
> continue;
>   }
>   int vindex = exprBack == null ? -1 : 
> ExprNodeDescUtils.indexOf(exprBack, reduceValuesBack);
>   if (kindex >= 0) { // looks like it should be vindex instead of kindex
> index[i] = -vindex - 1;
> continue;
>   }
> {code}
> Most probably the second "if (kindex >= 0)" (line 714) should be replaces 
> with "if (vindex >= 0)"
> The same situation in SemanticAnalyzer (7483-7499)
> {code}
>   int kindex = exprBack == null ? -1 : 
> ExprNodeDescUtils.indexOf(exprBack, reduceKeysBack);
>   if (kindex >= 0) {
> ColumnInfo newColInfo = new ColumnInfo(colInfo);
> newColInfo.setInternalName(Utilities.ReduceField.KEY + 
> ".reducesinkkey" + kindex);
> newColInfo.setTabAlias(nm[0]);
> outputRR.put(nm[0], nm[1], newColInfo);
> if (nm2 != null) {
>   outputRR.addMappingOnly(nm2[0], nm2[1], newColInfo);
> }
> index[i] = kindex;
> continue;
>   }
>   int vindex = exprBack == null ? -1 : 
> ExprNodeDescUtils.indexOf(exprBack, reduceValuesBack);
>   if (kindex >= 0) { // looks like it should be vindex instead of kindex
> index[i] = -vindex - 1;
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10101) LLAP: enable yourkit profiling of tasks

2015-05-13 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542457#comment-14542457
 ] 

Mostafa Mokhtar commented on HIVE-10101:


[~sershe]
An un-supported API example for JMC
http://hirt.se/blog/?p=277

> LLAP: enable yourkit profiling of tasks
> ---
>
> Key: HIVE-10101
> URL: https://issues.apache.org/jira/browse/HIVE-10101
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10101.02.patch, HIVE-10101.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10101) LLAP: enable yourkit profiling of tasks

2015-05-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542448#comment-14542448
 ] 

Sergey Shelukhin commented on HIVE-10101:
-

It's harder with LLAP because tasks overlap, but still better than nothing.

> LLAP: enable yourkit profiling of tasks
> ---
>
> Key: HIVE-10101
> URL: https://issues.apache.org/jira/browse/HIVE-10101
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10101.02.patch, HIVE-10101.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10101) LLAP: enable yourkit profiling of tasks

2015-05-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542446#comment-14542446
 ] 

Sergey Shelukhin commented on HIVE-10101:
-

I cannot find any API to control JMC. The whole point is to not bother with 
manual tasks when getting profiles.
I.e. if job had one slow task I just want to go and download a file for that 
slow task, not have profiling enabled all the time, or try to manually catch 
things and connect to places.

> LLAP: enable yourkit profiling of tasks
> ---
>
> Key: HIVE-10101
> URL: https://issues.apache.org/jira/browse/HIVE-10101
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10101.02.patch, HIVE-10101.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10580) Fix impossible cast in GenericUDF.getConstantLongValue

2015-05-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542428#comment-14542428
 ] 

Ashutosh Chauhan commented on HIVE-10580:
-

[~apivovarov] Do you want to update the patch with deleting this method? No 
point in maintaining dead code.

> Fix impossible cast in GenericUDF.getConstantLongValue
> --
>
> Key: HIVE-10580
> URL: https://issues.apache.org/jira/browse/HIVE-10580
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10580.1.patch
>
>
> line 548-549
> {code}
> if (constValue instanceof IntWritable) {
>   v = ((LongWritable) constValue).get();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO

2015-05-13 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542383#comment-14542383
 ] 

Mostafa Mokhtar commented on HIVE-9069:
---

[~jcamachorodriguez]
It looks like we either push the filter to the scan or have it in the join.

This doesn't get pushed to web_sales
{code}
select 
avg(ws_quantity) wq,
avg(wr_refunded_cash) ref,
avg(wr_fee) fee
from
web_sales,
web_returns,
customer_demographics cd1,
customer_address
where
web_sales.ws_item_sk = web_returns.wr_item_sk
and web_sales.ws_order_number = web_returns.wr_order_number
and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk
and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk
and ((cd1.cd_marital_status = 'M'
and cd1.cd_education_status = '4 yr Degree'
and ws_sales_price between 100.00 and 150.00)
or (ws_sales_price between 50.00 and 100.00)
or (ws_sales_price between 150.00 and 200.00))
and ((ws_net_profit between 100 and 200)
or (ca_country = 'United States'
and ca_state in ('MT' , 'OR', 'IN')
and ws_net_profit between 150 and 300)
or (ws_net_profit between 50 and 250))
{code}

Plan for the scan 
{code}
   Map 4
Map Operator Tree:
TableScan
  alias: web_sales
  filterExpr: (ws_item_sk is not null and ws_order_number is 
not null) (type: boolean)
  Statistics: Num rows: 143966864 Data size: 33110363004 Basic 
stats: COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: (ws_item_sk is not null and ws_order_number is 
not null) (type: boolean)
Statistics: Num rows: 143966864 Data size: 2879193520 Basic 
stats: COMPLETE Column stats: COMPLETE
Select Operator
  expressions: ws_item_sk (type: int), ws_order_number 
(type: int), ws_quantity (type: int), ws_sales_price (type: float), 
ws_net_profit (type: float)
  outputColumnNames: _col0, _col1, _col2, _col3, _col4
  Statistics: Num rows: 143966864 Data size: 2879193520 
Basic stats: COMPLETE Column stats: COMPLETE
  Reduce Output Operator
key expressions: _col0 (type: int), _col1 (type: int)
sort order: ++
Map-reduce partition columns: _col0 (type: int), _col1 
(type: int)
Statistics: Num rows: 143966864 Data size: 2879193520 
Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col2 (type: int), _col3 (type: 
float), _col4 (type: float)
Execution mode: vectorized
{code}

While in this query the filter ends up pushed down to web_sales
{code}
select 
avg(ws_quantity) wq,
avg(wr_refunded_cash) ref,
avg(wr_fee) fee
from
web_sales,
web_returns,
customer_demographics cd1,
customer_address
where
web_sales.ws_item_sk = web_returns.wr_item_sk
and web_sales.ws_order_number = web_returns.wr_order_number
and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk
and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk
and ((cd1.cd_marital_status = 'M'
and cd1.cd_education_status = '4 yr Degree'
and ws_sales_price between 100.00 and 150.00)
or (ws_sales_price between 50.00 and 100.00)
or (ws_sales_price between 150.00 and 200.00))
and ((ws_net_profit between 100 and 200)
or (ws_net_profit between 150 and 300)
or (ws_net_profit between 50 and 250))
{code}
Scan 
{code}
 Map 6
Map Operator Tree:
TableScan
  alias: web_sales
  filterExpr: (((ws_net_profit BETWEEN 100 AND 200 or 
(ws_net_profit BETWEEN 150 AND 300 or ws_net_profit BETWEEN 50 AND 250)) and 
ws_item_sk is not null) and ws_order_number is not null) (type: boolean)
  Statistics: Num rows: 143966864 Data size: 33110363004 Basic 
stats: COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: (((ws_net_profit BETWEEN 100 AND 200 or 
(ws_net_profit BETWEEN 150 AND 300 or ws_net_profit BETWEEN 50 AND 250)) and 
ws_item_sk is not null) and ws_order_number is not null) (type: boolean)
Statistics: Num rows: 143966864 Data size: 33110363004 
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
  expressions: ws_item_sk (type: int), ws_order_number 
(type: int), ws_quantity (type: int), ws_sales_price (type: float)
  outputColumnNames: _col0, _col1, _col2, _col3
  Statistics: Num rows: 143966864 Data size: 2303326064 
Basic stats: COMPLETE Column stats: COMPLETE
 

[jira] [Assigned] (HIVE-10698) query on view results fails with table not found error if view is created with subquery alias (CTE).

2015-05-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-10698:
--

Assignee: Pengcheng Xiong

> query on view results fails with table not found error if view is created 
> with subquery alias (CTE).
> 
>
> Key: HIVE-10698
> URL: https://issues.apache.org/jira/browse/HIVE-10698
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> To reproduce it, 
> {code}
> use bugtest;
> create table basetb(id int, name string);
> create view testv1 as
> with subtb as (select id, name from bugtest.basetb)
> select id from subtb;
> use castest;
> explain select * from bugtest.testv1;
> hive> explain select * from bugtest.testv1;
> FAILED: SemanticException Line 2:15 Table not found 'subtb' in definition of 
> VIEW testv1 [
> with subtb as (select id, name from bugtest.basetb)
> select id from `bugtest`.`subtb`
> ] used as testv1 at Line 1:22
> Note that there is a database prefix `bugtest`.`subtb`
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2915) up-to-date metastore ER diagram

2015-05-13 Thread Devarajan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542359#comment-14542359
 ] 

Devarajan commented on HIVE-2915:
-

The pdf attached does not contain the column names for tables, hence its tough 
to infer things. Can we prepare a ER diagram like the old one  in wiki?I have 
attached link below. 

link: 
https://issues.apache.org/jira/secure/attachment/12471108/HiveMetaStore.pdf

> up-to-date metastore ER diagram
> ---
>
> Key: HIVE-2915
> URL: https://issues.apache.org/jira/browse/HIVE-2915
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.8.1
> Environment: Ubuntu, MySQL workbech
>Reporter: Mahsa Mofidpoor
>Priority: Trivial
>  Labels: ER, Hive, metadata, metastore
> Fix For: 0.8.1
>
> Attachments: HiveMetastoreCompleterelations.pdf
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The up-to-date ER diagram for metastore is not available in Hive wiki. Now 
> there are 31 tables residing in metastore including the ones for indexes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10644) create SHA2 UDF

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542339#comment-14542339
 ] 

Hive QA commented on HIVE-10644:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732488/HIVE-10644.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8934 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3878/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3878/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3878/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732488 - PreCommit-HIVE-TRUNK-Build

> create SHA2 UDF
> ---
>
> Key: HIVE-10644
> URL: https://issues.apache.org/jira/browse/HIVE-10644
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10644.1.patch, HIVE-10644.2.patch
>
>
> Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and 
> SHA-512). The first argument is the cleartext string to be hashed. The second 
> argument indicates the desired bit length of the result, which must have a 
> value of 224, 256, 384, 512, or 0 (which is equivalent to 256). If either 
> argument is NULL or the hash length is not one of the permitted values, the 
> return value is NULL.
> MySQL also has SHA2 function 
> https://dev.mysql.com/doc/refman/5.5/en/encryption-functions.html#function_sha2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10548) Remove dependency to s3 repository in root pom

2015-05-13 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10548:

Fix Version/s: 1.2.0

> Remove dependency to s3 repository in root pom
> --
>
> Key: HIVE-10548
> URL: https://issues.apache.org/jira/browse/HIVE-10548
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Szehon Ho
>Assignee: Chengxiang Li
> Fix For: 1.2.0, 1.3.0
>
> Attachments: HIVE-10548.2.patch, HIVE-10548.2.patch, HIVE-10548.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10548) Remove dependency to s3 repository in root pom

2015-05-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542224#comment-14542224
 ] 

Xuefu Zhang commented on HIVE-10548:


[~sushanth], this artifact is only need for testing, not for build. Thus, I 
think it's okay to have it there. This is different from what was removed via 
this JIRA. Also, in the previous release, we had a similar site for this, but 
we didn't have a private maven repo for build artifacts.

Of course, it's nice if there is a public (apache) place where we can host 
binaries for testing, but we don't have it at the moment.

> Remove dependency to s3 repository in root pom
> --
>
> Key: HIVE-10548
> URL: https://issues.apache.org/jira/browse/HIVE-10548
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Szehon Ho
>Assignee: Chengxiang Li
> Fix For: 1.3.0
>
> Attachments: HIVE-10548.2.patch, HIVE-10548.2.patch, HIVE-10548.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10636) CASE comparison operator rotation optimization

2015-05-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542200#comment-14542200
 ] 

Hive QA commented on HIVE-10636:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12732486/HIVE-10636.3.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8921 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3877/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3877/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3877/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12732486 - PreCommit-HIVE-TRUNK-Build

> CASE comparison operator rotation optimization
> --
>
> Key: HIVE-10636
> URL: https://issues.apache.org/jira/browse/HIVE-10636
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10636.1.patch, HIVE-10636.2.patch, 
> HIVE-10636.3.patch, HIVE-10636.patch
>
>
> Step 1 as outlined in description of HIVE-9644



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10697) ObjecInspectorConvertors#UnionConvertor does a faulty conversion

2015-05-13 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542182#comment-14542182
 ] 

Swarnim Kulkarni commented on HIVE-10697:
-

Also seems like we have unit tests missing for the UnionConverter as well that 
should be added as a part of this JIRA.

> ObjecInspectorConvertors#UnionConvertor does a faulty conversion
> 
>
> Key: HIVE-10697
> URL: https://issues.apache.org/jira/browse/HIVE-10697
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
>
> Currently the UnionConvertor in the ObjectInspectorConvertors class has an 
> issue with the convert method where it attempts to convert the 
> objectinspector itself instead of converting the field.[1]. This should be 
> changed to convert the field itself. This could result in a 
> ClassCastException as shown below:
> {code}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyUnionObjectInspector 
> cannot be cast to org.apache.hadoop.hive.serde2.lazy.LazyString
>   at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector.getPrimitiveWritableObject(LazyStringObjectInspector.java:51)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:391)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:338)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$UnionConverter.convert(ObjectInspectorConverters.java:456)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$MapConverter.convert(ObjectInspectorConverters.java:539)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
>   ... 9 more
> {code}
> [1] 
> https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java#L466



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10548) Remove dependency to s3 repository in root pom

2015-05-13 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542179#comment-14542179
 ] 

Sushanth Sowmyan commented on HIVE-10548:
-

[~xuefuz], thanks for the catch! I do think it makes sense to include this. 
Also, on that note, I have a further question before I pick this - In 
itests/pom.xml, in the spark-test profile, there is a download-spark goal which 
downloads the following:

{noformat}
http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz
{noformat}

Is there a better place to download this from, or would it make more sense to 
remove this section for the 1.2 release?

> Remove dependency to s3 repository in root pom
> --
>
> Key: HIVE-10548
> URL: https://issues.apache.org/jira/browse/HIVE-10548
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Szehon Ho
>Assignee: Chengxiang Li
> Fix For: 1.3.0
>
> Attachments: HIVE-10548.2.patch, HIVE-10548.2.patch, HIVE-10548.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10697) ObjecInspectorConvertors#UnionConvertor does a faulty conversion

2015-05-13 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542178#comment-14542178
 ] 

Swarnim Kulkarni commented on HIVE-10697:
-

[~hsubramaniyan] Seems like this UnionConvertor was added as a part of 
HIVE-5202. Could you please help me understand why are we doing a conversion 
from one objectinspector to another in this code?[1] Aren't the 
ObjectinspectorConvertors supposed to be used to convert from one object to 
another given the two OIs and not the OIs themselves?

[1] 
https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java#L466

> ObjecInspectorConvertors#UnionConvertor does a faulty conversion
> 
>
> Key: HIVE-10697
> URL: https://issues.apache.org/jira/browse/HIVE-10697
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
>
> Currently the UnionConvertor in the ObjectInspectorConvertors class has an 
> issue with the convert method where it attempts to convert the 
> objectinspector itself instead of converting the field.[1]. This should be 
> changed to convert the field itself. This could result in a 
> ClassCastException as shown below:
> {code}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyUnionObjectInspector 
> cannot be cast to org.apache.hadoop.hive.serde2.lazy.LazyString
>   at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector.getPrimitiveWritableObject(LazyStringObjectInspector.java:51)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:391)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:338)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$UnionConverter.convert(ObjectInspectorConverters.java:456)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$MapConverter.convert(ObjectInspectorConverters.java:539)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
>   ... 9 more
> {code}
> [1] 
> https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java#L466



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10623) Implement hive cli options using beeline functionality

2015-05-13 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10623:

Attachment: HIVE-10623.2.patch

Rebase the code and retry the CI

> Implement hive cli options using beeline functionality
> --
>
> Key: HIVE-10623
> URL: https://issues.apache.org/jira/browse/HIVE-10623
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10623.1.patch, HIVE-10623.2.patch, HIVE-10623.patch
>
>
> We need to support the original hive cli options for the purpose of backwards 
> compatibility. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >