[jira] [Updated] (HIVE-10957) QueryPlan's start time is incorrect in certain cases

2015-06-05 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-10957:
-
Attachment: HIVE-10957.2.patch

Thanks [~hagleitn] for creating the JIRA.

Uploading patch #2 since #1 was generated by IntelliJ. Not sure if that one can 
be picked up by PreCommit test queue.

> QueryPlan's start time is incorrect in certain cases
> 
>
> Key: HIVE-10957
> URL: https://issues.apache.org/jira/browse/HIVE-10957
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Wei Zheng
> Attachments: HIVE-10957.1.patch, HIVE-10957.2.patch
>
>
> In some cases the start time of the previous query is used mistakenly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10673) Dynamically partitioned hash join for Tez

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575603#comment-14575603
 ] 

Hive QA commented on HIVE-10673:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12738066/HIVE-10673.4.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9004 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4195/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4195/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4195/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12738066 - PreCommit-HIVE-TRUNK-Build

> Dynamically partitioned hash join for Tez
> -
>
> Key: HIVE-10673
> URL: https://issues.apache.org/jira/browse/HIVE-10673
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-10673.1.patch, HIVE-10673.2.patch, 
> HIVE-10673.3.patch, HIVE-10673.4.patch
>
>
> Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
> reducer are unsorted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10957) QueryPlan's start time is incorrect in certain cases

2015-06-05 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10957:
--
Attachment: HIVE-10957.1.patch

Uploading patch for [~wzheng]

> QueryPlan's start time is incorrect in certain cases
> 
>
> Key: HIVE-10957
> URL: https://issues.apache.org/jira/browse/HIVE-10957
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Wei Zheng
> Attachments: HIVE-10957.1.patch
>
>
> In some cases the start time of the previous query is used mistakenly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10957) QueryPlan's start time is incorrect in certain cases

2015-06-05 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575584#comment-14575584
 ] 

Gunther Hagleitner commented on HIVE-10957:
---

+1

> QueryPlan's start time is incorrect in certain cases
> 
>
> Key: HIVE-10957
> URL: https://issues.apache.org/jira/browse/HIVE-10957
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Wei Zheng
> Attachments: HIVE-10957.1.patch
>
>
> In some cases the start time of the previous query is used mistakenly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10957) QueryPlan's start time is incorrect in certain cases

2015-06-05 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10957:
--
Description: In some cases the start time of the previous query is used 
mistakenly.

> QueryPlan's start time is incorrect in certain cases
> 
>
> Key: HIVE-10957
> URL: https://issues.apache.org/jira/browse/HIVE-10957
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Wei Zheng
>
> In some cases the start time of the previous query is used mistakenly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10941) Provide option to disable spark tests outside itests

2015-06-05 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10941:
--
Fix Version/s: 1.2.1

> Provide option to disable spark tests outside itests
> 
>
> Key: HIVE-10941
> URL: https://issues.apache.org/jira/browse/HIVE-10941
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 1.2.1
>
> Attachments: HIVE-10941.1.patch, HIVE-10941.2.patch
>
>
> HIVE-10477 provided an option to disable spark module, however we missed the 
> following files that are outside itests directory. i.e we need to club the 
> option with disabling the following tests as well :
> {code}
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
> org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10949) Disable hive-minikdc tests in Windows

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575567#comment-14575567
 ] 

Hive QA commented on HIVE-10949:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12738050/HIVE-10949.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9000 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4194/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4194/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4194/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12738050 - PreCommit-HIVE-TRUNK-Build

> Disable hive-minikdc tests in Windows
> -
>
> Key: HIVE-10949
> URL: https://issues.apache.org/jira/browse/HIVE-10949
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10949.1.patch
>
>
> hive-minikdc needs to be disabled for Windows OS since we dont have kerberos 
> support yet for Hadoop Cluster running under Windows OS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10956) HS2 leaks HMS connections

2015-06-05 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-10956:
---
Attachment: HIVE-10956.1.patch

> HS2 leaks HMS connections
> -
>
> Key: HIVE-10956
> URL: https://issues.apache.org/jira/browse/HIVE-10956
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 1.3.0
>
> Attachments: HIVE-10956.1.patch
>
>
> HS2 uses threadlocal to cache HMS client in class Hive. When the thread is 
> dead, the HMS client is not closed. So the connection to the HMS is leaked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10647) Hive on LLAP: Limit HS2 from overwhelming LLAP

2015-06-05 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10647:
--
Attachment: HIVE-10647.2.patch

Changed to use a semaphore. I do not see a specific need to grow the size 
beyond 1 in the default case because:

1. If it is the cli mode, the user basically never lets go of the session and 1 
is all they need.
2. If it is hs2, the user needs to configure this correctly anyways.

I can change it to a large number but don't really see a good reason for it.

> Hive on LLAP: Limit HS2 from overwhelming LLAP
> --
>
> Key: HIVE-10647
> URL: https://issues.apache.org/jira/browse/HIVE-10647
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10647.1.patch, HIVE-10647.2.patch
>
>
> We want to restrict the number of queries that flow through LLAP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8931) Test TestAccumuloCliDriver is not completing

2015-06-05 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HIVE-8931:
-
Attachment: HIVE-8931.001.patch

Finally circling back around on this.

Gave up trying to make the tests work in itests/qtests and moved them to 
itests/qtests-accumulo where I can properly rip out all of the thrift 0.9.2 
dependencies that sneak in all over.

Gets the tests running, but, sadly, apparently something has broken since these 
stopped running as 3/7 are now failing. Will try to look into why those are 
busted now.

> Test TestAccumuloCliDriver is not completing
> 
>
> Key: HIVE-8931
> URL: https://issues.apache.org/jira/browse/HIVE-8931
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Josh Elser
> Fix For: 0.12.1
>
> Attachments: HIVE-8931.001.patch
>
>
> Tests are taking 3 hours due to {{TestAccumuloCliDriver}} not finishing.
> Logs:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1848/failed/TestAccumuloCliDriver/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10728) deprecate unix_timestamp(void) and make it deterministic

2015-06-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575534#comment-14575534
 ] 

Lefty Leverenz commented on HIVE-10728:
---

Doc note:  Adding a TODOC1.3 label.  Document this in the Date Functions 
section of UDFs.

* [Hive Operators and UDFs -- Date Functions | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions]

> deprecate unix_timestamp(void) and make it deterministic
> 
>
> Key: HIVE-10728
> URL: https://issues.apache.org/jira/browse/HIVE-10728
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10728.01.patch, HIVE-10728.02.patch, 
> HIVE-10728.03.patch, HIVE-10728.patch
>
>
> We have a proper current_timestamp function that is not evaluated at runtime.
> Behavior of unix_timestamp(void) is both surprising, and is preventing some 
> optimizations on the other overload since the function becomes 
> non-deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10728) deprecate unix_timestamp(void) and make it deterministic

2015-06-05 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10728:
--
Labels: TODOC1.3  (was: )

> deprecate unix_timestamp(void) and make it deterministic
> 
>
> Key: HIVE-10728
> URL: https://issues.apache.org/jira/browse/HIVE-10728
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10728.01.patch, HIVE-10728.02.patch, 
> HIVE-10728.03.patch, HIVE-10728.patch
>
>
> We have a proper current_timestamp function that is not evaluated at runtime.
> Behavior of unix_timestamp(void) is both surprising, and is preventing some 
> optimizations on the other overload since the function becomes 
> non-deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575533#comment-14575533
 ] 

Hive QA commented on HIVE-10911:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12738043/HIVE-10911.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9002 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4193/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4193/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4193/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12738043 - PreCommit-HIVE-TRUNK-Build

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10911.patch
>
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575529#comment-14575529
 ] 

Lefty Leverenz commented on HIVE-10761:
---

Doc note:  This adds several configuration parameters, and usage information 
should also be documented.

#  hive.metastore.metrics.enabled
#  hive.server2.metrics.enabled
#  hive.service.metrics.class
#  hive.service.metrics.reporter
#  hive.service.metrics.file.location
#  hive.service.metrics.file.frequency




> Create codahale-based metrics system for Hive
> -
>
> Key: HIVE-10761
> URL: https://issues.apache.org/jira/browse/HIVE-10761
> Project: Hive
>  Issue Type: New Feature
>  Components: Diagnosability
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
> HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
> hms-metrics.json
>
>
> There is a current Hive metrics system that hooks up to a JMX reporting, but 
> all its measurements, models are custom.
> This is to make another metrics system that will be based on Codahale (ie 
> yammer, dropwizard), which has the following advantage:
> * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
> * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
> etc), 
> * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
> It is used for many projects, including several Apache projects like Oozie.  
> Overall, monitoring tools should find it easier to understand these common 
> metric, measurement, reporting models.
> The existing metric subsystem will be kept and can be enabled if backward 
> compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-05 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10761:
--
Labels: TODOC1.3  (was: )

> Create codahale-based metrics system for Hive
> -
>
> Key: HIVE-10761
> URL: https://issues.apache.org/jira/browse/HIVE-10761
> Project: Hive
>  Issue Type: New Feature
>  Components: Diagnosability
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
> HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
> hms-metrics.json
>
>
> There is a current Hive metrics system that hooks up to a JMX reporting, but 
> all its measurements, models are custom.
> This is to make another metrics system that will be based on Codahale (ie 
> yammer, dropwizard), which has the following advantage:
> * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
> * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
> etc), 
> * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
> It is used for many projects, including several Apache projects like Oozie.  
> Overall, monitoring tools should find it easier to understand these common 
> metric, measurement, reporting models.
> The existing metric subsystem will be kept and can be enabled if backward 
> compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10821) Beeline-CLI: Implement CLI source command using Beeline functionality

2015-06-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575511#comment-14575511
 ] 

Lefty Leverenz commented on HIVE-10821:
---

Doc note:  Adding a link to HIVE-10810:  Document Beeline/CLI changes.

> Beeline-CLI: Implement CLI source command using Beeline functionality
> -
>
> Key: HIVE-10821
> URL: https://issues.apache.org/jira/browse/HIVE-10821
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10821.1-beeline-cli.patch, 
> HIVE-10821.1-beeline-cli.patch, HIVE-10821.2-beeline-cli.patch, 
> HIVE-10821.3-beeline-cli.patch, HIVE-10821.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS

2015-06-05 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10910:
--
Attachment: HIVE-10910.2.patch

> Alter table drop partition queries in encrypted zone failing to remove data 
> from HDFS
> -
>
> Key: HIVE-10910
> URL: https://issues.apache.org/jira/browse/HIVE-10910
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Eugene Koifman
> Attachments: HIVE-10910.2.patch, HIVE-10910.patch
>
>
> Alter table query trying to drop partition removes metadata of partition but 
> fails to remove the data from HDFS
> hive> create table table_1(name string, age int, gpa double) partitioned by 
> (b string) stored as textfile;
> OK
> Time taken: 0.732 seconds
> hive> alter table table_1 add partition (b='2010-10-10');
> OK
> Time taken: 0.496 seconds
> hive> show partitions table_1;
> OK
> b=2010-10-10
> Time taken: 0.781 seconds, Fetched: 1 row(s)
> hive> alter table table_1 drop partition (b='2010-10-10');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException 
> Failed to move to trash: 
> hdfs://:8020//table_1/b=2010-10-10
> hive> show partitions table_1;
> OK
> Time taken: 0.622 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements

2015-06-05 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575506#comment-14575506
 ] 

Laljo John Pullokkaran commented on HIVE-10841:
---

Attached modified patch.
This patch address correctness & predicate push down to mapper.

> [WHERE col is not null] does not work sometimes for queries with many JOIN 
> statements
> -
>
> Key: HIVE-10841
> URL: https://issues.apache.org/jira/browse/HIVE-10841
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Query Processor
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10841.1.patch, HIVE-10841.patch
>
>
> The result from the following SELECT query is 3 rows but it should be 1 row.
> I checked it in MySQL - it returned 1 row.
> To reproduce the issue in Hive
> 1. prepare tables
> {code}
> drop table if exists L;
> drop table if exists LA;
> drop table if exists FR;
> drop table if exists A;
> drop table if exists PI;
> drop table if exists acct;
> create table L as select 4436 id;
> create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
> create table FR as select 4436 loan_id;
> create table A as select 4748 id;
> create table PI as select 4415 id;
> create table acct as select 4748 aid, 10 acc_n, 122 brn;
> insert into table acct values(4748, null, null);
> insert into table acct values(4748, null, null);
> {code}
> 2. run SELECT query
> {code}
> select
>   acct.ACC_N,
>   acct.brn
> FROM L
> JOIN LA ON L.id = LA.loan_id
> JOIN FR ON L.id = FR.loan_id
> JOIN A ON LA.aid = A.id
> JOIN PI ON PI.id = LA.pi_id
> JOIN acct ON A.id = acct.aid
> WHERE
>   L.id = 4436
>   and acct.brn is not null;
> {code}
> the result is 3 rows
> {code}
> 10122
> NULL  NULL
> NULL  NULL
> {code}
> but it should be 1 row
> {code}
> 10122
> {code}
> 2.1 "explain select ..." output for hive-1.3.0 MR
> {code}
> STAGE DEPENDENCIES:
>   Stage-12 is a root stage
>   Stage-9 depends on stages: Stage-12
>   Stage-0 depends on stages: Stage-9
> STAGE PLANS:
>   Stage: Stage-12
> Map Reduce Local Work
>   Alias -> Map Local Tables:
> a 
>   Fetch Operator
> limit: -1
> acct 
>   Fetch Operator
> limit: -1
> fr 
>   Fetch Operator
> limit: -1
> l 
>   Fetch Operator
> limit: -1
> pi 
>   Fetch Operator
> limit: -1
>   Alias -> Map Local Operator Tree:
> a 
>   TableScan
> alias: a
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: id is not null (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 _col5 (type: int)
>   1 id (type: int)
>   2 aid (type: int)
> acct 
>   TableScan
> alias: acct
> Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: aid is not null (type: boolean)
>   Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 _col5 (type: int)
>   1 id (type: int)
>   2 aid (type: int)
> fr 
>   TableScan
> alias: fr
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: (loan_id = 4436) (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 4436 (type: int)
>   1 4436 (type: int)
>   2 4436 (type: int)
> l 
>   TableScan
> alias: l
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: (id = 4436) (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 4436 (type: int)
>   1 4436 (type: int)
>   2 4436 (type: int)
> pi 
>   TableScan
> alias: pi
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Co

[jira] [Commented] (HIVE-10934) Restore support for DROP PARTITION PURGE

2015-06-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575502#comment-14575502
 ] 

Lefty Leverenz commented on HIVE-10934:
---

Doc note:  Adding a TODOC1.2 label because the documentation needs to be 
updated to show that this is available in version 1.2.1, not 1.2.0.

* [LanguageManual DDL -- Drop Partitions | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DropPartitions]

> Restore support for DROP PARTITION PURGE
> 
>
> Key: HIVE-10934
> URL: https://issues.apache.org/jira/browse/HIVE-10934
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>  Labels: TODOC1.2
> Fix For: 1.2.1
>
> Attachments: HIVE-10934.patch
>
>
> HIVE-9086 added support for PURGE in 
> {noformat}
> ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = "sayonara") 
> IGNORE PROTECTION PURGE;
> {noformat}
> looks like this was accidentally lost in HIVE-10228



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements

2015-06-05 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10841:
--
Attachment: HIVE-10841.1.patch

> [WHERE col is not null] does not work sometimes for queries with many JOIN 
> statements
> -
>
> Key: HIVE-10841
> URL: https://issues.apache.org/jira/browse/HIVE-10841
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Query Processor
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10841.1.patch, HIVE-10841.patch
>
>
> The result from the following SELECT query is 3 rows but it should be 1 row.
> I checked it in MySQL - it returned 1 row.
> To reproduce the issue in Hive
> 1. prepare tables
> {code}
> drop table if exists L;
> drop table if exists LA;
> drop table if exists FR;
> drop table if exists A;
> drop table if exists PI;
> drop table if exists acct;
> create table L as select 4436 id;
> create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
> create table FR as select 4436 loan_id;
> create table A as select 4748 id;
> create table PI as select 4415 id;
> create table acct as select 4748 aid, 10 acc_n, 122 brn;
> insert into table acct values(4748, null, null);
> insert into table acct values(4748, null, null);
> {code}
> 2. run SELECT query
> {code}
> select
>   acct.ACC_N,
>   acct.brn
> FROM L
> JOIN LA ON L.id = LA.loan_id
> JOIN FR ON L.id = FR.loan_id
> JOIN A ON LA.aid = A.id
> JOIN PI ON PI.id = LA.pi_id
> JOIN acct ON A.id = acct.aid
> WHERE
>   L.id = 4436
>   and acct.brn is not null;
> {code}
> the result is 3 rows
> {code}
> 10122
> NULL  NULL
> NULL  NULL
> {code}
> but it should be 1 row
> {code}
> 10122
> {code}
> 2.1 "explain select ..." output for hive-1.3.0 MR
> {code}
> STAGE DEPENDENCIES:
>   Stage-12 is a root stage
>   Stage-9 depends on stages: Stage-12
>   Stage-0 depends on stages: Stage-9
> STAGE PLANS:
>   Stage: Stage-12
> Map Reduce Local Work
>   Alias -> Map Local Tables:
> a 
>   Fetch Operator
> limit: -1
> acct 
>   Fetch Operator
> limit: -1
> fr 
>   Fetch Operator
> limit: -1
> l 
>   Fetch Operator
> limit: -1
> pi 
>   Fetch Operator
> limit: -1
>   Alias -> Map Local Operator Tree:
> a 
>   TableScan
> alias: a
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: id is not null (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 _col5 (type: int)
>   1 id (type: int)
>   2 aid (type: int)
> acct 
>   TableScan
> alias: acct
> Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: aid is not null (type: boolean)
>   Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 _col5 (type: int)
>   1 id (type: int)
>   2 aid (type: int)
> fr 
>   TableScan
> alias: fr
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: (loan_id = 4436) (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 4436 (type: int)
>   1 4436 (type: int)
>   2 4436 (type: int)
> l 
>   TableScan
> alias: l
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: (id = 4436) (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 4436 (type: int)
>   1 4436 (type: int)
>   2 4436 (type: int)
> pi 
>   TableScan
> alias: pi
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: id is not null (type: boolean)
>   

[jira] [Updated] (HIVE-10934) Restore support for DROP PARTITION PURGE

2015-06-05 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10934:
--
Labels: TODOC1.2  (was: )

> Restore support for DROP PARTITION PURGE
> 
>
> Key: HIVE-10934
> URL: https://issues.apache.org/jira/browse/HIVE-10934
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>  Labels: TODOC1.2
> Fix For: 1.2.1
>
> Attachments: HIVE-10934.patch
>
>
> HIVE-9086 added support for PURGE in 
> {noformat}
> ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = "sayonara") 
> IGNORE PROTECTION PURGE;
> {noformat}
> looks like this was accidentally lost in HIVE-10228



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements

2015-06-05 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10841:
--
Affects Version/s: 1.3.0

> [WHERE col is not null] does not work sometimes for queries with many JOIN 
> statements
> -
>
> Key: HIVE-10841
> URL: https://issues.apache.org/jira/browse/HIVE-10841
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Query Processor
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10841.patch
>
>
> The result from the following SELECT query is 3 rows but it should be 1 row.
> I checked it in MySQL - it returned 1 row.
> To reproduce the issue in Hive
> 1. prepare tables
> {code}
> drop table if exists L;
> drop table if exists LA;
> drop table if exists FR;
> drop table if exists A;
> drop table if exists PI;
> drop table if exists acct;
> create table L as select 4436 id;
> create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
> create table FR as select 4436 loan_id;
> create table A as select 4748 id;
> create table PI as select 4415 id;
> create table acct as select 4748 aid, 10 acc_n, 122 brn;
> insert into table acct values(4748, null, null);
> insert into table acct values(4748, null, null);
> {code}
> 2. run SELECT query
> {code}
> select
>   acct.ACC_N,
>   acct.brn
> FROM L
> JOIN LA ON L.id = LA.loan_id
> JOIN FR ON L.id = FR.loan_id
> JOIN A ON LA.aid = A.id
> JOIN PI ON PI.id = LA.pi_id
> JOIN acct ON A.id = acct.aid
> WHERE
>   L.id = 4436
>   and acct.brn is not null;
> {code}
> the result is 3 rows
> {code}
> 10122
> NULL  NULL
> NULL  NULL
> {code}
> but it should be 1 row
> {code}
> 10122
> {code}
> 2.1 "explain select ..." output for hive-1.3.0 MR
> {code}
> STAGE DEPENDENCIES:
>   Stage-12 is a root stage
>   Stage-9 depends on stages: Stage-12
>   Stage-0 depends on stages: Stage-9
> STAGE PLANS:
>   Stage: Stage-12
> Map Reduce Local Work
>   Alias -> Map Local Tables:
> a 
>   Fetch Operator
> limit: -1
> acct 
>   Fetch Operator
> limit: -1
> fr 
>   Fetch Operator
> limit: -1
> l 
>   Fetch Operator
> limit: -1
> pi 
>   Fetch Operator
> limit: -1
>   Alias -> Map Local Operator Tree:
> a 
>   TableScan
> alias: a
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: id is not null (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 _col5 (type: int)
>   1 id (type: int)
>   2 aid (type: int)
> acct 
>   TableScan
> alias: acct
> Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: aid is not null (type: boolean)
>   Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 _col5 (type: int)
>   1 id (type: int)
>   2 aid (type: int)
> fr 
>   TableScan
> alias: fr
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: (loan_id = 4436) (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 4436 (type: int)
>   1 4436 (type: int)
>   2 4436 (type: int)
> l 
>   TableScan
> alias: l
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: (id = 4436) (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 4436 (type: int)
>   1 4436 (type: int)
>   2 4436 (type: int)
> pi 
>   TableScan
> alias: pi
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: id is not null (type: boolean)
>   Statistics: Nu

[jira] [Updated] (HIVE-3628) Provide a way to use counters in Hive through UDF

2015-06-05 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-3628:
-
Labels: TODOC11  (was: )

> Provide a way to use counters in Hive through UDF
> -
>
> Key: HIVE-3628
> URL: https://issues.apache.org/jira/browse/HIVE-3628
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Viji
>Assignee: Navis
>Priority: Minor
>  Labels: TODOC11
> Fix For: 0.11.0
>
> Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch, 
> HIVE-3628.D8007.3.patch, HIVE-3628.D8007.4.patch, HIVE-3628.D8007.5.patch, 
> HIVE-3628.D8007.6.patch
>
>
> Currently it is not possible to generate counters through UDF. We should 
> support this. 
> Pig currently allows this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3628) Provide a way to use counters in Hive through UDF

2015-06-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575491#comment-14575491
 ] 

Lefty Leverenz commented on HIVE-3628:
--

Thanks for the reminder.  We still need to document this, so I'm adding a 
TODOC11 label.

> Provide a way to use counters in Hive through UDF
> -
>
> Key: HIVE-3628
> URL: https://issues.apache.org/jira/browse/HIVE-3628
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Viji
>Assignee: Navis
>Priority: Minor
> Fix For: 0.11.0
>
> Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch, 
> HIVE-3628.D8007.3.patch, HIVE-3628.D8007.4.patch, HIVE-3628.D8007.5.patch, 
> HIVE-3628.D8007.6.patch
>
>
> Currently it is not possible to generate counters through UDF. We should 
> support this. 
> Pig currently allows this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10941) Provide option to disable spark tests outside itests

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575479#comment-14575479
 ] 

Hive QA commented on HIVE-10941:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12738042/HIVE-10941.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9001 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4192/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4192/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4192/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12738042 - PreCommit-HIVE-TRUNK-Build

> Provide option to disable spark tests outside itests
> 
>
> Key: HIVE-10941
> URL: https://issues.apache.org/jira/browse/HIVE-10941
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10941.1.patch, HIVE-10941.2.patch
>
>
> HIVE-10477 provided an option to disable spark module, however we missed the 
> following files that are outside itests directory. i.e we need to club the 
> option with disabling the following tests as well :
> {code}
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
> org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10906) Value based UDAF function without orderby expression throws NPE

2015-06-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10906:

Attachment: HIVE-10906.2.patch

> Value based UDAF function without orderby expression throws NPE
> ---
>
> Key: HIVE-10906
> URL: https://issues.apache.org/jira/browse/HIVE-10906
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10906.2.patch, HIVE-10906.patch
>
>
> The following query throws NPE.
> {noformat}
> select key, value, min(value) over (partition by key range between unbounded 
> preceding and current row) from small;
> FAILED: NullPointerException null
> 2015-06-03 13:48:09,268 ERROR [main]: ql.Driver 
> (SessionState.java:printError(957)) - FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateValueBoundary(WindowingSpec.java:293)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateWindowFrame(WindowingSpec.java:281)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateAndMakeEffective(WindowingSpec.java:155)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:11965)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8910)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8868)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9606)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10079)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:327)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10090)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1124)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1061)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10955) CliDriver leaves tables behind at end of test run

2015-06-05 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10955:

Attachment: HIVE-10955.patch

> CliDriver leaves tables behind at end of test run 
> --
>
> Key: HIVE-10955
> URL: https://issues.apache.org/jira/browse/HIVE-10955
> Project: Hive
>  Issue Type: Test
>  Components: Testing Infrastructure
>Affects Versions: 1.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10955.patch
>
>
> When run serially with other drives, this causes problems



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10906) Value based UDAF function without orderby expression throws NPE

2015-06-05 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575422#comment-14575422
 ] 

Aihua Xu commented on HIVE-10906:
-

Thanks. [~ashutoshc] There will be a merge conflict with HIVE-10911. Could you 
please submit that first and then i will regenerate test baseline?

> Value based UDAF function without orderby expression throws NPE
> ---
>
> Key: HIVE-10906
> URL: https://issues.apache.org/jira/browse/HIVE-10906
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10906.patch
>
>
> The following query throws NPE.
> {noformat}
> select key, value, min(value) over (partition by key range between unbounded 
> preceding and current row) from small;
> FAILED: NullPointerException null
> 2015-06-03 13:48:09,268 ERROR [main]: ql.Driver 
> (SessionState.java:printError(957)) - FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateValueBoundary(WindowingSpec.java:293)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateWindowFrame(WindowingSpec.java:281)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateAndMakeEffective(WindowingSpec.java:155)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:11965)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8910)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8868)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9606)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10079)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:327)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10090)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1124)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1061)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575399#comment-14575399
 ] 

Hive QA commented on HIVE-10165:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12738028/HIVE-10165.6.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9080 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4191/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4191/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4191/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12738028 - PreCommit-HIVE-TRUNK-Build

> Improve hive-hcatalog-streaming extensibility and support updates and deletes.
> --
>
> Key: HIVE-10165
> URL: https://issues.apache.org/jira/browse/HIVE-10165
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Elliot West
>Assignee: Elliot West
>  Labels: streaming_api
> Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, 
> HIVE-10165.5.patch, HIVE-10165.6.patch, mutate-system-overview.png
>
>
> h3. Overview
> I'd like to extend the 
> [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest]
>  API so that it also supports the writing of record updates and deletes in 
> addition to the already supported inserts.
> h3. Motivation
> We have many Hadoop processes outside of Hive that merge changed facts into 
> existing datasets. Traditionally we achieve this by: reading in a 
> ground-truth dataset and a modified dataset, grouping by a key, sorting by a 
> sequence and then applying a function to determine inserted, updated, and 
> deleted rows. However, in our current scheme we must rewrite all partitions 
> that may potentially contain changes. In practice the number of mutated 
> records is very small when compared with the records contained in a 
> partition. This approach results in a number of operational issues:
> * Excessive amount of write activity required for small data changes.
> * Downstream applications cannot robustly read these datasets while they are 
> being updated.
> * Due to scale of the updates (hundreds or partitions) the scope for 
> contention is high. 
> I believe we can address this problem by instead writing only the changed 
> records to a Hive transactional table. This should drastically reduce the 
> amount of data that we need to write and also provide a means for managing 
> concurrent access to the data. Our existing merge processes can read and 
> retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to 
> an updated form of the hive-hcatalog-streaming API which will then have the 
> required data to perform an update or insert in a transactional manner. 
> h3. Benefits
> * Enables the creation of large-scale dataset merge processes  
> * Opens up Hive transactional functionality in an accessible manner to 
> processes that operate outside of Hive.
> h3. Implementation
> Our changes do not break the existing API contracts. Instead our approach has 
> been to consider the functionality offered by the existing API and our 
> proposed API as fulfilling separate and distinct use-cases. The existing API 
> is primarily focused on the task of continuously writing large volumes of new 
> data into a Hive table for near-immediate analysis. Our use-case however, is 
> concerned more with the frequent but not continuous ingestion of mutations to 
> a Hive table from some ETL merge process. Consequently we feel it is 
> justifiable to add our new functionality via an alternative set of public 
> interfaces and leave the existing API as is. This keeps both APIs clean and 
> focused at the expense of presenting additional options to potential users. 
> Wherever possible, shared implementation concerns have been factored out into 
> abstract base classes that are open to third-party extension. A detailed 
> breakdown of the changes is as follows:
> * We've introduced a public {{RecordMutator}} interface whose purpose is to 
> expose insert/update/delete operations to the user. This is a counterpart t

[jira] [Commented] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures

2015-06-05 Thread Richard Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575389#comment-14575389
 ] 

Richard Williams commented on HIVE-10410:
-

[~thejas] In the test case I use to reproduce this issue, I'm not sharing a 
session among multiple queries or setting hive.exec.parallel=true. I am, 
however, using the same session across multiple calls to the HiveServer2 Thrift 
API--in particular, I'm running many concurrent processes that each open a 
session, execute an asynchronous operation, poll HiveServer2 for the status of 
that operation until it completes, and then close their session. I think that 
what's happening is that the foreground thread is continuing to talk to the 
metastore using its Hive object while the pooled background thread is executing 
the asynchronous operation (and thus also using the same Hive object).

Right, the issue you mentioned is what I was talking about in my earlier 
comment--the patch I uploaded relies on the UserGroupInformation.doAs call in 
the submitted background operation to ensure that the current user in the 
background thread is the same as the current user in the foreground thread. 
Thus, when the background thread calls Hive.get, the call to isCurrentUserOwner 
will return false if the existing MetaStoreClient is associated with the wrong 
user and a new connection, this time associated with the correct user, will be 
created. Presumably the code that does this is the safeguard you're referring 
to?

{noformat}
  public static Hive get(HiveConf c, boolean needsRefresh) throws HiveException 
{
Hive db = hiveDB.get();
if (db == null || needsRefresh || !db.isCurrentUserOwner()) {
  if (db != null) {
LOG.debug("Creating new db. db = " + db + ", needsRefresh = " + 
needsRefresh +
  ", db.isCurrentUserOwner = " + db.isCurrentUserOwner());
  }
  closeCurrent();
  c.set("fs.scheme.class", "dfs");
  Hive newdb = new Hive(c);
  hiveDB.set(newdb);
  return newdb;
}

{noformat}

We haven't noticed any performance issues attributable to frequent 
reconnections to the metastore to as a result of this change. Then again, we 
probably wouldn't; we're using Sentry and thus have HiveServer2 impersonation 
disabled, so I would expect the current user to always be the user running the 
HiveServer2 process. When I get a chance, I'll try changing 
Hive.createMetaStoreClient to wrap the client it creates using 
HiveMetaStoreClient.newSynchronizedClient instead of removing the code that 
sets the background thread's Hive object.

> Apparent race condition in HiveServer2 causing intermittent query failures
> --
>
> Key: HIVE-10410
> URL: https://issues.apache.org/jira/browse/HIVE-10410
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.1
> Environment: CDH 5.3.3
> CentOS 6.4
>Reporter: Richard Williams
> Attachments: HIVE-10410.1.patch
>
>
> On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC 
> occasionally trigger odd Thrift exceptions with messages such as "Read a 
> negative frame size (-2147418110)!" or "out of sequence response" in 
> HiveServer2's connections to the metastore. For certain metastore calls (for 
> example, showDatabases), these Thrift exceptions are converted to 
> MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient 
> from retrying these calls and thus causes the failure to bubble out to the 
> JDBC client.
> Note that as far as we can tell, this issue appears to only affect queries 
> that are submitted with the runAsync flag on TExecuteStatementReq set to true 
> (which, in practice, seems to mean all JDBC queries), and it appears to only 
> manifest when HiveServer2 is using the new HTTP transport mechanism. When 
> both these conditions hold, we are able to fairly reliably reproduce the 
> issue by spawning about 100 simple, concurrent hive queries (we have been 
> using "show databases"), two or three of which typically fail. However, when 
> either of these conditions do not hold, we are no longer able to reproduce 
> the issue.
> Some example stack traces from the HiveServer2 logs:
> {noformat}
> 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: 
> org.apache.thrift.transport.TTransportException Read a negative frame size 
> (-2147418110)!
> org.apache.thrift.transport.TTransportException: Read a negative frame size 
> (-2147418110)!
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435)
> at 
> org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414)
> at 
> org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
> at org.apache.th

[jira] [Commented] (HIVE-10900) Fix the indeterministic stats for some hive queries

2015-06-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575372#comment-14575372
 ] 

Ashutosh Chauhan commented on HIVE-10900:
-

+1

> Fix the indeterministic stats for some hive queries 
> 
>
> Key: HIVE-10900
> URL: https://issues.apache.org/jira/browse/HIVE-10900
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-10900.01.patch
>
>
> If we do not run compute stats for a table and then we do some operation on 
> that table, we will get different stats numbers when we run explain. The main 
> reason is due to the different OS/FS configurations that Hive Stats depends 
> on when there is no table stats. A simple fix is to add compute stats for 
> those  indeterministic stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10900) Fix the indeterministic stats for some hive queries

2015-06-05 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10900:
---
Attachment: HIVE-10900.01.patch

temporary fix for accumulo stats. [~ashutoshc], could you please take a look? 
Also ccing [~jpullokkaran]

> Fix the indeterministic stats for some hive queries 
> 
>
> Key: HIVE-10900
> URL: https://issues.apache.org/jira/browse/HIVE-10900
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-10900.01.patch
>
>
> If we do not run compute stats for a table and then we do some operation on 
> that table, we will get different stats numbers when we run explain. The main 
> reason is due to the different OS/FS configurations that Hive Stats depends 
> on when there is no table stats. A simple fix is to add compute stats for 
> those  indeterministic stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10827) LLAP: support parallel query compilation in HS2

2015-06-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575336#comment-14575336
 ] 

Lefty Leverenz commented on HIVE-10827:
---

Document this or HIVE-4239 (if this issue gets reverted after committing 
HIVE-4239) -- new configuration parameter *hive.driver.parallel.compilation*.

Linking to HIVE-9850 for llap documentation.

> LLAP: support parallel query compilation in HS2
> ---
>
> Key: HIVE-10827
> URL: https://issues.apache.org/jira/browse/HIVE-10827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key & values

2015-06-05 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10929:
--
Attachment: HIVE-10929.4.patch

> In Tez mode,dynamic partitioning query with union all fails at 
> moveTask,Invalid partition key & values
> --
>
> Key: HIVE-10929
> URL: https://issues.apache.org/jira/browse/HIVE-10929
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10929.1.patch, HIVE-10929.2.patch, 
> HIVE-10929.3.patch, HIVE-10929.4.patch
>
>
> {code}
> create table dummy(i int);
> insert into table dummy values (1);
> select * from dummy;
> create table partunion1(id1 int) partitioned by (part1 string);
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.execution.engine=tez;
> explain insert into table partunion1 partition(part1)
> select temps.* from (
> select 1 as id1, '2014' as part1 from dummy 
> union all 
> select 2 as id1, '2014' as part1 from dummy ) temps;
> insert into table partunion1 partition(part1)
> select temps.* from (
> select 1 as id1, '2014' as part1 from dummy 
> union all 
> select 2 as id1, '2014' as part1 from dummy ) temps;
> select * from partunion1;
> {code}
> fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs

2015-06-05 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575327#comment-14575327
 ] 

Vaibhav Gumashta commented on HIVE-10453:
-

Committed to 1.2.1.

> HS2 leaking open file descriptors when using UDFs
> -
>
> Key: HIVE-10453
> URL: https://issues.apache.org/jira/browse/HIVE-10453
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 1.2.1, 2.0.0
>
> Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch
>
>
> 1. create a custom function by
> CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar';
> 2. Create a simple jdbc client, just do 
> connect, 
> run simple query which using the function such as:
> select myfunc(col1) from sometable
> 3. Disconnect.
> Check open file for HiveServer2 by:
> lsof -p HSProcID | grep myudf.jar
> You will see the leak as:
> {noformat}
> java  28718 ychen  txt  REG1,4741 212977666 
> /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
> java  28718 ychen  330r REG1,4741 212977666 
> /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10453) HS2 leaking open file descriptors when using UDFs

2015-06-05 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10453:

Fix Version/s: 1.2.1

> HS2 leaking open file descriptors when using UDFs
> -
>
> Key: HIVE-10453
> URL: https://issues.apache.org/jira/browse/HIVE-10453
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 1.2.1, 2.0.0
>
> Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch
>
>
> 1. create a custom function by
> CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar';
> 2. Create a simple jdbc client, just do 
> connect, 
> run simple query which using the function such as:
> select myfunc(col1) from sometable
> 3. Disconnect.
> Check open file for HiveServer2 by:
> lsof -p HSProcID | grep myudf.jar
> You will see the leak as:
> {noformat}
> java  28718 ychen  txt  REG1,4741 212977666 
> /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
> java  28718 ychen  330r REG1,4741 212977666 
> /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7193) Hive should support additional LDAP authentication parameters

2015-06-05 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575310#comment-14575310
 ] 

Chaoyu Tang commented on HIVE-7193:
---

[~ngangam] Overall, the patch looks good to me. I still have some questions, 
could you help clarify them? Thanks

> Hive should support additional LDAP authentication parameters
> -
>
> Key: HIVE-7193
> URL: https://issues.apache.org/jira/browse/HIVE-7193
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Mala Chikka Kempanna
>Assignee: Naveen Gangam
> Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.patch, 
> LDAPAuthentication_Design_Doc.docx, LDAPAuthentication_Design_Doc_V2.docx
>
>
> Currently hive has only following authenticator parameters for LDAP
>  authentication for hiveserver2. 
>  
> hive.server2.authentication 
> LDAP 
>  
>  
> hive.server2.authentication.ldap.url 
> ldap://our_ldap_address 
>  
> We need to include other LDAP properties as part of hive-LDAP authentication 
> like below
> a group search base -> dc=domain,dc=com 
> a group search filter -> member={0} 
> a user search base -> dc=domain,dc=com 
> a user search filter -> sAMAAccountName={0} 
> a list of valid user groups -> group1,group2,group3 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs

2015-06-05 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575314#comment-14575314
 ] 

Vaibhav Gumashta commented on HIVE-10453:
-

This is a good candidate for 1.2.1. I'll commit it to the branch as well. 
Thanks [~ychena] and [~szehon] for getting this in.

> HS2 leaking open file descriptors when using UDFs
> -
>
> Key: HIVE-10453
> URL: https://issues.apache.org/jira/browse/HIVE-10453
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch
>
>
> 1. create a custom function by
> CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar';
> 2. Create a simple jdbc client, just do 
> connect, 
> run simple query which using the function such as:
> select myfunc(col1) from sometable
> 3. Disconnect.
> Check open file for HiveServer2 by:
> lsof -p HSProcID | grep myudf.jar
> You will see the leak as:
> {noformat}
> java  28718 ychen  txt  REG1,4741 212977666 
> /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
> java  28718 ychen  330r REG1,4741 212977666 
> /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key & values

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575296#comment-14575296
 ] 

Hive QA commented on HIVE-10929:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12738023/HIVE-10929.3.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9002 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_dynamic_partition
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4190/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4190/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4190/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12738023 - PreCommit-HIVE-TRUNK-Build

> In Tez mode,dynamic partitioning query with union all fails at 
> moveTask,Invalid partition key & values
> --
>
> Key: HIVE-10929
> URL: https://issues.apache.org/jira/browse/HIVE-10929
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10929.1.patch, HIVE-10929.2.patch, 
> HIVE-10929.3.patch
>
>
> {code}
> create table dummy(i int);
> insert into table dummy values (1);
> select * from dummy;
> create table partunion1(id1 int) partitioned by (part1 string);
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.execution.engine=tez;
> explain insert into table partunion1 partition(part1)
> select temps.* from (
> select 1 as id1, '2014' as part1 from dummy 
> union all 
> select 2 as id1, '2014' as part1 from dummy ) temps;
> insert into table partunion1 partition(part1)
> select temps.* from (
> select 1 as id1, '2014' as part1 from dummy 
> union all 
> select 2 as id1, '2014' as part1 from dummy ) temps;
> select * from partunion1;
> {code}
> fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10950) Unit test against HBase Metastore

2015-06-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-10950:
--
Attachment: HIVE-10950-1.patch

Initial patch to get TestCliDriver running.

> Unit test against HBase Metastore
> -
>
> Key: HIVE-10950
> URL: https://issues.apache.org/jira/browse/HIVE-10950
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-10950-1.patch
>
>
> We need to run the entire Hive UT against HBase Metastore and make sure they 
> pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10673) Dynamically partitioned hash join for Tez

2015-06-05 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10673:
--
Attachment: HIVE-10673.4.patch

Patch v4: proper rebase of v2 (I hope).

> Dynamically partitioned hash join for Tez
> -
>
> Key: HIVE-10673
> URL: https://issues.apache.org/jira/browse/HIVE-10673
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-10673.1.patch, HIVE-10673.2.patch, 
> HIVE-10673.3.patch, HIVE-10673.4.patch
>
>
> Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
> reducer are unsorted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8955) alter partition should check for "hive.stats.autogather" in hiveConf

2015-06-05 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen resolved HIVE-8955.

Resolution: Not A Problem

> alter partition should check for "hive.stats.autogather" in hiveConf
> 
>
> Key: HIVE-8955
> URL: https://issues.apache.org/jira/browse/HIVE-8955
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.13.1
>Reporter: Pankit Thapar
>Assignee: Yongzhi Chen
>
> When alter partition code path is triggered, it should check for the flag 
> "hive.stats.autogather", if it is true, then only updateStats else skip them.
> This is done in append_partition code flow. 
> Is there any specific reason the alter_partition does not respect this conf 
> variable?
> //code snippet : HiveMetastore.java 
>  private Partition append_partition_common(RawStore ms, String dbName, String 
> tableName,
> List part_vals, EnvironmentContext envContext) throws 
> InvalidObjectException,
> AlreadyExistsException, MetaException {
> ...
> 
> if (HiveConf.getBoolVar(hiveConf, 
> HiveConf.ConfVars.HIVESTATSAUTOGATHER) &&
> !MetaStoreUtils.isView(tbl)) {
>   MetaStoreUtils.updatePartitionStatsFast(part, wh, madeDir);
> }
> ...
> ...
> }
> The above code snippet checks for the variable but this same check is absent 
> in 
> //code snippet : HiveAlterHandler.java 
> public Partition alterPartition(final RawStore msdb, Warehouse wh, final 
> String dbname,
>   final String name, final List part_vals, final Partition 
> new_part)
>   throws InvalidOperationException, InvalidObjectException, 
> AlreadyExistsException,
>   MetaException {
> 
> ...
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others

2015-06-05 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575239#comment-14575239
 ] 

Yongzhi Chen commented on HIVE-7018:


[~ctang.ma], could you review the change? Thanks

> Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but 
> not others
> -
>
> Key: HIVE-7018
> URL: https://issues.apache.org/jira/browse/HIVE-7018
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Yongzhi Chen
> Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch, HIVE-7018.3.patch
>
>
> It appears that at least postgres and oracle do not have the LINK_TARGET_ID 
> column while mysql does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10949) Disable hive-minikdc tests in Windows

2015-06-05 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575219#comment-14575219
 ] 

Vaibhav Gumashta commented on HIVE-10949:
-

+1

> Disable hive-minikdc tests in Windows
> -
>
> Key: HIVE-10949
> URL: https://issues.apache.org/jira/browse/HIVE-10949
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10949.1.patch
>
>
> hive-minikdc needs to be disabled for Windows OS since we dont have kerberos 
> support yet for Hadoop Cluster running under Windows OS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10949) Disable hive-minikdc tests in Windows

2015-06-05 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10949:
-
Attachment: HIVE-10949.1.patch

[~vgumashta]  or [~thejas] Can you please review the change?

Thanks
Hari

> Disable hive-minikdc tests in Windows
> -
>
> Key: HIVE-10949
> URL: https://issues.apache.org/jira/browse/HIVE-10949
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10949.1.patch
>
>
> hive-minikdc needs to be disabled for Windows OS since we dont have kerberos 
> support yet for Hadoop Cluster running under Windows OS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575214#comment-14575214
 ] 

Ashutosh Chauhan commented on HIVE-10911:
-

That is fine. sql spec allows placing NULLs either at start or end for sorts. 
So, both result sets are acceptable.
+1

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10911.patch
>
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10911:

Attachment: HIVE-10911.patch

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10911.patch
>
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10911:

Attachment: (was: HIVE-10911.patch)

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10911.patch
>
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10941) Provide option to disable spark tests outside itests

2015-06-05 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10941:
-
Attachment: HIVE-10941.2.patch

> Provide option to disable spark tests outside itests
> 
>
> Key: HIVE-10941
> URL: https://issues.apache.org/jira/browse/HIVE-10941
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10941.1.patch, HIVE-10941.2.patch
>
>
> HIVE-10477 provided an option to disable spark module, however we missed the 
> following files that are outside itests directory. i.e we need to club the 
> option with disabling the following tests as well :
> {code}
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
> org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10948) Slf4j warning in HiveCLI due to spark

2015-06-05 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-10948:
---
Description: 
The spark-assembly-1.3.1.jar is added to the Hive classpath 
./hive.distro:  export SPARK_HOME=$sparkHome
./hive.distro:  sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar`
./hive.distro:  CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}"

When launch HiveCLI, we could see the following message:
===
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/.../hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/.../spark/lib/spark-assembly-1.3.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindingsfor an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/.../hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/.../spark/lib/spark-assembly-1.3.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindingsfor an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]


The bug is similar like HIVE-9496

  was:
The spark-assembly-1.3.1.jar is added to the Hive classpath 
./hive.distro:  export SPARK_HOME=$sparkHome
./hive.distro:  sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar`
./hive.distro:  CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}"

When launch HiveCLI, we could see the following message:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/.../hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/.../spark/lib/spark-assembly-1.3.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindingsfor an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/.../hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/.../spark/lib/spark-assembly-1.3.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindingsfor an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]


> Slf4j warning in HiveCLI due to spark
> -
>
> Key: HIVE-10948
> URL: https://issues.apache.org/jira/browse/HIVE-10948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.0
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Minor
>
> The spark-assembly-1.3.1.jar is added to the Hive classpath 
> ./hive.distro:  export SPARK_HOME=$sparkHome
> ./hive.distro:  sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar`
> ./hive.distro:  CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}"
> When launch HiveCLI, we could see the following message:
> ===
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/.../hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/.../spark/lib/spark-assembly-1.3.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindingsfor an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> WARNING: Use "yarn jar" to launch YARN applications.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/.../hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/.../spark/lib/spark-assembly-1.3.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindingsfor an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 
> The bug is similar like HIVE-9496



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly

2015-06-05 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575154#comment-14575154
 ] 

Thiruvel Thirumoolan commented on HIVE-10815:
-

[~nemon] Thanks for raising this. This will be good to have and will be a 
compliment to rolling upgrade.

Is it possible to do the shuffle in open() instead of the constructor? Was 
there a reason you didn't want to do it that way? That way a reconnect() will 
also chose a random host and will play well with HIVE-9508 (if enabled).

> Let HiveMetaStoreClient Choose MetaStore Randomly
> -
>
> Key: HIVE-10815
> URL: https://issues.apache.org/jira/browse/HIVE-10815
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Affects Versions: 1.2.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
> Attachments: HIVE-10815.patch
>
>
> Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs 
> when multiple metastores configured.
>  Choosing MetaStore Randomly will be good for load balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-06-05 Thread Elliot West (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliot West updated HIVE-10165:
---
Attachment: HIVE-10165.6.patch

Thank you for such an in depth review. I've updated the patch: 
[^HIVE-10165.6.patch]. If one of your review points is not replied to here then 
you may consider it addressed directly in the patch. Here are my thoughts 
regarding the other specific issues, including any resolutions that I've 
applied:

{quote}
Lock.internalRelease: You've built in handling for releasing locks that are not 
part of transactions. When you do envision users locking something that isn't 
part of a transaction? Since this is doing write operations I would assume 
you'll always have a transaction.
{quote}

My intention is that this class will also be independently useful for  
components that read ACID data, as they will need to initiate a read lock with 
the MetaStore. As an example, I hope to replace [this code in the 
cascading-hive 
project|https://github.com/Cascading/cascading-hive/blob/wip-2.0/src/main/java/cascading/tap/hive/LockManager.java]
 with an instance of this {{Lock}} class.

{quote}
Transaction: Why do commit() and abort() release the locks? Since these locks 
are part of a transaction they will always be released when the transaction is 
committed or aborted.
{quote}

This is in place purely to cancel the heartbeat timer. If you look at 
{{Lock:143}} you may notice that an {{unlock}} call isn't issued to the 
MetaStore as part of the {{release}} if the lock is part of a transaction.

{quote}
MutatorClient: Why is Lock external to this class? It seems like Lock is a 
component of this class. Or do you envision users using one Lock object to 
manage multiple MutatorClients?
{quote}

This comment helped me see that I was missing a test, thanks. No sharing of 
locks was intended, I wanted a way to inject a mock which I then discovered was 
unnecessary once I had written said test. Lock construction now takes place in 
the {{Transaction}} not the {{MutatorClientBuilder}}.

{quote}
MutatorCoordinator: In the constructor, why are you passing in 
CreatePartitionHelper and SequenceValidator when there's only one instance of 
these?
{quote}

I wanted the ability to mock them in the {{TestMutatorCoordinator}} test. They 
are package private, so this separation doesn't leak into the public API.
If this is undesirable, can you recommend an alternative approach?

{quote}
MutatorCoordinator.resetMutator, this code is closing the Mutator everytime you 
switch Mutators. But if I understand correctly this is going to result in 
writing a footer in the ORC file. You're going to end up with a thousand tiny 
stripes in your files. That is not what you want. You do need to make sure you 
don't have too many open at a time to avoids OOMs and too many file handles 
open errors. But you'll need to keep a list of which ones are open and then 
close them on an LRU basis (or maybe pick the one with the most records since 
it will give you the best stripe size) as you need to open more rather than 
closing each one each time. Owen O'Malley comments?
{quote}

This class relies on the correct grouping of the data (by partition,bucket) to 
avoid the problem that you describe. As long as the data arrives grouped in 
this way, we can guarantee that once a given {{Mutator}} has been closed it'll 
never be needed again. It's also resource-light approach too, only one 
{{Mutator}} (and hence file) need be open at a given time. A {{Mutator}} cache 
would introduce more flexibility and resilience by relaxing the data grouping 
requirement, but this could then push optimisation decisions back to the user, 
now having to trade-off the number of open mutators and stripe size. I felt 
that as the user must sort the data anyway (lastTxnId,rowId) a grouping could 
generally be obtained for 'free' at the same time and thus allow a simpler 
mechanism to be employed. Very keen to hear your thoughts on this.

{quote}
CreationPartitionHelper.createPartitionIfNotExists: Why are you running the 
Driver class here? Why not call IMetaStoreClient.addPartition()? That would be 
much lighter weight.
{quote}

Agreed. I lifted [this code from the original streaming 
API|https://github.com/apache/hive/blob/80fb8913196eef8e4125544c3138b0c73be267b7/hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java#L350]
 and assumed that this worked around a limitation that I wasn't aware of. I've 
modified it to use the MetaStoreClient.


> Improve hive-hcatalog-streaming extensibility and support updates and deletes.
> --
>
> Key: HIVE-10165
> URL: https://issues.apache.org/jira/browse/HIVE-10165
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 1.2.0
>  

[jira] [Resolved] (HIVE-10827) LLAP: support parallel query compilation in HS2

2015-06-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10827.
-
Resolution: Fixed

in branch; 06 patch from original JIRA. We will revert this when that JIRA goes 
in, there's some hair splitting going on there with regard to what goes into 
what jira and when and how

> LLAP: support parallel query compilation in HS2
> ---
>
> Key: HIVE-10827
> URL: https://issues.apache.org/jira/browse/HIVE-10827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key & values

2015-06-05 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10929:
--
Attachment: HIVE-10929.3.patch

> In Tez mode,dynamic partitioning query with union all fails at 
> moveTask,Invalid partition key & values
> --
>
> Key: HIVE-10929
> URL: https://issues.apache.org/jira/browse/HIVE-10929
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10929.1.patch, HIVE-10929.2.patch, 
> HIVE-10929.3.patch
>
>
> {code}
> create table dummy(i int);
> insert into table dummy values (1);
> select * from dummy;
> create table partunion1(id1 int) partitioned by (part1 string);
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.execution.engine=tez;
> explain insert into table partunion1 partition(part1)
> select temps.* from (
> select 1 as id1, '2014' as part1 from dummy 
> union all 
> select 2 as id1, '2014' as part1 from dummy ) temps;
> insert into table partunion1 partition(part1)
> select temps.* from (
> select 1 as id1, '2014' as part1 from dummy 
> union all 
> select 2 as id1, '2014' as part1 from dummy ) temps;
> select * from partunion1;
> {code}
> fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10937) LLAP: make ObjectCache for plans work properly in the daemon

2015-06-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10937:

Attachment: HIVE-10937.patch

[~hagleitn] can you take a look? it's basically a cache of object pools, which 
should have better reuse than threadlocals. If we had a query-scope object it 
could be nicer

> LLAP: make ObjectCache for plans work properly in the daemon
> 
>
> Key: HIVE-10937
> URL: https://issues.apache.org/jira/browse/HIVE-10937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
> Attachments: HIVE-10937.patch
>
>
> There's perf hit otherwise, esp. when stupid planner creates 1009 reducers of 
> 4Mb each.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10946) LLAP: recent optimization introduced wrong assert to elevator causing test to fail

2015-06-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10946.
-
Resolution: Fixed
  Assignee: Sergey Shelukhin

in branch

> LLAP: recent optimization introduced wrong assert to elevator causing test to 
> fail
> --
>
> Key: HIVE-10946
> URL: https://issues.apache.org/jira/browse/HIVE-10946
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>
> Doesn't happen in real queries that I run but orc_llap fails for 0-column 
> case. Need to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10942) LLAP: expose what's running on the daemon thru JMX

2015-06-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10942.
-
Resolution: Fixed

in branch

> LLAP: expose what's running on the daemon thru JMX
> --
>
> Key: HIVE-10942
> URL: https://issues.apache.org/jira/browse/HIVE-10942
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10946) LLAP: recent optimization introduced wrong assert to elevator causing test to fail

2015-06-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10946:

Fix Version/s: llap

> LLAP: recent optimization introduced wrong assert to elevator causing test to 
> fail
> --
>
> Key: HIVE-10946
> URL: https://issues.apache.org/jira/browse/HIVE-10946
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
> Fix For: llap
>
>
> Doesn't happen in real queries that I run but orc_llap fails for 0-column 
> case. Need to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10470) LLAP: NPE in IO when returning 0 rows with no projection

2015-06-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575071#comment-14575071
 ] 

Sergey Shelukhin commented on HIVE-10470:
-

[~prasanth_j] may not be an issue, but can you check?
{noformat}
-  if (hasIndexOnlyCols) {
+  if (hasIndexOnlyCols && (includedRgs == null)) {
{noformat}
This only returns the dummy batch if there's no RG filtering at all. Could 
there be partial RG filtering in this case? I'd assume not, but I'm not sure... 
after all, someone does filter all the rows. It will probably be hard to cover 
by a q file test, unless lots of values are added to create partial RG 
filtering.

> LLAP: NPE in IO when returning 0 rows with no projection
> 
>
> Key: HIVE-10470
> URL: https://issues.apache.org/jira/browse/HIVE-10470
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
> Attachments: HIVE-10470.1.patch
>
>
> Looks like a trivial fix, unless I'm missing something. I may do it later if 
> you don't ;)
> {noformat}
> aused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.orc.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1764)
>   at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:92)
>   at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:39)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:116)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:36)
>   at 
> org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:329)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:299)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:55)
>   at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
>   ... 4 more
> {noformat}
> Running q file
> {noformat}
> SET hive.vectorized.execution.enabled=true;
> SET hive.llap.io.enabled=false;
> SET hive.exec.orc.default.row.index.stride=1000;
> SET hive.optimize.index.filter=true;
> DROP TABLE orc_llap;
> CREATE TABLE orc_llap(
> ctinyint TINYINT,
> csmallint SMALLINT,
> cint INT,
> cbigint BIGINT,
> cfloat FLOAT,
> cdouble DOUBLE,
> cstring1 STRING,
> cstring2 STRING,
> ctimestamp1 TIMESTAMP,
> ctimestamp2 TIMESTAMP,
> cboolean1 BOOLEAN,
> cboolean2 BOOLEAN)
> STORED AS ORC tblproperties ("orc.compress"="ZLIB");
> insert into table orc_llap
> select ctinyint, csmallint, cint, cbigint, cfloat, cdouble, cstring1, 
> cstring2, ctimestamp1, ctimestamp2, cboolean1, cboolean2
> from alltypesorc limit 10;
> SET hive.llap.io.enabled=true;
> select count(*) from orc_llap where cint < 6000;
> DROP TABLE orc_llap;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10746) Hive 1.2.0 w/ Tez 0.5.3/Tez 0.6.0 produces 1-byte FileSplits from TextInputFormat

2015-06-05 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10746:
---
Summary:  Hive 1.2.0 w/ Tez 0.5.3/Tez 0.6.0 produces 1-byte FileSplits from 
TextInputFormat  (was: Hive 0.14.x and Hive 1.2.0 w/ Tez 0.5.3/Tez 0.6.0 Slow 
group by/order by)

>  Hive 1.2.0 w/ Tez 0.5.3/Tez 0.6.0 produces 1-byte FileSplits from 
> TextInputFormat
> --
>
> Key: HIVE-10746
> URL: https://issues.apache.org/jira/browse/HIVE-10746
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Tez
>Affects Versions: 0.14.0, 0.14.1, 1.2.0, 1.1.0, 1.1.1
>Reporter: Greg Senia
>Assignee: Gopal V
>Priority: Critical
> Attachments: slow_query_output.zip
>
>
> The following query: "SELECT appl_user_id, arsn_cd, COUNT(*) as RecordCount 
> FROM adw.crc_arsn GROUP BY appl_user_id,arsn_cd ORDER BY appl_user_id;" runs 
> consistently fast in Spark and Mapreduce on Hive 1.2.0. When attempting to 
> run this same query against Tez as the execution engine it consistently runs 
> for over 300-500 seconds this seems extremely long. This is a basic external 
> table delimited by tabs and is a single file in a folder. In Hive 0.13 this 
> query with Tez runs fast and I tested with Hive 0.14, 0.14.1/1.0.0 and now 
> Hive 1.2.0 and there clearly is something going awry with Hive w/Tez as an 
> execution engine with Single or small file tables. I can attach further logs 
> if someone needs them for deeper analysis.
> HDFS Output:
> hadoop fs -ls /example_dw/crc/arsn
> Found 2 items
> -rwxr-x---   6 loaduser hadoopusers  0 2015-05-17 20:03 
> /example_dw/crc/arsn/_SUCCESS
> -rwxr-x---   6 loaduser hadoopusers3883880 2015-05-17 20:03 
> /example_dw/crc/arsn/part-m-0
> Hive Table Describe:
> hive> describe formatted crc_arsn;
> OK
> # col_name  data_type   comment 
>  
> arsn_cd string  
> clmlvl_cd   string  
> arclss_cd   string  
> arclssg_cd  string  
> arsn_prcsr_rmk_ind  string  
> arsn_mbr_rspns_ind  string  
> savtyp_cd   string  
> arsn_eff_dt string  
> arsn_exp_dt string  
> arsn_pstd_dts   string  
> arsn_lstupd_dts string  
> arsn_updrsn_txt string  
> appl_user_idstring  
> arsntyp_cd  string  
> pre_d_indicator string  
> arsn_display_txtstring  
> arstat_cd   string  
> arsn_tracking_nostring  
> arsn_cstspcfc_ind   string  
> arsn_mstr_rcrd_ind  string  
> state_specific_ind  string  
> region_specific_in  string  
> arsn_dpndnt_cd  string  
> unit_adjustment_in  string  
> arsn_mbr_only_ind   string  
> arsn_qrmb_ind   string  
>  
> # Detailed Table Information 
> Database:   adw  
> Owner:  loadu...@exa.example.com   
> CreateTime: Mon Apr 28 13:28:05 EDT 2014 
> LastAccessTime: UNKNOWN  
> Protect Mode:   None 
> Retention:  0
> Location:   hdfs://xhadnnm1p.example.com:8020/example_dw/crc/arsn 
>
> Table Type: EXTERNAL_TABLE   
> Table Parameters:
> EXTERNALTRUE
> transient_lastDdlTime   1398706085  
>  
> # Storage Information
> SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
> InputFormat:org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat:   
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed: No   

[jira] [Assigned] (HIVE-10746) Hive 0.14.x and Hive 1.2.0 w/ Tez 0.5.3/Tez 0.6.0 Slow group by/order by

2015-06-05 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-10746:
--

Assignee: Gopal V

> Hive 0.14.x and Hive 1.2.0 w/ Tez 0.5.3/Tez 0.6.0 Slow group by/order by
> 
>
> Key: HIVE-10746
> URL: https://issues.apache.org/jira/browse/HIVE-10746
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Tez
>Affects Versions: 0.14.0, 0.14.1, 1.2.0, 1.1.0, 1.1.1
>Reporter: Greg Senia
>Assignee: Gopal V
>Priority: Critical
> Attachments: slow_query_output.zip
>
>
> The following query: "SELECT appl_user_id, arsn_cd, COUNT(*) as RecordCount 
> FROM adw.crc_arsn GROUP BY appl_user_id,arsn_cd ORDER BY appl_user_id;" runs 
> consistently fast in Spark and Mapreduce on Hive 1.2.0. When attempting to 
> run this same query against Tez as the execution engine it consistently runs 
> for over 300-500 seconds this seems extremely long. This is a basic external 
> table delimited by tabs and is a single file in a folder. In Hive 0.13 this 
> query with Tez runs fast and I tested with Hive 0.14, 0.14.1/1.0.0 and now 
> Hive 1.2.0 and there clearly is something going awry with Hive w/Tez as an 
> execution engine with Single or small file tables. I can attach further logs 
> if someone needs them for deeper analysis.
> HDFS Output:
> hadoop fs -ls /example_dw/crc/arsn
> Found 2 items
> -rwxr-x---   6 loaduser hadoopusers  0 2015-05-17 20:03 
> /example_dw/crc/arsn/_SUCCESS
> -rwxr-x---   6 loaduser hadoopusers3883880 2015-05-17 20:03 
> /example_dw/crc/arsn/part-m-0
> Hive Table Describe:
> hive> describe formatted crc_arsn;
> OK
> # col_name  data_type   comment 
>  
> arsn_cd string  
> clmlvl_cd   string  
> arclss_cd   string  
> arclssg_cd  string  
> arsn_prcsr_rmk_ind  string  
> arsn_mbr_rspns_ind  string  
> savtyp_cd   string  
> arsn_eff_dt string  
> arsn_exp_dt string  
> arsn_pstd_dts   string  
> arsn_lstupd_dts string  
> arsn_updrsn_txt string  
> appl_user_idstring  
> arsntyp_cd  string  
> pre_d_indicator string  
> arsn_display_txtstring  
> arstat_cd   string  
> arsn_tracking_nostring  
> arsn_cstspcfc_ind   string  
> arsn_mstr_rcrd_ind  string  
> state_specific_ind  string  
> region_specific_in  string  
> arsn_dpndnt_cd  string  
> unit_adjustment_in  string  
> arsn_mbr_only_ind   string  
> arsn_qrmb_ind   string  
>  
> # Detailed Table Information 
> Database:   adw  
> Owner:  loadu...@exa.example.com   
> CreateTime: Mon Apr 28 13:28:05 EDT 2014 
> LastAccessTime: UNKNOWN  
> Protect Mode:   None 
> Retention:  0
> Location:   hdfs://xhadnnm1p.example.com:8020/example_dw/crc/arsn 
>
> Table Type: EXTERNAL_TABLE   
> Table Parameters:
> EXTERNALTRUE
> transient_lastDdlTime   1398706085  
>  
> # Storage Information
> SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
> InputFormat:org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat:   
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed: No   
> Num Buckets:-1   
> Bucket Columns: []   
> Sort Columns:   []   
> Storag

[jira] [Commented] (HIVE-10906) Value based UDAF function without orderby expression throws NPE

2015-06-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575016#comment-14575016
 ] 

Ashutosh Chauhan commented on HIVE-10906:
-

+1

> Value based UDAF function without orderby expression throws NPE
> ---
>
> Key: HIVE-10906
> URL: https://issues.apache.org/jira/browse/HIVE-10906
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10906.patch
>
>
> The following query throws NPE.
> {noformat}
> select key, value, min(value) over (partition by key range between unbounded 
> preceding and current row) from small;
> FAILED: NullPointerException null
> 2015-06-03 13:48:09,268 ERROR [main]: ql.Driver 
> (SessionState.java:printError(957)) - FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateValueBoundary(WindowingSpec.java:293)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateWindowFrame(WindowingSpec.java:281)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateAndMakeEffective(WindowingSpec.java:155)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:11965)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8910)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8868)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9606)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10079)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:327)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10090)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1124)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1061)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures

2015-06-05 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575005#comment-14575005
 ] 

Thejas M Nair commented on HIVE-10410:
--

[~Richard Williams]
Are you sharing the same session for multiple queries in your cluster or set 
hive.exec.parallel=true ? That is only case where I think this would happen.
Also, there is a problem with the change in the patch. The HiveMetaStoreClient 
within Hive object is associated with a specific user, so if it re-uses any 
Hive object from the current thread, it would get the wrong user. There is a 
safeguard against this that was added recently, but it would result in an 
expensive creation of new Hive object and (hive metastore client).
I think instead of pursing that option, we should just look at synchronizing 
thrift client use within metastore client. There is already a 
HiveMetaStoreClient.newSynchronizedClient that Hive object could use to get a 
synchronized client.

We also need to look at other issues like what [~ctang.ma] mentioned, but all 
of it does not have to happen in this jira.



> Apparent race condition in HiveServer2 causing intermittent query failures
> --
>
> Key: HIVE-10410
> URL: https://issues.apache.org/jira/browse/HIVE-10410
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.1
> Environment: CDH 5.3.3
> CentOS 6.4
>Reporter: Richard Williams
> Attachments: HIVE-10410.1.patch
>
>
> On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC 
> occasionally trigger odd Thrift exceptions with messages such as "Read a 
> negative frame size (-2147418110)!" or "out of sequence response" in 
> HiveServer2's connections to the metastore. For certain metastore calls (for 
> example, showDatabases), these Thrift exceptions are converted to 
> MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient 
> from retrying these calls and thus causes the failure to bubble out to the 
> JDBC client.
> Note that as far as we can tell, this issue appears to only affect queries 
> that are submitted with the runAsync flag on TExecuteStatementReq set to true 
> (which, in practice, seems to mean all JDBC queries), and it appears to only 
> manifest when HiveServer2 is using the new HTTP transport mechanism. When 
> both these conditions hold, we are able to fairly reliably reproduce the 
> issue by spawning about 100 simple, concurrent hive queries (we have been 
> using "show databases"), two or three of which typically fail. However, when 
> either of these conditions do not hold, we are no longer able to reproduce 
> the issue.
> Some example stack traces from the HiveServer2 logs:
> {noformat}
> 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: 
> org.apache.thrift.transport.TTransportException Read a negative frame size 
> (-2147418110)!
> org.apache.thrift.transport.TTransportException: Read a negative frame size 
> (-2147418110)!
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435)
> at 
> org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414)
> at 
> org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
> at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837)
> at 
> org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60)
> at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
> at com.sun.proxy.$Proxy6.getDatabases(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getDataba

[jira] [Commented] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574973#comment-14574973
 ] 

Ashutosh Chauhan commented on HIVE-10911:
-

Added a comment for testcase on RB.

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10911.patch
>
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10941) Provide option to disable spark tests outside itests

2015-06-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574965#comment-14574965
 ] 

Ashutosh Chauhan commented on HIVE-10941:
-

+1

> Provide option to disable spark tests outside itests
> 
>
> Key: HIVE-10941
> URL: https://issues.apache.org/jira/browse/HIVE-10941
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10941.1.patch
>
>
> HIVE-10477 provided an option to disable spark module, however we missed the 
> following files that are outside itests directory. i.e we need to club the 
> option with disabling the following tests as well :
> {code}
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
> org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10941) Provide option to disable spark tests outside itests

2015-06-05 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574923#comment-14574923
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10941:
--

cc-ing [~ashutoshc] for review.

Thanks
Hari

> Provide option to disable spark tests outside itests
> 
>
> Key: HIVE-10941
> URL: https://issues.apache.org/jira/browse/HIVE-10941
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-10941.1.patch
>
>
> HIVE-10477 provided an option to disable spark module, however we missed the 
> following files that are outside itests directory. i.e we need to club the 
> option with disabling the following tests as well :
> {code}
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
> org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
> org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10729) Query failed when select complex columns from joinned table (tez map join only)

2015-06-05 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574924#comment-14574924
 ] 

Vikram Dixit K commented on HIVE-10729:
---

[~selinazh] Any updates here? Can you take a look at [~gss2002]'s comment? 
[~gss2002] Can you provide a simple test where you saw the above exception.

Thanks
Vikram.

> Query failed when select complex columns from joinned table (tez map join 
> only)
> ---
>
> Key: HIVE-10729
> URL: https://issues.apache.org/jira/browse/HIVE-10729
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Selina Zhang
>Assignee: Selina Zhang
> Attachments: HIVE-10729.1.patch, HIVE-10729.2.patch
>
>
> When map join happens, if projection columns include complex data types, 
> query will fail. 
> Steps to reproduce:
> {code:sql}
> hive> set hive.auto.convert.join;
> hive.auto.convert.join=true
> hive> desc foo;
> a array
> hive> select * from foo;
> [1,2]
> hive> desc src_int;
> key   int
> value string
> hive> select * from src_int where key=2;
> 2val_2
> hive> select * from foo join src_int src  on src.key = foo.a[1];
> {code}
> Query will fail with stack trace
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getList(StandardListObjectInspector.java:111)
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:314)
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:262)
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:246)
>   at 
> org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:50)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:692)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:644)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:676)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386)
>   ... 23 more
> {noformat}
> Similar error when projection columns include a map:
> {code:sql}
> hive> CREATE TABLE test (a INT, b MAP) STORED AS ORC;
> hive> INSERT OVERWRITE TABLE test SELECT 1, MAP(1, "val_1", 2, "val_2") FROM 
> src LIMIT 1;
> hive> select * from src join test where src.key=test.a;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument

2015-06-05 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10427:

Labels: TODOC1.3  (was: TODOC2.0)

> collect_list() and collect_set() should accept struct types as argument
> ---
>
> Key: HIVE-10427
> URL: https://issues.apache.org/jira/browse/HIVE-10427
> Project: Hive
>  Issue Type: Wish
>  Components: UDF
>Reporter: Alexander Behm
>Assignee: Chao Sun
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, 
> HIVE-10427.3.patch, HIVE-10427.4.patch
>
>
> The collect_list() and collect_set() functions currently only accept scalar 
> argument types. It would be very useful if these functions could also accept 
> struct argument types for creating nested data from flat data.
> For example, suppose I wanted to create a nested customers/orders table from 
> two flat tables, customers and orders. Then it'd be very convenient to write 
> something like this:
> {code}
> insert into table nested_customers_orders
> select c.*, collect_list(named_struct("oid", o.oid, "order_date": o.date...))
> from customers c inner join orders o on (c.cid = o.oid)
> group by c.cid
> {code}
> Thanks you for your consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574893#comment-14574893
 ] 

Aihua Xu commented on HIVE-10911:
-

They are not related to the patch.

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10911.patch
>
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574876#comment-14574876
 ] 

Hive QA commented on HIVE-10911:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737968/HIVE-10911.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9002 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.testReadDataFromEncryptedHiveTableByPig[1]
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4189/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4189/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4189/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737968 - PreCommit-HIVE-TRUNK-Build

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10911.patch
>
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS

2015-06-05 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574845#comment-14574845
 ] 

Gunther Hagleitner commented on HIVE-10910:
---

Minor thing: Parse float can throw NumberFormatException. Also, hadoop 
configuration objects have a getFloat method. Otherwise +1 

> Alter table drop partition queries in encrypted zone failing to remove data 
> from HDFS
> -
>
> Key: HIVE-10910
> URL: https://issues.apache.org/jira/browse/HIVE-10910
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Eugene Koifman
> Attachments: HIVE-10910.patch
>
>
> Alter table query trying to drop partition removes metadata of partition but 
> fails to remove the data from HDFS
> hive> create table table_1(name string, age int, gpa double) partitioned by 
> (b string) stored as textfile;
> OK
> Time taken: 0.732 seconds
> hive> alter table table_1 add partition (b='2010-10-10');
> OK
> Time taken: 0.496 seconds
> hive> show partitions table_1;
> OK
> b=2010-10-10
> Time taken: 0.781 seconds, Fetched: 1 row(s)
> hive> alter table table_1 drop partition (b='2010-10-10');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException 
> Failed to move to trash: 
> hdfs://:8020//table_1/b=2010-10-10
> hive> show partitions table_1;
> OK
> Time taken: 0.622 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10827) LLAP: support parallel query compilation in HS2

2015-06-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10827:

Issue Type: Sub-task  (was: Improvement)
Parent: HIVE-7926

> LLAP: support parallel query compilation in HS2
> ---
>
> Key: HIVE-10827
> URL: https://issues.apache.org/jira/browse/HIVE-10827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-10827) support parallel query compilation in HS2

2015-06-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reopened HIVE-10827:
-

repurposing to commit to branch for now

> support parallel query compilation in HS2
> -
>
> Key: HIVE-10827
> URL: https://issues.apache.org/jira/browse/HIVE-10827
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10827) LLAP: support parallel query compilation in HS2

2015-06-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10827:

Summary: LLAP: support parallel query compilation in HS2  (was: support 
parallel query compilation in HS2)

> LLAP: support parallel query compilation in HS2
> ---
>
> Key: HIVE-10827
> URL: https://issues.apache.org/jira/browse/HIVE-10827
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10827) LLAP: support parallel query compilation in HS2

2015-06-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10827:

Fix Version/s: llap

> LLAP: support parallel query compilation in HS2
> ---
>
> Key: HIVE-10827
> URL: https://issues.apache.org/jira/browse/HIVE-10827
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10944) Fix HS2 for Metrics

2015-06-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574832#comment-14574832
 ] 

Sergey Shelukhin commented on HIVE-10944:
-

synchronized (as far as I can see) call to isInitialized from 
incrementMetricsCounter might affect perf for parallel workloads, like we saw 
before with locks in logging, kryo, and HDFS code. Also config checks in other 
places may not be cheap. Is it possible to cache those values, I assume once 
initialized it won't change back to false?

> Fix HS2 for Metrics
> ---
>
> Key: HIVE-10944
> URL: https://issues.apache.org/jira/browse/HIVE-10944
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-10944.patch
>
>
> Some issues about initializing the new HS2 metrics
> 1.  Metrics is not working properly in HS2 due to wrong init checks
> 2.  If not enabled, JVMPauseMonitor logs trash to HS2 logs as it wasnt 
> checking if metrics was enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10936) incorrect result set when hive.vectorized.execution.enabled = true with predicate casting to CHAR or VARCHAR

2015-06-05 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574829#comment-14574829
 ] 

Mostafa Mokhtar commented on HIVE-10936:


[~mmccline] FYI

> incorrect result set when hive.vectorized.execution.enabled = true with 
> predicate casting to CHAR or VARCHAR
> 
>
> Key: HIVE-10936
> URL: https://issues.apache.org/jira/browse/HIVE-10936
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: In this case using HDP install of Hive - 0.14.0.2.2.4.2-2
>Reporter: N Campbell
> Attachments: GO_TIME_DIM.zip
>
>
> Query returns data when set hive.vectorized.execution.enabled = false -or- if 
> target of CAST is STRING and not CHAR/VARCHAR
> set hive.vectorized.execution.enabled = true;
> select 
>   `GO_TIME_DIM`.`day_key`
> from 
>   `gosalesdw1021`.`go_time_dim` `GO_TIME_DIM` 
> where 
>   CAST(`GO_TIME_DIM`.`current_year` AS CHAR(4)) = '2010' 
> group by 
>   `GO_TIME_DIM`.`day_key`;
> create table GO_TIME_DIM ( DAY_KEY int , DAY_DATE timestamp , MONTH_KEY int , 
> CURRENT_MONTH smallint , MONTH_NUMBER int , QUARTER_KEY int , CURRENT_QUARTER 
> smallint , CURRENT_YEAR smallint , DAY_OF_WEEK smallint , DAY_OF_MONTH 
> smallint , DAYS_IN_MONTH smallint , DAY_OF_YEAR smallint , WEEK_OF_MONTH 
> smallint , WEEK_OF_QUARTER smallint , WEEK_OF_YEAR smallint , MONTH_EN string 
> , WEEKDAY_EN string , MONTH_DE string , WEEKDAY_DE string , MONTH_FR string , 
> WEEKDAY_FR string , MONTH_JA string , WEEKDAY_JA string , MONTH_AR string , 
> WEEKDAY_AR string , MONTH_CS string , WEEKDAY_CS string , MONTH_DA string , 
> WEEKDAY_DA string , MONTH_EL string , WEEKDAY_EL string , MONTH_ES string , 
> WEEKDAY_ES string , MONTH_FI string , WEEKDAY_FI string , MONTH_HR string , 
> WEEKDAY_HR string , MONTH_HU string , WEEKDAY_HU string , MONTH_ID string , 
> WEEKDAY_ID string , MONTH_IT string , WEEKDAY_IT string , MONTH_KK string , 
> WEEKDAY_KK string , MONTH_KO string , WEEKDAY_KO string , MONTH_MS string , 
> WEEKDAY_MS string , MONTH_NL string , WEEKDAY_NL string , MONTH_NO string , 
> WEEKDAY_NO string , MONTH_PL string , WEEKDAY_PL string , MONTH_PT string , 
> WEEKDAY_PT string , MONTH_RO string , WEEKDAY_RO string , MONTH_RU string , 
> WEEKDAY_RU string , MONTH_SC string , WEEKDAY_SC string , MONTH_SL string , 
> WEEKDAY_SL string , MONTH_SV string , WEEKDAY_SV string , MONTH_TC string , 
> WEEKDAY_TC string , MONTH_TH string , WEEKDAY_TH string , MONTH_TR string , 
> WEEKDAY_TR string )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS TEXTFILE
> LOCATION '../GO_TIME_DIM';
> Then create an ORC equivalent table and load it
> insert overwrite table 
> GO_TIME_DIM
> select * from TEXT.GO_TIME_DIM
> ;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10923) encryption_join_with_different_encryption_keys.q fails on CentOS 6

2015-06-05 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-10923.

Resolution: Fixed

> encryption_join_with_different_encryption_keys.q fails on CentOS 6
> --
>
> Key: HIVE-10923
> URL: https://issues.apache.org/jira/browse/HIVE-10923
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>
> Here is the stack trace
> {code}
> Task with the most failures(4):
> -
> Task ID:
>   task_1433377676690_0015_m_00
> URL:
>   
> http://ip-10-0-0-249.ec2.internal:44717/taskdetails.jsp?jobid=job_1433377676690_0015&tipid=task_1433377676690_0015_m_00
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"key":"238","value":"val_238"}
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"key":"238","value":"val_238"}
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> java.security.InvalidKeyException: Illegal key size
>   at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.init(JceAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension$DefaultCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:264)
>   at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2489)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2620)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2519)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:566)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:394)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> Caused by: java.security.InvalidKeyException: Illegal key size
>   at javax.crypto.Cipher.checkCryptoPerm(Cipher.java:1024)
>   at javax.crypto.Cipher.implInit(Cipher.java:790)
>   at javax.crypto.Cipher.chooseProvider(Cipher.java:849)
>   at javax.crypto.Cipher.init(Cipher.java:1348)
>   at javax.crypto.Cipher.init(Cipher.java:1282)
>   at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.init(JceAesCtrCryptoCodec.java:113)
>   ... 16 more
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:577)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:675)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.

[jira] [Commented] (HIVE-10923) encryption_join_with_different_encryption_keys.q fails on CentOS 6

2015-06-05 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574790#comment-14574790
 ] 

Pengcheng Xiong commented on HIVE-10923:


[~spena], thanks a lot for your reply. It is now resolved. Could you please 
also take a quick look at https://issues.apache.org/jira/browse/HIVE-10938 too 
if you have time? Any suggestions/comments are welcome. Thanks again.  

> encryption_join_with_different_encryption_keys.q fails on CentOS 6
> --
>
> Key: HIVE-10923
> URL: https://issues.apache.org/jira/browse/HIVE-10923
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>
> Here is the stack trace
> {code}
> Task with the most failures(4):
> -
> Task ID:
>   task_1433377676690_0015_m_00
> URL:
>   
> http://ip-10-0-0-249.ec2.internal:44717/taskdetails.jsp?jobid=job_1433377676690_0015&tipid=task_1433377676690_0015_m_00
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"key":"238","value":"val_238"}
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"key":"238","value":"val_238"}
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> java.security.InvalidKeyException: Illegal key size
>   at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.init(JceAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension$DefaultCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:264)
>   at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2489)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2620)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2519)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:566)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:394)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> Caused by: java.security.InvalidKeyException: Illegal key size
>   at javax.crypto.Cipher.checkCryptoPerm(Cipher.java:1024)
>   at javax.crypto.Cipher.implInit(Cipher.java:790)
>   at javax.crypto.Cipher.chooseProvider(Cipher.java:849)
>   at javax.crypto.Cipher.init(Cipher.java:1348)
>   at javax.crypto.Cipher.init(Cipher.java:1282)
>   at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.init(JceAesCtrCryptoCodec.java:113)
>   ... 16 more
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:577)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:675)
>   at org.ap

[jira] [Updated] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10911:

Affects Version/s: 1.3.0

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10911.patch
>
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10911:

Attachment: HIVE-10911.patch

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10911.patch
>
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3628) Provide a way to use counters in Hive through UDF

2015-06-05 Thread Pratik Khadloya (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574700#comment-14574700
 ] 

Pratik Khadloya commented on HIVE-3628:
---

Thanks for providing this feature. I am wondering how to use it, does anyone 
know if the docs have been updated?

> Provide a way to use counters in Hive through UDF
> -
>
> Key: HIVE-3628
> URL: https://issues.apache.org/jira/browse/HIVE-3628
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Viji
>Assignee: Navis
>Priority: Minor
> Fix For: 0.11.0
>
> Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch, 
> HIVE-3628.D8007.3.patch, HIVE-3628.D8007.4.patch, HIVE-3628.D8007.5.patch, 
> HIVE-3628.D8007.6.patch
>
>
> Currently it is not possible to generate counters through UDF. We should 
> support this. 
> Pig currently allows this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10906) Value based UDAF function without orderby expression throws NPE

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574680#comment-14574680
 ] 

Hive QA commented on HIVE-10906:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737942/HIVE-10906.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9002 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4188/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4188/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4188/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737942 - PreCommit-HIVE-TRUNK-Build

> Value based UDAF function without orderby expression throws NPE
> ---
>
> Key: HIVE-10906
> URL: https://issues.apache.org/jira/browse/HIVE-10906
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10906.patch
>
>
> The following query throws NPE.
> {noformat}
> select key, value, min(value) over (partition by key range between unbounded 
> preceding and current row) from small;
> FAILED: NullPointerException null
> 2015-06-03 13:48:09,268 ERROR [main]: ql.Driver 
> (SessionState.java:printError(957)) - FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateValueBoundary(WindowingSpec.java:293)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateWindowFrame(WindowingSpec.java:281)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateAndMakeEffective(WindowingSpec.java:155)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:11965)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8910)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8868)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9606)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10079)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:327)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10090)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1124)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1061)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10923) encryption_join_with_different_encryption_keys.q fails on CentOS 6

2015-06-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-10923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574618#comment-14574618
 ] 

Sergio Peña commented on HIVE-10923:


Do you have the java cryptographic extension (JCE) installed? This is needed 
for 256 key sizes. 

> encryption_join_with_different_encryption_keys.q fails on CentOS 6
> --
>
> Key: HIVE-10923
> URL: https://issues.apache.org/jira/browse/HIVE-10923
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>
> Here is the stack trace
> {code}
> Task with the most failures(4):
> -
> Task ID:
>   task_1433377676690_0015_m_00
> URL:
>   
> http://ip-10-0-0-249.ec2.internal:44717/taskdetails.jsp?jobid=job_1433377676690_0015&tipid=task_1433377676690_0015_m_00
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"key":"238","value":"val_238"}
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"key":"238","value":"val_238"}
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> java.security.InvalidKeyException: Illegal key size
>   at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.init(JceAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension$DefaultCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:264)
>   at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2489)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2620)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2519)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:566)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:394)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> Caused by: java.security.InvalidKeyException: Illegal key size
>   at javax.crypto.Cipher.checkCryptoPerm(Cipher.java:1024)
>   at javax.crypto.Cipher.implInit(Cipher.java:790)
>   at javax.crypto.Cipher.chooseProvider(Cipher.java:849)
>   at javax.crypto.Cipher.init(Cipher.java:1348)
>   at javax.crypto.Cipher.init(Cipher.java:1282)
>   at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.init(JceAesCtrCryptoCodec.java:113)
>   ... 16 more
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:577)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:675)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOper

[jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader

2015-06-05 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574591#comment-14574591
 ] 

Aihua Xu commented on HIVE-10754:
-

I'm wondering if some older versions of Hadoop have such issues. Let me take a 
look. Thanks.

> Pig+Hcatalog doesn't work properly since we need to clone the Job instance in 
> HCatLoader
> 
>
> Key: HIVE-10754
> URL: https://issues.apache.org/jira/browse/HIVE-10754
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10754.patch
>
>
> {noformat}
> Create table tbl1 (key string, value string) stored as rcfile;
> Create table tbl2 (key string, value string);
> insert into tbl1 values( '1', '111');
> insert into tbl2 values('1', '2');
> {noformat}
> Pig script:
> {noformat}
> src_tbl1 = FILTER tbl1 BY (key == '1');
> prj_tbl1 = FOREACH src_tbl1 GENERATE
>key as tbl1_key,
>value as tbl1_value,
>'333' as tbl1_v1;
>
> src_tbl2 = FILTER tbl2 BY (key == '1');
> prj_tbl2 = FOREACH src_tbl2 GENERATE
>key as tbl2_key,
>value as tbl2_value;
>
> dump prj_tbl1;
> dump prj_tbl2;
> result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
> prj_result = FOREACH result 
>   GENERATE  prj_tbl1::tbl1_key AS key1,
> prj_tbl1::tbl1_value AS value1,
> prj_tbl1::tbl1_v1 AS v1,
> prj_tbl2::tbl2_key AS key2,
> prj_tbl2::tbl2_value AS value2;
>
> dump prj_result;
> {noformat}
> The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  We 
> need to clone the job instance in HCatLoader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch

2015-06-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574582#comment-14574582
 ] 

Sergio Peña commented on HIVE-10943:


We need to create a new job in Jenkins, and create a property file in the 
instance.
I will create a new one for this.


> Beeline-cli: Enable precommit for beelie-cli branch 
> 
>
> Key: HIVE-10943
> URL: https://issues.apache.org/jira/browse/HIVE-10943
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>Priority: Minor
> Attachments: HIVE-10943.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch

2015-06-05 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10943:
---
Description: 



  was:
NO PRECOMMIT TESTS



> Beeline-cli: Enable precommit for beelie-cli branch 
> 
>
> Key: HIVE-10943
> URL: https://issues.apache.org/jira/browse/HIVE-10943
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>Priority: Minor
> Attachments: HIVE-10943.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10906) Value based UDAF function without orderby expression throws NPE

2015-06-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10906:

Attachment: HIVE-10906.patch

> Value based UDAF function without orderby expression throws NPE
> ---
>
> Key: HIVE-10906
> URL: https://issues.apache.org/jira/browse/HIVE-10906
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10906.patch
>
>
> The following query throws NPE.
> {noformat}
> select key, value, min(value) over (partition by key range between unbounded 
> preceding and current row) from small;
> FAILED: NullPointerException null
> 2015-06-03 13:48:09,268 ERROR [main]: ql.Driver 
> (SessionState.java:printError(957)) - FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateValueBoundary(WindowingSpec.java:293)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateWindowFrame(WindowingSpec.java:281)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateAndMakeEffective(WindowingSpec.java:155)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:11965)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8910)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8868)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9606)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10079)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:327)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10090)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1124)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1061)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10906) Value based UDAF function without orderby expression throws NPE

2015-06-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10906:

Attachment: (was: HIVE-10906.patch)

> Value based UDAF function without orderby expression throws NPE
> ---
>
> Key: HIVE-10906
> URL: https://issues.apache.org/jira/browse/HIVE-10906
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10906.patch
>
>
> The following query throws NPE.
> {noformat}
> select key, value, min(value) over (partition by key range between unbounded 
> preceding and current row) from small;
> FAILED: NullPointerException null
> 2015-06-03 13:48:09,268 ERROR [main]: ql.Driver 
> (SessionState.java:printError(957)) - FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateValueBoundary(WindowingSpec.java:293)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateWindowFrame(WindowingSpec.java:281)
> at 
> org.apache.hadoop.hive.ql.parse.WindowingSpec.validateAndMakeEffective(WindowingSpec.java:155)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:11965)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8910)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8868)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9606)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10079)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:327)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10090)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1124)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1061)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10911) Add support for date datatype in the value based windowing function

2015-06-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-10911:
---

Assignee: Aihua Xu

> Add support for date datatype in the value based windowing function
> ---
>
> Key: HIVE-10911
> URL: https://issues.apache.org/jira/browse/HIVE-10911
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> Currently date datatype is not supported in value based windowing function. 
> For the following query with hiredate to be date type, an exception will be 
> thrown.
> {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno 
> order by hiredate range 90 preceding) from emp;}}
> It's valuable to support such type with number of days as the value 
> difference. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10880) The bucket number is not respected in insert overwrite.

2015-06-05 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574432#comment-14574432
 ] 

Yongzhi Chen commented on HIVE-10880:
-

The failures are not related. 
Following two testes failed age more than 10. 
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable

org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias failed 
in build 4179(the build after this build) too.

For the spark failure, I tested locally, all pass. And my code change only 
affect when hive.enforce.bucketing is true, the spark test never set this 
value. So it is not related. 

---
 T E S T S
---

---
 T E S T S
---
Running org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 72.272 sec - in 
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark

Results :

Tests run: 5, Failures: 0, Errors: 0, Skipped: 0

Could anyone review the code? Thanks

> The bucket number is not respected in insert overwrite.
> ---
>
> Key: HIVE-10880
> URL: https://issues.apache.org/jira/browse/HIVE-10880
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Blocker
> Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, 
> HIVE-10880.3.patch
>
>
> When hive.enforce.bucketing is true, the bucket number defined in the table 
> is no longer respected in current master and 1.2. This is a regression.
> Reproduce:
> {noformat}
> CREATE TABLE IF NOT EXISTS buckettestinput( 
> data string 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> CREATE TABLE IF NOT EXISTS buckettestoutput1( 
> data string 
> )CLUSTERED BY(data) 
> INTO 2 BUCKETS 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> CREATE TABLE IF NOT EXISTS buckettestoutput2( 
> data string 
> )CLUSTERED BY(data) 
> INTO 2 BUCKETS 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> Then I inserted the following data into the "buckettestinput" table
> firstinsert1 
> firstinsert2 
> firstinsert3 
> firstinsert4 
> firstinsert5 
> firstinsert6 
> firstinsert7 
> firstinsert8 
> secondinsert1 
> secondinsert2 
> secondinsert3 
> secondinsert4 
> secondinsert5 
> secondinsert6 
> secondinsert7 
> secondinsert8
> set hive.enforce.bucketing = true; 
> set hive.enforce.sorting=true;
> insert overwrite table buckettestoutput1 
> select * from buckettestinput where data like 'first%';
> set hive.auto.convert.sortmerge.join=true; 
> set hive.optimize.bucketmapjoin = true; 
> set hive.optimize.bucketmapjoin.sortedmerge = true; 
> select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data);
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use 
> bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number 
> of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 
> (state=42000,code=10141)
> {noformat}
> The related debug information related to insert overwrite:
> {noformat}
> 0: jdbc:hive2://localhost:1> insert overwrite table buckettestoutput1 
> select * from buckettestinput where data like 'first%'insert overwrite table 
> buckettestoutput1 
> 0: jdbc:hive2://localhost:1> ;
> select * from buckettestinput where data like ' 
> first%';
> INFO  : Number of reduce tasks determined at compile time: 2
> INFO  : In order to change the average load for a reducer (in bytes):
> INFO  :   set hive.exec.reducers.bytes.per.reducer=
> INFO  : In order to limit the maximum number of reducers:
> INFO  :   set hive.exec.reducers.max=
> INFO  : In order to set a constant number of reducers:
> INFO  :   set mapred.reduce.tasks=
> INFO  : Job running in-process (local Hadoop)
> INFO  : 2015-06-01 11:09:29,650 Stage-1 map = 86%,  reduce = 100%
> INFO  : Ended Job = job_local107155352_0001
> INFO  : Loading data to table default.buckettestoutput1 from 
> file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1
> INFO  : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, 
> totalSize=52, rawDataSize=48]
> No rows affected (1.692 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10944) Fix HS2 for Metrics

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574360#comment-14574360
 ] 

Hive QA commented on HIVE-10944:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737896/HIVE-10944.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9001 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4187/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4187/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4187/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737896 - PreCommit-HIVE-TRUNK-Build

> Fix HS2 for Metrics
> ---
>
> Key: HIVE-10944
> URL: https://issues.apache.org/jira/browse/HIVE-10944
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-10944.patch
>
>
> Some issues about initializing the new HS2 metrics
> 1.  Metrics is not working properly in HS2 due to wrong init checks
> 2.  If not enabled, JVMPauseMonitor logs trash to HS2 logs as it wasnt 
> checking if metrics was enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch

2015-06-05 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574283#comment-14574283
 ] 

Xuefu Zhang commented on HIVE-10943:


Not sure. [~spena], any comments?

> Beeline-cli: Enable precommit for beelie-cli branch 
> 
>
> Key: HIVE-10943
> URL: https://issues.apache.org/jira/browse/HIVE-10943
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>Priority: Minor
> Attachments: HIVE-10943.patch
>
>
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10944) Fix HS2 for Metrics

2015-06-05 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-10944:


Assignee: Szehon Ho

> Fix HS2 for Metrics
> ---
>
> Key: HIVE-10944
> URL: https://issues.apache.org/jira/browse/HIVE-10944
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-10944.patch
>
>
> Some issues about initializing the new HS2 metrics
> 1.  Metrics is not working properly in HS2 due to wrong init checks
> 2.  If not enabled, JVMPauseMonitor logs trash to HS2 logs as it wasnt 
> checking if metrics was enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-05 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574246#comment-14574246
 ] 

Szehon Ho commented on HIVE-10761:
--

HIVE-10944

> Create codahale-based metrics system for Hive
> -
>
> Key: HIVE-10761
> URL: https://issues.apache.org/jira/browse/HIVE-10761
> Project: Hive
>  Issue Type: New Feature
>  Components: Diagnosability
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 1.3.0
>
> Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
> HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
> hms-metrics.json
>
>
> There is a current Hive metrics system that hooks up to a JMX reporting, but 
> all its measurements, models are custom.
> This is to make another metrics system that will be based on Codahale (ie 
> yammer, dropwizard), which has the following advantage:
> * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
> * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
> etc), 
> * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
> It is used for many projects, including several Apache projects like Oozie.  
> Overall, monitoring tools should find it easier to understand these common 
> metric, measurement, reporting models.
> The existing metric subsystem will be kept and can be enabled if backward 
> compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-05 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574247#comment-14574247
 ] 

Szehon Ho commented on HIVE-10761:
--

HIVE-10944

> Create codahale-based metrics system for Hive
> -
>
> Key: HIVE-10761
> URL: https://issues.apache.org/jira/browse/HIVE-10761
> Project: Hive
>  Issue Type: New Feature
>  Components: Diagnosability
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 1.3.0
>
> Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
> HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
> hms-metrics.json
>
>
> There is a current Hive metrics system that hooks up to a JMX reporting, but 
> all its measurements, models are custom.
> This is to make another metrics system that will be based on Codahale (ie 
> yammer, dropwizard), which has the following advantage:
> * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
> * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
> etc), 
> * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
> It is used for many projects, including several Apache projects like Oozie.  
> Overall, monitoring tools should find it easier to understand these common 
> metric, measurement, reporting models.
> The existing metric subsystem will be kept and can be enabled if backward 
> compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10944) Fix HS2 for Metrics

2015-06-05 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-10944:
-
Attachment: HIVE-10944.patch

Fixes the issues.

> Fix HS2 for Metrics
> ---
>
> Key: HIVE-10944
> URL: https://issues.apache.org/jira/browse/HIVE-10944
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
> Attachments: HIVE-10944.patch
>
>
> Some issues about initializing the new HS2 metrics
> 1.  Metrics is not working properly in HS2 due to wrong init checks
> 2.  If not enabled, JVMPauseMonitor logs trash to HS2 logs as it wasnt 
> checking if metrics was enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9248) Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is Hash mode

2015-06-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574221#comment-14574221
 ] 

Hive QA commented on HIVE-9248:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737861/HIVE-9248.05.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9003 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4186/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4186/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4186/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737861 - PreCommit-HIVE-TRUNK-Build

> Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is 
> Hash mode
> ---
>
> Key: HIVE-9248
> URL: https://issues.apache.org/jira/browse/HIVE-9248
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-9248.01.patch, HIVE-9248.02.patch, 
> HIVE-9248.03.patch, HIVE-9248.04.patch, HIVE-9248.05.patch
>
>
> Under Tez and Vectorization, ReduceWork not getting vectorized unless it 
> GROUP BY operator is MergePartial.  Add valid cases where GROUP BY is Hash 
> (and presumably there are downstream reducers that will do MergePartial).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-05 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574201#comment-14574201
 ] 

Szehon Ho commented on HIVE-10761:
--

Yea I'm working on a patch right now.

> Create codahale-based metrics system for Hive
> -
>
> Key: HIVE-10761
> URL: https://issues.apache.org/jira/browse/HIVE-10761
> Project: Hive
>  Issue Type: New Feature
>  Components: Diagnosability
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 1.3.0
>
> Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
> HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
> hms-metrics.json
>
>
> There is a current Hive metrics system that hooks up to a JMX reporting, but 
> all its measurements, models are custom.
> This is to make another metrics system that will be based on Codahale (ie 
> yammer, dropwizard), which has the following advantage:
> * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
> * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
> etc), 
> * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
> It is used for many projects, including several Apache projects like Oozie.  
> Overall, monitoring tools should find it easier to understand these common 
> metric, measurement, reporting models.
> The existing metric subsystem will be kept and can be enabled if backward 
> compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-05 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574181#comment-14574181
 ] 

Gopal V commented on HIVE-10761:


Setting {{hive.server2.metrics.enabled=true}} seems to trigger another error

{code}
2015-06-05 02:18:28,818 ERROR [main()]: server.HiveServer2 
(HiveServer2.java:stop(314)) - error in Metrics deinit: 
java.lang.NullPointerException null
java.lang.NullPointerException
at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:312)
at 
org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:387)
at 
org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:75)
at 
org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:609)
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:482)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}

> Create codahale-based metrics system for Hive
> -
>
> Key: HIVE-10761
> URL: https://issues.apache.org/jira/browse/HIVE-10761
> Project: Hive
>  Issue Type: New Feature
>  Components: Diagnosability
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 1.3.0
>
> Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
> HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
> hms-metrics.json
>
>
> There is a current Hive metrics system that hooks up to a JMX reporting, but 
> all its measurements, models are custom.
> This is to make another metrics system that will be based on Codahale (ie 
> yammer, dropwizard), which has the following advantage:
> * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
> * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
> etc), 
> * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
> It is used for many projects, including several Apache projects like Oozie.  
> Overall, monitoring tools should find it easier to understand these common 
> metric, measurement, reporting models.
> The existing metric subsystem will be kept and can be enabled if backward 
> compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >