date:20141104

[jira] [Updated] (HIVE-8609) Move beeline to jline2

2014-11-04 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-8609:
---
Status: Patch Available  (was: In Progress)

trigger the HIVE-QA to see the impact

> Move beeline to jline2
> --
>
> Key: HIVE-8609
> URL: https://issues.apache.org/jira/browse/HIVE-8609
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>Priority: Blocker
>
> We found a serious bug in jline in HIVE-8565. We should move to jline2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 27566: HIVE-8609: move beeline to jline2

2014-11-04 Thread cheng xu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27566/
---

Review request for hive.


Repository: hive-git


Description
---

HIVE-8609: move beeline to jline2
The following will be changed:
* MultiCompletor-> AggregateCompleter
* SimpleCompletor->StringsCompleter
* Terminal.getTerminalWidth() -> Terminal.getWidth()
* Terminal is an interface now; -> use TerminalFactory to get instances of a 
Terminal
* String -> CharSequence


Diffs
-

  beeline/src/java/org/apache/hive/beeline/AbstractCommandHandler.java 
a9479d56a3dbb922e917762e25267999ff9277ae 
  beeline/src/java/org/apache/hive/beeline/BeeLine.java 
8539a415b288af0e8f7ee1056932e40c0155e1ea 
  beeline/src/java/org/apache/hive/beeline/BeeLineCommandCompletor.java 
52313e6abc7974a9c2261063f19248f3293335d3 
  beeline/src/java/org/apache/hive/beeline/BeeLineCompletor.java 
c6bb4feb99f24074b69dff201b697ed2c1adeede 
  beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 
f73fb445aeff0052f43179a2314f3f342f72ea5f 
  beeline/src/java/org/apache/hive/beeline/BooleanCompletor.java 
3e88c531c761e9d42bc301fe1b4f1c22c5d0cbbd 
  beeline/src/java/org/apache/hive/beeline/ClassNameCompleter.java PRE-CREATION 
  beeline/src/java/org/apache/hive/beeline/CommandHandler.java 
bab17789b8b9d97b98ee3b64efa10fef48260af4 
  beeline/src/java/org/apache/hive/beeline/Commands.java 
7e366dc1d821a04d64f4e4923ebab2011e041a67 
  beeline/src/java/org/apache/hive/beeline/DatabaseConnection.java 
ab67700d3a83bc51902aa479f5f33ecb33401a47 
  beeline/src/java/org/apache/hive/beeline/ReflectiveCommandHandler.java 
2b957f20a63c17fd9decdfe4f2b2f92b30ecdb58 
  beeline/src/java/org/apache/hive/beeline/SQLCompletor.java 
844b9ae313e5d4cb5680d55a90255418b84155f9 
  beeline/src/java/org/apache/hive/beeline/TableNameCompletor.java 
bc0d9beb62ccbb0bff88d65e5e72d12c6f6bbb3b 
  cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 
d7a9b0ea6730e06637ac5796f173a9f99ea054c3 
  cli/src/test/org/apache/hadoop/hive/cli/TestCliDriverMethods.java 
63668bca8a998c799002e0492866f80e0d730f0b 
  pom.xml a5f851f31df15660cebef0e4691ea34699c6d1ef 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezJobMonitor.java 
284acbc8ae70026ffb878de0f76921b4816737ba 

Diff: https://reviews.apache.org/r/27566/diff/


Testing
---


Thanks,

cheng xu

[jira] [Commented] (HIVE-8711) DB deadlocks not handled in TxnHandler for Postgres, Oracle, and SQLServer

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195856#comment-14195856
 ] 

Hive QA commented on HIVE-8711:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679142/HIVE-8711.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6669 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1626/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1626/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1626/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679142 - PreCommit-HIVE-TRUNK-Build

> DB deadlocks not handled in TxnHandler for Postgres, Oracle, and SQLServer
> --
>
> Key: HIVE-8711
> URL: https://issues.apache.org/jira/browse/HIVE-8711
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.14.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8711.patch
>
>
> TxnHandler.detectDeadlock has code to catch deadlocks in MySQL and Derby.  
> But it does not detect a deadlock for Postgres, Oracle, or SQLServer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8609) Move beeline to jline2

2014-11-04 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-8609:
---
Attachment: HIVE-8609.patch

> Move beeline to jline2
> --
>
> Key: HIVE-8609
> URL: https://issues.apache.org/jira/browse/HIVE-8609
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>Priority: Blocker
> Attachments: HIVE-8609.patch
>
>
> We found a serious bug in jline in HIVE-8565. We should move to jline2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8724) Right outer join produces incorrect result on Tez

2014-11-04 Thread Gunther Hagleitner (JIRA)

Gunther Hagleitner created HIVE-8724:


 Summary: Right outer join produces incorrect result on Tez
 Key: HIVE-8724
 URL: https://issues.apache.org/jira/browse/HIVE-8724
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Critical
 Fix For: 0.14.0


Still working on repro. But when testing h14 in the cluster I've seen some 
errors with right outer joins. Rewriting as left outer joins works fine. It 
seems that when initializing the groups in CommonMergeJoinOperator the state 
machine gets confused about what position in the stream we're at.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-7073) Implement Binary in ParquetSerDe

2014-11-04 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu reassigned HIVE-7073:
--

Assignee: Ferdinand Xu

> Implement Binary in ParquetSerDe
> 
>
> Key: HIVE-7073
> URL: https://issues.apache.org/jira/browse/HIVE-7073
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Chen
>Assignee: Ferdinand Xu
>
> The ParquetSerDe currently does not support the BINARY data type. This ticket 
> is to implement the BINARY data type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8725) spark-client build failed sometime.[Spark Branch]

2014-11-04 Thread Chengxiang Li (JIRA)

Chengxiang Li created HIVE-8725:
---

 Summary: spark-client build failed sometime.[Spark Branch]
 Key: HIVE-8725
 URL: https://issues.apache.org/jira/browse/HIVE-8725
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li


Sometimes spark-client build failed due to miss spark dependency version, I'm 
not sure the root cause, but add dependency version definitely fix my problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8725) spark-client build failed sometime.[Spark Branch]

2014-11-04 Thread Chengxiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8725:

Attachment: HIVE-8725.1-spark.patch

> spark-client build failed sometime.[Spark Branch]
> -
>
> Key: HIVE-8725
> URL: https://issues.apache.org/jira/browse/HIVE-8725
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Attachments: HIVE-8725.1-spark.patch
>
>
> Sometimes spark-client build failed due to miss spark dependency version, I'm 
> not sure the root cause, but add dependency version definitely fix my problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8725) spark-client build failed sometimes.[Spark Branch]

2014-11-04 Thread Chengxiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8725:

Summary: spark-client build failed sometimes.[Spark Branch]  (was: 
spark-client build failed sometime.[Spark Branch])

> spark-client build failed sometimes.[Spark Branch]
> --
>
> Key: HIVE-8725
> URL: https://issues.apache.org/jira/browse/HIVE-8725
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Attachments: HIVE-8725.1-spark.patch
>
>
> Sometimes spark-client build failed due to miss spark dependency version, I'm 
> not sure the root cause, but add dependency version definitely fix my problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8725) spark-client build failed sometime.[Spark Branch]

2014-11-04 Thread Chengxiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8725:

Status: Patch Available  (was: Open)

> spark-client build failed sometime.[Spark Branch]
> -
>
> Key: HIVE-8725
> URL: https://issues.apache.org/jira/browse/HIVE-8725
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Attachments: HIVE-8725.1-spark.patch
>
>
> Sometimes spark-client build failed due to miss spark dependency version, I'm 
> not sure the root cause, but add dependency version definitely fix my problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8721) Enable transactional unit against other databases

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195943#comment-14195943
 ] 

Hive QA commented on HIVE-8721:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679143/HIVE-8721.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6669 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1627/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1627/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1627/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679143 - PreCommit-HIVE-TRUNK-Build

> Enable transactional unit against other databases
> -
>
> Key: HIVE-8721
> URL: https://issues.apache.org/jira/browse/HIVE-8721
> Project: Hive
>  Issue Type: Test
>  Components: Testing Infrastructure, Transactions
>Affects Versions: 0.14.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-8721.patch
>
>
> Since TxnHandler and subclasses use JDBC to directly connect to the 
> underlying database (rather than relying on DataNucleus) it is important to 
> test that all of the operations work against different database flavors.  An 
> easy way to do this is to enable the unit tests to run against an external 
> database.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8725) spark-client build failed sometimes.[Spark Branch]

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195955#comment-14195955
 ] 

Hive QA commented on HIVE-8725:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679179/HIVE-8725.1-spark.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 7098 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.ql.io.parquet.serde.TestParquetTimestampUtils.testTimezone
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/303/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/303/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-303/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679179 - PreCommit-HIVE-SPARK-Build

> spark-client build failed sometimes.[Spark Branch]
> --
>
> Key: HIVE-8725
> URL: https://issues.apache.org/jira/browse/HIVE-8725
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Attachments: HIVE-8725.1-spark.patch
>
>
> Sometimes spark-client build failed due to miss spark dependency version, I'm 
> not sure the root cause, but add dependency version definitely fix my problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8705) Support using pre-authenticated subject in kerberized HiveServer2 HTTP mode

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195977#comment-14195977
 ] 

Hive QA commented on HIVE-8705:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679162/HIVE-8705.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6668 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_pop
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1628/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1628/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1628/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679162 - PreCommit-HIVE-TRUNK-Build

> Support using pre-authenticated subject in kerberized HiveServer2 HTTP mode 
> 
>
> Key: HIVE-8705
> URL: https://issues.apache.org/jira/browse/HIVE-8705
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2, JDBC
>Affects Versions: 0.13.0, 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.15.0
>
> Attachments: HIVE-8705.1.patch, HIVE-8705.1.patch, 
> TestPreAuthenticatedKerberosSubject.java
>
>
> HIVE-6486 provided a patch to utilize pre-authenticated subject (someone who 
> has programmatically done a JAAS login and is not doing kinit before 
> connecting to HiveServer2 using the JDBC driver). However, that feature was 
> only for the binary mode code path. We need to have a similar feature when 
> the driver-server communicate using http transport. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8724) Right outer join produces incorrect result on Tez

2014-11-04 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8724:
-
Attachment: HIVE-8724.1.patch

> Right outer join produces incorrect result on Tez
> -
>
> Key: HIVE-8724
> URL: https://issues.apache.org/jira/browse/HIVE-8724
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8724.1.patch
>
>
> Still working on repro. But when testing h14 in the cluster I've seen some 
> errors with right outer joins. Rewriting as left outer joins works fine. It 
> seems that when initializing the groups in CommonMergeJoinOperator the state 
> machine gets confused about what position in the stream we're at.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8724) Right outer join produces incorrect result on Tez

2014-11-04 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8724:
-
Status: Patch Available  (was: Open)

> Right outer join produces incorrect result on Tez
> -
>
> Key: HIVE-8724
> URL: https://issues.apache.org/jira/browse/HIVE-8724
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8724.1.patch
>
>
> Still working on repro. But when testing h14 in the cluster I've seen some 
> errors with right outer joins. Rewriting as left outer joins works fine. It 
> seems that when initializing the groups in CommonMergeJoinOperator the state 
> machine gets confused about what position in the stream we're at.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8723) Set reasonable connection timeout for CuratorFramework ZooKeeper clients in Hive

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196005#comment-14196005
 ] 

Hive QA commented on HIVE-8723:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679173/HIVE-8723.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6669 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1629/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1629/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1629/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679173 - PreCommit-HIVE-TRUNK-Build

> Set reasonable connection timeout for CuratorFramework ZooKeeper clients in 
> Hive
> 
>
> Key: HIVE-8723
> URL: https://issues.apache.org/jira/browse/HIVE-8723
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-8723.1.patch
>
>
> Currently we use -1, due to which "any" elapsed time is always greater than 
> any timeout value resulting in an unnecessary connection loss exception. 
> Relevant code from curator framework:
> {code}
>  private synchronized void checkTimeouts() throws Exception
> {
> int minTimeout = Math.min(sessionTimeoutMs, connectionTimeoutMs);
> long elapsed = System.currentTimeMillis() - connectionStartMs;
> if ( elapsed >= minTimeout )
> {
> if ( zooKeeper.hasNewConnectionString() )
> {
> handleNewConnectionString();
> }
> else
> {
> int maxTimeout = Math.max(sessionTimeoutMs, 
> connectionTimeoutMs);
> if ( elapsed > maxTimeout )
> {
> if ( 
> !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
> {
> log.warn(String.format("Connection attempt 
> unsuccessful after %d (greater than max timeout of %d). Resetting connection 
> and trying again with a new connection.", elapsed, maxTimeout));
> }
> reset();
> }
> else
> {
> KeeperException.ConnectionLossException 
> connectionLossException = new CuratorConnectionLossException();
> if ( 
> !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
> {
> log.error(String.format("Connection timed out for 
> connection string (%s) and timeout (%d) / elapsed (%d)", 
> zooKeeper.getConnectionString(), connectionTimeoutMs, elapsed), 
> connectionLossException);
> }
> tracer.get().addCount("connections-timed-out", 1);
> throw connectionLossException;
> }
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8609) Move beeline to jline2

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196051#comment-14196051
 ] 

Hive QA commented on HIVE-8609:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679178/HIVE-8609.patch

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 6669 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.beeline.TestBeeLineWithArgs.testBeelineHiveConfVariable
org.apache.hive.beeline.TestBeeLineWithArgs.testBeelineHiveVariable
org.apache.hive.beeline.TestBeeLineWithArgs.testBeelineMultiHiveVariable
org.apache.hive.beeline.TestBeeLineWithArgs.testBeelineShellCommand
org.apache.hive.beeline.TestBeeLineWithArgs.testBreakOnErrorScriptFile
org.apache.hive.beeline.TestBeeLineWithArgs.testCSVOutput
org.apache.hive.beeline.TestBeeLineWithArgs.testCSVOutputDeprecation
org.apache.hive.beeline.TestBeeLineWithArgs.testDSVOutput
org.apache.hive.beeline.TestBeeLineWithArgs.testEmbeddedBeelineConnection
org.apache.hive.beeline.TestBeeLineWithArgs.testGetVariableValue
org.apache.hive.beeline.TestBeeLineWithArgs.testHiveVarSubstitution
org.apache.hive.beeline.TestBeeLineWithArgs.testNPE
org.apache.hive.beeline.TestBeeLineWithArgs.testNegativeScriptFile
org.apache.hive.beeline.TestBeeLineWithArgs.testNullDefault
org.apache.hive.beeline.TestBeeLineWithArgs.testNullEmpty
org.apache.hive.beeline.TestBeeLineWithArgs.testNullEmptyCmdArg
org.apache.hive.beeline.TestBeeLineWithArgs.testNullNonEmpty
org.apache.hive.beeline.TestBeeLineWithArgs.testPositiveScriptFile
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressHidden
org.apache.hive.beeline.TestBeeLineWithArgs.testTSV2Output
org.apache.hive.beeline.TestBeeLineWithArgs.testTSVOutput
org.apache.hive.beeline.TestBeeLineWithArgs.testTSVOutputDeprecation
org.apache.hive.beeline.TestBeeLineWithArgs.testWhitespaceBeforeCommentScriptFile
org.apache.hive.beeline.TestBeelineArgParsing.testBeelineOpts
org.apache.hive.beeline.TestBeelineArgParsing.testDuplicateArgs
org.apache.hive.beeline.TestBeelineArgParsing.testHelp
org.apache.hive.beeline.TestBeelineArgParsing.testHiveConfAndVars
org.apache.hive.beeline.TestBeelineArgParsing.testQueryScripts
org.apache.hive.beeline.TestBeelineArgParsing.testScriptFile
org.apache.hive.beeline.TestBeelineArgParsing.testSimpleArgs
org.apache.hive.beeline.TestBeelineArgParsing.testUnmatchedArgs
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgradeDryRun
org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[0]
org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[1]
org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[2]
org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[3]
org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[4]
org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[5]
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1630/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1630/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1630/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 41 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679178 - PreCommit-HIVE-TRUNK-Build

> Move beeline to jline2
> --
>
> Key: HIVE-8609
> URL: https://issues.apache.org/jira/browse/HIVE-8609
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>Priority: Blocker
> Attachments: HIVE-8609.patch
>
>
> We found a serious bug in jline in HIVE-8565. We should move to jline2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8670) Combine Hive Operator statistic and Spark Metric to an uniformed query statistic.[Spark Branch]

2014-11-04 Thread Chengxiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8670:

Attachment: HIVE-8670.1-spark.patch

Identified job statistic API and statistic builder logic.

> Combine Hive Operator statistic and Spark Metric to an uniformed query 
> statistic.[Spark Branch]
> ---
>
> Key: HIVE-8670
> URL: https://issues.apache.org/jira/browse/HIVE-8670
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Attachments: HIVE-8670.1-spark.patch
>
>
> For a Hive on Spark query, job statistic include Hive Operator level 
> statistic and spark job metric, we should combine these statistics in an 
> uniformed format and expose to user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 27569: HIVE-8670 Combine Hive Operator statistic and Spark Metric to an uniformed query statistic.[Spark Branch]

2014-11-04 Thread chengxiang li


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27569/
---

Review request for hive, Rui Li and Xuefu Zhang.


Bugs: HIVE-8670
https://issues.apache.org/jira/browse/HIVE-8670


Repository: hive-git


Description
---

add job statistic API and constructor logic, still miss spark task metric 
collection logic.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 04323bb 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/Statistic/SparkStatistic.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/Statistic/SparkStatisticGroup.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/Statistic/SparkStatistics.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/Statistic/SparkStatisticsBuilder.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobStatus.java 
a450af4 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/SimpleSparkJobStatus.java
 db2eca4 

Diff: https://reviews.apache.org/r/27569/diff/


Testing
---


Thanks,

chengxiang li

[jira] [Created] (HIVE-8726) Collect Spark TaskMetrics and build job statistic[Spark Branch]

2014-11-04 Thread Chengxiang Li (JIRA)

Chengxiang Li created HIVE-8726:
---

 Summary: Collect Spark TaskMetrics and build job statistic[Spark 
Branch]
 Key: HIVE-8726
 URL: https://issues.apache.org/jira/browse/HIVE-8726
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li


Implement SparkListener to collect TaskMetrics, and build SparkStatistic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8670) Combine Hive Operator statistic and Spark Metric to an uniformed query statistic.[Spark Branch]

2014-11-04 Thread Chengxiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8670:

Status: Patch Available  (was: Open)

> Combine Hive Operator statistic and Spark Metric to an uniformed query 
> statistic.[Spark Branch]
> ---
>
> Key: HIVE-8670
> URL: https://issues.apache.org/jira/browse/HIVE-8670
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Attachments: HIVE-8670.1-spark.patch
>
>
> For a Hive on Spark query, job statistic include Hive Operator level 
> statistic and spark job metric, we should combine these statistics in an 
> uniformed format and expose to user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8609) Move beeline to jline2

2014-11-04 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-8609:
---
Status: Open  (was: Patch Available)

> Move beeline to jline2
> --
>
> Key: HIVE-8609
> URL: https://issues.apache.org/jira/browse/HIVE-8609
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>Priority: Blocker
> Attachments: HIVE-8609.patch
>
>
> We found a serious bug in jline in HIVE-8565. We should move to jline2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8724) Right outer join produces incorrect result on Tez

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196099#comment-14196099
 ] 

Hive QA commented on HIVE-8724:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679200/HIVE-8724.1.patch

{color:red}ERROR:{color} -1 due to 28 failed/errored test(s), 6669 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_correctness
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_script_env_var1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_script_env_var2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_skewjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_exists
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_temp_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_tests
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_joins_explain
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_main
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_decimal
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_shufflejoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1631/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1631/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1631/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 28 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679200 - PreCommit-HIVE-TRUNK-Build

> Right outer join produces incorrect result on Tez
> -
>
> Key: HIVE-8724
> URL: https://issues.apache.org/jira/browse/HIVE-8724
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8724.1.patch
>
>
> Still working on repro. But when testing h14 in the cluster I've seen some 
> errors with right outer joins. Rewriting as left outer joins works fine. It 
> seems that when initializing the groups in CommonMergeJoinOperator the state 
> machine gets confused about what position in the stream we're at.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8670) Combine Hive Operator statistic and Spark Metric to an uniformed query statistic.[Spark Branch]

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196114#comment-14196114
 ] 

Hive QA commented on HIVE-8670:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679207/HIVE-8670.1-spark.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 7098 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.ql.io.parquet.serde.TestParquetTimestampUtils.testTimezone
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/304/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/304/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-304/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679207 - PreCommit-HIVE-SPARK-Build

> Combine Hive Operator statistic and Spark Metric to an uniformed query 
> statistic.[Spark Branch]
> ---
>
> Key: HIVE-8670
> URL: https://issues.apache.org/jira/browse/HIVE-8670
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Attachments: HIVE-8670.1-spark.patch
>
>
> For a Hive on Spark query, job statistic include Hive Operator level 
> statistic and spark job metric, we should combine these statistics in an 
> uniformed format and expose to user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 27572: HIVE-8718: Refactoring: move mapLocalWork field from MapWork to BaseWork

2014-11-04 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27572/
---

Review request for hive, Brock Noland and Szehon Ho.


Bugs: HIVE-8718
https://issues.apache.org/jira/browse/HIVE-8718


Repository: hive-git


Description
---

See JIRA description. Pure refactoring. 


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 4e3df75 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java b6a7388 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 6e679b6 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java b5f939b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 46dcfaf 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
 1a4fcbf 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
 8afe218 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java 
984963b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a1cc90d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java a808fc9 

Diff: https://reviews.apache.org/r/27572/diff/


Testing
---

None.


Thanks,

Xuefu Zhang

[jira] [Commented] (HIVE-8725) spark-client build failed sometimes.[Spark Branch]

2014-11-04 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196158#comment-14196158
 ] 

Xuefu Zhang commented on HIVE-8725:
---

I have never encountered the problem, but the change seems harmless and making 
dep consistent at least.

+1

> spark-client build failed sometimes.[Spark Branch]
> --
>
> Key: HIVE-8725
> URL: https://issues.apache.org/jira/browse/HIVE-8725
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Attachments: HIVE-8725.1-spark.patch
>
>
> Sometimes spark-client build failed due to miss spark dependency version, I'm 
> not sure the root cause, but add dependency version definitely fix my problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8725) spark-client build failed sometimes.[Spark Branch]

2014-11-04 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-8725:
--
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Committed to Spark branch. Thanks to ChengXiang for the contribution.

> spark-client build failed sometimes.[Spark Branch]
> --
>
> Key: HIVE-8725
> URL: https://issues.apache.org/jira/browse/HIVE-8725
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Fix For: spark-branch
>
> Attachments: HIVE-8725.1-spark.patch
>
>
> Sometimes spark-client build failed due to miss spark dependency version, I'm 
> not sure the root cause, but add dependency version definitely fix my problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27569: HIVE-8670 Combine Hive Operator statistic and Spark Metric to an uniformed query statistic.[Spark Branch]

2014-11-04 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27569/#review59768
---

Ship it!


Ship It!

- Xuefu Zhang


On Nov. 4, 2014, 12:35 p.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27569/
> ---
> 
> (Updated Nov. 4, 2014, 12:35 p.m.)
> 
> 
> Review request for hive, Rui Li and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8670
> https://issues.apache.org/jira/browse/HIVE-8670
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> add job statistic API and constructor logic, still miss spark task metric 
> collection logic.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 04323bb 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/Statistic/SparkStatistic.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/Statistic/SparkStatisticGroup.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/Statistic/SparkStatistics.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/Statistic/SparkStatisticsBuilder.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobStatus.java 
> a450af4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/SimpleSparkJobStatus.java
>  db2eca4 
> 
> Diff: https://reviews.apache.org/r/27569/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>

[jira] [Commented] (HIVE-8670) Combine Hive Operator statistic and Spark Metric to an uniformed query statistic.[Spark Branch]

2014-11-04 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196193#comment-14196193
 ] 

Xuefu Zhang commented on HIVE-8670:
---

+1

> Combine Hive Operator statistic and Spark Metric to an uniformed query 
> statistic.[Spark Branch]
> ---
>
> Key: HIVE-8670
> URL: https://issues.apache.org/jira/browse/HIVE-8670
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Attachments: HIVE-8670.1-spark.patch
>
>
> For a Hive on Spark query, job statistic include Hive Operator level 
> statistic and spark job metric, we should combine these statistics in an 
> uniformed format and expose to user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8670) Combine Hive Operator statistic and Spark Metric to an uniformed query statistic.[Spark Branch]

2014-11-04 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-8670:
--
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Committed to Spark branch. Thanks to Chengxiang for the contribution.

> Combine Hive Operator statistic and Spark Metric to an uniformed query 
> statistic.[Spark Branch]
> ---
>
> Key: HIVE-8670
> URL: https://issues.apache.org/jira/browse/HIVE-8670
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M3
> Fix For: spark-branch
>
> Attachments: HIVE-8670.1-spark.patch
>
>
> For a Hive on Spark query, job statistic include Hive Operator level 
> statistic and spark job metric, we should combine these statistics in an 
> uniformed format and expose to user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8721) Enable transactional unit tests against other databases

2014-11-04 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8721:
-
Summary: Enable transactional unit tests against other databases  (was: 
Enable transactional unit against other databases)

> Enable transactional unit tests against other databases
> ---
>
> Key: HIVE-8721
> URL: https://issues.apache.org/jira/browse/HIVE-8721
> Project: Hive
>  Issue Type: Test
>  Components: Testing Infrastructure, Transactions
>Affects Versions: 0.14.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-8721.patch
>
>
> Since TxnHandler and subclasses use JDBC to directly connect to the 
> underlying database (rather than relying on DataNucleus) it is important to 
> test that all of the operations work against different database flavors.  An 
> easy way to do this is to enable the unit tests to run against an external 
> database.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8621) Dump small table join data for map-join [Spark Branch]

2014-11-04 Thread Suhas Satish (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suhas Satish updated HIVE-8621:
---
Assignee: (was: Suhas Satish)

> Dump small table join data for map-join [Spark Branch]
> --
>
> Key: HIVE-8621
> URL: https://issues.apache.org/jira/browse/HIVE-8621
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Suhas Satish
>
> This jira aims to re-use a slightly modified approach of map-reduce 
> distributed cache in spark to dump map-joined small tables as hash tables 
> onto spark DFS cluster. 
> This is a sub-task of map-join for spark 
> https://issues.apache.org/jira/browse/HIVE-7613
> This can use the baseline patch for map-join
> https://issues.apache.org/jira/browse/HIVE-8616
> The original thought process was to use broadcast variable concept in spark, 
> for the small tables. 
> The number of broadcast variables that must be created is m x n where
> 'm' is  the number of small tables in the (m+1) way join and n is the number 
> of buckets of tables. If unbucketed, n=1
> But it was discovered that objects compressed with kryo serialization on 
> disk, can occupy 20X or more when deserialized in-memory. For bucket join, 
> the spark Driver has to hold all the buckets (for bucketed tables) in-memory 
> (to provide for fault-tolerance against Executor failures) although the 
> executors only need individual buckets in their memory. So the broadcast 
> variable approach may not be the right approach. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8623) Implement HashTableLoader for Spark map-join [Spark Branch]

2014-11-04 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-8623:
--
Attachment: HIVE-8623.2-spark.patch

Attached v2 that makes sure all input streams closed at the end.

> Implement HashTableLoader for Spark map-join [Spark Branch]
> ---
>
> Key: HIVE-8623
> URL: https://issues.apache.org/jira/browse/HIVE-8623
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Suhas Satish
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-8623.1-spark.patch, HIVE-8623.2-spark.patch
>
>
> This is a sub-task of map-join for spark 
> https://issues.apache.org/jira/browse/HIVE-7613
> This can use the baseline patch for map-join
> https://issues.apache.org/jira/browse/HIVE-8616



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Hive on Spark for simple hive queries

2014-11-04 Thread Prabu Soundar Rajan -X (prabsoun - MINDTREE LIMITED at Cisco)

Hi Team,

In spite of setting hive execution engine as Spark, when we try simple hive 
queries having only mapper phase like (select * from table where column=xyz) - 
we observe the jobs are not submitted to Spark master. We do not see those jobs 
in Spark master web UI. But when we try some queries with reducer phase(in mr 
execution style), we see the job as "Hive on Spark" application in Spark master 
web UI. Appreciate if you could help us understand this behavior.   Am I 
missing something obvious here?

Thanks & Regards,
Prabu

[jira] [Assigned] (HIVE-8509) UT: fix list_bucket_dml_2 test

2014-11-04 Thread Chinna Rao Lalam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam reassigned HIVE-8509:
--

Assignee: Chinna Rao Lalam

> UT: fix list_bucket_dml_2 test
> --
>
> Key: HIVE-8509
> URL: https://issues.apache.org/jira/browse/HIVE-8509
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Thomas Friedrich
>Assignee: Chinna Rao Lalam
>Priority: Minor
>
> The test list_bucket_dml_2 fails in FileSinkOperator.publishStats:
> org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30002]: 
> StatsPublisher cannot be connected to.There was a error while connecting to 
> the StatsPublisher, and retrying might help. If you dont want the query to 
> fail because accurate statistics could not be collected, set 
> hive.stats.reliable=false
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1079)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:971)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:175)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:121)
> I debugged and found that FileSinkOperator.publishStats throws the exception 
> when calling statsPublisher.connect here:
> if (!statsPublisher.connect(hconf)) {
> // just return, stats gathering should not block the main query
> LOG.error("StatsPublishing error: cannot connect to database");
> if (isStatsReliable)
> { throw new 
> HiveException(ErrorMsg.STATSPUBLISHER_CONNECTION_ERROR.getErrorCodedMsg()); }
> return;
> }
> With the hive.stats.dbclass set to counter in data/conf/spark/hive-site.xml, 
> the statsPuvlisher is of type CounterStatsPublisher.
> In CounterStatsPublisher, the exception is thrown because getReporter() 
> returns null for the MapredContext:
> MapredContext context = MapredContext.get();
> if (context == null || context.getReporter() == null)
> { return false; }
> When changing hive.stats.dbclass to jdbc:derby in 
> data/conf/spark/hive-site.xml, similar to TestCliDriver it works:
> 
> hive.stats.dbclass
> 
> jdbc:derby
> The default storatge that stores temporary hive statistics. 
> Currently, jdbc, hbase and counter type is supported
> 
> In addition, I had to generate the out file for the test case for spark.
> When running this test with TestCliDriver and hive.stats.dbclass set to 
> counter, the test case still works. The reporter is set to 
> org.apache.hadoop.mapred.Task$TaskReporter. 
> Might need some additional investigation why the CounterStatsPublisher has no 
> reporter in case of spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8509) UT: fix list_bucket_dml_2 test

2014-11-04 Thread Chinna Rao Lalam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-8509:
---
Attachment: HIVE-8509-spark.patch

> UT: fix list_bucket_dml_2 test
> --
>
> Key: HIVE-8509
> URL: https://issues.apache.org/jira/browse/HIVE-8509
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Thomas Friedrich
>Assignee: Chinna Rao Lalam
>Priority: Minor
> Attachments: HIVE-8509-spark.patch
>
>
> The test list_bucket_dml_2 fails in FileSinkOperator.publishStats:
> org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30002]: 
> StatsPublisher cannot be connected to.There was a error while connecting to 
> the StatsPublisher, and retrying might help. If you dont want the query to 
> fail because accurate statistics could not be collected, set 
> hive.stats.reliable=false
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1079)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:971)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:175)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:121)
> I debugged and found that FileSinkOperator.publishStats throws the exception 
> when calling statsPublisher.connect here:
> if (!statsPublisher.connect(hconf)) {
> // just return, stats gathering should not block the main query
> LOG.error("StatsPublishing error: cannot connect to database");
> if (isStatsReliable)
> { throw new 
> HiveException(ErrorMsg.STATSPUBLISHER_CONNECTION_ERROR.getErrorCodedMsg()); }
> return;
> }
> With the hive.stats.dbclass set to counter in data/conf/spark/hive-site.xml, 
> the statsPuvlisher is of type CounterStatsPublisher.
> In CounterStatsPublisher, the exception is thrown because getReporter() 
> returns null for the MapredContext:
> MapredContext context = MapredContext.get();
> if (context == null || context.getReporter() == null)
> { return false; }
> When changing hive.stats.dbclass to jdbc:derby in 
> data/conf/spark/hive-site.xml, similar to TestCliDriver it works:
> 
> hive.stats.dbclass
> 
> jdbc:derby
> The default storatge that stores temporary hive statistics. 
> Currently, jdbc, hbase and counter type is supported
> 
> In addition, I had to generate the out file for the test case for spark.
> When running this test with TestCliDriver and hive.stats.dbclass set to 
> counter, the test case still works. The reporter is set to 
> org.apache.hadoop.mapred.Task$TaskReporter. 
> Might need some additional investigation why the CounterStatsPublisher has no 
> reporter in case of spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8509) UT: fix list_bucket_dml_2 test

2014-11-04 Thread Chinna Rao Lalam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-8509:
---
Status: Patch Available  (was: Open)

> UT: fix list_bucket_dml_2 test
> --
>
> Key: HIVE-8509
> URL: https://issues.apache.org/jira/browse/HIVE-8509
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Thomas Friedrich
>Assignee: Chinna Rao Lalam
>Priority: Minor
> Attachments: HIVE-8509-spark.patch
>
>
> The test list_bucket_dml_2 fails in FileSinkOperator.publishStats:
> org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30002]: 
> StatsPublisher cannot be connected to.There was a error while connecting to 
> the StatsPublisher, and retrying might help. If you dont want the query to 
> fail because accurate statistics could not be collected, set 
> hive.stats.reliable=false
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1079)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:971)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:175)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:121)
> I debugged and found that FileSinkOperator.publishStats throws the exception 
> when calling statsPublisher.connect here:
> if (!statsPublisher.connect(hconf)) {
> // just return, stats gathering should not block the main query
> LOG.error("StatsPublishing error: cannot connect to database");
> if (isStatsReliable)
> { throw new 
> HiveException(ErrorMsg.STATSPUBLISHER_CONNECTION_ERROR.getErrorCodedMsg()); }
> return;
> }
> With the hive.stats.dbclass set to counter in data/conf/spark/hive-site.xml, 
> the statsPuvlisher is of type CounterStatsPublisher.
> In CounterStatsPublisher, the exception is thrown because getReporter() 
> returns null for the MapredContext:
> MapredContext context = MapredContext.get();
> if (context == null || context.getReporter() == null)
> { return false; }
> When changing hive.stats.dbclass to jdbc:derby in 
> data/conf/spark/hive-site.xml, similar to TestCliDriver it works:
> 
> hive.stats.dbclass
> 
> jdbc:derby
> The default storatge that stores temporary hive statistics. 
> Currently, jdbc, hbase and counter type is supported
> 
> In addition, I had to generate the out file for the test case for spark.
> When running this test with TestCliDriver and hive.stats.dbclass set to 
> counter, the test case still works. The reporter is set to 
> org.apache.hadoop.mapred.Task$TaskReporter. 
> Might need some additional investigation why the CounterStatsPublisher has no 
> reporter in case of spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8703) More Windows unit test fixes

2014-11-04 Thread Xiaobing Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196489#comment-14196489
 ] 

Xiaobing Zhou commented on HIVE-8703:
-

Yes, test only changes.

> More Windows unit test fixes
> 
>
> Key: HIVE-8703
> URL: https://issues.apache.org/jira/browse/HIVE-8703
> Project: Hive
>  Issue Type: Bug
>  Components: Tests, Windows
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-8703.1.patch, HIVE-8703.2.patch
>
>
> - TestStorageBasedMetastoreAuthorizationReads - needs to call 
> WindowsPathUtil.convertPathsFromWindowsToHdfs()
> - TestAuthorizationApiAuthorizer - created role should have a name. This was 
> causing TestLocationQueries to fail when run together because 
> TestLocationQueries was unable to drop a role with a null name. This one 
> fails on Unix as well.
> - create_like.q, stats_noscan_2.q: system:hive.root wasn't working on 
> Windows, change test to use system:test.tmp.dir.
> - Also update the golden files for a few Windows-only .q file tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8623) Implement HashTableLoader for Spark map-join [Spark Branch]

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196506#comment-14196506
 ] 

Hive QA commented on HIVE-8623:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679249/HIVE-8623.2-spark.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7098 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.ql.io.parquet.serde.TestParquetTimestampUtils.testTimezone
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/305/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/305/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-305/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679249 - PreCommit-HIVE-SPARK-Build

> Implement HashTableLoader for Spark map-join [Spark Branch]
> ---
>
> Key: HIVE-8623
> URL: https://issues.apache.org/jira/browse/HIVE-8623
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Suhas Satish
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-8623.1-spark.patch, HIVE-8623.2-spark.patch
>
>
> This is a sub-task of map-join for spark 
> https://issues.apache.org/jira/browse/HIVE-7613
> This can use the baseline patch for map-join
> https://issues.apache.org/jira/browse/HIVE-8616



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8720) Update orc_merge tests to make it consistent across OS'es

2014-11-04 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196510#comment-14196510
 ] 

Sergey Shelukhin commented on HIVE-8720:


+1

> Update orc_merge tests to make it consistent across OS'es
> -
>
> Key: HIVE-8720
> URL: https://issues.apache.org/jira/browse/HIVE-8720
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-8720.1.patch, orc_merge5_filedump_macosx.txt, 
> orc_merge5_filedump_opensuse.txt
>
>
> orc_merge*.q test cases fails with qfile diffs related to file size on 
> different OSes. I have seen failures with Open SUSE and CentOS. The order of 
> insertion of rows into ORC table impacts the file size because of run length 
> encoding. Since the order of rows is not guaranteed during insertion into 
> table we may get different file sizes. We cannot add ORDER BY to insert 
> queries as it will force insertion through single reducer which will disable 
> orc merge file optimization. Since these test cases test if the files are 
> merged or not it is sufficient to know the number of files after merging. 
> Instead of DESCRIBE FORMATTED (which shows the numFiles and fileSize) we can 
> use "dfs -ls" to know the number of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Hive on Spark for simple hive queries

2014-11-04 Thread Xuefu Zhang

For certain simple queries like your example, Hive doesn't some
optimization by execute it locally, which means no jobs are submitted to
the cluster (MR or Spark). I'm not sure if there is a way to turn this off,
but this is true for all execution engines.

Thanks,
Xuefu

On Tue, Nov 4, 2014 at 8:36 AM, Prabu Soundar Rajan -X (prabsoun - MINDTREE
LIMITED at Cisco)  wrote:

> Hi Team,
>
> In spite of setting hive execution engine as Spark, when we try simple
> hive queries having only mapper phase like (select * from table where
> column=xyz) - we observe the jobs are not submitted to Spark master. We do
> not see those jobs in Spark master web UI. But when we try some queries
> with reducer phase(in mr execution style), we see the job as "Hive on
> Spark" application in Spark master web UI. Appreciate if you could help us
> understand this behavior.   Am I missing something obvious here?
>
> Thanks & Regards,
> Prabu
>
>

[jira] [Updated] (HIVE-8703) More Windows unit test fixes

2014-11-04 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-8703:
-
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk/branch-0.14

> More Windows unit test fixes
> 
>
> Key: HIVE-8703
> URL: https://issues.apache.org/jira/browse/HIVE-8703
> Project: Hive
>  Issue Type: Bug
>  Components: Tests, Windows
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 0.14.0
>
> Attachments: HIVE-8703.1.patch, HIVE-8703.2.patch
>
>
> - TestStorageBasedMetastoreAuthorizationReads - needs to call 
> WindowsPathUtil.convertPathsFromWindowsToHdfs()
> - TestAuthorizationApiAuthorizer - created role should have a name. This was 
> causing TestLocationQueries to fail when run together because 
> TestLocationQueries was unable to drop a role with a null name. This one 
> fails on Unix as well.
> - create_like.q, stats_noscan_2.q: system:hive.root wasn't working on 
> Windows, change test to use system:test.tmp.dir.
> - Also update the golden files for a few Windows-only .q file tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8713) Unit test TestParquetTimestampUtils.testTimezone failing

2014-11-04 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196523#comment-14196523
 ] 

Jason Dere commented on HIVE-8713:
--

I've committed this to branch-0.14

> Unit test TestParquetTimestampUtils.testTimezone failing
> 
>
> Key: HIVE-8713
> URL: https://issues.apache.org/jira/browse/HIVE-8713
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Szehon Ho
> Fix For: 0.14.0
>
> Attachments: HIVE-8713.patch
>
>
> Has started failing in the precommit tests recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8713) Unit test TestParquetTimestampUtils.testTimezone failing

2014-11-04 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-8713:
-
Fix Version/s: (was: 0.15.0)
   0.14.0

> Unit test TestParquetTimestampUtils.testTimezone failing
> 
>
> Key: HIVE-8713
> URL: https://issues.apache.org/jira/browse/HIVE-8713
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Szehon Ho
> Fix For: 0.14.0
>
> Attachments: HIVE-8713.patch
>
>
> Has started failing in the precommit tests recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8697) Vectorized round(decimal, negative) produces wrong results

2014-11-04 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8697:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk and branch.

> Vectorized round(decimal, negative)  produces wrong results
> ---
>
> Key: HIVE-8697
> URL: https://issues.apache.org/jira/browse/HIVE-8697
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8697.02.patch
>
>
> Separated a 2nd issue -- wrong result -- found by [~thiruvel] in HIVE-8417 
> from 1st issue (IndexOutOfBoundsException due to scratch columns in 
> reduce-side).
> Interesting to note that in HIVE-8461 we converted Vectorization to use 
> HiveDecimal instead of Decimal128, yet the rounding problem still occurs...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8623) Implement HashTableLoader for Spark map-join [Spark Branch]

2014-11-04 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196594#comment-14196594
 ] 

Xuefu Zhang commented on HIVE-8623:
---

[~jxiang] Could you create a review board entry for this? Thanks.

> Implement HashTableLoader for Spark map-join [Spark Branch]
> ---
>
> Key: HIVE-8623
> URL: https://issues.apache.org/jira/browse/HIVE-8623
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Suhas Satish
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-8623.1-spark.patch, HIVE-8623.2-spark.patch
>
>
> This is a sub-task of map-join for spark 
> https://issues.apache.org/jira/browse/HIVE-7613
> This can use the baseline patch for map-join
> https://issues.apache.org/jira/browse/HIVE-8616



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 27580: HIVE-8623 Implement HashTableLoader for Spark map-join [Spark Branch]

2014-11-04 Thread Jimmy Xiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27580/
---

Review request for hive and Xuefu Zhang.


Bugs: HIVE-8623
https://issues.apache.org/jira/browse/HIVE-8623


Repository: hive-git


Description
---

Loading HashTable for Spark map-join. It's assumed that all tables share the 
same base dir. Each table has its own sub-folder. There could be several 
HashTable files for each table.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/HashTableLoaderFactory.java 10ad933 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 da36848 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/27580/diff/


Testing
---


Thanks,

Jimmy Xiang

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-11-04 Thread Sergey Shelukhin



> On Oct. 31, 2014, 12:42 a.m., Ashutosh Chauhan wrote:
> > data/scripts/q_test_init.sql, line 305
> > 
> >
> > Any reason for this? We have hive.stats.dbclass = fs at the top of 
> > file. That should be sufficient for this.

this is copied from CBO correctness test; I can remove it


> On Oct. 31, 2014, 12:42 a.m., Ashutosh Chauhan wrote:
> > data/scripts/q_test_init.sql, line 306
> > 
> >
> > you can just say analyze table cbo_t1 compute statistics;

again c/p from existing test


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/#review59288
---


On Oct. 30, 2014, 11:18 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27401/
> ---
> 
> (Updated Oct. 30, 2014, 11:18 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA
> 
> 
> Diffs
> -
> 
>   data/scripts/q_test_cleanup.sql 8ec0f9f 
>   data/scripts/q_test_init.sql 7484f0c 
>   itests/src/test/resources/testconfiguration.properties 2c84a36 
>   pom.xml bd74830 
>   ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
>   ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
>   ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
>   ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/27401/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>

[jira] [Commented] (HIVE-8623) Implement HashTableLoader for Spark map-join [Spark Branch]

2014-11-04 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196612#comment-14196612
 ] 

Jimmy Xiang commented on HIVE-8623:
---

Sure. Patch v2 is posted on RB: https://reviews.apache.org/r/27580/.

> Implement HashTableLoader for Spark map-join [Spark Branch]
> ---
>
> Key: HIVE-8623
> URL: https://issues.apache.org/jira/browse/HIVE-8623
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Suhas Satish
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-8623.1-spark.patch, HIVE-8623.2-spark.patch
>
>
> This is a sub-task of map-join for spark 
> https://issues.apache.org/jira/browse/HIVE-7613
> This can use the baseline patch for map-join
> https://issues.apache.org/jira/browse/HIVE-8616



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8621) Dump small table join data for map-join [Spark Branch]

2014-11-04 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HIVE-8621:
-

Assignee: Jimmy Xiang

> Dump small table join data for map-join [Spark Branch]
> --
>
> Key: HIVE-8621
> URL: https://issues.apache.org/jira/browse/HIVE-8621
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Suhas Satish
>Assignee: Jimmy Xiang
>
> This jira aims to re-use a slightly modified approach of map-reduce 
> distributed cache in spark to dump map-joined small tables as hash tables 
> onto spark DFS cluster. 
> This is a sub-task of map-join for spark 
> https://issues.apache.org/jira/browse/HIVE-7613
> This can use the baseline patch for map-join
> https://issues.apache.org/jira/browse/HIVE-8616
> The original thought process was to use broadcast variable concept in spark, 
> for the small tables. 
> The number of broadcast variables that must be created is m x n where
> 'm' is  the number of small tables in the (m+1) way join and n is the number 
> of buckets of tables. If unbucketed, n=1
> But it was discovered that objects compressed with kryo serialization on 
> disk, can occupy 20X or more when deserialized in-memory. For bucket join, 
> the spark Driver has to hold all the buckets (for bucketed tables) in-memory 
> (to provide for fault-tolerance against Executor failures) although the 
> executors only need individual buckets in their memory. So the broadcast 
> variable approach may not be the right approach. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8720) Update orc_merge tests to make it consistent across OS'es

2014-11-04 Thread Prasanth J (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196623#comment-14196623
 ] 

Prasanth J commented on HIVE-8720:
--

[~hagleitn] Can we have this for 0.14? These are just test file diffs to make 
the qfile results consistent across platforms.

> Update orc_merge tests to make it consistent across OS'es
> -
>
> Key: HIVE-8720
> URL: https://issues.apache.org/jira/browse/HIVE-8720
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-8720.1.patch, orc_merge5_filedump_macosx.txt, 
> orc_merge5_filedump_opensuse.txt
>
>
> orc_merge*.q test cases fails with qfile diffs related to file size on 
> different OSes. I have seen failures with Open SUSE and CentOS. The order of 
> insertion of rows into ORC table impacts the file size because of run length 
> encoding. Since the order of rows is not guaranteed during insertion into 
> table we may get different file sizes. We cannot add ORDER BY to insert 
> queries as it will force insertion through single reducer which will disable 
> orc merge file optimization. Since these test cases test if the files are 
> merged or not it is sufficient to know the number of files after merging. 
> Instead of DESCRIBE FORMATTED (which shows the numFiles and fileSize) we can 
> use "dfs -ls" to know the number of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-11-04 Thread Sergey Shelukhin



> On Oct. 31, 2014, 1:01 a.m., John Pullokkaran wrote:
> > itests/src/test/resources/testconfiguration.properties, line 65
> > 
> >
> > Why don't we put all CBO subqueries in to a single test?

seems it will be too big


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/#review59290
---


On Oct. 30, 2014, 11:18 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27401/
> ---
> 
> (Updated Oct. 30, 2014, 11:18 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA
> 
> 
> Diffs
> -
> 
>   data/scripts/q_test_cleanup.sql 8ec0f9f 
>   data/scripts/q_test_init.sql 7484f0c 
>   itests/src/test/resources/testconfiguration.properties 2c84a36 
>   pom.xml bd74830 
>   ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
>   ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
>   ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
>   ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/27401/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>

[jira] [Commented] (HIVE-8705) Support using pre-authenticated subject in kerberized HiveServer2 HTTP mode

2014-11-04 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196629#comment-14196629
 ] 

Vaibhav Gumashta commented on HIVE-8705:


Test failures are unrelated. 

[~hagleitn] it will be good to have it in 14.

> Support using pre-authenticated subject in kerberized HiveServer2 HTTP mode 
> 
>
> Key: HIVE-8705
> URL: https://issues.apache.org/jira/browse/HIVE-8705
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2, JDBC
>Affects Versions: 0.13.0, 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.15.0
>
> Attachments: HIVE-8705.1.patch, HIVE-8705.1.patch, 
> TestPreAuthenticatedKerberosSubject.java
>
>
> HIVE-6486 provided a patch to utilize pre-authenticated subject (someone who 
> has programmatically done a JAAS login and is not doing kinit before 
> connecting to HiveServer2 using the JDBC driver). However, that feature was 
> only for the binary mode code path. We need to have a similar feature when 
> the driver-server communicate using http transport. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8727) Dag summary has incorrect row counts and duration per vertex

2014-11-04 Thread Mostafa Mokhtar (JIRA)

Mostafa Mokhtar created HIVE-8727:
-

 Summary: Dag summary has incorrect row counts and duration per 
vertex
 Key: HIVE-8727
 URL: https://issues.apache.org/jira/browse/HIVE-8727
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth J
 Fix For: 0.14.0


During the code review for HIVE-8495 some code was reworked which broke some of 
INPUT/OUTPUT counters and duration.

Patch attached which fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8727) Dag summary has incorrect row counts and duration per vertex

2014-11-04 Thread Mostafa Mokhtar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-8727:
--
Attachment: HIVE-8727.1.patch

> Dag summary has incorrect row counts and duration per vertex
> 
>
> Key: HIVE-8727
> URL: https://issues.apache.org/jira/browse/HIVE-8727
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth J
> Fix For: 0.14.0
>
> Attachments: HIVE-8727.1.patch
>
>
> During the code review for HIVE-8495 some code was reworked which broke some 
> of INPUT/OUTPUT counters and duration.
> Patch attached which fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8509) UT: fix list_bucket_dml_2 test

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196643#comment-14196643
 ] 

Hive QA commented on HIVE-8509:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679259/HIVE-8509-spark.patch

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 7099 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join33
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_aggregate
org.apache.hadoop.hive.ql.io.parquet.serde.TestParquetTimestampUtils.testTimezone
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/306/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/306/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-306/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679259 - PreCommit-HIVE-SPARK-Build

> UT: fix list_bucket_dml_2 test
> --
>
> Key: HIVE-8509
> URL: https://issues.apache.org/jira/browse/HIVE-8509
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Thomas Friedrich
>Assignee: Chinna Rao Lalam
>Priority: Minor
> Attachments: HIVE-8509-spark.patch
>
>
> The test list_bucket_dml_2 fails in FileSinkOperator.publishStats:
> org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30002]: 
> StatsPublisher cannot be connected to.There was a error while connecting to 
> the StatsPublisher, and retrying might help. If you dont want the query to 
> fail because accurate statistics could not be collected, set 
> hive.stats.reliable=false
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1079)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:971)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:175)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:121)
> I debugged and found that FileSinkOperator.publishStats throws the exception 
> when calling statsPublisher.connect here:
> if (!statsPublisher.connect(hconf)) {
> // just return, stats gathering should not block the main query
> LOG.error("StatsPublishing error: cannot connect to database");
> if (isStatsReliable)
> { throw new 
> HiveException(ErrorMsg.STATSPUBLISHER_CONNECTION_ERROR.getErrorCodedMsg()); }
> return;
> }
> With the hive.stats.dbclass set to counter in data/conf/spark/hive-site.xml, 
> the statsPuvlisher is of type CounterStatsPublisher.
> In CounterStatsPublisher, the exception is thrown because getReporter() 
> returns null for the MapredContext:
> MapredContext context = MapredContext.get();
> if (context == null || context.getReporter() == null)
> { return false; }
> When changing hive.stats.dbclass to jdbc:derby in 
> data/conf/spark/hive-site.xml, similar to TestCliDriver it works:
> 
> hive.stats.dbclass
> 
> jdbc:derby
> The default storatge that stores temporary hive statistics. 
> Currently, jdbc, hbase and counter type is supported
> 
> In addition, I had to generate the out file for the test case for spark.
> When running this test with TestCliDriver and hive.stats.dbclass set to 
> counter, the test case still works. The reporter is set t

[jira] [Updated] (HIVE-8727) Dag summary has incorrect row counts and duration per vertex

2014-11-04 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-8727:
-
Status: Patch Available  (was: Open)

> Dag summary has incorrect row counts and duration per vertex
> 
>
> Key: HIVE-8727
> URL: https://issues.apache.org/jira/browse/HIVE-8727
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth J
> Fix For: 0.14.0
>
> Attachments: HIVE-8727.1.patch
>
>
> During the code review for HIVE-8495 some code was reworked which broke some 
> of INPUT/OUTPUT counters and duration.
> Patch attached which fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8727) Dag summary has incorrect row counts and duration per vertex

2014-11-04 Thread Prasanth J (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196665#comment-14196665
 ] 

Prasanth J commented on HIVE-8727:
--

+1

> Dag summary has incorrect row counts and duration per vertex
> 
>
> Key: HIVE-8727
> URL: https://issues.apache.org/jira/browse/HIVE-8727
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth J
> Fix For: 0.14.0
>
> Attachments: HIVE-8727.1.patch
>
>
> During the code review for HIVE-8495 some code was reworked which broke some 
> of INPUT/OUTPUT counters and duration.
> Patch attached which fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27565: Set reasonable connection timeout for CuratorFramework ZooKeeper clients in Hive

2014-11-04 Thread Vaibhav Gumashta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27565/
---

(Updated Nov. 4, 2014, 7:40 p.m.)


Review request for hive and Thejas Nair.


Bugs: HIVE-8723
https://issues.apache.org/jira/browse/HIVE-8723


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-8723


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java
 26d4d97 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java 3ed933a 
  jdbc/src/java/org/apache/hive/jdbc/ZooKeeperHiveClientHelper.java d515ce5 
  service/src/java/org/apache/hive/service/server/HiveServer2.java b814e4b 
  
shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/ZooKeeperTokenStore.java
 16a52e4 

Diff: https://reviews.apache.org/r/27565/diff/


Testing
---


Thanks,

Vaibhav Gumashta

[jira] [Updated] (HIVE-8723) Set reasonable connection timeout for CuratorFramework ZooKeeper clients in Hive

2014-11-04 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-8723:
---
Attachment: HIVE-8723.2.patch

> Set reasonable connection timeout for CuratorFramework ZooKeeper clients in 
> Hive
> 
>
> Key: HIVE-8723
> URL: https://issues.apache.org/jira/browse/HIVE-8723
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-8723.1.patch, HIVE-8723.2.patch
>
>
> Currently we use -1, due to which "any" elapsed time is always greater than 
> any timeout value resulting in an unnecessary connection loss exception. 
> Relevant code from curator framework:
> {code}
>  private synchronized void checkTimeouts() throws Exception
> {
> int minTimeout = Math.min(sessionTimeoutMs, connectionTimeoutMs);
> long elapsed = System.currentTimeMillis() - connectionStartMs;
> if ( elapsed >= minTimeout )
> {
> if ( zooKeeper.hasNewConnectionString() )
> {
> handleNewConnectionString();
> }
> else
> {
> int maxTimeout = Math.max(sessionTimeoutMs, 
> connectionTimeoutMs);
> if ( elapsed > maxTimeout )
> {
> if ( 
> !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
> {
> log.warn(String.format("Connection attempt 
> unsuccessful after %d (greater than max timeout of %d). Resetting connection 
> and trying again with a new connection.", elapsed, maxTimeout));
> }
> reset();
> }
> else
> {
> KeeperException.ConnectionLossException 
> connectionLossException = new CuratorConnectionLossException();
> if ( 
> !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
> {
> log.error(String.format("Connection timed out for 
> connection string (%s) and timeout (%d) / elapsed (%d)", 
> zooKeeper.getConnectionString(), connectionTimeoutMs, elapsed), 
> connectionLossException);
> }
> tracer.get().addCount("connections-timed-out", 1);
> throw connectionLossException;
> }
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8723) Set reasonable connection timeout for CuratorFramework ZooKeeper clients in Hive

2014-11-04 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196679#comment-14196679
 ] 

Vaibhav Gumashta commented on HIVE-8723:


[~hagleitn] Bug fix which doesn't make any functional change, but avoids 
unnecessary error logging. Will be good to have in 14. 

> Set reasonable connection timeout for CuratorFramework ZooKeeper clients in 
> Hive
> 
>
> Key: HIVE-8723
> URL: https://issues.apache.org/jira/browse/HIVE-8723
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-8723.1.patch, HIVE-8723.2.patch
>
>
> Currently we use -1, due to which "any" elapsed time is always greater than 
> any timeout value resulting in an unnecessary connection loss exception. 
> Relevant code from curator framework:
> {code}
>  private synchronized void checkTimeouts() throws Exception
> {
> int minTimeout = Math.min(sessionTimeoutMs, connectionTimeoutMs);
> long elapsed = System.currentTimeMillis() - connectionStartMs;
> if ( elapsed >= minTimeout )
> {
> if ( zooKeeper.hasNewConnectionString() )
> {
> handleNewConnectionString();
> }
> else
> {
> int maxTimeout = Math.max(sessionTimeoutMs, 
> connectionTimeoutMs);
> if ( elapsed > maxTimeout )
> {
> if ( 
> !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
> {
> log.warn(String.format("Connection attempt 
> unsuccessful after %d (greater than max timeout of %d). Resetting connection 
> and trying again with a new connection.", elapsed, maxTimeout));
> }
> reset();
> }
> else
> {
> KeeperException.ConnectionLossException 
> connectionLossException = new CuratorConnectionLossException();
> if ( 
> !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
> {
> log.error(String.format("Connection timed out for 
> connection string (%s) and timeout (%d) / elapsed (%d)", 
> zooKeeper.getConnectionString(), connectionTimeoutMs, elapsed), 
> connectionLossException);
> }
> tracer.get().addCount("connections-timed-out", 1);
> throw connectionLossException;
> }
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8395) CBO: enable by default

2014-11-04 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8395:
---
Attachment: HIVE-8395.22.patch

rinse, repeat.
[~ashutoshc]  [~rhbutani] [~jpullokkaran] [~hagleitn] do you guys want to start 
reviewing? Perhaps separate the diffs based on first letter into equal chunks

This patch is hard to upkeep against bit rot.

Probably the best way to review is to apply locally and use some difftool like 
diffmerge, meld or opendiff (can handle NULL characters). I am trying to upload 
RB but new fancy javascript upload thing is stuck on this patch it seems

> CBO: enable by default
> --
>
> Key: HIVE-8395
> URL: https://issues.apache.org/jira/browse/HIVE-8395
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
> Attachments: HIVE-8395.01.patch, HIVE-8395.02.patch, 
> HIVE-8395.03.patch, HIVE-8395.04.patch, HIVE-8395.05.patch, 
> HIVE-8395.06.patch, HIVE-8395.07.patch, HIVE-8395.08.patch, 
> HIVE-8395.09.patch, HIVE-8395.10.patch, HIVE-8395.11.patch, 
> HIVE-8395.12.patch, HIVE-8395.12.patch, HIVE-8395.13.patch, 
> HIVE-8395.13.patch, HIVE-8395.14.patch, HIVE-8395.15.patch, 
> HIVE-8395.16.patch, HIVE-8395.17.patch, HIVE-8395.18.patch, 
> HIVE-8395.18.patch, HIVE-8395.19.patch, HIVE-8395.20.patch, 
> HIVE-8395.21.patch, HIVE-8395.22.patch, HIVE-8395.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8705) Support using pre-authenticated subject in kerberized HiveServer2 HTTP mode

2014-11-04 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196692#comment-14196692
 ] 

Vaibhav Gumashta commented on HIVE-8705:


Patch committed to trunk. Thanks for reviewing [~thejas]. Will wait for 14 
approval before closing this one.

> Support using pre-authenticated subject in kerberized HiveServer2 HTTP mode 
> 
>
> Key: HIVE-8705
> URL: https://issues.apache.org/jira/browse/HIVE-8705
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2, JDBC
>Affects Versions: 0.13.0, 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.15.0
>
> Attachments: HIVE-8705.1.patch, HIVE-8705.1.patch, 
> TestPreAuthenticatedKerberosSubject.java
>
>
> HIVE-6486 provided a patch to utilize pre-authenticated subject (someone who 
> has programmatically done a JAAS login and is not doing kinit before 
> connecting to HiveServer2 using the JDBC driver). However, that feature was 
> only for the binary mode code path. We need to have a similar feature when 
> the driver-server communicate using http transport. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8727) Dag summary has incorrect row counts and duration per vertex

2014-11-04 Thread Prasanth J (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196697#comment-14196697
 ] 

Prasanth J commented on HIVE-8727:
--

[~hagleitn] HIVE-8495 broke output of dag summary. Can we have this for 0.14?

> Dag summary has incorrect row counts and duration per vertex
> 
>
> Key: HIVE-8727
> URL: https://issues.apache.org/jira/browse/HIVE-8727
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth J
> Fix For: 0.14.0
>
> Attachments: HIVE-8727.1.patch
>
>
> During the code review for HIVE-8495 some code was reworked which broke some 
> of INPUT/OUTPUT counters and duration.
> Patch attached which fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8705) Support using pre-authenticated subject in kerberized HiveServer2 HTTP mode

2014-11-04 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-8705:
---
Issue Type: Bug  (was: New Feature)

> Support using pre-authenticated subject in kerberized HiveServer2 HTTP mode 
> 
>
> Key: HIVE-8705
> URL: https://issues.apache.org/jira/browse/HIVE-8705
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.13.0, 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.15.0
>
> Attachments: HIVE-8705.1.patch, HIVE-8705.1.patch, 
> TestPreAuthenticatedKerberosSubject.java
>
>
> HIVE-6486 provided a patch to utilize pre-authenticated subject (someone who 
> has programmatically done a JAAS login and is not doing kinit before 
> connecting to HiveServer2 using the JDBC driver). However, that feature was 
> only for the binary mode code path. We need to have a similar feature when 
> the driver-server communicate using http transport. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8699) Enable support for common map join [Spark Branch]

2014-11-04 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-8699:
---

Assignee: Szehon Ho

> Enable support for common map join [Spark Branch]
> -
>
> Key: HIVE-8699
> URL: https://issues.apache.org/jira/browse/HIVE-8699
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
>
> This JIRA is to track issues related to common map-join support in Spark, 
> including logical and physical optimizations. HIVE-8616 provided initialial 
> processing, mainly represented by SparkMapJoinOptimizer. We need to continue 
> the work to make map join work from end to end, including enhancement needed 
> for SparkMapJoinOptimizer and subsequent physical optimization 
> SparkMapJoinResolver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8701) Combine nested map joins into the parent map join if possible [Spark Branch]

2014-11-04 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-8701:
---

Assignee: Szehon Ho

> Combine nested map joins into the parent map join if possible [Spark Branch]
> 
>
> Key: HIVE-8701
> URL: https://issues.apache.org/jira/browse/HIVE-8701
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
>
> With the work in HIVE-8616 enabled, the generated plan shows that the nested 
> map join operator isn't merged to its parent when possible. This is 
> demonstrated in auto_join2.q. The MR plan shown that this optimization is in 
> place. We should do the same for Spark.
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Spark
>   Edges:
> Map 2 <- Map 3 (NONE, 0)
> Map 3 <- Map 1 (NONE, 0)
>   DagName: xzhang_20141102074141_ac089634-bf01-4386-b1cf-3e7f2e99f6eb:3
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: src2
>   Statistics: Num rows: 58 Data size: 5812 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: key is not null (type: boolean)
> Statistics: Num rows: 29 Data size: 2906 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: key (type: string)
>   sort order: +
>   Map-reduce partition columns: key (type: string)
>   Statistics: Num rows: 29 Data size: 2906 Basic stats: 
> COMPLETE Column stats: NONE
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: src3
>   Statistics: Num rows: 29 Data size: 5812 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: UDFToDouble(key) is not null (type: boolean)
> Statistics: Num rows: 15 Data size: 3006 Basic stats: 
> COMPLETE Column stats: NONE
> Map Join Operator
>   condition map:
>Inner Join 0 to 1
>   condition expressions:
> 0 {_col0}
> 1 {value}
>   keys:
> 0 (_col0 + _col5) (type: double)
> 1 UDFToDouble(key) (type: double)
>   outputColumnNames: _col0, _col11
>   input vertices:
> 0 Map 3
>   Statistics: Num rows: 17 Data size: 1813 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: _col0 (type: string), _col11 (type: 
> string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 17 Data size: 1813 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 17 Data size: 1813 Basic 
> stats: COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: src1
>   Statistics: Num rows: 58 Data size: 5812 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: key is not null (type: boolean)
> Statistics: Num rows: 29 Data size: 2906 Basic stats: 
> COMPLETE Column stats: NONE
> Map Join Operator
>   condition map:
>Inner Join 0 to 1
>   condition expressions:
> 0 {key}
> 1 {key}
>   keys:
> 0 key (type: string)
> 1 key (type: string)
>   outputColumnNames: _col0, _col5
>   input vertices:
> 1 Map 1
>   Statistics: Num rows: 31 Data size: 3196 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (_col0 + _col5) is not null (type: boolean)
> Statistics: Num rows: 16 Data

[jira] [Assigned] (HIVE-8699) Enable support for common map join [Spark Branch]

2014-11-04 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-8699:
---

Assignee: (was: Szehon Ho)

> Enable support for common map join [Spark Branch]
> -
>
> Key: HIVE-8699
> URL: https://issues.apache.org/jira/browse/HIVE-8699
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>
> This JIRA is to track issues related to common map-join support in Spark, 
> including logical and physical optimizations. HIVE-8616 provided initialial 
> processing, mainly represented by SparkMapJoinOptimizer. We need to continue 
> the work to make map join work from end to end, including enhancement needed 
> for SparkMapJoinOptimizer and subsequent physical optimization 
> SparkMapJoinResolver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8718) Refactoring: move mapLocalWork field from MapWork to BaseWork

2014-11-04 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196724#comment-14196724
 ] 

Szehon Ho commented on HIVE-8718:
-

+1 looks good to me

> Refactoring: move mapLocalWork field from MapWork to BaseWork
> -
>
> Key: HIVE-8718
> URL: https://issues.apache.org/jira/browse/HIVE-8718
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.15.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Minor
> Attachments: HIVE-8718.patch
>
>
> Currently MapRedLocalWork instance is a field at MapWork. This is MR specific 
> where small table input to map join is always from a MapWork, which isn't 
> true for other execution engine where works doesn't need to be chopped at RS.
> The refactoring moves the MapRedLocalWork to its parent class, BaseWork. 
> Purely refactoring. No behavor change is expected. Test output might need to 
> be updated though.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27580: HIVE-8623 Implement HashTableLoader for Spark map-join [Spark Branch]

2014-11-04 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27580/#review59823
---



ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java


Nit: There is potential fd leak in case fs.open is successful while new 
ObjectInputStream() fails.


- Xuefu Zhang


On Nov. 4, 2014, 6:58 p.m., Jimmy Xiang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27580/
> ---
> 
> (Updated Nov. 4, 2014, 6:58 p.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8623
> https://issues.apache.org/jira/browse/HIVE-8623
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Loading HashTable for Spark map-join. It's assumed that all tables share the 
> same base dir. Each table has its own sub-folder. There could be several 
> HashTable files for each table.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/HashTableLoaderFactory.java 10ad933 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
>  da36848 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/27580/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>

[jira] [Commented] (HIVE-8727) Dag summary has incorrect row counts and duration per vertex

2014-11-04 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196775#comment-14196775
 ] 

Gunther Hagleitner commented on HIVE-8727:
--

+1 for 0.14

> Dag summary has incorrect row counts and duration per vertex
> 
>
> Key: HIVE-8727
> URL: https://issues.apache.org/jira/browse/HIVE-8727
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth J
> Fix For: 0.14.0
>
> Attachments: HIVE-8727.1.patch
>
>
> During the code review for HIVE-8495 some code was reworked which broke some 
> of INPUT/OUTPUT counters and duration.
> Patch attached which fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8728) Fix ptf.q determinism

2014-11-04 Thread Jimmy Xiang (JIRA)

Jimmy Xiang created HIVE-8728:
-

 Summary: Fix ptf.q determinism
 Key: HIVE-8728
 URL: https://issues.apache.org/jira/browse/HIVE-8728
 Project: Hive
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor


ptf.q has order by one column. We need to add SORT_QUERY_RESULTS for 
determinism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8728) Fix ptf.q determinism

2014-11-04 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-8728:
--
Status: Patch Available  (was: Open)

> Fix ptf.q determinism
> -
>
> Key: HIVE-8728
> URL: https://issues.apache.org/jira/browse/HIVE-8728
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: HIVE-8728.1.patch
>
>
> ptf.q has order by one column. We need to add SORT_QUERY_RESULTS for 
> determinism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8728) Fix ptf.q determinism

2014-11-04 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-8728:
--
Attachment: HIVE-8728.1.patch

> Fix ptf.q determinism
> -
>
> Key: HIVE-8728
> URL: https://issues.apache.org/jira/browse/HIVE-8728
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: HIVE-8728.1.patch
>
>
> ptf.q has order by one column. We need to add SORT_QUERY_RESULTS for 
> determinism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8728) Fix ptf.q determinism

2014-11-04 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196788#comment-14196788
 ] 

Xuefu Zhang commented on HIVE-8728:
---

+1 pending on test result.

> Fix ptf.q determinism
> -
>
> Key: HIVE-8728
> URL: https://issues.apache.org/jira/browse/HIVE-8728
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: HIVE-8728.1.patch
>
>
> ptf.q has order by one column. We need to add SORT_QUERY_RESULTS for 
> determinism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8719) LoadSemanticAnalyzer ignores previous partition location if inserting into partition that already exists

2014-11-04 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-8719:
---
Status: Patch Available  (was: Open)

> LoadSemanticAnalyzer ignores previous partition location if inserting into 
> partition that already exists
> 
>
> Key: HIVE-8719
> URL: https://issues.apache.org/jira/browse/HIVE-8719
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-8719.patch
>
>
> LOAD DATA INSERT INTO seems to be broken currently for partitions that do not 
> use hive's native directory structure naming scheme, thus ignoring any 
> location previously set by an ALTER TABLE ADD PARTITION ... LOCATION ... 
> command.
> Here is a simple reproducer:
> {noformat}
> echo 1 > /tmp/data1.txt
> hive -e "create external table testpart(id int) partitioned by (date string) 
> location '/tmp/testpart';"
> hive -e "alter table testpart add partition(date='2014-09-16')  location 
> '/tmp/testpart/20140916';"
> hive -e "describe formatted testpart partition(date='2014-09-16') ;" | egrep 
> '/tmp/testpart/(date=.?)?2014-?09-?16' > /tmp/a
> hive -e "load data local inpath '/tmp/data1.txt' into table testpart 
> partition(date='2014-09-16');"
> hive -e "describe formatted testpart partition(date='2014-09-16') ;" | egrep 
> '/tmp/testpart/(date=.?)?2014-?09-?16' > /tmp/b
> diff /tmp/a /tmp/b
> hadoop fs -ls /tmp/testpart/
> {noformat}
> Basically, what happens is that after the ALTER TABLE ADD PARTITION ... 
> LOCATION, the location is "/tmp/testpart/20140916". After the LOAD DATA has 
> run, the partition location becomes "/tmp/testpart/date=2014-09-16/". Any 
> data previously present in the other location will then be ignored as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8729) VectorizedRowBatch#selected is not restored upon exception in VectorFilterOperator#processOp()

2014-11-04 Thread Ted Yu (JIRA)

Ted Yu created HIVE-8729:


 Summary: VectorizedRowBatch#selected is not restored upon 
exception in VectorFilterOperator#processOp()
 Key: HIVE-8729
 URL: https://issues.apache.org/jira/browse/HIVE-8729
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor


Here is related code:
{code}
int [] selectedBackup = vrg.selected;
vrg.selected = temporarySelected;
...
if (vrg.size > 0) {
  forward(vrg, null);
}
{code}
If forward() throws exception, selected is not restored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27580: HIVE-8623 Implement HashTableLoader for Spark map-join [Spark Branch]

2014-11-04 Thread Jimmy Xiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27580/
---

(Updated Nov. 4, 2014, 9:02 p.m.)


Review request for hive and Xuefu Zhang.


Changes
---

Addressed the fd leak issue Xuefu pointed out.


Bugs: HIVE-8623
https://issues.apache.org/jira/browse/HIVE-8623


Repository: hive-git


Description
---

Loading HashTable for Spark map-join. It's assumed that all tables share the 
same base dir. Each table has its own sub-folder. There could be several 
HashTable files for each table.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/HashTableLoaderFactory.java 10ad933 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 da36848 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/27580/diff/


Testing
---


Thanks,

Jimmy Xiang

Re: Review Request 27580: HIVE-8623 Implement HashTableLoader for Spark map-join [Spark Branch]

2014-11-04 Thread Jimmy Xiang



> On Nov. 4, 2014, 8:38 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java,
> >  line 123
> > 
> >
> > Nit: There is potential fd leak in case fs.open is successful while new 
> > ObjectInputStream() fails.

Good catch. Fixed in v3.


- Jimmy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27580/#review59823
---


On Nov. 4, 2014, 9:02 p.m., Jimmy Xiang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27580/
> ---
> 
> (Updated Nov. 4, 2014, 9:02 p.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8623
> https://issues.apache.org/jira/browse/HIVE-8623
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Loading HashTable for Spark map-join. It's assumed that all tables share the 
> same base dir. Each table has its own sub-folder. There could be several 
> HashTable files for each table.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/HashTableLoaderFactory.java 10ad933 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
>  da36848 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/27580/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>

[jira] [Updated] (HIVE-8623) Implement HashTableLoader for Spark map-join [Spark Branch]

2014-11-04 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-8623:
--
Attachment: HIVE-8623.3-spark.patch

Attached v3 that addressed Xuefu's review comments. Thanks.

> Implement HashTableLoader for Spark map-join [Spark Branch]
> ---
>
> Key: HIVE-8623
> URL: https://issues.apache.org/jira/browse/HIVE-8623
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Suhas Satish
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-8623.1-spark.patch, HIVE-8623.2-spark.patch, 
> HIVE-8623.3-spark.patch
>
>
> This is a sub-task of map-join for spark 
> https://issues.apache.org/jira/browse/HIVE-7613
> This can use the baseline patch for map-join
> https://issues.apache.org/jira/browse/HIVE-8616



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8636) CBO: split cbo_correctness test

2014-11-04 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8636:
---
Attachment: HIVE-8636.01.patch

Update tests that do show tables, add reset to not enable cbo for other 
tests(?), address CR feedback. I'll see what else fails

> CBO: split cbo_correctness test
> ---
>
> Key: HIVE-8636
> URL: https://issues.apache.org/jira/browse/HIVE-8636
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8636.01.patch, HIVE-8636.patch
>
>
> CBO correctness test is extremely annoying - it runs forever, if anything 
> fails it's hard to debug due to the volume of logs from all the stuff, also 
> it doesn't run further so if multiple things fail they can only be discovered 
> one by one; also SORT_QUERY_RESULTS cannot be used, because some queries 
> presumably use sorting.
> It should be split into separate tests, the numbers in there now may be good 
> as boundaries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2014-11-04 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-7150:
-
Labels: jdbc  (was: )

> FileInputStream is not closed in HiveConnection#getHttpClient()
> ---
>
> Key: HIVE-7150
> URL: https://issues.apache.org/jira/browse/HIVE-7150
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>  Labels: jdbc
> Fix For: 0.15.0
>
>
> Here is related code:
> {code}
> sslTrustStore.load(new FileInputStream(sslTrustStorePath),
> sslTrustStorePassword.toCharArray());
> {code}
> The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8730) schemaTool failure when date partition has non-date value

2014-11-04 Thread Johndee Burks (JIRA)

Johndee Burks created HIVE-8730:
---

 Summary: schemaTool failure when date partition has non-date value
 Key: HIVE-8730
 URL: https://issues.apache.org/jira/browse/HIVE-8730
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
 Environment: CDH5.2
Reporter: Johndee Burks
Priority: Minor


If there is a none date value in the PART_KEY_VAL column within the 
PARTITION_KEY_VALS table in the metastore db, this will cause the HIVE-5700 
script to fail. The failure will be picked up by the schemaTool causing the 
upgrade to fail. A classic example of a value that can be present without users 
really being aware is __HIVE_DEFAULT_PARTITION__ which is filled in by hive 
automatically when doing dynamic partitioning and value is not present in 
source data for the partition column.

The reason for the failure is that the upgrade script does not account for none 
date values. What it is currently:

{code}
UPDATE PARTITION_KEY_VALS
  INNER JOIN PARTITIONS ON PARTITION_KEY_VALS.PART_ID = PARTITIONS.PART_ID
  INNER JOIN PARTITION_KEYS ON PARTITION_KEYS.TBL_ID = PARTITIONS.TBL_ID
AND PARTITION_KEYS.INTEGER_IDX = PARTITION_KEY_VALS.INTEGER_IDX
AND PARTITION_KEYS.PKEY_TYPE = 'date'
SET PART_KEY_VAL = IFNULL(DATE_FORMAT(cast(PART_KEY_VAL as date),'%Y-%m-%d'), 
PART_KEY_VAL);
{code}

What it should be to avoid issue: 

{code}
UPDATE PARTITION_KEY_VALS
  INNER JOIN PARTITIONS ON PARTITION_KEY_VALS.PART_ID = PARTITIONS.PART_ID
  INNER JOIN PARTITION_KEYS ON PARTITION_KEYS.TBL_ID = PARTITIONS.TBL_ID
AND PARTITION_KEYS.INTEGER_IDX = PARTITION_KEY_VALS.INTEGER_IDX
AND PARTITION_KEYS.PKEY_TYPE = 'date'
AND PART_KEY_VAL != '__HIVE_DEFAULT_PARTITION__'
SET PART_KEY_VAL = IFNULL(DATE_FORMAT(cast(PART_KEY_VAL as date),'%Y-%m-%d'), 
PART_KEY_VAL);
{code}

== Metastore DB

{code}
mysql> select * from PARTITION_KEY_VALS;
+-++-+
| PART_ID | PART_KEY_VAL   | INTEGER_IDX |
+-++-+
| 171 | 2099-12-31 |   0 |
| 172 | __HIVE_DEFAULT_PARTITION__ |   0 |
| 184 | 2099-12-01 |   0 |
| 185 | 2099-12-30 |   0 |
+-++-+
{code} 

== stdout.log

0: jdbc:mysql://10.16.8.121:3306/metastore> !autocommit on
0: jdbc:mysql://10.16.8.121:3306/metastore> SELECT 'Upgrading MetaStore schema 
from 0.12.0 to 0.13.0' AS ' '
+---+--+
|   |
+---+--+
| Upgrading MetaStore schema from 0.12.0 to 0.13.0  |
+---+--+
0: jdbc:mysql://10.16.8.121:3306/metastore> SELECT '< HIVE-5700 enforce single 
date format for partition column storage >' AS ' '
++--+
||
++--+
| < HIVE-5700 enforce single date format for partition column storage >  |
++--+
0: jdbc:mysql://10.16.8.121:3306/metastore> UPDATE PARTITION_KEY_VALS INNER 
JOIN PARTITIONS ON PARTITION_KEY_VALS.PART_ID = PARTITIONS.PART_ID INNER JOIN 
PARTITION_KEYS ON PARTITION_KEYS.TBL_ID = PARTITIONS.TBL_ID AND 
PARTITION_KEYS.INTEGER_IDX = PARTITION_KEY_VALS.INTEGER_IDX AND 
PARTITION_KEYS.PKEY_TYPE = 'date' SET PART_KEY_VAL = 
IFNULL(DATE_FORMAT(cast(PART_KEY_VAL as date),'%Y-%m-%d'), PART_KEY_VAL)

== stderr.log

exec /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop jar 
/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-cli-0.13.1-cdh5.2.0.jar
 org.apache.hive.beeline.HiveSchemaTool -verbose -dbType mysql -upgradeSchema
Connecting to 
jdbc:mysql://10.16.8.121:3306/metastore?useUnicode=true&characterEncoding=UTF-8
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( 
Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
Autocommit status: true
1 row selected (0.025 seconds)
1 row selected (0.004 seconds)
Closing: 0: 
jdbc:mysql://10.16.8.121:3306/metastore?useUnicode=true&characterEncoding=UTF-8
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
state would be inconsistent !!
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
state would be inconsistent !!
at 
org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:252)
at 
org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:220)
at org.ap

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-11-04 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/
---

(Updated Nov. 4, 2014, 9:07 p.m.)


Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Repository: hive-git


Description
---

See JIRA


Diffs (updated)
-

  data/scripts/q_test_cleanup.sql 8ec0f9f 
  data/scripts/q_test_init.sql 7484f0c 
  itests/src/test/resources/testconfiguration.properties 3ae001d 
  pom.xml a5f851f 
  ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
  ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
  ql/src/test/results/clientpositive/add_part_exist.q.out 107cfdb 
  ql/src/test/results/clientpositive/alter1.q.out 7c78410 
  ql/src/test/results/clientpositive/alter2.q.out 3356ab9 
  ql/src/test/results/clientpositive/alter3.q.out 70353d3 
  ql/src/test/results/clientpositive/alter4.q.out 42fa2d1 
  ql/src/test/results/clientpositive/alter5.q.out a83b68d 
  ql/src/test/results/clientpositive/alter_index.q.out c69127a 
  ql/src/test/results/clientpositive/alter_rename_partition.q.out 82eeb82 
  ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
  ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/describe_table_json.q.out e9c0977 
  ql/src/test/results/clientpositive/index_creation.q.out a313266 
  ql/src/test/results/clientpositive/input2.q.out 6ec74e0 
  ql/src/test/results/clientpositive/input3.q.out 2bd5475 
  ql/src/test/results/clientpositive/rename_column.q.out 4704acf 
  ql/src/test/results/clientpositive/show_tables.q.out 0a208f7 
  ql/src/test/results/clientpositive/temp_table.q.out b5891c2 
  ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
  ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/27401/diff/


Testing
---


Thanks,

Sergey Shelukhin

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-11-04 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/#review59835
---



data/scripts/q_test_init.sql


will fix


- Sergey Shelukhin


On Nov. 4, 2014, 9:07 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27401/
> ---
> 
> (Updated Nov. 4, 2014, 9:07 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA
> 
> 
> Diffs
> -
> 
>   data/scripts/q_test_cleanup.sql 8ec0f9f 
>   data/scripts/q_test_init.sql 7484f0c 
>   itests/src/test/resources/testconfiguration.properties 3ae001d 
>   pom.xml a5f851f 
>   ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
>   ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
>   ql/src/test/results/clientpositive/add_part_exist.q.out 107cfdb 
>   ql/src/test/results/clientpositive/alter1.q.out 7c78410 
>   ql/src/test/results/clientpositive/alter2.q.out 3356ab9 
>   ql/src/test/results/clientpositive/alter3.q.out 70353d3 
>   ql/src/test/results/clientpositive/alter4.q.out 42fa2d1 
>   ql/src/test/results/clientpositive/alter5.q.out a83b68d 
>   ql/src/test/results/clientpositive/alter_index.q.out c69127a 
>   ql/src/test/results/clientpositive/alter_rename_partition.q.out 82eeb82 
>   ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
>   ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/describe_table_json.q.out e9c0977 
>   ql/src/test/results/clientpositive/index_creation.q.out a313266 
>   ql/src/test/results/clientpositive/input2.q.out 6ec74e0 
>   ql/src/test/results/clientpositive/input3.q.out 2bd5475 
>   ql/src/test/results/clientpositive/rename_column.q.out 4704acf 
>   ql/src/test/results/clientpositive/show_tables.q.out 0a208f7 
>   ql/src/test/results/clientpositive/temp_table.q.out b5891c2 
>   ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
>   ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/resu

[jira] [Created] (HIVE-8731) TPC-DS Q49 : More rows are returned than the limit set in the query

2014-11-04 Thread Mostafa Mokhtar (JIRA)

Mostafa Mokhtar created HIVE-8731:
-

 Summary: TPC-DS Q49 : More rows are returned than the limit set in 
the query 
 Key: HIVE-8731
 URL: https://issues.apache.org/jira/browse/HIVE-8731
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.0


TPC-DS query 49 returns more rows than that set in limit.

Query 
{code}
set hive.cbo.enable=true;
set hive.stats.fetch.column.stats=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.tez.auto.reducer.parallelism=true;
set hive.auto.convert.join.noconditionaltask.size=128000;
set hive.exec.reducers.bytes.per.reducer=1;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;
set hive.support.concurrency=false;
set hive.tez.exec.print.summary=true;

explain  

select  
 'web' as channel
 ,web.item
 ,web.return_ratio
 ,web.return_rank
 ,web.currency_rank
 from (
select 
 item
,return_ratio
,currency_ratio
,rank() over (order by return_ratio) as return_rank
,rank() over (order by currency_ratio) as currency_rank
from
(   select ws.ws_item_sk as item
,(cast(sum(coalesce(wr.wr_return_quantity,0)) as decimal(15,4))/
cast(sum(coalesce(ws.ws_quantity,0)) as decimal(15,4) )) as 
return_ratio
,(cast(sum(coalesce(wr.wr_return_amt,0)) as decimal(15,4))/
cast(sum(coalesce(ws.ws_net_paid,0)) as decimal(15,4) )) as 
currency_ratio
from 
 web_sales ws left outer join web_returns wr 
on (ws.ws_order_number = wr.wr_order_number and 
ws.ws_item_sk = wr.wr_item_sk)
 ,date_dim
where 
wr.wr_return_amt > 1 
and ws.ws_net_profit > 1
 and ws.ws_net_paid > 0
 and ws.ws_quantity > 0
 and ws.ws_sold_date_sk = date_dim.d_date_sk
 and d_year = 2000
 and d_moy = 12
group by ws.ws_item_sk
) in_web
 ) web
 where 
 (
 web.return_rank <= 10
 or
 web.currency_rank <= 10
 )
 union all
 select 
 'catalog' as channel
 ,catalog.item
 ,catalog.return_ratio
 ,catalog.return_rank
 ,catalog.currency_rank
 from (
select 
 item
,return_ratio
,currency_ratio
,rank() over (order by return_ratio) as return_rank
,rank() over (order by currency_ratio) as currency_rank
from
(   select 
cs.cs_item_sk as item
,(cast(sum(coalesce(cr.cr_return_quantity,0)) as decimal(15,4))/
cast(sum(coalesce(cs.cs_quantity,0)) as decimal(15,4) )) as 
return_ratio
,(cast(sum(coalesce(cr.cr_return_amount,0)) as decimal(15,4))/
cast(sum(coalesce(cs.cs_net_paid,0)) as decimal(15,4) )) as 
currency_ratio
from 
catalog_sales cs left outer join catalog_returns cr
on (cs.cs_order_number = cr.cr_order_number and 
cs.cs_item_sk = cr.cr_item_sk)
,date_dim
where 
cr.cr_return_amount > 1 
and cs.cs_net_profit > 1
 and cs.cs_net_paid > 0
 and cs.cs_quantity > 0
 and cs_sold_date_sk = d_date_sk
 and d_year = 2000
 and d_moy = 12
 group by cs.cs_item_sk
) in_cat
 ) catalog
 where 
 (
 catalog.return_rank <= 10
 or
 catalog.currency_rank <=10
 )
 union all
 select 
 'store' as channel
 ,store.item
 ,store.return_ratio
 ,store.return_rank
 ,store.currency_rank
 from (
select 
 item
,return_ratio
,currency_ratio
,rank() over (order by return_ratio) as return_rank
,rank() over (order by currency_ratio) as currency_rank
from
(   select sts.ss_item_sk as item
,(cast(sum(coalesce(sr.sr_return_quantity,0)) as 
decimal(15,4))/cast(sum(coalesce(sts.ss_quantity,0)) as decimal(15,4) )) as 
return_ratio
,(cast(sum(coalesce(sr.sr_return_amt,0)) as 
decimal(15,4))/cast(sum(coalesce(sts.ss_net_paid,0)) as decimal(15,4) )) as 
currency_ratio
from 
store_sales sts left outer join store_returns sr
on (sts.ss_ticket_number = sr.sr_ticket_number and 
sts.ss_item_sk = sr.sr_item_sk)
,date_dim
where 
sr.sr_return_amt > 1 
and sts.ss_net_profit > 1

[jira] [Created] (HIVE-8732) ORC string statistics are not merged correctly

2014-11-04 Thread Owen O'Malley (JIRA)

Owen O'Malley created HIVE-8732:
---

 Summary: ORC string statistics are not merged correctly
 Key: HIVE-8732
 URL: https://issues.apache.org/jira/browse/HIVE-8732
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently ORC's string statistics do not merge correctly causing incorrect 
maximum values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8732) ORC string statistics are not merged correctly

2014-11-04 Thread Dain Sundstrom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196843#comment-14196843
 ] 

Dain Sundstrom commented on HIVE-8732:
--

Decimal and Date are also broken

> ORC string statistics are not merged correctly
> --
>
> Key: HIVE-8732
> URL: https://issues.apache.org/jira/browse/HIVE-8732
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> Currently ORC's string statistics do not merge correctly causing incorrect 
> maximum values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8732) ORC string statistics are not merged correctly

2014-11-04 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196847#comment-14196847
 ] 

Owen O'Malley commented on HIVE-8732:
-

Timestamp too. *sigh*

> ORC string statistics are not merged correctly
> --
>
> Key: HIVE-8732
> URL: https://issues.apache.org/jira/browse/HIVE-8732
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> Currently ORC's string statistics do not merge correctly causing incorrect 
> maximum values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8730) schemaTool failure when date partition has non-date value

2014-11-04 Thread Johndee Burks (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johndee Burks updated HIVE-8730:

Description: 
If there is a none date value in the PART_KEY_VAL column within the 
PARTITION_KEY_VALS table in the metastore db, this will cause the HIVE-5700 
script to fail. The failure will be picked up by the schemaTool causing the 
upgrade to fail. A classic example of a value that can be present without users 
really being aware is __HIVE_DEFAULT_PARTITION__ which is filled in by hive 
automatically when doing dynamic partitioning and value is not present in 
source data for the partition column.

The reason for the failure is that the upgrade script does not account for none 
date values. What it is currently:

{code}
UPDATE PARTITION_KEY_VALS
  INNER JOIN PARTITIONS ON PARTITION_KEY_VALS.PART_ID = PARTITIONS.PART_ID
  INNER JOIN PARTITION_KEYS ON PARTITION_KEYS.TBL_ID = PARTITIONS.TBL_ID
AND PARTITION_KEYS.INTEGER_IDX = PARTITION_KEY_VALS.INTEGER_IDX
AND PARTITION_KEYS.PKEY_TYPE = 'date'
SET PART_KEY_VAL = IFNULL(DATE_FORMAT(cast(PART_KEY_VAL as date),'%Y-%m-%d'), 
PART_KEY_VAL);
{code}

What it should be to avoid issue: 

{code}
UPDATE PARTITION_KEY_VALS
  INNER JOIN PARTITIONS ON PARTITION_KEY_VALS.PART_ID = PARTITIONS.PART_ID
  INNER JOIN PARTITION_KEYS ON PARTITION_KEYS.TBL_ID = PARTITIONS.TBL_ID
AND PARTITION_KEYS.INTEGER_IDX = PARTITION_KEY_VALS.INTEGER_IDX
AND PARTITION_KEYS.PKEY_TYPE = 'date'
AND PART_KEY_VAL != '__HIVE_DEFAULT_PARTITION__'
SET PART_KEY_VAL = IFNULL(DATE_FORMAT(cast(PART_KEY_VAL as date),'%Y-%m-%d'), 
PART_KEY_VAL);
{code}

== Metastore DB

{code}
mysql> select * from PARTITION_KEY_VALS;
+-++-+
| PART_ID | PART_KEY_VAL   | INTEGER_IDX |
+-++-+
| 171 | 2099-12-31 |   0 |
| 172 | __HIVE_DEFAULT_PARTITION__ |   0 |
| 184 | 2099-12-01 |   0 |
| 185 | 2099-12-30 |   0 |
+-++-+
{code} 

== stdout.log

{code}
0: jdbc:mysql://10.16.8.121:3306/metastore> !autocommit on
0: jdbc:mysql://10.16.8.121:3306/metastore> SELECT 'Upgrading MetaStore schema 
from 0.12.0 to 0.13.0' AS ' '
+---+--+
|   |
+---+--+
| Upgrading MetaStore schema from 0.12.0 to 0.13.0  |
+---+--+
0: jdbc:mysql://10.16.8.121:3306/metastore> SELECT '< HIVE-5700 enforce single 
date format for partition column storage >' AS ' '
++--+
||
++--+
| < HIVE-5700 enforce single date format for partition column storage >  |
++--+
0: jdbc:mysql://10.16.8.121:3306/metastore> UPDATE PARTITION_KEY_VALS INNER 
JOIN PARTITIONS ON PARTITION_KEY_VALS.PART_ID = PARTITIONS.PART_ID INNER JOIN 
PARTITION_KEYS ON PARTITION_KEYS.TBL_ID = PARTITIONS.TBL_ID AND 
PARTITION_KEYS.INTEGER_IDX = PARTITION_KEY_VALS.INTEGER_IDX AND 
PARTITION_KEYS.PKEY_TYPE = 'date' SET PART_KEY_VAL = 
IFNULL(DATE_FORMAT(cast(PART_KEY_VAL as date),'%Y-%m-%d'), PART_KEY_VAL)
{code}

== stderr.log

{code}
exec /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop jar 
/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-cli-0.13.1-cdh5.2.0.jar
 org.apache.hive.beeline.HiveSchemaTool -verbose -dbType mysql -upgradeSchema
Connecting to 
jdbc:mysql://10.16.8.121:3306/metastore?useUnicode=true&characterEncoding=UTF-8
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( 
Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
Autocommit status: true
1 row selected (0.025 seconds)
1 row selected (0.004 seconds)
Closing: 0: 
jdbc:mysql://10.16.8.121:3306/metastore?useUnicode=true&characterEncoding=UTF-8
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
state would be inconsistent !!
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
state would be inconsistent !!
at 
org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:252)
at 
org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:220)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:530)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

[jira] [Updated] (HIVE-8732) ORC string statistics are not merged correctly

2014-11-04 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-8732:

 Priority: Blocker  (was: Major)
Fix Version/s: 0.14.0

> ORC string statistics are not merged correctly
> --
>
> Key: HIVE-8732
> URL: https://issues.apache.org/jira/browse/HIVE-8732
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.14.0
>
>
> Currently ORC's string statistics do not merge correctly causing incorrect 
> maximum values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8733) HiveServer2 not picking correct IP address when hive.server2.thrift.bind.host is not set

2014-11-04 Thread Vaibhav Gumashta (JIRA)

Vaibhav Gumashta created HIVE-8733:
--

 Summary: HiveServer2 not picking correct IP address when 
hive.server2.thrift.bind.host is not set
 Key: HIVE-8733
 URL: https://issues.apache.org/jira/browse/HIVE-8733
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0, 0.14.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8733) HiveServer2 not picking correct IP address when hive.server2.thrift.bind.host is not set

2014-11-04 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-8733:
---
Affects Version/s: 0.12.0

> HiveServer2 not picking correct IP address when hive.server2.thrift.bind.host 
> is not set
> 
>
> Key: HIVE-8733
> URL: https://issues.apache.org/jira/browse/HIVE-8733
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27565: Set reasonable connection timeout for CuratorFramework ZooKeeper clients in Hive

2014-11-04 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27565/#review59843
---

Ship it!


Ship It!

- Thejas Nair


On Nov. 4, 2014, 7:40 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27565/
> ---
> 
> (Updated Nov. 4, 2014, 7:40 p.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-8723
> https://issues.apache.org/jira/browse/HIVE-8723
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-8723
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java
>  26d4d97 
>   jdbc/src/java/org/apache/hive/jdbc/Utils.java 3ed933a 
>   jdbc/src/java/org/apache/hive/jdbc/ZooKeeperHiveClientHelper.java d515ce5 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java b814e4b 
>   
> shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/ZooKeeperTokenStore.java
>  16a52e4 
> 
> Diff: https://reviews.apache.org/r/27565/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>

[jira] [Commented] (HIVE-8723) Set reasonable connection timeout for CuratorFramework ZooKeeper clients in Hive

2014-11-04 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196901#comment-14196901
 ] 

Thejas M Nair commented on HIVE-8723:
-

+1

> Set reasonable connection timeout for CuratorFramework ZooKeeper clients in 
> Hive
> 
>
> Key: HIVE-8723
> URL: https://issues.apache.org/jira/browse/HIVE-8723
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-8723.1.patch, HIVE-8723.2.patch
>
>
> Currently we use -1, due to which "any" elapsed time is always greater than 
> any timeout value resulting in an unnecessary connection loss exception. 
> Relevant code from curator framework:
> {code}
>  private synchronized void checkTimeouts() throws Exception
> {
> int minTimeout = Math.min(sessionTimeoutMs, connectionTimeoutMs);
> long elapsed = System.currentTimeMillis() - connectionStartMs;
> if ( elapsed >= minTimeout )
> {
> if ( zooKeeper.hasNewConnectionString() )
> {
> handleNewConnectionString();
> }
> else
> {
> int maxTimeout = Math.max(sessionTimeoutMs, 
> connectionTimeoutMs);
> if ( elapsed > maxTimeout )
> {
> if ( 
> !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
> {
> log.warn(String.format("Connection attempt 
> unsuccessful after %d (greater than max timeout of %d). Resetting connection 
> and trying again with a new connection.", elapsed, maxTimeout));
> }
> reset();
> }
> else
> {
> KeeperException.ConnectionLossException 
> connectionLossException = new CuratorConnectionLossException();
> if ( 
> !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
> {
> log.error(String.format("Connection timed out for 
> connection string (%s) and timeout (%d) / elapsed (%d)", 
> zooKeeper.getConnectionString(), connectionTimeoutMs, elapsed), 
> connectionLossException);
> }
> tracer.get().addCount("connections-timed-out", 1);
> throw connectionLossException;
> }
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8727) Dag summary has incorrect row counts and duration per vertex

2014-11-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196902#comment-14196902
 ] 

Hive QA commented on HIVE-8727:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12679285/HIVE-8727.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6671 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1632/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1632/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1632/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12679285 - PreCommit-HIVE-TRUNK-Build

> Dag summary has incorrect row counts and duration per vertex
> 
>
> Key: HIVE-8727
> URL: https://issues.apache.org/jira/browse/HIVE-8727
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth J
> Fix For: 0.14.0
>
> Attachments: HIVE-8727.1.patch
>
>
> During the code review for HIVE-8495 some code was reworked which broke some 
> of INPUT/OUTPUT counters and duration.
> Patch attached which fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8734) Renaming tables leaves stat store

2014-11-04 Thread Ryan Pridgeon (JIRA)

Ryan Pridgeon created HIVE-8734:
---

 Summary: Renaming tables leaves stat store 
 Key: HIVE-8734
 URL: https://issues.apache.org/jira/browse/HIVE-8734
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Ryan Pridgeon


Renaming tables does not alter the data stored in the TAB_COL_STATS. So if a 
user renames table1 to table2 they will also need to re-compute statistics 
against this table.

This is an inconvenience and puts the metastore in an inconsistent state. It 
neither drops or updates TAB_COL_STATS.TABLE_NAME. So you will have an entry 
for both the old table name and the new table name.

This lingering data is also inherited by any table that assumes the previously 
used table name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8734) Renaming tables leaves stat store in stale state

2014-11-04 Thread Ryan Pridgeon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Pridgeon updated HIVE-8734:

Summary: Renaming tables leaves stat store in stale state  (was: Renaming 
tables leaves stat store )

> Renaming tables leaves stat store in stale state
> 
>
> Key: HIVE-8734
> URL: https://issues.apache.org/jira/browse/HIVE-8734
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Ryan Pridgeon
>
> Renaming tables does not alter the data stored in the TAB_COL_STATS. So if a 
> user renames table1 to table2 they will also need to re-compute statistics 
> against this table.
> This is an inconvenience and puts the metastore in an inconsistent state. It 
> neither drops or updates TAB_COL_STATS.TABLE_NAME. So you will have an entry 
> for both the old table name and the new table name.
> This lingering data is also inherited by any table that assumes the 
> previously used table name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8700) Replace ReduceSink to HashTableSink (or equi.) for small tables [Spark Branch]

2014-11-04 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196925#comment-14196925
 ] 

Szehon Ho commented on HIVE-8700:
-

Thanks a lot [~ssatish], I am wondering do you think we can commit a partial 
patch just doing the replace, so Chao can proceed with his task?   Let me know 
your thoughts, or if the patch still needs some work.  Thanks

> Replace ReduceSink to HashTableSink (or equi.) for small tables [Spark Branch]
> --
>
> Key: HIVE-8700
> URL: https://issues.apache.org/jira/browse/HIVE-8700
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Suhas Satish
> Attachments: HIVE-8700.patch
>
>
> With HIVE-8616 enabled, the new plan has ReduceSinkOperator for the small 
> tables. For example, the follow represents the operator plan for the small 
> table dec1 derived from query {code}explain select /*+ MAPJOIN(dec)*/ * from 
> dec join dec1 on dec.value=dec1.d;{code}
> {code}
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: dec1
>   Statistics: Num rows: 0 Data size: 107 Basic stats: PARTIAL 
> Column stats: NONE
>   Filter Operator
> predicate: d is not null (type: boolean)
> Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: d (type: decimal(5,2))
>   sort order: +
>   Map-reduce partition columns: d (type: decimal(5,2))
>   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
> Column stats: NONE
>   value expressions: i (type: int)
> {code}
> With the new design for broadcasting small tables, we need to convert the 
> ReduceSinkOperator with HashTableSinkOperator or equivalent in the new plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 27594: HiveServer2 not picking correct IP address when hive.server2.thrift.bind.host is not set

2014-11-04 Thread Vaibhav Gumashta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27594/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-8733
https://issues.apache.org/jira/browse/HIVE-8733


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-8733


Diffs
-

  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
a0a6e18 
  service/src/java/org/apache/hive/service/server/HiveServer2.java b814e4b 

Diff: https://reviews.apache.org/r/27594/diff/


Testing
---


Thanks,

Vaibhav Gumashta

1 2 3 >

1 - 100 of 209 matches

Mail list logo