[jira] [Commented] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594437#comment-14594437
 ] 

Hive QA commented on HIVE-11037:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740793/HIVE-11037.05.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9012 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4330/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4330/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4330/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740793 - PreCommit-HIVE-TRUNK-Build

> HiveOnTez: make explain user level = true as default
> 
>
> Key: HIVE-11037
> URL: https://issues.apache.org/jira/browse/HIVE-11037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, 
> HIVE-11037.03.patch, HIVE-11037.04.patch, HIVE-11037.05.patch
>
>
> In Hive-9780, we introduced a new level of explain for hive on tez. We would 
> like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11044) Some optimizable predicates being missed by constant propagation

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594397#comment-14594397
 ] 

Hive QA commented on HIVE-11044:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740749/HIVE-11044.2.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9011 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4329/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4329/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4329/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740749 - PreCommit-HIVE-TRUNK-Build

> Some optimizable predicates being missed by constant propagation
> 
>
> Key: HIVE-11044
> URL: https://issues.apache.org/jira/browse/HIVE-11044
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-11044.1.patch, HIVE-11044.2.patch
>
>
> Some of the qfile explain plans show some predicates that could be taken care 
> of by running ConstantPropagate after the PartitionPruner:
> index_auto_unused.q:
> {noformat}
> filterExpr: ((12.0 = 12.0) and (UDFToDouble(key) < 10.0)) (type: boolean)
> {noformat}
> join28.q:
> {noformat}
> predicate: ((11.0 = 11.0) and key is not null) (type: boolean)
> {noformat}
> bucketsort_optimize_insert_7.q ("is not null" is unnecessary)
> {noformat}
> predicate: (((key < 8) and key is not null) and ((key = 0) or (key = 5))) 
> (type: boolean)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7193) Hive should support additional LDAP authentication parameters

2015-06-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594389#comment-14594389
 ] 

Lefty Leverenz commented on HIVE-7193:
--

The parameter descriptions in patch 5 look good.  Just one nit unfixed:  
"return" should be "returns" in the description of 
hive.server2.authentication.ldap.customLDAPQuery.  Thanks.

> Hive should support additional LDAP authentication parameters
> -
>
> Key: HIVE-7193
> URL: https://issues.apache.org/jira/browse/HIVE-7193
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Mala Chikka Kempanna
>Assignee: Naveen Gangam
> Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, 
> HIVE-7193.5.patch, HIVE-7193.patch, LDAPAuthentication_Design_Doc.docx, 
> LDAPAuthentication_Design_Doc_V2.docx
>
>
> Currently hive has only following authenticator parameters for LDAP 
> authentication for hiveserver2:
> {code:xml}
>  
>   hive.server2.authentication 
>   LDAP 
>  
>  
>   hive.server2.authentication.ldap.url 
>   ldap://our_ldap_address 
>  
> {code}
> We need to include other LDAP properties as part of hive-LDAP authentication 
> like below:
> {noformat}
> a group search base -> dc=domain,dc=com 
> a group search filter -> member={0} 
> a user search base -> dc=domain,dc=com 
> a user search filter -> sAMAAccountName={0} 
> a list of valid user groups -> group1,group2,group3 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11037:
---
Attachment: HIVE-11037.05.patch

address the newly added tez_self_join.q test case.

> HiveOnTez: make explain user level = true as default
> 
>
> Key: HIVE-11037
> URL: https://issues.apache.org/jira/browse/HIVE-11037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, 
> HIVE-11037.03.patch, HIVE-11037.04.patch, HIVE-11037.05.patch
>
>
> In Hive-9780, we introduced a new level of explain for hive on tez. We would 
> like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11059) hcatalog-server-extensions tests scope should depend on hive-exec

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594342#comment-14594342
 ] 

Hive QA commented on HIVE-11059:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740748/HIVE-11059.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9004 tests executed
*Failed tests:*
{noformat}
TestSSL - did not produce a TEST-*.xml file
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4328/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4328/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4328/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740748 - PreCommit-HIVE-TRUNK-Build

> hcatalog-server-extensions tests scope should depend on hive-exec
> -
>
> Key: HIVE-11059
> URL: https://issues.apache.org/jira/browse/HIVE-11059
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.1
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Attachments: HIVE-11059.patch
>
>
> (causes test failures in Windows due to the lack of WindowsPathUtil being 
> available otherwise)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11057) HBase metastore chokes on partition with ':' in name

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594268#comment-14594268
 ] 

Hive QA commented on HIVE-11057:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740711/HIVE-11057.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4327/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4327/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4327/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4327/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   74fe6f7..379cd85  branch-1   -> origin/branch-1
+ git reset --hard HEAD
HEAD is now at b97303c HIVE-11050: testCliDriver_vector_outer_join.* failures 
in Unit tests due to unstable data creation queries (Matt McCline reviewed by 
Gunther Hagleitner)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at b97303c HIVE-11050: testCliDriver_vector_outer_join.* failures 
in Unit tests due to unstable data creation queries (Matt McCline reviewed by 
Gunther Hagleitner)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740711 - PreCommit-HIVE-TRUNK-Build

> HBase metastore chokes on partition with ':' in name
> 
>
> Key: HIVE-11057
> URL: https://issues.apache.org/jira/browse/HIVE-11057
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11057.patch
>
>
> The HBase metastore uses ':' as a key separator when building keys for the 
> partition table.  This means that partitions with a colon in the name (which 
> is legal) cause problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594266#comment-14594266
 ] 

Hive QA commented on HIVE-11042:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740740/HIVE-11042.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9012 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4326/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4326/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4326/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740740 - PreCommit-HIVE-TRUNK-Build

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch, 
> HIVE-11042.3.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-19 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10233:
--
Attachment: HIVE-10233.10.patch

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-11050.
--
Resolution: Fixed

Committed to branch-1 as well. Thanks [~mmccline]!


> testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data 
> creation queries
> --
>
> Key: HIVE-11050
> URL: https://issues.apache.org/jira/browse/HIVE-11050
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 1.2.1
>
> Attachments: HIVE-11050.01.branch-1.patch, HIVE-11050.01.patch
>
>
> In some environments the Q file tests vector_outer_join\{1-4\}.q fail because 
> the data creation queries produce different input files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries

2015-06-19 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594213#comment-14594213
 ] 

Matt McCline commented on HIVE-11050:
-

I could not apply the patch to branch-1 either.  I recreated the changes using 
a difftool and attached that patch as HIVE-11050.01.branch-1.patch

I tried to commit but don't have permissions for that branch?

[~ prasanthj] Can you try committing the alternate patch?  Thanks.


> testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data 
> creation queries
> --
>
> Key: HIVE-11050
> URL: https://issues.apache.org/jira/browse/HIVE-11050
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 1.2.1
>
> Attachments: HIVE-11050.01.branch-1.patch, HIVE-11050.01.patch
>
>
> In some environments the Q file tests vector_outer_join\{1-4\}.q fail because 
> the data creation queries produce different input files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries

2015-06-19 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11050:

Attachment: HIVE-11050.01.branch-1.patch

> testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data 
> creation queries
> --
>
> Key: HIVE-11050
> URL: https://issues.apache.org/jira/browse/HIVE-11050
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 1.2.1
>
> Attachments: HIVE-11050.01.branch-1.patch, HIVE-11050.01.patch
>
>
> In some environments the Q file tests vector_outer_join\{1-4\}.q fail because 
> the data creation queries produce different input files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594150#comment-14594150
 ] 

Hive QA commented on HIVE-11037:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740685/HIVE-11037.03.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9012 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_self_join
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4325/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4325/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4325/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740685 - PreCommit-HIVE-TRUNK-Build

> HiveOnTez: make explain user level = true as default
> 
>
> Key: HIVE-11037
> URL: https://issues.apache.org/jira/browse/HIVE-11037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, 
> HIVE-11037.03.patch, HIVE-11037.04.patch
>
>
> In Hive-9780, we introduced a new level of explain for hive on tez. We would 
> like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11060) Make test windowing.q robust

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594148#comment-14594148
 ] 

Jesus Camacho Rodriguez commented on HIVE-11060:


Thanks Ashutosh! I just did it.

> Make test windowing.q robust
> 
>
> Key: HIVE-11060
> URL: https://issues.apache.org/jira/browse/HIVE-11060
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11060.01.patch, HIVE-11060.patch
>
>
> Add partition / order by in over clause to make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11060) Make test windowing.q robust

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11060:
---
Attachment: HIVE-11060.01.patch

> Make test windowing.q robust
> 
>
> Key: HIVE-11060
> URL: https://issues.apache.org/jira/browse/HIVE-11060
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11060.01.patch, HIVE-11060.patch
>
>
> Add partition / order by in over clause to make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11050:
-
Fix Version/s: (was: 2.0.0)

> testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data 
> creation queries
> --
>
> Key: HIVE-11050
> URL: https://issues.apache.org/jira/browse/HIVE-11050
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 1.2.1
>
> Attachments: HIVE-11050.01.patch
>
>
> In some environments the Q file tests vector_outer_join\{1-4\}.q fail because 
> the data creation queries produce different input files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reopened HIVE-11050:
--

[~mmccline] Patch does not apply cleanly in branch-1. Reopening the issue.

> testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data 
> creation queries
> --
>
> Key: HIVE-11050
> URL: https://issues.apache.org/jira/browse/HIVE-11050
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 1.2.1, 2.0.0
>
> Attachments: HIVE-11050.01.patch
>
>
> In some environments the Q file tests vector_outer_join\{1-4\}.q fail because 
> the data creation queries produce different input files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11033) BloomFilter index is not honored by ORC reader

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11033:
-
Fix Version/s: (was: 2.0.0)

> BloomFilter index is not honored by ORC reader
> --
>
> Key: HIVE-11033
> URL: https://issues.apache.org/jira/browse/HIVE-11033
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Allan Yan
>Assignee: Prasanth Jayachandran
> Fix For: 1.2.1
>
> Attachments: HIVE-11033.2.patch, HIVE-11033.patch
>
>
> There is a bug in the org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl class 
> which caused the bloom filter index saved in the ORC file not being used. The 
> root cause is the bloomFilterIndices variable defined in the SargApplier 
> class superseded the one defined in its parent class. Therefore, in the 
> ReaderImpl.pickRowGroups()
> {code}
>   protected boolean[] pickRowGroups() throws IOException {
> // if we don't have a sarg or indexes, we read everything
> if (sargApp == null) {
>   return null;
> }
> readRowIndex(currentStripe, included, sargApp.sargColumns);
> return sargApp.pickRowGroups(stripes.get(currentStripe), indexes);
>   }
> {code}
> The bloomFilterIndices populated by readRowIndex() is not picked up by 
> sargApp object. One solution is to make SargApplier.bloomFilterIndices a 
> reference to its parent counterpart.
> {noformat}
> 18:46 $ diff src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java 
> src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.original
> 174d173
> < bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> 178c177
> <   sarg, options.getColumnNames(), strideRate, types, 
> included.length, bloomFilterIndices);
> ---
> >   sarg, options.getColumnNames(), strideRate, types, 
> > included.length);
> 204a204
> > bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> 673c673
> < List types, int includedCount, 
> OrcProto.BloomFilterIndex[] bloomFilterIndices) {
> ---
> > List types, int includedCount) {
> 677c677
> <   this.bloomFilterIndices = bloomFilterIndices;
> ---
> >   bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11031) ORC concatenation of old files can fail while merging column statistics

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11031:
-
Fix Version/s: (was: 2.0.0)
   (was: 1.1.1)
   (was: 1.0.1)

> ORC concatenation of old files can fail while merging column statistics
> ---
>
> Key: HIVE-11031
> URL: https://issues.apache.org/jira/browse/HIVE-11031
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 1.2.1
>
> Attachments: HIVE-11031-branch-1.0.patch, HIVE-11031.2.patch, 
> HIVE-11031.3.patch, HIVE-11031.4.patch, HIVE-11031.patch
>
>
> Column statistics in ORC are optional protobuf fields. Old ORC files might 
> not have statistics for newly added types like decimal, date, timestamp etc. 
> But column statistics merging assumes column statistics exists for these 
> types and invokes merge. For example, merging of TimestampColumnStatistics 
> directly casts the received ColumnStatistics object without doing instanceof 
> check. If the ORC file contains time stamp column statistics then this will 
> work else it will throw ClassCastException.
> Also, the file merge operator swallows the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11035) PPD: Orc Split elimination fails because filterColumns=[-1]

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11035:
-
Fix Version/s: (was: 2.0.0)
   (was: 1.1.1)
   (was: 1.0.1)

> PPD: Orc Split elimination fails because filterColumns=[-1]
> ---
>
> Key: HIVE-11035
> URL: https://issues.apache.org/jira/browse/HIVE-11035
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
> Fix For: 1.2.1
>
> Attachments: HIVE-11035-branch-1.0.patch, HIVE-11035.patch
>
>
> {code}
> create temporary table xx (x int) stored as orc ;
> insert into xx values (20),(200);
> set hive.fetch.task.conversion=none;
> select * from xx where x is null;
> {code}
> This should generate zero tasks after optional split elimination in the app 
> master, instead of generating the 1 task which for sure hits the row-index 
> filters and removes all rows anyway.
> Right now, this runs 1 task for the stripe containing (min=20, max=200, 
> has_null=false), which is broken.
> Instead, it returns YES_NO_NULL from the following default case
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L976



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10685) Alter table concatenate oparetor will cause duplicate data

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10685:
-
Fix Version/s: (was: 2.0.0)
   (was: 1.1.1)
   (was: 1.0.1)

> Alter table concatenate oparetor will cause duplicate data
> --
>
> Key: HIVE-10685
> URL: https://issues.apache.org/jira/browse/HIVE-10685
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 1.2.1
>Reporter: guoliming
>Assignee: guoliming
>Priority: Critical
> Fix For: 1.2.1
>
> Attachments: HIVE-10685.patch
>
>
> "Orders" table has 15 rows and stored as ORC. 
> {noformat}
> hive> select count(*) from orders;
> OK
> 15
> Time taken: 37.692 seconds, Fetched: 1 row(s)
> {noformat}
> The table contain 14 files,the size of each file is about 2.1 ~ 3.2 GB.
> After executing command : ALTER TABLE orders CONCATENATE;
> The table is already 1530115000 rows.
> My hive version is 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11027) Hive on tez: Bucket map joins fail when hashcode goes negative

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11027:
-
Fix Version/s: (was: 2.0.0)
   (was: 1.1.1)
   (was: 1.0.1)

> Hive on tez: Bucket map joins fail when hashcode goes negative
> --
>
> Key: HIVE-11027
> URL: https://issues.apache.org/jira/browse/HIVE-11027
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 0.14.0, 1.0.0, 0.13
>Reporter: Vikram Dixit K
>Assignee: Prasanth Jayachandran
> Fix For: 1.2.1
>
> Attachments: HIVE-11027.patch
>
>
> Seeing an issue when dynamic sort optimization is enabled while doing an 
> insert into bucketed table. We seem to be flipping the negative sign on the 
> hashcode instead of taking the complement of it for routing the data 
> correctly. This results in correctness issues in bucket map joins in hive on 
> tez when the hash code goes negative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11060) Make test windowing.q robust

2015-06-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594085#comment-14594085
 ] 

Ashutosh Chauhan commented on HIVE-11060:
-

You also need to update golden file for SparkCliDriver for this test. Otherwise 
looks good, +1

> Make test windowing.q robust
> 
>
> Key: HIVE-11060
> URL: https://issues.apache.org/jira/browse/HIVE-11060
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11060.patch
>
>
> Add partition / order by in over clause to make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11060) Make test windowing.q robust

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11060:
---
Attachment: HIVE-11060.patch

> Make test windowing.q robust
> 
>
> Key: HIVE-11060
> URL: https://issues.apache.org/jira/browse/HIVE-11060
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11060.patch
>
>
> Add partition / order by in over clause to make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11059) hcatalog-server-extensions tests scope should depend on hive-exec

2015-06-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594079#comment-14594079
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11059:
--

+1

> hcatalog-server-extensions tests scope should depend on hive-exec
> -
>
> Key: HIVE-11059
> URL: https://issues.apache.org/jira/browse/HIVE-11059
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.1
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Attachments: HIVE-11059.patch
>
>
> (causes test failures in Windows due to the lack of WindowsPathUtil being 
> available otherwise)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11044) Some optimizable predicates being missed by constant propagation

2015-06-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594078#comment-14594078
 ] 

Ashutosh Chauhan commented on HIVE-11044:
-

+1

> Some optimizable predicates being missed by constant propagation
> 
>
> Key: HIVE-11044
> URL: https://issues.apache.org/jira/browse/HIVE-11044
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-11044.1.patch, HIVE-11044.2.patch
>
>
> Some of the qfile explain plans show some predicates that could be taken care 
> of by running ConstantPropagate after the PartitionPruner:
> index_auto_unused.q:
> {noformat}
> filterExpr: ((12.0 = 12.0) and (UDFToDouble(key) < 10.0)) (type: boolean)
> {noformat}
> join28.q:
> {noformat}
> predicate: ((11.0 = 11.0) and key is not null) (type: boolean)
> {noformat}
> bucketsort_optimize_insert_7.q ("is not null" is unnecessary)
> {noformat}
> predicate: (((key < 8) and key is not null) and ((key = 0) or (key = 5))) 
> (type: boolean)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10996:
---
Attachment: HIVE-10996.07.patch

> Aggregation / Projection over Multi-Join Inner Query producing incorrect 
> results
> 
>
> Key: HIVE-10996
> URL: https://issues.apache.org/jira/browse/HIVE-10996
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Gautam Kowshik
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
> HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, 
> HIVE-10996.06.patch, HIVE-10996.07.patch, HIVE-10996.patch, explain_q1.txt, 
> explain_q2.txt
>
>
> We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
> a regression.
> The following query (Q1) produces no results:
> {code}
> select s
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> {code}
> While this one (Q2) does produce results :
> {code}
> select *
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> 1 21  20  Bob 1234
> 1 31  30  Bob 1234
> 3 51  50  Jeff1234
> {code}
> The setup to test this is:
> {code}
> create table purchase_history (s string, product string, price double, 
> timestamp int);
> insert into purchase_history values ('1', 'Belt', 20.00, 21);
> insert into purchase_history values ('1', 'Socks', 3.50, 31);
> insert into purchase_history values ('3', 'Belt', 20.00, 51);
> insert into purchase_history values ('4', 'Shirt', 15.50, 59);
> create table cart_history (s string, cart_id int, timestamp int);
> insert into cart_history values ('1', 1, 10);
> insert into cart_history values ('1', 2, 20);
> insert into cart_history values ('1', 3, 30);
> insert into cart_history values ('1', 4, 40);
> insert into cart_history values ('3', 5, 50);
> insert into cart_history values ('4', 6, 60);
> create table events (s string, st2 string, n int, timestamp int);
> insert into events values ('1', 'Bob', 1234, 20);
> insert into events values ('1', 'Bob', 1234, 30);
> insert into events values ('1', 'Bob', 1234, 25);
> insert into events values ('2', 'Sam', 1234, 30);
> insert into events values ('3', 'Jeff', 1234, 50);
> insert into events values ('4', 'Ted', 1234, 60);
> {code}
> I realize select * and select s are not all that interesting in this context 
> but what lead us to this issue was select count(distinct s) was not returning 
> results. The above queries are the simplified queries that produce the issue. 
> I will note that if I convert the inner join to a table and select from that 
> the issue does not appear.
> Update: Found that turning off  hive.optimize.remove.identity.project fixes 
> this issue. This optimization was introduced in 
> https://issues.apache.org/jira/browse/HIVE-8435



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11037:
---
Attachment: HIVE-11037.04.patch

address [~jpullokkaran]'s comments.

> HiveOnTez: make explain user level = true as default
> 
>
> Key: HIVE-11037
> URL: https://issues.apache.org/jira/browse/HIVE-11037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, 
> HIVE-11037.03.patch, HIVE-11037.04.patch
>
>
> In Hive-9780, we introduced a new level of explain for hive on tez. We would 
> like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11044) Some optimizable predicates being missed by constant propagation

2015-06-19 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-11044:
--
Attachment: HIVE-11044.2.patch

patch v2 - updating golden for explainuser_2.q, now that the patch for 
HIVE-11028 has been committed

> Some optimizable predicates being missed by constant propagation
> 
>
> Key: HIVE-11044
> URL: https://issues.apache.org/jira/browse/HIVE-11044
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-11044.1.patch, HIVE-11044.2.patch
>
>
> Some of the qfile explain plans show some predicates that could be taken care 
> of by running ConstantPropagate after the PartitionPruner:
> index_auto_unused.q:
> {noformat}
> filterExpr: ((12.0 = 12.0) and (UDFToDouble(key) < 10.0)) (type: boolean)
> {noformat}
> join28.q:
> {noformat}
> predicate: ((11.0 = 11.0) and key is not null) (type: boolean)
> {noformat}
> bucketsort_optimize_insert_7.q ("is not null" is unnecessary)
> {noformat}
> predicate: (((key < 8) and key is not null) and ((key = 0) or (key = 5))) 
> (type: boolean)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11059) hcatalog-server-extensions tests scope should depend on hive-exec

2015-06-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11059:

Attachment: HIVE-11059.patch

Patch attached. [~hsubramaniyan], could you please review?

> hcatalog-server-extensions tests scope should depend on hive-exec
> -
>
> Key: HIVE-11059
> URL: https://issues.apache.org/jira/browse/HIVE-11059
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.1
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Attachments: HIVE-11059.patch
>
>
> (causes test failures in Windows due to the lack of WindowsPathUtil being 
> available otherwise)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11059) hcatalog-server-extensions tests scope should depend on hive-exec

2015-06-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11059:

Description: (causes test failures in Windows due to the lack of 
WindowsPathUtil being available otherwise)

> hcatalog-server-extensions tests scope should depend on hive-exec
> -
>
> Key: HIVE-11059
> URL: https://issues.apache.org/jira/browse/HIVE-11059
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.1
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
>
> (causes test failures in Windows due to the lack of WindowsPathUtil being 
> available otherwise)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594045#comment-14594045
 ] 

Hive QA commented on HIVE-10996:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740684/HIVE-10996.06.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9011 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4324/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4324/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4324/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740684 - PreCommit-HIVE-TRUNK-Build

> Aggregation / Projection over Multi-Join Inner Query producing incorrect 
> results
> 
>
> Key: HIVE-10996
> URL: https://issues.apache.org/jira/browse/HIVE-10996
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Gautam Kowshik
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
> HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, 
> HIVE-10996.06.patch, HIVE-10996.patch, explain_q1.txt, explain_q2.txt
>
>
> We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
> a regression.
> The following query (Q1) produces no results:
> {code}
> select s
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> {code}
> While this one (Q2) does produce results :
> {code}
> select *
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> 1 21  20  Bob 1234
> 1 31  30  Bob 1234
> 3 51  50  Jeff1234
> {code}
> The setup to test this is:
> {code}
> create table purchase_history (s string, product string, price double, 
> timestamp int);
> insert into purchase_history values ('1', 'Belt', 20.00, 21);
> insert into purchase_history values ('1', 'Socks', 3.50, 31);
> insert into purchase_history values ('3', 'Belt', 20.00, 51);
> insert into purchase_history values ('4', 'Shirt', 15.50, 59);
> create table cart_history (s string, cart_id int, timestamp int);
> insert into cart_history values ('1', 1, 10);
> insert into cart_history values ('1', 2, 20);
> insert into cart_history values ('1', 3, 30);
> insert into cart_history values ('1', 4, 40);
> insert into cart_history values ('3', 5, 50);
> insert into cart_history values ('4', 6, 60);
> create table events (s string, st2 string, n int, timestamp int);
> insert into events values ('1', 'Bob', 1234, 20);
> insert into events values ('1', 'Bob', 1234, 30);
> insert into events values ('1', 'Bob', 1234, 25);
> insert into events values ('2', 'Sam', 1234, 30);
> insert into events values ('3', 'Jeff', 1234, 50);
> insert into events values ('4', 'Ted', 1234, 60);
> {code}
> I realize select * and select s are not all that interesting in this context 
> but what lead us to this issue was select count(distinct s) was not returning 
> results. The above queries are the simplified queries that produce the issue. 
> I will note that if I convert the inner join to a 

[jira] [Updated] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11042:

Attachment: HIVE-11042.3.patch

Attach patch3 to remove the extra line.

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch, 
> HIVE-11042.3.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593971#comment-14593971
 ] 

Chao Sun commented on HIVE-11042:
-

+1
Can you also remove the extra line before this method?

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11058) Make alter_merge* tests (ORC only) stable across different OSes

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11058:
-
Summary: Make alter_merge* tests (ORC only) stable across different OSes  
(was: Make alter_merge* tests stable across different OSes)

> Make alter_merge* tests (ORC only) stable across different OSes
> ---
>
> Key: HIVE-11058
> URL: https://issues.apache.org/jira/browse/HIVE-11058
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> alter_merge* (ORC only) tests are showing stats diff in different OSes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11058) Make alter_merge* tests stable across different OSes

2015-06-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11058:
-
Description: alter_merge* (ORC only) tests are showing stats diff in 
different OSes.  (was: alter_merge* tests are showing stats diff in different 
OSes.)

> Make alter_merge* tests stable across different OSes
> 
>
> Key: HIVE-11058
> URL: https://issues.apache.org/jira/browse/HIVE-11058
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> alter_merge* (ORC only) tests are showing stats diff in different OSes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593843#comment-14593843
 ] 

Hive QA commented on HIVE-11042:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740664/HIVE-11042.2.patch

{color:green}SUCCESS:{color} +1 9011 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4323/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4323/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4323/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740664 - PreCommit-HIVE-TRUNK-Build

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593833#comment-14593833
 ] 

Yongzhi Chen commented on HIVE-11042:
-

[~csun], I think replaceTaskId( string , int) is a right name, we may need fix 
other method names. I will submit a different jira and work on it once I am 
fully understand other methods' use case. 
I make it public because I could do some unit test on it. It should have no 
harm. We have a couple of replaceTask...method is public too.
I do not know why my git diff always give me previous version of change, but 
after I create a new branch and cherry-pick my change, the issue is solved. 
Attach new version of patch 2. Thanks. 

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11042:

Attachment: (was: HIVE-11042.2.patch)

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11042:

Attachment: HIVE-11042.2.patch

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11057) HBase metastore chokes on partition with ':' in name

2015-06-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-11057:
--
Attachment: HIVE-11057.patch

This patch changes the separator from colon to ^A

> HBase metastore chokes on partition with ':' in name
> 
>
> Key: HIVE-11057
> URL: https://issues.apache.org/jira/browse/HIVE-11057
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11057.patch
>
>
> The HBase metastore uses ':' as a key separator when building keys for the 
> partition table.  This means that partitions with a colon in the name (which 
> is legal) cause problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10970) Revert HIVE-10453: HS2 leaking open file descriptors when using UDFs

2015-06-19 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593803#comment-14593803
 ] 

Yongzhi Chen commented on HIVE-10970:
-

[~vgumashta], do you find a reproduce? If you have, could you share with me? 
HIVE-10453 has to be fixed. Thanks

> Revert HIVE-10453: HS2 leaking open file descriptors when using UDFs
> 
>
> Key: HIVE-10970
> URL: https://issues.apache.org/jira/browse/HIVE-10970
> Project: Hive
>  Issue Type: Bug
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10594) Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593772#comment-14593772
 ] 

Hive QA commented on HIVE-10594:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740689/HIVE-10594.1-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7987 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.initializationError
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap_auto
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/897/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/897/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-897/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740689 - PreCommit-HIVE-SPARK-Build

> Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
> --
>
> Key: HIVE-10594
> URL: https://issues.apache.org/jira/browse/HIVE-10594
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Chao Sun
>Assignee: Xuefu Zhang
> Attachments: HIVE-10594.1-spark.patch
>
>
> Reporting problem found by one of the HoS users:
> Currently, if user is running Beeline on a different host than HS2, and 
> he/she didn't do kinit on the HS2 host, then he/she may get the following 
> error:
> {code}
> 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: 
> 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException 
> as:hive (auth:KERBEROS) cause:java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: 
> Exception in thread "main" java.io.IOException: Failed on local exception: 
> java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
> [Caused by GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)]; Host Details : local host is: 
> "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: 
> "secure-hos-1.ent.cloudera.com":8032;
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at java.lang.reflect.Method.invoke(Method.java:606)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> 2015-04-29 15:49:34,657 I

[jira] [Commented] (HIVE-11048) Make test cbo_windowing robust

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593771#comment-14593771
 ] 

Jesus Camacho Rodriguez commented on HIVE-11048:


LGTM, +1

> Make test cbo_windowing robust
> --
>
> Key: HIVE-11048
> URL: https://issues.apache.org/jira/browse/HIVE-11048
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11048.patch
>
>
> Add partition / order by in over clause to make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11034) Joining multiple tables producing different results with different order of join

2015-06-19 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-11034.

Resolution: Duplicate

I close this as a dupe. However, please feel free to reopen if the problem 
persists.

> Joining multiple tables producing different results with different order of 
> join
> 
>
> Key: HIVE-11034
> URL: https://issues.apache.org/jira/browse/HIVE-11034
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.0
> Environment: Linux 2.6.32-279.19.1.el6.x86_64
>Reporter: Srini Pindi
>Priority: Critical
>
> {panel}
> Join between tables with different join columns from main table yielding 
> wrong results in hive. 
> Changing the order of the joins between main table and other tables is 
> producing different results.
> {panel}
> Please see below for the steps to reproduce the issue:
> 1. Create tables as follows:
> create table p(ck string, email string);
> create table a1(ck string, flag string);
> create table a2(email string, flag string);
> create table a3(ck string, flag string);
> 2. Load data into the tables as follows:
> P
> ||ck||email||
> |10|e10|
> |20|e20|
> |30|e30|
> |40|e40|
>  
> A1
> ||ck||flag||
> |10|N|
> |20|Y|
> |30|Y|
> |40|Y|
> A2
> ||email||flag||
> |e10|Y|
> |e20|N|
> |e30|Y|
> |e40|Y|
>  
> A3
> ||ck||flag||
> |10|Y|
> |20|Y|
> |30|N|
> |40|Y|
>   
>  3. Good query:
> {panel}
> select p.ck 
> from p 
> left outer join a1 on p.ck = a1.ck 
> left outer join a3 on p.ck = a3.ck 
> left outer join a2 on p.email = a2.email 
> where a1.flag = 'Y'
>   and a3.flag = 'Y'
>   and a2.flag = 'Y'
> ;
> {panel}
> and results are
>   40
> 4. Bad query
> {panel}
> select p.ck 
> from p 
> left outer join a1 on p.ck = a1.ck 
> left outer join a2 on p.email = a2.email 
> left outer join a3 on p.ck = a3.ck 
> where a1.flag = 'Y'
>   and a2.flag = 'Y'
>   and a3.flag = 'Y'
> ;
> {panel}
>  Producing results as:
>  30
>  40



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593714#comment-14593714
 ] 

Hive QA commented on HIVE-10533:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740660/HIVE-10533.04.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9012 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_hook
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join2
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4322/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4322/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4322/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740660 - PreCommit-HIVE-TRUNK-Build

> CBO (Calcite Return Path): Join to MultiJoin support for outer joins
> 
>
> Key: HIVE-10533
> URL: https://issues.apache.org/jira/browse/HIVE-10533
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, 
> HIVE-10533.02.patch, HIVE-10533.03.patch, HIVE-10533.04.patch, 
> HIVE-10533.patch
>
>
> CBO return path: auto_join7.q can be used to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10972) DummyTxnManager always locks the current database in shared mode, which is incorrect.

2015-06-19 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593685#comment-14593685
 ] 

Alan Gates commented on HIVE-10972:
---

Yes, you're right.  I see where it's getting the parent locks.

+1 to committing this patch.

> DummyTxnManager always locks the current database in shared mode, which is 
> incorrect.
> -
>
> Key: HIVE-10972
> URL: https://issues.apache.org/jira/browse/HIVE-10972
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10972.2.patch, HIVE-10972.patch
>
>
> In DummyTxnManager [line 163 | 
> http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163],
>  it always locks the current database. 
> That is not correct since the current database can be "db1", and the query 
> can be "select * from db2.tb1", which will lock db1 unnecessarily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10479) CBO: Calcite Operator To Hive Operator (Calcite Return Path) Empty tabAlias in columnInfo which triggers PPD

2015-06-19 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593672#comment-14593672
 ] 

Laljo John Pullokkaran commented on HIVE-10479:
---

+1

> CBO: Calcite Operator To Hive Operator (Calcite Return Path) Empty tabAlias 
> in columnInfo which triggers PPD
> 
>
> Key: HIVE-10479
> URL: https://issues.apache.org/jira/browse/HIVE-10479
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-10479.01.patch, HIVE-10479.02.patch, 
> HIVE-10479.03.patch, HIVE-10479.patch
>
>
> in ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java, line 477, 
> when aliases contains empty string "" and key is an empty string "" too, it 
> assumes that aliases contains key. This will trigger incorrect PPD. To 
> reproduce it, apply the HIVE-10455 and run cbo_subq_notin.q.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10594) Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]

2015-06-19 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10594:
---
Attachment: HIVE-10594.1-spark.patch

> Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
> --
>
> Key: HIVE-10594
> URL: https://issues.apache.org/jira/browse/HIVE-10594
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Chao Sun
>Assignee: Xuefu Zhang
> Attachments: HIVE-10594.1-spark.patch
>
>
> Reporting problem found by one of the HoS users:
> Currently, if user is running Beeline on a different host than HS2, and 
> he/she didn't do kinit on the HS2 host, then he/she may get the following 
> error:
> {code}
> 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: 
> 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException 
> as:hive (auth:KERBEROS) cause:java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: 
> Exception in thread "main" java.io.IOException: Failed on local exception: 
> java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
> [Caused by GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)]; Host Details : local host is: 
> "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: 
> "secure-hos-1.ent.cloudera.com":8032;
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at java.lang.reflect.Method.invoke(Method.java:606)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at com.sun.proxy.$Proxy12.getClusterMetrics(Unknown Source)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90)
> 2015-04-29 15:49:34,658 INFO org.ap

[jira] [Assigned] (HIVE-10594) Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]

2015-06-19 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-10594:
--

Assignee: Xuefu Zhang

> Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
> --
>
> Key: HIVE-10594
> URL: https://issues.apache.org/jira/browse/HIVE-10594
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Chao Sun
>Assignee: Xuefu Zhang
>
> Reporting problem found by one of the HoS users:
> Currently, if user is running Beeline on a different host than HS2, and 
> he/she didn't do kinit on the HS2 host, then he/she may get the following 
> error:
> {code}
> 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: 
> 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException 
> as:hive (auth:KERBEROS) cause:java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: 
> Exception in thread "main" java.io.IOException: Failed on local exception: 
> java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
> [Caused by GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)]; Host Details : local host is: 
> "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: 
> "secure-hos-1.ent.cloudera.com":8032;
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at java.lang.reflect.Method.invoke(Method.java:606)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at com.sun.proxy.$Proxy12.getClusterMetrics(Unknown Source)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90)
> 2015-04-29 15:49:34,658 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apa

[jira] [Resolved] (HIVE-11000) Hive not able to pass Hive's Kerberos credential to spark-submit process [Spark Branch]

2015-06-19 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-11000.

Resolution: Duplicate

> Hive not able to pass Hive's Kerberos credential to spark-submit process 
> [Spark Branch]
> ---
>
> Key: HIVE-11000
> URL: https://issues.apache.org/jira/browse/HIVE-11000
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>
> The end of the result is that manual kinit with Hive's keytab on the host 
> where HS2 is running, or the following error may appear:
> {code}
> 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: 
> 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException 
> as:hive (auth:KERBEROS) cause:java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: 
> Exception in thread "main" java.io.IOException: Failed on local exception: 
> java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
> [Caused by GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)]; Host Details : local host is: 
> "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: 
> "secure-hos-1.ent.cloudera.com":8032;
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at java.lang.reflect.Method.invoke(Method.java:606)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at com.sun.proxy.$Proxy12.getClusterMetrics(Unknown Source)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90)
> 2015-04-29 15:49:34,658 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.deploy.yarn.Client.run(Client.scala:619)
> 2015-04-29 15:49:34,658 INFO org.apache.hive.spark.client.SparkClien

[jira] [Updated] (HIVE-10594) Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]

2015-06-19 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10594:
---
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-7292

> Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
> --
>
> Key: HIVE-10594
> URL: https://issues.apache.org/jira/browse/HIVE-10594
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Chao Sun
>
> Reporting problem found by one of the HoS users:
> Currently, if user is running Beeline on a different host than HS2, and 
> he/she didn't do kinit on the HS2 host, then he/she may get the following 
> error:
> {code}
> 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: 
> 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException 
> as:hive (auth:KERBEROS) cause:java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: 
> Exception in thread "main" java.io.IOException: Failed on local exception: 
> java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
> [Caused by GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)]; Host Details : local host is: 
> "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: 
> "secure-hos-1.ent.cloudera.com":8032;
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at java.lang.reflect.Method.invoke(Method.java:606)
> 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at com.sun.proxy.$Proxy12.getClusterMetrics(Unknown Source)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at 
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49)
> 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90)
> 2015-04-29 15:49:34,658 INFO org.apache.hive.spark.client.SparkClientImpl:
>   at org.apache.s

[jira] [Updated] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11037:
---
Attachment: HIVE-11037.03.patch

> HiveOnTez: make explain user level = true as default
> 
>
> Key: HIVE-11037
> URL: https://issues.apache.org/jira/browse/HIVE-11037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, 
> HIVE-11037.03.patch
>
>
> In Hive-9780, we introduced a new level of explain for hive on tez. We would 
> like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10996:
---
Attachment: HIVE-10996.06.patch

Updating q files.

> Aggregation / Projection over Multi-Join Inner Query producing incorrect 
> results
> 
>
> Key: HIVE-10996
> URL: https://issues.apache.org/jira/browse/HIVE-10996
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Gautam Kowshik
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
> HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, 
> HIVE-10996.06.patch, HIVE-10996.patch, explain_q1.txt, explain_q2.txt
>
>
> We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
> a regression.
> The following query (Q1) produces no results:
> {code}
> select s
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> {code}
> While this one (Q2) does produce results :
> {code}
> select *
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> 1 21  20  Bob 1234
> 1 31  30  Bob 1234
> 3 51  50  Jeff1234
> {code}
> The setup to test this is:
> {code}
> create table purchase_history (s string, product string, price double, 
> timestamp int);
> insert into purchase_history values ('1', 'Belt', 20.00, 21);
> insert into purchase_history values ('1', 'Socks', 3.50, 31);
> insert into purchase_history values ('3', 'Belt', 20.00, 51);
> insert into purchase_history values ('4', 'Shirt', 15.50, 59);
> create table cart_history (s string, cart_id int, timestamp int);
> insert into cart_history values ('1', 1, 10);
> insert into cart_history values ('1', 2, 20);
> insert into cart_history values ('1', 3, 30);
> insert into cart_history values ('1', 4, 40);
> insert into cart_history values ('3', 5, 50);
> insert into cart_history values ('4', 6, 60);
> create table events (s string, st2 string, n int, timestamp int);
> insert into events values ('1', 'Bob', 1234, 20);
> insert into events values ('1', 'Bob', 1234, 30);
> insert into events values ('1', 'Bob', 1234, 25);
> insert into events values ('2', 'Sam', 1234, 30);
> insert into events values ('3', 'Jeff', 1234, 50);
> insert into events values ('4', 'Ted', 1234, 60);
> {code}
> I realize select * and select s are not all that interesting in this context 
> but what lead us to this issue was select count(distinct s) was not returning 
> results. The above queries are the simplified queries that produce the issue. 
> I will note that if I convert the inner join to a table and select from that 
> the issue does not appear.
> Update: Found that turning off  hive.optimize.remove.identity.project fixes 
> this issue. This optimization was introduced in 
> https://issues.apache.org/jira/browse/HIVE-8435



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593597#comment-14593597
 ] 

Chao Sun commented on HIVE-11042:
-

But it is confusing to have to replaceTaskId methods (although slightly 
different parameters), and doing very different things.
Maybe rename them? Also, why this method is public instead of private? The 
comments for Patch #2 is still not changed.

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593595#comment-14593595
 ] 

Hive QA commented on HIVE-10996:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740656/HIVE-10996.05.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9011 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_having
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4321/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4321/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4321/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740656 - PreCommit-HIVE-TRUNK-Build

> Aggregation / Projection over Multi-Join Inner Query producing incorrect 
> results
> 
>
> Key: HIVE-10996
> URL: https://issues.apache.org/jira/browse/HIVE-10996
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Gautam Kowshik
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
> HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, 
> HIVE-10996.patch, explain_q1.txt, explain_q2.txt
>
>
> We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
> a regression.
> The following query (Q1) produces no results:
> {code}
> select s
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> {code}
> While this one (Q2) does produce results :
> {code}
> select *
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> 1 21  20  Bob 1234
> 1 31  30  Bob 1234
> 3 51  50  Jeff1234
> {code}
> The setup to test this is:
> {code}
> create table purchase_history (s string, product string, price double, 
> timestamp int);
> insert into purchase_history values ('1', 'Belt', 20.00, 21);
> insert into purchase_history values ('1', 'Socks', 3.50, 31);
> insert into purchase_history values ('3', 'Belt', 20.00, 51);
> insert into purchase_history values ('4', 'Shirt', 15.50, 59);
> create table cart_history (s string, cart_id int, timestamp int);
> insert into cart_history values ('1', 1, 10);
> insert into cart_history values ('1', 2, 20);
> insert into cart_history values ('1', 3, 30);
> insert into cart_history values ('1', 4, 40);
> insert into cart_history values ('3', 5, 50);
> insert into cart_history values ('4', 6, 60);
> create table events (s string, st2 string, n int, timestamp int);
> insert into events values ('1', 'Bob', 1234, 20);
> insert into events values ('1', 'Bob', 1234, 30);
> insert into events values ('1', 'Bob', 1234, 25);
> insert into events values ('2', 'Sam', 1234, 30);
> insert into events values ('3', 'Jeff', 1234, 50);
> insert into events values ('4', 'Ted', 1234, 60);
> {code}
> I realize select * and select s are not all that interesting in this context 
> but what lead us to this issue was select count(distinct s) was not returning 
> results. The above queries are the simplified queries that produce the issue. 
> I will note that if I convert the inner join to a table and select from that 
> the issue does not appear.
> Update: Found that turning off  hive.

[jira] [Commented] (HIVE-11043) ORC split strategies should adapt based on number of files

2015-06-19 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593575#comment-14593575
 ] 

Prasanth Jayachandran commented on HIVE-11043:
--

Mostly looks good. 
Few questions/comments:
1) Can we use the same default for numSplits as MR? 1 instead of -1. This will 
make ETL strategy the default even in the presence of single small file.
{code}
return generateSplitsInfo(conf, -1);
{code}
2) The condition should be numFiles <= context.minSplits right? This will avoid 
choosing BI in the case of 1 small file.
3) I tried some queries and numSplits arg in getSplits() can become 0. In which 
case we will end up using BI as default even though there are only small number 
of files.
4) Some more tests for these corner cases will be helpful.
5) Should we make this independently configurable? Instead of using the cache 
max size.

> ORC split strategies should adapt based on number of files
> --
>
> Key: HIVE-11043
> URL: https://issues.apache.org/jira/browse/HIVE-11043
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Fix For: 2.0.0
>
> Attachments: HIVE-11043.1.patch
>
>
> ORC split strategies added in HIVE-10114 chose strategies based on average 
> file size. It would be beneficial to choose a different strategy based on 
> number of files as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593540#comment-14593540
 ] 

Hive QA commented on HIVE-10999:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740653/HIVE-10999.2-spark.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8017 tests executed
*Failed tests:*
{noformat}
TestCliDriver-interval_udf.q-metadataonly1.q-union13.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.initializationError
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_lateral_view_explode2
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/896/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/896/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-896/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740653 - PreCommit-HIVE-SPARK-Build

> Upgrade Spark dependency to 1.4 [Spark Branch]
> --
>
> Key: HIVE-10999
> URL: https://issues.apache.org/jira/browse/HIVE-10999
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-10999.1-spark.patch, HIVE-10999.2-spark.patch, 
> HIVE-10999.2-spark.patch
>
>
> Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to 
> 1.4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593521#comment-14593521
 ] 

Yongzhi Chen commented on HIVE-11042:
-

PATCH 2 is attached. Please review. 

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11042:

Attachment: HIVE-11042.2.patch

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593492#comment-14593492
 ] 

Yongzhi Chen commented on HIVE-11042:
-

[~csun], it will cause more confusing if I change the existing 
replaceTaskId(String, String), the major part is different and can not 
combined. In replaceTaskId(String param1, String param2)  use case is pattern 
in param2, get pattern from param2, get lengh from param1 and replace param2. 
param2 called bucket number but it is major for bucket file(I think).
replaceTaskId(String param1, int param2) Is get pattern and length from from 
param1 and use param2 number .
It is not just switch order of the two params. 
I will change the comment and resubmit the patch. Thanks

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10533:
---
Attachment: HIVE-10533.04.patch

> CBO (Calcite Return Path): Join to MultiJoin support for outer joins
> 
>
> Key: HIVE-10533
> URL: https://issues.apache.org/jira/browse/HIVE-10533
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, 
> HIVE-10533.02.patch, HIVE-10533.03.patch, HIVE-10533.04.patch, 
> HIVE-10533.patch
>
>
> CBO return path: auto_join7.q can be used to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10996:
---
Attachment: HIVE-10996.05.patch

> Aggregation / Projection over Multi-Join Inner Query producing incorrect 
> results
> 
>
> Key: HIVE-10996
> URL: https://issues.apache.org/jira/browse/HIVE-10996
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Gautam Kowshik
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
> HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, 
> HIVE-10996.patch, explain_q1.txt, explain_q2.txt
>
>
> We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
> a regression.
> The following query (Q1) produces no results:
> {code}
> select s
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> {code}
> While this one (Q2) does produce results :
> {code}
> select *
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> 1 21  20  Bob 1234
> 1 31  30  Bob 1234
> 3 51  50  Jeff1234
> {code}
> The setup to test this is:
> {code}
> create table purchase_history (s string, product string, price double, 
> timestamp int);
> insert into purchase_history values ('1', 'Belt', 20.00, 21);
> insert into purchase_history values ('1', 'Socks', 3.50, 31);
> insert into purchase_history values ('3', 'Belt', 20.00, 51);
> insert into purchase_history values ('4', 'Shirt', 15.50, 59);
> create table cart_history (s string, cart_id int, timestamp int);
> insert into cart_history values ('1', 1, 10);
> insert into cart_history values ('1', 2, 20);
> insert into cart_history values ('1', 3, 30);
> insert into cart_history values ('1', 4, 40);
> insert into cart_history values ('3', 5, 50);
> insert into cart_history values ('4', 6, 60);
> create table events (s string, st2 string, n int, timestamp int);
> insert into events values ('1', 'Bob', 1234, 20);
> insert into events values ('1', 'Bob', 1234, 30);
> insert into events values ('1', 'Bob', 1234, 25);
> insert into events values ('2', 'Sam', 1234, 30);
> insert into events values ('3', 'Jeff', 1234, 50);
> insert into events values ('4', 'Ted', 1234, 60);
> {code}
> I realize select * and select s are not all that interesting in this context 
> but what lead us to this issue was select count(distinct s) was not returning 
> results. The above queries are the simplified queries that produce the issue. 
> I will note that if I convert the inner join to a table and select from that 
> the issue does not appear.
> Update: Found that turning off  hive.optimize.remove.identity.project fixes 
> this issue. This optimization was introduced in 
> https://issues.apache.org/jira/browse/HIVE-8435



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]

2015-06-19 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-10999:
--
Attachment: HIVE-10999.2-spark.patch

Can't reproduce the failures locally. Try again.

> Upgrade Spark dependency to 1.4 [Spark Branch]
> --
>
> Key: HIVE-10999
> URL: https://issues.apache.org/jira/browse/HIVE-10999
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-10999.1-spark.patch, HIVE-10999.2-spark.patch, 
> HIVE-10999.2-spark.patch
>
>
> Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to 
> 1.4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593410#comment-14593410
 ] 

Hive QA commented on HIVE-10996:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740625/HIVE-10996.04.patch

{color:red}ERROR:{color} -1 due to 125 failed/errored test(s), 9011 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_semijoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_udaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_views
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_count
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_distinct_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fold_eq_with_case_when
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_distinct_samekey
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_duplicate_key
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_id2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_insert_common_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_position
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_rollup1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join18_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataOnlyOptimizer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_gby3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_lateral_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup4_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reduce_deduplicate_extended
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semijoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqual_corr_expr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_table_access_keys_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_merge
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_count
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_count_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_leftsemi_mapjoin
org.apache.h

[jira] [Commented] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593406#comment-14593406
 ] 

Hive QA commented on HIVE-10999:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740629/HIVE-10999.2-spark.patch

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 7972 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.initializationError
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_bigdata
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_lateral_view_explode2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_leftsemijoin
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoin_noskew
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_pushdown
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/895/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/895/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-895/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740629 - PreCommit-HIVE-SPARK-Build

> Upgrade Spark dependency to 1.4 [Spark Branch]
> --
>
> Key: HIVE-10999
> URL: https://issues.apache.org/jira/browse/HIVE-10999
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-10999.1-spark.patch, HIVE-10999.2-spark.patch
>
>
> Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to 
> 1.4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11054) Read error : Partition Varchar column cannot be cast to string

2015-06-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593402#comment-14593402
 ] 

Xuefu Zhang commented on HIVE-11054:


I think this might be fixed in latest Hive already. [~ctang.ma], any comments?

> Read error : Partition Varchar column cannot be cast to string
> --
>
> Key: HIVE-11054
> URL: https://issues.apache.org/jira/browse/HIVE-11054
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 0.14.0
>Reporter: Devansh Srivastava
>
> Hi,
> I have one table with VARCHAR and CHAR datatypes.My target table has 
> structure like this :--
> CREATE EXTERNAL TABLE test_table(
> dob string COMMENT '',
> version_nbr int COMMENT '',
> record_status string COMMENT '',
> creation_timestamp timestamp COMMENT '')
> PARTITIONED BY (
> src_sys_cd varchar(10) COMMENT '',batch_id string COMMENT '')
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|'
> STORED AS ORC
> LOCATION
> '/test/test_table';
> My source table has structure like below :--
> CREATE EXTERNAL TABLE test_staging_table(
> dob string COMMENT '',
> version_nbr int COMMENT '',
> record_status string COMMENT '',
> creation_timestamp timestamp COMMENT ''
> src_sys_cd varchar(10) COMMENT '',
> batch_id string COMMENT '')
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|'
> STORED AS ORC
> LOCATION
> '/test/test_staging_table';
> We were loading data using pig script. Its a direct load, no transformation 
> needed. But when i was checking test_table's data in hive. It is giving 
> belowmentioned error:
> Diagnostic Messages for this Task:
> Error: java.io.IOException: java.io.IOException: java.lang.RuntimeException: 
> java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar 
> cannot be cast to java.lang.String
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:273)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:183)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.io.IOException: java.lang.RuntimeException: 
> java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar 
> cannot be cast to java.lang.String
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:352)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:115)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:271)
> ... 11 more
> Caused by: java.lang.RuntimeException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to 
> java.lang.String
> at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:95)
> at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:49)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:347)
> ... 15 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to 
> java.lang.String
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionColsToBatch(V

[jira] [Commented] (HIVE-10972) DummyTxnManager always locks the current database in shared mode, which is incorrect.

2015-06-19 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593401#comment-14593401
 ] 

Aihua Xu commented on HIVE-10972:
-

[~alangates] Ping. +[~ashutoshc] as well since seems you also have knowledge on 
that front.

> DummyTxnManager always locks the current database in shared mode, which is 
> incorrect.
> -
>
> Key: HIVE-10972
> URL: https://issues.apache.org/jira/browse/HIVE-10972
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10972.2.patch, HIVE-10972.patch
>
>
> In DummyTxnManager [line 163 | 
> http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163],
>  it always locks the current database. 
> That is not correct since the current database can be "db1", and the query 
> can be "select * from db2.tb1", which will lock db1 unnecessarily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]

2015-06-19 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-10999:
--
Attachment: HIVE-10999.2-spark.patch

Talked about this with Chengxiang. We think the reason is that spark doesn't 
depends on {{jersey-servlet}}. So during tests, we only add {{jersey-server}} 
to classpath, at version 1.14 (although I don't know why maven doesn't get the 
1.9 version, which is what spark really depends on). Then we have the class not 
found error. There won't be a problem during runtime because we'll have 
spark-assembly in classpath. That's why the qtests passed.

To solve the issue, we can either explicitly add the {{jersey-servlet}} 
dependency to failed tests, or we can upgrade hive's jersey version to 1.9 and 
remove dependency on {{jersey-servlet}} (it doesn't exist in 1.9).
Patch v2 takes the 1st approach which is simpler. But I think 2nd approach may 
be better, if possible.

> Upgrade Spark dependency to 1.4 [Spark Branch]
> --
>
> Key: HIVE-10999
> URL: https://issues.apache.org/jira/browse/HIVE-10999
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-10999.1-spark.patch, HIVE-10999.2-spark.patch
>
>
> Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to 
> 1.4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10996:
---
Attachment: HIVE-10996.04.patch

> Aggregation / Projection over Multi-Join Inner Query producing incorrect 
> results
> 
>
> Key: HIVE-10996
> URL: https://issues.apache.org/jira/browse/HIVE-10996
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Gautam Kowshik
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
> HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.patch, explain_q1.txt, 
> explain_q2.txt
>
>
> We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
> a regression.
> The following query (Q1) produces no results:
> {code}
> select s
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> {code}
> While this one (Q2) does produce results :
> {code}
> select *
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> 1 21  20  Bob 1234
> 1 31  30  Bob 1234
> 3 51  50  Jeff1234
> {code}
> The setup to test this is:
> {code}
> create table purchase_history (s string, product string, price double, 
> timestamp int);
> insert into purchase_history values ('1', 'Belt', 20.00, 21);
> insert into purchase_history values ('1', 'Socks', 3.50, 31);
> insert into purchase_history values ('3', 'Belt', 20.00, 51);
> insert into purchase_history values ('4', 'Shirt', 15.50, 59);
> create table cart_history (s string, cart_id int, timestamp int);
> insert into cart_history values ('1', 1, 10);
> insert into cart_history values ('1', 2, 20);
> insert into cart_history values ('1', 3, 30);
> insert into cart_history values ('1', 4, 40);
> insert into cart_history values ('3', 5, 50);
> insert into cart_history values ('4', 6, 60);
> create table events (s string, st2 string, n int, timestamp int);
> insert into events values ('1', 'Bob', 1234, 20);
> insert into events values ('1', 'Bob', 1234, 30);
> insert into events values ('1', 'Bob', 1234, 25);
> insert into events values ('2', 'Sam', 1234, 30);
> insert into events values ('3', 'Jeff', 1234, 50);
> insert into events values ('4', 'Ted', 1234, 60);
> {code}
> I realize select * and select s are not all that interesting in this context 
> but what lead us to this issue was select count(distinct s) was not returning 
> results. The above queries are the simplified queries that produce the issue. 
> I will note that if I convert the inner join to a table and select from that 
> the issue does not appear.
> Update: Found that turning off  hive.optimize.remove.identity.project fixes 
> this issue. This optimization was introduced in 
> https://issues.apache.org/jira/browse/HIVE-8435



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7193) Hive should support additional LDAP authentication parameters

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593269#comment-14593269
 ] 

Hive QA commented on HIVE-7193:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740574/HIVE-7193.5.patch

{color:green}SUCCESS:{color} +1 9010 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4319/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4319/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4319/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740574 - PreCommit-HIVE-TRUNK-Build

> Hive should support additional LDAP authentication parameters
> -
>
> Key: HIVE-7193
> URL: https://issues.apache.org/jira/browse/HIVE-7193
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Mala Chikka Kempanna
>Assignee: Naveen Gangam
> Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, 
> HIVE-7193.5.patch, HIVE-7193.patch, LDAPAuthentication_Design_Doc.docx, 
> LDAPAuthentication_Design_Doc_V2.docx
>
>
> Currently hive has only following authenticator parameters for LDAP 
> authentication for hiveserver2:
> {code:xml}
>  
>   hive.server2.authentication 
>   LDAP 
>  
>  
>   hive.server2.authentication.ldap.url 
>   ldap://our_ldap_address 
>  
> {code}
> We need to include other LDAP properties as part of hive-LDAP authentication 
> like below:
> {noformat}
> a group search base -> dc=domain,dc=com 
> a group search filter -> member={0} 
> a user search base -> dc=domain,dc=com 
> a user search filter -> sAMAAccountName={0} 
> a list of valid user groups -> group1,group2,group3 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-19 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593260#comment-14593260
 ] 

Jesus Camacho Rodriguez commented on HIVE-10996:


[~jpullokkaran], thanks for the comments.
1) you are right, the solution should not be tailored towards FIL operator 
only, as this could happen in other cases; I will change the patch accordingly,
2) the error is produced because the SEL operator above the FIL operator is 
removed by IdentityProjectRemoval optimization (as SEL and FIL have the same 
schema, this is what IdentityProjectRemoval is supposed to do). But actually 
that SEL operator was pruning columns out of the input tuples.
In the specific example provided in this case, after the SEL operator there is 
a JOIN operator, that was then joining on wrong columns and thus, not producing 
the right results.

> Aggregation / Projection over Multi-Join Inner Query producing incorrect 
> results
> 
>
> Key: HIVE-10996
> URL: https://issues.apache.org/jira/browse/HIVE-10996
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Gautam Kowshik
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
> HIVE-10996.03.patch, HIVE-10996.patch, explain_q1.txt, explain_q2.txt
>
>
> We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
> a regression.
> The following query (Q1) produces no results:
> {code}
> select s
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> {code}
> While this one (Q2) does produce results :
> {code}
> select *
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> 1 21  20  Bob 1234
> 1 31  30  Bob 1234
> 3 51  50  Jeff1234
> {code}
> The setup to test this is:
> {code}
> create table purchase_history (s string, product string, price double, 
> timestamp int);
> insert into purchase_history values ('1', 'Belt', 20.00, 21);
> insert into purchase_history values ('1', 'Socks', 3.50, 31);
> insert into purchase_history values ('3', 'Belt', 20.00, 51);
> insert into purchase_history values ('4', 'Shirt', 15.50, 59);
> create table cart_history (s string, cart_id int, timestamp int);
> insert into cart_history values ('1', 1, 10);
> insert into cart_history values ('1', 2, 20);
> insert into cart_history values ('1', 3, 30);
> insert into cart_history values ('1', 4, 40);
> insert into cart_history values ('3', 5, 50);
> insert into cart_history values ('4', 6, 60);
> create table events (s string, st2 string, n int, timestamp int);
> insert into events values ('1', 'Bob', 1234, 20);
> insert into events values ('1', 'Bob', 1234, 30);
> insert into events values ('1', 'Bob', 1234, 25);
> insert into events values ('2', 'Sam', 1234, 30);
> insert into events values ('3', 'Jeff', 1234, 50);
> insert into events values ('4', 'Ted', 1234, 60);
> {code}
> I realize select * and select s are not all that interesting in this context 
> but what lead us to this issue was select count(distinct s) was not returning 
> results. The above queries are the simplified queries that produce the issue. 
> I will note that if I convert the inner join to a table and select from that 
> the issue does not appear.
> Update: Found that turning off  hive.optimize.remove.identity.project fixes 
> this issue. This optimization was introduced in 
> https://issues.apache.org/jira/browse/HIVE-8435



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8128) Improve Parquet Vectorization

2015-06-19 Thread Dong Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593228#comment-14593228
 ] 

Dong Chen commented on HIVE-8128:
-

Sure, I did a quick update on Hive using the new changes and the test results 
seems fine! Thanks, [~nezihyigitbasi]

I will write complete code and try more cases to see what could be found next 
week.

> Improve Parquet Vectorization
> -
>
> Key: HIVE-8128
> URL: https://issues.apache.org/jira/browse/HIVE-8128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Dong Chen
> Fix For: parquet-branch
>
> Attachments: HIVE-8128-parquet.patch.POC, HIVE-8128.1-parquet.patch
>
>
> NO PRECOMMIT TESTS
> We'll want to do is finish the vectorization work (e.g. VectorizedOrcSerde, 
> VectorizedOrcSerde) which was partially done in HIVE-5998.
> As discussed in PARQUET-131, we will work out Hive POC based on the new 
> Parquet vectorized API, and then finish the implementation after finilized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4605) Hive job fails while closing reducer output - Unable to rename

2015-06-19 Thread Benoit Perroud (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593185#comment-14593185
 ] 

Benoit Perroud commented on HIVE-4605:
--

I'm seeing this error, too, with Hive 0.13, but it's because another process 
deleted the {{_tmp.-ext-10001}} folder.
So it's not really a bug from my perspective. 
To find the guilty process deleting the folder, have a look at hdfs audits file 
to figure out who deleted the folder.

> Hive job fails while closing reducer output - Unable to rename
> --
>
> Key: HIVE-4605
> URL: https://issues.apache.org/jira/browse/HIVE-4605
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
> Environment: OS: 2.6.18-194.el5xen #1 SMP Fri Apr 2 15:34:40 EDT 2010 
> x86_64 x86_64 x86_64 GNU/Linux
> Hadoop 1.1.2
>Reporter: Link Qian
>Assignee: Brock Noland
> Attachments: HIVE-4605.patch
>
>
> 1, create a table with ORC storage model
> create table iparea_analysis_orc (network int, ip string,   )
> stored as ORC;
> 2, insert table iparea_analysis_orc select  network, ip,  , the script 
> success, but failed after add *OVERWRITE* keyword.  the main error log list 
> as here.
> ava.lang.RuntimeException: Hive Runtime Error while closing operators: Unable 
> to rename output from: 
> hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_task_tmp.-ext-1/_tmp.00_0
>  to: 
> hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_tmp.-ext-1/00_0
>   at 
> org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:317)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
> output from: 
> hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_task_tmp.-ext-1/_tmp.00_0
>  to: 
> hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_tmp.-ext-1/00_0
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:108)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:867)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:309)
>   ... 7 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593170#comment-14593170
 ] 

Hive QA commented on HIVE-11037:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740570/HIVE-11037.02.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9011 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4318/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4318/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4318/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740570 - PreCommit-HIVE-TRUNK-Build

> HiveOnTez: make explain user level = true as default
> 
>
> Key: HIVE-11037
> URL: https://issues.apache.org/jira/browse/HIVE-11037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch
>
>
> In Hive-9780, we introduced a new level of explain for hive on tez. We would 
> like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11040) Change Derby dependency version to 10.10.2.0

2015-06-19 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-11040:

Component/s: Metastore

> Change Derby dependency version to 10.10.2.0
> 
>
> Key: HIVE-11040
> URL: https://issues.apache.org/jira/browse/HIVE-11040
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 1.2.1, 2.0.0
>
> Attachments: HIVE-11040.1.patch
>
>
> We don't see this on the Apache pre-commit tests because it uses PTest, but 
> running the entire TestCliDriver suite results in failures in some of the 
> partition-related qtests (partition_coltype_literals, partition_date, 
> partition_date2). I've only really seen this on Linux (I was using CentOS).
> HIVE-8879 changed the Derby dependency version from 10.10.1.1 to 10.11.1.1. 
> Testing with 10.10.1.1 or 10.20.2.0 seems to allow the partition related 
> tests to pass. I'd like to change the dependency version to 10.20.2.0, since 
> that version should also contain the fix for HIVE-8879.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11040) Change Derby dependency version to 10.10.2.0

2015-06-19 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-11040:

Fix Version/s: 2.0.0

> Change Derby dependency version to 10.10.2.0
> 
>
> Key: HIVE-11040
> URL: https://issues.apache.org/jira/browse/HIVE-11040
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 1.2.1, 2.0.0
>
> Attachments: HIVE-11040.1.patch
>
>
> We don't see this on the Apache pre-commit tests because it uses PTest, but 
> running the entire TestCliDriver suite results in failures in some of the 
> partition-related qtests (partition_coltype_literals, partition_date, 
> partition_date2). I've only really seen this on Linux (I was using CentOS).
> HIVE-8879 changed the Derby dependency version from 10.10.1.1 to 10.11.1.1. 
> Testing with 10.10.1.1 or 10.20.2.0 seems to allow the partition related 
> tests to pass. I'd like to change the dependency version to 10.20.2.0, since 
> that version should also contain the fix for HIVE-8879.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593141#comment-14593141
 ] 

Chao Sun commented on HIVE-11042:
-

Also, comments for this method needs a little improvement. For instance, "for 
example" -> "For example", "This method, pattern is in taskId." -> "In this 
method, pattern is in taskId", etc.

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method

2015-06-19 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593140#comment-14593140
 ] 

Chao Sun commented on HIVE-11042:
-

This code overlaps a lot with the existing replaceTaskId(String, String). Is it 
possible to just modify that method?

> Need fix Utilities.replaceTaskId method
> ---
>
> Key: HIVE-11042
> URL: https://issues.apache.org/jira/browse/HIVE-11042
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11042.1.patch
>
>
> When I are looking at other bug, I found Utilities.replaceTaskId (String, 
> int) method is not right.
> For example 
> Utilities.replaceTaskId"(ds%3D1)01", 5); 
> return 5
> It should return (ds%3D1)05



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10746) Hive 1.2.0+Tez produces 1-byte FileSplits from mapred.TextInputFormat

2015-06-19 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-10746:

Fix Version/s: 2.0.0

>  Hive 1.2.0+Tez produces 1-byte FileSplits from mapred.TextInputFormat
> --
>
> Key: HIVE-10746
> URL: https://issues.apache.org/jira/browse/HIVE-10746
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Tez
>Affects Versions: 0.14.0, 0.14.1, 1.2.0, 1.1.0, 1.1.1
>Reporter: Greg Senia
>Assignee: Gopal V
>Priority: Critical
> Fix For: 1.2.1, 2.0.0
>
> Attachments: HIVE-10746.1.patch, HIVE-10746.2.patch, 
> slow_query_output.zip
>
>
> The following query: 
> {code:sql}
> SELECT appl_user_id, arsn_cd, COUNT(*) as RecordCount FROM adw.crc_arsn GROUP 
> BY appl_user_id,arsn_cd ORDER BY appl_user_id;
> {code}
>  runs consistently fast in Spark and Mapreduce on Hive 1.2.0. When attempting 
> to run this same query against Tez as the execution engine it consistently 
> runs for over 300-500 seconds this seems extremely long. This is a basic 
> external table delimited by tabs and is a single file in a folder. In Hive 
> 0.13 this query with Tez runs fast and I tested with Hive 0.14, 0.14.1/1.0.0 
> and now Hive 1.2.0 and there clearly is something going awry with Hive w/Tez 
> as an execution engine with Single or small file tables. I can attach further 
> logs if someone needs them for deeper analysis.
> HDFS Output:
> {noformat}
> hadoop fs -ls /example_dw/crc/arsn
> Found 2 items
> -rwxr-x---   6 loaduser hadoopusers  0 2015-05-17 20:03 
> /example_dw/crc/arsn/_SUCCESS
> -rwxr-x---   6 loaduser hadoopusers3883880 2015-05-17 20:03 
> /example_dw/crc/arsn/part-m-0
> {noformat}
> Hive Table Describe:
> {noformat}
> hive> describe formatted crc_arsn;
> OK
> # col_name  data_type   comment 
>  
> arsn_cd string  
> clmlvl_cd   string  
> arclss_cd   string  
> arclssg_cd  string  
> arsn_prcsr_rmk_ind  string  
> arsn_mbr_rspns_ind  string  
> savtyp_cd   string  
> arsn_eff_dt string  
> arsn_exp_dt string  
> arsn_pstd_dts   string  
> arsn_lstupd_dts string  
> arsn_updrsn_txt string  
> appl_user_idstring  
> arsntyp_cd  string  
> pre_d_indicator string  
> arsn_display_txtstring  
> arstat_cd   string  
> arsn_tracking_nostring  
> arsn_cstspcfc_ind   string  
> arsn_mstr_rcrd_ind  string  
> state_specific_ind  string  
> region_specific_in  string  
> arsn_dpndnt_cd  string  
> unit_adjustment_in  string  
> arsn_mbr_only_ind   string  
> arsn_qrmb_ind   string  
>  
> # Detailed Table Information 
> Database:   adw  
> Owner:  loadu...@exa.example.com   
> CreateTime: Mon Apr 28 13:28:05 EDT 2014 
> LastAccessTime: UNKNOWN  
> Protect Mode:   None 
> Retention:  0
> Location:   hdfs://xhadnnm1p.example.com:8020/example_dw/crc/arsn 
>
> Table Type: EXTERNAL_TABLE   
> Table Parameters:
> EXTERNALTRUE
> transient_lastDdlTime   1398706085  
>  
> # Storage Information
> SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
> InputFormat:org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat:   
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed: No   
> Num Buckets: